February 25, 2006 14:58 WSPC/148-RMP
J070-00257
EDITORIAL STATEMENT
As of 1st January 2006, changes have taken place in the Editorial Board of Reviews in Mathematical Physics. In particular, the undersigned has taken over the position of Editor-in-Chief. The predecessors, Detlev Buchholz and the founding editor, Huzihiro Araki, have during the past 17 years built up a widely recognized journal in the field of mathematical physics. This success has been based on an editorial policy that puts scientific excellence and timeliness of the published papers in the first place, but also on the co-operations between the editorial board, the authors and the numerous referees that have devoted parts of their valuable time to this service. This editorial policy will be continued and the journal will also in the future count on the active participation of members of the scientific community in this enterprise. As stated on the cover page, the Reviews in Mathematical Physics is a journal for both review and original research papers. The original papers should ideally also have an expository part understandable to a wider readership. This profile makes the journal rather unique in the field. Mathematical physics is a discipline where common techniques and insights often apply to a variety of physical problems and survey articles may here have even more impact than in some other areas where specialization is narrower. One of the goals of the new editorial board is to increase the number of review articles even further, keeping the high standards set by the predecessors. Jakob Yngvason
v
February 25, 2006 14:58 WSPC/148-RMP
J070-00255
Reviews in Mathematical Physics Vol. 18, No. 1 (2006) 1–18 c World Scientific Publishing Company
ON THE NOTION OF CONDITIONAL SYMMETRY OF DIFFERENTIAL EQUATIONS
GIAMPAOLO CICOGNA Dipartimento di Fisica “E.Fermi” dell’Universit` a and I.N.F.N., Sezione di Pisa, Largo B. Pontecorvo 3, Ed. B-C, I-56127, Pisa, Italy
[email protected] MICHELE LAINO Dipartimento di Fisica “E.Fermi” dell’Universit` a, Largo B. Pontecorvo 3, Ed. B-C, I-56127, Pisa, Italy Received 8 March 2005 Revised 23 August 2005 Symmetry properties of PDE’s are considered within a systematic and unifying scheme: particular attention is devoted to the notion of conditional symmetry, leading to the distinction and a precise characterization of the notions of “true” and “weak” conditional symmetry. Their relationship with exact and partial symmetries is also discussed. An extensive use of “symmetry-adapted” variables is made; several clarifying examples, including the case of Boussinesq equation, are also provided. Keywords: Lie point-symmetries; partial differential equations; conditional symmetries. Mathematics Subject Classification 2000: 22E05, 35A25
1. Introduction In the study of general aspects of differential equations, and also in the concrete problem of finding their explicit solutions, a fundamental role is played, as well known, by the analysis of symmetry properties of the equations. In addition to the classical notion of Lie “exact” symmetries (see e.g., [1–7]), an important class of symmetries is given by the “conditional symmetries” (or “nonclassical symmetries”), introduced and developed by Bluman and Cole [8, 9], Levi and Winternitz [10, 11], Fushchych [12, 13] and many others (see e.g., [5, 11]). In this paper, we will be concerned with partial differential equations (PDE) and with the above mentioned types of symmetries, and also with the notion of “partial symmetry”, as defined in [14]: in the context of a simple comprehensive scheme, we will distinguish different notions of conditional symmetry, with a precise characterization of their properties and a clear comparison with other types of symmetries. 1
February 25, 2006 14:58 WSPC/148-RMP
2
J070-00255
G. Cicogna & M. Laino
An extensive use will be made of the “symmetry-adapted” variables (also called “canonical coordinates”, see e.g. [2, 7] and also [15]), which reveal to be extremely useful; several clarifying examples will also be provided, including the case of Boussinesq equation, which offers good examples for all the different notions of symmetry considered in this paper (see also [16]). For the sake of simplicity, only “geometrical” or Lie point-symmetries will be considered, although the relevant results could be extended to more general classes of symmetries, as generalized or B¨acklund, potential or nonlocal symmetries, whose importance is well known and also recently further emphasized (see e.g. [17–20]). 2. Preliminary Statements Let us start with a preliminary lemma, simple but important for our applications. In view of this, the notations are chosen similar, as far as possible, to those used below. Lemma 2.1. Consider a system of n equations for the n functions ya = ya (s) (a = 1, . . . , n; s ∈ R) of the form (sum over repeated indices) dya = Gab (s, y)yb , (2.1) ds where Gab are n × n given functions of s and of y ≡ (y1 , . . . , yn ), which are assumed regular enough (e.g., analytic in a neighborhood of s = 0, y = 0). Then, any solution of (2.1) can be written, in a neighborhood of s = 0, ya (s) = Rab (s)κb ,
(2.2)
where κ ≡ (κ1 , . . . , κn ) are constants, and Rab are regular functions with Rab (0) = δab (then, κa = ya (0)). Reciprocally, for any solution ya (s), there are regular functions Sab (s) such that κa = Sab (s)yb (s).
(2.3)
Proof. The result is nearly trivial if Gab do not depend on y. In the general case, let ya = y¯a (s) denote any given solution of (2.1) in a neighborhood of s = 0, determined by n initial values κa = y¯a (0) (we omit to write explicitly the dependence on the κ). Let us put Ka (s) = Sab (s)¯ yb (s), where Sab are functions to be determined; we then get (with = d/ds) Ka = Sab y¯b + Sab y¯b = (Sab + Sac Gcb (s, y¯(s)))¯ yb .
(2.4)
Consider now the equation for the matrix S, with clear notations in matrix form, S = −SG,
(2.5)
where it is understood that the generic solution y¯ is replaced in G by its expression depending on s (and on κ, of course). Now, Eq. (2.5) always admits a solution
February 25, 2006 14:58 WSPC/148-RMP
J070-00255
On the Notion of Conditional Symmetry of Differential Equations
3
S — as well known (see e.g., [21]) — which can be characterized as a fundamental matrix for the associated “adjoint” system ζa = −ζb Gba . In particular, this fundamental matrix can be constructed assuming as initial value at s = 0 the matrix S(0) = I. Therefore, choosing this matrix S, one gets from (2.4) Ka = const = yb (0) = y¯a (0) = κa , and κa = Sab (s)¯ yb (s). The matrix S(s) can be Ka (0) = Sab (0)¯ −1 κb := Rab κb with R = R(s) locally inverted, giving for any solution ya (s), ya = Sab and R(0) = I. Remark 2.2. As is clear from the proof, S and R also depend on the initial values κ which indeed determine the generic solution y¯(s); the only relevant points here are the “factorization” of the κ as in (2.2) and the form (2.3), i.e. the possibility of obtaining s-independent “combinations” (with coefficients S depending on s) of the components of each solution. In our applications, the functions Gab will also depend on some other parameters; then all results hold true, but clearly S, R and κ turn out to be functions of these additional quantities. In the following, we will consider systems of PDE’s, denoted by ∆ ≡ ∆a (x, u(m) ) = 0,
a = 1, . . . , ν,
u ≡ (u1 , . . . , uq );
x ≡ (x1 , . . . , xp ),
(2.6)
for the q functions uα = uα (x) of the p variables xi , where u(m) denotes the functions uα together with their x derivatives up to the order m, with usual notations and assumptions (as stated, e.g., in [2]). In particular, we will always assume that all standard smoothness properties and the maximal rank condition are satisfied. As anticipated, only Lie point-symmetries will be considered, with infinitesimal generator given by vector field ∂ ∂ + ϕα (x, u) . (2.7) ∂xi ∂uα To simplify notations, we shall denote by X ∗ the “appropriate” prolongation of X for the equation at hand or, alternatively, its infinite prolongation (indeed, only a finite number of terms will appear in calculations). For completeness and, even more, for comparison with the subsequent Definition 2.4, let us start with the following (completely standard) definition (cf. [2]). X = ξi (x, u)
Definition 2.3. A (nondegenerate) system of PDE ∆a (x, u(m) ) = 0 is said to admit the Lie point-symmetry generated by the vector field X (or to be symmetric under X) if the following condition X ∗ (∆)|∆=0 = 0
(2.8)
is satisfied, or — equivalently (at least under mild hypothesis, see [2]) — if there are functions Gab (x, u(m) ) such that (X ∗ (∆))a = Gab ∆b . Let us also give this other definition.
(2.9)
February 25, 2006 14:58 WSPC/148-RMP
4
J070-00255
G. Cicogna & M. Laino
Definition 2.4. A system of PDE as before is said to be invariant under a vector field X if X ∗ (∆) = 0.
(2.10)
For instance, the Laplace equation uxx + uyy = 0 is invariant under the rotation symmetry generated by X = y∂/∂x − x∂/∂y; the heat equation ut = uxx is symmetric but not invariant under X = 2t
∂ ∂ − xu , ∂x ∂u
(2.11)
indeed one has X ∗ (ut − uxx ) = −x(ut − uxx ). 3. Symmetric and Invariant Equations Let us introduce a first simplification: we will assume that the vector fields X are “projectable” or, more explicitly, that the functions ξ in (2.7) do not depend on u, as often happens in the study of PDE’s. This strongly simplifies calculations, especially in the introduction of the more “convenient” or “symmetry-adapted” variables, and allows a more direct relationship between symmetries and symmetryinvariant solutions, as discussed in [22]. A first result, concerning “exact” (to be distinguished from conditional or partial, see below) symmetries is the following (see also [2, 7]). Theorem 3.1. Let ∆ = 0 be a nondegenerate system of PDE, symmetric under a vector field X, according to Definition 2.3. Then, there are new p + q variables s, z and v, with s ∈ R, z ∈ Rp−1 , v ≡ (v1 (s, z), . . . , vq (s, z)), and a new system of ˜ b (s, z, v (m) ) [where v (m) stands for PDE’s, say K = 0, with Ka = Sab (s, z, v (m) )∆ ˜ = ∆(s, ˜ z, v (m) ) is ∆ v(s, z) and its derivatives with respect to s and z, and ∆ when expressed in terms of the new variables s, z, v], which is locally equivalent to the initial system and is invariant (as in Definition 2.4) under the symmetry X = ∂/∂s, i.e. Ka = Ka (z, v (m) ). Proof. Given X, one introduces “canonical variables” s, z, defined by Xs ≡ ξi
∂s ∂s + ϕα = 1; ∂xi ∂uα
Xzk = 0,
(k = 1, . . . , p − 1).
(3.1)
One first considers the subset of characteristic equations dxi /ξi = ds which do not contain the variables uα , and finds the variable s together with the X-invariant variables zk . Then, using the characteristic equations dxi /ξi = duα /ϕα , one finds the q invariant quantities vα , and expresses the uα in terms of vα and of the new independent variables s, zk . Once written in these coordinates, the symmetry field and all its prolongations are simply given by X = X∗ =
∂ , ∂s
(3.2)
February 25, 2006 14:58 WSPC/148-RMP
J070-00255
On the Notion of Conditional Symmetry of Differential Equations
5
˜ whereas the symmetry condition becomes ∂ ∆/∂s| = 0 or ˜ ∆=0 ∂ ˜ ˜ b. ∆a = Gab ∆ ∂s
(3.3)
˜ and that of κ An application of Lemma 2.1, where the role of y is played here by ∆ ˜ ˜ a which by K, shows that there are suitable “combinations” Ka = Sab ∆b of the ∆ do not depend explicitly on s, i.e. (∂/∂s)K = 0. This result can be compared with an analogous result presented in [23], where however the point of view is different (i.e. constructing equations with a prescribed algebra of symmetries). Example 3.2. Consider the quite trivial system of PDE for u = u(x, y) uxx + uyy + uxxx = 0,
(3.4a)
uxxx = uxxy = uxyy = uyyy = 0.
(3.4b)
This system admits the rotation symmetry X = y∂/∂x − x∂/∂y, although none of the equations above is invariant or symmetric under rotations. The variables s, z are in this case obviously the polar variables θ, r and X = X ∗ = ∂/∂θ; it is now easy to construct combinations of the above equations for v = v(r, θ) which are invariant under ∂/∂θ. For example, yuxxx − xuxxy + yuxyy − xuyyy = −r−2 (rvrθ + r2 vrrθ + vθθθ ) = 0, xuxxx + yuxxy + xuxyy + yuyyy = (r−2 )(−rvr + r2 vrr − 2vθθ + r3 vrrr + rvrθθ ) = 0. It can be remarked that considering Eq. (3.4a), together with only the first one of the (3.4b), i.e. uxxx = 0, one obtains a system which is not symmetric under rotations, although the equation uxxx = 0 expresses the vanishing of the “symmetry breaking term” in (3.4a). As a consequence, the system of these two equations would admit solutions, e.g., u = x2 y − y 3 /3, which are not transformed by rotations into other solutions. Example 3.3. In the example of heat equation mentioned at the end of the previous section, choosing the variables s = x/2t, z = t and with u = exp(−zs2 )v(s, z) as determined by the symmetry vector field (2.11), the equation is transformed into the equivalent equation for v = v(s, z), 4z 2 vz + 2zv − vss = 0, (vs = ∂v/∂s, etc.) which indeed does not depend explicitly on s and therefore, is invariant under the symmetry X = ∂/∂s (but does contain a function v depending on s). Now, looking for solutions with vs = 0, i.e. with v = w(z), one obtains the known reduced equation 2zwz + w = 0 (see [2]). It should be emphasized that the result in Theorem 3.1 is not the same as (but is related to, and includes in particular) the well-known result concerning
February 25, 2006 14:58 WSPC/148-RMP
6
J070-00255
G. Cicogna & M. Laino
the reduction of the given PDE to the X-invariant equations for the variables w(z): indeed, introducing the new “symmetry-adapted” variables s, z and v(s, z), we have transformed the equation into a locally equivalent equation for v(s, z). If one now further assumes that ∂v/∂s = 0, i.e. if one looks for the X-invariant solutions v = w(z), then the equations Ka = 0 become a system of equations Ka(0) (z, w(m) ) = 0
(3.5)
involving only the variables z and functions depending only on z (see [24] for a detailed discussion on the reduction procedure). In particular, in the case of a single PDE for a single unknown function depending on two variables, the PDE is reduced to an ODE, as well known, and as in Example 3.3 above. 4. Conditional Symmetries, in “True” and “Weak” Sense The above approach includes in a completely natural way some other important situations. It is known indeed that, by means of the introduction of the notion of conditional symmetry (CS), one may obtain other solutions which turn out to be invariant under these “nonclassical” symmetries [5, 8–13]. But there are different types of CS, and it is useful to distinguish these different notions and to see how they can be fitted in this scheme. To avoid unessential complications with notations, we will consider from now on only the case of a single PDE ∆ = 0 for a single unknown function u(x); the extension to the general cases is in principle straightforward. As well known, a vector field X is said to be a conditional symmetry for the equation ∆ = 0 if X is an “exact” Lie point-symmetry for the system ∂u = 0, (4.1) ∆ = 0; XQ u ≡ ϕ − ξi ∂xi where XQ is the symmetry written in “evolutionary form” [2]. The second equation in (4.1), indicating that we are looking for solutions invariant under X, is automatically symmetric under X; we have then only to impose X ∗ (∆)|Σ = 0,
(4.2)
where Σ is the set of the simultaneous solutions of the two equations (4.1), plus (possibly) some differential consequences of the second one (see [2, 25–31] for a precise and detailed discussion on this point and the related notion of degenerate systems of PDE). In the canonical variables s, z and v = v(s, z) determined by the vector field X, the invariance condition XQ v = 0 becomes ∂v =0 (4.3) ∂s and the condition of CS (4.2) takes the simple form (let us now retain for simplicity ˜ also in the new coordinates) the same notation ∆, instead of ∆, ∂∆ = 0, (4.4) ∂s Σ
February 25, 2006 14:58 WSPC/148-RMP
J070-00255
On the Notion of Conditional Symmetry of Differential Equations
7
here Σ stands for the set of the simultaneous solutions of ∆ = 0 and vs = 0, together with the derivatives of vs with respect to all variables s and zk . Using () the global notation vs to indicate vs , vss , vszk etc., the CS condition (4.4) then becomes equivalently, according to Definition 2.3, and with clear notations, ∂ ∆ = G(s, z, v (m) ) ∆ + H (s, z, v (m) )vs() , (4.5) ∂s
which, in the original coordinates x, u, states that X ∗ ∆ is a “combination” of ∆ and of XQ u with its differential consequences. Now, another application of Lemma 2.1 () (the role of y being played by ∆ and vs ) gives that ∆ must have the form ∆ = R(s, z, v (m) )K(z, v (m) ) + Θ (s, z, v (m) )vs() , (4.6) ()
where the points to be emphasized are that R, K do not contain vs and that K does not depend explicitly on s. If one now looks for solutions of ∆ = 0 which are independent on s, i.e. v = w(z) () and vs = 0, then Eq. (4.6) becomes a “reduced” equation K (0) (z, w(m) ) = 0, just as in the exact symmetry case. Remark 4.1. If X is a CS for a differential equation, then clearly also Xψ = ψ(x, u)X, for any smooth function ψ, is another CS. While the invariant variables z are the same for X and Xψ , the variable s turns out to be different. This implies that, writing the differential equation in terms of the canonical variables, one obtains, in general, different equations for different choices of ψ. All these equations will produce the same reduced equation when one looks for invariant solutions v = w(z). Example 4.2. It is known that the nonlinear acoustic equation [12, 13, 26] utt = u uxx;
u = u(x, t)
admits the CS X = 2t
∂ ∂ ∂ + + 8t . ∂x ∂t ∂u
Introducing the canonical variables s = t, z = x − t2 and u = v(s, z) + 4s2 , the equation becomes 8 − 2vz − vvzz + vss − 4svsz = 0.
(4.7)
Considering instead (1/2t)X = ∂/∂x + (1/2t)∂/∂t + 4∂/∂u, one gets s = x, z = x − t2 , u = v(s, z) + 4(s − z) and the equation becomes 8 − 2vz − vvzz − 4svss + 4zvss − 2vvsz − 8svsz + 8zvsz − vvss = 0.
(4.8)
Both Eqs. (4.7) and (4.8) have the form (4.6), as expected, and both become the reduced ODE 8 − 2wz − wwzz = 0.
February 25, 2006 14:58 WSPC/148-RMP
8
J070-00255
G. Cicogna & M. Laino
The presence of some terms containing s in the above Eqs. (4.7) and (4.8) shows that X is not an exact symmetry, and the fact that these terms disappear when vs = 0 shows that X is a CS. However, the above one is not the only way to obtain reduced equations. Indeed, the rather disappointing remark is that, as pointed out by Olver and Rosenau [26] (see also [25]), given an arbitrary vector field X, if one can find some particular simultaneous solution u ˆ of the two equations (4.1), then the CS condition (4.2) turns out to be automatically satisfied when evaluated along this solution, i.e. X ∗ (∆)|uˆ = 0.
(4.9)
It can be interesting to verify this fact in terms of the canonical variables s, z, v. Indeed one has (d/ds is the total derivative) X ∗ (∆) =
d∆ ∂∆ ∂∆ = − vs() () ∂s ds ∂v
(4.10)
which vanishes if one chooses a solution of ∆ = 0 of the form vˆ = w(z). ˆ Even more, it is enough to find an arbitrary solution of ∆ = 0; then, choosing any vector field leaving invariant this solution, one could conclude that, essentially, any vector field is a CS, and any solution is invariant under some CS, cf. [26]. This issue has been also considered in [32], from another point of view (see also the end of this section). The point is that the existence of some solution u ˆ of the two equations (4.1) is not exactly equivalent to the condition (4.2), this happens essentially because X in this case is a symmetry of an enlarged system which includes the compatibility conditions of the differential consequences of both equations in (4.1) (or the “integrability conditions”: see [2, 25–27]). Therefore, it is important to clearly distinguish different notions and introduce a sort of “classification” of CS. We will say that X is CS in “true” or standard sense if X ∗ (∆)|Σ = 0 is satisfied: the discussion and the examples above cover precisely this case; also the examples of CS considered in the literature are usually CS of standard type (see e.g., [8–13], [28–31], but see also [19, 26, 32, 33]). Instead, when X ∗ (∆)|uˆ = 0 is satisfied only for some u ˆ, we shall say that a “weak” CS is concerned (we will be more precise in a moment; notice however that some authors call generically “weak” symmetries all non-exact symmetries). What happens in this case is, once again, more clearly seen in the canonical variables determined by the given vector field X (see also [15]): assume indeed for a moment that in these coordinates the PDE takes the form ∆=
σ
sr−1 Kr (z, v (m) ) +
r=1
Θ (s, z, v (m) )vs() = 0,
(4.11)
()
where the part not containing vs is a polynomial in the variable s, with coefficients Kr not depending explicitly on s. Now, if one looks for X-invariant solutions w(z) of ∆ = 0, one no longer obtains reduced equations involving only the invariant
February 25, 2006 14:58 WSPC/148-RMP
J070-00255
On the Notion of Conditional Symmetry of Differential Equations
9
variables z and w(z), as in the case of Eq. (4.6), but one is faced (cf. [26, 27]) with the system of reduced equations (not containing s nor functions of s) Kr(0) (z, w(m) ) = 0;
r = 1, . . . , σ.
(4.12)
Assume that this system admits some solution (it is known that the existence of invariant solutions is by no means guaranteed in general, neither for “true” CS, nor for “exact” Lie symmetries), and denote by Σσ the set of these solutions: for any w(z) ˆ ∈ Σσ , we are precisely in the case of weak CS. The identical conclusion holds if the initial PDE is transformed into an expression of this completely general form (instead of (4.11)) ∆=
σ
Rr (s, z, v (m) )Kr (z, v (m) ) +
r=1
Θ (s, z, v (m) )vs() = 0
(4.13)
with the presence here of a sum of σ terms Rr Kr (with σ > 1), where the coefficients Rr which depend on s are grouped together, with the only obvious condition that the coefficients Rr be linearly independent (the idea should be that of obtaining the minimum number of independent conditions (4.12)). We now see that the set Σσ can be characterized equivalently as the set of the solutions of the system ∆ = 0;
∂∆ = 0; ∂s
...
∂ σ−1 ∆ = 0; ∂sσ−1
vs() = 0;
(4.14)
(indeed the Rr are also functionally independent as functions of s). Conversely, if a ∆(s, z, v (m) ) = 0 is such that a system like (4.14) admits the symmetry X = ∂/∂s, then condition (2.9) must be satisfied, and applying once again Lemma 2.1, we see that ∆ must have the form (4.13). Therefore, (4.13) is the most general form of an equation exhibiting the weak CS X = ∂/∂s, to be compared with (4.6), which corresponds to the case of true CS. Let us now come back to the original coordinates x, u: we will see that the set of conditions (4.14) is the result of the following procedure. Given the equation ∆ = 0, and a vector field X, assume that the system of equations (4.1) is not symmetric under X (therefore, that X is not a “true” CS for ∆ = 0), then put ∆(1) := X ∗ (∆)
(4.15)
and consider ∆(1) = 0 as a new condition to be fulfilled, obtaining in this way the augmented system (the first step of this approach is similar to a procedure, involving contact vector fields, which has been proposed in [34]) ∆ = 0;
∆(1) = 0;
XQ u = 0.
(4.16)
If this system is symmetric under X, i.e. if X ∗ (∆)|Σ1 = 0,
(4.17)
February 25, 2006 14:58 WSPC/148-RMP
10
J070-00255
G. Cicogna & M. Laino
where Σ1 is the set of simultaneous solutions of (4.16), we can say that X is “weak CS of order 2” (according to this, a true CS is of order 1). If instead (4.16) is not symmetric under X, the procedure can be iterated, introducing ∆(2) := X ∗ (∆(1) )
(4.18)
and appending the new equation ∆(2) = 0, and so on. Finally, we will say that X is a weak CS of order σ if X ∗ (∆)|Σσ = 0,
(4.19)
where Σσ is the set (if not empty, of course) of the solutions of the system ∆ = ∆(1) = · · · = ∆(σ−1) = 0,
XQ u = 0,
(4.20)
(as already pointed out, it is understood, here and in the following, that also the differential consequences of XQ u = 0 must be taken into account; clearly, the additional conditions ∂∆/∂s = 0 or X ∗ ∆ = 0 and so on, should not be confused with the differential consequences of the equation ∆ = 0). Remark 4.3 (The “Partial” Symmetries). The above procedure for finding weak CS is reminiscent of the procedure used for constructing partial symmetries, according to the definition proposed in [14] (see also [35]), the (relevant!) difference being the presence in the weak CS case of the additional condition XQ u = 0. Let us recall indeed that a vector field X is said to be a partial symmetry of order σ for ∆ = 0 if the set of equations, with the above definitions (4.15) and (4.18), ∆ = ∆(1) = · · · = ∆(σ−1) = 0
(4.21)
admits some solutions. In terms of the variables s, z and v(s, z), conditions (4.21) are () the same as (4.14) but without the conditions vs = 0. The set of solutions found in the presence of a partial symmetry provides a “symmetric set of solutions”, meaning that the symmetry transforms a solution belonging to this set into a, generally different, solution in the same set. If, in particular, this set includes some solutions which are left fixed by X, then this symmetry is also a CS, either true or weak. So, we could call the weak CS, by analogy, “partial conditional symmetries of order σ”. We can now summarize our discussion in the following way. Definition 4.4. Given a PDE ∆ = 0, a projectable vector field X is a “true” conditional symmetry for the equation if it is a symmetry for the system ∆ = 0;
XQ u = 0.
(4.22)
A vector field X is a “weak” CS (of order σ) if it is a symmetry of the system ∆ = 0;
∆(1) := X ∗ (∆) = 0;
∆(2) := X ∗ (∆(1) ) = 0; XQ u = 0.
...
∆(σ−1) = 0; (4.23)
February 25, 2006 14:58 WSPC/148-RMP
J070-00255
On the Notion of Conditional Symmetry of Differential Equations
11
Proposition 4.5. If X is a true CS, the system (4.22) gives rise to a reduced equation in p − 1 independent variables, which, if it admits solutions, produces X-invariant solutions of ∆ = 0. If X is a weak CS of order σ, the system (4.23) gives rise to a system of σ reduced equations, which, if it admits solutions, produces X-invariant solutions of ∆ = 0. Introducing X-adapted variables s, z, such that Xs = 1, Xz = 0, the PDE has the form (4.6) in the case of true CS, or (4.13) in the case of weak CS. We can then rephrase the Olver–Rosenau statement [26] saying that any vector field X is either an exact, or a true CS, or a weak CS. Similarly, rewriting the () equation ∆ = 0 as in (4.11) or in (4.13) but without isolating the terms vs , we can also say, recalling the procedure used for obtaining partial symmetries, that any X is either an exact or a partial symmetry. It is clear however that, as already remarked, the set of solutions which can be obtained in this way may be empty, or contain only trivial solutions (e.g., u = const). Example 4.6. It is known that the Korteweg–de Vries equation ∆ := ut + uxxx + uux = 0,
u = u(x, t),
does not admit (true) CS (apart from its exact symmetries). There are however weak CS; e.g., it is simple to verify that the scaling vector field X = 2x
∂ ∂ ∂ +t +u ∂x ∂t ∂u
is indeed an exact symmetry for the system ∆ = 0, ∆(1) = X ∗ (∆) = 0 and XQ u = 0, and therefore, is a weak CS of order σ = 2, and u = x/t is a scaling-invariant solution. But also, if we consider only the system ∆ = 0, ∆(1) = X ∗ (∆) = 0 (i.e. without the invariance condition XQ u = 0), we obtain the symmetric set of solutions x + c1 , (c1 , c2 = const), u= t + c2 showing that X is also a partial symmetry. Few words, for completeness, about the so-called “direct method” [25, 32, 36–38] for finding solutions to PDE’s. The simplest and typical application of this method deals with the PDE involving a function of two variables (which we shall call x, t, in view of the next applications), and one looks for solutions of the form (also called “similarity reduction solutions”) u(x, t) = U (x, t, w(z))
with z = z(x, t)
(4.24)
or, more simply, of the form (according to a remark by Clarkson and Kruskal [36], this is not a restriction, see also Lou [37]) u(x, t) = α(x, t) + β(x, t) w(z)
with z = z(x, t),
(4.25)
one then substitutes (4.25) into the PDE and imposes that w(z) satisfies an ODE. Although this method is not based on any symmetry, there is clearly a close and
February 25, 2006 14:58 WSPC/148-RMP
12
J070-00255
G. Cicogna & M. Laino
fully investigated relationship with symmetry properties; referring to [22, 24, 38] for a complete and detailed discussion, we only add here the following remark, to illustrate the idea in the present setting. Assuming in (4.25) that zt = 0, one can choose x and z as independent variables, and then write (4.25) in the form ˜ z)w(z). Then, putting u=u ˜(x, z) = α ˜ (x, z) + β(x, X = ξ(x, z)
∂ ∂ ∂ + ζ(x, z) + ϕ(x, z, u) , ∂x ∂z ∂u
(4.26)
one can fix ζ = 0, in such a way that Xz = 0, choose ξ = 1, and finally impose that XQ u ≡ α ˜ x + β˜x w(z) − ϕ(x, z, u) = 0 in order to determine the coefficient ϕ in (4.26). Then, by construction, u ˜(x, z) is invariant under this X. It is also easy to see that the invariance condition XQ u = 0 is satisfied exactly by the family (4.25). If zt = 0 in (4.25), the same result is true retaining z = x and t as independent variables, and choosing X=
∂ ∂ + ϕ(x, t, u) . ∂t ∂u
(4.27)
So, the direct method has produced a set of solutions to the given PDE which also satisfies the invariance condition XQ u = 0. Then, according to our discussion, X is a CS for the PDE: it is a true CS if w(z) satisfies a single ODE, as is usually the case in the direct method, or a weak CS if the method has produced a separation of the PDE into a system of ODE’s. Notice that a generalization of this method has been proposed in [39], with the introduction of two functions of the similarity variable z; this procedure has been further extended in [32], where its relationship with method of differential constraints is also carefully examined. Other reduction procedures, based on the introduction of suitable multiple differential constraints, have also been proposed, aimed at finding nonclassical symmetries and solutions of differential problems (see, e.g., [17, 40–42], and also [5]). It can be also remarked that in our discussion we have only considered the case of a single vector field X; clearly, the situation becomes richer and richer if more than one vector field is taken into consideration. First of all, the reduction procedure itself must be adapted and refined when the given equation admits an algebra of symmetries of dimension larger than 1 (possibly infinite). For a recent discussion see [43]. 5. Examples from the Boussinesq Equation The symmetry properties of the Boussinesq equation ∆ := utt + uxxxx + uuxx + u2x = 0;
u = u(x, t)
(5.1)
have been the object of several papers (see e.g., [10, 16, 17]), but it is useful here to consider some special cases to illustrate the above discussion. First of all, let us give
February 25, 2006 14:58 WSPC/148-RMP
J070-00255
On the Notion of Conditional Symmetry of Differential Equations
13
the invariant form (according to Theorem 3.1) of the equation under the (exact) dilational symmetry D=x
∂ ∂ ∂ + 2t − 2u ∂x ∂t ∂u
(5.2)
with s = log x, z = x2 /t and u = z exp(−2s) v(s, z), we get z 2 16z 2vzzzz + 4zvz2 + 2v + 12vzz + 4vzvzz + z 2 vzz + 48zvzzz + 2vvz + 4zvz − 6vs − vzvs + zvs2 + 4z 2 vs vz + 11vss + zvvss + 4zvsz + 4z 2 vvsz − 6vsss − 12zvssz + 24z 2vszz + vssss + 8zvsssz + 24z 2 vsszz + 32z 3vszzz = 0, which indeed does not depend explicitly on s. The dilational-invariant solutions are found by putting v = w(z), and only the terms in the parenthesis survive. For what concerns “true” CS, writing the general vector field in the form X = ξ(x, t, u)
∂ ∂ ∂ + τ (x, t, u) + ϕ(x, t, u) , ∂x ∂t ∂u
(5.3)
a complete list of CS has been given both for the case τ = 0 (and therefore, without any restriction, τ = 1) [10] and for the case τ = 0 [16, 37], see also [44]; it has been also shown that the invariant solutions under these CS are precisely those found by means of the “direct method” [10, 16, 36, 37]. Let us give for completeness the form taken by the Boussinesq equation when rewritten in terms of the canonical variables determined by some of these CS. For instance, choosing the CS X=t
∂ ∂ ∂ + − 2t , ∂x ∂t ∂u
(5.4)
we get s = t, z = x − t2 /2 and u = v(s, z) − s2 , and the equation becomes −2 − vz + vz2 + vvzz + vzzzz + vss − 2svsz = 0.
(5.5)
Starting instead (cf. Remark 4.1) from (1/t)X = ∂/∂x + (1/t)∂/∂t − 2∂/∂u, we get s = x, z = x − t2 /2, u = v(s, z) + 2(z − s) and − 2 − vz + vz2 + vvzz + vzzzz + 2vs vz + vvss − 2svss + 2zvss + 2vvsz − 4svsz + 4zvsz + vs2 + vssss + 4vsssz + 6vsszz + 4vszzz = 0.
(5.6)
Both Eqs. (5.5) and (5.6) have the expected form (4.6), and become the known reduced ODE (cf. [10]) if one looks for solutions with v = w(z) and u = w(z) − t2 .
February 25, 2006 14:58 WSPC/148-RMP
14
J070-00255
G. Cicogna & M. Laino
In the case of CS with τ = 0, the invariant solutions have the form (instead of (4.25)) u(x, t) = α(x, t) + β(x, t)w(t),
(5.7)
where w(t) depends only on t and satisfies an ODE. Choosing, e.g., (cf. [17]) 2u 48 ∂ ∂ + +δ 3 , (5.8) X= ∂x x x ∂u where δ may assume the values 0 or 1, the canonical variables are given by s = x, z = t, and u(x, t) = −12δ/x2 + x2 v(x, t), and the equation becomes x2 (vtt + 6v 2 ) + 8x3 vvx + x4 vx2 + 12vxx − 12δvxx + x4 vvxx + 8xvxxx + x2 vxxxx = 0, which has the form (4.6), observing that the role of the variable s is played here by x; as expected, looking for solutions in which v = w(t), this equation becomes one of the solutions listed in [37]. To complete the analysis, one can also look for solutions of the form u(x, t) = α(x, t) + β(x, t)w(x)
(5.9)
with w(x) satisfying an ODE, or for CS of the form ∂ ∂ + ϕ(x, t, u) , (5.10) ∂t ∂u i.e. with ξ = 0. It is not difficult to verify that no true CS of this form is admitted by the Boussinesq equation. However, solutions of the form (5.9) can be obtained via weak CS. Indeeed, choosing e.g., 1 2u ∂ ∂ + 2− , (5.11) X= ∂t t t ∂u X=
one now obtains s = t, z = x and u(x, t) = 1/t + v(x, t)/t2 , giving vvxx + vx2 + 6v + t(vxx + 2) + t2 vxxxx − 4tvt + t2 vtt = 0,
(5.12)
which is precisely of the form (4.13), showing that (5.11) is a weak CS (the role of s is played here by t). Looking indeed for solutions with v = w(x), one gets (cf. (4.14)) a system of the three ODE’s vvxx + vx2 + 6v = 0,
2 + vxx = 0,
vxxxx = 0
admitting the common solution w = −x2 and giving the solution u = 1/t − x2 /t2 of the Boussinesq equation. Another example of weak CS is the following ∂ 10 3 ∂ 2 ∂ + − 2x + t , (5.13) X =t ∂x ∂t 3 ∂u where s = t, z = x − t3 /3 and u = −2sz − s4 + v(s, z). Instead of giving the form of the equation in these variables, let us now evaluate, according to our discussion
February 25, 2006 14:58 WSPC/148-RMP
J070-00255
On the Notion of Conditional Symmetry of Differential Equations
15
(cf. (4.20)), the additional equations ∆(1) = X ∗ (∆) = 0, etc. We get 5 ∆(1) = −10t − 3ux − 2tuxt − t3 uxx − xuxx = 0, 3 ∆(2) = 2 + uxt + t2 uxx = 0.
(5.14) (5.15)
The most general solution of the equation ∆(2) = 0 is t3 u = F (t) + G x − − 2tx, 3 where F , G are arbitrary functions of the indicated arguments; we easily conclude that (5.13) is a weak CS of order σ = 3 and, taking into account also the invariance condition XQ u = 0, we obtain the invariant solutions u(x, t) = −
t4 12 − 2tx − 3 (x − t3 /3)2
and u(x, t) = −
t4 − 2tx + c 3
(c = const). If instead we do not impose the invariance condition XQ u = 0 and solve, according to Remark 4.3, the three equations (5.1), (5.14) and (5.15), we find the slightly more general families of solutions u(x, t) = −
12 t4 − 2tx + c1 t − 3 3 (x − t /3 − c1 )2
and u(x, t) = −
t4 − 2tx + c2 t + c, 3
which are transformed by the symmetry into one another, showing that (5.13) is also a partial symmetry for the Boussinesq equation. 6. Concluding Remarks Some of the facts presented in this paper were certainly already known, although largely dispersed in the literature, often in different forms and with different languages. This paper is an attempt to provide a unifying scheme where various notions and peculiarities of symmetries of differential equations can be stated in a natural and simple way. This allows us, in particular, to provide a precise characterization and a clear distinction between different notions of conditional symmetry: this is indeed one of the main objectives of our paper. We can also give a neat comparison between the notions of conditional, partial and exact symmetries; several new and explicit examples elucidate the discussion. In the same unifying spirit, it can be also remarked that all the above notions can be viewed as particular cases of a unique comprehensive idea, which can be traced back to the general idea of appending suitable additional equations to the given differential problem ∆ = 0, and to search for (exact) symmetries of this augmented problem (cf. [45]). In other words, one looks for a supplementary equation, say E = 0, and a vector field X satisfying X ∗ (∆) = G∆ + HE;
X ∗ (E) = GE ∆ + HE E
(6.1)
(some authors call generically “conditional symmetries” for the equation ∆ = 0 all these symmetries, and call “Q-conditional” symmetries the more commonly named
February 25, 2006 14:58 WSPC/148-RMP
16
J070-00255
G. Cicogna & M. Laino
conditional symmetries.) Now, it is clear that all our above notions of symmetries simply correspond to different choices of the supplementary equation E = 0. Indeed, (i) if E = 0 is chosen to be XQ u = 0, we are in the case of true CS, (ii) if E = 0 is given by the system X ∗ (∆) = X ∗ (∆(1) ) = · · · = 0, we are in the case of partial symmetries, (iii) if E = 0 is the same as in (ii) plus the condition XQ u = 0, we are precisely in the case of weak CS. References [1] L. V. Ovsjannikov, Group Properties of Differential Equations (Siberian Academy of Sciences, Novosibirsk, 1962); Group Analysis of Differential Equations (Academic Press, New York, 1982). [2] P. J. Olver, Application of Lie Groups to Differential Equations (Springer, Berlin, 1986); 2nd edn. (Springer-Verlag, New York, 1993). [3] G. W. Bluman and S. Kumei, Symmetries and Differential Equations (Springer, Berlin, 1989). [4] H. Stephani, Differential Equations. Their Solution Using Symmetries (Cambridge University Press, 1989). [5] N. H. Ibragimov (ed.), CRC Handbook of Lie Group Analysis of Differential Equations, Vols. 1–3 (CRC Press, Boca Raton, 1995). [6] G. Gaeta, Nonlinear Symmetries and Nonlinear Equations (Kluwer, Dordrecht, 1994). [7] G. W. Bluman and S. C. Anco, Symmetry and Integration Methods for Differential Equations (Springer, New York, 2002). [8] G. W. Bluman and J. D. Cole, The general similarity solution of the heat equation, J. Math. Mech. 18 (1969) 1025–1042. [9] G. W. Bluman and J. D. Cole, Similarity Methods for Differential Equations (Springer, Berlin, 1974). [10] D. Levi and P. Winternitz, Nonclassical symmetry reduction: Example of the Boussinesq equation, J. Phys. A 22 (1989) 2915–2924. [11] P. Winternitz, Conditional symmetries and conditional integrability for nonlinear systems, in Group Theoretical Methods in Physics (XVIII ICGTMP), eds. V. V. Dodonov and V. I. Man’ko (Springer, Berlin, 1991), pp. 298–322. [12] W. I. Fushchych (ed.), Symmetry Analysis of Equations of Mathematical Physics (Institute of Mathematics National Academy of Science of Ukraine, Kiev, 1992). [13] W. I. Fushchych, Conditional symmetries of the equations of mathematical physics, in Modern Group Analysis: Advanced Analytical and Computational Methods in Mathematical Physics, eds. N. H. Ibragimov, M. Torrisi and A. Valenti (Kluwer, Dordrecht, 1993), pp. 231–239. [14] G. Cicogna and G. Gaeta, Partial Lie point symmetries of differential equations, J. Phys. A 34 (2001) 491–512. [15] G. Cicogna, A discussion of the different notions of symmetry of differential equations, in Proc. Inst. Math. N.A.S. Uk. 50 (2004), pp. 77–84; Weak symmetries and symmetry adapted coordinates in differential problems, Int. J. Geom. Meth. Mod. Phys. 1 (2004) 23–31. [16] P. A. Clarkson, Nonclassical symmetry reduction of the Boussinesq equation, Chaos, Solitons and Fractals 5 (1995) 2261–2301.
February 25, 2006 14:58 WSPC/148-RMP
J070-00255
On the Notion of Conditional Symmetry of Differential Equations
17
[17] A. M. Grundland, L. Martina and G. Rideau, Partial differential equations with differential constraints, in Advances in Mathematical Sciences, CRM Proc. Lect. Notes, Vol. 11 (American Mathematical Society, Providence, RI, 1997), pp. 135–154. [18] L. Fatibene, M. Ferraris, M. Francaviglia and R. G. McLenaghan, Generalized symmetries in mechanics and field theory, J. Math. Phys. 43 (2002) 3147–3161. [19] R. Z. Zhdanov, Higher conditional symmetry and reduction of initial value problems, Nonlinear Dynamics 28 (2002) 17–27. [20] C. Sophocleous, Classification of potential symmetries of generalized inhomogeneous nonlinear diffusion equations, Physica A 320 (2003) 169–183. [21] E. A. Coddington and N. Levinson, Theory of Ordinary Differential Equations (McGraw-Hill, New York, 1955). [22] E. Pucci, Similarity reduction of partial differential equations, J. Phys. A 25 (1992) 2631–2640. [23] J. F. Cari˜ nena, M. A. Del Olmo and P. Winternitz, On the relation between weak and strong invariance of differential equations, Lett. Math. Phys. 29 (1993) 151–163. [24] R. Z. Zhdanov, I. M. Tsyfra and R. O. Popovych, A precise definition of reduction of partial differential equations, J. Math. Anal. Appl. 238 (1999) 101–123. [25] E. Pucci and G. Saccomandi, Evolution equations, invariant surface conditions and functional separation of variables, Physica D 139 (2000) 28–47. [26] P. J. Olver and Ph. Rosenau, The construction of special solutions to partial differential equations, Phys. Lett. A, 114 (1986) 107–112; Group-invariant solutions of differential equations, SIAM J. Appl. Math. 47 (1987) 263–278. [27] P. J. Olver, Symmetry and explicit solutions of partial differential equations, Appl. Numer. Math. 10 (1992) 307–324. [28] E. Pucci and G. Saccomandi, On the weak symmetry group of partial differential equations, J. Math. Anal. Appl. 163 (1992) 588–598. [29] W. I. Fushchich and I. M. Tsyfra, On a reduction and solutions of the nonlinear wave equations with broken symmetry, J. Phys. A 20 (1987) L45–L48. [30] R. Z. Zhdanov and I. M. Tsyfra, Reduction of differential equations and conditional symmetry, Ukr. Math. Z. 48 (1996) 595–602. [31] R. Popovich, On reduction and Q-conditional (nonclassical) symmetry, in Symmetry in Nonlinear Mathematical Physics, Vols. 1–2 (Natl. Acad. Sci. Ukraine, Kyiv, 1997), pp. 437–443. [32] P. J. Olver, Direct reduction and differential constraints, Proc. R. Soc. London A 444 (1994) 509–523. [33] M. V. Foursov and E. M. Vorob’ev, Solutions of nonlinear wave equation utt = (uux )x invariant under conditional symmetries, J. Phys. A 29 (1996) 6363–6373. [34] A. V. Dzhamay and E. M. Vorob’ev, Infinitesimal weak symmetries of nonlinear differential equations in two independent variables, J. Phys. A 27 (1994) 5541–5549. [35] G. Cicogna, Partial symmetries and symmetric sets of solutions to PDE’s, in Symmetry and Perturbation Theory, Proc. 2002 SPT Conference, eds. S. Abenda, G. Gaeta and S. Walcher (World Scientific, Singapore, 2002), pp. 26–33. [36] P. A. Clarkson and M. D. Kruskal, New similarity solutions of the Boussinesq equation, J. Math. Phys. 30 (1989) 2201–2213. [37] S.-Y. Lou, A note on the new similarity reductions of the Boussinesq equation, Phys. Lett. A 151 (1990) 133–135. [38] M. C. Nucci and P. A. Clarkson, The nonclassical method is more general than the direct method for symmetry reduction, An example of the Fitzhugh–Nagumo equation, Phys. Lett. A 164 (1992) 49–56. [39] V. A. Galaktionov, On new exact blow-up solutions for nonlinear heat conduction equations with source and applications, Differ. Integral Equ. 3 (1990) 863–874.
February 25, 2006 14:58 WSPC/148-RMP
18
J070-00255
G. Cicogna & M. Laino
[40] E. M. Vorob’ev, Reduction and quotient equations for differential equations with symmetries, Acta Appl. Math. 23 (1991) 1–24; Symmetries of compatibilty conditions for systems of differential equations, Acta Appl. Math. 26 (1992) 61–86. [41] M. C. Nucci, Iterating the nonclassical symmetries method, Physica D 78 (1994) 124– 134; Nonclassical symmetries as special solutions of heir-equations, J. Math. Appl. Anal. 279 (2003) 168–179. [42] O. V. Kaptsov and I. V. Verevkin, Differential constraints and exact solutions of nonlinear diffusion equations, J. Phys. A 36 (2003) 1401–1414. [43] A. M. Grundland, P. Tempesta and P. Winternitz, Weak transversality and partially invariant solutions, J. Math. Phys. 44 (2003) 2704–2722. [44] P. J. Olver and E. M. Vorob’ev, Nonclassical and conditional symmetries, in CRC Handbook of Lie Group Analysis of Differential Equations, ed. N. H. Ibragimov Vols. 1–3 (CRC Press, Boca Raton, 1995). [45] W. I. Fushchych, On symmetry and particular solutions of some multidimensional physics equations, in Algebraic-Theoretical Methods in Mathematical Physics Problems (Inst. Math. Acad. Sci. of Ukraine, Kyiv, 1983), pp. 4–23.
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Reviews in Mathematical Physics Vol. 18, No. 1 (2006) 19–60 c World Scientific Publishing Company
QUANTUM STATE ESTIMATION AND LARGE DEVIATIONS
M. KEYL Istituto Nazionale di Fisica della Materia, Unita’ di Pavia, Dipartimento di Fisica “A. Volta”, via Bassi 6, I-27100 Pavia, Italy
[email protected] Received 2 May 2005 Revised 7 October 2005 In this paper we propose a method to estimate the density matrix ρ of a d-level quantum system by measurements on the N -fold system in the joint state ρ⊗N . The scheme is based on covariant observables and representation theory of unitary groups and it extends previous results concerning pure states and the estimation of the spectrum of ρ. We show that it is consistent (i.e. the original input state ρ is recovered with certainty if N → ∞), analyze its large deviation behavior, and calculate explicitly the corresponding rate function which describes the exponential decrease of error probabilities in the limit N → ∞. Finally, we discuss the question whether the proposed scheme provides the fastest possible decay of error probabilities. Keywords: Quantum information theory; quantum state estimation; large deviations; covariant observables. Mathematics Subject Classification 2000: 81P68, 81P15, 60F10
1. Introduction The density operator ρ of a d-level quantum system (d ∈ N) describes the preparation of the system in all details relevant to statistical experiments, and the task of quantum state estimation is to determine ρ by measurements on a (possibly large) number N of systems, which are all prepared according to ρ. In the limit of infinitely many input systems, it is of course possible to get exact estimates. If N remains finite, however, estimation errors are unavoidable. The best we can get (if N is large enough) is an estimation scheme which produces only small errors or, better to say, which produces large errors only with a small probability. There are several ways to get “good” estimation schemes. One possibility is to choose an appropriate figure of merit which measures the quality of the estimates (e.g., averaged fidelities with respect to the original density matrix) and to solve the corresponding optimization problem. If we know a priori that the input state ρ is pure (but otherwise unknown), this approach is very successful and leads to optimal estimators, which can be given in closed form for all finite values of N 19
February 25, 2006 14:58 WSPC/148-RMP
20
J070-00256
M. Keyl
(cf. [24, 30, 6, 11, 19, 29, 7]). In the general case, however (i.e. if nothing is known about ρ), the situation is much more difficult. First of all, the result depends much more on the figure of merit chosen than in the pure state case, and even if we have found an appropriate quality criterion, it is in general very hard to determine the corresponding optimal estimator explicitly for arbitrary N ; some results related to this approach can be found in [36, 15, 2]. A way out of this dilemma is to neglect the quality of the estimates for finite N and to look for estimation schemes which guarantee at least that error probabilities vanish “as fast as possible” as N goes to infinity (cf. [22, 24]; for a collection of recent publications on the subject, see also [18]). There are two approaches which implement this somewhat vague idea in a mathematically exact way. One possibility is to look at variances (rescaled by N ) in the limit N → ∞. This is done in several works (cf. [17, 16, 31] and in particular, the papers reprinted in [18]) and it leads to quantum analogs of classical Cram´er–Rao type bounds. The second idea is to analyze the large deviation behavior of the estimators. To make this more precise, let us denote an estimate derived from a measurement on N systems in the joint state ρ⊗N by σ. Then, we can look at the probability PN, that the trace-norm distance between ρ and σ (or any other appropriate distance measure for states) is at least , i.e. ρ − σ1 ≥ . Since ρ = σ would be the exact estimate, this is clearly an error probability. Now, we are interested in those cases where PN, vanishes exponentially fast in N , i.e. PN, ≈ CN exp −N inf I(σ, ρ) . (1.1) ρ−σ1 ≥
Here, CN , N ∈ N is an unknown sequence of positive real numbers, growing at most subexponentially with N (and which is of no interest for the following), and I(ρ, σ) is a positive function which vanishes iff σ = ρ holds. I is called the rate function because it describes the exponential rate with which estimation errors vanishes asymptotically. In classical statistics, this analysis was initiated by Bahadur [3–5] and has become, in the mean time, a classical topic (“Bahadur efficiency”). For the quantum case, however, much less is known, and the results available so far cover three different areas: (1) In [27, 1, 21], an explicit scheme to estimate the spectrum of ρ is proposed and its rate function is calculated. The latter is shown to be optimal in [20]. (2) The rate function of the optimal pure state estimator is calculated in [19]. (3) In [20], the behavior of quantities like lim→0 inf ρ−σ1 ≥ I(ρ, σ) is analyzed for one-parameter families of states, and the relation to quantum Fisher information is discussed. The purpose of the present paper is to extend the results about the spectrum in [27] and about pure states in [19] in two respects. Firstly, we will propose a scheme to estimate the full density matrix which is based on covariant observables [24] and which reduces to [27] if we look only at the spectrum of ρ. And secondly, we will pose the question whether the proposed scheme is “asymptotically optimal”, i.e. whether its rate function is bigger than the rate function of any other scheme.
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
21
There is of course no guarantee that a given set of functions admits a maximal element, but in the classical case, it is known that such an “optimal rate function” exists (and is given by the classical relative entropy — this is again a consequence of Bahadur’s work [3–5]). For quantum systems, however, the situation is — not very surprisingly — much more difficult. The outline of the paper is as follows: In Sec. 2, we will give a more formal introduction to the questions we are considering and in Sec. 3, we will state our main results. The proofs and a more detailed discussion is then distributed among Sec. 4 (where we will consider U(d)-covariant estimation schemes) and Sec. 5 (where upper bounds on rate functions will be discussed). 2. Basic Definitions In this section, we will present some mathematical preliminaries (in particular, basic definitions and terminology) concerning quantum state estimation. A short summary of material from the theory of large deviations used throughout this paper can be found in Appendix A. 2.1. State estimation Let us consider the d-dimensional Hilbert space H = Cd and the corresponding set S of density operators. The task of quantum state estimation is to determine a state ρ ∈ S by a measurement on an N -fold system, which is prepared in the joint state ρ⊗N . Mathematically, this can be described by a normalized POV measure EN on the state space S with values in the algebra B(H⊗N ) of (bounded) operators on H⊗N . More precisely, EN is a (strongly) σ-additive set function EN : B(S) → B(H⊗N ) with EN (∆) ≥ 0,
EN (∅) = 0,
EN (S) = 1I,
(2.1)
on the Borel σ algebra B(S) of S, and the probability to get an estimate in a Borel set ∆ ⊂ S is given by µN,ρ (∆) = tr ρ⊗N EN (∆) . (2.2) Since the number N of systems is arbitrary, we need a whole sequence of observables and we will call each such sequence in the following a full estimation scheme. For a good estimation scheme, the quality of the estimates should increase with N , i.e. the error probability should decrease and in the limit of infinitely many input systems, the estimate should be exact; in other words, the sequence of probability measures (µN,ρ )N ∈N should converge for each ρ weakly to the point measure concentrated at ρ. Such an estimation scheme is called consistent. If we are interested not in the whole state but only in some special properties of ρ (e.g., its von Neumann entropy), described by a function S ρ → p(ρ) ∈ X taking its values in a locally compact, separable metric space X, we have to consider more ⊗N generally ⊗N POV measures EN : B(X) → B(H ) on X instead of S. As before, tr ρ EN (∆) is the probability to get an estimate in ∆ ⊂ X. Estimating the
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
M. Keyl
22
spectrum of a density operator is a particular example of this kind. In this case, p coincides with d xj = 1 (2.3) s : S → Σ = x ∈ [0, 1]d | x1 ≥ · · · ≥ xd ≥ 0, j=1
which maps a density operator ρ to its spectrum s(ρ) ∈ Σ, i.e. sj (ρ) = χj , ρχj where χ1 , . . . , χd denotes an appropriate eigenbasis of ρ. We will call Σ the set of ordered spectra and s the canonical projection onto Σ. Let us summarize the discussion up to now in the following definition. Definition 2.1. Consider a finite dimensional Hilbert space H, the corresponding set S of density operators, and a function p : S → X taking its values in the locally compact, separable metric space X. A sequence (EN )N ∈N of POV measures EN : B(X) → B(H⊗N ) is called a p-estimation scheme (or just an estimation scheme if there is no danger of confusion). A p-estimation scheme is called consistent, if the sequence (µN,ρ )N ∈N of probability measures defined in (2.2) converges for each ρ ∈ S weakly to a point measure concentrated at p(ρ) ∈ X. We recover both cases we are mainly interested in if we set X = S and p = Id for the full problem and X = Σ and p = s for spectral estimation. Of special importance in this work are the estimation schemes with additional symmetry properties: Let us denote the permutation group on N points by SN and its natural representation on H⊗N by V , i.e. Vσ ψ1 ⊗ · · · ⊗ ψN = ψσ−1 (1) ⊗ · · · ⊗ ψσ−1 (N ) ,
σ ∈ SN ,
ψ1 , . . . , ψN ∈ H.
(2.4)
An estimation scheme (EN )N ∈N is called permutation invariant, if Vσ EN (∆)Vσ∗ = EN (∆),
∀σ ∈ SN ,
∀∆ ∈ B(X)
(2.5)
holds. Likewise, it is called U(d)-covariant (or just covariant) if U(d) acts continuously on X by U(d) × X (U, x) → αU (x) ∈ X such that the conditions U ⊗N EN (∆)U ⊗N ∗ = EN (αU (∆)),
∀U ∈ U(d),
∀∆ ∈ B(X)
(2.6)
and p(UρU ∗ ) = αU (p(ρ)),
∀U ∈ U(d),
∀ρ ∈ S
(2.7)
are satisfied. If the scheme (EN )N ∈N is consistent, covariance of the projection p (2.7) is implied by covariance of the measures EN (2.6). Furthermore, note that the U(d) operation αU is uniquely determined (if it exists) due to surjectivity of p. For full estimation, we have αU (ρ) = UρU ∗ and for spectral estimation, it is the trivial action, i.e. αU (x) = x. Hence, the covariant estimation schemes are defined in both cases we are interested in.
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
23
2.2. Large deviations ¯ (the closure Consider now, a Borel set ∆ ⊂ X and a state ρ ∈ S such that p(ρ) ∈ ∆ of ∆). The quantity µρ,N (∆) is then the probability to get a false estimate in ∆. If the scheme is consistent, this probability goes to zero. This is, however, a very weak statement because the convergence can be very slow. As already pointed out in the introduction, we are therefore interested in schemes, where convergence of error probabilities to zero is exponentially fast ; in other words, for each ρ ∈ S, the sequence (µN,ρ )N ∈N of probability measures from Eq. (2.2) should satisfy the large deviation principle a with a rate function I(ρ, · ). This idea leads to the following definition: Definition 2.2. A p-estimation scheme (EN )N ∈N , as described in Definition 2.1, satisfies the large deviation principle (LDP) with rate function I : S × X → [0, ∞] if (1) Iρ = I(ρ, · ) is a rate function (cf. Definition A.1) for each ρ ∈ S. (2) I(ρ, x) = 0 iff p(ρ) = x holds. (3) The sequence (µN,ρ )N ∈N of probability measures (2.2) satisfies for each ρ ∈ S the large deviation principle with rate function Iρ . Note that condition (2) guarantees that each scheme which satisfies the LDP is consistent, because the µN,ρ (∆) converges to 0, if ∆ is a closed set which does not contain p(ρ). Occasionally, we will have to refer to the rate function I of an estimation scheme (EN )N ∈N without using (EN )N ∈N directly. In this case, we will call I an admissible rate function. Definition 2.3. A function I : S × X → [0, ∞] which is the rate function of a p-estimation scheme is called p-admissible (or just admissible if p is understood). The set of all p-admissible rate functions is denoted by E(p). We do not yet know how continuous or discontinuous admissible rate functions can be in their first argument. For example, an otherwise very bad estimation scheme might provide very fast exponential decay for a particular input state. The discussion in Secs. 4.1 and 5.4 will indicate that discontinuities might occur in particular at the boundary of the state space, while the behavior in the interior of S (i.e. at non-degenerate density matrices) seems to be more regular. To avoid such difficulties, let us introduce the following subset of E(p): E 0 (p) = {I ∈ E(p) | I is lower semi continuous}.
(2.8)
If the map p we want to estimate is covariant in the sense of Eq. (2.7), we can introduce, in addition, E c (p) = {I ∈ E(p) | I is covariant}, aA
(2.9)
short summary of definitions and theorems from large deviations theory which are relevant for this paper can be found in Appendix A.
February 25, 2006 14:58 WSPC/148-RMP
24
J070-00256
M. Keyl
where we call an admissible rate function covariant, if it is the rate function of a U(d)-covariant estimation scheme. In contrast to this, any function F : S × X → [0, ∞] is called U(d)-invariant if (2.10) F U ρU ∗ , αU (x) = F (ρ, x), ∀U ∈ U(d), ∀ρ ∈ S, ∀x ∈ X is satisfied. Obviously, each admissible rate function which is covariant is U(d)invariant too. It is not clear whether the converse holds as well (i.e. whether U(d)invariance of I ∈ E(p) implies covariance). However, problems can occur only on the boundary of S (i.e. for degenerate density matrices) and even there only if I is not lower semicontinuous (cf. Sec. 4.2 for details). Finally, note that U(d)-invariance of I ∈ E c (p) implies, together with lower semicontinuity of Iρ ( · ) = I(ρ, · ), lower semicontinuity of I x ( · ) = I( · , x) along the orbits of the U(d) action on S. The general relation between E 0 (p) and E c (p) is, however, not clear (i.e. I ∈ E c (p) can be discontinuous transversal to the orbits). Ideally, we would like to have estimation schemes (EN )N ∈N which provide the fastest possible exponential decay of error probabilities. Hence, for a given map p : S → X, we are mainly interested in the quantities Ip (ρ, σ) = sup I(ρ, σ),
Ip0 (ρ, σ) = sup I(ρ, σ)
I∈E(p)
(2.11)
I∈E 0 (p)
and Ipc (ρ, σ) = sup I(ρ, σ).
(2.12)
I∈E c (p)
The functions Ip# : S × X → [0, ∞] thus defined (following the notation introduced # above, we will write IId for full and Is# for spectral estimation), are the least upper # bounds on the sets E (p), but they are not necessarily admissible themselves. In slight abuse of language, we will call them nevertheless the optimal rate functions. If Ip can be realized as the rate function of a particular estimation scheme (EN )N ∈N , we will call (EN )N ∈N (strongly) asymptotically optimal. 3. Summary of Main Results A particular example for asymptotic optimality arises in classical estimation theory (for finite probability distributions). It is known from the Bahadur efficiency [3–5] that the classical relative entropy is an upper bound for all admissible rate functions; and Sanov’s theorem (cf. [10]) states that this bound can be achieved by the empirical distribution (i.e. relative frequencies in a given sample). Therefore, the latter provides an asymptotically optimal estimation scheme. For quantum systems, the situation is more difficult, and our knowledge is (unfortunately) not yet as complete as for classical estimation. Nevertheless, we have some significant partial results which we want to summarize in this section. The proofs and a more detailed discussion are revealed in Secs. 4 and 5.
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
25
3.1. Estimating the spectrum The most complete result is available for spectral estimation. To state it, let us recall the definition of the scheme presented in [27]. It is based on the decomposition of the representation U → U ⊗N of the unitary group U(d) into irreducible components. The latter is given by H⊗N = HY ⊗ KY , U ⊗N = πY (U ) ⊗ 1I, (3.1) Y ∈Yd (N )
Y ∈Yd (N )
where Yd (N ) denotes the set of Young frames with d rows and N boxes d d Yd (N ) = Y ∈ N | Y1 ≥ · · · ≥ Yd , Yj = N ,
(3.2)
j=1
πY denotes the irreducible representation with highest weight b Y , and KY is a multiplicity space which carries an irreducible representation of the symmetric group SN on N elements: Vσ = 1I ⊗ ΠY (σ), σ ∈ SN , (3.3) Y ∈Yd (N )
where Vσ is defined in Eq. (2.4) and ΠY is the irreducible SN representation defined by the Young frame Y . Now, we can define a spectral estimation scheme (FˆN )N ∈N by FˆN (∆) = PY , (3.4) Y /N ∈∆
where PY denotes the projection onto HY ⊗ KY : PY ∈ B(H⊗N ),
PY2 = PY ,
PY∗ = PY ,
PY H⊗N = HY ⊗ KY .
(3.5)
In other words, FˆN is a discrete measure with normalized Young frames Y /N as possible estimates and the probability to get the outcome Y /N for input systems in the joint state ρ⊗N is tr(ρ⊗N PY ). In [27], it is shown that FˆN satisfies the large deviation principle with the classical relative entropy between the probability vectors x ∈ Σ and s(ρ) as the rate function I(ρ, x). As we will see in Sec. 5.2, this is in fact the best that can be achieved (cf. also [20]). Theorem 3.1. The spectral estimation scheme (FˆN )N ∈N defined in (3.4) is asymptotically optimal; i.e. it satisfies the LDP with the optimal rate function Is defined in Eq. (2.11). In addition, Is = Is0 = Isc holds, and Is is given explicitly by S × Σ (ρ, x) → Is (ρ, x) =
d
xj [ln(xj ) − ln(sj (ρ))],
(3.6)
j=1
where s : S → Σ is the canonical projection from Eq. (2.3). b More precisely the Y , . . . , Y are the components of the highest weight in a particular basis of 1 d the Cartan subalgebra.
February 25, 2006 14:58 WSPC/148-RMP
26
J070-00256
M. Keyl
3.2. The full density matrix ˆN )N ∈N we have found so far is defined by For the full problem, the best scheme (E the integral (with an arbitrary continuous function f : S → R) ˆN (dρ) f (ρ)E S = dim HY f U ρY /N U ∗ |πY (U )φY πY (U )φY | ⊗ 1I dU, (3.7) Y ∈Yd (N )
U(d)
where φY ∈ HY is the highest weight vector of the irreducible representation πY and ρx denotes for each x ∈ Σ the diagonal density matrix ρx = diag(x1 , . . . , xd ).
(3.8)
The main properties of this scheme are: It projects to the spectral estimation scheme FˆN from Sec. 3.1, ˆN (s−1 (∆)) = FˆN (∆), E
∀∆ ∈ B(Σ),
(3.9)
it is covariant (i.e. Eq. (2.6) holds with αU (ρ) = U ρU ∗ ) and permutation invariant ˆN can be regarded therefore as a two step process: First, (cf. Eq. (2.5)). Measuring E ˆ measure the observable FN in terms of the instrument T , which is defined by the family of channels (given in the Schr¨ odinger picture): TY : B(H⊗N ) ω → trKY (PY ωPY ) ∈ B(HY ),
Y ∈ Yd (N ),
(3.10)
where trKY denotes the partial trace over KY and the PY are again the projections from (3.5). If the estimate for the spectrum we get in this way (with probability tr(PY ρ⊗N )) is Y /N , the output of T is a quantum system (described by the Hilbert space HY — hence of different type than the input systemc ) in the state −1 TY (ρ⊗N ). On this system, we perform a measurement of a covariant tr PY ρ⊗N observable EY with values in SY = s−1 (Y /N ) which is defined by the integral f (σ)EY (dσ) = f U ρY /N U ∗ |πY (U )φY πY (U )φY | dU, (3.11) SY
U(d)
(where f denotes now a continuous function on SY ) and this gives us an estimate for the eigenvectors of ρ. In the special case of pure states (i.e. if the first measurement ˆY is given by gives Y /N = (1, 0, 0, . . . , 0)), the observable E f (σ)EˆY (dσ) = f (σ)σ ⊗N , for Y = (N, 0, . . . , 0), (3.12) P
P
−1
where P = s (1, 0, . . . , 0) denotes the set of pure states. This observable is known to optimize for each N global quality criteria like averaged fidelity [24, 30, 19]. ˆN as a direct generalization of the best known estimaHence, we can look at E tion schemes for the spectrum and for pure states. We discuss this point of view in c If
d = 2 holds, the situation is special. In this case, the output of T can be regarded as an M = Y1 − Y2 qubit system, and T itself coincides with the “natural purifier” studied in [8, 28].
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
27
ˆN is described by the greater detail in Sec. 4.4. The large deviation behavior of E following theorem (cf. Sec. 4.5 for the proof): Theorem 3.2. The full estimation scheme (EˆN )N ∈N defined in Eq. (3.7) satisfies the large deviation principle with rate function Iˆ : S × S → [0, ∞] ˆ U ρx U ∗ ) = I(ρ,
d xk ln(xk ) − (xk − xk+1 ) ln[pmk (U ∗ ρU )] ,
(3.13)
k=1
where x = (x1 , . . . , xd ) ∈ Σ, xd+1 = 0, ρx is the density matrix from Eq. (3.8), U ∈ U(d), and pmj (σ) denotes the principal minor (i.e. the upper left rank j subdeterminant) of the matrix σ. The best upper bound on the rate function for full estimation schemes we have found so far is derived from quantum hypothesis testing. Theorem 3.3. Each admissible rate function I : S × S → [0, ∞] is bounded from above by the relative entropy, i.e. I(ρ, σ) ≤ S(ρ, σ) = tr(σ ln(σ) − σ ln(ρ)),
∀ρ, σ ∈ S.
(3.14)
The proof will be given in Sec. 5.2; cf. also [20]. It is easy to check numerically ˆ σ) and S(ρ, σ) do not coincide in general. If we consider in particular the that I(ρ, qubit case (d = 2) and express the density operators ρ, σ in Bloch form, i.e. ρ=
1 [1I + x · σ ], 2
σ=
1 [1I + y · σ ], 2
(3.15)
(where σ = (σ1 , σ2 , σ3 ) are the Pauli matrices and x, y ∈ R3 with | x|, | y | ≤ 1), we get for the rate function I from Eq. (3.13),
2 ˆ σ) = −S(σ) − | y | ln 1 + | x| cos θ − 1 − | y | ln 1 − | x| , I(ρ, (3.16) 2 2 4 where θ denotes the angle between x and y , and S(σ) is the von Neumann entropy of σ. The relative entropy of σ and ρ becomes [9] 1 + | x| | y| cos(θ) 1 2 ln . (3.17) S(ρ, σ) = −S(σ) − ln(1 + | x| ) − 2 2 1 − | x| We have plotted both quantities as functions of θ for two different values of | x| = | y | in Fig. 1, which shows that I(ρ, σ) is in general strictly smaller than S(ρ, σ). 3.3. Optimal rate functions Hence, for a general input state ρ, we only know for sure that the optimal rate functions defined in Eqs. (2.11) and (2.12) have to satisfy (with p = Id for full estimation) c 0 , IId ≤ IId ≤ S. Iˆ ≤ IId
(3.18)
February 25, 2006 14:58 WSPC/148-RMP
28
J070-00256
M. Keyl
3
rate function relative entropy
2.5
2
1.5
1
0.5
0
0
0.5
1
1.5
0.025
2
2.5
3
rate function relative entropy
0.02
0.015
0.01
0.005
0
0
0.5
1
1.5
2
2.5
3
Fig. 1. Relative entropy and rate function Iˆ as a function of the angle θ between the two Bloch vectors x and y . The upper plot corresponds to the case | x| = | y| = 0.9 and the lower plot to | x| = | y | = 0.1.
This is, however, not as bad as it looks like at a first glance: Since S(ρ, σ) and ˆ σ) coincide if ρ and σ commute, we get I(ρ, ˆ σ) = I c (ρ, σ) = I 0 (ρ, σ) = IId (ρ, σ) = S(ρ, σ) I(ρ, Id Id =
d
sj (σ)(ln sj (σ) − ln sj (ρ)),
∀ρ, σ ∈ S with [ρ, σ] = 0. (3.19)
j=1
A second partial result arises if the input state is pure. In Proposition 5.5, we will show c ˆ σ), (ρ, σ) = I(ρ, IId
∀ρ, σ ∈ S with ρ pure,
(3.20)
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
29
and in Sec. 4.4, we will give some heuristic arguments which indicate that Iˆ and ˆN )N ∈N is the best I c coincide even for general input states. This indicates that (E scheme as long as we are insisting on some additional regularity conditions of the rate function — in the case at hand, this is covariance. It is not clear, however, whether covariance can be replaced by something more general without breaking ˆ There are at least some indications (cf. Sec. 5.3) that Eq. (3.20) the equality with I. c 0 with IId . Note that Iˆ ∈ E 0 (p), hence (3.20) already would still hold if we replace IId 0 c (ρ, σ) ≥ IId (ρ, σ) for pure ρ. Our conjecture here is that equality holds implies IId for all ρ and σ. Another result which can be derived easily from Eq. (3.20) and Proposition 4.9 is S ∈ E(Id), i.e. there is no estimation scheme with relative entropy as its rate function. This follows from the fact that S is lower semicontinuous and U(d)-invariant in the sense of Eq. (2.10). Hence, S ∈ E(Id) would imply according to Proposition 4.9, S ∈ E c (Id) in contradiction to Eq. (3.20) and the fact that ˆ σ) holds for all pure states ρ, σ with ρ = σ and ρσ = 0. On the other S(ρ, σ) > I(ρ, hand, there is strong evidence that IId = S holds, i.e. that S is the best upper bound of the set of all admissible rate functions. This would imply that we can find for each pair ρ0 , σ0 ∈ S an I ∈ E(Id) such that I(ρ0 , σ0 ) = S(ρ0 , σ0 ) holds, but I is ˆ almost everywhere else. much smaller than S (most probably even smaller than I) In Sec. 5.3, we will discuss these topics in greater detail. For now, let us summarize all our conjectures in the following equation c 0 = IId ≤ I = S. Iˆ = IId
(3.21)
4. Covariant Observables The aim of this section is to study estimation schemes which are U(d)-covariant and permutation invariant, i.e. they do not prefer a special copy of the input state or a particular direction in the Hilbert space H. Among the proof of Theorem 3.2, we will provide several general results, which are useful within the discussion of the questions raised in Sec. 3.3. Therefore, only full estimation schemes are considered in this section (i.e. p = Id), but most of the results in Secs. 4.2 and 4.3 can be generalized quite easily to p-estimation schemes, if p is sufficiently covariant. 4.1. Continuity properties Let us start with some technical results concerning continuity and uniform convergence with respect to the original density matrix ρ. They will become crucial within the discussion of group averages in the next section. Some of them, however, are quite interesting in their own right, and it is therefore reasonable to devote a whole subsection for them. Central subjects of this discussion will be integrals of the form −1 ln e−N f (σ) tr ρ⊗N EN (dσ) , (4.1) hN (ρ, f ) = N S
February 25, 2006 14:58 WSPC/148-RMP
30
J070-00256
M. Keyl
where f denotes an arbitrary, real-valued continuous function on S. Quantities of this form usually appear in Varadhan’s Theorem (cf. Theorem A.3), i.e. if the estimation scheme (EN )N ∈N satisfies the LDP with rate function I, we have lim hN (ρ, f ) = h(ρ, f ) = inf (I(ρ, σ) + f (σ)).
N →∞
σ∈S
(4.2)
If, on the other hand, (EN )N ∈N does not necessarily satisfy the LDP but (4.2) holds for each f and a density matrix ρ, the sequence of probability measures tr ρ⊗N EN ( ·) satisfies the Laplace principle (Definition A.4) which is equivalent to the large deviation principle (Theorem A.5). Hence, the study of convergence properties of the hN (ρ, f ) is a useful tool to prove that the LDP holds for a given estimation scheme. In this section, we will discuss continuity of h with respect to ρ and uniformity of the convergence hN → h (again with respect to ρ). The most crucial step in this direction is the following lemma. Lemma 4.1. Consider an estimation scheme (EN )N ∈N satisfying the LDP with rate function I, an arbitrary continuous (real-valued) function f and the functionals hN , h defined in Eqs. (4.1) and (4.2). (1) For each nondegenerate density matrix ρ ∈ S and each sequence N N → ρN ∈ S converging to ρ, we have lim hN (ρN , f ) = lim hN (ρ, f ) = h(ρ, f ).
N →∞
N →∞
(4.3)
(2) If I is lower semicontinuous in both arguments, the lower bound lim inf hN (ρN , f ) ≥ h(ρ, f ) N →∞
(4.4)
holds even for degenerate ρ. Proof. Let us consider statement (1) first. In this case, the proof mainly depends on the following lemma which allows us to represent one sequence as a convex combination of two others. (j)
Lemma 4.2. Consider two sequences N N → ρN ∈ S, j = 1, 2 both converging to the same nondegenerate density matrix ρ ∈ S. For each λ ∈ R with 0 < λ < 1, there exists an integer Nλ ∈ N and a third sequence N N → σN ∈ S such that (1)
(2)
ρN = λρN + (1 − λ)σN ,
∀ N > Nλ ,
(4.5)
holds. Proof. Let κ = inf φ=1 φ, ρφ and define =
(1 − λ)κ . λ+1
(4.6)
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
31
Since ρ is nondegenerate, we have κ > 0 and therefore > 0 as well. Hence, there is an Nλ ∈ N such that (with φ ∈ H and A ∈ B(H)) (j) (j) (4.7) sup |φ, (ρN − ρ)φ | ≤ sup tr (ρN − ρ)A φ=1
A=1
(j) = ρN − ρ 1 <
(4.8)
holds for all N > Nλ and for j = 1, 2. In addition, we see by the triangle inequality that (1) (2) (4.9) sup φ, ρN − ρN φ < 2 φ=1
holds as well for all N > Nλ . Now, define δ=
1 λ κ − = 2 2 1−λ
(4.10)
(the second equality follows from Eq. (4.6)) and (2)
(1)
σN = −δρN + (1 + δ)ρN ,
for N > Nλ
(4.11)
(and σN ∈ S arbitrary otherwise). Obviously, tr(σN ) = 1 and 1 −λ (2) (1) ρ + ρ = σN . 1−λ N 1−λ N
(4.12)
Hence, (1)
(2)
ρN = λρN + (1 − λ)σN ,
∀N > Nλ
(4.13)
as stated. It only remains to show that σN ≥ 0 (and therefore, σN ∈ S) holds for all N > Nλ . This follows from (2) (1) φ, σN φ = −δ φ, ρN φ + (1 + δ) φ, ρN φ (4.14) (1) (1) ≥ −2δ − δ φ, ρN φ + (1 + δ) φ, ρN φ (4.15) (1) (4.16) = −2δ + φ, ρN φ ≥ −2δ + φ, ρφ − ≥ −2δ + κ − = −(2δ + 1) + κ = 0,
(4.17)
where we have used Eq. (4.9) in (4.15), Eq. (4.8) in (4.16) and the definition of δ (4.10) in (4.17). (1)
(2)
Now, let us apply this lemma to ρN = ρ and ρN = ρN for all N ∈ N. For each λ ∈ (0, 1), we get an Nλ ∈ N such that hN (ρ, f ) = hN (λρN + (1 − λ)σN , f ) holds for all N > Nλ . Hence, lim hN (ρ, f ) = lim hN (λρN + (1 − λ)σN , f ).
N →∞
N →∞
(4.18)
February 25, 2006 14:58 WSPC/148-RMP
32
J070-00256
M. Keyl
Using the definition of hN in (4.1), we get: −1 ln λN e−N hN (ρN ,f ) hN (λρN + (1 − λ)σN , f ) = N N N −n n −Nf (σ) + λ (1 − λ) e tr(AN,n EN (dσ)) , n=1
S
(4.19) where AN,n denotes the sum of all tensor products consisting of N − n factors ρN and n factors σN . We can rewrite this expression as hN (λρN + (1 − λ)σN , f ) = − ln λ + hN (ρN , f ) n N 1−λ 1 N hN (ρN ,f ) −Nf (σ) e tr(AN,n EN (dσ)) . (4.20) − ln 1 + e N λ S n=1 Since ρN and σN are density matrices, the operators AN,n are positive. Hence, the argument of the last logarithm in Eq. (4.20) is greater than one and the logarithm, therefore, is positive. This implies: hN (ρN , f ) ≥ hN (λρN + (1 − λ)σN , f ) + ln(λ),
(4.21)
and with Eq. (4.18), lim inf hN (ρN , f ) ≥ lim inf hN (λρN + (1 − λ)σN , f ) + ln(λ) N →∞
N →∞
= lim hN (ρ, f ) + ln(λ). N →∞
(4.22)
Since λ ∈ (0, 1) is arbitrary, we get lim inf N →∞ hN (ρN , f ) ≥ limN →∞ hN (ρ, f ). The other inequality (i.e. lim supN →∞ hN (ρN , f ) ≤ limN →∞ hN (ρ, f )) can be derived with the same argument, if we exchange the role of ρ and ρN (i.e. apply (1) (2) Lemma 4.2 to ρN = ρN and ρN = ρ for all N ∈ N). Hence, limN →∞ hN (ρN , f ) = limN →∞ hN (ρ, f ) as stated. The equality limN →∞ hN (ρ, f ) = h(ρ, f ) follows from Varadhan’s Theorem (Theorem A.3). Now consider statement (2). If ρ is degenerate, the method used above cannot be applied. However, if the rate function I is sufficiently continuous, we can extend (parts of) the result derived for nondegenerate density matrices to the degenerate case. To this end, we need the following lemma: Lemma 4.3. Consider a compact metric space (X, d) and a lower semicontinuous function F : X × X → [c, ∞], c ∈ R. The infimum F (x) = inf y∈X F (x, y) is lower ¯ semicontinuous as well.
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
33
Proof. Due to lower semicontinuity of F , we find for each (x, y) ∈ X × X and each > 0, a δx,y > 0 with d(x, x ) < δx,y ,
d(y, y ) < δx,y ⇒ F (x , y ) > F (x, y) − .
(4.23)
Since X is compact, each fixed x ∈ X y 1 , . . . , yk ∈ X admits finitely many points such that the neighborhoods Uj = y ∈ X | d y , yj < δx,yj overlap X. Now, define δ = minj δx,yj > 0. For each x satisfying d(x, x ) < δ and each y ∈ X, there is a j = 1, . . . , k with F (x , y ) > F (x, yj ) − . Hence, F (x , y ) > inf y F (x, y) − and we get d(x, x ) < δ ⇒ F(x ) = inf F (x , y ) > inf F (x, y) − . y y ¯
(4.24)
Since δ > 0, this shows that F is lower semicontinuous at x and since x is arbitrary, ¯ the statement follows. Let us apply this lemma to F (ρ, σ) = I(ρ, σ) + f (σ). Since I is lower semicontinuous by assumption, we get for each > 0, a δ > 0 such that ρ − ρ1 < δ implies h(ρ , f ) > h(ρ, f ) − . Together with the convexity of the δ-ball around ρ, this implies h(λρ + (1 − λ)ρ , f ) > h(ρ, f ) − ,
∀ λ ∈ (0, 1).
(4.25)
If (ρN )N ∈N is a sequence in S converging to ρ, the convex linear combinations λρN + (1 − λ)ρ converges to λρ + (1 − λ)ρ . As in Eq. (4.22), we get lim inf hN λρN + (1 − λ)ρ , f ≤ lim inf hN (ρN , f ) − ln(λ). (4.26) N →∞
N →∞
Now, assume without loss of generality that ρ is nondegenerate. Then, λρ + (1 − λ)ρ is nondegenerate as well and we have according to statement (1) lim inf hN λρN + (1 − λ)ρ , f = h(λρ + (1 − λ)ρ , f ) > h(ρ, f ) − . (4.27) N →∞
Hence, lim inf hN (ρN , f ) ≥ h(ρ, f ) − + ln(λ). N →∞
(4.28)
Since > 0 and λ ∈ (0, 1) are arbitrary, the statement follows. According to [13, Proposition 1.2.7], this lemma implies immediately that the convergence hN → h is uniform on each compact set of nondegenerate density matrices. Proposition 4.4. Consider the same assumptions as in the preceding lemma and a compact set K ⊂ S consisting only of nondegenerate density matrices. Then, the convergence hN → h is uniform on K, i.e. lim supρ∈K |hN (ρ, f ) − h(ρ, f )| = 0
N →∞
holds.
(4.29)
February 25, 2006 14:58 WSPC/148-RMP
34
J070-00256
M. Keyl
Another simple consequence of Lemma 4.1 is the continuity of h( · , f ) on the interior of S. The proof is again omitted, since it can be taken without change from [first paragraph of the proof of Proposition 1.2.7, 13]. Proposition 4.5. Consider again the assumptions from Lemma 4.1. The function S ρ → h(ρ, f ) ∈ R is continuous at each nondegenerate ρ. This is a somewhat surprising result, because it is derived without any further assumption on the rate function I. Although it does not imply that I(ρ, σ) is continuous in ρ, it shows at least that the dependence of I on the original density matrix ρ is quite regular on the interior of the state space S. On the boundary, however, nothing can be said. The discussion in Secs. 5.3 and 5.4 will indicate that this is probably a fundamental aspect of admissible rate functions and not just a problem of the methods used in the proofs. Let us consider now the natural action of U(d) on the set C(S) of continuous functions on S, i.e. for each U ∈ U(d) and each f ∈ C(S) define αU f ∈ C(S) by αU f (σ) = f (U σU ∗ ). Then, we can consider for each fixed ρ ∈ S and each f the functions U(d) U → hN U ∗ ρU, αU f ∈ R and U(d) U → h U ∗ ρU, αU f ∈ R,
(4.30)
(4.31)
and pose the same question as above — but now considering the dependency on U rather than on ρ. The following is the analog of Lemma 4.1. Lemma 4.6. Consider an estimation scheme (EN )N ∈N satisfying the LDP with rate function I, an arbitrary continuous (real-valued) function f and the functionals hN , h defined in Eqs. (4.1) and (4.2). (1) For each nondegenerate density matrix ρ ∈ S and each sequence N N → UN ∈ U(d) converging to U ∈ U(d), we have lim hN UN∗ ρUN , αUN f = lim hN U ∗ ρU, αU f = h U ∗ ρU, αU f . (4.32) N →∞
N →∞
(2) If I is lower semicontinuous in both arguments, the lower bound lim inf hN UN∗ ρUN , αUN f ≥ h U ∗ ρU, αU f N →∞
(4.33)
holds even for degenerate ρ. Proof. To prove statement (1), let us start with the observation that the function sequence (αUM f )M∈N converges uniformly to αU f : Due to the compactness of S, the function f is not just continuous but even uniformly continuous, i.e. for each > 0, there is a δ > 0 with σ1 − σ2 1 < δ ⇒ |f (σ1 ) − f (σ2 )| < .
(4.34)
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
35
Convergence of (UM )M∈N implies the existence of M ∈ N with M > M ⇒ UM − U < δ/2. For each σ and each M > M , we therefore get
∗ ∗
UM σUM − U σU ∗ 1 ≤ UM σUM − UM σU ∗ 1 + UM σU ∗ − U σU ∗ 1 (4.35)
∗ ∗ ∗ ≤ U − U UM σ1 + UM − U U σ1 < δ, M
(4.36) which implies together with (4.34) for an arbitrary σ and M > M , ∗ |αUM f (σ) − αU f (σ)| = f UM σUM − f (U σU ∗ ) < .
(4.37)
In other words, the convergence αUM f → αU f is uniform as stated (since M does not depend on σ). To proceed, it is necessary to consider the following simple properties of the functionals hN and h: If f, f1 denotes continuous functions on S and ∈ R, we have for all ρ, f ≥ f1 ⇒ hN (ρ, f ) ≥ hN (ρ, f1 ) and hN (ρ, f + ) = hN (ρ, f ) + ,
(4.38)
and from Lemma 4.1, we already know that for all > 0 and all f , there is an N [, f ] ∈ N with N > N [, f ] ⇒ |hN UN∗ ρUN , f − h(U ∗ ρU, f )| < . (4.39) Uniform convergence αUM f → αU f implies that αU f − ≤ αUM f ≤ αU f + holds for all M > M . Hence, for all N ∈ N, we have hN UN∗ ρUN , αU f − ≤ hN UN∗ ρUN , αUM f ≤ hN UN∗ ρUN , αU f + (4.40) according to (4.38). Together with (4.39), we get N > N [, αU f ], M > M ⇒ |hN UN∗ ρUN , αUM f − h(U ∗ ρU, αU f )| < 2,
(4.41)
which implies Eq. (4.32). Statement (2) can be shown in the same way, if we replace Eq. (4.39) by (cf. Lemma 4.1) (4.42) N > N [, f ] ⇒ hN UN∗ ρUN , f ≥ h(U ∗ ρU, f ) − and use only the lower bound of (4.40). As in the case of Lemma 4.1, we can now derive continuity and uniformity properties from this result. The following proposition is (again) an immediate consequence of [13, Proposition 1.2.7]. The proof is therefore omitted. Proposition 4.7. Consider the same assumptions as in Lemma 4.6 and a nondegenerate density matrix ρ. (1) The function
U(d) U → h U ∗ ρU, αU f = inf I U ∗ ρU, U ∗ σU + f (σ) ∈ R σ∈S
is continuous.
(4.43)
February 25, 2006 14:58 WSPC/148-RMP
36
J070-00256
M. Keyl
(2) The convergence of hN U ∗ ρU, αU f to h U ∗ ρU, αU f is uniform in U, i.e. lim sup hN U ∗ ρU, αU f − h U ∗ ρU, αU f = 0 (4.44) N →∞ U∈U(d)
holds. 4.2. Averaging Let us consider now the question whether covariance and permutation invariance are “harmful” for the rate function; i.e. can we hope to exhaust the optimal upper bounds from Eq. (2.11) with schemes admitting these symmetry properties? One possible way to answer this question is to start with a general scheme (EN )N ∈N and to average over the unitary and the permutation group. For the latter, this leads to ¯N (∆) = 1 Vp EN (∆)Vp∗ , (4.45) E N! p∈SN
and since we have tr ρ⊗N Vp EN (∆)Vp∗ = tr Vp∗ ρ⊗N Vp EN (∆) = tr ρ⊗N EN (∆)
(4.46)
for each permutation p ∈ SN , we see that the rate function has not changed at all by this procedure. Hence, for the rest of this section, we can assume without loss of generality that each scheme is permutation invariant. This leads us to averages over the unitary group, i.e. ¯N (∆) = U ⊗N EN (U ∗ ∆U )U ⊗N ∗ dU. (4.47) E U(d)
Here, the situation is (unfortunately) different. The following proposition shows ¯N is in general worse than that of EN . that the convergence behavior of E Proposition 4.8. Consider an estimation scheme (EN )N ∈N satisfying the LDP ¯N )N ∈N from with rate function I and the corresponding averaged scheme (E Eq. (4.47). For each nondegenerate density matrix ρ, the sequence of probability measures tr ρ⊗N E¯N (·) satisfies the LDP with rate function I¯ρ given by ¯ σ) = inf I(U ∗ ρU, U ∗ σU ). I¯ρ (σ) = I(ρ, (4.48) U∈U(d)
¯N (·) satisfy the Laplace Proof. It is sufficient to show that the measures tr ρ⊗NE principle with the same rate function (cf. Theorem A.5), because the Laplace principle is equivalent to the large deviation principle. Hence, we have to show that −1 ¯N (dσ) = inf (f (σ) + I(ρ, ¯ σ)) (4.49) ln e−N f (σ) tr ρ⊗NE lim N →∞ N σ∈S S ¯N , we get holds for all continuous functions f on S. Inserting the definition of E ¯N (dσ) e−N f (σ) tr ρ⊗N E S ∗ = e−N f (UσU ) tr (U ∗ ρU )⊗NEN (dσ) dU, (4.50) U(d)
S
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
or with the notation from Sec. 4.1 (cf. Eqs. (4.1) and (4.30)), ∗ ¯N (dσ) = e−N f (σ) tr ρ⊗N E e−N hN (U ρU,αU f ) dU. S
37
(4.51)
U(d)
According 4.7, the quantity hN (U ∗ ρU, αU f ) converges uniformly in ∗ to Proposition U to h U ρU, αU f , i.e. for each > 0, there is an N ∈ N such that N > N ⇒ h U ∗ ρU, αU f + ≥ hN U ∗ ρU, αU f ≥ h U ∗ ρU, αU f − , ∀U ∈ U(d) (4.52) holds. Hence, for each > 0, we get ∗ −1 ln lim sup e−N (h(U ρU,αU f )+) dU N →∞ N U(d) ∗ −1 ln e−N hN (U ρU,αU f ) dU. (4.53) ≥ lim sup N N →∞ U(d) From Proposition 4.7, we know that h U ∗ ρU, αU f is continuous in U and we can apply Varadhan’s Theorem (Theorem A.3) to the left-hand side of this inequality. Together with (4.54) inf h(U ∗ ρU, αU f ) = inf inf I(U ∗ ρU, σ) + f (U σU ∗ ) U∈U(d) U∈U(d) σ∈S (4.55) = inf inf I(U ∗ ρU, U ∗ σU ) + f (σ) U∈U(d) σ∈S
¯ σ) + f (σ)), = inf (I(ρ, σ∈S
this implies the upper bound ∗ −1 ¯ σ) + f (σ)) + . ln e−N hN (U ρU,αU f ) dU ≤ inf (I(ρ, lim sup σ∈S N →∞ N U(d) The lower bound ∗ −1 ¯ σ) + f (σ)) − lim inf ln e−N hN (U ρU,αU f ) dU ≥ inf (I(ρ, N →∞ N σ∈S U(d)
(4.56)
(4.57)
(4.58)
can be shown in the same way. Since > 0 is arbitrary, Eq. (4.49) follows from (4.51), (4.57) and (4.58), which concludes the proof. Hence, the best we can hope is that the averaged scheme satisfies the LDP with rate function I¯ which is actually the worst U(d)-invariant rate function which can be derived from I. Only if I is U(d)-invariant itself (such that I¯ = I holds), ¯N )N ∈N is as good as that of (EN )N ∈N . The followthe convergence behavior of (E ing proposition shows that at least in this case, the convergence problems on the boundary of S can be solved. Proposition 4.9. If (EN )N ∈N is an estimation scheme satisfying the LDP with a U(d)-invariant, lower semicontinuous (in both arguments) rate function I, the averaged scheme (E¯N )N ∈N defined in Eq. (4.47) satisfies the LDP with the same rate function.
February 25, 2006 14:58 WSPC/148-RMP
38
J070-00256
M. Keyl
Proof. ⊗N We will show again the alternative statement that the sequence ¯N (·) satisfies the Laplace principle, i.e. Eq. (4.49) holds for all contintr ρ E uous real-valued functions f and with I¯ replaced by I. As in the last proof, we can rewrite this in terms of the functionals hN and h defined in Eqs. (4.1) and (4.2), i.e. we have to show that (cf. Eq. (4.51)), ∗ −1 ln e−N hN (U ρU,αU f ) dU ≤ h(ρ, f ) (4.59) lim sup N →∞ N U(d) and lim inf N →∞
−1 ln N
e−N hN (U
∗
ρU,αU f )
dU ≥ h(ρ, f )
(4.60)
U(d)
hold. However, now the convergence of hN U ∗ ρU, αU f to h(ρ, f ) is only known to be pointwise (and not necessarily uniform) in U . Therefore, we cannot proceed as in Proposition 4.8. Instead, we will use different strategies for the upper and the lower bound. To get the upper bound note that f is (as a continuous function on a compact set) bounded from above by a constant K > 0. Therefore, the functions U → hN U ∗ ρU, αUf are bounded as well (by the same constant) and we get (note that h U ∗ ρU, αU f = h(ρ, f ) holds for all U by assumption), ∗ hN U ρU, αU f − h(ρ, f ) dU = 0 (4.61) lim N →∞
U(d)
by the dominated convergence theorem. Now, let us introduce for each > 0 and each N ∈ N, the set (4.62) ∆N, = U ∈ U(d)| hN U ∗ ρU, αU f − h(ρ, f ) > . From Eq. (4.61), we see that for each δ > 0, there is an Nδ ∈ N such that N > Nδ implies ∗ hN U ρU, αU f − h(ρ, f ) dU < δ, (4.63) |∆N, | ≤ U(d)
where |∆N, | denotes the volume of ∆N,with respect to the Haar measure (note that ∆N, is due to continuity of U → hN U ∗ ρU, αU f open and therefore measurable). Now, choose > 0 arbitrarily and δ = /2, then we have for all N > Nδ , ∗ 1 −N hN (U ∗ ρU,αU f ) e dU ≥ e−N hN (U ρU,αU f ) dU ≥ e−N (h(ρ,f )+) , 2 U(d) U(d)\∆N, (4.64) ∗ where we have used the fact that hN U ρU, αU f < h(ρ, f ) + holds for all U ∈ ∆N, . Taking logarithms and the limit N → ∞, this implies ∗ −1 lim sup ln e−N hN (U ρU,αU f ) dU ≤ h(ρ, f ) + . (4.65) N →∞ N U(d) Since > 0 is arbitrary, we get the upper bound (4.59).
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
To prove the lower bound, let us assume first that lim inf inf hN U ∗ ρU, αU f − h(ρ, f ) ≥ 0 N →∞ U∈U(d)
does not hold. Then, we can find a sequence (UN )N ∈N of unitaries with lim inf hN UN∗ ρUN , αUN f − h(ρ, f ) < 0. N →∞
39
(4.66)
(4.67)
However, due to compactness of U(d), we can assume without loss of generality that (UN )N ∈N converges to a unitary U . Hence, Eq. (4.67) contradicts statement (2) of Lemma 4.6 (since the rate function I is lower semicontinuous by assumption). Hence, Eq. (4.66) is valid and we can find for each > 0, an N ∈ N such that N > N implies hN U ∗ ρU, αU f > h(ρ, f ) − , ∀U ∈ U(d). (4.68) Hence, lim inf N →∞
−1 ln N
e−N hN (U
∗
ρU,αU f )
dU > h(ρ, f ) − .
(4.69)
U(d)
Since > 0 is arbitrary we get the lower bound (4.60) and the proof is completed.
This result is very useful if we want to check whether a given rate function is admissible or not. Many prominent candidates are U(d)-invariant and lower semicontinuous (like relative entropy), and in this case, it is according to Proposition 4.9 sufficient to consider only covariant schemes. Important examples of functions 0 (for IId , which can be tested this way are the optimal rate functions IId and IId this is true at least on the interior of S): 0 are U(d)-invariant (i.e. Proposition 4.10. The optimal rate functions IId and IId ∗ Eq. (2.10) holds with αU (σ) = U σU ). 0 are defined as the upper bounds on E(Id) and E 0 (Id), Proof. Since IId and IId we have to show that these sets are invariant under the operation I → IU with IU (ρ, σ) = I(U ρU ∗ , U σU ∗ ). Hence, consider I ∈ E(Id). Then, there is a full estimation scheme (EN )N ∈N satisfying LDP with rate function I. For each U U with EN (∆) = fixed U ∈ U(d), we can define the translated scheme EN N ∈N ⊗N ∗ ∗ ⊗N EN (U ∆U )U . If ∆ is open, we get U 1 −1 U ln tr ρ⊗N EN ln tr (U ρU ∗ )⊗N EN (U ∆U ∗ ) lim inf (∆) = lim inf (4.70) N →∞ N N →∞ N (4.71) ≤ − inf ∗ I(U ρU ∗ , σ) σ∈U∆U
= − inf I(U ρU ∗ , U σU ∗ ). σ∈∆
(4.72)
This shows that the large deviation upper bound holds with rate function IU . The U satisfies the LDP lower bound can be shown in the same way. Hence, EN N ∈N
February 25, 2006 14:58 WSPC/148-RMP
40
J070-00256
M. Keyl
with rate function IU , and this implies IU ∈ E(Id). Since the operation I → IU respects semicontinuity of I, invariance of E 0 (Id) is trivial and this concludes the proof. Summarizing the discussion of this subsection, we can conclude that averaging is in the context of large deviations not as powerful as it is in other areas like optimal cloning. Nevertheless, it is not completely useless either. In particular, the 0 ∈ E(p) is interesting in this regard, because it would imply that conjecture IId 0 can be derived as the rate function of a covariant scheme. Hence, covariant IId schemes are an important special case (and therefore worth studying), although they probably cannot tell us the whole truth. 4.3. General structure Now, let us have a look at the general structure of covariant and permutation invariant estimation schemes. Our main tool is the following theorem about covariant observables [24]. Theorem 4.11. Consider a compact group G which acts transitively on a locally compact, separable metric space X by G× X (g, x) → αg (x), and a representation π of G on a Hilbert space H. Each POV measure E : B(X) → B(H) which is covariant (i.e. E(αg ∆) = π(g)E(∆)π(g)∗ for all ∆ ∈ B(X) and all g ∈ G) has the form f (x)E(dx) = f (αg x0 )π(g)Q0 π(g)∗ µ(dg), (4.73) X
G
where x0 ∈ X is an (arbitrary) reference point, µ is the Haar-measure on G and Q0 ∈ B(H) a positive operator which is uniquely determined by (4.73) and the choice of x0 . Unfortunately, this theorem is not applicable to our case, because the action of U(d) on S is not transitive. A way out of this dilemma is to look at the fibration s : S → Σ defined in Eq. (2.3) and to apply the results about transitive group actions to each fiber separately. (For the rest of this section, we will use frequently the notations introduced in Sec. 3.1.) Theorem 4.12. Each covariant and permutation invariant observable E : B(S) → B(H⊗N ) has the form (with a continuous function f on S) f (ρ)E(dρ) S ∗ ∗ = πY (U ) f U ρx U qY (dx) πY (U ) dU ⊗ 1IY (4.74) Y ∈Yd (N )
U(d)
Σ
with a sequence of (non-normalized) POV measures qY : B(Σ) → B(HY ), the diagonal matrices ρx = diag(x1 , . . . , xd ) from Eq. (3.8) and the unit matrix 1IY ∈ B(KY ).
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
Proof. Permutation invariance implies immediately that EN (∆) = EN,Y (∆) ⊗ 1IY
41
(4.75)
Y ∈Yd (N )
holds with 1IY ∈ B(KY ) and a family of POV measures EN,Y : B(S) → B(HY ), which are again U(d)-covariant: EN,Y (U ∆U ∗ ) = πY (U )EN (∆)πY (U )∗ ,
∀U ∈ U(d).
(4.76)
Hence, we only have to look at EN,Y for a fixed Y ∈ Yd (N ), Therefore, the statement is a consequence of the following lemma. Lemma 4.13. Each U(d)-covariant observable E : B(S) → B(HY ) has the form ∗ f (ρ)E(dρ) = πY (U ) f U ρx U q(dx) πY (U ∗ ) dU (4.77) S
U(d)
Σ
with an appropriate POV-measure q : B(Σ) → B(HY ). Proof. To each ρ ∈ G, we can associate the stabilizer subgroup Gρ = U ∈ U(d) | U ρU ∗ = ρ of U(d), whose structure is uniquely determined by the degeneracy of the eigenvalues of ρ. Hence, the set J = {Gρx | x ∈ Σ}
with ρx = diag(x1 , . . . , xd )
(4.78)
is finite and for each ρ, there is exactly one G ∈ J such that Gρ = U GU ∗ holds with an appropriate unitary U ∈ U(d). Therefore, we can decompose S into a disjoint union S = G∈J SG of finitely many subsetsd SG = ρ ∈ S | ∃U ∈ U(d) with Gρ = U GU ∗ ; (4.79) and similarly, we have Σ = G ΣG with ΣG = s(SG ). By construction, each orbit s−1 (x), x ∈ ΣG is naturally homeomorphic to the homogeneous space XG = U(d)/G. Hence, there is a natural homeomorphism ΦG : ΣG × XG → SG which is uniquely determined by ΦG (x, [1I]) = ρx
and ΦG (x, [V ]) = V ρx V ∗ ,
∀x ∈ ΣG , ∀[V ] ∈ XG .
(4.80)
Note that the crucial property of ΦG is to intertwine the group actions ρ → U ρU ∗ and [V ] → [U V ] of U(d) on SG and XG , respectively. The SG are in general neither open nor closed, but they are Borel subsets of S (more precisely, differentiable submanifolds with boundary): Since s is continuous, it is obviously sufficient to show that ΣG ∈ B(Σ) holds. However, this follows from the fact that each ΣG can be expressed as the complement of a Borel set in a finite union of closed sets (this is easy to see but tedious to write down). SG ∈ B(S) now implies B(SG ) = {∆ ∩ SG | ∆ ∈ B(S)} ⊂ B(S) and we can define the POV d The decomposition of S into a finite union of fiber bundles we are describing here is a special case of a much more general result (“slice theorem”) about compact G-manifolds; cf. [26].
February 25, 2006 14:58 WSPC/148-RMP
42
J070-00256
M. Keyl
measures EG : B(SG ) → B(HY ), EG (∆) = E(∆). Note that the EG are not normalized and some of them can vanish completely. Since we can reconstruct E from the EG by E(∆) = G EG (∆ ∩ SG ) it is sufficient to prove the statement for each G separately. In addition, we can use the homeomorphism ΦG from Eq. (4.80) to identify SG with ΣG × XG and EG with a POVM on B(ΣG × XG ) which is covariant with respect to the group action Σg × XG (x, [V ]) → αG U (x, [V ]) = (x, [U V ]) ∈ ΣG × XG of U(d), i.e. ∗ EG αG U ∆ = πY (U )EG (∆)πY (U ),
∀∆ ∈ B(ΣG × XG ),
∀U ∈ U(d).
(4.81)
(4.82)
This is a direct consequence of the intertwining property of ΦG mentioned above. Now, let us consider the Abelian algebras C(XG ) and C(ΣG ) of continuous functions on XG and ΣG . Each h ∈ C(ΣG ) defines a positive linear map by ˜G,h (k) = C(XG ) k → E h(x)k(y)EG (dx × dy) ∈ B(HY ). (4.83) ΣG ×XG
˜G,h imply that it can be expressed as an integral over Positivity and linearity of E XG with respect to a POV measure EG,h ˜ EG,h (k) = k(y)EG,h (dy) (4.84) XG
(this is a general property of positive maps on Abelian algebras; cf. [33]). From (4.82), it follows immediately that EG,h is covariant and we can apply Theorem 4.11, i.e. there is a positive operator QG (h) such that ˜ EG,h (k) = k([U ])πY (U )QG (h)πY (U ∗ ) dU (4.85) U(d)
holds. Note that the distinguished point xo from Theorem 4.11 is in our case [1I] ∈ XG . Since QY (h) is uniquely defined by this equation (cf. Theorem 4.11), we get another positive linear map QG : C(ΣG ) h → Q(h) ∈ B(HY ) which can again be expressed as an integral h(x)qG (dx), (4.86) QG (h) = ΣG
and we get
f (x, y)EG (dx × dy) = πY (U ) f ([U ], x)qG (dx) πY (U ∗ ) dU.
ΣG ×XG
U(d)
(4.87)
ΣG
for each f of the form f (x, y) = k(x)h(y) with k ∈ C(ΣG ), h ∈ C(XG ), and by linearity and continuity for each continuous f on ΣG × XG . Now, we can again
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
43
apply the homeomorphism ΦG to map EG back to a measure on SG . Since ΦG intertwines the action of U(d) on SG and ΣG × XG , we get from (4.87), f (ρ)EG (dρ) = πY (U ) f U ρx U ∗ qG (dx) πY (U ∗ ) dU. (4.88) S
U(d)
Σ
Hence, the statement of the lemma follows with q(∆) = G qG (∆ ∩ ΣG ). Together with the decomposition of E from Eq. (4.75), the statement of this lemma concludes the proof of the theorem. 4.4. An explicit scheme The class of observables described in Theorem 4.12 is still quite big. To reduce the freedom of choice further, we can focus our attention to estimation schemes which coincide with (FˆN )N ∈N from Theorem 3.1, as long as only information about the spectrum of ρ is required. In other words, EN should satisfy for all N ∈N EN (s−1 (∆)) = FˆN (∆),
∀∆ ∈ B(Σ).
(4.89)
This leads to the following corollary. Corollary 4.14. Each covariant and permutation invariant estimation scheme (EN )N ∈N which satisfies Eq. (4.89) can be written as f (ρ)EN (dρ) = f U ρY /N U ∗ U ⊗N (QY ⊗ 1I)U ⊗N ∗ dU, (4.90) S
Y ∈Yd (N )
U(d)
with a family of operators QY ∈ B(HY ). Proof. Equation (4.89) implies immediately that the POV measures qY from Proposition 4.12 are discrete, i.e. qY Z δZ/N , (4.91) qY = Z∈Yd (N )
where δZ/N denotes the Dirac measure at Z/N ∈ Σ and qY Z ∈ B(HY ). Hence, EN becomes ˜ Y U ⊗N ∗ dU, f (ρ)EN (dρ) = f U ρY /N U ∗ U ⊗N Q (4.92) S
Y ∈Yd (N )
with ˜Y = Q
U(d)
qZY ⊗ 1I.
(4.93)
Z∈Yd (N )
Using the definition of FˆN in Eq. (3.4) and again Eq. (4.89), we get ˜ Y U ⊗N ∗ dU, PY = FˆN ({Y /N }) = EN (s−1 (Y /N )) = U ⊗N Q U(d)
(4.94)
February 25, 2006 14:58 WSPC/148-RMP
44
J070-00256
M. Keyl
˜ Y must be of the form q˜Y ⊗ 1I with qY ∈ B(HY ). Hence, but this implies that Q (4.93) implies qZY = 0 for Y = Z, which proves the corollary. Since the estimation scheme (FˆN )N ∈N is asymptotically optimal, condition (4.89) looks at a first glance very natural. In contrast to permutation invariance and covariance, however, we have no proof that it does not “harm” the rate function. In other words, the crucial question is: Given a covariant and permutation invariant estimation scheme (EN )N ∈N satisfying LDP with rate function I, does a ˜N )N ∈N exist which satisfies Eq. (4.89) and the LDP with a rate function scheme (E ˜ ˜N I such that I ≤ I˜ holds? A possible strategy towards a proof might be to define E by Eq. (4.90) with QY = Σ qY (dx) and the POV measures qY which define EN according to Theorem 4.12. The hard part (which we have not solved up to now) is of course to show that the rate function I˜ of such a scheme is at least as good as I. If we accept condition (4.89) nevertheless, the estimation scheme (EˆN )N ∈N arises from Corollary 4.14 if we choose QY = dim HY |φY φY |,
(4.95)
where φY denotes the highest weight vector of the irreducible representation πY . To see (heuristically) why this should be a good choice for the QY , consider a nonsingular, diagonal density matrix ρ = eh with h = diag(h1 , . . . , hd ) and h1 ≥ · · · ≥ hd . Since EˆN projects to FˆN , we know already that we get an exact estimate for the spectrum of ρ in the limit N → ∞. To get a consistent scheme, we need operators QY such that the quantities tr πY (U ∗ ρU )QY dim KY = tr (U ∗ ρU )⊗N (QY ⊗ 1I) (4.96) (regarded as densities along the orbits SY = s−1 (Y /N )) are more and more concentrated on the density operators with the correct eigenvectors, i.e. to ρY /N . Since Y ∈ Yd (N ) is the highest weight of the irreducible representation πY and φY its highest weight vector, the highest eigenvalue of πY (ρ) is given by exp( j Yj hj ) and φY ∈ HY is the corresponding eigenvector. All other eigenvalues grow with a lower exponential rate (or decay faster, depending on the chosen normalization). Therefore, the matrix element φY , πY (ρ)φY dominates all other eigenvalues in the limit N → ∞. Hence, the density (4.96) has the desired behavior if we choose QY = |φY φY |. Note that the reasoning just sketched indicate that for any consistent scheme of the form (4.90), the overlap of the QY with |φY φY | should not decay too fast (at most polynomial). In the case of pure input state, we will make this reasoning more precise; cf. Sec. 5.3. 4.5. Proof of Theorem 3.2 Our next task is to prove Theorem 3.2, i.e. we have to show that the estimation ˆN defined in Eq. (3.7) satisfies the LDP with rate function Iˆ given in scheme E (3.13). The first step is to check that Iˆ is well defined.
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
45
Lemma 4.15. is a (unique) function Iˆ on S × S which satisfies There d ∗ ˆ I ρ, U ρx U = j=1 xj ln(xj ) − I1 (ρ, U, x) and I1 (ρ, U, x) =
d
(xj − xj+1 ) ln pmj (U ∗ ρU ) ,
(4.97)
j=1
ˆ σ) = 0 implies σ = ρ. where we have set xd+1 = 0. Iˆ is positive and I(ρ, Proof. To prove that Iˆ is well defined, we have to show that U1 ρx U1∗ = U2 ρx U2∗ implies I1 (ρ, U1 , x) = I1 (ρ, U2 , x). This is equivalent to [U, ρx ] = 0 ⇒ I1 (ρ, U, x) = I1 (ρ, 1I, x). To exploit the relation [U, ρx ] = 0, let us introduce k ≤ d integers 1 = j0 < j1 < · · · < jk = d + 1 such that xjα > xjα+1 and xj = xjα > 0 holds for jα ≤ j < jα+1 and α < k. Then, we have I1 (ρ, U, x) =
k
(xjα −1 − xjα ) ln[pmjα −1 (U ∗ ρU )].
(4.98)
α=1
On the other hand, [U, ρx ] = 0 implies that U is block diagonal U = diag(U0 , . . . , Uk−1 )
with Uα ∈ U(dα ),
dα = jα+1 − jα .
(4.99)
Hence, we have pmjα −1 (U ∗ ρU ) = pmjα −1 (ρ) for all such U and all α with 1 ≤ α ≤ k. Together with Eq. (4.98), this shows that Iˆ is well defined. ˆ U ρx U ∗ ) ≥ 0 holds for each ρ To prove positivity, we have to show that inf U I(ρ, ˆ and x. Hence, we have to minimize I (for fixed x and ρ) and since xj ≥ xj+1 , this implies that we have to maximize the minors of U ∗ ρU . To this end, let us denote the eigenvalues of ρ and the upper left j × j submatrix of U ∗ ρU by λ1 ≥ λ2 ≥ (j) (j) (j) · · · ≥ λd , respectively, λ1 ≥ λ2 ≥ · · · ≥ λj . The minors of U ∗ ρU then become pmj (U ∗ ρU ) = λ1 · · · λj . According to [25, Theorem 4.3.15], the λk satisfy the (j)
(j)
(j)
(j)
constraint λk ≥ λk for all k = 1, . . . , j, and this bound is (obviously) saturated if U ∗ ρU is diagonal in the preferred basis. Hence, we get pmj (U ∗ ρU ) ≤ λ1 · · · λj and therefore, ˆ U ρx U ∗ ) ≥ I(ρ,
d
xj ln(xj ) −
j=1
d
(xj − xj+1 ) ln(λ1 · · · λj ).
(4.100)
j=1
Expanding the logarithms and reshuffling the second sum leads to d xj (ln(xj ) − ln(λj )), Iˆ ρ, U ρx U ∗ ≥
(4.101)
j=1
and equality holds iff ρ and σ = U ρx U ∗ are simultaneously diagonalizable. Since the left-hand side of this inequality is a relative entropy of classical probability ˆ distributions, we see that Iˆ is positive and I(σ) = 0 holds iff σ = ρ. ˆ As in Now, let us show that (EˆN )N ∈N satisfies the LDP with rate function I. the proof of Proposition 4.9, we will do this by proving the equivalent statement
February 25, 2006 14:58 WSPC/148-RMP
46
J070-00256
M. Keyl
that (EˆN )N ∈N satisfies the Laplace principle with the same rate function, i.e. −1 ˆN (dσ) = inf (f (σ) + I(ρ, ˆ σ)) lim ln e−N f (σ) tr ρ⊗N E (4.102) N →∞ N σ∈S S ˆN , should hold for all continuous functions f on S. If we insert the definition of E the integral on the left-hand side becomes ˆN (dσ) e−N f (σ) tr ρ⊗N E S ∗ = dim HY e−N f (UρY /N U ) tr (U ∗ ρU )⊗N |φY φY | ⊗ 1IY dU, Y ∈Yd (N )
U(d)
(4.103) where 1IY denotes the unit operator on KY . Now, assume that ρ is nondegenerate (i.e. ρ ∈ GL(d, C)), then we can rewrite the density in this integral to ⊗N |φY φY | ⊗ 1IY = tr PY (U ∗ ρU )⊗N PY |φY φY | ⊗ 1IY tr U ∗ ρU (4.104) ∗ = dim KY tr πY (U ρU )|φY φY | (4.105) = dim KY φY , πY (U ∗ ρU )φY , ∗
⊗N
(4.106) ∗
where we have used in the second equation that PY (U ρU ) PY = πY (U ρU )⊗1IY holds. The matrix elements of πY (U ∗ ρU ) with respect to the highest weight vector can be expressed as ([38, §49] or [34, Sec. IX.8]), φY , πY (U ∗ ρU )φY =
d
pmk (U ∗ ρU )Yk −Yk+1 ,
(4.107)
k=1
where we have set Yd+1 = 0. The right-hand side of this equation makes sense even if the exponents are not integer valued. We can therefore rewrite Eq. (4.103) with the probability measure Y 1 h(x)νN (dx) = N h (4.108) dim(HY ) dim(KY ) d N Σ Y ∈Yd (N )
to get e−N f (σ) tr ρ⊗N EˆN (dσ) S
dN e−N f (Uρx U
= Σ
∗
d
pmk (U ∗ ρU )N (xk −xk+1 ) dU νN (dx)
(4.110)
exp −N f U ρx U ∗ − ln(d) − I1 (U, ρ, x) dU νN (dx),
(4.111)
)
U(d)
= Σ
(4.109)
k=1
U(d)
where I1 (ρ, U, x) =
d
(xk − xk+1 ) ln[pmk (U ∗ ρU )]
k=1
is the function from Eq. (4.97). Now, we need the following lemma.
(4.112)
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
47
Lemma 4.16. The probability measures νN defined in Eq. (4.108) satisfy the large deviation principle with rate function I0 (x) = ln(d) +
d
xj ln(xj ).
(4.113)
j=1
Proof. This follows immediately from Theorem 3.1 with ρ =
1I d
(cf. also [12]).
Obviously, the product measure νN (dx) × dU satisfies the LDP with the same rate function. Moreover, the function in the argument of the exponential in Eq. (4.111) is continuous in x and U . Hence, we can apply Varadhan’s Theorem to Eq. (4.111) and get −1 ˆN (dσ) ln e−N f (σ) tr ρ⊗N E (4.114) lim N →∞ N S (4.115) = inf f U ρx U ∗ − ln(d) − I1 (U, ρ, x) + I0 (x) x,U d ∗ xj ln(xj ) − I1 (U, ρ, x) , (4.116) = − inf f U ρx U + x,U
j=1
which proves Theorem 3.2 for nondegenerate density matrices. Now, assume that ρ is degenerate and has rank r < d. By continuity in ρ, Eqs. (4.106) and (4.107) imply that d tr (U ∗ ρU )⊗N |φY φY | ⊗ 1IY = dim KY pmk (U ∗ ρU )Yk −Yk+1
(4.117)
k=1
holds as in the nondegenerate case. The only difference is that the right-hand side can vanish now, and it vanishes in particular for all Y with Yk > 0 for k > r (because all minors with k > r vanish for any U ). Instead of (4.110), we therefore get ˆN (dσ) e−N f (σ) tr ρ⊗N E S
rN e−N f (Uρx U
= Σr
U(d)
∗
)
r
pmk (U ∗ ρU )N (xk −xk+1 ) dU νN,r (dx)
(4.118)
k=1
with Σr = {x ∈ Σ | xk = 0 ∀k > r} and
1 h(x)νN,r (dx) = N d Σr
Y ∈Yr (N )
h
Y N
(4.119)
dim(HY ) dim(KY ).
(4.120)
Note that the difference between νN and νN,r is just the summation over all Young frames with r rows instead of d rows. The right-hand side of Eq. (4.117) can still
February 25, 2006 14:58 WSPC/148-RMP
48
J070-00256
M. Keyl
vanish because the unitary matrix U is a d × d matrix. Hence, we can exclude M = {U ∈ U(d) | pmr (U ∗ ρU ) = 0}
(4.121)
from the domain of integration without changing the value of the integral in (4.118). Hence, we get ˆN (dσ) e−N f (σ) tr ρ⊗N E S exp −N f U ρx U ∗ − ln(r) − I1 (U, ρ, x) dU νN,r (dx). = Σr
U(d)\M
(4.122) The domain Σr × (U(d) \ M) is open in Σr × U(d) and I1 is continuous on it. Hence, we can apply Varadhan’s Theorem and proceed as in the nondegenerate case. 5. Upper Bounds In this section, we will provide a detailed discussion of general upper bounds on admissible rate functions. This includes in particular the proofs of Theorems 3.1 and 3.3. 5.1. Hypothesis testing Let us start with a very brief review of some material from quantum hypothesis testing (for a detailed discussion cf. [22, 24, 18]), because it can be used to derive related results for estimation schemes. As in state estimation, the task of hypothesis testing is to determine a state from measurements on N systems. In hypothesis testing, however, we know a priori that only a finite number of different states can occur. For our purposes, it is sufficient to distinguish only between two states ρ0 , ρ1 ∈ S. This can be done by an observable of the N -fold system with values in the set {0, 1}, where we conclude from the outcome j ∈ {0, 1} that the initial is given preparation was done according to ρj . Mathematically, suchan observable is the probA by a positive operator AN ∈ B(H⊗N ) with AN ≤ 1I and tr ρ⊗N N j ability to get the result 0 during a measurement on N systems in the joint state . Hence, the two quantities ρ⊗N j (5.1) βN (AN ) = tr ρ⊗N αN (AN ) = tr ρ⊗N 0 (1I − AN ) , 1 AN are error probabilities. More precisely, αN (AN ) is the probability to detect ρ1 although the initial preparation was given by ρ⊗N (error of the first kind) and 0 βN (AN ) is the probability for the converse situation (error of the second kind). Ideally, we would like to have a test which minimizes αN and βN . This is however impossible because we can always reduce one quantity at the expense of the other. A possible solution of this problem is to make βN (AN ) as small as possible under the constraint that αN (AN ) remains bounded by some > 0. The corresponding minimal (second kind) error probability is therefore ∗ () = inf{βN (AN ) | AN ∈ B(H⊗N ), 0 ≤ AN ≤ 1I, αN (AN ) ≤ }. βN
(5.2)
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
49
∗ Stein’s Lemma describes the behavior of βN () in the limit N → ∞; the quantum version is shown in [23, 32].
Theorem 5.1 (Quantum Stein’s Lemma). For any 0 < < 1, the equality lim
N →∞
1 ∗ ln βN () = −S(ρ1 , ρ0 ) N
(5.3)
holds. 5.2. State estimation Let us consider now a (full) estimation scheme (EN )N ∈N . One possibility to distinguish between the two states of ρ and σ is to choose a neighborhood ∆ ∈ B(S) of σ with ρ ∈ ∆ and to use the tests AN = EN (∆). If (EN )N ∈N is consistent, the corresponding first kind error probability αN (AN ) vanishes in the limit N → ∞ and we can apply Stein’s Lemma to get a bound on βN (AN ) = tr ρ⊗N EN (∆) . Exploiting this idea more carefully leads to the following theorem. Theorem 5.2. Consider a continuous map p : S → X onto a locally compact, separable metric space X. The optimal rate function Ip defined in Eq. (2.11) satisfies the inequality Ip (ρ, x) ≤
inf
σ∈p−1 (x)
S(ρ, σ),
∀ρ ∈ S,
∀x ∈ X,
(5.4)
where S denotes the quantum relative entropy. Proof. For each pair ρ0 , ρ1 of density operators with p(ρ0 ) = p(ρ1 ), we can find a sequence of tests (AN )N ∈N by AN = EN (∆) with an appropriate Borel set ∆ ⊂ X. If ∆ ∈ B(X) is a neighborhood of p(ρ0 ), consistency of (EN )N ∈N implies that for all > 0, there is an N ∈ N such that αN (AN ) = 1 − tr EN (∆)ρ⊗N < (5.5) 0 holds for all N > N . Hence, Stein’s Lemma implies −1 −1 ln βN (AN ) = lim sup ln tr ρ⊗N lim sup 1 EN (∆) ≤ S(ρ1 , ρ0 ). N →∞ N N →∞ N
(5.6)
Now, assume that the rate function I satisfies I(ρ1 , x0 ) > S(ρ1 , ρ0 ) for some ρ0 , ρ1 with p(ρ0 ) = x0 and p(ρ1 ) = x0 . Since I(ρ1 , · ) is lower semicontinuous, we find a closed neighborhood ∆ of x0 such that I(ρ1 , x) ≥ S(ρ1 , ρ0 ) + δ,
∀x ∈ ∆
(5.7)
holds for an appropriate δ > 0. Hence, the large deviation upper bound (A.1) implies 1 (5.8) lim sup ln tr ρ⊗N 1 EN (∆) ≤ − inf I(ρ1 , x), x∈∆ N →∞ N −1 lim inf ln tr ρ⊗N (5.9) 1 EN (∆) ≥ inf I(ρ1 , x) ≥ S(ρ1 , ρ0 ) + δ N →∞ N x∈∆
February 25, 2006 14:58 WSPC/148-RMP
50
J070-00256
M. Keyl
in contradiction to Eq. (5.6). Hence, I(ρ1 , x0 ) ≤ S(ρ0 , ρ1 ) for all ρ0 with p(ρ0 ) = x0 , which concludes the proof. Proof of Theorem 3.3. If we apply this theorem to full estimation schemes (i.e. X = S and p = Id), we get I(ρ, σ) ≤ S(ρ, σ) ∀ρ, σ ∈ S and Theorem 3.3 follows as a simple corollary. Proof of Theorem 3.1. For a spectral estimation schemes with rate function I, Theorem 5.2 implies that I(ρ, x) ≤ inf s(σ)=x S(ρ, σ) holds. However, the infimum on the right-hand side is achieved if σ and ρ commute and the eigenvalues in a joint eigenbasis are given in the same order. In this case, we have S(ρ, σ) =
d
xj (ln xj − ln rj ) = S(x, r),
(5.10)
j=1
where s(σ) = x = (x1 , . . . , xd ) and s(ρ) = r = (r1 , . . . , rd ) denote the ordered spectra of σ and ρ, and S(r, x) is the classical relative entropy of the probability vectors r and x. Hence, for spectral estimation, the upper bound (5.4) becomes I(ρ, x) ≤ S(s(ρ), x),
∀ρ ∈ S,
∀x ∈ Σ.
(5.11)
However, from [27], we know already that the scheme (FˆN )N ∈N defined in (3.4) saturates this bound; hence, (FˆN )N ∈N is asymptotically optimal as stated in Theorem 3.1. If we are looking at full estimation in particular, the method used in the proof of Theorem 5.2 can be improved significantly. The following lemma, which expresses the rate function explicitly as a limit over a sequence of operators, is of great use in the next subsection. Lemma 5.3. Consider a full estimation scheme (EN )N ∈N satisfying the LDP with rate function I : S × S → [0, ∞] and two states ρ, σ ∈ S. There is a sequence (∆N )N ∈N of Borel sets ∆N ⊂ S satisfying (5.12) lim tr σ ⊗N EN (∆N ) = 1, N →∞
−1 ln tr ρ⊗N EN (∆N ) = I(ρ, σ) N →∞ N lim
(5.13)
and U ∆N U ∗ = ∆N ,
∀U ∈ U(d) with [U, σ] = 0.
Proof. For each k ∈ N, consider the set ˜ k = ω ∈ S| σ − ω1 ≤ k −1 ⊂ S, ∆
(5.14)
(5.15)
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
51
which obviously has the symmetry property (5.14). Since the scheme (EN )N ∈N is consistent (since (EN )N ∈N satisfies the LDP, this follows directly from Definition 2.2), we have for each k ∈ N an index Nk ∈ N such that ˜ k) ≥ 1 − 1 tr σ ⊗N EN (∆ k
(5.16)
holds for all N ≥ Nk . In addition, we get for each k ∈ N, −1 ˜ k ) = inf I(ρ, ω) ln tr ρ⊗N EN (∆ ˜k N →∞ N ω∈∆ lim
(5.17)
by combining the large deviation upper and lower bounds. Hence, for each k ∈ N, there is an Nk ∈ N with −1 1 ⊗N < ˜ E ( ∆ ) − inf I(ρ, ω) ln tr ρ (5.18) N k N k ˜k ω∈∆ for all N ≥ Nk . Now, let us recursively sequence (Nk )k∈N define a strictly increasing of integers by N1 = 1 and Nk = max Nk , Nk , Nk−1 + 1 , and set ˜ k, ∆N = ∆
for Nk ≤ N < Nk+1 .
(5.19)
For each N ≥ Nk , we therefore have an integer l ≥ k with Nl ≤ N < Nl+1 and ˜ l . Since Nl ≤ N implies in particular N ≥ N , we have due to (5.16) ∆N = ∆ l ˜ l) ≥ 1 − 1 ≥ 1 − 1 tr σ ⊗N EN (∆N ) = tr σ ⊗N EN (∆ l k
(5.20)
and this implies Eq. (5.12). Similarly, we have N ≥ Nl ≥ Nl and therefore, with (5.18) −1 ⊗N E (∆ ) − inf I(ρ, ω) ln tr ρ N N N ω∈∆N −1 ˜ l ) − inf I(ρ, ω) < 1 ≤ 1 . ln tr ρ⊗N EN (∆ (5.21) = ˜l N l k ω∈∆ Now, note that the sequence (∆N )N ∈N forms a neighborhood base at σ ∈ S, more precisely ∆N +1 ⊂ ∆N ,
∀N ∈ N
∞
and
∆N = {σ}.
(5.22)
N =1
Lower semicontinuity of Iρ ( · ) = I(ρ, · ) implies in addition that Uk = Iρ−1 (Iρ (σ) − k −1 , ∞]
(5.23)
is for each k ∈ N an open neighborhood of σ. Hence, we have an Mk ∈ N such that M ≥ Mk implies ∆M ⊂ Uk and therefore I(ρ, σ) ≥ inf I(ρ, ω) ≥ I(ρ, σ) − ω∈∆M
1 , k
∀M ≥ Mk .
(5.24)
February 25, 2006 14:58 WSPC/148-RMP
52
J070-00256
M. Keyl
Now, assume that N ≥ max{Nk , Mk }, then we get with Eq. (5.21), −1 ⊗N ln tr ρ E (∆ ) − I(ρ, σ) N N N 2 −1 ≤ ln tr ρ⊗N EN (∆N ) − inf I(ρ, ω) + inf I(ρ, ω) − I(ρ, σ) ≤ ω∈∆N ω∈∆N N k (5.25) and this implies Eq. (5.13), which concludes the proof. 5.3. Pure states The main purpose of this section is to provide a proof of Eq. (3.20), where we have c coincide for pure input states. This is basically quite simple. claimed that Iˆ and IId We will take, however, a small detour which allows us to have a closer look beyond the covariant case (Sec. 5.4). Let us consider first a pure state ρ and a mixed state σ. From Eq. (3.13), we ˆ σ) = ∞. Since Iˆ is a lower bound on all I # , see immediately that this implies I(ρ, Id we get 0 c ˆ σ) = ∞, (ρ, σ) = IId (ρ, σ) = I(ρ, IId (ρ, σ) = IId
∀ρ pure σ mixed.
(5.26)
Hence, only the case where ρ and σ are both pure needs to be discussed. For the rest of this section, we will assume (unless something different is explicitly stated) therefore that ρ = |φ φ|,
σ = |ψ ψ| with φ, ψ ∈ H,
φ = ψ = 1
(5.27)
holds. The rate function Iˆ then has the following simple structure: ˆ σ) = − ln tr(ρσ) = − ln(|φ, ψ |2 ). I(ρ,
(5.28)
Now, we need the following lemma which shows that we can assume without loss of generality that the operators EN (∆N ) from Lemma 5.3 are rank one projectors. Lemma 5.4. Consider an admissible rate function I ∈ E(Id) and two pure states ρ = |φ φ|, σ = |ψ ψ|. There is a sequence (ΨN )N ∈N of normalized vectors ΨN ∈ ⊗N (the symmetric subspace of H⊗N ) such that H+ lim inf N →∞
and
−1 ⊗N ln |φ , ΨN |2 ≥ I(ρ, σ) N 2 lim ΨN , ψ ⊗N = 1
N →∞
(5.29)
(5.30)
holds. If I is covariant, we can choose ΨN = ψ ⊗N . Proof. Consider a full estimation scheme (EN )N ∈N satisfying the LDP with rate function I and the sequence (∆N )N ∈N of Borel sets ∆N ⊂ S from Lemma 5.3.
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
53
Since only the overlap of EN (∆N ) with φ⊗N and ψ ⊗N are of interest, we can assume without loss of generality that EN (∆N ) is supported by the symmetric ⊗N . Now, choose a 0 < λ < 1 and denote the spectral projector tensor product H+ of EN (∆N ) belonging to the interval [1 − λ, 1] by PN,λ . Obviously, we have due to EN (∆N ) ≤ 1I, ⊗N ψ , EN (∆N )ψ ⊗N ≤ ψ ⊗N , PN,λ ψ ⊗N (5.31) + (1 − λ) ψ ⊗N , (1I − PN,λ )ψ ⊗N ⊗N ⊗N = (1 − λ) + λ ψ , PN,λ ψ . (5.32) Equation (5.12) therefore implies lim ψ ⊗N , PN,λ ψ ⊗N = 1. N →∞
(5.33)
Hence, for each 0 < δ < 1, there is an Nδ ∈ N such that ψ ⊗N , PN,λ ψ ⊗N ≥ 1 − δ
(5.34)
holds for all N ≥ Nδ . Now, we define for N with PN,λ ψ ⊗N = 0 (which due to Eq. (5.34) is true if N is large enough) ΨN =
PN,λ ψ ⊗N PN,λ ψ ⊗N
(5.35)
and ΨN arbitrary for all other N . Equation (5.34) implies immediately (5.30). The bound (5.29) follows from ⊗N (5.36) φ , EN (∆N )φ⊗N ≥ (1 − λ) φ⊗N , PN,λ φ⊗N , which in turn implies −1 lnφ⊗N , EN (∆N )φ⊗N (5.37) N −1 ≤ lim inf lnφ⊗N , (1 − λ)PN,λ φ⊗N (5.38) N →∞ N −1 lnφ⊗N , PN,λ φ⊗N = lim inf (5.39) N →∞ N 2 −1 ⊗N ≤ lim inf ln( φ , ΨN ), (5.40) N →∞ N where we have used in the last equation that PN,λ ΨN = ΨN and therefore, PN,λ ≥ |ΨN ΨN | holds if N is large enough. Now, assume that I is covariant. This implies, by definition that we can choose (EN )N ∈N to be covariant as well and we get according to Eq. (5.14), I(ρ, σ) = lim
N →∞
U ⊗N EN (∆N )U ⊗N ∗ = EN (∆N ),
∀U ∈ U(d) with [U, σ] = 0.
(5.41)
Since PN,λ is a spectral projector of EN (∆N ), we get U ⊗N PN,λ U ⊗N ∗ = PN,λ for the same set of U and since σ = |ψ ψ|, this implies U ⊗N PN,λ ψ ⊗N = PN,λ U ⊗N ψ ⊗N = PN,λ ψ ⊗N ,
hence U ⊗N ΨN = ΨN
(5.42)
February 25, 2006 14:58 WSPC/148-RMP
54
J070-00256
M. Keyl
for all U with U ψ = ψ and all ΨN from Eq. (5.35). It is easy to see that ΨN = ψ ⊗N ⊗N is the only vector in H+ with this property. c With this lemma, it is now very easy to determine IId (ρ, · ) for pure input states ρ. As already stated in Sec. 3.3, we get (cf. in this context, the analysis of covariant pure state estimation in [19]):
Proposition 5.5. For each pure state ρ and all σ ∈ S, the equality if σ is mixed, c ˆ σ) = ∞, (ρ, σ) = I(ρ, IId − ln tr(ρσ), if σ is pure,
(5.43)
holds. c ˆ σ) for all ρ, σ ∈ S. If ρ is (ρ, σ) ≥ I(ρ, Proof. Since Iˆ is covariant, we have IId ˆ σ). If both ˆ pure and σ is mixed, we have I(ρ, σ) = ∞ and therefore, I c (ρ, σ) = I(ρ, states are pure, we get from Lemma 5.4 −1 ˆ σ) ln|φ⊗N , ψ ⊗N |2 = − ln tr(ρσ) = I(ρ, (5.44) I c (ρ, σ) ≤ lim N →∞ N which concludes the proof.
Together with the arguments from Sec. 4.4, this result supports our conjecture c and Iˆ coincide also for mixed input states. from Sec. 3.3 that IId 5.4. Beyond covariance If we look at Eq. (5.30) and compare it with the reasoning in the last proof, we might think that covariance is not really needed here, because ΨN converges to ψ ⊗N in the limit N → ∞ even without further assumptions on I. This impression, however, is wrong, because the vectors ψ ⊗N and φ⊗N become more and more orthogonal as N increases and therefore, the part of ΨN which is orthogonal to ψ ⊗N can play a crucial role (although it vanishes in the limit N → ∞). The relation of the optimal rate 0 to Iˆ and relative entropy S needs, therefore, more discussion. functions IId and IId Although we are not yet able to give complete results, we will collect in the following 0 = Iˆ some (informal) arguments which supports the two conjectures IId = S and IId from the end of Sec. 3.3. As in the last section, we will consider only pure states, i.e. we will evaluate a rate function I(ρ, σ) only for ρ = |φ φ| and σ = |ψ ψ|. In addition, we will assume that H is two-dimensional (this can be done without loss of generality, because we just have to replace H with the subspace generated by ψ and φ). Hence, we can set √ (5.45) ψ = |0 and φ = φp,α = p|0 + eiα 1 − p|1 with 0 ≤ p ≤ 1, α ∈ (−π, π] and an arbitrary but fixed basis |0 , |1 of H. In the ⊗N number basis |k, N ∈ H+ , k = 0, . . . , N , −1/2 N SN |0 ⊗(N −k) ⊗ |1 ⊗k , (5.46) |k; N = k
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
55
⊗N ⊗N (where SN is the projector to H+ ) the vectors ΨN ∈ H+ from Lemma 5.4 can then be written as
ΨN =
N
fN,k |k; N
(5.47)
k=0
and φ⊗N becomes ⊗N
φ
=
φ⊗N p,α
N 1/2 N = pN −k (1 − p)k eikα |k, N . k
(5.48)
k=0
Let us consider the conjecture IId = S first. In the case of pure states, this would imply that we can find for each pair of pure states σ = ρ0 an admissible rate function I with I(ρ0 , σ) = ∞. A possible way to prove this would consist of two steps: • Step 1. Find a sequence (AN )N ∈N of operators such that −1 lim ln tr ρ⊗N lim tr σ ⊗N AN = 1 (5.49) 0 AN = ∞, N →∞ N N →∞ and −1 lim ln tr ρ⊗N AN = I σ (ρ) > 0, ∀ρ = σ (5.50) N →∞ N holds. • Step 2. Find a full estimation scheme (EN )N ∈N and a sequence (∆N )N ∈N of Borel sets ∆N ⊂ S shrinking to σ such that EN (∆N ) = AN holds for all N ∈ N. To implement the second step we would need a converse of Lemma 5.3, and such a result is (unfortunately) not yet available. The problem here is not to construct some POV measures with EN (∆N ) = AN , but to construct them such that the resulting scheme satisfies the LDP (which includes in particular consistency). It seems, however, that this is more a technical than a fundamental problem. The first step is much easier to perform.e Assume that ρ0 = |φq,β φq,β | holds with φq,β from (5.45). Then, we set AN = |ΨN ΨN | and define ΨN according to (5.47) by √ √ fN,0 = −NN N 1 − qeiβ , fN,1 = NN q (5.51) with the normalization NN = (N (1 − q) + q)−1/2 and fN,k = 0 for all k > 1. Obviously, we have = 0 and lim fN,0 = 1 ΨN , φ⊗N q,β N →∞
e However,
(5.52)
(5.53)
it is not sufficient to find a sequence of tests which saturates the bound from Stein’s Lemma, because Eq. (5.50) would not necessarily hold in this case.
February 25, 2006 14:58 WSPC/148-RMP
56
J070-00256
M. Keyl
which implies Eq. (5.49). On the other hand, we get I σ (ρ) = − ln tr(ρσ) for each pure ρ = ρ0 and therefore Eq. (5.50) holds as well. Hence, there is strong evidence behind the conjecture IId = S from Sec. 3.3 (at least for pure input states). The method used in the last paragraph can be easily generalized to construct a sequence of operators (AN )N ∈N such that the function I σ from (5.50) becomes infinite at finitely many points or even on a countable dense subset of the space P 0 ˆ = I, of pure states. This is, however, not sufficient to disprove the conjecture IId σ because in this case, we would need (AN )N ∈N such that I becomes lower semicontinuous. I σ (ρ0 ) > − ln tr(ρ0 σ) for one state ρ0 implies for such an I σ that I σ (ρ) > − ln(ρσ) holds for all ρ in a whole neighborhood of ρ0 in P. We will show in the following why it is (at least) very difficult to find a sequence (AN )N ∈N with this special property. To this end, consider AN = |ΨN ΨN | with ΨN from Lemma 5.4 and a fixed 0 < p < 1 such that 2 −1 > −ln tr |ψ, φp,α |2 = −ln p ln ΨN , φ⊗N (5.54) lim p,α N →∞ N holds for all α with −π < α− < α < α+ < π for some bounds α− , α+ . To rewrite this in a more convenient way, let us identify the interval (−π, π] with the unit circle S 1 and consider the sequence (FN )N∈N , FN ∈ L2 (S 1 ) with FN = F˜N −1 F˜N , F˜N (α) = ΨN , φ⊗N (5.55) p,α . In the orthonormal basis (ek )k∈Z , ek ∈ L2 (S 1 ), ek (α) = (2π)−1/2 exp(ikα), these vectors become 1/2 N N F˜N (α) = fN,k pN −k (1 − p)k eikα , (5.56) k k=0
hence all FN are elements of the positive frequency subspace H2 (S 1 ) = span{ek | k ≥ 0} ⊂ L2 (S 1 ).
(5.57)
In addition, we can conclude immediately from Eq. (5.30) and |0, N = ψ ⊗N , the inequality −1 ˜ 2 ln FN ≤ − ln p. (5.58) lim N →∞ N Hence, to get (5.54), the functions FN have to converge pointwise and exponentially fast to 0 on the interval (α− , α+ ). To find such a sequence is difficult due to the following lemma. Lemma 5.6. A function F ∈ H2 (S 1 ) which vanishes on a non-empty subinterval (α− , α+ ) of S 1 vanishes completely. The proof of this lemma uses the fact that each smooth element of H2 (S 1 ) is the boundary value of an analytic function on the unit disc (cf. [37] for details). For us, it shows that the FN cannot vanish on (α− , α+ ) because FN = 1 by
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
57
construction. It is even impossible that the sequence (FN )N ∈N converges (in norm) to a function F ∈ L2 (S 1 ), because this F would satisfy again F = 1, F ∈ H2 (S 1 ) and F (α) = 0 for all α ∈ (α− , α+ ). The only way out is to find a sequence which does not converge for all α. Such a series can be constructed if we allow infinitely fast oscillations in the limit N → ∞ (start with a sequence which converges in L2 (S 1 ) and shift its elements to the positive frequency space). However, even then there are two additional requirements: 1. The vectors ΨN (and therefore, the coefficients fN,k ) have to satisfy the constraints ΨN = 1 and limN →∞ |fN,0 | = 1 and 2. limN →∞ FN (α) = 0 must hold not only for all α ∈ (α− , α+ ), but also for all p ∈ (p− , p+ ) for some 0 < p− < p+ < 1. We have not yet succeeded to construct a sequence (ΨN )N ∈N which satisfies all these conditions, but what we can say already at this point is the following: If there is a rate function I ∈ E 0 (Id) with I(ρ, σ) > ˆ σ) for some ρ, σ, then the corresponding estimation scheme must develop very I(ρ, irregular behavior with respect to relative phases and this indicates that a more detailed analysis of phase estimation might solve our problem. Appendix A. Some Material from Large Deviations Theory The purpose of this appendix is to collect some material about large deviation theory which is used throughout this paper. For a more detailed presentation, we refer the reader to monographs [14, 13, 10]. Definition A.1. A function I : X → [0, ∞] on a locally compact, separable, metric space X is called a rate function if (1) I ≡ ∞. (2) I is lower semicontinuous. (3) I has compact level sets, i.e. I −1 [−∞, c] is compact for all c ∈ R. Definition A.2. Let (µN )N ∈N , N ∈ N be a sequence of probability measures on the Borel subsets of a locally compact, separable metric space X and I : X → [0, 1] a rate function in the sense of Definition A.1. We say that (µN )N ∈N satisfies the large deviation principle with rate function I : X → [0, ∞] if the following conditions hold: (1) For each closed subset ∆ ⊂ Σ, we have lim sup N →∞
1 ln µN (∆) ≤ − inf I(x). x∈∆ N
(A.1)
(2) For each open subset ∆ ⊂ Σ, we have lim inf N →∞
1 ln µN (∆) ≥ − inf I(x). x∈∆ N
(A.2)
The most relevant consequence of this definition is the following theorem of Varadhan [35], which describes the behavior of some expectation values in the
February 25, 2006 14:58 WSPC/148-RMP
58
J070-00256
M. Keyl
limit N → ∞: Theorem A.3 (Varadhan). Consider a sequence (µN )N ∈N , N ∈ N of probability measures on X satisfying the large deviation principle with rate function I : X → [0, ∞] and a continuous function f : X → R which is bounded from below. Then the following equality holds: 1 ln e−N f (x) µN (dx) = − inf (f (x) + I(x)). (A.3) lim N →∞ N x∈E E Varadhan’s Theorem has a converse: If we know that a sequence of measures µN satisfies Eq. (A.3) for all bounded continuous functions, it can be shown that the µN satisfy the large deviation principle as well. Following [13], we have: Definition A.4. Let (µN )N ∈N be a sequence of measures on a locally compact, separable metric space X and I : X → [0, ∞] a rate function. We say that (µN )N ∈N satisfies the Laplace principle with rate function I, if we have 1 ln lim e−N f (x) µN (dx) = − inf (f (x) + I(x)) (A.4) N →∞ N x∈E E for all bounded continuous functions f : E → R. Theorem A.5. The Laplace principle implies the large deviation principle with the same rate function. Acknowledgments I would like to thank R. D. Gill and R. F. Werner for many useful discussions, and M. Hayashi for comments on an earlier version of this manuscript. Financial support by the European Union project ATESIT (contract no. IST-2000-29681) is also greatfully acknowledged. References [1] R. Alicki, S. Rudnicki and S. Sadowski, Symmetry properties of product states for the system of N n-level atoms, J. Math. Phys. 29(5) (1988) 1158–1162. [2] E. Bagan, M. Baig, R. Munoz-Tapia and A. Rodriguez, Collective vs local measurements in qubit mixed state estimation, Phys. Rev. A 69 (2004) 010304. [3] R. R. Bahadur, On the asymptotic efficiency of tests and estimates, Sankhy¯ a 22 (1960) 229–252. [4] R. R. Bahadur, Rates of convergence of estimates and test statistics, Ann. Math. Statist. 38 (1967) 303–324. [5] R. R. Bahadur, Some Limit Theorems in Statistics, Conference Board of the Mathematical Sciences Regional Conference Series in Applied Mathematics, No. 4 (Society for Industrial and Applied Mathematics, Philadelphia, Pa, 1971). [6] D. Bruß, D. P. DiVincenzo, A. Ekert, C. A. Fuchs, C. Machiavello and J. A. Smolin, Optimal universal and state-dependent cloning, Phys. Rev. A 57(4) (1998) 2368–2378. [7] D. Bruß and C. Macchiavello, Optimal state estimation for d-dimensional quantum systems, Phys. Lett. A 253 (1999) 249–251.
February 25, 2006 14:58 WSPC/148-RMP
J070-00256
Quantum State Estimation and Large Deviations
59
[8] J. I. Cirac, A. K. Ekert and C. Macchiavello, Optimal purification of single qubits, Phys. Rev. Lett. 82 (1999) 4344–4347. [9] J. Cortese, Relative entropy and single qubit Holevo–Schumacher–Westmoreland channel capacity, quant-ph/0207128 (2002). [10] F. den Hollander, Large deviations, in Fields Institute Monographs, Vol. 14 (American Mathematical Society, Providence, RI, 2000). [11] R. Derka, V. Buˇzek and A. K. Ekert, Universal algorithm for optimal estimation of quantum states from finite ensembles via realizable generalized measurements, Phys. Rev. Lett. 80(8) (1998) 1571–1575. [12] N. G. Duffield, A large deviation principle for the reduction of product representations, Proc. Amer. Math. Soc. 109 (1990) 503–515. [13] P. Dupuis and R. S. Ellis, A Weak Convergence Approach to the Theory of Large Deviations (Wiley, New York, 1997). [14] R. S. Ellis, Entropy, Large Deviations, and Statistical Mechanics (Springer, Berlin, 1985). [15] D. G. Fischer and M. Freyberger, Estimating mixed quantum states, Phys. Lett. A 273 (2000) 293–302. [16] R. D. Gill, Quantum asymptotics, in State of the Art in Probability and Statistics, eds. (A. W. van der Vaart, M. de Gunst and C. A. J. Klaassen), IMS Lecture Notes — Monograph Series, Vol. 36 (Institute of Mathematical Statistics, 2001), pp. 255–285. [17] R. D. Gill and S. Massar, State estimation for large ensembles, Phys. Rev. A 61 (2000) 2312–2327. [18] M. Hayashi (ed.), Asymptotic Theory of Quantum Statistical Inference: Selected Papers (World Scientific, to appear in 2005). [19] M. Hayashi, Asymptotic estimation theory for a finite dimensional pure state model, J. Phys. A 31 (1998) 4633–4655. [20] M. Hayashi, Two quantum analogues of Fisher information from a large deviation viewpoint of quantum estimation, J. Phys. A 35(36) (2002) 7689–7727; the arXiv version (quant-ph 0202003) is more recent and contains more materials. [21] M. Hayashi and K. Matsumoto, Quantum universal variable-length source coding, Phys. Rev. A 66(2) (2002) 022311, 13. [22] C. W. Helstrom, Quantum Detection and Estimation Theory (Academic Press, New York, 1976). [23] F. Hiai and D. Petz, The proper formula for relative entropy and its asymptotics in quantum probability, Commun. Math. Phys. 143 (1991) 99–114. [24] A. S. Holevo, Probabilistic and Statistical Aspects of Quantum Theory (NorthHolland, Amsterdam, 1982). [25] R. A. Horn and C. R. Johnson, Matrix Analysis (Cambridge University Press, Cambridge, 1985). [26] K. J¨ anich, Differenzierbare G-Mannigfaltigkeiten, Lecture Notes in Mathematics, No. 59 (Springer-Verlag, Berlin, 1968). [27] M. Keyl and R. F. Werner, Estimating the spectrum of a density operator, Phys. Rev. A 64(5) (2001) 052311. [28] M. Keyl and R. F. Werner, The rate of optimal purification procedures, Ann H. Poincar´e 2 (2001) 1–26. [29] J. I. Latorre, P. Pascual and R. Tarrach, Minimal optimal generalized quantum measurements, Phys. Rev. Lett. 81 (1998) 1351–1354. [30] S. Massar and S. Popescu, Optimal extraction of information from finite quantum ensembles, Phys. Rev. Lett. 74(8) (1995) 1259–1263.
February 25, 2006 14:58 WSPC/148-RMP
60
J070-00256
M. Keyl
[31] K. Matsumoto, A new approach to the Cramer–Rao type bound of the pure state model, J. Phys. A 35 (2002) 3111–3124. [32] T. Ogawa and H. Nagaoka, Strong converse and Stein’s Lemma in quantum hypothesis testing, IEEE Trans. Inform. Theory 46 (2000) 2428–2433. [33] V. I. Paulsen, Completely Bounded Maps and Dilations (Cambridge University Press, Cambridge, 2002). [34] B. Simon, Representations of Finite and Compact Groups (American Mathematical Society, Providence, 1996). [35] S. R. S. Varadhan, Asymptotic probabilities and differential equations, Commun. Pure Appl. Math. 19 (1966) 261–286. [36] G. Vidal, J. I. Latorre, P. Pascual and R. Tarrach, Optimal minimal measurements of mixed states, Phys. Rev. A 60 (1999) 126–135. [37] A. Wassermann, Operator algebras and conformal field theory, III, Fusion of positive energy representations of LSU(N ) using bounded operators, Invent. Math. 133(3) (1998) 467–538. [38] D. P. Zhelobenko, Compact Lie Groups and Their Representations (American Mathematical Society, Providence, 1978).
February 25, 2006 14:58 WSPC/148-RMP
J070-00258
Reviews in Mathematical Physics Vol. 18, No. 1 (2006) 61–78 c World Scientific Publishing Company
GENERALIZED EIGENVECTORS FOR RESONANCES IN THE FRIEDRICHS MODEL AND THEIR ASSOCIATED GAMOV VECTORS
¨ HELLMUT BAUMGARTEL University of Potsdam, Mathematical Institute, D-14415 Potsdam, Germany
[email protected] Received 27 September 2005 Revised 29 December 2005 A Gelfand triplet for the Hamiltonian H of the Friedrichs model on R with multiplicity space K, dim K < ∞, is constructed such that exactly the resonances (poles of the inverse of the Livˇsic-matrix) are (generalized) eigenvalues of H. The corresponding eigen(anti)linear forms are calculated explicitly. Using the wave matrices for the wave (M¨ oller) operators the corresponding eigen(anti)linear forms on the Schwartz space S for the unperturbed Hamiltonian H0 are also calculated. It turns out that they are of pure Dirac type and can be characterized by their corresponding Gamov vector λ → k/(ζ0 − λ)−1 , ζ0 resonance, k ∈ K, which is uniquely determined by restriction of S to S ∩ H2+ , where H2+ denotes the Hardy space of the upper half-plane. Simultaneously this restriction yields a truncation of the generalized evolution to the well-known decay semigroup for t ≥ 0 of the Toeplitz type on H2+ . That is: Exactly those pre-Gamov vectors λ → k/(ζ − λ)−1 , ζ from the lower half-plane, k ∈ K, have an extension to a generalized eigenvector of H if ζ is a resonance and if k is from that subspace of K which is uniquely determined by its corresponding Dirac type antilinear form. Keywords: Friedrichs model; scattering theory; resonances; generalized eigenvectors; Gamov vectors. Mathematics Subject Classification 2000: 47A40, 47D06, 81U20
1. Introduction In quantum scattering systems, bumps cross-sections often can be described 2 in −1 by expressions like λ → c (λ−λ0 )2 + Γ2 , where λ0 is the resonance energy, Γ/2 the half-width, called Breit–Wigner formulas (see, e.g., [1, pp. 428–429]). Sometimes, if the scattering matrix is analytically continuable into the lower half-plane C− , these bumps can be connected with complex poles λ0 − i Γ2 of the scattering matrix −1 in C− . Then c (λ − λ0 ) − i Γ2 is called the Breit–Wigner amplitude, if the pole is of first order (see, e.g., [1, pp. 428–429]). These poles are called resonances (see, e.g., [2, 3]). 61
February 25, 2006 14:58 WSPC/148-RMP
62
J070-00258
H. Baumg¨ artel
The basic idea is that these points should coincide with eigenvalues for generalized eigenvectors of the evolution which is determined by the Hamiltonian H of the scattering system. Obviously this (first) problem cannot be solved within the Hilbert space H, it requires extension techniques, e.g., the use of Gelfand triplets. A further (second) problem is to establish a rigorous mathematical framework to derive modified associated states, also corresponding to resonances as eigenvectors, but of a truncated evolution, such that the eigenvectors satisfy the exponential decay law. These vectors are called Gamov vectors in the literature (see, e.g., [4–6] and further references therein). An obvious suggestion is that also this problem has to be solved by techniques beyond the Hilbert space. Such an approach was presented by Bohm and Gadella, and others by using Gelfand triplets (Rigged Hilbert Spaces (RHS) in their terminology) on Hardy subspaces of H0 , the Hilbert space of the unperturbed Hamiltonian H0 of the scattering system (see [5–7] and papers quoted therein). Originally, the theory of Gelfand triplets (see, e.g., [8], see also [9]) was developed for self-adjoint operators to generalize eigenvector expansions also for the absolutely continuous spectrum. For this purpose, the occurrence of complex eigenvalues is only a nuisance. In this paper, it is shown that for the finite-dimensional Friedrichs model on R, the first problem can be solved rigorously by the Gelfand triplet approach, i.e. the construction of a triplet is presented such that exactly the resonances are eigenvalues of the extended Hamiltonian. The corresponding (generalized) eigenvectors are calculated explicitly (a slightly modified triplet was already considered in [10]). This result confirms the basic idea mentioned above. On the other hand, recently it turned out that to solve the second problem, the use of the triplet approach is not indispensable. On the contrary, the Gamov vectors can be identified as vectors in the Hilbert space H0 , resp. H, more precisely, they are eigenvectors of the decay semigroup for t ≥ 0, which is of Toeplitz type and which can be defined by a truncation of the quantum evolution. This insight came into the light and was supported by analogies in the Lax–Phillips scattering theory. This approach has been promoted and emphasized by Strauss [11] (see also [12]). A detailed presentation for positive Hamiltonians, where one starts with properties of analytic continuation of the scattering matrix, can be found in [17] while a brief version can be found in [18]. The connection with the original Lax–Phillips theory is considered in [19]. However, if one adopts this point of view, then a third problem arises: One has to point out the connection between the generalized eigenvector (the solution of the first problem) and the corresponding Gamov vector, i.e. one has to determine the selection principle which selects the right Gamov vector from the whole collection of all pre-Gamov vectors (eigenvectors of the decay semigroup). Also this problem is solved in this paper: Exactly those eigenvectors of the decay semigroup have extensions to a generalized eigenvector if the eigenvalue is a resonance and which belong to a distinguished subeigenspace, which is calculated explicitly. Vice versa,
February 25, 2006 14:58 WSPC/148-RMP
J070-00258
Generalized Eigenvectors for Resonances
63
the restriction of the generalized eigenvector (for H0 ), which is an eigenantilinear form on the Schwartz space of pure Dirac type, to the Hardy subspace for the upper half-plane C+ is (via the Paley–Wiener theorem) characterized by a vector from this Hardy space. This vector is the Gamov vector corresponding to the generalized eigenvector. We consider the Friedrichs model on R as example to demonstrate the topic because of the direct (spectral) connection with the Lax–Phillips theory. Here the absolutely continuous spectrum of H0 and H is R. However, also in the case of the half-axis [0, ∞) corresponding results are true. Modified conditions for this case are mentioned in Sec. 5. 2. Preliminaries 2.1. Basic objects of the Friedrichs model In the following, we collect the concepts and denotations for the finite-dimensional Friedrichs model on R. Let H0 := L2 (R, K, dλ), where K denotes a multiplicity Hilbert space, dim K < ∞. Further, let E be a finite-dimensional Hilbert space, dim E =: N and put H := H0 ⊕ E. The projection onto E is denoted by PE . H0 is a self-adjoint operator on H with reducing projection PE , where H0 H0 is the multiplication operator on H0 . The self-adjoint operator H on H is given by a perturbation of H0 as H := H0 + Γ + Γ∗ , where Γ denotes a partial isometry on H with the properties Γ∗ Γ = PE ,
ΓΓ∗ ≤ PE⊥ := 1l − PE .
The operator function L± (z) := (z − H0 )PE − Γ∗ (z − H0 )−1 Γ,
z ∈ C± ,
the so-called Livˇsic-matrix, is decisive in the following. One has L± (z) E ∈ L(E) that is holomorphic on C± . For brevity, if there is no danger of confusion, we write L± (z) instead of L± (z) E. Further, we need the so-called partial resolvent PE (z − H)−1 PE . It turns out that L± (z) · PE (z − H)−1 PE = PE (z − H)−1 PE · L± (z) = PE ,
z ∈ C± ,
(see, e.g., [10]), that is PE (z − H)−1 PE E = (L± (z) E)−1 ,
z ∈ C± ,
and this equation shows that (L± (z) E)−1 ∈ L(E) is holomorphic on C± . For H x := f + e, f ∈ H0 , e ∈ E, one has Γx = Γe, Γ∗ x = Γ∗ f . Therefore, ∞ (Γe)(λ) = M (λ)e, E Γ∗ f = M (λ)∗ f (λ) dλ, −∞
where λ → M (λ) ∈ L(E → K) is a.e. defined on R.
February 25, 2006 14:58 WSPC/148-RMP
64
J070-00258
H. Baumg¨ artel
Assumption 1. M (·) is a Schwartz function, i.e. M (·) ∈ S(L(E → K)). For example, this implies ∞ M (λ)∗ M (λ)22,E dλ < ∞, −∞
∞
−∞
M (λ)∗ M (λ)2,E dλ < ∞,
where ·2,E denotes the Hilbert–Schmidt norm on E. Obviously one has ∞ M (λ)∗ M (λ) ∗ −1 dλ, z ∈ C± . Γ (z − H0 ) Γ E = z−λ −∞
(2.1)
Therefore, s-lim→+0 Γ∗ (λ ± i − H0 )−1 Γ exists on R, hence also L± (λ) := s-lim→+0 L± (λ ± i) exists and it is infinitely differentiable and polynomially bounded. From (2.1), we obtain Γ∗ E0 (dλ)Γ | E = M (λ)∗ M (λ), | dλ
λ ∈ R,
where E0 (·) denotes the spectral measure of H0 on H0 . Assumption 2. H has no eigenvalues. This is equivalent to det L+ (λ) = 0 for all λ ∈ R (see, e.g., [10]). Then, L+ (λ)−1 exists for all λ ∈ R, it is infinitely differentiable and supλ L+ (λ)−1 E < ∞. Furthermore, we have s- lim PE (λ ± i − H)−1 PE E = (L± (λ) E)−1 , →+0
λ ∈ R.
(2.2)
H has no singular continuous spectrum. From (2.2), we obtain PE E(dλ)PE 1 (L− (λ)−1 − L+ (λ)−1 ) |E= | dλ 2πi = L± (λ)−1 M (λ)∗ M (λ)L∓ (λ)−1 ,
λ ∈ R,
where E(·) denotes the spectral measure of H. 2.2. Wave operators and wave matrices Since Γ + Γ∗ is a finite-dimensional perturbation, the wave operators W± = W± (H, H0 ) := s-limt→±∞ eitH e−itH0 PE⊥ exist, they are isometric from H0 onto H. Furthermore, W±∗ = W± (H0 , H) = s-limt→±∞ eitH0 e−itH . In the following we rewrite the wave operators as limits of operator spectral integrals. We refer to [15] for details on operator spectral integrals, where this theory is presented. We use also results of Baumg¨artel [13] (see also [14]). Here we m mention only the following facts: If µ → t(µ) := j=1 χ∆j (µ)tj , tj ∈ H0 , is a step
February 25, 2006 14:58 WSPC/148-RMP
J070-00258
Generalized Eigenvectors for Resonances
65
∞ function then the spectral integral −∞ E0 (dµ)t(µ) is given by ∞ m ∞ E0 (dµ)t(µ) = E0 (dµ)χ∆j (µ)tj −∞
j=1
=
m j=1
= The spectral integral exists if
∞ −∞
−∞
m
∞
−∞
χ∆j (µ)E0 (dµ)tj
E0 (∆j )tj .
j=1
E0 (dµ)x(µ) for a more general function µ → x(µ) ∈ H0
∞
−∞
(x(λ), E0 (dµ)x(λ)) dλ < ∞. dµ µ=λ
(dµ)g) Note that (g,E0dµ exists a.e. on R for all g ∈ H0 because the spectral measure E0 (·) is absolutely continuous. Now, put HE0 := clo spa(E0 (∆)f, f ∈ ΓE) and HE := clo spa(E(∆)e, e ∈ E). It is not hard to see that HE0 and HE have natural spectral representations with respect to E0 (·), E(·), rspectively, which are explicitly given by spectral integrals: ∞ ∞ E0 (dµ)Γf (µ), HE y = E(dλ)g(λ), (2.3) HE0 x = −∞
−∞
where µ → f (µ) ∈ E, λ → g(λ) ∈ E are vector functions with values in E such that ∞ ∞ the integrals (2.3) exist. Note that −∞ E0 (dµ)Γf (µ) exists iff −∞ M (µ)f × 2 µ → M (µ)f (µ) is an element of H0 . The integral K dµ < ∞, i.e. iff the function (µ) ∞ ∞ E(dλ)g(λ) exists iff M (λ)L+ (λ)−1 g(λ)2K dλ < ∞, i.e. iff the function −∞ −∞ λ → M (λ)L+ (λ)−1 g(λ) is an element of H0 . The function f (·) is called the representer of x and g(·) the representer of y with respect to the corresponding spectral representation. Note further that ∞ E0 (dµ)Γf (µ) (λ) = (Γf (λ))(λ) = M (λ)f (λ) −∞
and H0 HE0 = {f ∈ H0 : M (λ)∗ f (λ) = 0 a.e. on R}. The wave operators W± , W±∗ can be written as strong limits of certain spectral integrals (see [13]): ∞ E(dλ)(1l − Γ∗ R0 (λ ± i))f, (2.4) H0 f → W± f = s- lim →+0
H g → W±∗ g = s- lim
→+0
−∞
∞
−∞
E0 (dλ)(1l + (Γ + Γ∗ )R(λ ± i))g,
(2.5)
February 25, 2006 14:58 WSPC/148-RMP
66
J070-00258
H. Baumg¨ artel
where R0 (z) := (z − H0 )−1 , R(z) := (z − H)−1 denote the resolvent of H0 , H on H0 , H, respectively. From (2.4), we get immediately f ∈ H0 HE0 .
W± f = f,
(2.6)
W± on HE0 and W±∗ on HE can be calculated explicitly. Lemma 2.1. The wave operators are given by the following expressions: ∞ ∞ W± E0 (dµ)Γf (µ) = E(dλ)L± (λ)f (λ), −∞
W±∗
∞
E(dλ)g(λ) =
−∞
(2.7)
−∞ ∞ −∞
E0 (dλ)ΓL± (λ)−1 g(λ).
(2.8)
Proof. (2.7). First we calculate W± (Γe), e ∈ E. From (2.4), we obtain ∞ W± (Γe) = s- lim E(dλ)(Γe − Γ∗ R0 (λ ± i)Γe) →+0
−∞
∞
= s- lim
→+0
−∞
∞
= s- lim
→+0
−∞
∞
= s- lim
→+0
but
∞
−∞
−∞
E(dλ)(Γe + L± (λ ± i)e − ((λ ± i) − H0 )e) E(dλ)(L± (λ ± i)e − λe ∓ ie + H0 e + Γe) E(dλ)(L± (λ ± i)e + (H − λ)e),
E(dλ)(H − λ)e = 0, i.e.
W± (Γe) = s- lim
→+0
∞
−∞
E(dλ)L± (λ ± i)e.
∞ Now, the spectral integral −∞ E(dλ)L± (λ)e exists and it turns out by straightforward calculation that one can interchange s-lim and integral, i.e. finally we have ∞ E(dλ)L± (λ)e. W± (Γe) = −∞
Straightforward extension to the spectral integrals W±∗ e.
∞ −∞
E0 (dµ)Γf (µ) yields (2.7).
According to (2.5) we have (2.8). Correspondingly, first we calculate ∞ W±∗ e = s- lim E0 (dλ)PE⊥ (Γ + Γ∗ )R(λ ± i)e →+0
−∞
∞
= s- lim
→+0
−∞
∞
= s- lim
→+0
−∞
E0 (dλ)ΓPE R(λ ± i)PE e E0 (dλ)ΓL± (λ ± i)−1 e.
February 25, 2006 14:58 WSPC/148-RMP
J070-00258
Generalized Eigenvectors for Resonances
67
∞ Again, the spectral integral −∞ E0 (dλ)ΓL± (λ)−1 e exists and we can interchange s-lim and integral, i.e. we arrive at ∞ ∗ E0 (dλ)ΓL± (λ)−1 e. W± e = Extension to the spectral
−∞ ∞ integrals −∞
E(dλ)g(λ) gives (2.8).
Therefore, W± (HE0 ) = HE and W± (H0 HE0 ) = H HE . Using (2.6), we get H0 HE0 = H HE . Note that this is compatible with E ⊂ HE . Thus, the wave operators act nontrivially only on HE0 , HE . Lemma 2.1 says: If λ → f (λ) is the representer of x ∈ HE0 with respect to E0 , then the representer of W± x ∈ HE with respect to E is given by λ → L± (λ)f (λ). Conversely, if λ → g(λ) is the representer of y ∈ HE with respect to E, then the representer of W±∗ y ∈ HE0 with respect to E0 is given by λ → L± (λ)−1 g(λ). In general, operator functions with these properties are called the wave matrices of W± , W±∗ with respect to given fixed spectral representations (see [15, p. 177] for these concepts). Note that wave matrices are well defined only if the spectral representations are fixed. Lemma 2.2. The wave matrices of W± , W±∗ with respect to the natural spectral representations in HE0 , HE are given by W± (λ) = L± (λ),
W±∗ (λ) = L± (λ)−1 ,
λ ∈ R.
Note that in the natural spectral representation of HE0 , the vectors Γe, e ∈ E are considered in some sense as “constants”, whereas the corresponding function as a function in H0 with respect to the usual K-representation is given by λ → (Γe)(λ) = M (λ)e. As is well known (see, e.g., [15, p. 398]), the scattering matrix SK (λ) := (W+∗ W− )(λ) in the usual K-representation of H0 = HE0 ⊕ (H0 HE0 ) is given by SK (λ) = 1lK − 2πiM (λ)L+ (λ)−1 M (λ)∗ ,
λ ∈ R.
(2.9)
Lemma 2.3. On HE0 and with respect to the natural spectral representation of HE0 the scattering matrix SE (·) is given by SE (λ) = L+ (λ)−1 L− (λ) = L+ (λ)−1 L+ (λ)∗ .
(2.10)
This means if f ∈ HE0 and f˜(·) is its representer with respect to E0 , i.e. f (λ) = M (λ)f˜(λ), then SE (λ)f˜(λ) is the E0 -representer of Sf , where (Sf )(λ) = SK (λ)f (λ). Proof. We have to prove that SK (λ)M (λ)f˜(λ) = M (λ)SE (λ)f˜(λ). But this is obvious because of M (λ)L+ (λ)−1 L− (λ) = (1lK − 2πiM (λ)L+ (λ)−1 M (λ)∗ )M (λ) = SK (λ)M (λ).
(2.11)
February 25, 2006 14:58 WSPC/148-RMP
68
J070-00258
H. Baumg¨ artel
Remark 2.4. In the following, we restrict the consideration to the case that ΓE is generating for H0 and E is generating for H, i.e. we assume HE = H and HE0 = H0 . This implies dim E = dim K. Moreover, the operator function λ → M (λ) ∈ L(E → K) is then invertible for all λ, M (λ)−1 ∈ L(K → E). 3. Gelfand Triplets 3.1. The Schwartz space triplet on H0 and its transformation to H By S we denote the space of all Schwartz functions λ → s(λ) ∈ K with values in K. The canonical norms on S are denoted by ·σ , where σ labels these norms. S ⊂ H0 is dense in H0 with respect to the Hilbert space norm of H0 . The space of all continuous antilinear forms on S is denoted by S × . Then, S ⊂ H0 ⊂ S × is a Gelfand triplet with respect to H0 , the Schwartz space triplet. The representer s(λ), λ → s˜(λ) ∈ E. of s in the E0 -representation is denoted by s˜, s(λ) = M (λ)˜ By the wave operator W+ , the Schwartz space triplet can be transformed to a triplet with respect to H. We put D := W+ S and equip D with the topology of S. Thus, we obtain the triplet D ⊂ H ⊂ D× . ×
Note that D =
W+× S × ,
×
×
(3.1)
W+× s×
where D d = is defined by
W+∗ d | s× = d | W+× s× , d ∈ D.
Lemma 3.1. The triplet (3.1) satisfies the following properties: (i) E ⊂ D and E = W+ T where T := {f ∈ H0 : f (λ) = M (λ)L+ (λ)−1 e, e ∈ E} is an N-dimensional subspace of H0 with T ⊂ S, (ii) D = Φ ⊕ E where Φ := {W+ s : s ∈ S ∩ (H0 T )} = PE⊥ D ⊂ H0 and Φ is dense in H0 , (iii) D× = Φ× × E (Cartesian product) where Φ× is the space of all continuous antilinear forms on Φ, (iv) if d = φ + e and d× = {φ× , e× }, then d | d× = φ | φ× + (e, e× )E , (v) H0 Φ ⊆ Φ and HD ⊆ D. Proof. (i)–(iv) are obvious because of Lemma 2.1. (v) HD ⊆ D is obvious because H acts on the representers of elements in D by multiplication of the spectral parameter, this implies H0 Φ ⊆ Φ because of H = H0 + Γ + Γ∗ . 3.2. A modified Gelfand triplet Recall that spec(H0 E) is a finite set of (real) eigenvalues. Let (a, b) ⊂ R be an open interval with spec(H0 E) ⊂ (a, b). Further let G0 ⊂ C an (open) connected symmetric region (symmetric with respect to complex conjugation) such that G0 ∩ R = (a, b).
February 25, 2006 14:58 WSPC/148-RMP
J070-00258
Generalized Eigenvectors for Resonances
69
Assumption 3. The operator function R λ → M (λ) ∈ L(E → K) has a holomorphic continuation into G0 . Then, L+ (·) is holomorphic in C+ ∪ G0 and L+ (·)−1 is meromorphic there and even holomorphic in C+ ∪ (a, b). We introduce a modified Gelfand triplet: Recall first that the Schwartz functions have the representation s(λ) = M (λ)L+ (λ)−1 x(λ), x(λ) ∈ E, where the representer in the E0 -representation is given by s˜(λ) = L+ (λ)−1 x(λ). Now, let S0 ⊂ S be the following submanifold of the Schwartz space: S0 := {s ∈ S : λ → x(λ) is holomorphic continuable into G0 }. S0 is dense in S with respect to the Schwartz topology. The (stronger) topology in S0 is defined by the collection of norms s0 σ,K := s0 σ +
sup z∈K⊂G0
x(z)E ,
where K runs through all compact subsets of G0 . Then S0 ⊂ H0 ⊂ S0× is a modified Gelfand triplet with respect to H0 . The transformation of S0 to H is given, as before, by D0 := W+ S0 . Then, D0 ⊂ H ⊂ D0× is a Gelfand triplet with respect to H. Similarly as in Lemma 3.1 we obtain Lemma 3.2. The modified Gelfand triplet satisfies the following properties: (i) E ⊂ D0 , (ii) D0 = Φ0 ⊕ E, where Φ0 = PE⊥ D0 , × × × (iii) D0× = Φ× 0 × E and for d0 = φ0 + e, d0 = {φ0 , e }, one has × ×
d0 | d× 0 = φ0 | φ0 + (e, e )E ,
(iv) H0 Φ0 ⊆ Φ0 and HD0 ⊆ D0 . Proof. (i) Since the functions x(·) for the elements f ∈ T are given by x(λ) = e for all λ, i.e. by constants, the condition of holomorphic continuability is obviously satisfied. (ii)–(iv) are true because of Lemma 3.1. Remark 3.3. A simple example satisfying Assumptions 1–3 is given for multiplicity N = 1, i.e. E = Ce0 , then, according to Remark 2.4, one has also K = C. Let 2 λ0 ∈ R be the eigenvalue of H0 , H0 e0 = λ0 e0 . Choose Γe0 (λ) := e−λ /2 . Then, ∞ −λ2 e ∗ −1 dλ e0 Γ (z − H0 ) Γe0 = −∞ z − λ
February 25, 2006 14:58 WSPC/148-RMP
70
J070-00258
H. Baumg¨ artel
and
L+ (z) = z − λ0 +
∞ −∞
2
e−λ dλ, λ−z
where we have omitted the factor e0 . Let x0 ∈ R. The calculation z → x0 + i0 gives ∞ −λ2 2 e L+ (x0 ) = x0 − λ0 + iπe−x0 + dλ, −∞ λ − x0 where the integral is Cauchy’s mean value. This shows that L+ (x0 ) = 0 is impossible because Cauchy’s mean value is real. That is, the Assumptions 1 and 2 are satisfied. 2 Assumption 3 is satisfied because λ → e−λ /2 is holomorphic in C, hence z → L+ (z) is also holomorphic in C. The same is true for L− (·). 3.3. Resonances We define the concept resonance for the Friedrichs model satisfying Assumptions 1, 2 and 3 as follows: The point ζ0 ∈ G0 ∩ C− is called a resonance if det L+ (ζ0 ) = 0. In other words, ζ0 is a resonance iff ζ0 is a pole of L+ (·)−1 , i.e. a pole of the analytic continuation of the partial resolvent into G0 ∩ C− . From Lemma 2.3, we obtain: A point ζ0 ∈ G0 ∩ C− is a pole of L+ (·)−1 iff it is a pole of SK (·), resp. of SE (·). 4. Results The first result (Theorem 4.1) says that exactly the resonances are eigenvalues of the extended Hamiltonian H × with respect to the modified Gelfand triplet for H, if for the corresponding eigenvectors a certain analyticity condition is required. Theorem 4.1. The point ζ0 ∈ G0 ∩ C− is an eigenvalue of the extended Hamiltonian H × with respect to the Gelfand triplet D0 ⊂ H ⊂ D0× with × eigenantilinear form d× 0 := {φ0 (ζ0 , e0 ), e0 } satisfying the eigenvalue equation × × H × d× 0 = ζ0 d0 , where φ0 (ζ0 , e0 ) is the analytic continuation into G0 ∩ C− of a holomorphic vector antilinear form φ× 0 (z, e0 ) in C+ iff ζ0 is a resonance. The anti(z, e) is given by linear form C+ z → φ× 0 −1
φ | φ× Γe)H0 , 0 (z, e) := (φ, (z − H0 )
φ ∈ Φ0 , z ∈ C+ ,
and e0 satisfies L+ (ζ0 )e0 = 0, i.e. e0 ∈ ker L+ (ζ0 ). That is, the (generalized) eigenspace of ζ0 is q-dimensional, where q is the geometric multiplicity of the eigenvalue 0 of L+ (ζ0 ). The second result (Theorem 4.2) concerns the structure of the corresponding × eigenantilinear form s× 0 of H0 with respect to the modified Schwartz space triplet. This antilinear form is given by ∗ × × s× 0 (ζ0 , e0 ) = (W+ ) d0 (ζ0 , e0 ).
February 25, 2006 14:58 WSPC/148-RMP
J070-00258
Generalized Eigenvectors for Resonances
71
It turns out that s× 0 is an antilinear form on S0 of a pure Dirac type with respect to the point ζ0 and there is a very simple transformation formula from e0 to the corresponding vector k0 ∈ K. × Theorem 4.2. The eigenantilinear form s× 0 of H0 with respect to the Gelfand × ∗ × × triplet S0 ⊂ H0 ⊂ S0× , associated to d× 0 by s0 := (W+ ) d0 is given by
s | s× 0 (ζ0 , e0 ) = 2πi(s(ζ0 ), k0 )K ,
s ∈ S0 ,
where k0 := M (ζ0 )e0 . The third result (Corollary 4.3) connects the eigenantilinear form s× 0 (ζ0 , e0 ) with . a corresponding Gamov vector which is uniquely determined by s× 0 Recall that pre-Gamov vectors are considered (in this paper) as the eigenvec2 2 , t ≥ 0, where H+ ⊂ H0 is tors of the truncated evolution t → Q+ e−itH0 H+ the Hardy subspace for C+ and Q+ the projection onto this Hardy subspace. 2 The truncated evolution is a strongly continuous contractive semigroup on H+ of the Toeplitz type (see, e.g., [11]). As is well known, each point ζ ∈ C− is an eigenvalue of the generator of this semigroup and the corresponding eigenspace is 2 : f (λ) := k(λ − ζ)−1 , k ∈ K}, i.e. the dimension of the eigenspace given by {f ∈ H+ of ζ coincides with dim K. Now, the decisive question is which pre-Gamov vectors are connected with eigenantilinear forms of H0× . The first answer is that one has to select the poles of L+ (·)−1 , resp. of SK (·). However, it remains the question: Which values of k ∈ K have to be chosen such that the pre-Gamov vector given by k is in fact connected to an eigenantilinear form of H0× . 2 2 2 ⊂ H+ is dense in H+ with respect to the Hilbert Recall first that S0 ∩ H+ 2 space norm of H+ . The mentioned connection is then simply given by restriction 2 of s× 0 to S0 ∩ H+ . 2 Corollary 4.3. The restricted eigenantilinear form s× 0 S0 ∩ H+ 2 S0 ∩ H+ s → 2πi(s(ζ0 ), k0 )K 2 is even continuous with respect to the Hilbert space topology of H+ , i.e. it can be × 2 2 2 continuously extended onto clo(S0 ∩ H+ ) = H+ . That is, s0 H+ is realized by the 2 -vector k0 (ζ0 − λ)−1 via the relation H+ ∞ k0 dλ. (4.1) 2πi(s(ζ0 ), k0 ) = s(λ), ζ0 − λ K −∞
Proof. (4.1) follows immediately from the Paley–Wiener theorem. 2 Corollary 4.3 means: The restriction on H+ of the eigenantilinear form s× 0 , which × × × × ∗ is the back transform s0 = W+ d0 of d0 , associated to the resonance ζ0 and to the parameter vector e0 ∈ ker L+ (ζ0 ), to the Hilbert space H0 , resp. the corresponding Gelfand triplet, yields the associated Gamov vector λ → k0 (ζ0 − λ)−1 ,
February 25, 2006 14:58 WSPC/148-RMP
72
J070-00258
H. Baumg¨ artel
where k0 = M (ζ0 )e0 . Conversely, exactly the pre-Gamov vectors where ζ0 is a resonance and k0 = M (ζ0 ) with e0 ∈ ker L+ (ζ0 ) have an extension (or “continuation”) to an eigenantilinear form of the extended Hamiltonian H × with respect to the Gelfand triplet D0 ⊂ H ⊂ D0× . That is exactly these pre-Gamov vectors that are true Gamov vectors. The last result presents a simple partial answer to the question, how the parameter space M (ζ0 ) ker L+ (ζ0 ) can be derived from the Laurent expansion of the scattering matrix SE (·) at ζ0 . Proposition 4.4. If ζ0 is a simple pole of SE (·), then ker L+ (ζ0 ) = ima{Resz=ζ0 SE (z)}.
(4.2)
Proof. An easy calculation gives ker L+ (ζ0 ) = ima L−1 = ima(L−1 L+ (ζ0 )∗ ), where L−1 = Resz=ζ0 L+ (z)−1 . This gives (4.2). Note that L+ (ζ0 )∗ )−1 exists. Remark 4.5. The relation between the order g of the pole ζ0 of SE (·) and q := dim ker L+ (ζ0 ) is complicated. If m ≤ N = dim E is the algebraic multiplicity of the eigenvalue 0 of L+ (ζ0 ) and r, 1 ≤ r ≤ m, the order of the zero ζ0 of det L+ (z), then in any case 1 ≤ g ≤ r (see, e.g., [16] for details). 5. The Case of the Friedrichs Model on the Half Axis [0, ∞) ∼ L2 (R+ , K, dλ) where R+ := [0, ∞), In this case, H0 has to be replaced by P+ H0 = and P+ is the projection given as the multiplication operator by χR+ (·), where χ denotes the corresponding characteristic function. Now, one assumes (a, b) ⊂ R+ . L+ (·) and L− (·) are branches of a unique analytic function, defined on C>0 := {z ∈ C : z ∈ / R+ }, which is again denoted by L± (·) if it is considered in C± . 2 ). Then, S is replaced by S+ := P+ (S ∩ H+ 2 S+ ⊂ P+ H+ ⊂ P+ H0 ,
where the inclusions are dense inclusions with respect to the Hilbert space topology 2 2 2 , i.e. P+ H+ s → P+−1 s =: s˜ ∈ H+ is of P+ H0 . Note that P+ is a bijection on H+ −1 uniquely defined by s. Therefore we choose in S+ the topology induced by P+ from 2 , i.e. the Schwartz space topology used in Sec. 3.1. The assumption on M (·) S ∩ H+ is replaced by M (·) ∈ S+ (L(E → K)), which means that all matrix elements of M (·) are elements of S+ -type (in the scalar sense). Assumption 2 remains unchanged as 2 , even it is the definition of S0,+ (the former S0 ), which is then also dense in P+ H+ 2 2 , induced by P+ in P+ H+ . Then, with respect to the Hilbert space topology of H+ Theorem 4.1 remains true literally, also Theorem 4.2. Concerning Corollary 4.3, note that in this case the eigenantilinear form S0,+ s → 2πi(s(ζ0 ), k0 )K
February 25, 2006 14:58 WSPC/148-RMP
J070-00258
Generalized Eigenvectors for Resonances
73
2 is even continuous with respect to the Hilbert space topology of H+ (injected into 2 2 P+ H+ ). Therefore, it can be continuously extended onto H+ (via P+−1 ) such that (4.1) is true also in this case. The Gamov vector, considered in the unperturbed Hilbert space P+ H0 , is then again given by
R+ λ →
k0 . ζ0 − λ
Proposition 4.4 remains true in this case. For the interplay between R, R+ and the Hardy spaces, i.e. the interplay between P+ and the projections Q± onto the Hardy spaces, which is a special case of what Halmos called “Two subspaces in generic position” ([20], refinement by Kato, see [21]) and see [17, 19], for example. 6. Proofs 6.1. Proof of Theorem 1 The eigenvalue equation for eigenvalues ζ0 ∈ G0 ∩ C− of H × with respect to the triplet D0 ⊂ H ⊂ D0× reads ×
d | H × d× 0 = d | ζ0 d0 ,
d ∈ D0 ,
or ×
Hd|d× 0 = ζ0 d|d0 ,
where d = φ + e, φ ∈ Φ0 , e ∈ E, with
d× 0
=
{φ× 0 , e0 },
d ∈ D0 , φ× 0
∈ Φ× 0 , e0 ∈ E. This is equivalent
× ∗ (H0 e − ζ0 e, e0 ) + Γe | φ× 0 = ζ0 φ − H0 φ | φ0 − (Γ φ, e0 ).
Since e and φ vary independently we obtain two equations: ((ζ0 − H0 )e, e0 ) = Γe | φ× 0 ,
e ∈ E,
(6.1)
φ ∈ Φ0 .
(6.2)
and ∗
(ζ0 − H0 )φ | φ× 0 = (Γ φ, e0 ),
φ× 0 depends on ζ0 , the possible eigenvalue (and on e0 ). According to our analyticity condition for φ× 0 this antilinear form is required to be the analytic continuation of a holomorphic vector antilinear form C+ z → φ× 0 (z). This means that Eq. (6.2) has to be valid also on C+ and it is a vector antilinear form there: ∗ ((¯ z − H0 )φ, φ× 0 (z))H0 = (Γ φ, e0 )E ,
z ∈ C+ , φ ∈ Φ0 ,
or (φ, (z − H0 )φ× 0 (z))H0 = (φ, Γe0 )H0 ,
z ∈ C+ , φ ∈ Φ0 .
This means (z − H0 )φ× 0 (z) = Γe0 or −1 φ× Γe0 , 0 (z) = (z − H0 )
z ∈ C+ .
(6.3)
February 25, 2006 14:58 WSPC/148-RMP
74
J070-00258
H. Baumg¨ artel
Now, we have to check that this antilinear form on Φ0 is analytically continuable into C+ ∪ G0 as a holomorphic antilinear form according to the requirement in Theorem 4.1: We have shown in Sec. 3.2 that the elements s ∈ S0 have the representation s(λ) = M (λ)L+ (λ)−1 x(λ), where λ → x(λ) ∈ E. Then, (W+ s)(λ) = x(λ) and the function x(·) is holomorphic continuable into G0 . If ζ ∈ C+ , we have ⊥ −1
φ | φ× Γe0 0 (ζ) = PE W+ s, (ζ − H0 ) = (W+ s, (ζ − H0 )−1 Γe0 ) ∞ −1 = E(dλ)x(λ), (ζ − H0 ) Γe0
−∞
∞
= Since x(λ) = obtain
N j=1
−∞
(E(dλ)x(λ), (ζ − H0 )−1 Γe0 ) dλ. dλ
xj (λ)bj , where the {bj }j form an orthonormal basis of E, we
φ | φ× 0 (ζ)
=
N j=1
∞
−∞
xj (λ)
(E(dλ)bj , R0 (ζ)Γe0 ) dλ, dλ
so that we have to calculate the expression (E(dλ)e, R0 (ζ)Γe0 ) dλ for any e ∈ E. This calculation starts with the identity (R(z)e, R0 (ζ)Γe0 ) = (R0 (z)ΓL+ (z)−1 e, R0 (ζ)Γe0 ),
z, ζ ∈ C+ ,
where for the calculation of the right-hand side the explicit expression for the resolvent R(z) = (z − H)−1 is used. This implies (R(µ ± i0)e, R0 (ζ)Γe0 ) =
1 −1 ¯ (R0 (ζ)ΓL e, Γe0 ) − (R0 (µ ± i0)ΓL± (µ)−1 e, Γe0 ) . ± (µ) µ−ζ
Using 1 E(dµ) = (R(µ − i0) − R(µ + i0)) dµ 2πi finally after a lengthy but straightforward calculation, we obtain (E(dµ)e, R0 (ζ)Γe0 ) 1 = L± (µ)−1 M (µ)∗ M (µ)L∓ (µ)1 e, (ζ − µ − L+ (ζ))e0 . dµ µ−ζ (6.4) Inspection of (6.4) proves the assertion. Now we know that the antilinear form × φ× 0 (z) satisfies the Eq. (6.3) for z ∈ C+ . Therefore, φ0 (ζ, e0 ) satisfies Eq. (6.2) for
February 25, 2006 14:58 WSPC/148-RMP
J070-00258
Generalized Eigenvectors for Resonances
75
all ζ ∈ G0 ∪ C+ (where now we have taken into account the second parameter e0 ). Since z → φ× 0 (z, e0 ) is holomorphic in the whole region G0 ∪ C+ , we consider the (second) equation (6.1) first on C+ . Then, it reads −1 ((¯ z − H0 )e, e0 ) = Γe | φ× Γe0 ) = (e, Γ∗ (z − H0 )−1 Γe0 ), 0 (z, e0 = (Γe, (z − H0 )
so that we have (e, (z − H0 )e0 ) − Γe | φ× 0 (z, e0 ) = (e, L+ (z)e0 ),
e ∈ E, z ∈ C+ ,
(6.5)
and Eq. (6.1) reads simply (e, L+ (z)e0 ) = 0 for all e ∈ E which obviously has no solution in C+ ∪ (a, b). But by analytic continuation, the identity (6.5) is true also in C− ∩ G0 . That is, Eq. (6.1) is equivalent to L+ (ζ0 )e0 = 0,
ζ0 ∈ C− ∩ G0 .
(6.6)
This means: Eq. (6.1) has a solution ζ0 with corresponding parameter e0 ∈ E iff Eq. (6.6) is satisfied. Conversely, if ζ0 ∈ C− ∩G0 and e0 ∈ E satisfy Eq. (6.6) then ζ0 × is an eigenvalue of H × and d× 0 := φ0 (ζ0 , e0 ), e0 is a corresponding eigenantilinear form. The dimension of the eigenspace of ζ0 is then dim ker L+ (ζ0 ). 6.2. Proof of Theorem 2
× × ∗ × × To calculate s× 0 , e0 ), e0 , first 0 (ζ0 , e0 ) = (W+ ) d0 with d0 = φ0 (ζ
we consider × × ∗ × φ . Later on, we again for z ∈ C and calculate s (z, e ) = (W ) (z, e ), e φ× + 0 0 0 + 0 0 0 consider the analytic continuation into G0 ∩ C− . We start with ×
s | s× 0 (z, e0 ) = W+ s | d0 (z, e0 )
= PE⊥ W+ s | φ× 0 (z, e0 ) + (PE W+ s, e0 ) = (PE⊥ W+ s, (z − H0 )−1 Γe0 ) + (PE W+ s, e0 )
= (W+ s, (z − H0 )−1 Γe0 ) + (s, W+∗ e0 ). ∞ We have W+∗ e0 = −∞ E0 (dλ)ΓL+ (λ)−1 e0 dλ and W+ s = −∞ E(dλ)L+ (λ)˜ s(λ), ∞ s(λ), i.e. s˜(·) is the representer of s with respect to the where s = −∞ E0 (dλ)Γ˜ s(λ). Then E0 -representation, s(λ) = M (λ)˜ ∞ (E(dλ)L+ (λ)˜ s(λ), R0 (z)Γe0 ) dλ. (W+ s, R0 (z)Γe0 ) = dλ −∞ ∞
Again, we use (6.4) for the calculation of this expression and obtain (W+ s, R0 (z)Γe0 ) ∞ 1 = L− (µ)−1 M (µ)∗ M (µ)L+ (µ)−1 L+ (µ)˜ s(µ), (z − µ − L+ (z))e0 dµ −∞ µ − z ∞ =− (L− (µ)−1 M (µ)∗ s(µ), e0 ) dµ −∞
∞
+ −∞
1 L− (µ)−1 M (µ)∗ M (µ)˜ s(µ), L+ (z)e0 dµ. z−µ
February 25, 2006 14:58 WSPC/148-RMP
J070-00258
H. Baumg¨ artel
76
Furthermore we have (s, W+∗ e0 ) =
s,
−∞ ∞
= −∞
∞
∞
= −∞
E0 (dλ)ΓL+ (λ)−1 e0 dλ
s(λ), M (λ)L+ (λ)−1 e0 K dλ L− (λ)−1 M (λ)∗ s(λ), e0 E dλ,
so that we finally obtain
s | s× 0 (z, e0 )
∞
= −∞
1 L− (µ)−1 M (µ)∗ s(µ) dµ, L+ (z)e0 z¯ − µ
. E
For the analytic continuation from z ∈ C+ into C+ ∪ G0 , we have to check the integral ∞ 1 L− (µ)−1 M (µ)∗ s(µ) dµ. Ψ− (¯ z ) := (6.7) z ¯ − µ −∞ Since this integral is the left factor in the scalar product, we substitute for the moment z → z¯, consider ∞ 1 L− (µ)−1 M (µ)∗ s(µ) dµ, z ∈ C− , (6.8) Ψ− (z) := z − µ −∞ and check the continuation into C+ . Recall that z → Ψ+ (z) for z ∈ C+ is defined by one and the same formula (6.8). Then, we obtain for z ∈ C+ Ψ− (z) = Ψ+ (z) + 2πiL− (z)−1 M (¯ z )∗ s(z) z )−1 )∗ M (¯ z )∗ s(z). = Ψ+ (z) + 2πi(L+ (¯ Substituting again z → z¯, i.e. now we have z¯ ∈ C+ and z ∈ C− , we obtain (Ψ− (¯ z ), L+ (z)e0 ) = (Ψ+ (¯ z ), L+ (z)e0 ) + 2πi((L+ (z)−1 )∗ M (z)∗ s(¯ z ), L+ (z)e0 ), where Ψ+ (¯ z ) is a holomorphic part such that the first term vanishes for z = ζ0 . Then, we have ∗ z ), L+ (z)−1 L+ (z)e0 ) + (Ψ+ (¯ z ), L+ (z)e0 )
s | s× 0 (z, e0 ) = 2πi(M (z) s(¯
and ∗
s | s× 0 (ζ0 , e0 ) = 2πi(M (ζ0 ) s(ζ0 ), e0 ) = 2πi(s(ζ0 ), M (ζ0 )e0 )K ,
that is, the antilinear form s× 0 (ζ0 , e0 ) is of pure Dirac type with respect to the point ζ0 and the corresponding vector k0 ∈ K with
s | s× 0 (ζ0 , e0 ) = 2πi(s(ζ0 ), k0 )K
February 25, 2006 14:58 WSPC/148-RMP
J070-00258
Generalized Eigenvectors for Resonances
77
is given by k0 := M (ζ0 )e0 . This confirms the fact (which is known from the beginning) that the subspace of the admissible vectors k ∈ K has the dimension dim ker L+ (ζ0 ), too. Acknowledgment It is a pleasure to thank Professor A. Bohm for discussions on the subject at the 3rd International Workshop on Pseudo-Hermitean Hamiltonians in Quantum Physics at Ko¸c University, Istanbul, June 20–22 and at DESY Zeuthen, July 5, 2005. References [1] A. Bohm, Quantum Mechanics (Springer-Verlag, Berlin, 1979). [2] E. Br¨ andas and N. Elander (eds.) Resonances, Lecture Notes in Physics, Vol. 325 (Springer-Verlag, Berlin, 1989). [3] S. Albeverio, J. C. Ferreira and L. Streit, Resonances — Models and Phenomena, Lecture Notes in Physics, Vol. 211 (Springer-Verlag, Berlin, 1984). [4] G. Gamov, Zur Quantentheorie des Atomkerns, Z. Phys. 51 (1928) 204–212. [5] A. Bohm and M. Gadella, Dirac Kets, Gamov Vectors and Gelfand Triplets, Lecture Notes in Physics, Vol. 348 (Springer-Verlag, Berlin, 1989). [6] A. Bohm and N. L. Harshman, Quantum theory in the rigged Hilbert space — Irreversibility from causality, in Irreversibility and Causality, Semigroups and Rigged Hilbert Spaces, Lecture Notes in Physics, Vol. 504 (Springer-Verlag, Berlin, 1998), pp. 181–237. [7] A. Bohm, S. Maxson, M. Loewe and M. Gadella, Quantum mechanical irreversibility, Phys. A 236 (1997) 485–549. [8] I. M. Gelfand and N. J. Wilenkin, Verallgemeinerte Funktionen (Distributionen), IV (VEB Deutscher Verlag der Wissenschaften, Berlin, 1964). [9] H. Baumg¨ artel, Resonanzen und Gelfandsche Raumtripel, Math. Nachr. 72 (1976) 93–98. [10] H. Baumg¨ artel, Resonances of Perturbed Self Adjoint Operators and their Eigenfunctionals, Math. Nachr. 75 (1976) 133–151. [11] Y. Strauss, Resonances in the rigged Hilbert space and Lax–Phillips scattering theory, Internat. J. Theoret. Phys. 42 (2003) 2285–2317. [12] E. Eisenberg, L. P. Horwitz and Y. Strauss, The Lax–Phillips semigroup of the unstable quantum system, in Irreversibility and Causality, Semigroups and Rigged Hilbert Spaces, Lecture Notes in Physics, Vol. 504 (Springer-Verlag, Berlin, 1998), pp. 323–332. [13] H. Baumg¨ artel, Eine Bemerkung zur Theorie der Wellenoperatoren, Math. Nachr. 42 (1969) 359–363. [14] H. Baumg¨ artel, Integraldarstellungen der Wellenoperatoren von Streusystemen, Mber. Dt. Akad. Wiss. 9 (1967) 169–174. [15] H. Baumg¨ artel and M. Wollenberg, Mathematical Scattering Theory, Operator Theory: Advances and Applications, Vol. 9 (Birkh¨ auser-Verlag, Basel, Boston, Stuttgart, 1983). [16] H. Baumg¨ artel, Analytic Perturbation Theory for Matrices and Operators, Operator Theory: Advances and Applications, Vol. 15 (Birkh¨ auser-Verlag, Basel, Boston, Stuttgart, 1985).
February 25, 2006 14:58 WSPC/148-RMP
78
J070-00258
H. Baumg¨ artel
[17] H. Baumg¨ artel, Gamov vectors for resonances: A Lax–Phillips point of view, arXiv: math-ph/0407059. [18] H. Baumg¨ artel, Gamov vectors for resonances: A Lax–Phillips approach, Inst. Phys. Conf. Ser. 185 (2005) 151–156. [19] H. Baumg¨ artel, On Lax–Phillips semigroups, to appear in J. Operator Theory; arXiv: math-ph/0410036. [20] P. R. Halmos, Two subspaces, Trans. Amer. Math. Soc. 144 (1969) 381–389. [21] T. Kato, Perturbation Theory for Linear Operators (Springer-Verlag, Berlin, 1976).
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
Reviews in Mathematical Physics Vol. 18, No. 1 (2006) 79–117 c World Scientific Publishing Company
ON THE HERMITICITY OF q-DIFFERENTIAL OPERATORS AND FORMS ON THE QUANTUM EUCLIDEAN SPACES RN q
GAETANO FIORE Dip. di Matematica e Applicazioni, Fac. di Ingegneria, Universit` a di Napoli, V. Claudio 21, 80125 Napoli and I.N.F.N., Sezione di Napoli, Complesso MSA, V. Cintia, 80126 Napoli gaetano.fi
[email protected] Received 3 June 2005 Revised 17 January 2006 We show that the complicated -structure characterizing for positive q the Uq so(N )covariant differential calculus on the noncommutative manifold RN q boils down to similarity transformations involving the ribbon element of a central extension of Uq so(N ) and its formal square root v˜. Subspaces of the spaces of functions and of p-forms on RN q are made into Hilbert spaces by introducing non-conventional “weights” in the integrals defining the corresponding scalar products, namely suitable positive-definite q-pseudodifferential operators v˜±1 realizing the action of v˜±1 ; this serves to make the partial q-derivatives anti-hermitean and the exterior coderivative equal to the hermitean conjugate of the exterior derivative, as usual. There is a residual freedom in the choice of the weight m(r) along the “radial coordinate” r. Unless we choose a constant m, then the square-integrables functions/forms must fulfill an additional condition, namely, their analytic continuations to the complex r plane can have poles only on the sites of some special lattice. Among the functions naturally selected by this condition there are q-special functions with “quantized” free parameters. Keywords: Hopf algebras; quantum groups and related algebraic methods; -structures; differential calculus; noncommutative geometry on noncompact manifolds. Mathematics Subject Classification 2000: 81R50, 81R60, 16W10, 16W30, 20G42
1. Introduction Over the past two decades, the noncommutative geometry program [4] and the related programs of generalizing the concept of symmetries through quantum groups [8, 40, 10] quantum group covariant noncommutative spaces (shortly: quantum spaces) [28, 10] have found a widespread interest in the mathematical and theoretical physics community and accomplished substantial progress. Initially, mathematical investigations have been concentrated essentially in compact 79
February 25, 2006 14:58 WSPC/148-RMP
80
J070-00259
G. Fiore
noncommutative manifolds, the non-compact being usually much more complicated to deal with, especially when trying to proceed from an algebraic to a functionalanalytical treatment. In particular, so are -structures and -representations of the involved algebras. Recently, an increasing number of works is being devoted to extend results to non-compact noncommutative manifolds. We might divide these works into two subgroups. The first (see, e.g., [5, 18, 19, 21, 37]) essentially deal with non-compact noncommutative manifolds which can be obtained by isospectral deformations [6] of commutative Connes’ spectral triples and carry the action of an Abelian group Tk × Rh . The second, and even more difficult (see, e.g., [29], and references therein) deal with non-compact noncommutative manifolds which underlie some quantum group or more generally, carry the action of some quantum group; it is still under debate what the most convenient axiomatization of these models is (see [29]). The noncommutative manifold we are going to consider in the present work belongs to the second category and is relatively old and famous, but presents an additional complication even at the formal level (i.e. before entering a functionalanalytical treatment): the -structure characterizing for real q the Uq so(N )covariant differential calculus [1] on the quantum Euclidean space RN q [10] is characterized by an unpleasant nonlinear action on the differentials, the partial derivatives and the exterior derivative [30]. This at the origin of a host of formal and substantial complications. As examples we mention the following difficulties: determinining the actual geometry of RN q [17, 2]; identifying the “right” momentum sector within the algebra of observables of quantum mechanics on a RN q -configuration space and solving the corresponding eigenvalue problems for Hermitean operators in the form of differential operators [38, 13, 39]; more generally formulating and solving differential equations on RN q ; finally, writing down tractable kinetic terms for Lagrangians of potential field theory models on RN q . A similar situation occurs for other non-compact quantum spaces, notably for the q-Minkowski space [32]. It turns out that we are facing a problem similar to the one we encounter in functional analysis on the real line when taking the Hermitean conjugate of a differential operator like D = σ(x)
d 1 , dx σ(x)
(1.1)
where σ(x) is a smooth complex function vanishing for no x. As an element of the Heisenberg algebra, D is not imaginary (excluding the trivial case σ ≡ 1) with respect to the -structure d d =− , x = x, dx dx but fulfills the similarity transformation D = −|σ|−2 D|σ|2 ,
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
81
this corresponding to the fact that it is not anti-hermitean as an operator on L2 (R). D is however (formally) anti-hermitean on L2 (R, |σ|−2 dx). In other words, if we insert the weight |σ|−2 > 0 in the integral giving the scalar product, (φ, ψ) = φ (x) |σ|−2 ψ(x) dx, [as one does when setting the Sturm–Liouville problem for D2 ], D becomes antihermitean under the corresponding Hermitean conjugation †a : (A† φ, ψ) := (φ, Aψ)
⇒
D† = −D.
In this work, we show that the partial derivatives ∂ α and the exterior derivative d of the Uq so(N )-covariant differential calculus on RN q can be expressed by the similarity transformation ∂ α = ν˜ ∂˜α ν˜−1 ,
˜ν , d = ν˜−1 d˜
(1.2)
in terms of elements ∂˜α , d˜ which are purely imaginary under the -structure studied in [30]. The unusual and novel feature here is that ν˜ is not a function on RN q but a positive-definite pseudodifferential operator, more precisely the realization of the fourth root of the ribbon element of the extension of Uq so(N ) with a central α element generating dilatations of RN q . Therefore, the ∂ become anti-hermitean and the exterior coderivative δ becomes the Hermitean conjugate of d (on the space of differential forms) if we introduce the “weights” v˜∓1 := ν˜∓2 in the integral defining the scalar product of two “wave-functions/forms” on RN q . For practical purposes, it is much more convenient to use the ∂ α rather than the ∂˜α because the former have much simpler commutation relations (in the form of modified Leibniz rules) with the coordinates of RN q , whereas for the commutation α ˜ relations involving the ∂ , we have not even found a closed form. This suggests to cure the complications mentioned at the beginning as one does in the undeformed, functional-analytical setting. Section 2 contains preliminaries about the quantum group Uq so(N ), the differential calculus on RN q , frame bases, Hodge map and the analog of Lebesgue ; the latter is completely determined apart from a residintegration over RN q ual freedom in choosing the integration measure m(r) dr along the radial direction r. In Sec. 3, we prove at the algebraic level (i.e. at the level of formal power series) Eq. (1.2) and the corresponding formula for the differentials dxi of the coordinates xi of RN q . In Sec. 4, we deal with implementing the previous algebraic results in a functional-analytical setting: we introduce spaces of square-integrable functions/forms over RN q and show how the algebraic structure can be implemented in different “pictures” (i.e. configuration space realizations) as Hermitean conjugation of operators acting on them. As applications, Hermitean conjugation † is the representation of the following modified -structure of the Heisenberg algebra a = [|σ|−2 a|σ|2 ] = |σ|2 a∗ |σ|−2 .
a The
February 25, 2006 14:58 WSPC/148-RMP
82
J070-00259
G. Fiore
we first consider quantum mechanics on RN q and recall how one can diagonalize a set of commuting observables including various momentum components, then we write down “tractable” kinetic terms for (bosonic) field theories on RN q . These steps require promoting the formally (i.e. algebraically) defined ν˜±1 into corresponding well defined pseudodifferential operators, and this is done in Sec. 5 passing to the Fourier transform of the variable y = ln r. No further constraint is needed if m(r) ≡ 1, whereas an additional one must be imposed on the spaces of square-integrable functions/forms if m(r) is not constant (non-homogeneous space along the radial direction), e.g., if m(r) dr is the measure of the so-called Jackson integral: they have to be restricted to interesting subspaces Lm 2 consisting of functions whose analytic continuation in the complex r-plane have poles locations rα on a certain number γ of “rays” originating from r = 0, forming with each other 1 angles equal to 2π/γ, and such that |rα | = q j (or |rα | = q j+ 2 ), with j ∈ Z. Surprisingly, this is a condition which automatically selects q-special functions where their free parameters (which will play the role of fundamental physical quantities, e.g., a universal energy scale) are “quantized”. 2. Preliminaries 2.1. RN q and its covariant differential calculi As a noncommutative space, we consider the Uq so(N )-covariant deformation [10] of the Euclidean space RN (h := ln q plays the role of deformation parameter). We shall call the deformed algebra of functions on this space “algebra of functions on the quantum Euclidean space RN q ”, and denote it by F . It is essentially the unital associative algebra over C[[h]] generated by N elements xi (the Cartesian “coordinates”) modulo the relations (2.1) given below, and will be extended to include formal power series in the generators; out of F , we shall extract subspaces consisting of elements that can be considered integrable or square-integrable functions. The Uq so(N )-covariant differential calculus on RN q [1] is defined introducing the invariant exterior derivative d, satisfying nilpotency and the Leibniz rule d(f g) = df g +f dg, and imposing the covariant commutation relations (2.2) between the xi and the differentials ξ i := dxi . Partial derivatives are introduced through the decomposition d =: ξ i ∂i . All the other commutation relations are derived by consistency. The complete list is Paijhk xh xk = 0,
(2.1)
ˆ hi ξ j xk , xh ξ i = q R jk h k (Ps + Pt )ij hk ξ ξ = 0,
(2.3)
Paijhk ∂j ∂i = 0, ∂i xj = h i
δij
∂ ξ =q
(2.2)
(2.4) +
−1
ˆ jh xk ∂h , qR ik
hi j k ˆ jk R ξ ∂ .
(2.5) (2.6)
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
83
ˆ is the braid matrix of SOq (N ) (see [10]). The matriThe N 2 × N 2 matrix R ces Ps , Pa , Pt are SOq (N )-covariant deformations of the symmetric trace-free, antisymmetric and trace projectors, respectively, which appear in the projector ˆ decomposition of R ˆ = qPs − q −1 Pa + q 1−N Pt . R
(2.7)
The Pt projects on a one-dimensional subspace and can be written in the form sm Pt ij gsm )−1 g ij gkl = kl = (g
(q N
q2 − 1 g ij gkl , − 1)(1 + q 2−N )
(2.8)
where the N × N matrix gij is a SOq (N )-isotropic tensor, deformation of the ordinary Euclidean metric. The metric and the braid matrix satisfy the relations [10] ˆ ±1lh = R ˆ ∓1hl glk , gil R ij jk
ˆ ±1jk = R ˆ ∓1ij g lk . g il R lh hl
(2.9)
Indices will be lowered and raised using gij and its inverse g ij , e.g., ∂ i := g ij ∂j ,
xi := gij xj .
We shall call DC ∗ (differential calculus algebra on RN q ) the unital associative algebra over C[[h]] generated by xi , ξ i , ∂i modulo these relations. We shall denote by ∗ (exterior algebra, or algebra of exterior forms) the graded unital subalgebra generated by the ξ i alone, with grading ≡ the degree in ξ i , and by p (vector space p of exterior p-forms) the component with grading = p, p = 0, 1, 2, . . . . Each carries an irreducible representation of Uq so(N ), and its dimension is the binomial N coefficient p [12], exactly as in the q = 1 (i.e. undeformed) case; in particular, N N )= N carries the there are no forms with p > N , and dim( N = 1, therefore singlet representation of Uq so(N ). We shall endow DC ∗ with the same grading , and call DC p its component with grading = p. The elements of DC p can be considered differential-operator-valued p-forms. We shall denote by Ω∗ (algebra of differential forms) the graded unital subalgebra generated by the ξ i , xi , with grading , and by Ωp (space of differential p-forms) its component with grading p; by definition, Ω0 = F itself. Clearly, both Ω∗ and Ωp are F -bimodules. We shall denote by H (Heisenberg algebra on RN q ) the unital subalgebra generated by the xi , ∂i . Note that by definition, DC 0 = H, and that both DC ∗ and DC p are H-bimodules. Using (2.4) and (2.9), one can easily verify that the ∂ i satisfy the same commutation relations as the xi , and therefore together with the unit 1 generate a subalgebra of H isomorphic to F , which we shall call F . Denote by {Dπ }π∈Π a basis of the vector space underlying F consisting of homogeneous polynomials in the ∂’s and with first element D0 = 1. Any “pseudodifferential-operator-valued
February 25, 2006 14:58 WSPC/148-RMP
84
J070-00259
G. Fiore
form”, i.e. any element O ∈ DC ∗ , (in particular, O ∈ H) can be uniquely expressed in the “normal-ordered” form Oπ Dπ , Oπ ∈ Ω∗ , (2.10) O= π∈Π
by repeated application of relation (2.5) and (2.6) to move step by step all ∂’s to the right of all x, ξ’s. For any ω ∈ Ω∗ , we shall denote by Oω| the π = 0 component (Oω)0 of the normal-ordered form of Oω: Oω = (Oω)ν Dν = Oω| + (Oω)ν Dν . ν∈Π
ν=0
In particular, for O = ∂i and ω ≡ f ∈ F , the previous formula becomes the deformed Leibniz rule ∂i f = ∂i f | + fij ∂j ,
fij ∈ F.
(2.11) fij
ˆ hj xk . qR ik
= We have From (2.5), we find, e.g., that if f = xh , then ∂i f | = δih and introduced this vertical bar | in the notation to always make clear “where the action of the derivatives is meant to stop”, while sometimes this remains ambiguous by the mere use of brackets. From associativity, the obvious property O(O ω|)| = OO ω| follows. F, F are dual vector spaces with respect to the pairing [27] ∂i1 · · · ∂il , xj1 · · · xjm = δlm ∂i1 · · · ∂il xj1 · · · xjl | ∈ C
(2.12)
with m = 0, 1, . . . . The elements r2 ≡ x · x := xk xk ,
∂ · ∂ := g kl ∂l ∂k = ∂ k ∂k
are Uq so(N )-invariant and respectively generate the centers of F, F . ∂ · ∂ is a deformation of the Laplacian on RN . We shall slightly extend F by introducing the square root r of r2 and its inverse r−1 as new (central) generators; r can be considered as the deformed “Euclidean distance of the generic point of coordinates i i −1 fulfill (2.1) as well as (xi ) of RN q from the origin”. Then, the elements t := x r the relation t · t = 1; they generate the deformed algebra F (SqN −1 ) of “functions on the unit quantum Euclidean sphere”. The latter can be completely decomposed into eigenspaces Vl of the deformed quadratic Casimir of Uq so(N ), or equivalently of the Casimir w defined in (2.31) with eigenvalues wl := q −l(l+N −2) , implying a corresponding decomposition for F : F (SqN −1 ) =
∞ l=0
Vl ,
F =
∞ Vl ⊗ C[[r, r−1 ]] .
(2.13)
l=0
An orthonormal basis {SlI } (consisting of “spherical harmonics”) of Vl can be extracted out of the set of homogeneous, completely symmetric and trace-free
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
85
polynomials of degree l 1 i2 ···il j1 j2 t t · · · tjl SlI ≡ Sli1 i2 ···il := Pjs,li 1 j2 ···jl
(2.14)
suitably normalized (I denotes the multi-index i1 i2 · · · il , P s,l denotes the Uq so(N )covariant, completely symmetric and trace-free projector with l indices [11, 16]). Therefore, for the generic f ∈ F , f=
∞
fl =
l=0
∞ l=0
SlIfl,I (r).
(2.15)
I
The -structure compatible with the compact -structure of Uq so(N ) requires q ∈ R\{0}. On the generators xi is given by [10]b xi = xj gji ,
(2.16)
whereas the conjugates of the derivatives ∂ i , (resp. the differentials) are not combinations of the derivatives (resp. the differentials) themselves. One can complete a Uq so(N )-covariant -structure by the relations [31] ξ i = ξˆj gji ,
∂ i = −q −N ∂ˆj gji ,
(2.17)
where
qk i x ∂ · ∂ , k := q − q −1 1 + q 2−N
k q −2 2 ˆi i N −2 i −1 i 1−N ˆ ξ·x+ r d ∂ ξ := σq Λ ξ + q kx d − k q 1 + q N −2
k j 2 ˆi ξ ∂ r = σq N −2 Λ−2 ξ i + qkξ j ∂j xi − k q 1−N ξ · x + ∂ ; j 1 + q N −2
∂ˆi := Λ2 ∂ i +
(2.18)
(2.19)
the second expression in (2.19) is derived from the first [31] using the Leibniz rule and the decomposition d = ξ i ∂i . Here, σ is a pure phase factor which we shall set = 1, whereas the element Λ−2 is defined by Λ−2 := 1 + qkxi ∂i +
qN k2 r2 ∂ · ∂ ≡ 1 + O(h) (1 + q N −2 )2
(2.20)
we enumerate the xi of [10] as in [30] byh i i= −n, . . . , −1, 0, 1, . . . , n for N odd, and i = is the rank of so(N ), then the metric matrix −n, . . . , −1, 1, . . . , n for N even, where n := N ” “ 2 N ij −ρ reads gij = g = q i δi,−j , where (ρi ) := 2 − 1, N for N odd, − 2, . . . , 12 , 0, − 12 , . . . , 1 − N 2 2 ” “ N N N (ρi ) := 2 − 1, 2 − 2, . . . , 0, 0, . . . , 1 − 2 for N even. We can obtain a set of N real coordinates b If
xα by a linear transformation xα := Viα xi (α = 0, 1, . . . , 2n for odd N , α = 1, . . . , 2n for even N ) defined by (h ≥ 1) 1 Vi2h−1 := √ (δih + gih ), 2
−i Vi2h := √ (δih − gih ), 2
Vi0 := δi0
(only for odd N ).
February 25, 2006 14:58 WSPC/148-RMP
86
J070-00259
G. Fiore
(in [31] it was denoted by Λ). Its square root and inverse square root Λ−1 , Λ can be either introduced as additional generators or as formal power series in the deformation parameter h = ln q. They fulfill the relations Λxi = q −1 xi Λ,
Λ∂ i = q∂ i Λ,
Λξ i = ξ i Λ,
Λ1| = 1
(2.21)
and the corresponding ones for Λ−1 . The elements ξˆi , ∂ˆi satisfy relation (2.3) and ˆ −1 . As a ˆ replaced by q −1 , R (2.4) and the analogue of (2.5) and (2.6) with q, R i consequence, dˆ := ξˆ ∂ˆi = −d is also Uq so(N )-invariant, nilpotent and satisfies the ˆ ξˆi , ∂ˆi can be introduced also as independent objects Leibniz rule on F . In fact, d, defining an alternative Uq so(N )-covariant differential calculus. We shall denote by Fˆ the subalgebra generated by the ∂ˆi ; it is isomorphic to F, F , too. One finds [31] that under the action of r = r,
(∂ · ∂) = q −2N ∂ˆ · ∂ˆ = q 2−N ∂ · ∂Λ2 ,
Λ = q N Λ−1 .
(2.22)
2.2. Uq so(N ) and its action on DC ∗ We extend as in [26] the compact Hopf -algebra Uq so(N ) (this requires real q) by adding a central, primitive and imaginary generator η ∆(η) = 1 ⊗ η + η ⊗ 1,
(η) = 0,
Sη = −η
(here ∆, , S respectively denote the coproduct, counit, antipode), and we endow so(N ) by the quasitriangular structure the resulting Hopf -algebra H := Uq ˜ := Rq η⊗η , R
(2.23)
where R ≡ R(1) ⊗ R(2) (in a Sweedler notation with upper indices and suppressed summation index) denotes the quasitriangular structure of Uq so(N ). This -structure of H thus can be summarized by the relations R(1) ⊗ R(2) = R21 ,
η = −η.
(2.24)
DC ∗ is H-module -algebra (which here we choose to be right), (aa ) g = (a g(1) ) (a g(2) ).
(2.25)
Here g(1) ⊗ g(2) = ∆(g) in Sweedler notation. The transformation laws of the generators σ i = xi , ξ i , ∂ i of DC ∗ under the H-action read σ i g = ρij (g)σ j , i
i
x η=x,
i
g ∈ Uq so(N ), i
ξ η=ξ ,
(2.26)
∂ η = −∂ ; i
i
(2.27)
ˆ here ρ denotes the N -dimensional representation of Uq so(N ). The braid matrix R ij j (1) i (2) ˆ = ρ (R )ρ (R ); its explicit form can be found in [10]. is related to R by R k hk h The elements Zji := T (1) ρij (T (2) ),
where T = R21 R ≡ T (1) ⊗ T (2) , R21 ≡ R(2) ⊗ R(1)
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
87
are generators of Uq so(N ), and make up the “SOq (N ) vector field matrix” Z [41, 42, 34, 35]. The Zji are related to the Faddeev–Reshetikin–Takhtadjan generators [10] := R(1) ρal (R(2) ), L+,a l
L−,a := ρal (R−1(1) )R−1(2) l
(2.28)
)L+,i by the relation Zkh = (SL−,h i k . Equation (2.24) implies that T is real, and Zkh = Zhk ,
∓,j kj (L±,i = gih L∓,h j ) = SLi k g ,
(2.29)
as ρ is a -representation; the second equality in (2.29)2 is based on the following useful property of the N -dimensional representation of Uq so(N ): ρab (Sh) = g ad ρcd (h)gcb .
(2.30)
We recall that Uq so(N ) is a Ribbon Hopf algebra [33]: the ribbon element w ∈ Uq so(N ) is a special, central element such that w2 = u1 S(u1 ),
u1 := (SR(2) )R(1) ,
∆(w) = (w ⊗ w)T −1 ,
(2.31)
Sw = w = S −1 w.
(2.32)
It is well known [9] that there exist isomorphisms Uh so(N )[[h]] U so(N ))[[h]] of -algebras over C[[h]]. This essentially means that it is possible to express the elements of either algebra as power series in h = ln q with coefficients in the other. In particular, w has an extremely simple expression in terms of the quadratic Casimir C of so(N )c : w = q −C = e−hC = 1 + O(h),
C := X a Xa =: L(L + N − 2),
(2.33)
({X a } is a basis of so(N )). We denote by v := w1/2 , ν := w1/4 and by ˜ As an immediate w, ˜ v˜, ν˜, T˜, Z˜ the analogs of w, ν, T, Z obtained by replacing R by R. consequence 2
w ˜ = q −C q −η ,
C
v˜ = q − 2 q −
η2 2
,
C
ν˜ = q − 4 q −
η2 4
,
T˜ = T q 2η⊗η .
Since C, −η 2 are real (even positive-definite), if q > 0, all these elements make sense either as positive-definite formal power series in h of the form 1 + O(h), or as additional positive-definite generators of our Hopf -algebra. In Sec. 5, we shall make them into positive-definite operators acting on the spaces of functions and of p-forms on RN q . All the information on the -algebras DC ∗ , H and the right action can be encoded in the cross-product -algebra DC ∗ > H. We recall that this is H ⊗ DC ∗ as a vector space, and so we denote as usual g ⊗ a simply by ga; that H1DC∗ , 1H DC ∗ are subalgebras isomorphic to H, DC ∗ , and so we omit to write either unit can be easily proved using the properties of the Drinfel’d twist F and the relation a R = F21 q X ⊗XaF −1 .
c This
February 25, 2006 14:58 WSPC/148-RMP
88
J070-00259
G. Fiore
1DC ∗ , 1H whenever multiplied by non-unit elements; that for any a ∈ DC ∗ , g ∈ H the product fulfills ag = g(1) (a g(2) ).
(2.34)
∗
DC > H is a H-module algebra itself, if we extend on H as the adjoint action, namely, as h g = Sg(1) h g(2) . In view of (2.34), this formula will correctly reproduce the action also on the elements of DC ∗ , and therefore, on any element h ∈ DC ∗ > H. The “cross commutation relations” (2.34) on the generators σ h and Zji , η take the form ˆ 12 σ1 , ˆ 12 Z1 R σ1 Z2 = R i
i
x η = (η + 1)x ,
ˆ lm σ n , ˆ hi Z k R i.e. σ h Zji = R nj km l
(2.35)
∂ η = (η − 1)∂ .
(2.36)
i
i
ξ η = (η + 1)ξ ,
i
i
The right relation in (2.35) is the translation of the left one, where the conventional matrix tensor notation has been used. An alternative -structure for the whole DC ∗ > H will be given in (3.7). As shown in [15, 3], there exists a -algebra homomorphism ϕ : A > H → A,
(2.37)
acting as the identity on A itself, a ∈ A,
ϕ(a) = a,
(2.38)
where H is the Hopf algebra H = Uq so(N ), and A = H is the deformed Heisenberg algebra. In [16], we have extended ϕ to the Hopf algebra H = Uq so(N ) introducing
an additional generator η = ϕ(η) ∈ DC ∗ subject to the condition ϕ(q η ) = q η = q −N/2 Λ, so that [η , xi ] = −xi ,
[η , ∂ i ] = ∂ i ,
[η , ξ i ] = 0,
η 1| = q −N/2 .
(2.39)
For real q, ϕ is even a -algebra homomorphism. Applying ϕ to both sides of (2.34), one finds in particular a ϕ(g) = ϕ(g(1) ) (a g(2) ).
(2.40)
In the sequel, we shall often use the shorthand notation ϕ(g) =: g ,
g ∈ H. Zkh
(2.41)
ϕ(Zkh )
We shall need in particular the images = explicitly. We determine them here, starting from an ansatz inspired by the images ϕl (Zkh ) found in [3] for the analogous map ϕl : Uq so(N ) < H → H (where Uq so(N ) acts with a left action): Proposition 2.1. Let q ∈ R. Under the -algebra map ϕ : H > H → H the ϕ(Zkh ) are given by Zkh = q −2 δkh + q −1 k∂ h xj gjk − q −1−N kxh ∂ˆj gjk −
k 2 q −2 h 2 ˆj ∂ r ∂ gjk , 1 + q N −2
(2.42)
where we have defined k := q − q −1 . Moreover, g 1| = (g)1,
g ∈ H.
(2.43)
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
89
The latter relation together with (2.40) implies g f | = f S −1 g.
(2.44)
In particular, we find (A.3) on the spherical harmonics of level l. One may ask if ϕ trivially extends to a map of the type (2.37) and (2.38) with the Heisenberg algebra H replaced by the whole DC ∗ . The answer is no: by using formula (2.53), one easily finds the commutation relation ˆ 12 ξ1 , ˆ −1 Z1 R ξ1 Z2 = R 12
(2.45) i
i
which differs from what one would obtain from (2.35) with σ = ξ applying such a ϕ. Clearly, this formula holds also if we replace the matrix Z with any of its powers Z h . Now, note that the ξ i commute with Λ, see (2.21)3 . Recalling [10] that the center Z(Uq so(N )) of Uq so(N ) is generated by the Casimirs Cl defined by Cl := tr[U Z h ],
Uji := g ik gjk ,
one easily checks and concludes that [ξ i , Ch ] = 0
l = 1, 2, . . . , [N/2] ,
(2.46)
i ξ , ϕ Z(H) = 0
⇒
(2.47)
so(N ), in particular [ξ i , w ˜ ] = 0, whereas ξ i do not commute with the with H = Uq i center Z(H) itself (in fact [ξ , Ch ] = 0, [ξ i , η] = 0). 2.3. Vielbein basis, Hodge map and Laplacian The set of N exact forms {ξ i } is a natural basis for the H-bimodule DC 1 , as well as for the H > Uq so(N )-bimodule DC 1 > Uq so(N ). In [2, 16], we introduced “frame” [7] (or “vielbein”) bases {θi } and {ϑi } for the two, which are very useful for many purposes. These 1-forms are given by N
l m 1−η j ρ m (u4 )L−,i ϑi := q −η− 2 L−,i j , l ξ = ξ q i
θ := Λ
−1
l ϕ(L−,i l )ξ
=Λ
−1 h
ξ
j Uk−1i ϕ(L−,k j Uh )
(2.48) =Λ
−1 h
ξ ϕ(S
2
L−,i h )
(2.49)
are the FRT generators, see (2.28)], and are [u4 := R−1(1) S −1 R−1(2) , and L±,i l characterized by the property [ϑi , H] = 0,
[θi , H] = 0.
(2.50) i
They satisfy the same commutation relations as the ξ . As already recalled, from N ) = 1. The matrix elements of the q-epsilon tensor (2.3) it follows [12] that dim( are defined [12] up to a normalization constant γN by either relation ξ i1 ξ i2 · · · ξ iN = dNx εi1 i2 ···iN , where γN dNx := ξ −n ξ 1−n · · · ξ n ∈
θi1 θi2 · · · θiN = dV εi1 i2 ···iN ,
N
,
γN dV := θ−n θ1−n · · · θn .
(2.51)
(2.52)
One finds [16] that the “volume form” dV is central in DC ∗ and equal to dV = dNxΛ−N . As a consequence of (2.21), dV | = dNx.
February 25, 2006 14:58 WSPC/148-RMP
90
J070-00259
G. Fiore
Note that (2.50) in particular implies [θi , ϕ(g)] = 0 for any g ∈ Uq so(N ). Going to the differential basis ξ h by means of the inverse transformation of (2.49), one finds the following commutation relations between the ξ h and g = ϕ(g): l ξ h ϕ(g) = ϕ(SL−,h gL−,i i l )ξ ,
−,h ϕ(g)ξ h = ξ l ϕ(S 2L−,i ). l gSLi
(2.53)
As shown in [16], for any p = 0, 1, . . . , N , one can define a Uq so(N )-covariant, H-bilinear map ∗ : DC p → DC N −p
(2.54)
(the “Hodge map”), such that ∗ 1 = dV and on each DC p (and therefore, on the whole DC ∗ )d ∗2 ≡ ∗ ◦ ∗ = id
(2.56)
by setting on the monomials in the θa ∗
(θa1 θa2 · · · θap ) = cp θap+1 · · · θaN εaN ···ap+1 a1 ···ap ,
(2.57)
(the normalization constants cp are given in [16]). H-bilinearity of the Hodge map implies in particular ∗
(a ωp b) = a∗ ωp b,
∀ a, b ∈ H,
ωp ∈ DC p ;
(2.58)
i.e. applying Hodge and multiplying by “functions or differential operators” are commuting operations, in other words, a differential form ωp and its Hodge image have the same commutation relations with xi , ∂ j . Restricting the domain of ∗ to the ˜ ∗ ⊂ DC ∗ generated by xi , ξ j , Λ±1 , one obtains also a Uq so(N )unital subalgebra Ω covariant, F˜ -bilinear map ˜ N −p ˜p → Ω ∗:Ω
(2.59)
˜ 0 ). The restriction (2.59) is the fulfilling again ∗ 1 = dV and (2.56) (here, F˜ ≡ Ω notion closest to the conventional notion of a Hodge map on RN q : as a matter of ˜ ∗ is not fact, there is no F -bilinear restriction of ∗ to Ω∗ . Note, however, that Ω e closed under the -structure . One would think that, since the vielbein θa do not belong to Ω∗ , they cannot be used to describe a p-form ω ∈ Ω∗ through components ωaθp ···a1 ∈ F . On the contrary, in Sec. 4, we shall give a very useful notion of such components. is no sign at the right-hand side of (2.56) [contrary to the standard (−1)p(N−p) of the undeformed case] because of the non-standard ordering of the indices in (2.57). The latter in turn is the only correct one: had we used a different order, at the right-hand side of (2.56) tensor products of the matrices U ±1 , instead of the unit matrix, would have appeared, because of the property [36],
d There
i1 ···iN = (−1)N−1 Uji1i i2 ···iN j1 . e In
˜ ∗ is closed. [2], we introduced a different -structure under which Ω
(2.55)
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
91
Finally, introducing the exterior coderivative δ := −∗ d∗
(2.60) −1
one finds that on all of DC ∗ , and in particular on all of Ω∗ , the Laplacian ∆[˜ν d δ + δ d is given by −1
∆[˜ν
]
ˆ := dδ + δd = −q 2 ∂ · ∂Λ2 = −q −N ∂ˆ · ∂.
]
:=
(2.61)
For the exterior coderivative δˆ := −∗ dˆ∗ of the “hatted” differential calculus, one similarly finds that the Laplacian ∆ ≡ ∆[˜ν ] := dˆδˆ + δˆdˆ is equal to ∆ = −q −2 ∂ˆ · ˆ −2 = −q N ∂ · ∂. The reason for the awkward superscripts [˜ν ] , [˜ν −1 ] will appear ∂Λ clear in Sec. 4. 2.4. Integration over RN q and naive scalar products In defining integration over RN q , i.e. a suitable C-linear functional N f ∈Γ⊂F → f d x ∈ C, q
we adopt the approach of [36] (already sketched in [22]), rather than the preceding one of [11, 24, 14],f since the former is applicable to a larger domain Γ ⊂ F of “functions” (specified below). Going to “polar coordinates” {xi } → {ti , r}, f (x) = f (t, r), allows to define the integral decomposing it into an integral over the “angular coordinates” ti , i.e. over the q-sphere SqN −1 , followed by the integral over the “radial coordinate” r: ∞ f (x) dNx = dr m(r) rN −1 dN −1 t f (t, r). SqN −1
0
q
Up to a normalization factor AN (q) (playing the role of the volume of SqN −1 ), which we here choose to be 1 for the sake of brevity, the integration SqN −1 dN −1 t coincides with the projection f ∈ Γ → f0 ∈ Γ0 , where Γ0 = Γ ∩ C[[r, r−1 ]] is the “zero angular momentum” subspace of Γ [see (2.13)]: SqN −1 dN −1 t f (x) = f0 (r). This implies ∞ N f (x) d x = dr m(r) rN −1 f0 (r). (2.62) q
0
This has to be understood as an integral of the analytic continuation of f0 (r) to R+ , if f0 is not assigned as a function on R+ from the very beginning; by dr we mean Lebesgue measure, whereas dr m(r) ≡ dµ(r) denotes a Borel measure fulfilling f The
construction of [11, 24, 14] is purely algebraic, on the fact that by repeated R R namely, based application of the Stokes theorem, one can reduce q dNx f to q dNx eq2 [−r 2 /a2 ] for any function f = eq2 [−r 2 /a2 ]p(x) where eq2 [−r 2 ] is the q-gaussian and p is a monomial in xi ; by linearity, this can be extended also to power series p(x) in a certain (not so large) class with fast decrease at infinity.
February 25, 2006 14:58 WSPC/148-RMP
92
J070-00259
G. Fiore
the q-scaling property dµ(qr) = q dµ(r) (in other words, the “weight” m(r) fulfills m(qr) = m(r)), which ensures the invariance under q-dilatations f (qx)dN (qx) ≡ Λ−1 f (x)|dN (qx) = f (x) dNx. (2.63) q
q
q
The “weight” mJ,r0 (r) := |q − 1| n∈Z rδ(r − r0 q n ) gives the so-called Jackson integral, m(r) = 1 the standard Lebesgue integral, over R+ . Thus, we can define integration on the functional space Γ=F f0 ∈ C[[r, r−1 ]] f0 dNx = ±∞ . q
For real q, integration over RN q fulfills the following properties:
f d x = f dNx N
q
reality,
(2.64)
q
f f dNx ≥ 0,
and = 0 iff f = 0 positivity, N Uq so(N )-invariance. f d x g = (g) f dNx
(2.65)
q
q
(2.66)
q
Moreover, if f is a regular function decreasing faster than 1/rN −1 as r → ∞, the Stokes theorem holds ∂i f (x)| dNx = 0, (2.67) ∂ˆi f (x)| dNx = 0. q
q
Properties (2.66) and (2.67) express invariance respectively under deformed “infinitesimal translations and rotations”. On the contrary, the cyclic property for the integral of a product of functions is q-deformed [36]. Integration of functions immediately leads to integration of N -forms ωN . Upon moving all the ξ’s to the right of the x’s and using (2.51), we can express ωN in the form ωN = f dNx, and just have to set ωN = f dNx. (2.68) q
q
ˆ N −1 | = 0. Finally, by using Then, Eq. (2.67) takes the form q dωN −1 | = 0, q dω Stokes theorem, it is easy to show that for any p = 0, 1, . . . , N and any αp ∈ DC p , βN −p ∈ DC N −p , αp βN −p | = (αp |) βN −p |, (2.69) q
q
provided the product αp βN −p | decreases fast enough as r → ∞. Because of the N C-linearity of q d x and properties (2.64), (2.65) and (2.69), one can introduce the
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
93
(naive) scalar products of two “wave-functions” φ, ψ ∈ F and more generally of two “wave-forms” αp , β p ∈ Ωp by N (2.70) φ, ψ := φ ψ d x, αp , β p := αp ∗ βp |. q
q
From the decomposition (2.15) for φ, ψ and the orthonormality relations N −1 d t SlI SlI = (SlI SlI )0 = δll δ II , we find Sq ∞ φ, ψ = dr rN −1 m(r)(φ ψ)0 (r) 0
=
∞ l,l =0
=
I,I
∞ l=0
(SlI SlI )0
0
∞
dr rN −1 m(r)φl,I (r)ψl ,I (r)
φl,I , ψl,I ,
(2.71)
I
where we have introduced the “reduced scalar product” ∞ ∞ ˜ dr rN −1 m(r)φ (r)ψ(r) = dy eN y m(y) ˜ φ˜ (y)ψ(y) φ, ψ :=
(2.72)
−∞
0
of two functions φ(r), ψ(r) defined on the positive real line, and we have defined y := log r, m(y) ˜ := m(ey ). A glance to (A.3) is sufficient to verify that for any real a, the operator wa (in particular, ν ±1 ) is Hermitean with respect to ·, ·. ˆ Using (2.17), Stokes theorem (2.67) and the analog of (2.11) for the ∂α derivatives, we find that the p are not Hermitean with respect to ·, ·, but [13]: φ, pα ψ = φ pα ψ| dNx = (ˆ pα φ|) ψ dNx = ˆ pα φ, ψ, (2.73) q
q
ˆα
with pˆ = −i∂ . Using Stokes theorem, in the Appendix we show that (2.70)2 equals 1 αθ ap ···a1 β θ ap ···a1 dNx αp , βp = cN −p q 1 = αθ ap ···a1 βθ ap ···a1 | dNx, (2.74) cN −p q α
where we have introduced the notation ωp = ξ i1 · · · ξ ip ωip ···i1 (x) = θa1 · · · θap ωaθp ···a1 =: θa1 · · · θap ωaθp ···a1 (x)|
(2.75)
for any p-form ωp ∈ Ωp . We shall call the functions ωip ···i1 , ωaθp ···a1 (note also the latter belong to F , not to H!) the components of the p-form ωp ∈ Ωp respectively in the bases {ξ i }, {θa }. The ωaθp ···a1 must not be confused with the components ωaθp ···a1 of ωp in the basis {θa }, defined above by ωp =: θa1 · · · θap ωaθp ···a1 (without the final vertical bar); the latter belong to H, because θa ∈ DC ∗ \Ω∗ ! Clearly ωaθp ···a1 = ωaθp ···a1 |.
February 25, 2006 14:58 WSPC/148-RMP
94
J070-00259
G. Fiore
The above “open-minded” definition implies the following generalized notion of transformation of the components of a given differential p-form under the change of basis of 1-forms ξ i ↔ θa : −,a 1 ωip ···i1 (x) = Λ−p ϕ S 2 (Lip p · · · L−,a ) ωaθp ···a1 (x)|, i1 (2.76) −,i 1 ωaθp ···a1 (x) = Λp ϕ S(Lap p · · · L−,i a1 ) ωip ···i1 (x)|. In the Appendix, we also show αp , β p = ∗ αp , ∗ β p .
(2.77)
Formula (2.74) shows that (2.70)2 defines a “good” scalar product in Ωp , reduc ing it to the scalar product in p F . In particular, if p = 0, then α0 , β0 ∈ F and we recover the scalar product (2.70)1 , because ∗ α0 β0 | = α0 dV β0 | = α0 β 0 dV | = α0 β 0 dNx. q
q
q
q
One defines a “naive” Hilbert space of square integrable functions on RN q by ˜m L 2
f (x) ≡
:=
∞ l=0
SlI fl,I (r)
∈ F f , f < ∞
(2.78)
I
(the superscript m refers to the choice of the radial measure m), and similarly, one defines “naive” Hilbert space of square integrable p-forms. 3. The -Structure Expressed by Similarity Transformations Theorem 3.1. For positive q, the -structure of DC ∗ given in (2.16) and (2.17) can be expressed in the form xi = xh ghi , ξ
i
N h
=q ξ
∂ i = −q
(3.1)
ghj Zij Λ−2 ,
1−N 2
v −1 ∂ h ghi v Λ = −˜ v −1 ∂ h v˜ ghi ,
−1
v d v˜ d = −˜
,
θ = w ˜ θw ˜−1 .
(3.2) (3.3) (3.4) (3.5)
(The proof of the theorem is in the Appendix.) By the linear transformation Viα (see Sec. 2.1), we obtain a set of derivatives ∂ α such that on −i∂ α , acts as a similarity transformation: pα ≡ −i∂ α := −iViα ∂ i ⇒ pα = v˜−1 pα v˜ .
(3.6)
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
95
Incidentally, one can endow the whole A > H with an alternative -structure by keeping Eq. (2.16) unchanged while removing the map ϕ from (3.2) and (3.3), and readjusting the normalization factors in the latter formulae:
˜−1 ξ h ghi w, ˜ ξ i = q N ξ h ghj Zij q −2η = w
∂ i = −˜ v −1 ∂ h v˜ghi ,
d = −w ˜−1 ξ i v˜∂i v˜ = −ξ i v˜∂i v˜−1 ,
(3.7)
θ = wθ ˜ w ˜−1 ; the second equality in the first line is easily proved by means of the formulae given in Sec. 2.2 and (A.2). We see that acts as a similarity transformation also on the differentials ξ α = Viα ξ i . Using (3.3), (3.7), (2.28) and (2.29), the fact that for real q, ϕ is a -algebra )L+,i map and the relation Zkh = (SL−,h i k , it is now straightforward to prove: Proposition 3.2. For real q,
ϑi = ϑj gji ,
θi = θj gji ,
dV = dV = dV .
(3.8)
Moreover, the -structure and the Hodge map commute: (∗ ωp ) = ∗ (ωp ).
(3.9)
4. New Solutions for Old Problems: Improved Real Momentum, Scalar Products and Hermitean Conjugation We now come to some problems addressed in the introduction. (1) Quantum mechanics on RN q as a configuration space. One question that has been asked in the literature [38, 14, 13] is: what is the “right” momentum sector subalgebra P within algebra of observables H? In particular, what should be considered the “right” square momentum (i.e. Laplacian) [22, 38, 14, 13]? What are their spectral decompositions? (2) Field theory on RN q . What is the “right” kinetic term in the action functional of a field-theoretic model on RN q ? This is clearly related also to the question: what is the “right” propagator after quantization of the model? As for Problem (1), we wish to fulfill at least the following requirements. P must be: (1) isomorphic to F (and therefore to F ); (2) closed under the action of so(N ); (3) closed under the -structure. The solution proposed in [38, 13] was Uq
essentially the subalgebra P ⊂ H generated by the pα R defined by p2i+1 = ∂ i + ∂ i = ∂ i − q −N ∂ˆj gji , R
i i i −N ˆj p2i ∂ gji ], R = i[∂ − ∂ ] = i[∂ + q
(4.1)
(where we adopt the indices’ convention of [30], as in the previous section) and in [13], we even erroneously stated that it was uniquely determined (the proof of [13, Theorem 2] has a bug). The pα R are real and fulfill relations (2.4), whereas (2.5) and (2.6) are replaced by rather complicated ones involving the angular momentum
February 25, 2006 14:58 WSPC/148-RMP
96
J070-00259
G. Fiore
components (see [38, relation (3)] for the R3q case). Finding eigenfunctions of a complete set of commuting observables including one or more pα R is thus a rather hard task. Trying the same even with just the square momentum (i.e. Laplacian) α g pα R · pR leads to lengthy calculations and complicated formulae. On the basis of the results of the previous section, one could propose as an alternative solution that P ⊂ H be the subalgebra generated by the p˜α defined by p˜α := −iViα ∂˜i ,
∂˜i := ν˜−1 ∂ i ν˜ .
(4.2)
α
Also, the p˜ are real. They fulfill relations (2.4) and (2.6), whereas (2.5) is to be replaced by a so complicated one that it probably cannot be put in closed form.h Similarly, one can introduce a purely imaginary nilpotent exterior derivative by d˜ := ν˜ d ν˜−1
⇒
˜ d˜ = −d;
(4.3)
unpleasantly it does not fulfill the ordinary Leibniz rule any more. pα }, the {pα } or any As we now point out, the choice among the set {pα R }, the {˜ ˜ other sets of derivatives, or between d and d, will have physical significance only together with a specific choice of the scalar product within the Hilbert space upon they are meant to act. The standard “naive” scalar product (2.70) is just one of the possible choices, but not the only one; our goal is to adapt this choice to the choice of the (most manageable) momentum components and exterior derivative. ˜α are (formally) Hermitean with respect to the “naive” scalar Both the pα R and the p product ·, ·: α φ, pα φ, p˜α ψ = ˜ pα φ, ψ. (4.4) R ψ = pR φ, ψ , The first equality (on the appropriate domains) follows from (2.73), and was already proved in [38, 13, 39]; as we shall see in Sec. 5, the second actually holds (on the appropriate domains) if the radial measure m(r) is 1 or satisfies some other specific ˜α is condition. As already noted, the computation of the action of either pα R or p rather complicated because none of them fulfills a simple Leibniz rule like (2.11). As an alternative, we tentatively introduce the “improved” scalar products ˇ ν˜−1 ψ, ˇ ˇ ψ) ˇ := ˜ (φ, ν −1 φ,
ˇp , ˇp ) := ˜ (α ˇ p, β ν −1 α ˇ p , ν˜−1 β
(4.5)
g To see this, note that pα · pα is a combination of ∂ · ∂, ∂ ˆ · ∂ˆ and ∂ˆ · ∂. The latter in its own R R is an alternative, simpler candidate for a real Laplacian, and in fact was diagonalized in [22], formula (40), where a rather long expression for its eigenvalues (involving also the orbital angular momentum number l) was found. This is related to the occurrence of the angular momentum in the commutation relations between these Laplacians and the coordinates xi . N h At least, one advantage is, however, that the Laplacian −˜ p · p˜ ≡ ∂˜ · ∂˜ is equal to ∂ · ∂Λq 1− 2
and therefore, its commutation relation with the coordinate xi is pretty manageable for iterated applications, i i˜ ˜ ˜ i = (1 + q 2−N )q − N 2 ∂ Λ + qx ∂ · ∂, ∂˜ · ∂x
whereas the commutation relation of −pR · pR with xi is more complicated.
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
the “improved” Hilbert space of square integrable functions on RN q ∞ ˇ m := f (x) ≡ L S Ifl,I (r) ∈ F | (f , f ) < ∞ , 2
l
l=0
97
(4.6)
I
and similarly the “improved” Hilbert space of square integrable p-forms. Under the conditions specified in Sec. 5, the (in the algebraic sense) positive-definite elements ν˜±1 can be represented as Hermitean, positive-definite pseudodifferential operators on appropriate domains. Then, −2 N ˇ ˇ ˇp |. ˇ ˇ ˇ ˇ p, βp) = α ˇ p ∗ ν˜−2 β (4.7) (φ, ψ) = φ ν˜ ψ| d x, (α q
q
As a consequence of Theorem 3.1 and of the equality ν˜2 = v˜ , we obtain ˇ ) = (δˆα ˇ ), ˇp, β (α ˇ p , dˆβ p−1 p−1
ˆˇ p) ˇ ,α ˇ (dˆβ p−1 ˇ p ) = (β p−1 , δ α
(4.8)
and the (formal) hermiticity of both the momenta pα = i∂ α and the Laplacian ∆ with respect to the “improved” scalar product (·, ·): ˇ = (pα φ, ˇ ψ), ˇ ˇ pα ψ) (φ,
ˇp ) = (∆α ˇp ). (α ˇ p , ∆β ˇp, β
(4.9)
In other words, the hermiticity of p˜ , ∂˜ · ∂˜ with respect to ·, · becomes equivalent to the hermiticity of pα , ∆ ∝ ∂ · ∂ with respect to (·, ·)! If we impose the relation ˇ we can regard φ, φ ˇ as wave-functions representing the same ket and φ = ν˜−1 φ, α α p˜ , p as pseudodifferential operators representing the same abstract operator in two different, but physically equivalent (configuration-space) “pictures”, because α
ˇ ψ) ˇ = φ, ψ. (φ,
(4.10)
Our answer to Problem (1) is therefore as follows: in the original, “naive” picture the momentum observables act on a wave-function φ(x) as the pseudodifferential operators p˜α , whereas the “position” observables act simply by (left) multiplication by xα , yielding xα φ(x). This picture is thus more convenient to compute the action of the latter than the action of the former. Instead, in the second “improved” picˇ ture, the momentum operators act on a wave-function φ(x) as the differential operators pα , whereas the “position” observables, act as the pseudodifferential operators ν˜ xα ν˜−1 . Therefore, the second picture is definitely more convenient for computing the action of the momentum operators, as well as for answering question (2) (as we shall see below). This notion of “picture” can be generalized as follows. For any pseudodifferential operator σ = id + O(h) depending only on C , η , we introduce the “σ-picture” by f [σ] := σf |, f , g[σ] := σ −1 f , σ −1 g ≡ O[σ] := σOσ −1 ,
(σ −1 f |) σ −1 g| dNx,
(4.11)
q
for f , g ∈ F , O ∈ H (note that for σ = 1, one recovers the original picture). For our purposes, it will be enough to stick to pseudodifferential operators of the
February 25, 2006 14:58 WSPC/148-RMP
98
J070-00259
G. Fiore
2
form σ = q a(η +b) g(C), where b is a real constant and g(C) is a positive-definite pseudodifferential operator depending only on the quadratic Casimir of so(N ). We tentatively introduce the “Hilbert space of square integrable functions on RN q in the σ-picture” by ∞ m,σ I 2 ˜ := f (x) ≡ Sl fl,I (r) ∈ F | f σ < ∞ , (4.12) L 2
l=0
I
ˇ = φ[˜ν ] , φ = φ[1] , L ˇm = L ˜ m,˜ν . Then, where f 2σ := f , f [σ] . In particular, φ 2 2 trivially ˜ m,σ φ[σ] ∈ L 2
⇔
˜ m, φ∈L 2
(4.13)
φ[σ] , ψ [σ] [σ] = φ, ψ, ˜ m,σ ), and (denoting by D[σ] (O[σ] ) the domain of operator O[σ] within L 2 ˜ m,σ φ[σ] ∈ D[σ] (O[σ] ) ⊂ L 2
⇔
˜m φ ∈ D(O) ⊂ L 2 ,
O[σ] φ[σ] | = (Oφ|)[σ] ,
(4.14)
implying that one can describe the same “physics” by any of the σ-pictures. So, one can choose the most convenient for each computation. The generalization of the notion of σ-pictures to forms is straightforward. In Sec. 5, we determine radial measures m and for each σ of the above type a ˜ m,σ and define σ as a pseudodifferential operator ⊂L (m-dependent) subspace Lm,σ 2 2 such that f , g[σ] = f , g[(σσ
−1
)
]
= f [(σσ
−1
)
]
, g
(4.15)
for any f , g ∈ Lm,σ , in particular, 2 ˇ ψ) ˇ = φ, ˆ ψ ˇ = φ, ˇ ψ, ˆ (φ, −1
ˆ ≡ φ[˜ν where φ becomes
]
(4.16)
ˇ ≡ φ[˜ν ] . After the replacements φ → φ, ˆ ψ → ψ, ˇ (2.73) , φ ˇ = ˆ ˆ ψ. ˇ ˆ pα ψ pα φ, φ,
(4.17)
Then, (3.3) and (4.2) will imply (4.4)2 and (4.9)1 , respectively, for any φ, ψ ∈ ˇ ψ ˇ ∈ D[˜ν ] (pα ) (note that with our notation pα = p˜α[˜ν ] ) and more D(˜ pα ), and φ, generally φ[σ] , p˜α[σ] ψ [σ] [σ] = ˜ pα[σ] φ[σ] , ψ [σ] [σ]
(4.18)
for any σ and φ , ψ ∈ D (˜ p ). As an application, we recall how one can diagonalize observables of P using improved pictures. In [13], we constructed irreducible -representations of the -algebra P > H ⊂ H and diagonalized within the latter a complete set of commuting observables, consisting not only of the square total momentum P ·P =: (P ·P )n , [σ]
[σ]
[σ]
α[σ]
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
99
a j but of all the (P · P )a := j=−a P Pj with a = 1, 2, . . . , n (these are the squares of the projections of the momentum on the hyperplanes with coordinates P −a , P 1−a , . . . , P a ), of P 0 (only for odd N ), and of the generators K a of the Cartan subalgebra of Uq so(N ). Diagonalization was performed first at the abstract level, i.e. eigenvectors were abstract kets and P was the -algebra generated by abstract Uq so(N )-covariant generators Pi fulfilling (2.4) and the same -relations (2.16) as the xi . Then, we realized the scheme in RN q -configuration space in two different realizations, i.e. pictures: in the first one (which we called “unbarred”), Pi were realized as −iΛ∂i = −iτ ∂˜i τ −1 , in the second (which we called “barred”), the Pi 2 were realized as −i∂ˆi Λ−1 = τ −1 ∂˜i τ where τ := ν q (η +N +1) /4 . In the previous notation, they amount, respectively, to the σ = τ and the σ = τ −1 pictures.i To compute the action of Pi either one is much more convenient than the “naive” one, where Pi are realized as the pseudodifferential operators −i∂˜i , because of the relatively simple commutation relations (2.5) and (2.21), and the analogous ones involving the ∂ˆi . For 0 < q < 1, we found the following spectral decompositions of the above observables: [τ ]
[τ ]
(p · p)φπ,j = κ2 q 2πn φπ,j , [τ ]
(p · p)a φπ,j = κ2a q [τ ]
Pn
k=a
2πk
[τ ]
φπ,j ,
[τ ]
p0 φπ,j = κ0 q π0 φπ,j
(4.19)
(only for odd N );
here κ ≡ κn is a positive constant characterizing the irreducible representation (by a redefinition of πn , it can always be chosen in [1, q[), and −2ρa 1 + q 1 + q −1 n κa = κq n−a , κ = ±κq (only for odd N ), 0 1 + q N −2 1 + q N −2 whereas π, j are vectors (the component ja of j labels eigenvalues of K a ) with suitable [13] integer components, in particular, πn ∈ Z and πh ∈ N if h < n. Up [τ ] to normalization, in the unbarred realization (or “picture”), the eigenfuntions φπ,j with π = 0 will be given by [13] −1 j1 0 (x ) eq−1 [iκ0 x ], if N = 2n + 1, [τ ] φ0,j ∼ (x−n )jn · · · (x−2 )j2 · qκ2 (x−sign(j1 )·1 )|j1 | ϕJq−1 x1 x1 , if N = 2n, q 2−N + 1 n where J := a=1 ja and, having set (l)q := (q l − 1)/(q − 1), eq (z) :=
∞ zl , (l)q ! l=0
ϕJq (z) :=
∞ l=0
(−z)l . (l)q2 !(l + J)q2 !
(4.20)
warn the reader that in the conventions of [13], Λ is what here is denoted by Λ−1 , and conversely.
i We
February 25, 2006 14:58 WSPC/148-RMP
100
J070-00259
G. Fiore [τ ]
(As we expect, for odd N , in the limit q = 1, φ0,0 formally becomes a plane wave [τ ] orthogonal to the x0 coordinate.) The φ0,j can also be obtained from the cyclic [τ ] [τ ] eigenfunction φ0,0 by applying to the latter suitable elements in P > H. The φπ,j [τ ] with π = 0 are obtained applying to φ0,j powers of the Λ∂i with i > 0. We thus find relatively “tractable” eigenfunctions, which can actually be expressed through qspecial functions (see Sec. 5.2). Formula (4.19) shows that these operators have very simple discrete spectra, essentially consisting of integer powers of q. As a matter of fact, the eigenfunctions are also normalizable: this was proved in [13] adopting a slightly different definition of integration, and is true also adopting the definition of integration [36] recalled in Sec. 2.4.j This situation is to be contrasted with the undeformed one, where the corresponding operators have continuous spectra and generalized eigenfunctions. Therefore, q-deformation can be seen as a “regularizing” device! Moreover, in Sec. 5, we shall see that the constant κ characterizing the irreducible representation can take any value if we choose a trivial radial weight [m(r) ≡ 1] in (2.62), whereas (at least, for even N ) is quantized to a specific value (defined up to powers of q) if we choose a nontrivial m(r). In other words, in the latter case, the nature of space(time) fixes an energy scale independent of the particular irreducible representation we have chosen, namely of the particular type of particles we describe by the latter! Similarly, one can treat the case q > 1. We come now to Question (2). The kinetic term in the action for a p-form (i.e. an antisymmetric tensor with p-indices) Euclidean field theory with mass M can be most simply introduced as ˇk, α ˇ k ). Sk = ((∆ + M 2 )α It will be rather “tractable” because ∆ = −q N ∂ · ∂ has the rather simple action (A.6) as a differential operator. Consider in particular a scalar field (i.e. k = 0). The “propagator” (or Green function) G(y, x) of the theory should be expressible ˇπ ,l,I } of eigenfunctions of ∆ + M 2 , ν in terms of any orthonormal basis {φ n
ˇπ ,l,I = (κ2 q 2πn + M 2 )φ ˇπ ,l,I , (∆ + M 2 )φ n n ˇπ ,l,I = q −l(l+N −2)/4 φ ˇπ ,l,I , νφ n n
(4.21)
and some other observables (whose eigenvalues we label by a multi-index I) commuting with each other and making up a complete set, through the relatively simple [τ ]
either case, the question of the normalizability of all φπ ,j is reduced to the question of the [τ ] normalizability of the cyclic eigenfunction φ0,0 by manipulations involving the use of Stokes theorem, similarly as in the undeformed context the normalizability of the H´ermite functions is 2 [τ ] reduced to that of the gaussian e−r /2 . That φ0,0 is normalizable is true by the definition of integration of [13] in the first case, and can be proved by a rather lenghty computation in the present case. j In
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
101
formula
G(y, x) =
ˇπ ,l,I (y)[˜ ˇπ ,l,I |] (x) φ ν −2 (∆ + M 2 )−1 φ n n
πn ,l,I
=
πn ,l,I
=
πn ,l,I
ˇ φ πn ,l,I (y)
1 ˇ [˜ ν −2 φ πn ,l,I |] (x) κ2 q 2πn + M 2
l(l+N −2)/2 2 ˇπ ,l,I (y) q ˇπ ,l,I | (x), φ q η /2 φ n n κ2 q 2πn + M 2
(4.22)
where y i denote the generators of another copy of RN q . If we choose I as the multiindex labelling spherical harmonics (2.14), one thus looks for the basis elements in ˇπ ,l,I = S I φπn ,l (r). Using the formulae given in Appendix A.1 reduces the form φ l n Eq. (4.21)1 to a q-difference equation for φπn ,l (r); solving it is now an affordable task, which is left as a job for future work.
5. Defining the Pseudodifferential Operators q a(η
+b)2
As said, in order that the formal considerations of the previous section are imple 2 mented at the operator level, we have to make sense out of σ = q a(η +b) g(C) as pseudodifferential operators on F (more generally on Ω∗ ) and investigate whether ˜ m,σ to some subspace Lm,σ in order that on the latter (4.15) we need to restrict L 2,p 2,p holds. We are going to do this next, distinguishing the case m ≡ 1 from the others. Clearly, it is sufficient to do this for p = 0-forms, i.e. functions, because the form components are functions themselves. Recalling the decomposition (2.15) for φ, (A.3) and (2.71), we see that g(C) fulfills the requirement, so the problem is 2 reduced to showing that one can define q a(η +b) so that the latter also does. To 2 define the action of q a(η +b) on the functions φl,I (r), we perform the change of variable r → y := ln r, whereby η = −∂y − N/2 and rN −1 dr = eN y dy, for any ˜ ˜ = rN/2 φ(r) in terms of function φ(r) denote φ(y) := φ(ey ), and express eyN/2 φ(y) ˆ its Fourier transform φ(ω): e
N 2
1 φ(y) = √ 2π
y˜
∞
iωy ˆ φ(ω)e dω.
(5.1)
−∞
N Here we are assuming in addition that all e 2 y φ˜l,I (y) ∈ L2 (R) ≡ L2 (R, dy), in other words that all φl,I (r) ∈ L2 (R+ , drN ), what guarantees that the Fourier transform exists and is invertible. One initial motivation behind such a change of variable is that y is more suitable to describe the behavior of functions occurring in q-analysis, notably q-special functions (which are typically involved as solutions of q-difference equations) as r → 0, ∞ (i.e. y → −∞, ∞), since often they wildly fluctuate as r → 0 or as r → ∞; this can be inferred from the typical exponential scaling laws
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
G. Fiore
102
of the zeroes/poles rn of q-special functions either as r → 0 or r → ∞.k From (5.1), we find ∞ 2 2 N N 1 iωy −a(ω+ib)2 ˆ ˜ ˜ dω φ(ω)e q , = q a∂y e 2 y φ(y)| =√ e 2 y q a(η +b) φ(y)| 2π −∞
2
2
i.e. q a(η +b) acts as multiplication by q −aω on the Fourier transform, implying N ∞ ∞ 2 e− 2 y g[l(l + N − 2)] SlI dω φˆl,I (ω)eiωy q −a(ω+ib) . (5.2) σφ(x)| = √ 2π l=0 −∞ I Of course this is well defined only for q −a such that the integrals are. We also easily see that one can extend the domain of the partial derivatives ∂ i to φ with φl,I (r) ∈ L2 (R+ , drN ) using (A.11) and (A.5), provided we can extend also the action Λ±1 f (x) = f (q ∓1 x) of Λ±1 ≡ e∓h∂y on such φ’s; this is done of course by setting N ∞ e− 2 (y∓h) I ∞ ±1 Sl dω φˆl,I (ω)eiω(y∓h) . (5.3) Λ φ(x)| = √ 2π l=0 I −∞ In terms of Fourier transforms, the reduced scalar product (2.72) becomes ∞ ∞ ∞ dy i(ω −ω)y ˆ ˆ m(y)e ˜ dω dω φ (ω)ψ(ω ) . (5.4) φ, ψ = −∞ −∞ −∞ 2π 5.1. The case m ≡ 1 In the case m(r) = m(y) ˜ ≡ 1, the third integral at the right-hand side of (5.4) reduces to δ(ω − ω ), implying ∞ ˆ dω φˆ (ω)ψ(ω), φ, ψ = −∞
φ, ψ =
∞ ∞ l=0
I
−∞
(5.5) dω φˆl,I (ω)ψˆl,I (ω).
2
For φ[σ] , ψ [σ] ∈ F , this and (5.2) for σ = q a(η +b) g(C) imply φ[σ] , ψ [σ] [σ] := σ −1 φ[σ] , σ −1 ψ [σ] ∞ = g 2 (l(l + N − 2)) l=0
−∞
I
= φ[σ] , ψ [σ
∞
−1
]
= φ[σ
−1
]
[σ]
[σ]
2
dω(φˆl,I (ω)) ψˆl,I (ω)q 2ab
, ψ [σ] ,
−2aω 2
(5.6) (5.7)
happens for instance with the q-gaussian eq2 [−r 2 ] := 0 ϕ0 [q 2 , (q 2 − 1)r 2 ]: property (5.32) implies eq2 [−q 2 r 2 ] = [1 − (q 2 − 1)r 2 ]eq2 [−r 2 ], whence we see that for q > 1 and sufficiently large r, the modulus of eq2 [−q 2n r 2 ] grows with n and its sign flips at each step n → n + 1.
k This
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
103
in particular φ[σ] 2σ =
∞
g 2 (l(l + N − 2))
−∞
I
l=0
∞
2
[σ]
dω|φˆl,I (ω)|2 q 2ab
−2aω 2
.
(5.8)
˜ 1,σ if this is finite. If both φ[σ] , ψ [σ] ∈ L ˜ 1,σ then, The function φ[σ] will belong to L 2 2 by Schwarz inequality, the right-hand side of (5.6) is finite as well; then equalities in (5.7) are just the proof of relation (4.15) we were seeking for. Note that in the present m ≡ 1 case, by (2.71) the condition φ[σ] 2σ < ∞ ˜ 1,σ implies q aη 2 φl,I ∈ L2 (R+ , drN ) for all l, I, whence the assumed characterizing L 2 exixtence and invertibility of the Fourier transform automatically follows. We summarize the results by stating the following: Theorem 5.1. If m ≡ 1, for any real s, the scalar product of the Hilbert space ˜ 1,σ can be expressed by any of the expressions in (4.15) and the p˜α[s] are := L L1,σ 2 2 (formally) hermitean operators defined on L1,σ 2 . 2
Remark 5.2. If q a > 1, the factor q −2aω in (5.8) acts as a “UV regulator”. 5.2. The case m = 1 The measure m ≡ 1 describes a continuous and homogeneous space along the radial direction. It is important to leave room for a discretized space by allowing for a non-unit m, notably a measure concentrated in points, like Jackson’s measure mJ,r0 (r) drN , where mJ,r0 (r) := |q − 1| rδ(r − r0 q l ) = |q − 1| δ(y − y0 − lh) l∈Z
l∈Z
(here y0 = log r0 ). The case m = 1 actually reveals to be rather interesting and full of surprises; in the sequel, we disclose some of its features by performing a preliminary analysis, leaving an exhaustive investigation as the subject for some other work. We assume that all the φl,I (r) can be analytically continued to the complex r-plane. Sticking for simplicity to the case that φl,I (r) are uni-valued, the analytic continuation of φ˜l,I (y) := φl,I (ey ) will fulfill the periodicity condition k φ˜l,I (y) = φ˜l,I y + i2π , k ∈ Z, (5.9) γ with γ = 1; more generally, they will also fulfill this condition with γ = 2, 3, . . . if φl,I (r) can be expressed in the form φl,I (r) = φl,I (rγ ), with φl,I (z) uni-valued. ¯ l, I in the intermediate ¯ Below we shall occasionally suppress the subscripts results N ˜ 2 y to avoid a too heavy notation. Now, we compute the Fourier transform φˆ of φ(y)e ∞ N dy ˜ ˆ 2 y−iωy , √ Φ(ω, y), Φ(ω, y) := φ(y)e (5.10) φ(ω) = 2π −∞
February 25, 2006 14:58 WSPC/148-RMP
104
J070-00259
G. Fiore
Fig. 1.
Poles’ locations of the integrand of (5.10) and anticlockwise integration contour.
using the method of residues. We first assume that φ(r) has no poles on R+ (or, ˜ equivalently, that φ(y) has no poles on the real axis). For ω < 0, the exponential −iωy rapidly goes to zero as (y) → ∞. Choose a contour like the one depicted e in Fig. 1, with M ∈ N. By (5.9), the integral on the upper horizontal side equals −eiN Mπ+ωM2π times (5.10), and therefore vanishes in the limit, M → ∞, together with the integral on the vertical sides. Therefore, taking this limit, we find √ √ ˜ )e( N2 −iω)y . ˆ Res Φ(ω, y ) = i 2π Res φ(y φ(ω) = i 2π poles y ∈C+
poles y ∈C+
˜ By (5.9), the poles of φ(y), and therefore of Φ(ω, y), can be parametrized in the form k yj φ ,k = yjφ + 2π i, γ
π 0 < (yjφ ) < 2 , γ
(5.11)
where k ∈ Z and jφ is some possible additional index. Therefore, ∞ √ N k ˜ j )e( N2 −iω)yjφ ˆ Res φ(y e( 2 −iω)i2π γ φ(ω) = i 2π φ jφ
=
√ i 2π 1−e
π γ (iN +2ω)
k=0
˜ j )e( N2 −iω)yjφ Res φ(y φ
(5.12)
jφ
˜ j ) = Res φ(y ˜ j +i2πk/γ). By applying the method of residues since by (5.9) Res φ(y φ φ instead to an analogous clockwise contour in the lower complex y-half-plane C− , ˆ one finds that the latter formula gives φ(ω) also for ω > 0. ˆ Note that if N/γ is an even integer, φ(ω) has a first-order pole in ω = 0 and ∞ dω in (5.1) has to be understood as a principal value integral around ω = 0, −∞ unless cancellations of contributions of different poles jφ occur.
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
105
Replacing (5.12) in (5.2), if no φl,I (r) has poles on R+ we find σφ(x)| = i
∞
g[l(l + N − 2)]
l=0
×
SlI
˜ j ) Res φ(y l,I
jl,I
I N
∞
dω
e(iω− 2 )(y−yjl,I ) q −a(ω+ib)
2
,
π
1 − e γ (2ω+iN )
−∞
(5.13)
where we have used the shorthand notation jl,I := jφ˜l,I . The integral is well defined 2 for q −a ≤ 1, i.e. ah ≥ 0. Note that if ah > 0, because of the damping factor q −aω , w ˜−a φ(x) has no more poles in y = yjl,I . Formula (5.3) will still give the action of Λ±1 on φ. Let us now evaluate σφ, σ ψ (with ah, a h ≥ 0) in the present case. By (2.71) and the previous equation, we find σφ, σ ψ =
∞ l=0
2
2
(5.14)
I
and
2
gg [l(l + N − 2)]q a(η +b) φl,I , q a (η +b ) ψl,I
2
q a(η +b) φ, q a (η +b ) ψ =
˜ j )] Res ψ(y ˜ j )M j (a, b; a , b ), [Res φ(y j
(5.15)
j,j
Mjj :=
∞
dω
−∞
×
N
∞
dω
[1 − e
−∞ ∞
e 2 (yj +yj )+i(ωyj −ω yj ) q −a(ω+ib)
i(ω dy m(y)e ˜
π γ (2ω−iN )
][1 − e
2
−a (ω−ib )2
π γ (2ω +iN )
]
−ω)y
−∞
˜ (here yj denote the pole locations of ψ(y) with 0 < (yj ) < 2π/γ). We ask whether a(η +b)2 a(η −b)2 ψ = q φ, ψ for φ, ψ within a suitable space of functions to be φ, q identified. For γ ∈ N and β = 0, 12 , let m,[β,γ]
:= {φ ∈ L2 (R+ , m(r) drN ) | φ(r) = f (r)φ(rγ ), where ¯ φ is analytic with poles only in z = − q n(j+β) , j ∈ Z}. ¯ The poles of φ(r) will be only in L2
rj,k := q j+β ei
π(2k+1) γ
(5.16)
(5.17)
˜ with k = 0, 1, . . . , γ − 1 and j belongs to some subset J ⊂ Z, and those of φ(y) only in yj,k := h(j + β) + i
π(2k + 1) . γ
(5.18)
Condition (5.17) amounts to saying that the pole locations lie on γ special straight half-lines starting from r = 0 and forming with each other angles equal to 2π/γ, 1 and are such that their absolute values are either q j or q j+ 2 , with j ∈ J ⊂ Z.
February 25, 2006 14:58 WSPC/148-RMP
106
J070-00259
G. Fiore
The condition appearing in (5.16) thus implies (5.11) (with (yj ) = π/γ), whence m,[β,γ] , then (5.13)–(5.15). Thus, if φ, ψ ∈ L2 ∞ ∞ 2 2 N 1 q −a(ω+ib) −a (ω−ib ) + 2 (j+j +2β)
Mjj (a, b; a , b ) = dω dω π π 4 −∞ −∞ (N − 2iω) sin (N + 2iω ) sin 2γ 2γ ∞ i[(ω −ω)(y−hβ)+ωjh−ω j h] × dy m(y)e ˜ . (5.19) −∞
Note that in (5.15), one can consider the indices j, j as running over the whole Z for any φ, ψ because the residues will vanish in the yj which are not poles for these functions. Then, one can consider M (a, b; a , b ) as a universal infinite matrix and express the left-hand side of (5.15) in terms of the row-by-column matrix product
2
2
q a(η +b) φ, q a (η +b ) ψ = Rφ† M (a, b; a , b )Rψ ,
(5.20)
where by Rφ , we have denoted the column vector with infinitely many components ˜ y=[h(j+β)+iπ/γ] . Rφj , j ∈ Z, given by Rφj = Res φ| Now, performing the change of integration variables ω → −ω , one immediately finds that Mjj (a , b ; a, b) = Mjj (a, b; a , b ). Moreover, taking the complex conjugate and performing the change of integration variables ω → −ω, ω → −ω , we find that the Mjj are real, j (5.21) Mj (a, b; a , b ) = Mjj (a, b; a , b ). By the q-scaling property, the transformed weight m(y) ˜ := m(ey ) is periodic with period h = ln q; we shall also assume that m is invariant under r-inversion,l so for any k ∈ Z, m(r−1 ) = m(r), m(−y) ˜ = m(y). ˜
m(q k r) = m(r), i.e. m(y ˜ + kh) = m(y), ˜
(5.22)
Performing the change of integration variables ω ↔ ω, y → −y + (j + j + 2β + 2a b − 2ab)h, we now find Mjj (a, b; a , b ) = Mjj (a , b ; a, b),
if N/γ ∈ N,
2(a b − ab) ∈ Z;
(5.23)
in fact, the weight m ˜ and the last integral in (5.19) are automatically invariant under this change of integration variables, whereas the condition N/γ ∈ N ensures also that the denominator in the first two is. From these relations, we find that the matrix M is Hermitean: M † (a, b; a , b ) = M (a, b; a , b ).
(5.24)
This is true in particular if a = a , b = b . Choosing instead a = 0 = b relations (5.23) and (5.24) together with (5.15), respectively, imply φ, q a(η φ, q l For
+b)2
a(η −b)2
2
ψ = q a(η −b) φ, ψ ,
ψ
= q
a(η −b)2
ψ, φ = ψ, q
(5.25) a(η +b)2
φ .
the Jackson weight mJ,r0 given above, this necessarily requires r0 = 1 or r0 = q 1/2 .
(5.26)
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
107
In formula (A.17) in the Appendix, we give a necessary and sufficient condition on the weight m (which is satisfied in particular by the Jackson measure) and on the parameters a, h, γ in order that the positivity condition 2
φ, q aη φ ≥ 0,
2
φ, q aη φ = 0 iff φ = 0
(5.27)
is fulfilled. We need this to be true with any a such that ah ≥ 0, in particular with a = 1/2 for (4.9)1 to be valid, or alternatively with a = −1/2 for the analog of 2 (4.9)1 with pα replaced by the pˆα to be valid. Then, for any σ = g(C)q a(η +b) with 2ab ∈ Z, 2 2 g 2 [l(l + N − 2)]φl,I , q 2a(η +b ) ψl,I = φ, σ −2 ψ φ, ψ[σ] = l,I
2
+b2 )
φl,I , ψl,I = σ −2 φ, ψ
(5.28)
˜ m,s , defines a “good” scalar product within the following subspace of L 2 m,σ,[β,γ] m,[β,γ] := φ ≡ SlI φl,I (r) φl,I ∈ L2 with φσ < ∞ L2
(5.29)
=
g 2 [l(l + N − 2)]q 2a(η
l,I
l,I
(here φσ := φ, φ[σ] ), making the latter a pre-Hilbert space. Relation (5.26) ensures the sesquilinearity of ·, ·[σ] , (5.27) its positivity. The p˜α[σ] are (formally) m,σ,[β,γ] , as a consequence of (5.25). hermitean operators on their domain within L2 Investigating their essential self-adjointness in the completed Hilbert space is left as a job for future work. We collect the results by stating the following: Theorem 5.3. Let β ∈ {0, 1/2}, γ ∈ N be a submultiple of N, ah ≥ 0, 4ab ∈ Z, σ = 2 g(C)q a(η +b) . Assume that the radial weight m(r) fulfills (5.22) and (A.17), where m(y) ˇ ≡ m(eh(y/2+β) ). Then, (5.28) defines the scalar product of a pre-Hilbert space m,σ,[β,γ] m,σ,[β,γ] and the p˜α[σ] are (formally) hermitean operators defined on L2 . L2 The spaces introduced in (5.29) are very interesting. Functions φl,I fulfilling (5.17) are for instance 1 1 , f (r) , (5.30) j+β γ 1 + (q r) 1 + (q jl +β r)γ l
where jl ∈ Z, β = 0, 1/2 and f (r) is a polynomial or more generally analytic in a domain including all R+ . To this category belong also some q-special functions with distinguished (i.e. quantized) values of the parameters characterizing them. Essentially all special functions can be defined as particular cases of the q-hypergeometric functions r ϕs (a1 , . . . , ar ; b1 , . . . , bs ; q, z),m defined as (analytic continuations in the complex z-plane of) r ϕs (a1 , . . . , ar ; b1 , . . . , bs ; q, z) ∞ (a1 ; q)n · · · (ar ; q)n
:=
n=0 m See,
for instance, [20, 23].
(b1 ; q)n · · · (bs ; q)n
((−1)n q n(n−1)/2 )1+s−r
zn (q; q)n
(5.31)
February 25, 2006 14:58 WSPC/148-RMP
108
J070-00259
G. Fiore
(with parameters such that the series has at least a finite convergence radius), where (a; q)0 := 1,
(a; q)n :=
n−1
(1 − aq i ), n = 1, 2, . . .
i=0
(whenever |q| < 1, the latter definition makes sense also for n = ∞). For instance, the functions introduced in (4.20) can be expressed as eq (z) = 0 ϕ0 (q, (1 − q)z),
ϕJq (z) =
1 2J 2 2 2 2 ϕ1 (0, 0; q ; q , −(1 − q ) z). (J)q2 !
One can rewrite them in the form (5.30)2 , using their interesting properties (see, e.g., [20, 23]). For example, 0 ϕ0 (q, z)
=
∞
1 , 1 − zq i i=0
2 ϕ1 (a1 , a2 ; b; q, z)
=
1 ϕ0 (a; q, z)
=
∞ 1 − azq i
, 1 − zq i b ϕ , z; a z; q, a 1 2 . 2 1 a2
(5.32)
i=0
(a2 ; q)∞ (a1 z; q)∞ (b; q)∞ (z; q)∞
(5.33)
Using (5.32)1 and (5.33) with a1 , a2 = 0, b = q l (l ∈ Z), one can checkn [τ ] m,σ,[β,γ] that the eigenfunctions φπ,j written in Sec. 4 belong to the space L2 where
2
σ = τ := ν q (η +N +1) /4 , β = 0, γ = 1 provided the energy scale κ2 appearing in their definition is quantized (up to powers of q) as follows: κ2 =
(1 + q 2−N )2 . (1 − q 2 )2
(5.34)
Acknowledgments This work is partially supported by the European Commission RTN Programme HPRN-CT-2000-00131 and by MIUR. A. Appendix A.1. Proof of Theorem 3.1 and related lemmas For σ i = xi , ξ i , ∂ i , we easily find σ i w±1 = q ±(1−N ) σ i ,
σi w ˜±1 = q ∓N σ i .
(A.1)
In fact, (2.26)
(2.31)
σ i u1 = σ j ρij (u1 ) = σ j ρih (SR(2) )ρhj (R(1) ) (2.30)
(2) mh ˆ jl = σ j g il ρm )gmh ρhj (R(1) ) = σ j g il gmh R l (R
(2.7),(2.8)
=
q 1−N σ j g il gjl .
Recalling (2.30) and (2.31), we find σ i w2 = ρij (u1 Su1 ) = q 2−2N σ i whence the first part of the claim. The proof of the second statement is completely analogous. n Details
will be given elsewhere.
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
109
It is not difficult to check that (A.1) implies wσ i w−1 = q N −1 Zji σ j .
(A.2)
Lemma A.1. Let wl := q −l(l+N −2) . Then, on the spherical harmonics of level l (with l = 0, 1, 2, . . .) w SlI | = wl SlI = SlI w,
wa SlI | = (wl )a SlI = SlI wa ,
(A.3)
for any real a. In particular, ν SlI | = q −l(l+N −2)/4 SlI = SlI ν. Proof. We determine the eigenvalue wl applying the pseudodifferential operator w ≡ ϕ(w) to Sln···n = (tn )l : w (tn )l |
(2.44)
(tn )l S −1 w
(2.25)
tn w(1) [(tn )l−1 w(2) ]
(2.32)1
tn wT −1(1) [(tn )l−1 wT −1(2) ]
(A.1)1
q 1−N wl−1 tn T −1(1) [(tn )l−1 T −1(2) ] −1(2) −1(2) · · · tn T(l−1) q 1−N wl−1 (tn T −1(1) ) tn T(1) −1(2) −1(2) · · · ρnil T(l−1) ti1 ti2 · · · til . q 1−N wl−1 ρni1 (T −1(1) )ρni2 T(1)
= = = =
(2.25)
=
(2.26)
=
(2.32)2
=
(tn )l w
From the definition of T and the relations, id ⊗ ∆ R = R13 R12 , (∆ ⊗ id)R = R13 R23 , −1(2)
−1(2)
it follows that T −1(1) ⊗ T(1) ⊗ · · · ⊗ T(l−1) is a product of 2(l − 1)R−1 mn , with suitable m, n = 1, 2, . . . , l. A glance at the explicit form [10] of the Yang–Baxter −1nn := ρnh (R−1(1) )ρnk (R−1(2) ) = q −1 δhn δkn . It follows that matrix R shows that Rhk −1(2)
ρni1 (T −1(1) )ρni2 (T(1)
−1(2)
) · · · ρnil (T(l−1) ) = q −2(l−1) δin1 · · · δinl ,
which together with the preceding relation gives the recursive relation wl = q 3−2l−N wl−1 ; we solve the latter starting from w1 = q 1−N (see (A.1)) and we find (A.3)1 , and consequently also (A.3)2 . Lemma A.2. An element O ∈ H is identically zero iff for any f ∈ RN q , Of | = 0.
(A.4)
Proof. Let {X π }π∈Π be the basis of RN q dual to the one {Dπ }π∈Π of (2.10) with respect to the pairing (2.12). From the hypothesis, we obtain Oν Dν = 0. Oν = OX ν | = 0 ∀ν ∈ Π ⇒ O = ν∈Π
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
G. Fiore
110
In order to prove the theorem, we need some more useful relations. Let us introduce the shorthand notations zl − 1 , ¯ := 1 + q N −2 , lz := µ := 1 + q 2−N , µ z−1 z→1
(lz is called “z-number” because lz −→ l). Moreover, we introduce z-derivatives (with z = q, q −1 ) Dz f (r)| :=
f (zr) − f (r) (z − 1)r
Dq f (q −1 r)| = q −1 Dq−1 f (r)|.
⇒
ˆ ˆ := ∂ˆ · ∂, Then, setting henceforth for brevity := ∂ · ∂, µ xi + qr∂ i , 1+q r xi µ ¯ + q −1 r∂ˆi , ∂ˆi r = −1 1+q r ˆ ˆ i=µ x ¯ ∂ˆi + q −2 xi ,
∂ i r2 = µxi + q 2 r2 ∂ i ,
∂ ir =
∂ˆi r2 = µ ¯xi + q −2 r2 ∂ˆi , xi = µ ∂ i + q 2 xi ,
r2 = µ2 (q N Λ−2 − 1)(q 2 − 1)−1 + q 2 r2 , µ xi Dq f (r) + f (qr)∂i , ∂ i f (r) = 1+q r xi µ ¯ D −1 f (r) + f (q −1 r)∂ˆi . ∂ˆi f (r) = −1 1+q r q
(A.5)
(A.6) (A.7) (A.8) (A.9)
Let i1 ···il j1 x · · · xjl , Xli1 ···il := rl Sli1 ···il = Pjs,l 1 ···jl
i1 ···il j2 i0 i1 ···il Tl−1 := g i0 j1 Pjs,l x · · · xjl 1 ···jl
i0 i1 ···il [compare with (2.14)]. Clearly r1−l Tl−1 ∈ Vl−1 . The projector Ps,l is uniquely characterized by the following property [11]
Ps,l Pπ,m(m+1) = Pπ,m(m+1) Ps,l = δsπ Ps,l ,
Ps,l 2 = Ps,l ,
(A.10)
where π = a, s, t, m = 1, . . . , l − 1 and by Pπ,m(m+1) we have denoted the matrix acting as Pπ on the mth, (m + 1)th indices and as the identity on the remaining ones. Using (2.5) and (A.6), this implies for m = 1, 2, . . . , l, i1 ···il jm jm+1 i1 ···il jm jm+1 [∂ x · · · xjl − xjm · · · xjl−1 ∂ jl ] = 0 = P s,l x · · · xjl |, Pjs,l j1 ···jl ∂ 1 ···jl s,l i1 ···il jm jl Pj1 ···jl x · · · x | = 0.
Using (2.5) (as well as its analog for the ∂ˆi ), (2.9) and (2.7), it follows i0 i1 ···il , ∂ i0 Xli1 ···il | = lq2 Tl−1
xi ∂i Xli1 ···il | = lq2 Xli1 ···il ,
(A.11)
i0 i1 ···il ∂ˆi0 Xli1 ···il | = lq−2 Tl−1 ,
Xli1 ···il | = 0,
(A.12)
i0 i1 ···il xi0 Xli1 ···il = Xl+1 +
r 2 lq 2 T i0 i1 ···il . µ(l − 1 + N/2)q2 l−1
(A.13)
To prove (A.13), note that the decomposition (2.13) of the left-hand side gives (suppressing indices) xXl = Yl+1 + r2 Yl−1 , with Yj combinations of the Xj ’s. Yl−1
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
111
can be determined applying the Laplacian to both sides and, recalling (A.11) and (A.12)2 : i0 i1 ···il | 0 = xi0Xli1 ···il − r2 Yl−1 i0 i1 ···il = µ∂ i0Xli1 ···il | − µ2 (N/2)q2 + q N xi ∂i Yl−1 | i0 i1 ···il i0 i1 ···il = µlq2 Tl−1 − µ2 (N/2)q2 + q N (l − 1)q2 Yl−1 i0 i1 ···il i0 i1 ···il = µ lq2 Tl−1 − µ(l − 1 + N/2)q2 Yl−1 . Now, from (A.10), it follows Ps,l+1 Yl−1 ∝ Ps,l+1 Tl−1 = 0, whence Ps,l+1 Yl+1 = Ps,l+1 xXl = Ps,l+1 Xl+1 = Xl+1 , and we find that indeed Yl+1 = Xl+1 . Proof of Theorem 3.1 Relation (3.2) is an immediate consequence of (2.17)1 , (2.19) and (2.42). The second equality in (3.3) is immediate. As for the first, ∂ˆi0 f (r)Xli1 ···il | (A.9)
=
qµ ¯ xi0 i1 ···il (Dq−1 f |) X + f (q −1 r)∂ˆi0 Xli1 ···il | 1+q r l
(A.12−A.13)
=
q N Dq−1 f | r 2 lq 2 i0 i1 ···il i0 i1 ···il Tl−1 + µXl+1 N q(1 + q)r l−1+ 2 q2
i0 i1 ···il + f (q −1 r) lq−2 Tl−1
on one hand, and on the other v −1∂ i0 v Λ f (r)Xli1 ···il | (2.21),(A.3)
v −1∂ i0f (q −1 r)Xli1 ···il |q −(l+N )l/2
xi0 (A.8) −1 −1 µ i0 (Dq−1 f |) + f (r)∂ Xli1 ···il |q −(l+N )l/2 = v q 1+q r D −1 f | r 2 lq 2 (A.11−A.13) −1 q i0 i1 ···il i0 i1 ···il Tl−1 = v + µXl+1 N q(1 + q)r l−1+ 2 q2 i0 i1 ···il q −(l+N )l/2 + lq2 f Tl−1 =
February 25, 2006 14:58 WSPC/148-RMP
112
J070-00259
G. Fiore
(A.3)
=
2 2 (l+N −3)(l−1)/2 lq q (l+N −1)(l+1)/2 i0 i1 ···il r i0 i1 ···il Xl+1 + Tl−1 µq N l−1+ 2 q2 i0 i1 ···il (l+N −3)(l−1)/2 q −(l+N )l/2 + lq 2 q f Tl−1
Dq−1 f | q(1 + q)r
=
Dq−1 f | (1 + q)r
2 −2l−(N −1)/2 (N −3)/2 i0 i1 ···il rlq2 q i0 i1 ···il Tl−1 Xl+1 + µq N l−1+ 2 q2
i0 i1 ···il , + lq2 q (3−N )/2−2l f Tl−1
whence [∂ˆi0 − q
=
N +1 2
v −1∂ i0 v Λ]f (r)Xli1 ···il |
Dq−1 f | (1 + q)r
r2 l 2 (q N −1 − q 1−2l ) q i0 i1 ···il i0 i1 ···il Tl−1 + [f (q −1 r) − f (r)]lq2 q 2−2l Tl−1 N l−1+ 2 q2
Dq−1 f | 2 i0 i1 ···il i0 i1 ···il r lq2 q 1−2l (q 2 − 1)Tl−1 + (Dq−1 f |)r(q −1 − 1)lq2 q 2−2l Tl−1 (1 + q)r =0 =
leading to ∂ˆi = q (N +1)/2 v −1∂ i v Λ, equivalent to the claim (3.3). To prove (3.4) now, we just have to proceed as follows. By (2.6), d := ξ i ∂ j gij = q N ∂ i ξ j gij , whence (3.3),(3.2)
d = q N ξ j ∂ i gij (A.2)
3
1
(2.39)
1
1
=
q 2N ξ k gkl Z lj Λ−2 (−q (1−N )/2 v −1 ∂ j v Λ)
= −q 2 N − 2 ξ k gkl Λ−1 q 1−N v −1 w ∂ l w−1 v N
1
2
2
= −q 2 N + 2 ξ k gkl v q − 2 − 2 q −η ∂ l q η v −1 = −˜ v ξ k gkl ∂ l v˜−1 = −˜ v d˜ v −1 .
The proof of (3.5) is completely analogous. A.2. Proof of formulae (2.74) and (2.77) a1 b ···b ∗ θ · · · θap αθap ···a1 θbp+1 · · · θbNεb1N ···bpp+1 βbθp ···b1 αp βp | = ck q
q
= cp q
= cp
q
b ···b
αθap···a1 θbp · · · θb1 gbp ap · · · gb1 a1 θbp+1 · · · θbN εb1N ···bpp+1 βbθp ···b1 b ···b
αθap···a1 εbp ···b1 bp+1 ···bN gbp ap · · · gb1 a1 εb1N ···bpp+1 βbθp ···b1 dV
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
= cp q
= = =
···bN b1 ···bp εbN ···bp+1 βbθp ···b1
p 1 αθap···a1 Ua−1c · · · Ua−1c εcp+1 p ···c1 p 1
1 cN −p 1 cN −p 1 cN −p
b
113
dV
b ···b
q
q
p 1 αθap···a1 Ua−1c · · · Ua−1c Pac11 ···cpp βbθp ···b1 dV p 1
p 1 θ αθap···a1 Ua−1c · · · Ua−1c βcp ···c1 dNx p 1
αθ ap ···a1 β θ ap ···a1 dNx. q
Here U is the (diagonal, positive-definite) matrix defined in (2.46). The second equality is based on the relation [12] gi1 j1 gi2 j2 · · · giN jN εjN ···j2 j1 =: εi1 i2 ···iN = εiN ···i2 i1 . ∗ αp , ∗ βp =
(A.14)
(∗ α) ∗∗ βp q θ a ···a αap ···a1 εa1N ···app+1 θbN · · · θbp+1 gbp+1 ap+1 · · · gbN aN θb1 · · · θbp βbθp ···b1 = cp q θ a1 ···ap αap ···a1 gbp+1 ap+1 · · · gbN aN βbθp ···b1 dV = cp εaN ···ap+1 εbN ···bp+1 b1 ···bp q θ a1 ···ap bN ···bp+1 b1 ···bp αap ···a1 gbp+1 ap+1 · · · gbN aN βbθp ···b1 dV = cp εaN ···ap+1 ε q
= = = =
× gc1 d1 · · · gcp dp g d1 b1 · · · g dp bp θ a ···a αap ···a1 g d1 b1 · · · g dp bp βbθp ···b1 dV cp εa1N ···app+1 εdp ···d1 ap+1 ···aN q θ ap+1 ···aN a1 ···ap −1dp 1 αap ···a1 βbθp ···b1 dV cp εdp ···d1 εaN ···ap+1 Ubp · · · Ub−1d 1 q θ 1 a1 ···ap −1dp −1d1 αap ···a1 βbθp ···b1 dV Pad1 ···dp Ubp · · · Ub1 cN −p q θ 1 −1dp −1d1 αdp ···d1 βbθp ···b1 dV = rhs (2.77). Ubp · · · Ub1 cN −p q
A.3. Studying the positivity relation (5.27) According to odd or even p = N/γ, the matrix elements of M (a) will take the two different forms ∞ ∞ π a 2 N π2 ∞ ei 2 [(ω −ω)y+ωj−ω j ]− h ω + 2 h(j+j +2β) j 2 2 , Mj = dω dω dy m(y) ˇ 8h −∞ π π −∞ −∞ ω C ω C γh hγ cosh(ω) if p := N/γ is odd C(ω) := (A.15) sinh(ω) if p := N/γ is even.
February 25, 2006 14:58 WSPC/148-RMP
114
J070-00259
G. Fiore
To obtain the previous formula from (5.19), we have also performed the change of ˇ := integration variables y → h(y/2 + β), ω → πω/h, ω → πω /h and set m(y) m(hy/2 ˜ + hβ), whence it follows for any k ∈ Z, m(y ˇ + 2k) = m(y), ˇ
m(−y) ˇ = m(y), ˇ
so that ∞
m(y) ˇ =
mk eikπy ,
with m−k = mk = mk .
k=−∞
We also define ˇ φ(ω) :=
N
e−iπωj+ 2 h(j+β) Rφj
⇒
ˇ + 2k) = φ(ω), ˇ φ(ω
∀k ∈ Z.
j∈Z
Replacing in (5.20) (with ψ = φ), we find ∞ ∞ π aπ 2 2 ˇ ˇ φ(ω ) π2 ∞ ei 2 [(ω −ω)y]− h ω [φ(ω)] aη 2 2 2 φ, q φ = dω dω dy m(y) ˇ 8h −∞ π π −∞ −∞ ω C ω C hγ hγ π2 = 2h
∞
dω −∞
2 aπ ˇ ˇ φ(ω ) e− h ω [φ(ω)] dω mk δ(ω − ω + 2k) 2 2 π π −∞ k=−∞ ω C ω C hγ hγ
∞ ∞ π2 = mk dω 2h −∞ k=−∞
2
∞
∞
aπ 2 h
ω 2
ˇ )|2 |φ(ω
2 π π (ω + 2k) C ω C hγ hγ
e−
2
2
2 ∞ 2 − aπ ˇ e h ω cos(kπy) |φ(ω)|
2 2 dy m(y) ˇ dω π π −1 −∞ k=−∞ C ω (ω + 2k) C hγ hγ 1 π2 1 π 2 aπ 2 2 ˇ = , dω|φ(ω)| dy m(y) ˇ K ω, y, . (A.16) 4h −1 hγ h −1
π2 = 4h
1
∞
2
Thus, φ, q aη φ will be positive for any φ if 1 π 2 aπ 2 , dy m(y) ˇ K ω, y, >0 hγ h −1
∀ω ∈ ] − 1, 1],
(A.17)
where ∞
2 ∞ e−t(ω+2l) cos(kπy) . K(ω, y, δ, t) := C[δ(ω + 2l)] C[δ(ω + 2(k + l)] l=−∞ k=−∞ ∞ The weight characterizing the Jackson integral, m(y) ˇ ∼ l=−∞ δ(y − 2l) certainly fulfills (A.17) for any choice of a, h, γ because the integral appearing there reduces to K (y = 0) which is manifestly positive. In fact, by continuity, K will
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
115
remain positive at least in a neighborhood of y = 0, so that (A.17) will be fulfilled also by weights m(y) ˇ non-vanishing on some suitable interval including y = 0. A more detailed characterization of weights m(y) ˇ and parameters a, h, γ such that (A.17) is fulfilled is left as a possible subject for future work. If K were strictly positive for all y, all weights m(y) ˇ would do the job. Note also that for h → 0, one ' 2 2 − aπ ω ∼ δ(ω) and also 1/C(kπ 2 /γh) → δk0 and, therefore, finds a/hπe h 3
2 π2 ˇ 2 φ, q aη φ ∼ m0 √ |φ(0)| ≥0 2 ah
which is non-negative for any φ and any choice of m(y). ˇ References [1] U. Carow-Watamura, M. Schlieker and S. Watamura, SOq (N ) covariant differential calculus on quantum space and quantum deformation of Schr¨ odinger equation, Z. Physik C 49 (1991) 439. [2] B. L. Cerchiai, G. Fiore and J. Madore, Geometrical tools for quantum Euclidean spaces, Commun. Math. Phys. 217 (2001) 521–554; math.QA/0002007. [3] C.-S. Chu and B. Zumino, Realization of vector fields for quantum groups as pseudodifferential operators on quantum spaces, in Proc. 20th Int. Conf. Group Theory Methods in Physics, Toyonaka, Japan (1995); q-alg/9502005. [4] A. Connes, Non-commutative differential geometry, Publ. I.H.E.S. 62 (1986) 257; Noncommutative Geometry (Academic Press, 1994). [5] A. Connes and M. Dubois-Violette, Noncommutative finite-dimensional manifolds. I. Spherical manifolds and related examples, Commun. Math. Phys. 230 (2002) 539–579; math.QA/0107070. [6] A. Connes and G. Landi, Noncommutative manifolds the instanton algebra and isospectral deformations, Commun. Math. Phys. 221 (2001) 141–159. [7] A. Dimakis and J. Madore, Differential calculi and linear connections, J. Math. Phys. 37(9) (1996) 4647–4661. [8] V. G. Drinfel’d, Hopf algebras and the quantum Yang–Baxter equation, Dokl. Akad. Nauk SSSR 283 (1985) 1060–1064; translated in English in J. Sov. Math. 32 (1985) 254–258; Quantum groups, in Proc. I.C.M., Berkeley (1986), pp. 798–820; ibid., J. Sov. Math. 41 (1988) 898–915. [9] V. G. Drinfeld, Quasi Hopf Algebras, Leningrad Math. J. 1 (1990) 1419. [10] L. D. Faddeev, N. Y. Reshetikhin and L. Takhtadjan, Quantization of Lie groups and Lie algebras, Algebra i Analiz 1 (1989) 178–206; translated from the Russian in Leningrad Math. J. 1 (1990) 193–225. [11] G. Fiore, The SOq (N, R)-symmetric harmonic oscillator on the quantum Euclidean space RN q and its Hilbert space structure, Int. J. Mod Phys. A8 (1993) 4679–4729. [12] G. Fiore, Quantum groups SOq (N ), Spq (n) have q-determinants, too, J. Phys. A 27 (1994) 3795. [13] G. Fiore, The Euclidean Hopf algebra Uq (eN ) and its fundamental Hilbert space representations, J. Math. Phys. 36 (1995) 4363–4405; hep-th/9407195. [14] G. Fiore, q-Euclidean covariant quantum mechanics on RN q : Isotropic harmonic oscillator and free particle, PhD thesis, SISSA-ISAS (May 1994). [15] G. Fiore, Realization of Uq (so(N )) within the differential algebra on RN q , Commun. Math. Phys. 169 (1995) 475–500.
February 25, 2006 14:58 WSPC/148-RMP
116
J070-00259
G. Fiore
[16] G. Fiore, Quantum group covariant (anti)symmetrizers, ε-tensors, vielbein, Hodge map and Laplacian, J. Phys. A 37 (2004) 9175–9193; math.QA/0405096. [17] G. Fiore and J. Madore, The geometry of the quantum Euclidean space, J. Geom. Phys. 33 (2000) 257–287; math/9904027. [18] V. Gayral, B. Iochum and J. C. Varilly, Dixmier traces on noncompact isospectral deformations, hep-th/0507206. [19] V. Gayral, J. M. Gracia-Bonda, B. Iochum, T. Schcker and J. C. Varilly, Moyal planes are spectral triples, Commun. Math. Phys. 246 (2004) 569–623; hep-th/0307241. [20] G. Gasper and M. Rahman, Basic Hypergeometric Series, Encyclopedia of Mathematics and its Applications, Vol. 35 (Cambridge University Press, 1990). [21] J. M. Gracia-Bondia, F. Lizzi, G. Marmo and P. Vitale, Infinitely many star products to play with, J. High Energy Phys. 2(4) (2002) 026; hep-th/0112092. [22] A. Hebecker and W. Weich, Free particle in q deformed configuration space, Lett. Math. Phys. 26 (1992) 245–258. [23] A. Klimyk and K. Schm¨ udgen, Quantum Groups and Their Representations (Springer, 1997). [24] A. Kempf and S. Majid, Algebraic q-integration and Fourier theory on quantum and braided spaces, J. Math. Phys. 35 (1994) 6802–6837. [25] S. Majid, q-Epsilon tensor for quantum and braided spaces, J. Math. Phys. 34 (1995) 2045–2058. [26] S. Majid, Braided momentum structure of the q-Poincare group, J. Math. Phys. 36 (1995) 1991–2007. [27] S. Majid, Foundations of Quantum Groups (Cambridge University Press, 1995). [28] Yu. I. Manin, Some remarks on Koszul algebras and quantum groups, Ann. Inst. Fourier (Grenoble) 27 (1987) 191–205; Quantum groups and noncommutative geometry, preprint CRM-1561 (Montreal, 1988); Topics in Noncommutative Geometry (Princeton University Press, 1991). [29] T. Masuda, Y. Nakagami and S. L. Woronowicz, A C ∗ algebraic framework for quantum groups, to appear in Int. J. Math; math.QA/0309338. [30] O. Ogievetsky, Differential operators on quantum spaces for GLq (n) and SOq (n), Lett. Math. Phys. 24 (1992) 245. [31] O. Ogievetsky and B. Zumino, Reality in the differential calculus on the q-Euclidean spaces, Lett. Math. Phys. 25 (1992) 121–130. [32] O. Ogievetsky, W. B. Schmidke, J. Wess and B. Zumino, q-Deformed Poincar´e algebra, Commun. Math. Phys. 150 (1992) 495–518. [33] N. Y. Reshetikhin and V. G. Turaev, Ribbon graphs and their invariants derived from quantum groups, Commun. Math. Phys. 127 (1990) 1–26. [34] P. Schupp, P. Watts and B. Zumino, Differential geometry on linear quantum groups, Lett. Math. Phys. 25 (1992) 139. [35] P. Schupp, P. Watts and B. Zumino, Bicovariant quantum algebras and quantum lie algebras, Commun. Math. Phys. 157 (1993) 305. [36] H. Steinacker, Integration on quantum Euclidean space and sphere in N dimensions, J. Math Phys. 37 (1996) 4738. [37] W. D. van Suijlekom, The noncommutative Lorentzian cylinder as an isospectral deformation, J. Math. Phys. 45 (2004) 537–556; math-ph/0310009. [38] W. Weich, The Hilbert space representations for SOq (3)-symmetric quantum mechanics, LMU-TPW 1994-5; hep-th/9404029. [39] J. Wess, q-Deformed Heisenberg algebras, Lectures given at the Internationale Universitaetswochen fuer Kern- und Teilchenphysik, Schladming, Austria (January 1999); math-ph/9910013.
February 25, 2006 14:58 WSPC/148-RMP
J070-00259
On the Hermiticity of q-Differential Operators and Forms
117
[40] S. L. Woronowicz, Twisted SU(2) group. An example of noncommutative differential calculus, Publ. Res. Inst. Math. Sci. 23 (1987) 117–181; Differential calculus on compact matrix pseudogroups (quantum groups), Commun. Math. Phys. 122 (1989) 125–170. [41] B. Zumino, Introduction to the differential geometry of quantum groups, in ed. K. Schm¨ udgen, Mathematical Physics X (Springer-Verlag, 1992). [42] B. Zumino, Differential calculus on quantum spaces and quantum groups, in Proc. 19th ICGTMP Conf. eds. M. O., M. S. and J. M. G., Vol. 1 (CIEMAT/RSEF, Madrid, 1993).
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Reviews in Mathematical Physics Vol. 18, No. 2 (2006) 119–162 c World Scientific Publishing Company
ENERGY EXPANSION AND VORTEX LOCATION FOR A TWO-DIMENSIONAL ROTATING BOSE–EINSTEIN CONDENSATE
RADU IGNAT Laboratoire J.L. Lions, Universit´ e Pierre et Marie Curie, B.C. 187, 4 Place Jussieu, 75252 Paris Cedex 05, France
[email protected] VINCENT MILLOT Department of Mathematical Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
[email protected] Received 26 June 2005 Revised 16 February 2006
We continue the analysis started in [14] on a model describing a two-dimensional rotating Bose–Einstein condensate. This model consists in minimizing under the unit mass constraint, a Gross–Pitaevskii energy defined in R2 . In this contribution, we estimate the critical rotational speeds Ωd for having exactly d vortices in the bulk of the condensate and we determine their topological charge and their precise location. Our approach relies on asymptotic energy expansion techniques developed by Serfaty [20–22] for the Ginzburg–Landau energy of superconductivity in the high κ limit. Keywords: Bose–Einstein condensate; renormalized energy; vortices. Mathematics Subject Classification 2000: 35A15, 35Q55
1. Introduction Since its first experimental achievement in dilute alkali gases, the phenomenon of the Bose–Einstein condensation has given rise to a very active area of research in condensed matter physics. A Bose–Einstein condensate (BEC) is a quantum object in which every atom is in the lowest quantum state, so that it can be described by a single wave function. One of the most interesting feature of these systems is their superfluid behavior (see [10]): above some critical velocity, a BEC rotates through the existence of vortices, i.e. zeroes of the wave function around which there is a circulation of phase. When the angular speed gets larger, the number of vortices increases and they arrange themselves in a regular pattern around the center of the 119
May 2, 2006 15:57 WSPC/148-RMP
120
J070-00260
R. Ignat & V. Millot
condensate. This has been observed experimentally by the ENS group [16, 17] and by the MIT group [1]. We consider here a two-dimensional model describing a condensate placed in a trap that strongly confines the atoms in the direction of the rotation axis (see [10, 11]). In the non-dimensionalized form (see [2, 14]), the wave function minimizes the Gross–Pitaevskii (GP) energy 1 1 |∇u|2 + 2 [(|u|2 − a(x))2 − (a− (x))2 ] − Ωx⊥ · (iu, ∇u) dx Fε (u) = 2 4ε R2 (1.1) under the constraint
R2
|u|2 = 1,
(1.2)
where ε > 0 is small and describes the ratio of two characteristic lengths and Ω = Ω(ε) ≥ 0 is the angular velocity. The function a(x) in (1.1) comes from the existence of a potential trapping the atoms, and is normalized such that R2 a+ (x) = 1. We will restrict our attention to the specific case of a harmonic trapping, that is a(x) = a0 − x21 − Λ2 x22 with a0 = 2Λ/π for some constant Λ ∈ (0, 1], which corresponds to actual experiments (see [16, 17]). Our goal is to compute an asymptotic expansion of the energy Fε (uε ) and to determine the number and the location of vortices according to the value of the angular speed Ω(ε) in the limit ε → 0. More precisely, we want to estimate the critical velocity Ωd for which the dth vortex becomes energetically favorable and to derive a reduced energy governing the location of the vortices (the so-called “renormalized energy” by analogy with [8, 20, 21]). We have started in [14] the analysis of minimizers uε of the functional Fε under the constraint (1.2) and we have already determined the critical rotational speed √ 2 ) √ Ω1 = π(1+Λ |ln ε| of nucleation of the first vortex inside the domain 2Λ D = {x ∈ R2 : a(x) > 0}. In the physical context, the set D represents the region occupied by the condensate since in the limit ε → 0, the minimization of Fε forces |uε |2 to be close to the function a+ (x)(Fε (uε ) remaining small in front of 1/ε2 ). We proved that for subcritical velocities Ω ≤ Ω1 − δ ln|ln ε| with −δ < ω1 < 0 for some constant ω1 , there is no vortices in the region D and uε behaves as the vortex-free profile η˜ε eiΩS where the phase function S : R2 → R is given by S(x) =
Λ2 − 1 x1 x2 Λ2 + 1
and η˜ε is the (unique) positive solution of the minimization problem Min Eε (u) : u ∈ H, uL2(R2 ) = 1
(1.3)
(1.4)
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
121
with
1 1 |∇u|2 + 2 [(|u|2 − a(x))2 − (a− (x))2 ] and 2 4ε 1 2 2 2 H = u ∈ H (R , C) : |x| |u| < ∞ .
Eε (u) =
R2
R2
In this contribution which constitutes the sequel of [14], we push forward the study of minimizers uε . First, we prove the following estimate on the critical speed Ωd for any integer d ≥ 1 in the asymptotic ε → 0 , √ π(1 + Λ2 ) 1 + Λ2 √ (|ln ε| + (d − 1) ln|ln ε|) . (|ln ε| + (d − 1) ln|ln ε|) = Ωd = a0 2Λ Then, we show that for velocities ranged between Ωd and Ωd+1 , any minimizer has exactly d vortices of degree +1 inside D. Establishing an asymptotic expansion of Fε (uε ) as ε → 0, we derive the distribution of vortices within D as a minimizing configuration of the reduced energy given by (1.5) below. We also improve the result stated in [14] for the non-existence of vortices in the subcritical case by showing that the best constant is ω1 = 0, that is subcritical velocities go up to Ω1 − δ ln|ln ε| for any δ > 0. Our main theorem can be stated as follows: Theorem 1.1. Let uε be any minimizer of Fε in H under the constraint (1.2) and let 0 < δ 1 be any small constant. √ (i) If Ω ≤ Ω1 − δ ln|ln ε|, then for any R0 < a0 , there exists ε0 = ε0 (R0 , δ) > 0 Λ = x ∈ R2 : |x|2Λ = such that for any ε < ε0 , uε is vortex free in BR 0 Λ . In addition, x21 + Λ2 x22 < R02 , i.e. uε does not vanish in BR 0 Fε (uε ) = Fε (˜ ηε eiΩS ) + o(1). (ii) If Ωd + δ ln|ln ε| ≤ Ω ≤ Ωd+1 − δ ln|ln ε| for some integer d ≥ 1, then for any √ R0 < a0 , there exists ε1 = ε1 (R0 , d, δ) > 0 such that for any ε < ε1 , uε has Λ . Moreover, exactly d vortices xε1 , . . . , xεd of degree one in BR 0 |xεj | ≤ C Ω−1/2 |xεi
−
xεj |
≥CΩ
−1/2
for any j = 1, . . . , d,
and
for any i = j,
√ where C > 0 denotes a constant independent of ε. Setting x ˜εj = Ω xεj , the ˜εd ) tends to minimize, as ε → 0, the renormalized energy configuration (˜ xε1 , . . . , x w(b1 , . . . , bd ) = −πa0
i=j
ln|bi − bj | +
d πa0 |bj |2Λ . 1 + Λ2 j=1
(1.5)
May 2, 2006 15:57 WSPC/148-RMP
122
J070-00260
R. Ignat & V. Millot
In addition, Fε (uε ) = Fε (˜ ηε eiΩS ) −
πa20 d πa0 2 (d − d) ln|ln ε| (Ω − Ω1 ) + 2 1+Λ 2
+ Min w(b) + Qd,Λ + o(1), b∈R2d
(1.6)
where Qd,Λ is a constant depending only on d and Λ. These results are in agreement with the study made by Castin and Dum [11] who have looked for minimizers in a reduced class of functions. More precisely, we find the same critical angular velocities Ωd as well as a distribution of vortices around the origin at a scale Ω−1/2 . The minimizing configurations for the renormalized energy w(·) have been studied in the radial case Λ = 1 by Gueron and Shafrir in [12]. They prove that for d ≤ 6, regular polygons centered at the origin and stars are local minimizers. For larger d, they numerically found minimizers with a shape of concentric polygons and then, triangular lattices as d increases. These figures are exactly the ones observed in physical experiments (see [16, 17]). Our approach, suggested in [2] by Aftalion and Du, strongly relies on techniques developed by Serfaty [20–22] for the Ginzburg–Landau (GL) energy of superconductivity in the high κ limit. We point out that Serfaty has already applied the method to a simplified GP energy (the study is made in a ball instead of R2 with a(x) ≡ 1 and the minimization is performed without mass constraint) and has obtained in [23] a result analogue to Theorem 1.1 which shows that the simple model captures the main features of the full model concerning vortices. We emphasize once more that we treat here the exact physical model without any simplifying assumptions. The outline of our proof follows Serfaty’s method but many technical difficulties arise from the specificities of the problem such as the unit mass constraint or the degenerate behavior of the function a(x) near the boundary of D. As we shall see, a very delicate analysis is required so that we prefer sometimes to write all the details even if some proofs follow closely to other authors. More precisely, we also make use of the following results on the GL functional [3–5, 9, 15, 18, 19, 24], starting from the pioneering work of Bethuel, Brezis and H´elein [8]. We finally refer to our first part [14] for additional references on mathematical studies of vortices in BECs. For the convenience of the reader, we recall now some results already established in [14]. First, we have proved the existence and smoothness of any minimizer uε of Fε under the constraint (1.2) in the regime Ω≤
1 + Λ2 (|ln ε| + ω1 ln|ln ε|) a0
(1.7)
for a constant ω1 ∈ R, as well as some qualitative properties: Eε (uε ) ≤ C|ln ε|2 , √ + |uε | a in any compact K ⊂ D and |uε | decreases exponentially fast to 0 outside D. We have also showed the existence and uniqueness of the positive minimizer η˜ε of Eε under the mass constraint (1.2) for every ε > 0. Concerning the Lagrange
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
123
multiplier kε ∈ R associated to η˜ε and the qualitative properties of η˜ε , we have obtained: (1.8) |kε | ≤ C|ln ε| , √ 1 ηε ) ≤ C|ln ε| for ε small and η˜ε → a+ in L∞ (R2 ) ∩ Cloc (D) as ε → 0. Using Eε (˜ a splitting technique introduced by Lassoued and Mironescu [15], we were able to decouple into two independent parts the energy Fε (u) for any u ∈ H . The first part corresponds to the energy of the vortex-free profile η˜ε eiΩS and the second part to a reduced energy of v = u/(˜ ηε eiΩS ), i.e. Fε (u) = Fε (˜ ηε eiΩS ) + F˜ε (v) + T˜ε (v),
(1.9)
where the functionals F˜ε and T˜ε are defined by ˜ ε (v) , F˜ε (v) = E˜ε (v) + R η˜ε2 η˜4 E˜ε (v) = |∇v|2 + ε2 (|v|2 − 1)2 , 4ε R2 2 ˜ ε (v) = Ω R η˜2 ∇⊥ a · (iv, ∇v), 1 + Λ 2 R2 ε 2
1 ˜ Tε (v) = Ω |∇S|2 − 2Ω2 x⊥ · ∇S + kε η˜ε2 (|v|2 − 1). 2 R2
(1.10)
(1.11)
(1.12)
Since the function η˜ε does not vanish, the vortex structure of any minimizer uε can be studied via the map ηε eiΩS ), vε = uε /(˜ applying the Ginzburg–Landau techniques to the weighted energy E˜ε (vε ). It is intuitively clear that difficulties will arise in the region where η˜ε is small and we will 2 ˜ require the following properties of v2ε inherited from−1uε and η˜ε : Eε (vε ) ≤ C|ln ε| , ˜ ε (vε )| ≤ C|ln ε| , |∇vε | ≤ CK ε T˜ε (vε ) ≤ o(1), |R and |vε | 1 in any compact K ⊂ D. In the sequel, it will be more convenient to replace in the different functionals the function η˜ε2 by its limit a+ (x). We denote by Fε , Eε and Rε the corresponding functionals (see notations below). In the regime (1.7), we have computed in [14] some fundamental bounds for the energy of vε in a domain slightly smaller than D: Fε (vε , Dε ) ≤ o(1),
(1.13)
Eε (vε , Dε ) ≤ Cω1|ln ε|,
(1.14)
Eε (vε , Dε \{|x|Λ < 2|ln ε|−1/6 }) ≤ Cω1 ln|ln ε|,
(1.15)
Dε = {x ∈ D : a(x) > νε |ln ε|−3/2 }
(1.16)
where
and νε is a chosen parameter in the interval (1, 2) (see Proposition 2.13). These estimates represent the starting point of our analysis here.
May 2, 2006 15:57 WSPC/148-RMP
124
J070-00260
R. Ignat & V. Millot
The plan of the paper is as follows. In Sec. 2, we prove that the subset of D where |vε | is smaller than 1/2 can be covered by a family of disjoint discs such that each radius vanishes as ε → 0, the cardinal of this family is uniformly bounded with respect to ε and vε has a non-vanishing degree around each disc of the family. We will call such a collection of discs a fine structure of vortices and a vortex one of these discs (identified with their center). In Sec. 3, we establish various lower energy estimates namely inside a vortex and away from the vortices. In Sec. 4, we prove Theorem 1.1 matching the lower energy estimates with upper estimates coming from the construction of trial functions. These constructions are presented in Sec. 5 which can be read independently from the rest of the paper. Finally, we prove in the Appendix, an auxiliary result that we shall use in the proof of Theorem 1.1. Notations. Throughout the paper, we denote by C a positive constant independent of ε and we use the subscript to point out a possible dependence on the argument. For x = (x1 , x2 ) ∈ R2 , we write |x|Λ =
x21 + Λ2 x22
Λ and BR = {x ∈ R2 , |x|Λ < R}
and for A ⊂ R2 , E˜ε (v, A) =
A
1 2 η˜4 η˜ε |∇v|2 + ε2 (1 − |v|2 )2 , 2 4ε
1 a2 a|∇v|2 + 2 (1 − |v|2 )2 , 4ε A 2 ˜ ε (v, A) = Ω R η˜2 ∇⊥ a · (iv, ∇v) , 1 + Λ2 A ε Ω a∇⊥ a · (iv, ∇v), Rε (v, A) = 1 + Λ2 A Eε (v, A) =
(1.17)
˜ ε (v, A) , F˜ε (v, A) = E˜ε (v, A) + R Fε (v, A) = Eε (v, A) + Rε (v, A). We do not write the dependence on A when A = R2 .
2. Fine Structure of Vortices The main goal of this section is to construct a fine structure of vortices away from the boundary of D. The analysis here follows the ideas in [8, 9]. The main difficulty in our situation is due to the presence in the energy of the weight function a(x) which vanishes on ∂D and it does not allow us to construct the structure up to the boundary because of the resulting degeneracy in the energy estimates. Throughout this paper, we assume that Ω satisfies (1.7), so that (1.13)–(1.15) hold. We will
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
125
prove the following results for the map vε = uε /(˜ ηε eiΩS ): √a √
Theorem 2.1. (1) For any R ∈ 2 0 , a0 , there exists εR > 0 such that for any ε < εR , |vε | ≥
1 2
Λ in BR \B Λ√a0 . 2
(2) There exist some constants N ∈ N, λ0 > 0 and ε0 > 0 (which only depend for any ε < ε0 , one can find a finite collection of points on ω 1 ) such that ε Λ √ xj j∈Jε ⊂ B a0 such that Card(Jε ) ≤ N and 4
1 |vε | ≥ 2
¯ Λ√a0 in B
2
B(xεj , λ0 ε) .
j∈Jε √ a
Remark 2.2. The statement of Theorem 2.1 also holds if the radius 2 0 is replaced by an arbitrary r ∈ (0, R) but then the constants in Theorem 2.1 depend on r. For √ a the sake of simplicity, we prefer to fix r = 2 0 . In the next proposition, we replace as in [20] the discs {B(xεj , λ0 ε)}j∈Jε obtained in Theorem 2.1 by slightly larger discs B(xεj , ρ) (deleting some of the points xεj , if necessary), in order to get a precise information on the behavior of vε on ∂B(xεj , ρ). The resulting family of discs will represent the vortices of the map vε (and hence, the vortices of uε also). Proposition 2.3. Let 0 < β < µ < 1 be given constants such that µ ¯ := µN +1 > β ε and let {xj }j∈Jε be the collection of points given by (2) in Theorem 2.1. There exists 0 < ε1 < ε0 such that for any ε < ε1 , we can find J˜ε ⊂ Jε and ρ > 0 verifying (i) λ0 ε ≤ εµ ≤ ρ ≤ εµ¯ < εβ , 1 ¯ Λ√a \∪ ˜ B(xε , ρ), in B (ii) |vε | ≥ j 0 j∈Jε 2 2 2 on ∂B(xεj , ρ) for every j ∈ J˜ε , (iii) |vε | ≥ 1 − 2 |ln ε| 1 C(β, µ) for every j ∈ J˜ε , |∇vε |2 + 2 (1 − |vε |2 )2 ≤ (iv) ε 2ε ρ ∂B(xj ,ρ) (v) |xε − xε | ≥ 8ρ for every i, j ∈ J˜ε with i = j. i
j
Moreover, for each j ∈ J˜ε , we have vε ε , ∂B(xj , ρ) = 0 Dj := deg |vε |
and
|Dj | ≤ C
(2.1)
for a constant C independent of ε. Remark 2.4. We point out that for every j ∈ J˜ε , the disc B(xεj , ρ) carries at least one zero of vε since the degree Dj = 0.
May 2, 2006 15:57 WSPC/148-RMP
126
J070-00260
R. Ignat & V. Millot
2.1. Some local estimates We start with a fundamental lemma. It strongly relies on Pohozaev’s identity and it will play a similar role as in [8, Theorem III.2]. In our situation, we only derive local estimates as in [3, 9, 24]. Some of the arguments used in the proof are taken from [3, 9]. √ Lemma 2.5. For any 0 < R < a0 and 23 < α < 1, there exists a positive constant CR,α such that 1 Λ (1 − |vε |2 )2 ≤ CR,α for any x0 ∈ BR . ε2 B(x0 ,εα ) Proof. Step 1. Set u˜ε = uε e−iΩS . We claim that uε , Dε ) ≤ C|ln ε|, Eε (˜
(2.2)
where Dε is defined in (1.16). Indeed, since u˜ε = η˜ε vε , we get that
|∇˜ uε |2 ≤ C η˜ε2 |∇vε |2 + |vε |2 |∇˜ ηε |2 . By [14, Propositions 2.2 and 3.3], |vε | ≤ C in Dε , η˜ε2 ≤ Ca in Dε and Eε (˜ ηε ) ≤ C|ln ε| and consequently, |∇˜ uε |2 ≤ C a(x)|∇vε |2 + |∇˜ ηε |2 ≤ C|ln ε| Dε
Dε
Dε
by (1.14). On the other hand, we also have C 1 2 2 (a(x) − η˜ε2 )2 + η˜ε4 (1 − |vε |2 )2 (a(x) − |˜ uε | ) ≤ 2 ε2 Dε ε Dε C (a(x) − η˜ε2 )2 + a2 (x)(1 − |vε |2 )2 ≤ C|ln ε| ≤ 2 ε Dε Dε and therefore (2.2) follows. Step 2. We are going to show that one can find a constant CR,α > 0, independent Λ , there is some r0 ∈ (εα , εα/2+1/3 ) satisfying of ε, such that for any x0 ∈ BR Eε (˜ uε , ∂B(x0 , r0 )) ≤
CR,α . r0
Λ such We proceed by contradiction. Assume that for all M > 0, there is xM ∈ BR that M , for any r ∈ (εα , εα/2+1/3 ). Eε (˜ uε , ∂B(xM , r)) ≥ (2.3) r Obviously, for ε small, B(xM , εα/2+1/3 ) ⊂ Dε . Integrating (2.3) for r ∈ (εα , εα/2+1/3 ), we derive that εα/2+1/3 dr Eε (˜ = M (α/2 − 1/3)|ln ε| uε , Dε ) ≥ M r εα
which contradicts Step 1 for M large enough.
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
127
Λ α α/2+1/3 Step 3. Fix x0 ∈ BR and let ) be r0 ∈ (ε , ε given by Step 2. We recall that any minimizer uε of Fε in u ∈ H, uL2(R2 ) = 1 satisfies
1 (a(x) − |uε |2 )uε + ε uε in R2 , ε2 where ε denotes the Lagrange multiplier. Therefore, we have 1 1 uε |2 )˜ uε + 2 (a(x) − a(x0 ))˜ uε + 2iΩ(∇S − x⊥ ) · ∇˜ uε −∆˜ uε = 2 (a(x0 ) − |˜ ε ε + ( ε + 2Ω2 x⊥ · ∇S − Ω2 |∇S|2 )˜ uε in B(x0 , r0 ). (2.4) −∆uε + 2iΩx⊥ · ∇uε =
uε and As in the proof of the Pohozaev identity, we multiply (2.4) by (x − x0 ) · ∇˜ we integrate by parts in B(x0 , r0 ). We have 2 ∂u ˜ε r0 2 −∆˜ uε · [(x − x0 ) · ∇˜ uε ] = |∇˜ uε | − r0 2 ∂B(x0 ,r0 ) B(x0 ,r0 ) ∂B(x0 ,r0 ) ∂ν (2.5) and 1 ε2
B(x0 ,r0 )
=
1 2ε2
(a(x0 ) − |˜ uε |2 )˜ uε · [(x − x0 ) · ∇˜ uε ]
B(x0 ,r0 )
(a(x0 ) − |˜ uε |2 )2 −
r0 4ε2
∂B(x0 ,r0 )
(a(x0 ) − |˜ uε |2 )2
(2.6)
(where ν is the outer normal vector to ∂B(x0 , r0 )). From (2.4)–(2.6), we derive that 1 (a(x0 ) − |˜ uε |2 )2 ε2 B(x0 ,r0 ) ≤ C r0 |∇˜ uε |2 + r0 ε−2 (a(x0 ) − |˜ uε |2 )2 ∂B(x0 ,r0 )
+ r0 ε
−2
∂B(x0 ,r0 )
B(x0 ,r0 )
2
|a(x) − a(x0 )||˜ uε ||∇˜ uε | + Ωr0
+ (Ω + | ε |)r0
B(x0 ,r0 )
B(x0 ,r0 )
|∇˜ uε |2
|˜ uε ||∇˜ uε | .
Then, we estimate each integral term in the right-hand side of the previous inequality. By [14, Proposition 3.2], we have | ε | ≤ Cε−1 |ln ε| and |˜ uε | ≤ C in R2 . According to (2.2), we obtain (a(x0 ) − |˜ uε |2 )2 ≤ Cε−2 uε |2 )2 ε−2 (a(x0 ) − a(x))2 + (a(x) − |˜ ∂B(x0 ,r0 )
∂B(x0 ,r0 )
≤ Cε−2 and
Ωr0
B(x0 ,r0 )
∂B(x0 ,r0 )
3
(a(x) − |˜ uε |2 )2 + CR ε 2 α−1 ,
|∇˜ uε |2 ≤ 2Ωr0 Eε (˜ uε , Dε ) ≤ CR εα/2+1/3 |ln ε|2 ,
May 2, 2006 15:57 WSPC/148-RMP
128
J070-00260
R. Ignat & V. Millot
and r0 ε−2
B(x0 ,r0 )
|a(x) − a(x0 )||˜ uε ||∇˜ uε | ≤ CR r02 ε−2
B(x0 ,r0 )
|∇˜ uε |
≤ CR r03 ε−2 [Eε (˜ uε , Dε )]1/2 3
≤ CR ε 2 α−1 |ln ε|1/2 , and 2
(Ω + | ε |)r0
B(x0 ,r0 )
|˜ uε ||∇˜ uε | ≤ CR ε−1 |ln ε|r02 [Eε (˜ uε , Dε )]1/2 1
≤ CR εα− 3 |ln ε|3/2 (here we use that |a(x) − a(x0 )| ≤ CR r0 for any x ∈ B(x0 , r0 )). We finally get that
1 ˜ε , ∂B(x0 , r0 )) (a(x0 ) − |˜ uε |2 )2 ≤ CR,α 1 + r0 Eε u 2 ε B(x0 ,r0 ) for some constant CR,α independent of ε. By Step 2, we conclude that 1 (a(x0 ) − |˜ uε |2 )2 ≤ CR,α . (2.7) ε2 B(x0 ,εα ) √ Since ˜ ηε − aC 1 (BRΛ ) ≤ CR ε2 |ln ε| by [14, Proposition 2.2], we have CR 1 2 2 (1 − |vε | ) ≤ 2 (˜ ηε2 − |˜ uε |2 )2 ε2 B(x0 ,εα ) ε B(x0 ,εα ) CR ≤ 2 (a(x) − |˜ uε |2 )2 + o(1) ε α B(x0 ,ε ) CR ≤ 2 (a(x0 ) − |˜ uε |2 )2 + o(1) ≤ CR,α ε α B(x0 ,ε ) and we conclude with (2.7). The next result will allow us to define the notion of a bad disc as in [8]. √ Proposition 2.6. For any 0 < R < a0 , there exist two positive constants λR and µR such that if √ a0 − R 1 l 2 2 Λ ≥ λR and l ≤ , (1 − |vε | ) ≤ µR with x0 ∈ BR , ε2 B(x0 ,2l) ε 2 then |vε | ≥ 1/2 in B(x0 , l). Proof. In [14, Proposition 3.3], we proved the existence of a constant CR > 0 independent of ε such that CR in B Λ√a0 +R . ε 2 Then, the result follows as in [8, Theorem III.3]. |∇vε | ≤
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
129
√ Λ Definition 2.7. For 0 < R < a0 and x ∈ BR , we say that B(x, λR ε) is a bad disc if 1 (1 − |vε |2 )2 ≥ µR . ε2 B(x,2λR ε) Now we can give a local version of Theorem 2.1. We will see that Lemma 2.5 plays a crucial role in the proof. √ Proposition 2.8. For any 0 < R < a0 and 23 < α < 1, there exist positive Λ constants NR,α and εR,α such that for every ε < εR,α and x0 ∈ BR , one can find α x1 , . . . , xNε ∈ B(x0 , ε ) with Nε ≤ NR,α verifying
N ε 1 in B(x0 , εα ) |vε | ≥ B(xk , λR ε) . 2 k=1
Proof. We follow the ideas in [8, Chapter IV]. Consider a family of discs B(xi , λR ε) i∈F such that xi ∈ B(x0 , εα ), λR ε λR ε B xi , ∩ B xj , = ∅ for i = j, 4 4 B(xi , λR ε). B(x0 , εα ) ⊂
(2.8) (2.9)
i∈F
Obviously, the discs B(xi , 2λR ε) i∈F cannot intersect more that C times (where C is a universal constant) and B(xi , 2λR ε) ⊂ B(x0 , εα ) i∈F
1 2 2 (α + 3 ).
We denote by F the set of indices i ∈ F such that B(xi , λR ε) with α = is a bad disc. We derive from Definition 2.7 that 1 C 2 2 (1 − |vε | ) ≤ 2 (1 − |vε |2 )2 . µR Card(F ) ≤ ε2 B(xi ,2λR ε) ε B(x0 ,εα ) i∈F
The conclusion now follows by Lemma 2.5 and Proposition 2.6. Remark 2.9. By the proof of Proposition 2.8, it follows that any family of discs B(xi , λR ε) i∈F satisfying (2.8) and (2.9) cannot contain more than NR,α bad discs. In the sequel, we will require the following crucial lemma to prove that vortices of degree zero do not occur. This result has its source in [3, 9] and the proof is based on the construction of a suitable test function. Hence, the main difference and difficulty in our case come from the mass constraint we have to take into account in the construction of test functions.
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
R. Ignat & V. Millot
130
Lemma 2.10. Let D > 0, 0 < β < 1 and γ > 1 be given constants such that √ γβ < 1. Let 0 < R < a0 and 0 < ρ < εβ be such that ργ > λR ε. We assume that Λ , for x0 ∈ BR 1 D (i) |∇vε |2 + 2 (1 − |vε |2 )2 < , 2ε ρ ∂B(x0 ,ρ) 1 on ∂B(x0 , ρ), 2 vε , ∂B(x0 , ρ) = 0. (iii) deg |vε | (ii) |vε | ≥
Then, we have 1 in B(x0 , ργ ). 2 Proof of Lemma 2.10. We are going to construct a comparison function as in [3] or [9] to obtain the following estimate: 1 |∇vε |2 + 2 (1 − |vε |2 )2 ≤ Cβ,R . (2.10) 2ε B(x0 ,ρ) |vε | ≥
Since the degree of vε restricted to ∂B(x0 , ρ) is zero, we may write on ∂B(x0 , ρ) vε = |vε |eiφε , where φε is a smooth map from ∂B(x0 , ρ) into R. Then, we define vˆε : R2 → C by vˆε = χε eiψε in B(x0 , ρ), in R2 \B(x0 , ρ),
vˆε = vε where ψε is the solution of
∆ψε = 0
in B(x0 , ρ),
ψε = φε
on ∂B(x0 , ρ),
and χε has the form, written in polar coordinates centered at x0 , χε (r, θ) = (|vε (ρeiθ )| − 1)ξ(r) + 1 and ξ is a smooth function taking values in [0, 1] with small support near ρ with ξ(ρ) = 1. By [14, Proposition 3.3], we know that |vε (x)| ≤ 1 + Cε1/3 for x ∈ D √ with |x|Λ ≥ a0 − ε1/8 and we deduce that 0 ≤ χε ≤ 1 + Cε1/3 . Arguing as in [7, proof of Theorem 2], we may prove that ∂φε 2 2 ≤ Cρ |∇ψε | ≤ Cρ |∇vε |2 (2.11) B(x0 ,ρ) ∂B(x0 ,ρ) ∂τ ∂B(x0 ,ρ) and B(x0 ,ρ)
|∇χε |2 +
1 (1 − χ2ε )2 ≤ Cρ ε2
∂B(x0 ,ρ)
|∇vε |2 +
1 (1 − |vε |2 )2 + O(ρ). 2ε2 (2.12)
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
From (2.11), (2.12) and assumption (i), we infer that 1 |∇ˆ vε |2 + 2 (1 − |ˆ vε |2 )2 ≤ C. 2ε B(x0 ,ρ)
131
(2.13)
We set v˜ε = m−1 ˆε with mε = ˜ ηε vˆε L2 (R2 ) . Clearly, η˜ε eiΩS v˜ε ∈ H and ε v ˜ ηε eiΩS v˜ε L2 (R2 ) = 1. Since uε = η˜ε eiΩS vε minimizes the functional Fε under the ηε eiΩS v˜ε ) and by (1.9), it yields constraint (1.2), we have Fε (uε ) ≤ Fε (˜ F˜ε (vε ) + T˜ε (vε ) ≤ F˜ε (˜ vε ) + T˜ε (˜ vε ). We claim that F˜ε (˜ vε ) ≤ F˜ε (ˆ vε ) + Cρ|ln ε|2
and T˜ε (vε ) − T˜ε (˜ vε ) = O ρ2 |ln ε|2 .
(2.14)
(2.15)
Indeed, we have already established in the proof of [14, Proposition 3.3] that ˜ ε (vε ) ≤ C|ln ε|2 E˜ε (vε ) ≤ C|ln ε|2 and R (2.16) so that, using (2.13), ˜ ηε vε L2 (R2 ) = 1, vˆε = vε in R2\B(x0 , ρ) and (2.16), we obtain
2
η˜ε2 |ˆ η˜ε2 1 − |vε |2 m2ε = 1 + vε | − 1 + B(x0 ,ρ)
B(x0 ,ρ)
= 1 + O(ρ ε|ln ε|).
(2.17)
From (2.13), (2.16) and (2.17), we derive 2 2 η˜ε2 |∇˜ vε |2 = m−2 η ˜ |∇ˆ v | = ε ε ε R2
R2
R2
η˜ε2 |∇ˆ vε |2 + O(ρε|ln ε|3 )
(2.18)
and ˜ ε (˜ ˜ vε ) = R ˜ ε (ˆ R vε ) = m−2 vε ) + O(ρε|ln ε|3 ). ε Rε (ˆ 2
(2.19)
2
Since uε remains bounded in R and Eε (uε ) ≤ C|ln ε| by [14, Proposition 3.3], we infer from (2.16), 1 2(1 − m−2 1 ε ) 4 2 2 4 2 2 η ˜ (1 − |˜ v | ) = η ˜ (1 − |ˆ v | ) + η˜ε2 (1 − |ˆ vε |2 )|˜ ηε vˆε |2 ε ε ε 2 R2 ε ε 2 R2 ε ε2 R2 2 (1 − m−2 ε ) + |˜ ηε vˆε |4 ε2 R2 1 ≤ 2 η˜4 (1 − |ˆ vε |2 )2 ε R2 ε 1/2 1 4 2 2 + Cρ|ln ε| η˜ (1 − |vε | ) ε2 R2 \B(x0 ,ρ) ε × ≤
1 ε2
1/2 4
R2 \B(x
R2
|uε |
+ Cρ2 |ln ε|2
0 ,ρ)
η˜ε4 (1 − |ˆ vε |2 )2 + Cρ|ln ε|2 .
(2.20)
May 2, 2006 15:57 WSPC/148-RMP
132
J070-00260
R. Ignat & V. Millot
Finally, we obtain in the same way, T˜ε (vε ) − T˜ε (˜ (2.21) vε ) ≤ T˜ε (vε ) − T˜ε (ˆ vε ) + T˜ε (ˆ vε ) − T˜ε (˜ vε ) 2 2 2 ≤ C|ln ε|2 (1 + |x|2 )˜ ηε2 + |1 − m−2 | (1 + |x| )˜ η |ˆ v | ε ε ε R2
B(x0 ,ρ)
≤ Cρ2 |ln ε|2 .
(2.22)
From (2.18)–(2.21), we conclude that (2.15) holds. Since vˆε = vε in R2 \B(x0 , ρ), we get from (2.14) and (2.15) that F˜ε (vε , B(x0 , ρ)) ≤ F˜ε (ˆ vε , B(x0 , ρ)) + Cρ|ln ε|2 . By (2.13), we have E˜ε (ˆ vε , B(x0 , ρ)) ≤ C and therefore, ˜ ε (ˆ R vε , B(x0 , ρ)) ≤ CΩ |∇ˆ vε | ≤ CΩρ∇ˆ vε L2 (B(x0 ,ρ)) = O(ρ|ln ε|). B(x0 ,ρ)
(2.23) Hence, F˜ε (ˆ vε , B(x0 , ρ)) ≤ C and we conclude that F˜ε (vε , B(x0 , ρ)) ≤ Cβ . ˜ ε (vε , B(x0 , ρ))| = O(ρ|ln ε|2 ) and As for (2.23), using (2.16), we easily derive that |R we finally get that E˜ε (vε , B(x0 , ρ)) ≤ Cβ which clearly implies (2.10) since η˜ε2 → a+ uniformly as ε → 0 (see [14, Proposition 2.2]). We deduce from (2.10) that ρ 1 2 2 2 |∇vε | + 2 (1 − |vε | ) ds ≤ Cβ,R . 2ε 2ργ ∂B(x0 ,s) ρ Since 2ργ s|lnds ≥ Cγ |ln ε|1/2 , we derive that for small ε there exists s0 ∈ [2ργ , ρ] s|1/2 such that 1 Cβ,R |∇vε |2 + 2 (1 − |vε |2 )2 ≤ . 2ε s |ln s0 |1/2 0 ∂B(x0 ,s0 ) Repeating the arguments used to prove (2.10), we find that 1 Cβ,R |∇vε |2 + 2 (1 − |vε |2 )2 ≤ . 2ε |ln s0 |1/2 B(x0 ,s0 ) In particular, we have 1 ε2
B(x0 ,2ργ )
(1 − |vε |2 )2 = o(1)
and the conclusion follows by Proposition 2.6. We obtain as in [9, Proposition IV.3] the following result which gives us an estimate of the contribution in the energy of any vortex. We reproduce here the proof for completeness.
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
133
√ Λ Proposition 2.11. Let 0 < R < a0 and 23 < α < 1. Let x0 ∈ BR and assume 1 that |vε (x0 )| < 2 . Then there exists a positive constant CR,α (which only depends on R, α and ω1 ) such that |∇vε |2 ≥ CR,α |ln ε|. B(x0 ,εα )
Proof. Let NR,α and x1 , . . . , xNε ∈ B(x0 , εα ) be as in Proposition 2.8. We set δα =
α1/2 − α 3(NR,α + 1)
and for k = 0, . . . , 3NR,α + 2, we consider αk = α1/2 − kδα ,
Ik = [εαk , εαk+1 ] and Ck = B(x0 , εαk+1 )\B(x0 , εαk ).
Then, there is some k0 ∈ {1, . . . , 3NR,α + 1} such that Nε Ck0 ∩ B(xj , λR ε) = ∅.
(2.24)
j=1
Indeed, since Nε ≤ NR,α and 2λR ε < |Ik | for small ε, the union of Nε intervals of length 2λR ε Nε
|xi − x0 | − λR ε, |xi − x0 | + λR ε j=1
cannot intersect all the intervals Ik of disjoint interior, for 1 ≤ k ≤ 3NR,α + 1. From (2.24), we deduce that |vε (x)| ≥
1 2
Therefore, for every ρ ∈ Ik0 ,
for any x ∈ Ck0 .
dk0 = deg
vε , ∂B(x0 , ρ) |vε |
is well defined and does not depend on ρ. We claim that dk0 = 0.
(2.25)
By contradiction, we suppose that dk0 = 0. According to (1.14), it results that 1 |∇vε |2 + 2 (1 − |vε |2 )2 ≤ CR |ln ε|. Λ 2ε B √a +R 0 2
Using the same argument as in Step 2 of the proof of Lemma 2.5, there is a constant CR,α such that 1 CR,α |∇vε |2 + 2 (1 − |vε |2 )2 ≤ for some ρ0 ∈ Ik0 . 2ε ρ0 ∂B(x0 ,ρ0 )
May 2, 2006 15:57 WSPC/148-RMP
134
J070-00260
R. Ignat & V. Millot
According to Lemma 2.10 with β = αk0 +1 and γ = 1 2
αk0 −1
, αk0
we should have
|vε (x0 )| ≥ which is a contradiction. By (2.25), we obtain for every ρ ∈ Ik0 , 1 1 ∂vε 1 ≤ |dk0 | = |∇vε | vε ∧ ≤C 2π ∂B(x0 ,ρ) |vε |2 ∂τ ∂B(x0 ,ρ) (we use that |vε | ≥
1 2
in Ck0 ). Then, the Cauchy–Schwarz inequality yields C for any ρ ∈ Ik0 |∇vε |2 ≥ ρ ∂B(x0 ,ρ)
and the conclusion follows integrating on Ik0 . 2.2. Proofs of Theorem 2.1 and Proposition 2.1 The part (1) in Theorem 2.1 follows directly from Lemma 2.12 below. Lemma 2.12. There exists a constant εR > 0 such that for any 0 < ε < εR , |vε | ≥
1 2
Λ in BR \B Λ√a0 . 5
Proof. First, we fix some α ∈ ( 23 , 1). We proceed by contradiction. Suppose that Λ \B Λ√a0 such that |vε (x0 )| < 1/2. Then, for any ε sufficiently there is some x0 ∈ BR 5
small, we have B(x0 , εα ) ⊂ Dε \{|x|Λ < 2|ln ε|−1/6 } and therefore, by (1.15), we get that
|∇vε |2 ≤ CR Eε vε , Dε \ |x|Λ < 2|ln ε|−1/6 ≤ CR ln|ln ε| B(x0 ,εα )
which contradicts Proposition 2.11 for ε small enough. Proof of (2) in Theorem 2.1. We fix some 23 < α < 1. As in the proof of Proposition 2.8, we consider a finite family of points {xj }j∈J satisfying xj ∈ B Λ√a0 2 λ0 ε λ0 ε B xi , ∩ B xj , =∅ 4 4 B(xj , λ0 ε), B Λ√a0 ⊂ 2
for i = j,
j∈J
√
a where λ0 := λ √a0 defined in Proposition 2.6 with R = 2 0 and we denote by Jε 2 the set of indices j ∈ J such that B(xj , λ0 ε) contains at least one point yj verifying
|vε (yj )| <
1 . 2
(2.26)
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
135
Obviously, B(xj , λ0 ε) is a bad disc for every j ∈ Jε . Applying Lemma 2.12 (with √ 3 a0 R = 4 ), we infer that there exists ε0 such that for any 0 < ε < ε0 , B(xj , λ0 ε) ⊂ B Λ√a0
for any j ∈ Jε .
4
(2.27)
Then, it remains to prove that Card(Jε ) is bounded independently of ε. Using √ a Proposition 2.11 (with R = 2 0 ), we derive that for any j ∈ Jε and any point yj satisfying (2.26) in the ball B(xj , λ0 ε), |∇vε |2 ≥ |∇vε |2 ≥ Cα |ln ε| (2.28) B(xj ,2εα )
B(yj ,εα )
for some positive constant Cα which only depends on α. We set for ε small enough, B(xj , 2εα ) ⊂ B Λ√a0 . W = 3
j∈Jε
We claim that there is a positive integer Mα independent of ε such that any y ∈ W belongs to at most Mα balls in the collection {B(xj , 2εα )}j∈Jε . Indeed, for each y ∈ W , consider the subset Ky ⊂ Jε defined by Ky = j ∈ Jε : y ∈ B(xj , 2εα ) . We have for every j ∈ Ky , α
xj ∈ B(y, 2ε ) ⊂ B(y, ε ) ⊂ B α
Λ √
a0 2
1 with α = 2
2 α+ . 3
(2.29)
Since the family of discs {B(xj , λ0 ε)}j∈Ky is a subcover of B(y, εα ) satisfying (2.8) and (2.9), we conclude from Remark 2.9 that Card(Ky ) ≤ Mα with Mα = N √a0 ,α . From (2.28), we infer that 2 1 |∇vε |2 ≥ |∇vε |2 ≥ |∇vε |2 ≥ Cα Card(Jε )|ln ε|. Λ M α α √ B a W B(xj ,2ε ) 2
j∈Jε
0
(2.30) On the other hand, we know by (1.14), 2 |∇vε | ≤ C a(x)|∇vε |2 ≤ C|ln ε| √ BΛ a 2
0
√ BΛ a 2
(2.31)
0
for a constant C independent of ε. Matching (2.30) and (2.31), we conclude that Card(Jε ) is uniformly bounded. In the following, we will prove Proposition 2.3. We proceed exactly as in [20, Theorem 2.1] and an adaptation of [3, Theorem V.1]. Before starting our proof, we recall, for the convenience of the reader, a result obtained in [14, Proposition 4.1],
May 2, 2006 15:57 WSPC/148-RMP
136
J070-00260
R. Ignat & V. Millot
by a method due to Sandier [18] and Sandier–Serfaty [19]: Proposition 2.13 [14]. There exists a positive constant K0 such that for ε sufficiently small, there exist νε ∈ (1, 2) and a finite collection of disjoint balls Bi i∈I := B(pi , ri ) i∈I satisfying: ε ε (i) for every i ∈ Iε , Bi ⊂⊂ Dε = x ∈ R2 , a(x) > νε |ln ε|−3/2 , (ii) x ∈ Dε , |vε (x)| < 1 − |ln ε|−5 ⊂ ∪i∈Iε Bi , (iii) ri ≤ |ln ε|−10 , i∈Iε
1 a(x)|∇vε |2 ≥ πa(pi )|di | |ln ε| − K0 ln|ln ε| , (iv) 2 Bi where di = deg
vε |vε | , ∂Bi
for every i ∈ Iε .
Proof of Proposition 2.3. By Theorem 2.1, we have for ε small enough, B(xεj , λ0 ε) ⊂ B Λ√a0 . j∈Jε
3
From (iii) in Proposition 2.13, there exists a radius rε ∈ ( ¯i ∩ B
∂BrΛε
= ∅ for every i ∈ Iε .
√
a0 3 ,
√
a0 2 ]
such that (2.32)
Hence, we have |vε | ≥ 1 − |ln ε|−5
on ∂BrΛε .
The existence of a subset J˜ε ⊂ Jε satisfying (i)–(v) can now be proved identically as in [20, Proposition 3.2] and it remains to prove (2.1). From the proof of Theorem 2.1, we know (by construction) that each disc B(xεk , λ0 ε), k ∈ Jε , contains at least one point yk such that |vε (yk )| < 12 . Therefore, each disc B(xεj , ρ), j ∈ J˜ε , contains at least one of the yk ’s with |xεj − yk | < λ0 ε. Assume now that Dj = 0. By Lemma 2.10 with γ = µ−1/2 , it would lead to |vε | ≥ 12 in B(xεj , ργ ) and then |vε (yk )| ≥ 12 for ε small enough, contradiction. We also find a bound on the degrees Dj : 1 √ 1 ∂vε |Dj | = vε ∧ ≤ C∇vε L2 (∂B(xεj ,ρ)) ρ ≤ C 2π ∂B(xεj ,ρ) |vε |2 ∂τ by (iv) in Proposition 2.3. 3. Some Lower Energy Estimates In this section, we obtain various lower energy estimates for vε in terms of the vortex structure defined in Sec. 2, Proposition 2.3. We start by proving a lower bound on the kinetic energy away from the vortices which brings out the interaction between vortices. The method that we use is based on the techniques developed in [3, 8, 20, 21]. As in the previous section, the main difficulty is due to the degenerate behavior
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
137
near the boundary of D of the function a(x) since the method involves in our case the operator −div(a−1 ∇) which is not uniformly elliptic in D. To avoid this problem, we √ √ Λ for an arbitrary radius R ∈ [ a0 /2, a0 ). The shall establish our estimates in BR √ underlying idea here is to let R → a0 at the end of the analysis. To emphasize the possible dependence on R in the “error term”, we will denote by OR (1) (respectively, oR (1)) any quantity which remains uniformly bounded in ε for fixed R (respectively, any quantity which tends to 0 as ε → 0 for fixed R). In the sequel, we will also write J˜ε = {1, . . . , nε }. Proposition 3.1. For any R ∈ [ 1 2
Θρ
a(x)|∇vε |2 ≥ π
√
a0 √ 2 , a0
nε
Λ ε ), let Θρ = BR \∪nj=1 B(xεj , ρ). We have
Dj2 a(xεj )|ln ρ|
j=1
+ WR,ε (xε1 , D1 ), . . . , (xεnε , Dnε ) + OR (1),
(3.1)
where nε
WR,ε (xε1 , D1 ), . . . , (xεnε , Dnε ) = −π Di Dj a(xεj ) ln|xεi − xεj | − π Dj ΨR,ε (xεj ) i=j
j=1
and ΨR,ε is the unique solution of nε
1 1 ε ε div ∇Ψ D a(x ) ∇ | = − · ∇ ln|x − x R,ε j j j a a j=1 nε = − Dj a(xεj ) ln|x − xεj | Ψ R,ε
Λ in BR ,
(3.2) on
Λ ∂BR .
j=1
Moreover, if
ρ |xεi −xεj |
→ 0 as ε → 0 for any i = j, then the term OR (1) in (3.1) is
in fact oR (1). Remark 3.2. We point out that the dependence on R in the interaction term WR,ε only appears in the function ΨR,ε . Moreover, for ΨR,ε to be well defined, 1/a(x) has √ Λ so that we cannot pass to the limit R → a0 in (3.1) to be bounded inside BR without an a priori deterioration of the error term. Proof of Proposition 3.1. We consider the solution Φρ of the linear problem 1 div ∇Φ =0 in Θρ , ρ a Λ on ∂BR , Φρ = 0 Φρ = const. 1 ∂Φρ = 2πDj ε a ∂ν ∂B(xj ,ρ)
on ∂B(xεj , ρ), for j = 1, . . . , nε ,
May 2, 2006 15:57 WSPC/148-RMP
138
J070-00260
R. Ignat & V. Millot
and ΦR,ε the solution of nε div 1 ∇Φ Dj δxεj = 2π R,ε a j=1 ΦR,ε = 0
Λ in BR ,
on
(3.3)
Λ ∂BR .
For x ∈ Θρ , we set wε (x) = |vvεε (x) (x)| and ∂wε 1 ∂Φρ ∂wε 1 ∂Φρ S = −wε ∧ + , wε ∧ + . ∂x2 a ∂x1 ∂x1 a ∂x2 We easily check that div S = 0 in Θρ and ∂B Λ S · ν = ∂B(xε ,ρ) S · ν = 0. By [8, j R ¯ ρ ) such that S = ∇⊥ H and hence, we can write Lemma I.1], there exists H ∈ C 1 (Θ the Hodge–de Rham type decomposition wε ∧ ∇wε = Consequently, 2 a(x)|∇wε | = Θρ
Θρ
1 |∇Φρ |2 + 2 a(x)
Θρ
1 |∇Φρ |2 + 2 a(x)
≥
1 ⊥ ∇ Φρ + ∇H. a Θρ
Θρ
⊥
∇ Φρ · ∇H +
Θρ
a(x)|∇H|2
∇⊥ Φρ · ∇H.
We observe that the last term is in fact equal to zero since it is the integral of a Jacobian and Φρ is constant on ∂Θρ . Hence, 1 |∇Φρ |2 . a(x)|∇wε |2 ≥ a(x) Θρ Θρ Since |∇vε |2 ≥ |vε |2 |∇wε |2 in Θρ , we derive that 1 |∇Φρ |2 + T1 + 2T2 a(x)|∇vε |2 ≥ Θρ Θρ a(x) with
T1 =
1 2 |∇Φρ |2 |vε | − 1 a(x) Θρ
and T2 =
Θρ
2 |vε | − 1 ∇Φ⊥ ρ · ∇H.
Arguing as in [3] (see Step 4 in the proof of Theorem 6), it turns out that T1 = oR (1) and T2 = oR (1) and therefore, 1 2 |∇Φρ |2 + oR (1). a(x)|∇vε | ≥ (3.4) a(x) Θρ Θρ On the other hand, by integrating by parts, we obtain nε 1 1 ∂Φρ 2 |∇Φρ | = Φρ = −2π Dj Φρ (zj ) Θρ a(x) ∂Θρ a(x) ∂ν j=1
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
139
for any point zj ∈ ∂B(xεj , ρ). Since nε and each Dj remain uniformly bounded in ε by Proposition 2.3, we may rewrite this equality as nε
1 |∇Φρ |2 = −2π Dj ΦR,ε (zj ) + O ΦR,ε − Φρ L∞ (Θρ ) . (3.5) Θρ a(x) j=1 Using an adaptation of [8, Lemma I.4] (see, e.g., [6, Lemma 3.5]), we derive that nε sup ΦR,ε − infε ΦR,ε . (3.6) ΦR,ε − Φρ L∞ (Θρ ) ≤ j=1
∂B(xj ,ρ)
∂B(xεj ,ρ)
Λ , To estimate the right-hand side term in (3.6), we introduce for x ∈ BR
ΨR,ε (x) = ΦR,ε (x) −
nε
Dj a(xεj ) ln|x − xεj |.
j=1
Since ΦR,ε solves (3.3), we deduce that ΨR,ε may be characterized as the solution of Eq. (3.2). By elliptic regularity, we infer that ΨR,ε W 2,p (BRΛ ) ≤ CR,p for any ε 1 ≤ p < 2 (here we used that {xεj }nj=1 ⊂ B Λ√a0 by Theorem 2.1). In particular, ΨR,ε 4
Λ is uniformly bounded with respect to ε in C 0,1/2 (BR ) and hence, √ sup ΨR,ε − infε ΨR,ε ≤ CR ρ = oR (1). ∂B(xj ,ρ)
∂B(xεj ,ρ)
Since |xεj − xεi | ≥ 8ρ, we derive from (2.1), n ε ε ε sup Di a(xi ) ln|x − xi | − ∂B(xεj ,ρ)
≤ρ
i=1 nε i=1, i=j
(respectively, ≤ o(1) if
a(xεi )
sup ∂B(xεj ,ρ)
ρ |xεi −xεj |
inf
∂B(xεj ,ρ)
nε
Di a(xεi ) ln|x
−
i=1
|Di | ≤ O(1), |x − xεi |
→ 0 as ε → 0 for any i = j). Coming back to (3.6),
we obtain that ΦR,ε − Φρ L∞ (Θρ ) ≤ OR (1) (respectively, ≤ oR (1) if as ε → 0 for any i = j). Inserting this estimate in (3.5), we get that nε 1 |∇Φρ |2 = −2π Dj ΦR,ε (zj ) + OR (1) Θρ a(x) j=1 = −2π
nε j=1
+ 2π
xεi |
nε
Dj ΨR,ε (zj ) − 2π
ρ |xεi −xεj |
→0
Di Dj a(xεi ) ln|zj − xεi |
i=j
Dj2 a(xεj )|ln ρ| + OR (1)
(3.7)
j=1
(respectively, + oR (1) as ε → 0). Since ΨR,ε is uniformly bounded with respect to √ Λ ), we have |ΨR,ε (zj ) − ΨR,ε (xεj )| ≤ CR ρ = oR (1). Moreover, using ε in C 0,1/2 (BR
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
R. Ignat & V. Millot
140
(2.1) and |xεj − xεi | ≥ 8ρ, we derive that zj − xεj ε ε ε ε Di Dj a(xi )(ln|zj − xi | − ln|xj − xi |) ≤ |Di ||Dj | ln 1 + ε xj − xεi i=j i=j ρ ≤ ≤ O(1) |Di ||Dj | ε |xj − xεi | i=j
(respectively, ≤ o(1) as ε → 0). Hence, (3.7) yields nε 1 |∇Φρ |2 = −2π Dj ΨR,ε (xεj ) − 2π Di Dj a(xεi ) ln|xεj − xεi | Θρ a(x) j=1 i=j
+ 2π
nε
Dj2 a(xεj )|ln ρ| + OR (1)
j=1
(respectively, + oR (1) as ε → 0). Combining this estimate with (3.4), we obtain the announced result. Arguing as in [20, 21], we estimate the contribution in the energy of each vortex which yields the following lower bounds for Eε (vε ): √a √
Lemma 3.3. For any R ∈ 2 0 , a0 , we have Λ Eε (vε , BR )≥π
nε
Dj2 a(xεj )|ln ρ| + π
j=1
nε
|Dj |a(xεj ) ln
j=1
ρ + WR,ε + OR (1) ε
(3.8)
and Λ )≥π Eε (vε , BR
nε
|Dj |a(xεj ) ln
j=1
ρ + O(1). ε
(3.9)
Proof. In view of Proposition 3.1, it suffices to show that Eε (vε , B(xεj , ρ)) ≥ π|Dj |a(xεj ) ln
ρ + O(1) ε
for j = 1, . . . , nε ,
which is equivalent to a(xεj ) 1 ρ |∇vε |2 + (1 − |vε |2 )2 ≥ π|Dj | ln + O(1) 2 2 B(xεj ,ρ) 2ε ε
for j = 1, . . . , nε (3.10)
Λ ) ≤ CR |ln ε|). (we used that |a(x) − a(xεj )| ≤ Cρ for x ∈ B(xεj , ρ) and Eε (vε , BR Setting
ε vˆ(y) = vε (ρy + xεj ) for y ∈ B(0, 1) and εˆ = , ρ a(xεj )
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
141
we infer from Proposition 2.3 that |ˆ v | ≥ 1 − |ln2ε|2 on ∂B(0, 1), a(xεj ) 1 1 ρ |∇ˆ v |2 + 2 (1 − |ˆ v |2 )2 = |∇vε |2 + (1 − |vε |2 )2 ≤ C 2 ∂B(0,1) 2ˆ ε 2 ∂B(xεj ,ρ) 2ε2 (3.11) and 1 2
1 1 |∇ˆ v | + 2 (1 − |ˆ v |2 )2 = 2ˆ ε 2 B(0,1) 2
B(xεj ,ρ)
|∇vε |2 +
a(xεj ) (1 − |vε |2 )2 . 2ε2
As in the proof of [3, Lemma VI.1], (3.11) yields for ε small enough, 1 1 ρ |∇ˆ v |2 + 2 (1 − |ˆ v |2 )2 ≥ π|Dj | |ln εˆ| + O(1) = π|Dj | ln + O(1) 2 B(0,1) 2ˆ ε ε and hence, (3.10) holds. As in [14, Proposition 4.2], we may compute an asymptotic expansion of Rε (vε , Dε ) in terms of vortices which leads, in view of Lemma 3.3, to lower expansions of Fε (vε , Dε ): √ a √ Lemma 3.4. For any R ∈ [ 2 0 , a0 ), we have Fε (vε , Dε ) ≥ π
nε
Dj2
a(xεj )|ln ρ|
+π
j=1
−
πΩ 1 + Λ2
nε
|Dj |a(xεj ) ln
j=1 nε
ρ ε
a2 (xεj )Dj + WR,ε + OR (1)
(3.12)
nε πΩ ρ − a2 (xεj )Dj + O(1). ε 1 + Λ2 j=1
(3.13)
j=1
and Fε (vε , Dε ) ≥ π
nε j=1
|Dj |a(xεj ) ln
Proof. We consider the family of balls {Bi }i∈Iε given in Proposition 2.13. As in √ the proof of Proposition 2.3, we can find rε ∈ [R, (R + a0 )/2] such that (2.32) holds. Setting + = i ∈ Iε , |pi |Λ > rε and di ≥ 0 IR and
− IR = i ∈ Iε , |pi |Λ > rε and di < 0 , + IR
− IR .
(3.14)
¯i ⊂ Dε \ B ¯rΛ for any i ∈ we have B ∪ By Theorem 2.1, Propositions 2.3 ε and 2.13, we infer that for ε small enough,
nε 1 in Ξε := Dε Bi ∪ B(xεj , ρ) . |vε | ≥ 2 + − j=1 i∈IR ∪IR
May 2, 2006 15:57 WSPC/148-RMP
142
J070-00260
R. Ignat & V. Millot
Arguing exactly as in [14, Proposition 4.2], we obtain that nε −πΩ Rε (vε , Ξε ) = a2 (xεj )Dj 1 + Λ2 j=1
−
πΩ 1 + Λ2
2
a (pi ) − νε2 |ln ε|−3 di + oR (1).
(3.15)
+ − i∈IR ∪IR
We recall that we have showed in the proof of [14, Proposition 4.2] that Rε (vε , ∪i∈I + ∪I − Bi ) = o(1). In the same way, we may prove that R R ε B(xεj , ρ)) = o(1). From (iv) in Proposition 2.13 and (3.15), we deduce Rε (vε , ∪nj=1 that
Bi Fε (vε , Dε ) ≥ Eε vε , Dε + − i∈IR ∪IR
+
+ − i∈IR ∪IR
1 2
a(x)|∇vε |2 + Rε (vε , Ξε ) + oR (1)
Bi
nε πΩ a2 (xεj )Dj 1 + Λ2 j=1
a(pi )|di | |ln ε| − K0 ln|ln ε|
Λ ≥ Eε (vε , BR )−
+π
+ − i∈IR ∪IR
−
πΩ 1 + Λ2
2
a (pi ) − νε2 |ln ε|−3 di + oR (1).
(3.16)
+ − i∈IR ∪IR
¯ rΛ for i ∈ I + ∪ I − , we have a(pi ) a0 and we deduce that for ε small Since pi ∈ B R R ε enough, π
a(pi )|di | |ln ε| − K0 ln|ln ε| −
+ − i∈IR ∪IR
πΩ 1 + Λ2
2
a (pi ) − νε2 |ln ε|−3 di ≥ 0
+ − i∈IR ∪IR
which leads to Λ Fε (vε , Dε ) ≥ Eε (vε , BR )−
nε πΩ a2 (xεj )Dj + oR (1). 1 + Λ2 j=1
(3.17)
Combining (3.8) and (3.17), we obtain (3.12). Similarly, the inequality (3.17) applied √ with R = a0 /2, and (3.9) yield (3.13). 4. Proof of Theorem 1.1 In this section, we are going to prove Theorem 1.1 in terms of the map vε . We start by showing that vortices must be of degree one. This yields a fundamental
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
143
improvement of the estimates obtained in the previous section. Then, we treat separately the points (i) and (ii) of Theorem 1.1. 4.1. Vortices have degree one Lemma 4.1. Whenever ε is small enough, Dj = +1 for j = 1, . . . , nε . Proof. By [14, Proposition 3.5], we have Fε (vε , Dε ) ≤ o(1). According to (3.13), it yields π
nε
nε πa0 Ω ρ ρ ε − a(xj )Dj ≤ π |Dj |a(xεj ) ln ε 1 + Λ2 ε j=1
|Dj |a(xεj ) ln
j=1
Dj >0
−
nε πΩ a2 (xεj )Dj ≤ O(1). 1 + Λ2 j=1
From (1.7), we derive that nε
|Dj |a(xεj ) ln
j=1
ρ ≤ Dj a(xεj )|ln ε| + o(|ln ε|). ε Dj >0
Since ρ ≥ εµ , it leads to (we recall that Dj = 0) (1 − µ) |Dj |a(xεj )|ln ε| ≤ µ |Dj |a(xεj )|ln ε| + o(|ln ε|). Dj <0
By Theorem 2.1,
a(xεj )
Dj <0
Dj >0
≥ a0 /2 and consequently,
|Dj | ≤
2µ Cµ + o(1). |Dj | + o(1) ≤ 1−µ 1−µ Dj >0
Choosing µ sufficiently small, it yields Dj > 0 for j = 1, . . . , nε whenever ε is small enough. Since |xεj | ≤ C and Dj > 0, we may now assert that Di Dj a(xεj ) ln|xεi − xεj | ≥ O(1) −π i=j
nε
and thus, W√a0 ,ε ≥ −π j=1 Dj Ψ √a0 ,ε (xεj ) = O(1). Hence, the inequality (3.12) 2 2 √ (applied with R = a0 /2) together with Fε (vε , Dε ) ≤ o(1) leads us to π
nε j=1
Dj2 a(xεj )|ln ρ| + π
nε j=1
Dj a(xεj ) ln
nε πΩ ρ − a2 (xεj )Dj ≤ O(1). ε 1 + Λ2 j=1
ε (Dj2 − Dj )a(xεj )|ln ρ| ≤ o(|ln ε|). Since As previously, we derive from (1.7), nj=1 µ ¯ ε ρ ≤ ε and a(xj ) ≥ a0 /2, we conclude that nε µ ¯ a0 (D2 − Dj ) ≤ o(1) 2 j=1 j
which yields Dj = +1 whenever ε is small enough.
May 2, 2006 15:57 WSPC/148-RMP
144
J070-00260
R. Ignat & V. Millot
As a direct consequence of Lemma 4.1, we obtain the following improvement of Lemma 3.4: Corollary 4.2. For any R ∈ [
a0 √ 2 , a0
), we have
nε πΩ a2 (xεj ) 2 1 + Λ j=1 j=1
ε ε + WR,ε (x1 , +1), . . . , (xnε , +1) + OR (1).
F˜ε (vε ) ≥ π
nε
√
a(xεj )|ln ε| −
Proof. It follows directly from (3.12) and Lemma 4.1 that for any R ∈ nε πΩ a2 (xεj ) 2 1 + Λ j=1 j=1 ε
ε + WR,ε (x1 , +1), . . . , (xnε , +1) + OR (1).
Fε (vε , Dε ) ≥ π
nε
√a0 √
2 , a0 ,
a(xεj )|ln ε| −
On the other hand, we have proved in the proofs of [14, Propositions 3.4 and 3.5], that |Fε (vε , Dε ) − F˜ε (vε , Dε )| = o(1) and F˜ε (vε , R2 \ Dε ) ≥ o(1). Hence, we have F˜ε (vε ) ≥ Fε (vε , Dε ) + o(1) and the conclusion follows. 4.2. The subcritical case We are now able to prove (i) in Theorem 1.1. Following the proof of [14, Theorem 1.1], it suffices to show Proposition 4.3 below. Proposition 4.3. Assume that (1.7) holds with ω1 < 0. Then, for ε sufficiently small, we have that |vε | → 1
in L∞ loc (D) as ε → 0.
(4.1)
Moreover, F˜ε (vε ) = o(1)
and
E˜ε (vε ) = o(1).
(4.2)
√ √ a Proof. We fix some 2 0 < R0 < a0 . In the proof of [14, Proposition 3.4], we √ a have proved that F˜ε (vε ) ≤ o(1) so that Corollary 4.2 applied with R = 2 0 leads to
π
nε j=1
a(xεj )|ln ε| −
nε nε nε πa0 Ω πΩ ε ε a(x ) ≤ π a(x )|ln ε| − a2 (xεj ) ≤ O(1). j j 2 1 + Λ2 j=1 1 + Λ j=1 j=1
Since a(xεj ) ≥ a0 /2 and ω1 < 0, we deduce that nε a0 |ω1 | nε ln|ln ε| ≤ −ω1 a(xεj )ln|ln ε| ≤ O(1) 2 j=1
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
145
and then nε ≤ o(1) which implies that nε ≡ 0 whenever ε is small enough. Using the notation (3.14), we derive from (3.16) that
a(pi )|di | |ln ε| − K0 ln|ln ε| Fε (vε , Dε ) ≥ π + − i∈IR ∪IR 0
0
πΩ − 1 + Λ2
2
a (pi ) − νε2 |ln ε|−3 di .
+ − i∈IR ∪IR 0
0
By [14, Proposition 3.5], we have Fε (vε , Dε ) ≤ O(|ln ε|−1 ). Since a(pi ) a0 for + − ∪ IR , we infer that exists c > 0 independent of ε such that i ∈ IR 0 0
a(pi )|di ||ln ε| ≤ π a(pi )|di | |ln ε| − K0 ln|ln ε| c + − i∈IR ∪IR 0
+ − i∈IR ∪IR
0
0
πΩ − 1 + Λ2
0
2
a (pi ) − νε2 |ln ε|−3 di
+ − i∈IR ∪IR 0
0
≤ O(|ln ε|−1 ). Since a(x) ≥ |ln ε|−3/2 in Dε , we finally obtain |di | ≤ O(|ln ε|−1/2 ). Hence,
+ − i∈IR ∪IR 0
+ − i∈IR ∪IR 0
0
0
|di | = 0 for ε sufficiently small and we conclude from (3.15), Rε (vε , Dε \∪i∈I +
− R0 ∪IR0
Bi ) = o(1).
By the proof of [14, Proposition 4.2], we also have Rε (vε , ∪i∈I + that Rε (vε , Dε ) = o(1). Consequently,
− R0 ∪IR0
Bi ) = o(1) so
Eε (vε , Dε ) = Fε (vε , Dε ) + o(1) ≤ o(1). Then the rest of the proof follows as in [14, Proposition 4.3]. 4.3. The supercritical case In this section, we will prove (ii) in Theorem 1.1. Writing
1 + Λ2 Ω= |ln ε| + ω(ε)ln|ln ε| , a0 we assume that (d − 1) + δ ≤ ω(ε) ≤ d − δ
(4.3)
(4.4)
for some integer d ≥ 1 and some positive number δ 1 independent of ε. We start by proving that, in this regime, vε has vortices whenever ε is small enough: Proposition 4.4. Assume that (4.4) holds. Then, for ε sufficiently small, vε has exactly d vortices of degree one, i.e. nε ≡ d, and πa0 2 (d − d) ln|ln ε| + O(1). F˜ε (vε ) = −πa0 dω(ε) ln|ln ε| + (4.5) 2
May 2, 2006 15:57 WSPC/148-RMP
146
J070-00260
R. Ignat & V. Millot
Proof. Step 1. We start by proving that nε ≥ 1 for ε sufficiently small. By uε L2 (R2 ) = 1 Theorem 5.1 in Sec. 5 (with d = 1), there exists u ˜ε ∈ H such that ˜ and uε ) ≤ Fε (˜ ηε eiΩS ) − πa0 ω(ε)ln|ln ε| + O(1). Fε (˜ By the minimizing property of uε and (1.9), we have uε ) Fε (uε ) = Fε (ηε eiΩS ) + F˜ε (vε ) + T˜ε (vε ) ≤ Fε (˜ and since |T˜ε (vε )| = o(1) (see [14, Proposition 3.3]), we deduce that F˜ε (vε ) ≤ −πa0 ω(ε)ln|ln ε| + O(1). √
a0 2
From here, it turns out by Corollary 4.2 applied with R =
(recall that W √a0 ,ε ≥ 2
O(1)),
nε πΩ a2 (xεj ) + O(1) 2 1 + Λ j=1 j=1 n ε Ω|xεj |2Λ ε ≥π a(xj ) −ω(ε)ln|ln ε|+ + O(1) 1 + Λ2 j=1
−πa0 ω(ε)ln|ln ε| + O(1) ≥ F˜ε (vε ) ≥ π
nε
a(xεj )|ln ε| −
≥ −πa0 ω(ε)nε ln|ln ε| + O(1). Hence, nε ≥ 1 + o(1) and the conclusion follows. Step 2. Now, we show that πa0 2 F˜ε (vε ) ≥ −πa0 nε ω(ε)ln|ln ε| + (nε − nε )ln|ln ε| + O(1). 2
(4.6)
In the case nε = 1, we have already proved the result in the previous step. Then, we may assume that nε ≥ 2. Since Ψ √a0 ,ε ∞ = O(1), we get from Corollary 4.2 √
applied with R =
2
a0 2 ,
F˜ε (vε ) ≥ π
nε j=1
a(xεj ) |ln ε| −
≥π
nε j=1
nε
ln|xεi − xεj | −
i=1 i=j
a(xεj ) −ω(ε)ln|ln ε| −
Ωa(xεj ) 1 + Λ2
+ O(1)
nε i=1 i=j
ln|xεi − xεj | +
Ω|xεj |2Λ + O(1). 1 + Λ2
(4.7)
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
147
Since F˜ε (vε ) ≤ o(1), we derive that −
i=j
On the other hand, − and hence, π
−
ln|xεi − xεj | ≥ O(1) so that |xεj |2 ≤ C ln|ln ε| |ln ε|−1
i=j
xεj |
nε Ω + |xε |2 ≤ Cln|ln ε|. 1 + Λ2 j=1 j Λ
ln|xεi
nε
nε Ω|xεj |2Λ ε ε a(xεj ) −ω(ε)ln|ln ε| − ln|x − x | + i j 2 1 + Λ j=1 i=1 i=j
= −πa0 nε ω(ε)ln|ln ε| − πa0
ln|xεi − xεj | +
i=j
nε πa0 Ω |xε |2 + o(1). 1 + Λ2 j=1 j Λ
(4.8)
Setting r = maxj |xεj |, we remark that −
ln|xεi − xεj | +
i=j
+
nε Ω |xε |2 ≥ −(n2ε − nε ) ln 2r 1 + Λ2 j=1 j Λ
n2ε − nε ΩΛ2 r2 ln|ln ε| + O(1). ≥ 1 + Λ2 2
(4.9)
Combining (4.7)–(4.9), we obtain (4.6). Step 3. We start by proving that nε ≥ d. The case d = 1 is proved in Step 1 so that we may assume that d ≥ 2. By Theorem 5.1 in Sec. 5, there exists for ε small uε L2 (R2 ) = 1 and enough, u ˜ε ∈ H such that ˜ Fε (˜ uε ) ≤ Fε (˜ ηε eiΩS ) − πa0 dω(ε)ln|ln ε| +
πa0 2 (d − d)ln|ln ε| + O(1). 2
uε ) yields As in Step 1, Fε (uε ) ≤ Fε (˜ πa0 2 F˜ε (vε ) ≤ −πa0 dω(ε)ln|ln ε| + (d − d)ln|ln ε| + O(1). 2
(4.10)
Matching (4.6) with (4.10), we deduce that −ω(ε)nε +
d2 − d n2ε − nε ≤ −ω(ε)d + + o(1) 2 2
and it yields ω(ε)(d − nε ) ≤
(d − nε )(d + nε − 1) + o(1). 2
If assume that nε ≤ d − 1, it would lead to (d − 1) + δ ≤
d + nε − 1 + o(1) ≤ d − 1 + o(1) 2
which is impossible for ε small enough.
(4.11)
May 2, 2006 15:57 WSPC/148-RMP
148
J070-00260
R. Ignat & V. Millot
Assume now that nε ≥ d + 1. As previously, we infer that (4.11) holds and therefore, d + nε − 1 + o(1) ≥ d + o(1) 2 which is also impossible for ε small. Hence, nε ≡ d whenever ε is small enough which leads to (4.5) by (4.6) and (4.10). d−δ ≥
By Proposition 4.4, we may now assume that vε has exactly d vortices. We move on a first information on their location: Lemma 4.5. We have |xεj | ≤ C|ln ε|−1/2 |xεi − xεj | ≥ C|ln ε|−1/2
for j = 1, . . . , d
and
if d ≥ 2,
for i = j.
Proof. Matching (4.5) with (4.7) and (4.8) and using that nε = d, we deduce that
−πa0
ln|xεi − xεj | +
i=j
Hence, d
−
i=j
j=1
d
πa0 Ω ε 2 |xj |Λ ≤ πa0 (d2 − d)ln |ln ε|1/2 + O(1). 2 1 + Λ j=1
! " Ω|xε |2 j ≤ O(1) ln |ln ε| |xεi − xεj | + 2
and the conclusion follows. ρ Since |xε −x ε | = o(1) by Lemma 4.5, we may now improve the lower estimates i j obtained in Lemma 3.3 following the method of the proof of Proposition 5.2 in [20, 21]. √a √
Lemma 4.6. For any R ∈ 2 0 , a0 , we have Λ Eε (vε , BR ) ≥ πa0
d
a(xεj )|ln ε| + WR,ε (xε1 , . . . , xεd ) +
j=1
πa0 d ln a0 + a0 dγ0 + oR (1), 2
where γ0 is an absolute constant. Proof. Since 1 2
Θρ
ρ |xεi −xεj |
= o(1) and Dj = 1, Proposition 3.1 yields 2
a(x)|∇vε | ≥ π
d
a(xεj )|ln ρ| + WR,ε (xε1 , . . . , xεd ) + oR (1)
(4.12)
j=1
and it remains to estimate Eε (vε , B(xεj , ρ)) for j = 1, . . . , d. We proceed as follows. Since Dj = 1, we may write on ∂B(xεj , ρ) in polar coordinates with center xεj , vε (x) = |vε (x)|ei(θ+ψj (θ)) ,
θ ∈ [0, 2π],
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
149
where ψj ∈ H 1 ([0, 2π], R) and ψj (0) = ψj (2π) = 0. Then, in each disc B(xεj , 2ρ), we consider the map vˆε defined by vε (x) if x ∈ B(xεj , ρ), r − ρ 2ρ − r + |vε (xεj + ρ eiθ )| vˆε (x) = ρ ρ 2ρ − r ρ−r × exp i θ + ψj (θ) + ψj (0) if x ∈ B(xεj , 2ρ)\B(xεj , ρ). ρ ρ Then, vˆε = exp i(θ+ψj (0)) on ∂B(xεj , 2ρ). Exactly as in the proof of Proposition 5.2 in [20, 21], we prove that Eε (ˆ (4.13) vε , B(xεj , 2ρ)\B(xεj , ρ)) − πa(xεj ) ln 2 = o(1). Since |a(x) − a(xεj )| = O(ρ) in B(xεj , 2ρ), we may write a(xεj ) a(xεj ) vε , B(xεj , 2ρ)) = |∇ˆ vε |2 + (1 − |ˆ vε |2 )2 + o(1). Eε (ˆ 2 ε 2 2ε B(xj ,2ρ)
(4.14)
Now, we shall recall a result in [8]. For ε˜ > 0, we consider 1 1 |∇u|2 + 2 (1 − |u|2 )2 , I(˜ ε) = Min u∈C 2 B(0,1) 2˜ ε where
x 1 on ∂B(0, 1) . C = u ∈ H (B(0, 1), C), u(x) = |x|
Then, we have
lim I(˜ ε) + π ln ε˜ = γ0 .
ε˜→0
(4.15)
x−xε
Since vˆε (x) = |x−xjε | eiψj (0) on ∂B(xεj , 2ρ), we obtain by scaling j a(xεj ) 1 |∇ˆ vε |2 + (1 − |ˆ vε |2 )2 2 B(xεj ,2ρ) 2ε2 ε = π ln ρ + π ln 2 + π ln a(xεj ) + γ0 + o(1). ≥I ε 2 ε 2ρ a(x ) j
With (4.13) and (4.14), we derive that for j = 1, . . . , d, ρ πa(xεj ) + ln a(xεj ) + a(xεj )γ0 + o(1) ε 2 ρ πa0 ln a0 + a0 γ0 + o(1). ≥ πa(xεj ) ln + ε 2 Combining this estimate with (4.12), we get the result. Eε (vε , B(xεj , ρ)) ≥ πa(xεj ) ln
We are now able to give the asymptotic expansion of F˜ε (vε ) which will allow us to locate precisely the vortices. This concludes the proof of Theorem 1.1.
May 2, 2006 15:57 WSPC/148-RMP
150
J070-00260
R. Ignat & V. Millot
√ Proposition 4.7. Setting x ˜εj = Ω xεj for j = 1, . . . , d, as ε → 0 the x ˜εj ’s tend to 2d minimize the renormalized energy w : R → R given by w(b1 , . . . , bd ) = −πa0
ln|bi − bj | +
i=j
d πa0 |bj |2Λ . 1 + Λ2 j=1
Moreover, we have πa0 2 F˜ε (vε ) = −πa0 d ω(ε) ln|ln ε| + (d − d) ln|ln ε| 2 + Min w(b) + QΛ,d + o(1)
(4.16)
b∈R2d
2
where QΛ,d = πa2 0 (d2 − d) ln(1 + Λ2 ) + πa0 d ln a0 − πa20 d ln a0 + a0 dγ0 − πa0 d2 (Λ) and (Λ) is given by (A.2). Proof. From Lemma 4.6 and (3.17), we infer that for any R ∈ [ Fε (vε , Dε ) ≥ π
d
a(xεj )|ln ε| −
j=1
√ a0 √ 2 , a0
),
d πΩ 2 ε a (xj ) 1 + Λ2 j=1
πa0 d ln a0 + a0 dγ0 + oR (1). 2 As in the proof of Corollary 4.2, this estimate implies + WR,ε +
F˜ε (vε ) ≥ π
d
a(xεj )|ln ε| −
j=1
d πΩ 2 ε πa0 d ln a0 + a0 dγ0 + oR (1). a (xj ) + WR,ε + 1 + Λ2 j=1 2
Expanding Ω and a(xεj ), we derive that d Ω|xεj |2Λ πa0 d ln a0 + a0 dγ0 + oR (1) F˜ε (vε ) ≥ π a(xεj ) − ω(ε)ln|ln ε|+ + WR,ε + 2 1 + Λ 2 j=1 and by Lemma 4.5, it yields d πa0 F˜ε (vε ) ≥ −πa0 dω(ε)ln|ln ε| + Ω|xεj |2Λ 1 + Λ2 j=1
+ WR,ε +
πa0 d ln a0 + a0 dγ0 + oR (1). 2
(4.17)
By Lemma 4.5, we also have WR,ε = −πa0
ln|xεi − xεj | − π
i=j
d
ΨR,ε (xεj ) + o(1).
(4.18)
j=1
Since Dj = 1 for all j, the function ΨR,ε satisfies the equation d 1 1 ε Λ div ∇ΨR,ε = − a(xj )∇ , · ∇(ln |x − xεj |) in BR a a j=1
d ΨR,ε = − a(xεj )ln|x − xεj | j=1
on
Λ ∂BR .
(4.19)
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
151
We infer from Lemma 4.5 that for j = 1, . . . , d, a(xεj )∇
1 −2a0 |x|2Λ + fεj (x), · ∇(ln |x − xεj |) = 2 a a (x)|x|2
where fεj satisfies fεj Lp (BRΛ ) = oR (1) for any p ∈ [1, 2) and # # #a0 ln|x| − a(xεj ) ln |x − xεj | # 1 Λ = o(1). C (∂B ) R
Letting ΨR to be the solution of the equation 2 div 1 ∇ΨR = −2|x|Λ 2 a a (x)|x|2 ΨR = −ln|x|
Λ in BR ,
on
(4.20)
Λ ∂BR ,
it follows by classical results that ΨR,ε −a0 dΨR L∞ (BRΛ ) = oR (1). Hence, we obtain from (4.18), WR,ε (xε1 , . . . , xεd )
lim
ε→0
+ πa0
$ ln|xεi
−
xεj |
= −πa0 d2 ΨR (0).
(4.21)
i=j
Combining (4.17) and (4.21), we are led to lim inf F˜ε (vε ) + πa0 dω(ε)ln|ln ε| + πa0 ε→0
≥
i=j
ln|xεi
−
xεj |
d πa0 − Ω|xεj |2Λ 1 + Λ2 j=1
$
πa0 d ln a0 + a0 dγ0 − πa0 d2 ΨR (0). 2
Setting x ˜εj =
√
Ω xεj , it yields
πa0 2 ε ε ˜ lim inf Fε (vε ) + πa0 dω(ε)ln|ln ε| − (d − d)ln|ln ε| − w(˜ x1 , . . . , x ˜d ) ε→0 2 πa0 2 πa0 d2 (d − d) ln(1 + Λ2 ) + πa0 d ln a0 − ln a0 + a0 dγ0 − πa0 d2 ΨR (0). ≥ 2 2
May 2, 2006 15:57 WSPC/148-RMP
152
J070-00260
R. Ignat & V. Millot
Since ΨR (0) → (Λ) as R →
√ a0 by Lemma A.1 in Appendix A, we conclude that
πa0 2 ε ε ˜ (d − d)ln|ln ε|−w(˜ x1 , . . . , x ˜d ) ≥ QΛ,d lim inf Fε (vε )+πa0 ω(ε)d ln|ln ε| − ε→0 2 (4.22) and hence, lim inf ε→0
πa0 2 F˜ε (vε ) + πa0 ω(ε)d ln|ln ε| − (d − d)ln|ln ε| 2
≥ Min w(b) + QΛ,d .
(4.23)
b∈R2d
˜ε ∈ H such that ˜ uε L2 (R2 ) = By Theorem 5.1 in Sec. 5, for any δ > 0, there exists u 1 and πa0 2 iΩS (d − d)ln|ln ε| uε ) − Fε (˜ ηε e ) + πa0 dω(ε)ln|ln ε| − lim sup Fε (˜ 2 ε→0 ≤ Min w(b) + QΛ,d + δ . b∈R2d
As in the proof of Proposition 4.4, Fε (uε ) ≤ Fε (˜ uε ) implies πa0 2 (d − d)ln|ln ε| lim sup F˜ε (vε ) + πa0 dω(ε)ln|ln ε| − 2 ε→0 ≤ Min w(b) + QΛ,d + δ . b∈R2d
(4.24)
Matching (4.23) with (4.24), we conclude that πa0 2 (d − d)ln|ln ε| = Min w(b) + QΛ,d lim F˜ε (vε ) + πa0 dω(ε)ln|ln ε| − ε→0 2 b∈R2d since δ is arbitrarily small. Coming back to (4.22), we are led to Min w(b) + QΛ,d − lim sup w(xε1 , . . . , xεd ) ≥ QΛ,d
b∈R2d
ε→0
and therefore, limε→0 w(˜ xε1 , . . . , x ˜εd ) = Min w(b) which ends the proof. b∈R2d
Remark 4.8. In the case d = 1, the expansion of the energy takes the simpler form F˜ε (vε ) = −πa0 ω(ε)ln|ln ε| + QΛ,1 + o(1) 2 2 and the renormalized energy w(·) reduces to w(b) = √ (πaε0 |b|Λ )/(1 + Λ ). In particε ular, if x denotes the single vortex of vε , we have Ω x → 0 as ε goes to 0.
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
153
5. Upper Bound of the Energy Here, we give the construction of the test functions used in the previous sections. The difficulties are twofold: the mass constraint we have to take into account and the vanishing property of the function a(x) on the boundary of D. Hence, the classical methods cannot be applied directly. Concerning the mass constraint, we simply renormalize a suitable trial function. This procedure requires a high precision in the energy estimates and an almost optimal choice of the preliminary trial function. To overcome the degeneracy problem induced by the function a(x), we proceed by upper approximation of a(x). In the sequel, we assume that (1.7) holds. Using notation (4.3), the result can be stated as follows: Theorem 5.1. Let d ≥ 1 be an integer. For any δ > 0, there exists (˜ uε )ε>0 ⊂ H verifying ˜ uε L2 (R2 ) = 1 and πa0 2 iΩS (d − d) ln|ln ε| uε ) − Fε (˜ ηε e ) + πa0 ω(ε)d ln|ln ε| − lim sup Fε (˜ 2 ε→0 ≤ Min w(b) + QΛ,d + δ, b∈R2d
where the constant QΛ,d is defined in Proposition 4.7. As mentioned above, the proof of Theorem 5.1 is based on a first construction which is given by the following proposition. Here, some of the main ingredients are taken from a previous construction due to Andr´e and Shafrir [5]. Proposition 5.2. Let d ≥ 1 be an integer. For any δ > 0, there exists (ˆ vε )ε>0 such that η˜ε vˆε ∈ H and πa0 2 ˜ (d − d)ln|ln ε| vε ) + πa0 ω(ε)d ln|ln ε| − lim sup Fε (ˆ 2 ε→0 ≤ Min w(b) + QΛ,d + δ. b∈R2d
Proof. Step 1. Let σ > 0 and κ > 0 be two small parameters that we will choose ¯ → R given by later. We consider the function aσ : D √ a(x) if |x|Λ ≤ a0 − σ, aσ (x) = √ −2 a0 − σ |x|Λ + 2a0 − σ otherwise. ¯ aσ ≥ a and aσ ≥ Cσ 2 in D ¯ for some positive It turns out that aσ ∈ C 1 (D), ¯ we may define Φσ : D → R the solution constant C. Since aσ does not vanish in D, of the equation div 1 ∇Φσ = 2π dδ0 in D, aσ (5.1) Φ = 0 on ∂D. σ
May 2, 2006 15:57 WSPC/148-RMP
154
J070-00260
R. Ignat & V. Millot
¯ By the results in [8, Chap. I], we may find a map v0σ ∈ C 2 (D\{0}, S 1) satisfying v0σ ∧ ∇v0σ =
1 ⊥ ∇ Φσ aσ
in D\{0}.
(5.2)
Set Θκ,ε = D\B(0, κ−1 Ω−1/2 ). By (5.1) and (5.2), we have for ε small enough, 1 1 ∂Φσ Φσ aσ |∇v0σ |2 = |∇Φσ |2 = − Θκ,ε Θκ,ε aσ ∂B(0,κ−1 Ω−1/2 ) a ∂ν 1 a20 d2 ∂Ψσ + =− ∂ν |x| ∂B(0,κ−1 Ω−1/2 ) a
(5.3) × Ψσ + ln|x| , ¯ for any 0 < α < 1, where Ψσ (x) = (a0 d)−1 Φσ (x) − ln |x|. Notice that Ψσ ∈ C 1,α (D) since it satisfies the equation div 1 ∇Ψσ = fσ (x) in D, aσ (5.4) Ψ = −ln|x| on ∂D σ
with
−2|x|2Λ a2 (x)|x|2 1 x σ fσ (x) = − ∇ · 2 = √ aσ (x) |x| −2 a0 − σ|x|Λ a2σ (x)|x|2
if |x| ≤
√ a0 − σ,
otherwise.
From (5.3), we derive that $ 1 σ 2 2 1/2 lim sup a|∇v0 | − πa0 d ln(κΩ ) 2 Θκ,ε ε→0 $ 1 σ 2 2 1/2 ≤ lim aσ |∇v0 | − πa0 d ln(κΩ ) ε→0 2 Θκ,ε ≤ −πa0 d2 Ψσ (0). By Lemma A.1 in Appendix A, Ψσ (0) → (Λ) as σ → 0 where the constant (Λ) is defined in (A.2). Consequently, we may choose σ small such that $ δ 1 σ 2 2 1/2 (5.5) lim sup a|∇v0 | − πa0 d ln(κΩ ) ≤ −πa0 d2 (Λ) + . 2 Θκ,ε 2 ε→0 In R2 \B(0, κ−1 Ω−1/2 ), we define σ if x ∈ Θκ , v0 (x) √ vˆε (x) = a0 x if x ∈ R2 \D. v0σ |x|Λ
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
155
By [14, Proposition 2.2], we have ˜ ηε2 L∞ (R2\Dε ) = o(1). Since vˆε does not depend 2 2 vε | = 1 in R \Dε , we derive that on ε in R \Dε and |ˆ lim E˜ε (ˆ vε , R2 \Dε ) = 0.
ε→0
(5.6)
From [14, Proposition 2.2], we also know that # # # a − η˜ε2 # # # ≤ Cε1/3 # η˜2 # ∞ ε L (Dε )
(5.7)
and hence, (5.5) remains valid if one replaces a by η˜ε2 in the left-hand side. Since v0σ is S 1 -valued, we deduce that δ lim sup E˜ε (ˆ vε , R2 \B(0, κ−1 Ω−1/2 )) − πa0 d2 ln(κΩ1/2 ) ≤ −πa0 d2 (Λ) + . (5.8) 2 ε→0 Step 2. We are going to extend vˆε to B(0, κ−1 Ω−1/2 ). As in [8], we may write in a neighborhood of 0 (using polar coordinates),
v0σ (x) = exp i(dθ + ψσ (x)) , where ψσ is a smooth function in that neighborhood. Let (b1 , . . . , bd ) ∈ R2d be a minimizing configuration for w(·), i.e. w(b1 , . . . , bd ) = Min w(b) b∈R2d
(5.9)
(note that we necesarily have bi = bj for i = j). We choose κ sufficiently small (ε) such that max |bj | ≤ 1/4κ and we set bj = Ω−1/2 bj . Following the proof of [5, Lemma 2.6], we write e
iψσ (0)
(ε) d % x − bj j=1
|x −
(ε) bj |
= exp i(dθ + φε (x)) for x ∈ Aκ,ε = B(0, κ−1 Ω−1/2 )\B(0, (2κ)−1 Ω−1/2 ),
where φε is a smooth function satisfying |∇φε (x)| ≤ Cσ κ2 Ω1/2 ) and |φε (x) − ψσ (0)| = Cσ κ2 for x ∈ Aκ,ε . We define in Aκ,ε ,
vˆε (x) = exp i(dθ + ψˆε (x)) with
ψˆε (x) = 2 − 2κΩ1/2 |x| φε (x) + 2κΩ1/2 |x| − 1 ψσ (x).
May 2, 2006 15:57 WSPC/148-RMP
156
J070-00260
R. Ignat & V. Millot
As in [5], we get that (using (5.7)), lim sup E˜ε (ˆ vε , Aκ,ε ) − πa0 d2 ln 2 ε→0 $ 1 aσ |∇ˆ vε |2 − πa0 d2 ln 2 ≤ Cσ κ2 . ≤ lim sup 2 Aκ,ε ε→0
(5.10)
(ε)
Next, we define vˆε in Ξκ,ε = B(0, (2κ)−1 Ω−1/2 )\∪dj=1 B(bj , 2κΩ−1/2 ) by vˆε (x) = eiψσ (0)
(ε) d % x − bj (ε)
j=1
|x − bj |
.
Once more as in [5], we have (using (5.7)),
1 lim sup E˜ε (ˆ vε , Ξκ,ε ) ≤ lim sup aσ |∇ˆ vε |2 2 Ξκ,ε ε→0 ε→0 1 ≤ πa0 (d2 + d) ln ln|bi − bj | + Cσ κ. − πa0 2κ
(5.11)
i=j
(ε)
Finally, in each Bj
(ε)
:= B(bj , 2κΩ−1/2 ), we set (ε) x − bj iψσ (0) j vˆε (x) = e w ˜ε , 2κΩ−1/2
(5.12)
where w ˜εj realizes $ d % 2κy + bj − bi 1 1 2 2 2 on ∂B(0, 1) |∇v| + 2 (1 − |v| ) , v(y) = Min 2 B(0,1) 2ˆ ε |2κy + bj − bi | i=1 (5.13) with εˆ =
ε . √ 2κ a0 Ω−1/2
As in the proof of [5, Lemma 2.3], we derive $ 1 1 j 2 j 2 2 lim |∇w˜ε | + 2 (1 − |w ˜ε | ) − π|ln εˆ| = γ0 + X(κ), ε→0 2 B(0,1) 2ˆ ε where γ0 is defined in (4.15) and X(κ) denotes a quantity satisfying X(κ) → 0 as κ → 0. By scaling, we obtain $ π a0 2κΩ−1/2 1 2 2 2 = ln a0 + γ0 + X(κ). |∇ˆ vε | + 2 (1 − |ˆ vε | ) − π ln lim (ε) ε→0 2 Bj 2ε ε 2 (ε)
Notice that in Bj , aσ (x) = a(x) ≤ a0 − (|ln ε| + ω1 ln|ln ε|)−1
min
y∈B(bj ,2κ)
a0 |y|2Λ 1 + Λ2
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
157
and consequently, 1 a0 aσ 2κΩ−1/2 2 2 2 lim sup aσ |∇ˆ vε | + (1 − |ˆ vε | ) − πa0 ln 2 Bj(ε) 2ε2 ε ε→0 πa0 |bj |2Λ πa0 ln a0 + a0 γ0 − + X(κ). 2 1 + Λ2 By (5.7), it yields 2κΩ−1/2 (ε) ˜ lim sup Eε (ˆ vε , Bj ) − πa0 ln ε ε→0 πa0 |bj |2Λ πa0 ln a0 + a0 γ0 − + X(κ). (5.14) ≤ 2 1 + Λ2 Combining (5.8), (5.10), (5.11) and (5.14), we conclude that for κ small enough, πa0 2 ˜ (d − d)ln|ln ε| lim sup Eε (ˆ vε ) − πa0 d|ln ε| − 2 ε→0 ≤
≤ −πa0
i=j
d πa0 ln|bi − bj | − |bj |2Λ + QΛ,d + δ. 1 + Λ2 j=1
(5.15)
˜ ε (ˆ vε ). The Cauchy–Schwartz inequality yields Step 3. Now, it remains to estimate R 1/2
1/2 2 2 2 ˜ |Rε (ˆ E˜ε (ˆ vε , R \Dε )| ≤ CΩ |x| η˜ε vε , R2 \Dε ) . (5.16) R2\Dε
By [14, Proposition 2.2], Ω2 R2\Dε |x|2 η˜ε2 → 0 as ε → 0 and according to (5.6), it leads to ˜ ε (ˆ ˜ ε (ˆ (5.17) vε ) − R vε , Dε ) = 0. lim R ε→0
By the results in [8, Chap. IX], for εˆ sufficiently small and each j = 1, . . . , d, there ˆ εj ⊂ B(0, 1) with diam(D ˆ εj ) ≤ C εˆ such that |w exists exactly one disc D ˜εj | ≥ 1/2 in ˆ j . By scaling, we infer that exist exactly d discs D1 , . . . , Dd with Dj ⊂ B (ε) B(0, 1)\D ε ε ε ε j and diam(Dεj ) ≤ Cε such that & d 1 in Dε Dεj . |ˆ vε | ≥ 2 j=1 We derive from (5.14) that d d (ε) 1/2 j ˜ ε vˆε , R ≤ CΩε E˜ε (ˆ D vε , Bj ) −→ 0, ε ε→0 j=1 j=1 ˜ ε (ˆ ˜ ε (ˆ and by (5.17), it leads to limε→0 R vε ) − R vε , Dε \ ∪dj=1 Dεj ) = 0. From (5.7), we infer that & & d d ˜ ε vˆε , Dε Dεj − Rε vˆε , Dε Dεj = 0 lim R ε→0 j=1 j=1
May 2, 2006 15:57 WSPC/148-RMP
158
J070-00260
R. Ignat & V. Millot
and hence,
& d j ˜ vε ) − Rε vˆε , Dε Dε = 0. lim Rε (ˆ ε→0 j=1
(5.18)
To compute Rε (ˆ vε , D\∪dj=1 Dεj ), we proceed as in [14, Proposition 4.2] (here, we use that E˜ε (ˆ vε ) ≤ C|ln ε| by (5.15)). It yields & d d πΩ (ε) lim Rε vˆε , Dε Dεj + a2 (bj ) = 0 2 ε→0 1 + Λ j=1 j=1 (ε)
since deg(ˆ vε /|ˆ vε |, ∂Dεj ) = +1 for j = 1, . . . , d. Expanding a2 (bj ) and Ω, we deduce from (5.18) that d
2πa0 ˜ lim Rε (ˆ vε ) + πa0 d |ln ε| + πa0 ω(ε)d ln|ln ε| = |bj |2Λ . ε→0 1 + Λ2 j=1
(5.19)
Combining (5.9), (5.15) and (5.19), we obtain the announced result. Proof of Theorem 5.1. We consider the map vˆε given in Proposition 5.2 and we set v˜ε = m−1 ˆε ε v
and u ˜ε = η˜ε eiΩS v˜ε
with mε = ˜ ηε vˆε L2 (R2 ) .
We are going to prove that the map u˜ε satisfies the required property. By [14, Lemma 3.2], we have Fε (˜ uε ) = F (˜ ηε eiΩS ) + F˜ε (˜ vε ) + T˜ε (˜ vε ). vε ) − F˜ε (ˆ vε ) → 0 and In view of Proposition 5.2, it suffices to prove that F˜ε (˜ (ε) T˜ε (˜ vε ) → 0 as ε → 0. We first estimate mε . Since |ˆ vε | = 1 in R2 \ ∪dj=1 Bj and ˜ ηε L2 (R2 ) = 1, we have m2ε = η˜ε2 + η˜ε2 (|ˆ vε |2 − 1) = 1 + η˜ε2 (|ˆ vε |2 − 1). R2
(ε)
(ε)
∪d j=1 Bj
∪d j=1 Bj
Using the Cauchy–Schwarz inequality, we derive from (5.12), (5.13) and [8, Theorem III.2] that 1/2 2 2 −1/2 2 2 η˜ (|ˆ vε | − 1) ≤ C|ln ε| (|ˆ vε | − 1) (ε) ∪dj=1 Bj(ε) ε ∪d j=1 Bj ≤ Cε|ln ε|−1/2
(5.20)
and thus m2ε = 1 + O(ε|ln ε|−1/2 ).
(5.21)
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
159
(ε)
Using |ˆ vε | = 1 in R2 \∪dj=1 Bj , |∇S| ≤ C|x|, |kε | ≤ C|ln ε|, (5.20) and (5.21), we derive that T˜ε (˜ vε ) ≤ C|ln ε|2 |1 − m−2 | (1 + |x|2 )˜ ηε2 ε R2
2 −2 2 2 + η˜ε |1 − mε ||ˆ vε | + (1 − |ˆ vε | ) (ε)
∪d j=1 Bj
≤ Cε|ln ε|3/2 . Now, we may estimate using (5.15), (5.19) and (5.21), 2 2 −2 2 2 η˜ε |∇˜ vε | = mε η˜ε |∇ˆ vε | = η˜ε2 |∇ˆ vε |2 + O(ε|ln ε|1/2 ), R2
R2
(5.22)
R2
and ˜ ε (˜ ˜ vε ) = R ˜ ε (ˆ R vε ) = m−2 vε ) + O(ε|ln ε|1/2 ). ε Rε (ˆ We write 1 ε2
R2
η˜ε4 (1 − |˜ vε |2 )2 =
(5.23)
1 2(1 − m−2 ε ) η˜ε4 (1 − |ˆ vε |2 )2 + 2 ε R2 ε2 × η˜ε4 (1 − |ˆ vε |2 )|ˆ vε |2 (ε)
∪d j=1 Bj
+
2 (1 − m−2 ε ) 2 ε
R2
η˜ε4 |ˆ vε |4 .
We infer from (5.15) and (5.21) that 2 (1 − m−2 ε ) η˜ε4 |ˆ vε |4 ≤ C|ln ε|−1 , ε2 R2 and from (5.20) and (5.21), |1 − m−2 ε | vε |2 ≤ C|ln ε|−1 . η˜ε4 |ˆ vε |2 1 − |ˆ 2 (ε) ε ∪d j=1 Bj
(5.24)
(5.25)
(5.26)
vε ) = F˜ε (ˆ vε ) + o(1) and the Combining (5.22)–(5.26), we finally obtain that F˜ε (˜ proof is complete. Acknowledgments We express our gratitude to A. Aftalion, who suggested this problem to us, for his very helpful suggestions and comments. We also thank E. Sandier and I. Shafrir for very interesting discussions, and H. Brezis for his hearty encouragement and constant support. The research of the authors was partially supported by the RTN Program “Fronts-Singularities” of European Commission, HPRN-CT-2002-00274.
May 2, 2006 15:57 WSPC/148-RMP
160
J070-00260
R. Ignat & V. Millot
Appendix A In this appendix, we prove that the functions ΨR and Ψσ defined by (4.20) and, √ respectively, (5.4) converge to the same limiting function as R → a0 and σ → 0. The proof is based on the construction of suitable barrier functions. √ Lemma A.1 For any 0 < R < a0 , respectively, any σ > 0, let ΨR be the solution √ of Eq. (4.20), respectively, Ψσ the solution of (5.4). Then, ΨR → Ψ as R → a0 , 1 (D) where Ψ is the unique solution in respectively, Ψσ → Ψ as σ → 0, in Cloc 0 ¯ C (D) of 2 div 1 ∇Ψ = −2|x|Λ in D, 2 a a (x)|x|2 (A.1) Ψ = −ln|x| on ∂D. In particular, lim √ R→ a0
ΨR (0) = lim Ψσ (0) = Ψ (0) =: (Λ). σ→0
(A.2)
Proof. Step 1. Uniqueness of Ψ . Assume that (A.1) admits two solutions Ψ1 and ¯ Then, the difference Ψ1 − Ψ2 satisfies div( 1 ∇(Ψ1 − Ψ2 )) = 0 in D and Ψ2 in C 0 (D). a 1 2 ¯ Ψ − Ψ = 0 on ∂D. By elliptic regularity, we infer that Ψ1 − Ψ2 ∈ C 2 (D) ∩ C 0 (D). 1 2 Hence, it follows Ψ − Ψ ≡ 0 by the classical maximum principle. Step 2: Existence of Ψ . We set for y ∈ D, √ Ry ΥR (y) = ΨR √ − ζ(y) + ln(R/ a0 ), a0 where ζ is the solution of
∆ζ = 0
in D,
ζ = − ln |y| on ∂D. Since ΨR solves (4.20), we deduce that ΥR is the unique solution of 1 f (y) −div ∇ΥR = 2 in D, aR (y) aR (y) ΥR = 0 on ∂D,
(A.3)
where aR (y) = a20 /R2 − |y|2Λ and f (y) =
2|y|2Λ + 2(y1 , Λ2 y2 ) · ∇ζ(y). |y|2
We easily check that y → KaR (y), respectively, y → −KaR (y), defines a supersolution, respectively, a subsolution, of (A.3) whenever the constant K satisfies K ≥ f L∞(D) /(Λ2 a0 ). Hence, |ΥR | ≤ CaR
in D
(A.4)
for a constant C independent of R. By elliptic regularity, we deduce that ΥR remains √ 2,p (D) as R → a0 for any 1 ≤ p < ∞. Therefore, from any sequence bounded in Wloc
May 2, 2006 15:57 WSPC/148-RMP
J070-00260
Energy Expansion and Vortex Location
161
√ Rn → a0 , we may extract a subsequence, still denoted by (Rn ), such that ΥRn → 1 (D) where Υ satisfies Υ in Cloc 1 f ∇Υ = 2 in D. −div a(y) a (y) ¯ We infer from (A.4) that |Υ (y)| ≤ Ca(y) for any y ∈ D and hence, Υ ∈ C 0 (D) with Υ |∂D = 0. Consequently, the function Ψ := Υ + ζ defines a solution of (A.1) ¯ which is continuous in D. √ 1 (D) as R → a0 Step 3. By the uniqueness of Ψ , we have that ΥR → Ψ −ζ in Cloc √ 1 which clearly implies ΨR → Ψ in Cloc (D) as R → a0 . To prove that Ψσ → Ψ in 1 Cloc (D) as σ → 0, we may proceed as in Step 2. Indeed, we may show as in Step 2, that |Ψσ − ζ| ≤ Caσ in D for a constant C independent of σ. References [1] J. R. Abo-Shaeer, C. Raman, J. M. Vogels and W. Ketterle, Observation of vortex lattices in Bose–Einstein condensate, Science 292 (2001) 476–479. [2] A. Aftalion and Q. Du, Vortices in a rotating Bose–Einstein condensate: Critical angular velocities and energy diagrams in the Thomas–Fermi regime, Phys. Rev. A 64 (2001). [3] L. Almeida and F. Bethuel, Topological methods for the Ginzburg–Landau equations, J. Math. Pures Appl. 77 (1998) 1–49. [4] N. Andr´e and I. Shafrir, Asymptotic behavior of minimizers for the Ginzburg–Landau functional with weight, I, Arch. Ration. Mech. Anal. 142 (1998) 45–73. [5] N. Andr´e and I. Shafrir, Asymptotic behavior of minimizers for the Ginzburg–Landau functional with weight, II, Arch. Ration. Mech. Anal. 142 (1998) 75–98. [6] A. Beaulieu and R. Hadiji, On a class of Ginzburg–Landau equations with weight, Panamer. Math. J. 5 (1995) 1–33. [7] F. Bethuel, H. Brezis and F. H´elein, Asymptotics for the minimization of a Ginzburg– Landau functional, Calc. Var. Partial Differential Equations 1 (1993) 123–148. [8] F. Bethuel, H. Brezis and F. H´elein, Ginzburg–Landau Vortices (Birkh¨ auser, 1993). [9] F. Bethuel and T. Rivi`ere, Vortices for a variational problem related to superconductivity, Ann. Inst. H. Poincar´ e Anal. Non Lin´eaire 12 (1995) 243–303. [10] D. Butts and D. Rokhsar, Predicted signatures of rotating Bose–Einstein condensates, Nature 397 (1999) 327–329. [11] Y. Castin and R. Dum, Bose–Einstein condensates with vortices in rotating traps, Eur. Phys. J. D 7 (1999) 399–412. [12] S. Gueron and I. Shafrir, On a discrete variational problem involving interacting particles, SIAM J. Appl. Math. 60 (2000) 1–17. [13] R. Ignat and V. Millot, Vortices in 2d rotating Bose–Einstein condensate, C. R. Acad. Sci. Paris S´er. I 340 (2005) 571–576. [14] R. Ignat and V. Millot, The critical velocity for vortex existence in a two dimensional rotating Bose–Einstein condensate, J. Funct. Anal. 233 (2006) 260–306. [15] L. Lassoued and P. Mironescu, Ginzburg–Landau type energy with discontinuous constraint, J. Anal. Math. 77 (1999) 1–26. [16] K. Madison, F. Chevy, J. Dalibard and W. Wohlleben, Vortex formation in a stirred Bose–Einstein condensate, Phys. Rev. Lett. 84 (2000) 806–809. [17] K. Madison, F. Chevy, J. Dalibard and W. Wohlleben, Vortices in a stirred Bose– Einstein condensate, J. Modern Opt. 47 (2000) 2715–2723.
May 2, 2006 15:57 WSPC/148-RMP
162
J070-00260
R. Ignat & V. Millot
[18] E. Sandier, Lower bounds for the energy of unit vector fields and applications, J. Funct. Anal. 152 (1998) 119–145. [19] E. Sandier and S. Serfaty, A rigorous derivation of a free boundary problem arising in superconductivity, Ann. Sci. Ecole Norm. Sup. 33 (2000) 561–592. [20] S. Serfaty, Local minimizers for the Ginzburg–Landau energy near critical magnetic field: Part I, Commun. Contemp. Math. 1 (1999) 213–254. [21] S. Serfaty, Local minimizers for the Ginzburg–Landau energy near critical magnetic field: Part II, Commun. Contemp. Math. 1 (1999) 295–333. [22] S. Serfaty, Stable configurations in superconductivity: Uniqueness, multiplicity, and vortex-nucleation, Arch. Ration. Mech. Anal. 149 (1999) 329–365. [23] S. Serfaty, On a model of rotating superfluids, ESAIM Control Optim. Calc. Var. 6 (2001) 201–238. [24] M. Struwe, On the asymptotic behavior of minimizers of the Ginzburg–Landau model in 2-dimensions, J. Diff. Int. Equations 7 (1994) 1617–1624; Erratum J. Diff. Int. Equations 8 (1995) 224.
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
Reviews in Mathematical Physics Vol. 18, No. 2 (2006) 163–199 c World Scientific Publishing Company
A HOLOMORPHIC REPRESENTATION OF THE JACOBI ALGEBRA
STEFAN BERCEANU National Institute for Physics and Nuclear Engineering Department of Theoretical Physics PO Box MG-6, Bucharest-Magurele, Romania
[email protected] Received 8 September 2005 Revised 22 February 2006 A representation of the Jacobi algebra h1 su(1, 1) by first-order differential operators with polynomial coefficients on the manifold C × D1 is presented. The Hilbert space of holomorphic functions on which the holomorphic first-order differential operators with polynomials coefficients act is constructed. Keywords: Coherent states; representations of coherent state Lie algebras; Jacobi group; first-order holomorphic differential operators with polynomial coefficients. Mathematics Subject Classification 2000: 81R30, 32WXX, 12E10, 33C47, 32Q15, 81V80
Contents 1. Introduction 2. Coherent States: The General Setting 2.1 Coherent state groups 2.2 The symmetric Fock space 2.3 Representation of CS-Lie algebras by differential operators 3. The Jacobi Algebra 4. The Differential Action 5. The Reproducing Kernel 6. The Group Action on the Base Manifold 6.1. Formulas for the Heisenberg–Weyl group H1 and SU(1, 1) 6.2. Holstein–Primakoff–Bogoliubov-type equations 6.3. The action of the Jacobi group 7. The Symmetric Fock Space 7.1 The Heisenberg–Weyl group 7.2 The group SU(1, 1) 7.3 The Jacobi group 7.4 The geometry of the manifold C × D1
163
164 167 167 168 169 170 171 173 175 175 176 180 183 183 184 184 187
May 2, 2006 15:57 WSPC/148-RMP
164
J070-00261
S. Berceanu
8. Physical Applications: Classical and Quantum Equations of Motion 9. Comparison with K¨ ahler–Berndt’s Approach 10. Some More Comments Appendix
188 190 193 194
1. Introduction In this paper we deal with realizations of finite-dimensional Lie algebras by firstorder differential operators on homogeneous spaces. Our method, firstly developed in [6], permits to get the holomorphic differential action of the generators of a continuous unitary representation π of a Lie group G with the Lie algebra g on a homogeneous space M = G/H. We consider homogeneous manifolds realized as the K¨ ahler coherent state (CS)-orbits obtained by the action of the representation π on a fixed cyclic vector e0 belonging to the complex separable Hilbert space H of the representation [55]. We have applied our method to compact (non-compact) hermitian symmetric spaces in [7] (respectively, [8]) and we have produced simple formulas which show that the differential action of the generators of a hermitian group G on holomorphic functions defined on the hermitian symmetric spaces G/H can be written down as a sum of two terms, one a polynomial P , and the second one a sum of partial derivatives times some polynomials Qs, the degree of polynomials being less than 3. This is a generalization of the well-known realization [47] of the generators J0,+,− of the group G = SU(2) (and similarly, for its non-compact dual G = SU(1, 1)) on the homogeneous manifold G/U(1) by the differential operators ∂ ∂ ∂ , J− = −2jz + z 2 ∂z , J0 = j − z ∂z , where the generators verify the J+ = − ∂z commutation relations [J0 , J± ] = ±J± , [J− , J+ ] = −2J0 and J0 e0 = −je0 . In [10, 13] we have generalized the results of [7, 8] to K¨ ahler CS-orbits of semisimple Lie groups. The differential action of the generators of the groups is of the same type as in the case of hermitian symmetric orbits, i.e. first-order differential operators with holomorphic polynomial coefficients, but the maximal degree of the polynomials is greater than 2. We have presented explicit formulas involving the Bernoulli numbers and the structure constants for semisimple Lie groups [10, 13]. The simplest example in which the maximum degree of the polynomials multiplying the derivative is already 3 was worked out in detail in [10, 13], where we have constructed CS on the non-symmetric space M := SU(3)/S(U(1) × U(1) × U(1)). ∂ ; Let us now recall the standard Segal–Bargmann–Fock [3] realization a → ∂z + + a → z of the canonical commutation relations (CCR) [a, a ] = 1 on the symmetric i exp(−|z|2 )dz ∧ d¯ z ) attached to the Hilbert space Fock space FH := Γhol (C, 2π 2 H := L (R, dx). The Segal–Bargmann–Fock realization can be considered as a representation by differential operators of the real three-dimensional Heisenberg algebra h1 ≡ gHW = is1 + za+ − z¯as∈R;z∈C of the Heisenberg–Weyl group (HW) H1 , where Hn denotes the (2n + 1)-dimensional HW group. We can look at this construction from group-theoretic point of view, considering the complex number
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
165
z as a local coordinate on the homogeneous manifold M := H1 /R ∼ = C. Glauber [30] has attached field coherent states to the points of the manifold M . In the present paper, we are interested in representations of Lie algebras which are semi-direct sum of Heisenberg algebras and semisimple Lie algebras by firstorder differential operators with holomorphic polynomials coefficients. The most appropriate framework for such an approach is furnished by the so-called CS-groups, i.e. groups which admit an orbit which is a complex submanifold of a projective Hilbert space [48, 52]. Indeed, such groups contain all compact groups, all simple hermitian groups, certain solvable groups and also some mixed groups as the semidirect product of the HW group and the symplectic group [52]. In reference [11], we have advanced the hypothesis that the generators of CS-groups admit representations by first-order differential operators with holomorphic polynomials coefficients on CS-manifolds. Here, we just present explicit formulas for the simplest example of such a representation of the Lie algebra semi-direct sum of the three-dimensional Heisenberg algebra h1 and the algebra of the group SU(1, 1) acting on it in the canonical fashion, gJ1 := h1 su(1, 1), called Jacobi algebra (cf. [28] or [52, p. 78]). The case of the Jacobi algebra gJn = hn sp(n, R) is treated separately [14]. Let us remained also that the Jacobi algebra gJn , also denoted st(n, R) by Kirillov in [42, § 18.4] or tsp(2n + 2, R) in [44], is isomorphic with the subalgebra of Weyl algebra An (see also [25]) of polynomials of degree maximum 2 in the variables p1 , . . . , pn , q1 , . . . , qn with the Poisson bracket, while the Heisenberg algebra hn is the nilpotent ideal isomorphic with polynomials of degree ≤ 1 and the real symplectic algebra sp(n, R) is isomorphic to the subspace of symmetric homogeneous polynomials of degree 2. In this paper, we study the six-dimensional Jacobi algebra gJ1 and we denote it just as gJ , when there is no possibility of confusion with gJn . The representations of the Jacobi group were investigated also by the orbit method [42, 43], starting from a matrix representation (see [43, p. 182]) of the Jacobi algebra gJ in [19, 20]. Our method is inspired from the squeezed states of Quantum Optics, see, e.g., the reviews [66, 62, 26, 27]. It is well known that for the harmonic oscillator CSs the uncertainties in momentum and position are equal √ 49, 21, 65, 33, 64] are the with 1/ 2 (in units of ). “The squeezed states” [41, 63, √ states for which the uncertainty in position is less than 1/ 2. The squeezed states are a particular class of “minimum uncertainty states” (MUS) [50], i.e. states which saturates the Heisenberg uncertainty relation. In the present paper, we do not insist on the applications of our paper to the squeezed states, the Gaussian states [60, 1], disentangling theorems, i.e. analytic Backer–Campbell–Hausdorff relations defined from a (4 × 4)-matrix representation of the Jacobi algebra, or nonlinear coherent states [62]. Let us just mention that “Gaussian pure states” (“Gaussons”) [60] are more general MUSs. In fact, as was shown in [1], these states are CSs based on the manifold XJn := Hn × R2n , where Hn is the Siegel upper half-plane Hn := {Z ∈ Mn (C) | Z = U + iV, U, V ∈ Mn (R), (V ) > 0, U t = U ; V t = V }. Mn (R) denotes the n × n matrices with entries in R, R = R or C and X t denotes the transpose
May 2, 2006 15:57 WSPC/148-RMP
166
J070-00261
S. Berceanu
of the matrix X. In [14], we have started the generalization of CSs attached to the Jacobi group GJ1 = H1 SU(1, 1) to the Jacobi group GJn = Hn Sp(n, R). The connection of our construction of coherent states based on DJn = Cn × Dn [14] and the Gaussons of [60] is a subtle one and should be investigated separately. Dn denotes the Siegel ball Dn := {Z ∈ Mn (C) | Z = Z t , 1 − Z Z¯ > 0}. In Sec. 9, we indicate the clue of this connection in the present case, n = 1, which is offered by the K¨ ahler–Berndt’s construction, shortly sketched in the same Sec. 9. The only physical applications are contained in Sec. 8, where we use the expressions of the generators of the Jacobi group GJ1 to determine the quantum and classical evolution on the manifold DJ1 , generated by a linear Hamiltonian in the generators of the group. We emphasize that some of the results obtained in this paper, as the reproducing kernel or the group action on the base manifold, can be obtained as particular cases of some of the formulas in [59, Chap. III, Propositions 5.1–5.3] and [52, Sec. XII.4]. We also stress that some of the formulas presented here appear in the context of automorphic Jacobi forms [28, 19] — this denomination is inspired by the book [56]. The Jacobi group can be associated (see [40, Chap. 5]) with the group GK investigated by K¨ ahler [37–39] as a group of the Universal Theory of Everything, including relativity, quantum mechanics and even biology. In the paper [37], K¨ ahler has determined the structure of the real ten-dimensional Lie algebra gK of the (Poincar´e or New Poincar´e ) group GK and has realized this algebra by differential operators in four real variables. However, our approach and the proofs are independent and, we hope, more accessible to people familiar with the coherent state approach in Theoretical Physics and in Mathematical Physics. Moreover, as far as we know, some of the formulas presented in this paper are completely new, e.g., Eq. (7.8) expressing the base of polynomials defined on the manifold DJ1 — the homogeneous space of the Jacobi group GJ1 , acting by biholomorphic maps, or the resolution of unity (7.14) and (7.15). In order to facilitate the understanding of all subsequent sections, we present in Sec. 2 the general setting concerning the CS-groups: Sec. 2.1 briefly recalls the definition of CS-groups and Sec. 2.2 defines the space of functions, called the symmetric Fock space, on which the differential operators act (Sec. 2.3). However, we shall not enter into a detailed analysis of the root structure of CS-Lie algebras [52], keeping the exposition as elementary as possible. Section 3 presents the Jacobi algebra gJ . Perelomov’s CS-vectors associated with the Jacobi group GJ1 (cf. denomination used in [19] or [52, p. 701]) are based on the complex homogeneous manifold M := DJ1 . The differential action of the generators of the Jacobi group is given in Lemma 4.1 of Sec. 4. The operators a and a+ are unbounded operators, but it is enough to work on the dense subspace of smooth vectors of the Hilbert space of the hermitian representation (cf. [52, p. 40] and also Sec. 2.3 of our paper). ¯ J → C. In Lemma 5.1 of Sec. 5 we calculate the reproducing kernel K : DJ1 × D 1 Some facts concerning the representations of the HW group H1 and SU(1, 1) are collected in Sec. 6.1. Several relations are obtained in Sec. 6.2 as a consequence of
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
167
the fact that the Heisenberg algebra is an ideal of the Jacobi algebra, and we find how to change the order of the representations of the groups HW and SU(1, 1). Some of the relations presented in Sec. 6.2 have appeared earlier in connection with the squeezed states [41] in Quantum Optics [63]. The main result of Sec. 6.3 is given in Proposition 6.13, which expresses the action of the Jacobi group on Perelomov’s CS-vectors. Remark 6.15 establishes the connection of our results in the context of coherent states with those obtained in the theory of automorphic Jacobi forms [28]. In Sec. 7.3, we construct the symmetric Fock space attached to the reproducing kernel K from the symmetric Fock spaces associated with the groups HW ahler two-form ω, the (cf. Sec. 7.1) and SU(1, 1) (cf. Sec. 7.2). The GJ1 -invariant K¨ Liouville form and the equations of geodesics on the manifold DJ1 are calculated in Sec. 7.4. Proposition 7.1 summarizes all the information obtained in Sec. 7 concerning the symmetric Fock space FK attached to the reproducing kernel K for the Jacobi group GJ1 , while Proposition 7.2 gives the continuous unitary holomorphic representation πK of GJ1 on FK . Simple applications to equations of motion on DJ1 determined by linear Hamiltonians in the generators of the Jacobi group are presented in Sec. 8. The equation of motion is a matrix Riccati equation on the manifold DJ1 . In order to compare our K¨ahler two-form ω with that given by K¨ ahler (see [40], which reproduces [37–39]) and Berndt [17, 19], we express in Sec. 9 our ω in coordinates on DJ1 in appropriate (called in [19] EZ) coordinates ahler–Berndt’s two-form is in fact the K¨ahler two-form attached to in XJ1 . The K¨ the manifold XJ1 on which are based the Gaussons considered in [60] in the case n = 1. Section 10 contains some more remarks referring to the connection between the formulas proved in the present article for the Jacobi algebra and the formalism used in [52] for CS-groups. In order to be self-contained, two formulas referring to the groups HW and SU(1, 1) are proved in the Appendix.
2. Coherent States: The General Setting 2.1. Coherent state groups Let us consider the triplet (G, π, H), where π is a continuous, unitary representation of the Lie group G on the separable complex Hilbert space H. Let us denote by H∞ the smooth vectors. Let us pick up e0 ∈ H∞ and let the notation: eg,0 := π(g) · e0 , g ∈ G. We have an action G × H∞ → H∞ , g · e0 := eg,0 . When there is no possibility of confusion, we write just eg for eg,0 . Let us denote by [ ] : H× := H\{0} → P(H) = H×/∼ the projection with respect to the equivalence relation [λx] ∼ [x], λ ∈ C× , x ∈ H× . So, [·] : H× → P(H), [v] = Cv. The action G × H∞ → H∞ extends to the action G×P(H∞ ) → P(H∞ ), g · [v] := [g · v]. For X ∈ g, where g is the Lie algebra of the Lie group G, let us define the (unbounded) operator dπ(X) on H by dπ(X)·v := d/dt|t=0 π(exp tX)·v, whenever
May 2, 2006 15:57 WSPC/148-RMP
168
J070-00261
S. Berceanu
the limit on the right-hand side exists. We obtain a representation of the Lie algebra g on H∞ , the derived representation, and we denote X · v := dπ(X) · v for X ∈ g, v ∈ H∞ . Extending dπ by complex linearity, we get a representation of the universal enveloping algebra of the complex Lie algebra gC on the complex vector space H∞ , dπ : S := U(gC ) → B0 (H∞ ). Here B0 (H0 ) ⊂ L(H), where H0 := H∞ denotes the subset of linear operators A : H0 → H0 which have a formal adjoint (cf. [52, p. 29]). Let us now denote by H the isotropy group H := G[e0 ] := {g ∈ G | g · e0 ∈ Ce0 }. We shall consider (generalized) coherent states on complex homogeneous manifolds M ∼ = G/H [55], imposing the restriction that M be a complex submanifold of P(H∞ ). In such a case, the orbit M is called a CS-manifold and the groups G which generate such orbits are called CS-groups (cf. [52, Definition XV.2.1, p. 650; Theorem XV.1.1, p. 646], while their Lie algebras are called CS-Lie algebras. The coherent vector mapping is defined locally, on a coordinate neighborhood ¯ ϕ(z) = ez¯ (cf. [11]), where H ¯ denotes the Hilbert space conjugate V0 , ϕ : M → H, ¯ to H. The vectors ez¯ ∈ H indexed by the points z ∈ M are called Perelomov’s coherent state vectors. The precise definition depends on the root structure of the CS-Lie algebras and we do not go into the details here (see [11]), but only in Sec. 10 we just specify the root structure according to [52] in the case of the Jacobi algebra. ¯ y), x, y ∈ H, λ ∈ C. We use for the scalar product the convention: (λx, y) = λ(x, 2.2. The symmetric Fock space The space of holomorphic functions (in fact, holomorphic sections of a certain G-homogeneous line bundle over M [52, 11]) FH is defined as the set of square integrable functions with respect to the scalar product (2.1) f¯(z)g(z) dνM (z, z¯), (f, g)FH = M
dνM (z, z¯) =
ΩM (z, z¯) . (ez¯, ez¯)
Here ΩM is the normalized G-invariant volume form n 1 ω ∧ ···∧ ω, ΩM := (−1)( 2 ) n!
(2.2)
(2.3)
n times
and the G-invariant K¨ ahler two-form ω on the 2n-dimensional manifold M is given by ∂2 ω(z) = i Gα,β dzα ∧ d¯ zβ , Gα,β (z) = log(ez¯, ez¯). (2.4) ∂zα ∂ z¯β α,β
It can be shown that (2.1) is nothing else but the Parseval overcompletness identity [15] (ψ1 , ψ2 ) = (ψ1 , ez¯)(ez¯, ψ2 ) dνM (z, z¯), (ψ1 , ψ2 ∈ H). (2.5) M=G/H
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
169
It can be seen that relation (2.1) (or (2.5)) on homogeneous manifolds fits into Rawnsley’s global realization [58] of Berezin’s coherent states on quantizable K¨ ahler manifolds [15], modulo Rawnsley’s “epsilon” function [58, 24], a constant for homogeneous quantization. If (M, ω) is a K¨ ahler manifold and (L, h, ∇) is a (quantum) holomorphic line bundle L on M , where h is the hermitian metric and ∇ is the connection compatible with the metric and the complex structure, then ahler potential f is f = − log h(z) (see, e.g., [9]). h(z, z) = (ez¯, ez¯)−1 and the K¨ Let us now introduce the map Φ : H → FH ,
Φ(ψ) := fψ ,
fψ (z) = Φ(ψ)(z) = (ϕ(z), ψ)H = (ez¯, ψ)H ,
z ∈ V0 ,
(2.6)
¯ complex conjugate to H with the dual space where we have identified the space H H of H. ¯ 0 reads ¯ → C, which on V0 × V It can be defined as a function K : M × M K(z, w) ¯ := Kw (z) = (ez¯, ew¯ )H .
(2.7)
For CS-groups, the function K (2.7) is a positive definite reproducing kernel; the symmetric Fock space FH (or FK ) is the reproducing kernel Hilbert space of holomorphic functions on M, HK ⊂ CM , associated to the kernel K (2.7), and the evaluation map Φ defined in (2.6) extends to an isometric G-equivariant embedding H∗ → FH [11] f¯ψ1 (z)fψ2 (z) dνM (z). (ψ1 , ψ2 )H = (Φ(ψ1 ), Φ(ψ2 ))FH = (fψ1 , fψ2 )FH = M
(2.8) Sometimes the kernel K is considered as a Bergman section [54] of a certain bundle ¯ , firstly considered by Kobayashi [46], see [51, Chap. V–VIII] and over M × M [52, Chap. XII]. 2.3. Representation of CS-Lie algebras by differential operators Let us consider again the triplet (G, π, H). The derived representation dπ is a hermitian representation of the semigroup S := U(gC ) on H∞ (cf. [52, p. 30]). The unitarity and the continuity of the representation π imply that idπ(X)|H∞ is essentially self-adjoint (cf. [52, p. 391]). Let us denote this image in B0 (H∞ ) by AM := dπ(S). If Φ : H∗ → FH is the isometry (2.6), we are interested in the study of the image of AM via Φ as subset in the algebra of holomorphic, linear differential operators, ΦAM Φ−1 := AM ⊂ DM . The set DM (or simply D) of holomorphic, finite-order, linear differential operators on M is a subalgebra of homomorphisms HomC (OM , OM ) generated by the set OM of germs of holomorphic functions of M and the vector fields. We consider also the subalgebra AM of AM of differential operators with holomorphic polynomial coefficients. Let U := V0 ⊂ M , endowed with the local coordinates (z1 , z2 , . . . , zn ).
May 2, 2006 15:57 WSPC/148-RMP
170
J070-00261
S. Berceanu
∂ We set ∂i := ∂z and ∂ α := ∂1α1 ∂2α2 · · · ∂nαn , α := (α1 , α2 , . . . , αn ) ∈ Nn . The seci tions of DM on U are A : f → α aα ∂ α f , aα ∈ Γ(U, O), aα s being zero except a finite number. For k ∈ N, let us denote by Dk the subset of differential operators of degree ≤ k. The filtration of D induces a filtration on A. Summarizing, we have a correspondence between the following three objects:
gC X → X ∈ AM → X ∈ AM ⊂ DM ,
differential operator on FH .
(2.9)
Moreover, it is easy to see [11] that if Φ is the isometry (2.6), then Φdπ(gC )Φ−1 ⊆ D1 and we have gC X → X ∈ D1 ; where
Xz (fψ (z)) = Xz (ez¯, ψ) = (ez¯, Xψ),
Xz (fψ (z)) =
PX (z) +
QiX (z)
∂ ∂zi
(2.10)
fψ (z).
(2.11)
In [11], we have advanced the hypothesis that for CS-groups the holomorphic functions P and Q in (2.11) are polynomials, i.e. A ⊂ A1 ⊂ D1 . In this paper, we present explicit formulas for (2.11) in case of the simplest example of a mixed group which is a CS-group, the Jacobi group GJ1 . We start with the Jacobi algebra. 3. The Jacobi Algebra The Heisenberg–Weyl group is the group with the three-dimensional real Lie algebra isomorphic to the Heisenberg algebra h1 ≡ gHW = is1 + xa+ − x¯as∈R,x∈C ,
(3.1)
where a+ (a) are the boson creation (respectively, annihilation) operators which verify the CCR (3.5a). Let us also consider the Lie algebra of the group SU(1, 1): su(1, 1) = 2iθK0 + yK+ − y¯K− θ∈R,y∈C ,
(3.2)
where the generators K0,+,− verify the standard commutation relations (3.5b). We consider the matrix realization 0 1 0 0 0 1 1 , K− = i . (3.3) K0 = , K+ = i 2 0 −1 0 0 1 0 Now, let us define the Jacobi algebra as the the semi-direct sum gJ1 := h1 su(1, 1),
(3.4)
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
171
where h1 is an ideal in gJ1 , i.e. [h1 , gJ1 ] = h1 , determined by the commutation relations: [a, a+ ] = 1,
(3.5a)
[K0 , K± ] = ±K± , +
[a, K+ ] = a ,
[K− , K+ ] = 2K0 , +
[K− , a ] = a,
+
[K+ , a ] = [K− , a] = 0, 1 1 [K0 , a+ ] = a+ , [K0 , a] = − a. 2 2
(3.5b) (3.5c) (3.5d) (3.5e)
4. The Differential Action We shall suppose that we know the derived representation dπ of the Lie algebra gJ1 (3.4) of the Jacobi group GJ1 . We associate to the generators a, a+ of the HW group and to the generators K0,+,− of the group SU(1, 1) the operators a, a+ , respectively + K 0,+,− , where (a+ )+ = a, K + 0 = K 0 , K ± = K ∓ , and we impose to the cyclic vector e0 to verify simultaneously the conditions ae0 = 0,
(4.1a)
K − e0 = 0,
(4.1b)
K 0 e0 = ke0 ;
k > 0,
2k = 2, 3, . . . .
(4.1c)
We consider in Eq. (4.1c) the positive discrete series representations Dk+ of SU(1, 1) (cf. [2, Sec. 9]). Perelomov’s coherent state vectors associated to the group GJ1 with Lie algebra the Jacobi algebra (3.4), based on the manifold M : M := H1 /R × SU(1, 1)/U(1), M =
DJ1
:= C × D1 ,
(4.2a) (4.2b)
are defined as +
ez,w := eza
+wK +
e0 ,
z ∈ C,
|w| < 1.
(4.3)
The general scheme (2.9) associates to elements of the Lie algebra g differential operators: X ∈ g → X ∈ D1 . The space of functions on which these operators act in the case of the Jacobi group will be made precise in Sec. 7. The following lemma expresses the differential action of the generators of the Jacobi algebra as operators of the type A1 in two variables on M . Lemma 4.1. The differential action of the generators (3.5a)–(3.5e) of the Jacobi algebra (3.4) is given by the formulas: a=
∂ ; ∂z
a+ = z + w
∂ ; ∂z
(4.4a)
May 2, 2006 15:57 WSPC/148-RMP
172
J070-00261
S. Berceanu
∂ ∂ 1 ∂ ; K0 = k + z +w ; ∂w 2 ∂z ∂w 1 ∂ ∂ K+ = z 2 + 2kw + zw + w2 , 2 ∂z ∂w
K− =
(4.4b) (4.4c)
where z ∈ C, |w| < 1. Proof. With the definition (4.3), we have the formal relations: ∂ ∂ ez,w ; K + ez,w = ez,w . ∂z ∂w The proof is based on the general formula a+ ez,w =
Ad(exp X) = exp(adX ), valid for Lie algebras g, which here we write down explicitly as 1 AeX = eX A − [X, A] + [X, [X, A]] + · · · , 2
(4.5)
(4.6)
and we take X = za+ + wK + because of the definition (4.3). (1) Firstly, we take in (4.6) A = a. Then, [X, A] = −z − wa+ ; [X, [X, A]] = 0, and aeX = eX (a + z + wa+ ); ∂ aeX e0 = z + w eX e0 . ∂z (2) Now, we take in (4.6) A = K 0 . Then, [X, A] = − z2 a+ − wK + ; [X, [X, A]] = 0, and ∂ z ∂ +w K 0 ez,w = k + ez,w . 2 ∂z ∂w (3) Finally, we take in (4.6) A = K − . We have [X, A] = −za − 2wK 0 , and [X, [X, A]] = [za+ + wK + , −za − 2wK 0 ] = −z 2 [a+ , a] − 2zw[a+ , K 0 ] − wz[K + , a] − 2w2 [K + , K 0 ] = z 2 + 2zwa+ + 2w2 K + . Using (4.6), we have
1 AeX e0 = eX K − + (za + 2wK 0 ) + (z 2 + 2wza+ + 2w2 K + ) e0 2 2 ∂ z ∂ + wz + w2 = 2wk + eX e0 . 2 ∂z ∂w
Now, we do some general considerations. For X ∈ g, let X · ez := X z ez . Then, X · ez¯ = X z¯ez¯. But (ez¯, X · ez¯ ) = (X + · ez¯, ez¯ ) and finally, with Eq. (2.10), we have Xz¯ (ez¯, ez¯ ) = X+ z (ez¯, ez¯ ). With observation (4.7) and the previous calculation, Lemma 4.1 is proved.
(4.7)
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
173
Comment 4.2. We illustrate (4.7) for X = a. Then, it can be checked up that ∂ ¯ ∂ z¯ + z w (ez¯,w¯ , ez¯ ,w¯ ) = z¯ + w ¯ (ez¯,w¯ , ez¯ ,w¯ ) = (ez¯,w¯ , ez¯ ,w¯ ), ∂ z¯ ∂z 1 − ww ¯ where the kernel has the expression (5.3) calculated below. 5. The Reproducing Kernel Now, we calculate the reproducing kernel K on the base manifold M = DJ1 as the scalar product of two Perelomov’s CS-vectors (4.3), taking into account the conditions (4.1) and the orthonormality of the basis of the Hilbert spaces associated with the factors of the Jacobi group. Lemma 5.1. Let K = K(¯ z , w, ¯ z, w), where z ∈ C, w ∈ C, |w| < 1, +
¯ − za e K := (e0 , ez¯a+wK
+wK +
e0 ).
(5.1)
Then, the reproducing kernel is K = (1 − ww) ¯ −2k exp
¯ + z¯2 w 2z z¯ + z 2 w . 2(1 − ww) ¯
(5.2)
¯ J → C is: More generally, the kernel K : DJ1 × D 1 K(z, w; z¯ , w ¯ ) := (ez¯,w¯ , ez¯ ,w¯ ) = (1 − ww ¯ )−2k exp
¯ + z¯2 w 2¯ z z + z 2w . 2(1 − ww¯ )
(5.3)
Proof. We introduce the auxiliary operators: 1 + 2 (a ) + K + , 2 1 K − = a2 + K − , 2 1 1 + K0 = a a+ + K 0 , 2 2
(5.4b)
K − e0 = 0,
(5.5a)
K+ =
(5.4a)
(5.4c)
which have the properties
K 0 e0 = k e0 ;
1 k = k + ; 4
[K σ , a] = [K σ , a+ ] = 0, [K 0 , K ± ]
=
±K ± ;
(5.5b)
σ = ±, 0,
[K − , K + ]
=
2K 0 .
(5.6a) (5.6b)
Using the fact that ek,k+m is an orthonormal system (see also Sec. 7.2 and the Appendix), where ek,k+m := akm (K + )m ek,k ;
a2km =
Γ(2k) , m!Γ(m + 2k)
(5.7)
May 2, 2006 15:57 WSPC/148-RMP
174
J070-00261
S. Berceanu
the relation (see, e.g., [31, Eq. (1.110)]) ∞ xm Γ(q + m) , m! Γ(q) m=0
(1 − x)−q =
(5.8)
and the orthonormality of the n-particle states (see also Sec. 7.1 and the Appendix): 1
|n = (n!)− 2 (a+ )n |0;
n , n = δnn ,
it is proved the relation
¯ − w K + e e0 = (1 − w w) ¯ −2k . e0 , ewK
(5.9)
(5.10)
We introduce the notation +
E = E(z, w) := eza
+ 2 +w 2 (a )
=
z p ( w )q 2 (a+ )p+2q . p! q!
(5.11)
p,q≥0
With the change of variable: n := p + 2q, i.e. p = n − 2q, Eq. (5.11) becomes n
E=
[2] n≥0 q=0
q w z n−2q (a+ )n . (n − 2q)!q! 2
Recalling that the Hermite polynomials can be represented as (cf. [5, Eq. (10.13.9)]) n
[2] (−1)m (2x)n−2m , Hn (x) = n! m!(n − 2m)! m=0
(5.12)
the expression (5.11) becomes n i−n w 2 iz E(z, w) = Hn √ (a+ )n . n! 2 2w
(5.13)
n≥0
Then,
¯ − z a+ +w K + e e0 . K := K z¯, w; ¯ z , w ) = (ez,w , ez ,w = e0 , ez¯a+wK But due to Eqs. (5.4a), (5.4b), K can be written down as
w ¯ 2 ¯ − w K + z a+ + w2 (a+ )2 K = e0 , ez¯a+ 2 a ewK e e e0 .
Let the notation ¯ ; z, w) = (e0 , E + (¯ z, w ¯ )E(z, w)e0 ). F := F (¯ zw
Because of the orthonormality relation (5.9), e0 , an (a+ )n e0 = n!δnn , we get: n 1 w ¯ w 2 z¯ z Hn −i √ F = Hn i √ . n! 4 2w 2w ¯
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
175
We use the summation relation of the Hermite polynomials (Mehler formula, cf. [5, Eq. (10.13.22)]) ∞ ( s2 )n 1 2xys − (x2 + y 2 )s2 Hn (x)Hn (y) = √ exp , 2 n! 1 − s2 1−s n=o
|s| < 1,
(5.14)
where z¯ x = −i √ ; 2w ¯
z y = i√ ; 2w
s = (w ¯ w)1/2 ,
and we get F =
1 (1 − w ¯ w)
1/2
exp
2¯ z z + z¯2 w + z 2 w ¯ . 2(1 − w ¯ w)
Recalling (5.10), we have
K = (1 − w ¯ w)−2k F, and finally, (ez¯,w¯ , ez¯ ,w¯ ) = (1 − ww¯ )−2k exp
¯ + z¯2 w 2z z¯ + z 2 w . 2(1 − ww ¯ )
6. The Group Action on the Base Manifold We start this section recalling in Sec. 6.1 some useful relations for representations of the groups H1 and SU(1, 1). Then, we obtain formulas (6.14), (6.15) for the change of order of the action of these groups. 6.1. Formulas for the Heisenberg–Weyl group H1 and SU(1, 1) Let us recall some relations for the displacement operator: 1 2 + D(α) := exp(αa − α ¯ a) = exp − |α| exp(αa+ ) exp(−α ¯ a), 2 D(α2 )D(α1 ) = eiθh (α2 ,α1 ) D(α2 + α1 ),
θh (α2 , α1 ) := (α2 α¯1 ).
(6.1) (6.2)
k representation of the group SU(1, 1) and let us Let us denote by S the D+ introduce the notation S (z) = S(w), where w and z, w ∈ C, |w| < 1, z ∈ C, are ¯ related by (6.3c), (6.3d). We have the relations:
S (z) := exp(zK + − z¯K − ), z ∈ C; ¯ ¯ − ); S(w) = exp(wK + ) exp(ηK 0 ) exp(−wK z tanh(|z|), w ∈ C, |w| < 1; w = w(z) = |z| w w 1 + |w| z = z(w) = arctanh(|w|) = log ; |w| 2|w| 1 − |w| η = log(1 − ww) ¯ = −2 log(cosh (|z|)).
(6.3a) (6.3b) (6.3c) (6.3d) (6.3e)
May 2, 2006 15:57 WSPC/148-RMP
176
J070-00261
S. Berceanu
Let us consider an element g a g= b
∈ SU(1, 1), ¯b , where |a|2 − |b|2 = 1. a ¯
(6.4)
Remark 6.1. The following relations hold: S (z)e0 = (1 − |w|2 )k e0,w , ¯ k a −2k eg := S(g)e0 = a ¯ e0,w=−i b = S (z)e0 , a ¯ a ¯ ¯ a + ¯bw)−2k e0,g·w , S(g)e0,w = (¯
(6.5) (6.6) (6.7)
where w ∈ C, |w| < 1 and z ∈ C in (6.6) are related by Eqs. (6.3c), (6.3d), and the linear-fractional action of the group SU(1, 1) on the unit disk D1 in (6.7) is aw + b g·w= ¯ . bw + a ¯
(6.8)
We recall also the following property, which is a particular case of a more general result proved in [12]: Remark 6.2. If S (z) is defined by (6.3a), then: ¯ (6.9a) S (z2 )S (z1 ) = S (z3 )eiθs K 0 ; ¯ ¯ ¯ w1 + w2 w3 = ; (6.9b) 1+w ¯2 w1 1 + w2 w ¯1 eiθs = , (6.9c) 1 + w1 w ¯2 where wi and zi , i = 1, 2, 3, in Eq. (6.9b) are related by the relations (6.3c), (6.3d). Comment 6.3. Note that when z1 , z2 ∈ R, then (6.9a) expresses just the additivity of the “rapidities”, S (z2 )S (z1 ) = S (z2 + z1 ), ¯ ¯ ¯ while (6.9b) becomes just the Lorentz composition of velocities in special relativity: w1 + w2 . w3 = 1 + w2 w1 6.2. Holstein–Primakoff–Bogoliubov-type equations We recall the Holstein–Primakoff–Bogoliubov equations [34, 23] (see also [53]), a consequence of the Eq. (4.5) and of the fact that the Heisenberg algebra is an ideal in the Jacobi algebra (3.4), as expressed in (3.5c)–(3.5e): z sinh(|z|)a+ , (6.10a) S −1 (z)aS (z) = cosh(|z|)a + ¯ ¯ |z| z¯ sinh(|z|)a, S −1 (z)a+ S (z) = cosh(|z|)a+ + ¯ ¯ |z|
(6.10b)
and the CCR are still fulfilled in the new creation and annihilation operators.
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
Let us introduce the notation: M A A˜ = ¯ ; D = D(z) = A P
N Q
,
177
(6.11)
where M = cosh(|z|);
N=
z sinh(|z|); |z|
Note that
¯; P =N
D(z) = eX ,
where X :=
0
z
z¯ 0
Q = M.
.
(6.12)
(6.13)
Remark 6.4. With the notation (6.11), (6.12), Eqs. (6.10) become: S −1 (z)˜ aS (z) = D(z)˜ a. ¯ ¯ Using formula (4.5), we obtain, as a consequence that the HW group is a normal subgroup of the Jacobi group, the relations (6.14), (6.15) (or (6.16)), which allow to interchange the order of the representations of the groups SU(1, 1) and HW: Remark 6.5. If D and S (z) are defined by (6.1), respectively (6.3a), then ¯ D(α)S (z) = S (z)D(β), (6.14) ¯ ¯ where z z sinh(|z|); α = β cosh(|z|) + β¯ sinh(|z|); (6.15a) β = α cosh(|z|) − α ¯ |z| |z| ¯ α − αw ¯ β + βw β= ; α= . (6.15b) 2 1/2 (1 − |w| ) (1 − |w|2 )1/2 With the convention (6.11), Eq. (6.15a) can be written down as: β˜ = D(−z)˜ α;
˜ α ˜ = D(z)β.
(6.16)
Let us introduce the notation S (z, θ) := exp(2iθK 0 + zK + − z¯K − ). (6.17) ¯ Using (4.5), more general formulas than Holstein–Primakoff–Bogoliubov equations (6.10) can be proved, namely: si(x) si(x) + −1 S (z, θ) (z)a S (z, θ) = cs(x) + iθ a , (6.18a) a+z ¯ ¯ x x si (x) + si(x) a, (6.18b) S (z, θ)−1 (z)a+ S (z, θ) = cs(x) − iθ a + z¯ ¯ ¯ x x where cs(x) :=
cosh(x), cos(x),
and similarly for si(x).
if λ = x2 > 0, if λ = −x2 < 0,
;
λ := |z|2 − θ2 ,
(6.19)
May 2, 2006 15:57 WSPC/148-RMP
178
J070-00261
S. Berceanu
Let us consider X ∈ su(1, 1), iθ z X= , z¯ −iθ
θ ∈ R,
z ∈ C.
(6.20)
Then g = eX ∈ SU(1, 1) is an element of the form (6.4), where a = cs(x) + iθ If g =
α β ¯ α β ¯
si(x) , x
b=z
si(x) . x
(6.21)
∈ SU(1, 1), then Eq. (6.18) can be written down as S −1 (g)aS(g) = αa + βa+ , ¯ +α ¯ a+ , S −1 (g)a+ S(g) = βa
(6.22a) (6.22b)
and we have the following (generalized Holstein–Primakoff–Bogoliubov) equations: Remark 6.6. If S denotes the representation of SU(1, 1), with the convention (6.11), we have S −1 (g)˜ aS(g) = g · a ˜.
(6.23)
Applying again formula (4.5), we obtain a more general formula than (6.14), namely: S (z, θ)D(α)S (z, θ)−1 = D(α1 ), ¯ ¯
(6.24)
where α1 = α1 (z, α, θ) = α cs(x) + (iθα + z α ¯)
si(x) . x
(6.25)
Written down in the form similar to (6.14), Eq. (6.24) reads D(α)S (z, θ) = S (z, θ)D(β1 ), ¯ ¯
(6.26)
where β1 = β1 (z, α, θ) = α1 (z, −α, −θ), i.e. ¯) β1 = α cs(x) − (iθα + z α
si(x) , x
si(x) α = β1 cs(x) + (iθβ1 + z β¯1 ) . x
(6.27)
Note that if θ = 0, then S (z, θ) = S (z) and β1 in (6.27) becomes β1 = β with β ¯ ¯ given by (6.15). We also underline that if z = 0 in (6.24), then (6.25) becomes just sin(|θ|) . α1 = α cos(|θ|) + iθ |θ|
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
179
Summarizing, we rewrite now Eq. (6.24) in the following useful form: Remark 6.7. In the matrix realization (3.3), Eq. (6.24) can be written down as S(g)D(α)S −1 (g) = D(αg ),
(6.28)
where Eq. (6.25) has the expression of the natural action of SU(1, 1) × C → C: g · α ˜ := αg , αg = a α + b α ¯,
(6.29)
and a, b have the expression (6.21). Let us remark that the commutation relations (3.5c)–(3.5e) between the generators of the groups SU(1, 1) and HW were chosen in such a way that the action of the group SU(1, 1) on the complex plane M ≈ C = H1 /R be the natural one, cf. Remark 6.7. Such a choice of the action of the group SU(1, 1) on the group H1 , a normal subgroup of the Jacobi group GJ1 , was inspired from the squeezed states in Quantum Optics (cf., e.g., [53]). If we had started from the natural action of SU(1, 1) on C given in Remark 6.7, then the commutation relations (3.5c)–(3.5e) would had followed taking the derivatives in (6.23) realized as (6.18) using the development (6.21). Now, we consider the product of two representations D and S and apply Remark 6.5: D(α2 )S (z2 )D(α1 )S (z1 ) = D(α2 )D(α)S (z2 )S (z1 ), ¯ ¯ ¯ ¯
(6.30)
where α = α1 cosh(|z2 |) + α ¯1
z2 sinh(|z2 |), |z2 |
(6.31)
or α ˜ = D(z)˜ α1 . Equations (6.30) and (6.31) allow to determine Remark 6.8. The action (α2 , z2 ) × (α1 , w1 ) = (A, w), where z2 , α1,2 , A ∈ C, w, w1 ∈ D1 and the variables of type w and z are related by Eqs. (6.3c), (6.3d), can be expressed as: z2 ¯1 w2 α1 + α sinh|z2 | = α2 + ¯1 , (6.32a) A = α2 + α1 cosh|z2 | + α |z2 | (1 − |w2 |2 )1/2 z2 sinh|z2 | cosh|z2 |w1 + w1 + w2 |z2 | . (6.32b) w = z¯ = 2 1 + w1 w ¯2 sinh|z2 |w1 + cosh|z2 | |z2 | Equations (6.32) express the action (α2 , w2 ) × (α1 , w1 ) = (α2 + w2 ◦ α1 , w2 ◦ w1 ), α1,2 ∈ C, w1,2 ∈ D1 . (6.32) can be written down as: A˜ = α ˜ 2 + D˜ α1 , (6.33a) w=
M w1 + N . P w1 + Q
(6.33b)
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
S. Berceanu
180
Let us introduce the normalized vectors: Ψα,w := D(α)S(w)e0 ;
α ∈ C, w ∈ C, |w| < 1.
(6.34)
As a consequence of (6.30), we have: Remark 6.9. The product of the representations D and S acts on the CS-vector ¯ (6.34) with the effect: D(α2 )S (z2 )Ψα1 ,w1 = JΨA,w , where J = ei(θh (α2 ,α)+kθs ) . (6.35) ¯ Above (A, w) are given by Remark 6.8, θh (α2 , α) is given by (6.2) with α given by (6.31), while θs is given by (6.9c) and the dependence w2 = w2 (z2 ) is given by Eq. (6.3c). Note also the following important property (6.36), well known in the Quantum Optics of squeezed states (see, e.g., [63, p. 3219, Eq. (20)]): Comment 6.10. The action of the HW group on the (“squeezed”) state vector Ψz,α = S (z)D(α)e0 ¯ ¯ modifies only the part of the HW group. More precisely, we have D(β)Ψz, α = eiη Ψz, α + γ, ¯ ¯
where η = (γ α ¯ ),
(6.36)
and z ˜ γ = β cosh(|z|) − β¯ sinh(|z|) or γ˜ = D(−z)β. |z|
(6.37)
Indeed, we apply formula (6.14): D(β)S (z)D(α) = S (z)D(γ)D(α), ¯ ¯ where γ has the expression (6.37). Then, (6.36) follows. 6.3. The action of the Jacobi group Now we find a relation between the (normalized) vector (6.34) and the (unnormalized) Perelomov’s CS-vector (4.3), which will be important in the proof of Proposition 6.13, our main result of this section. Lemma 6.11. The vectors (6.34), (4.3), i.e. Ψα,w := D(α)S(w)e0 ;
ez,w := exp(za+ + w K + )e0 .
are related by the relation
α ¯ Ψα,w = (1 − ww) ¯ k exp − z ez,w , 2
where z = α − wα ¯.
(6.38)
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
181
Proof. Due to (6.3a), (6.3b), (4.1b) and (4.1c), we have the relations ¯ − )e0 S(w)e0 = exp(wK + ) exp(ηK 0 ) exp(−wK = exp(wK + ) exp(k ln(1 − ww))e ¯ 0 = (1 − ww) ¯ k exp(wK + )e0 , which is also a proof of (6.5). We obtain successively ¯ k D(α) exp(wK + )e0 Ψα,w = (1 − ww) 1 2 k = (1 − ww) ¯ exp − |α| exp(αa+ ) exp(−α ¯ a) exp(wK + )e0 2 1 2 k = (1 − ww) ¯ exp − |α| exp(αa+ ) exp(−α ¯ a) exp(wK + ) 2 × exp(¯ αa) exp(−α ¯ a)e0 1 2 k = (1 − ww) ¯ exp − |α| exp(αa+ )Ee0 , 2 where here αa). E := exp(−α ¯ a) exp(wK + ) exp(¯
(6.39)
As a consequence of (4.5),
1 exp(Z) exp(X) exp(−Z) = exp X + [Z, X] + [Z, [Z, X]] + · · · , 2 where, if we take Z = −α ¯ a; X = wK + , then [Z, X] = −α ¯ wa+ ;
[Z, [Z, X]] = α ¯2 w.
We find for E defined by (6.39) the value α ¯2 + ¯ + E = exp w K + − αa , 2 and finally 2 1 2 α ¯ Ψα,w = exp − |α| exp w (1 − ww) ¯ k eα−wα,w ¯ , 2 2 i.e. (6.38). Comment 6.12. Starting from (6.38), we reobtain the expression (5.2) of the reproducing kernel K. Indeed, the normalization (Ψα,w , Ψα,w ) = 1 implies that w 2 2 α ¯ − c.c. (1 − ww) ¯ −2k . (eα−wα,w ¯ , eα−w α,w ¯ ) = exp |α| − 2 With the notation: α − wα ¯ = z, we have z + z¯w α= , 1 − ww ¯
(6.40)
May 2, 2006 15:57 WSPC/148-RMP
182
J070-00261
S. Berceanu
and then (6.40) can be rewritten as (ez,w , ez,w ) = (1 − ww) ¯ −2k exp
¯ 2 2z z¯ + w¯ z 2 + wz , 2(1 − ww) ¯
i.e. we get another proof of (5.2). From the following proposition, we can see the holomorphic action of the Jacobi group GJ1 := H1 SU(1, 1), on the manifold
DJ1
(6.41)
(4.2b):
Proposition 6.13. Let us consider the action S(g)D(α)ez,w , where g ∈ SU(1, 1) has the form (6.4), D(α) is given by (6.1), and the coherent state vector is defined in (4.3). Then we have the formula (6.42) and the relations (6.43), (6.44)–(6.46): S(g)D(α)ez,w = λez1 ,w1 ,
λ = λ(g, α; z, w),
α − αw ¯ +z aw + b ; w1 = g · w = ¯ , ¯bw + a ¯ bw + a ¯ z z1 ¯0 − α ¯ 2 exp iθh (α, α0 ), λ = (¯ a + ¯bw)−2k exp α 2 2 z1 =
α0 =
z + z¯w , 1 − ww ¯
α2 = (α + α0 )a + (¯ α+α ¯ 0 )b.
(6.42) (6.43) (6.44) (6.45) (6.46)
Corollary 6.14. The action of the six-dimensional Jacobi group (6.41) on the fourdimensional manifold (4.2b), where D1 = SU(1, 1)/U(1), is given by Eq. (6.42), (6.43). The composition law in GJ1 is (g1 , α1 , t1 ) ◦ (g2 , α2 , t2 ) = (g1 ◦ g2 , g2−1 · α ˜ 1 + α2 , t1 + t2 + (g2−1 · α1 α ¯2 )),
(6.47)
where g · α ˜ := αg is given by (6.29), and if g has the form given by (6.4), then ˜ = αg−1 = a ¯ α − bα ¯. g −1 · α Proof of Proposition 6.13. With Lemma
6.11, we have ez,w = λ1 Ψα0 ,w , where ¯ 0 (1 − |w|2 )−k . Then, I := S(g)D(α)ez,w α0 is given by (6.45) and λ1 = exp z2 α becomes successively I = λ1 S(g)D(α)Ψα0 ,w = λ1 S(g)D(α)D(α0 )S(w)e0 = λ2 S(g)D(α1 )S(w)e0 , where α1 = α + α0 and λ2 = λ1 eiθh (α1 ,α0 ) . With Eqs. (6.28), (6.29), we have I = ¯ 1 . But (6.5) implies I = λ3 D(α2 )S(g)e0,w , λ2 D(α2 )S(g)S(w)e0 , where α2 = aα1 +bα with λ3 = λ2 (1 − |w|2 )k . Now, we use (6.7) and we find I = λ4 D(α2 )e0,w1 , where a + ¯bw)−2k λ3 . We rewrite the in accord with (6.8) w1 is given by (6.43) and λ4 = (¯
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
183
last equation as I = λ5 D(α2 )S(w1 )e0 , where λ5 = (1 − |w1 |2 )−k λ4 . Then, we apply again Lemma 6.11 and we find I = λ6 ez1 ,w1 , where λ6 = λ5 (1−|w1 |2 )k exp(− α¯22 z1 ), ¯2 . Proposition 6.13 is proved. and z1 = α2 − w1 α Remark 6.15. Combining the expressions (6.43)–(6.46), the factor λ in (6.42) can be written down as λ = (¯ a + ¯bw)−2k exp(−λ1 ),
(6.48)
where λ1 = or
¯bz 2 + (¯ aα ¯ + ¯bα)(2z + z0 ) , 2(¯ a + ¯bw)
¯ w, z0 = α − α
¯b(z + z0 )2 z0 +α ¯ z+ λ1 = . 2 2(¯ a + ¯bw)
(6.49)
(6.50)
Note the expression (6.48)–(6.50) is identical with the expression given in [28, Theorem 1.4] of the Jacobi forms, under the identification of c, d, τ, z, µ, λ in [28] with, respectively, ¯b, a ¯, w, z, α, −α ¯ in our notation. Note also that the composition law (6.47) of the Jacobi group GJ and the action of the Jacobi group on the base manifold (4.2b) is similar with that in the paper [18]. See also Sec. 9 and [19, Corollary 3.4.4]. 7. The Symmetric Fock Space We recall the construction (2.6) of the map Φ : H∗ → FH ;
Φ(ψ) = fψ ,
fψ (z) := (ez¯, ψ)H ,
and the isometric embedding (2.8). Knowing the symmetric Fock spaces associated to the groups HW and SU(1, 1), we shall construct in this section the symmetric Fock space associated to the Jacobi group. We begin recalling the construction for 7.1. The Heisenberg–Weyl group In the orthonormal base (5.9), Perelomov’s CS-vectors associated to the HW group, defined on M := H1 /R = C, are zn + ez := eza e0 = |n, (7.1) (n!)1/2 and their corresponding holomorphic functions are (see, e.g., [3]) f|n (z) := (ez¯, |n) =
zn . (n!)1/2
(7.2)
May 2, 2006 15:57 WSPC/148-RMP
184
J070-00261
S. Berceanu
¯ → C is The reproducing kernel K : C × C f|n (z)f¯|n (z ) = ezz¯ , K(z, z¯ ) := (ez¯, ez¯ ) =
(7.3)
where the vector ez is given by (7.1), while the function f|n (z) is given by (7.2). In order to obtain the equality (7.3) with ez given by (7.1), Eq. (5.9) is used. The scalar product (2.1) on the Segal–Bargmann–Fock space is (cf. [3]) 2 1 ∗ (φ, ψ)H = (fφ , fψ )FH = f¯φ (z)fψ (z)e−|z| dze dz. π Now, we recall the similar construction for 7.2. The group SU(1, 1) In the orthonormal base (5.7), Perelomov’s CS-vectors for SU(1, 1), based on the unit disk D1 = SU(1, 1)/U(1), are ez := ezK + e0 =
z n K n+ n!
e0 =
z n ek,k+n n!akn
,
(7.4)
and the corresponding holomorphic functions are (see, e.g., [2, Eq. (9.14)]) Γ(n + 2k) n z . fek,k+n (z) := (ez¯, ek,k+n ) = n!Γ(2k) ¯ 1 → C is The reproducing kernel K : D1 × D K(z, z¯ ) := (ez¯, ez¯ ) = fek,k+m (z)f¯ek,k+m (z ) = (1 − z z¯)−2k ,
(7.5)
(7.6)
where the vector ez is given by (7.4), while the function fek,k+m (z) is given by (7.5). In order to obtain the equality (7.6) for ez given by (7.4), the orthonormality given by (5.7) is used, while for the second equality involving the functions (7.5), use is made of Eq. (5.8). The scalar product (2.1) on D1 = SU(1, 1)/U(1) is (see, e.g., [2, Eq. (9.9)]) (φ, ψ)H∗ = (fφ , fψ )FH 2k − 1 = f¯φ (z)fψ (z)(1 − |z|2 )2k−2 dz dz, π |z|<1
2k = 2, 3, . . . .
7.3. The Jacobi group In formula (4.3) defining Perelomov’s CS vectors for the Jacobi group (6.41), we take into account (5.4a), (5.6a) and we have 1 + 2 ez,w = exp za + (a+) w exp(wK + )e0 . 2
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
185
With (5.13), (5.11), we have n i−n w 2 wm iz ez,w = (K + )m e0 . Hn √ (a+ )n n! 2 m! 2w n m Now, we take into account (5.7) and we get n i−n w iz wm √ ez,w = |n H ek ,k +m . n 2 (n!)1/2 2w m m!ak m n The base of functions associated to the CS-vectors attached to the Jacobi group (6.41), based on the manifold M (4.2b) f|n;ek ,k +m (z, w) := (ez¯,w¯ , |nek ,k +m ), consists of the functions −1/2
f|n;ek ,k +m (z, w) = (n!)
i √ 2
n
z ∈ C,
|w| < 1,
(7.7)
−iz Γ(m + 2k ) m+ n 2 w Hn √ . m!Γ(2k ) 2w
(7.8)
Using Eq. (5.12), we can write down n −iz n i 2 √ √ := Pn (z, w), w Hn 2 2w
(7.9)
where the polynomials Pn (z, w) have the expression [(n 2 )]
Pn (z, w) = n!
k=0
w 2
k
z n−2k . k!(n − 2k)!
(7.10)
With the notation (7.9), Eq. (7.8) can be written down as f|n;ek ,k +m (z, w) = fek ,k +m (w)
Pn (z, w) √ , n!
(7.11)
where the functions fek ,k +m are defined in (7.5). For above, we have 2k = 2k + 1/2 and m = 0, 1, . . . . In order to illustrate (7.10), we present the first six polynomials Pn (z, w): P0 = 1; P2 = z 2 + w; P4 = z 4 + 6z 2 w + 3w2 ;
P1 = z; P3 = z 3 + 3zw; P5 = z 5 + 10z 3 w + 15zw.
¯ → C has the property: The reproducing kernel (5.3) K : M × M f|n,ek ,k +m (z, w)f¯|n,ek ,k +m (z , w ) K(z, w; z¯, w ¯ ) := (ez¯,w¯ , ez¯ ,w¯ ) =
(7.12)
(7.13a)
n,m
= (1 − ww ¯ )−2k exp
¯ + z¯2 w 2¯ z z + z 2w . 2(1 − ww¯ )
(7.13b)
The fact that the coherent state vectors (4.3) have the scalar product given by (7.13b) was proved in Lemma 5.1, Eq. (5.3). In order to check up the equality
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
S. Berceanu
186
(7.13) for the functions (7.8), we use the form (7.11), sum up the part corresponding to the functions (7.5) using the summation formula (7.6) for k , while for the part corresponding to the mixed part Pn (z, w) in (7.11), we go back with (7.9) to the representation (7.8), and we apply again the summation formula (5.14) of the Hermite polynomials. In accord with the general scheme of Sec. 2.1, the scalar product (2.1) of functions from the space FK corresponding to the kernel defined by (5.3) on the manifold (4.2b) is:
¯ 2k f¯φ (z, w)fψ (z, w)(1 − ww)
(φ, ψ) = Λ z∈C;|w|<1
2 ¯ + z¯2 w |z|2 z w × exp − exp − dν, 1 − ww¯ 2(1 − ww) ¯
(7.14)
where the value of the GJ -invariant measure dν dν =
dw dw dz dz (1 − ww) ¯ 3
(7.15)
will be deduced latter in (7.22), in accord with the receipt (2.2). In order to find the value of the constant Λ in (7.14), we take the functions φ, ψ = 1, we change the variable z → (1 − ww) ¯ 1/2 z and we get 1=Λ |w|<1
(1 − ww) ¯ 2k−2 dw dw
2 ¯ + z¯2 w z w exp(−|z|2 ) exp − dz dz. 2 z∈C
We apply Eqs. (A1), (A2) in [4]: I(B, C) =
n 2 1 1 dzk dzk = [det(1−CB)]− 2 , exp (z · Bz + z¯ · C z¯) π −n e−|z| 2 k=1
where B, C are complex symmetric matrices such that |B| < 1, |C| < 1. Here, n = 1, B = −w, ¯ C = −w. So, we get 1 = πΛ (1 − ww) ¯ 2k−5/2 dw dw. |w|<1
But |w|<1
dw dw π , = (1 − |w|2 )λ 1−λ
where λ < 1,
and we find out the value of the constant Λ in (7.14): Λ=
4k − 3 . 2π 2
(7.16)
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
187
7.4. The geometry of the manifold C × D1 Now, we follow the general prescription of Sec. 2.2. We calculate the K¨ahler potential as the logarithm of the reproducing kernel (5.3), f := log K, i.e. ¯ + z¯2 w 2z z¯ + z 2 w − 2k log(1 − ww). ¯ 2(1 − ww) ¯ The K¨ ahler two-form ω is given by the formula:
(7.17)
f=
−iω = fzz¯ dz ∧ d¯ z + fzw¯ dz ∧ dw¯ − fz¯w d¯ z ∧ dw + fww¯ dw ∧ dw. ¯ The volume form is:
f zz¯ −ω ∧ ω = 2 fz¯w
fzw¯ z ∧ dw ∧ dw. ¯ dz ∧ d¯ fww ¯
(7.18)
(7.19)
We start calculating the partial derivatives of the function f . We have z + z¯w fz¯ = , 1 − ww¯ fw¯ = fww¯ =
z 2 + w(2z z¯ + z¯2 w) 2kw , + 2 2(1 − ww) ¯ 1 − ww¯ A 2k + , 2(1 − ww) ¯ 2 (1 − ww) ¯ 2
where A is z 2 w + 2z z¯)(1 − ww) ¯ 2 + 2w(1 ¯ − ww)(¯ ¯ z 2 w2 + 2z z¯w + z 2 ) = (1 − ww)A ¯ . A = (2¯ Here, A is z 2 w + 2z z¯ − 2¯ z 2 w2 w ¯ − 2z z¯ww ¯ + 2¯ z 2 ww ¯ 2 + 4z z¯ww¯ + 2z 2 w. ¯ A = 2¯ and we have finally, z + wz)(z ¯ + w¯ z ). A = 2(¯ So, we find for the manifold (4.2) the fundamental two-form ω (7.18), where 1 , (7.20a) fzz¯ = 1 − ww¯ z + w¯ z fzw¯ = , (7.20b) (1 − ww) ¯ 2 fww¯ =
(¯ z + wz)(z ¯ + w¯ z) 2k + . (1 − ww) ¯ 3 (1 − ww) ¯ 2
We can write down the two-form ω (7.18)–(7.20) as A ∧ A¯ 2k , A = dz + α ¯ 0 dw, dw ∧ dw¯ + −iω = 2 (1 − ww) ¯ 1 − ww¯
α0 =
(7.20c)
z + z¯w . 1 − ww¯
(7.21)
For the volume form (7.19), we find ω ∧ ω = 4k(1 − ww) ¯ −3 4zzww.
(7.22)
It can be checked that indeed, the measure dν and the fundamental two-form ω are group-invariant at the action (6.43) of the Jacobi group (6.41).
May 2, 2006 15:57 WSPC/148-RMP
188
J070-00261
S. Berceanu
Now, we summarize the contents of this section as follows: Proposition 7.1. Let us consider the Jacobi group GJ1 (6.41) with the composition rule (6.47) acting on the coherent state manifold (4.2) via Eq. (6.43). The maniahler potential (7.17) and the GJ1 -invariant K¨ ahler two-form ω fold DJ1 has the K¨ given by (7.21). The holomorphic polynomials (7.7) associated to the coherent state vectors (4.3) are given by (7.11), where the functions f are given by (7.5), while the polynomials P are given by (7.10). The Hilbert space of holomorphic functions FK ¯ J → C given by (5.3) is endowed associated to the holomorphic kernel K : DJ1 × D 1 with the scalar product (7.14), where the normalization constant Λ is given by (7.16) and the GJ1 -invariant measure dν is given by (7.15). Recalling [52, Proposition IV.1.9, p. 104; Proposition XII.2.1, p. 515], Proposition 6.13 can be formulated as follows: Proposition 7.2. Let h := (g, α) ∈ GJ1 , where GJ1 is the Jacobi group (6.41), and we consider the representation π(h) := S(g)D(α), g ∈ SU(1, 1), α ∈ C, and let the notation x := (z, w) ∈ DJ1 := C × D1 . Then, the continuous unitary representation (πK , HK ) attached to the positive definite holomorphic kernel K defined by (5.3) is (πK (h) · f )(x) = J(h−1 , x)−1 f (h−1 · x),
(7.23)
where the cocycle J(h−1 , x)−1 := λ(h−1 , x) with λ defined by Eqs. (6.42)–(6.46) and the function f belongs to the Hilbert space of holomorphic functions HK ≡ FK endowed with the scalar product (7.14), where Λ is given by (7.16). Remark 7.3. The equations of the geodesics on the manifold DJ1 endowed with the two-form (7.21) in the variables w ∈ D1 , z ∈ C are 2 2 w dz dw dz dw ¯ d2 z ¯0 + 2 2k − α = 0; (7.24a) ¯20 −α ¯ 30 2k 2 − α dt dt P dt dt dt 2 dw 2 dz ¯ dz dw w d2 w 2 + 4k + α ¯0 2k 2 + + 2α ¯0 = 0, (7.24b) dt dt dt dt P dt ¯ where α0 is given by (6.45) and P = 1 − ww.
8. Physical Applications: Classical and Quantum Equations of Motion We consider applications of the formulas (4.4) proved in this paper for the Jacobi group GJ1 to equations of motion on the CS-manifold DJ1 . This extend our previous results for hermitian groups [7, 8] or semisimple groups which generate CS-orbits [10, 13] to an example of non-reductive CS-group. Passing on from the dynamical system problem in the Hilbert space H to the corresponding one on M is sometimes called dequantization, and the system on M
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
189
is a classical one [7, 8]. Following Berezin [16], the motion on the classical phase space can be described by the local equations of motion z˙γ = i{H, zγ },
(8.1)
where H is the energy function attached to the Hamiltonian H. In (8.1), {·, ·} denotes the Poisson bracket: ∂f ∂g ∂f ∂g {f, g} = G−1 − , f, g ∈ C ∞ (M ), α,β ∂zα ∂ z¯β ∂ z¯α ∂zβ α,β∈∆+
where G is defined in (2.4). The equations of motion (8.1) can be written down as ∂ ∂z 0 G z˙ i = − (8.2) ∂ H. ¯ 0 −G z¯˙ ∂ z¯ We consider an algebraic Hamiltonian linear in the generators of the group of symmetry λ X λ . (8.3) H= λ∈∆
We recall (cf. [7, 8]) that if the differential action of the generators of the group G is given by formulas (2.11), then classical motion and the quantum evolutions generated by the Hamiltonian (8.3) are given by the same equations of motion (8.1) on M = G/H: λ Qλ,α . (8.4) iz˙α = λ
The two-form form ω on M permits to determine the Berry phase [8]. Let us consider a linear Hamiltonian in the generators of the Jacobi group (6.41): H = a a + ¯a a+ + 0 K 0 + + K + + − K − .
(8.5)
We have Remark 8.1. The equations of motion on the manifold (4.2b) generated by the linear Hamiltonian (8.5) are given by the matrix Riccati equation: 0 (8.6a) iz˙ = a + z + + zw, 2 iw˙ = − + (¯ a + 0 )w + + w2 . (8.6b) Note that the second equation (8.6b) is a Riccati equation on D1 . The procedure of linearization of matrix Riccati equation on manifolds is discussed in [8]. An interesting development of the present construction should be to consider nonlinear CSs [62] attached to a deformed Jacobi group. However, difficulties in the physical interpretation of the creation and annihilation of q-deformed oscillator, related to the quantum groups SUq (2) [22], appear as these are symmetric but not self-adjoint operators [45].
May 2, 2006 15:57 WSPC/148-RMP
190
J070-00261
S. Berceanu
9. Comparison with K¨ ahler–Berndt’s Approach Berndt — alone or in collaboration — has studied the real Jacobi group GJ (R) in several references [17–20]. The Jacobi group appears (see explanation in [40]) in the context of the so-called Poincar´e group or The New Poincar´e group investigated by K¨ ahler as the ten-dimensional group GK which invariates a hyperbolic metric [37–39]. K¨ ahler and Berndt have investigated the Jacobi group GJ0 (R) := SL2 (R) 2 R acting on the manifold XJ1 := H1 × C, where H1 is the upper half-plane H1 := {v ∈ C|(v) > 0}. For self-contentedness, in Remarks 9.1 and 9.2 below, we briefly proof two results from [19], which we need in order to express our two-form ω in the coordinates used by K¨ ahler and Berndt. Remark 9.1. The action of GJ0 (R) on XJ1 is given by (g, (v, z)) → (v1 , z1 ), g = (M, l), where a b av + b z + l1 v + l2 , z1 = ; M= ∈ SL2 (R), (l1 , l2 ) ∈ R2 . (9.1) v1 = cv + d cv + d c d Proof. Let us use the notation of [19]. We denote GJ (R) := SL2 (R) H(R), where H(R) is the real HW group with the composition law: X X X (λ, µ, κ)(λ , µ , κ ) = λ + λ , µ + µ , κ + κ + , = det . (9.2) X X X If g = (M, X, κ) ∈ GJ (R), where M ∈ SL2 (R), X = (λ, µ), (X, κ) ∈ R3 , then the composition law in the real Jacobi group is XM gg = M M , XM + X , κ + κ + (9.3) . X The action of GJ (R) over the H(R) is M (X, κ)M −1 = (XM −1 , κ).
(9.4)
Let us consider the Iwasawa decomposition for a matrix M ∈ SL2 (R): 1 x y 1/2 cos θ sin θ 0 M= , y > 0. 0 1 − sin θ cos θ 0 y −1/2 If
M=
a
b
c
d
(9.5)
,
(9.6)
then we find for x, y, θ in (9.5) x=
ac + bd ; d2 + c2
y=
d2
1 ; + c2
c sin θ = − √ ; 2 c + d2
d cos θ = √ . 2 c + d2
(9.7)
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
191
For g = (M, X, κ) ∈ GJ (R), the EZ-coordinates (Eichler–Zagier, cf. [19, the definition, pp. 12 and 51]) are (x, y, θ, λ, µ, κ). Let τ = x + iy ∈ H1 , z = ξ + iη = pτ + q, where (p, q) = XM −1 = (λd − µc, −λb + µa).
(9.8)
Using the multiplication law (9.3), the Iwasawa decomposition (9.5) and the Eqs. (9.7), (9.8), we find the action of GJ (R) on the base XJ1 aτ + b z + λτ + µ , g(τ, z) = , (9.9) cτ + d cτ + d and Remark 9.1 is proved. Let us now recall that C
−1
SL2 (R)C = SU(1, 1),
where C =
i
i
−1 1
.
If M ∈ SL2 (R) is the matrix (9.6), then, under the transformation (9.10) α β ∗ −1 M = C MC = , α, β ∈ C, |α|2 − |β|2 = 1, β¯ α ¯
(9.10)
(9.11)
where 2α = a + d + i(b − c);
2β = a − d − i(b + c).
(9.12)
Now, we pass to the complex group GJC = C −1 GJ (R)C. We recall that the Jacobi group GJC is a group of Harish–Chandra type, (cf., e.g., [52, p. 514]; see the definition in [59, Chap. III, Sec. 5] and [52, Chap. XII.1]). Moreover, it is well known that the Jacobi algebra (3.4) is a CS-Lie algebra (cf., e.g., [48, Theorem 5.2]). The correspondence between our notation and that of Berndt–Schmidt in [19, p. 12] is as follows: a+ , a, K+ , K− , 1, K0 corresponds, respectively to: Y+ , Y− , X+ , −X− , −Z0 , 12 Z. See also our Remark 10.2 below. We see that under the transformation (9.10), g = (M, X, κ) ∈ SL2 (R) H(R) is twisted to g ∗ = (M ∗ , X ∗ , κ), where M ∗ is given by (9.11), while, due to action (9.4), X ∗ = XC = (iλ − µ, iλ + µ). Also the map (9.10) induces a transformation of the bounded domain D1 into the upper half-plane H1 and τ ∈ H1 → τ ∗ = C −1 (τ ) =
τ −i ∈ D1 . τ +i
(9.13)
The action C −1 GJ0 (R)C descends on the basis as the biholomorphic map: Cˇ −1 : XJ1 := H1 × C → DJ1 := D1 × C: (τ, z) → (τ ∗ , z ∗ ). Here τ ∗ is given by (9.13), while z ∗ = p∗ τ ∗ + q ∗ . So, (p, q) = (λ, µ)M −1 , and (p∗ , q ∗ ) = (λ∗ , µ∗ )M ∗ −1 . But M ∗ = C −1 M C, and (p∗ , q ∗ ) = (p, q)C = (−q + ip, q + ip), and we get z ∗ = τ2iz +i . Note that in [19, p. 53] the factor 2i in this formula is missing.
May 2, 2006 15:57 WSPC/148-RMP
192
J070-00261
S. Berceanu
In a different notation, we have shown that Remark 9.2. The action C −1 GJ0 (R)C, descends on the basis as the biholomorphic map: Cˇ −1 : XJ1 := H1 × C → DJ1 := D1 × C: w=
v−i ; v+i
z=
2iu , v+i
w ∈ D1 ,
v ∈ H1 , z ∈ C.
(9.14)
The GJ0 (R)-invariant closed two-form considered by K¨ ahler–Berndt is: ω = α
dv ∧ d¯ v 1 ¯ B ∧ B, +β (v − v¯)2 v − v¯
B = du −
u−u ¯ dv, v − v¯
v, u ∈ C,
(v) > 0, (9.15)
v cf. [39, Sec. 36]; see also [17, Sec. 3.2], where the first term is misprinted as α dv∧d¯ v−¯ v . Under the mapping (9.14), the two-form ω (7.21) reads
−iω = −
2 2k ¯ B ∧ B, dv ∧ d¯ v+ (¯ v − v)2 i(¯ v − v)
(9.16)
i.e. (9.15). In fact, we have proved that Remark 9.3. When expressed in the coordinates (v, u) ∈ XJ1 which are related to ahler the coordinates (w, z) ∈ DJ1 by the map (9.14) given by Remark 9.2, our K¨ two-form (7.21) is identical with the one (9.16) considered by K¨ ahler–Berndt. If we use the EZ coordinates adapted to our notation v = x + iy;
u = pv + q,
x, p, q, y ∈ R,
y > 0,
(9.17)
K¨ ahler metric on X corresponding to the K¨ ahler–Berndt’s the K¨ ahler two-form ω (9.16) reads J
GJ0 (R)-invariant ds2 =
k 1 (dx2 + dy 2 ) + [(x2 + y 2 )dp2 + dq 2 + 2x dp dq], 2y 2 y
(9.18)
i.e. the equation in [19, p. 62] or the equation given in [17, p. 30]. The K¨ ahler two-form (9.15) of K¨ ahler–Berndt corresponds (cf. [39, Eq. (9), Chap. 36]) to the K¨ ahler potential v − v¯ (u − u ¯)2 λ f = − log − iπµ , 2 2i v − v¯
u ∈ C,
v ∈ H1 .
(9.19)
The K¨ ahler potential (9.19), corresponds to some (K¨ ahler) Perelomov’s CSvectors based on the CS-manifold XJ1 on which the group GJ0 (R) acts via the action (9.1), which, instead of the scalar product K (5.2), should have a scalar product K in the EZ coordinates (9.17) x, y, p, q ∈ R λ
K = y − 2 exp(2πµp2 y).
(9.20)
It would be interesting to extend Eq. (9.16) to the case Hn Sp(n, R), see [14]. This would be useful for a better understanding of the Gaussons [60, 1].
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
193
10. Some More Comments Remark 10.1. We have called the algebra (3.4), the Jacobi algebra and the group (6.41), the Jacobi group, in agreement with the name used in [19] or [52, p. 178], where the algebra gJ1 := h1 sl(2, R) is called “Jacobi algebra”. The denomination adopted in the present paper is of course in accord with the one used in [52] because of the isomorphism of the Lie algebras su(1, 1) ∼ sl(2, R) ∼ sp(1, R)(∼ so(2, 1)). Also the name “Jacobi algebra” is used in [52, p. 248] to call the semi-direct sum of the (2n + 1)-dimensional Heisenberg algebra and the symplectic algebra, hsp := hn sp(n, R). The group corresponding to this algebra is called sometimes in the Mathematical Physics literature (see, e.g., [1, Sec. 10.1], which is based on [60]) the “metaplectic group”, but in reference [52] the term “metaplectic group” is reserved to the two-fold covering group of the symplectic group, cf. [52, p. 402] (see also [4, 5]). Other names of the metaplectic representation are the oscillator representation, the harmonic representation or the Segal–Shale–Weil representation, see [29, Chap. 4] and [19]. Remark 10.2. We now discuss the Jacobi algebra (3.4) from the view point of the book [52]. We know (cf. Lemma XII.1.20, p. 509) that quasihermitian Lie algebras, i.e. Lie algebras for which a maximal compactly embedded subalgebra k (cf. [52, Definition VII.1.1, p. 222]) satisfies the relation zg (z(k)) = k (cf. [52, Definition VII.2.15, + − − p. 241]), admit the five-grading of the complexification gC = p+ s ⊕ pr ⊕ kC ⊕ pr ⊕ ps , where k is the maximal compactly embedded subalgebra of g, ps (pr ) represent the semisimple roots, (respectively, the solvable roots) (cf. [52, Definition VII.2.4, p. 234]), while “+ ” (“− ”) refers to the positive (respectively, negative) roots with respect to a ∆+ adopted positive system (cf. [52, Definition VII.2.6, p. 236]). But the Jacobi algebra is quasihermitian, cf. [52, Example VIII.2.3, p. 294], with t = {0} ⊕ R ⊕ u(1) a compactly embedded Cartan algebra and also a maximal compactly embedded subalgebra of gJ (cf. [52, Example VII.2.30, p. 249] and [52, Example XII.1.22, p. 513]). So, the generators K+ , K− , a+ , a of the Jacobi − + − algebra (3.4) belong to p+ s , ps , pr , respectively pr , 1 belongs to the R-part of k, while K0 belongs to the u(1)-part of k. Note that due to relation (3.5d), the + subalgebra p+ = p+ r ⊕ ps which appears in the definition (4.3) of Perelomov’s coherent state vectors is an abelian one, as it should be (cf. [52, Lemma XII.1.20, p. 509]). Remark 10.3. We emphasize that the representation given in Lemma 4.1 is different from the extended metaplectic representation (cf. [29, p. 196], see also [59]). As was already mentioned, the Jacobi algebra admits a realization as subalgebra of the Weyl algebra A1 of polynomials in p, q of degree ≤ 2. In the present paper, we have presented a realization of the Jacobi algebra as subalgebra of the Weyl algebra A2 defined by holomorphic first-order differentials operators with holomorphic polynomials of degree ≤ 2. This algebra is realized in the variables x = (z, w) ∈ DJ1 . We recall that the only finite-dimensional non-solvable Lie algebras that can be realized
May 2, 2006 15:57 WSPC/148-RMP
194
J070-00261
S. Berceanu
as Lie subalgebras of the complex Weyl algebra A1 are: sp(1, R), sp(1, R) × C and the Jacobi algebra [61, 36, 57]. Remark 10.4. Note that the expression (5.3) of the reproducing kernel is the particular case n = 1, 2k = 12 of the reproducing kernel on the space DJn := Cn × Dn , where Dn is the Siegal ball, [52, p. 532] or in the article [32] and [59, (5.28)]. See also [14]. Appendix For self-completeness, we also give Proof of (5.9). We shall calculate λn;m := (e0 , an (a+ )m e0 ). We shall use the formula [A, B m ] =
m−1
B s [A, B]B m−s−1 ,
(A.1)
s=0
for A = an , B = a+ . We have [A, B m ] = −
m−1
B s [B, A]B m−s−1 .
s=0
But +
n
[B, A] = [a , a ] =
n−1
p
+
a [a , a]a
n−p−1
=−
p=0
n−1
an−1 = −nan−1 ,
p=0
and m
[A, B ] = n
m−1
(a+ )s an−1 (a+ )m−s−1 ,
s=0
so, we have λn;m = nλn−1;m−1 . If n = m, then λnn = n!. If n < m, then λn;m = ct(e0 , [a, (a+ )p ]e0 ), where p−1 + s + + p−s−1 = p(a+ )p−1 , and λn;m = p > 1. But [a, (a+ )p ] = s=0 (a ) [a, a ](a ) + p−1 ct(e0 , (a ) e0 ) = 0. Similarly, if n > m, then [ap , a+ ] = pap−1 and also λn;m = ct(e0 , ap−1 e0 ) = 0. So, we have λn;m = n!δn;m . Proof of (5.7). We calculate µn;m := (e0 , Cn;m e0 ), where Cn;m = K n− K m + , using (A.1) with A = K n− and B = K + . We find m−1 s n m−s−1 µn;m = e0 , K + [K − , K + ]K + e0 = (e0 , [K n− , K + ]K m−1 e0 ). + s=0
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
But [K + , K n− ] =
n−1
K p− [K + , K − ]K n−p−1 = −2 −
p=0
= −2
n−1
K p− K 0 K n−p−1 −
p=0
n−1
K p− [K 0 , K n−p−1 ]−2 −
p=0
n−1
K p− K n−p−1 K0 −
p=0
= −2nK n−1 − K0 − 2
n−1
K p− [K 0 , K n−p−1 ]. −
p=0
We find
m−1 µn;m = 2n e0 , K n−1 e0 + R, − K 0K +
where R := 2(e0 , R0 e0 );
R0 :=
n−1
K p− [K 0 , K n−p−1 ]K m−1 , − +
p=0
and we get m−1 µn;m = 2nkµn−1,m−1 + 2n(e0 , K n−1 ]e0 ) + R. − [K 0 , K +
But [K 0 , K m−1 ]= +
m−2
K s+ [K 0 , K + ]K m−2−s = (m − 1)K m−1 , + +
k=0
and [K 0 , K n−p−1 ]= −
n−p−2
K q− [K 0 , K − ]K n−p−q−2 = −(n − p − 1)K n−p−1 . − −
q=0
We get successively R0 = −
n−1
(n − p − 1)Cn−1;m−1 ,
p=0
R = −n(n − 1)µn−1;m−1 , and µn;m = (2nk + 2n(m − 1) − n(n − 1))µn−1;m−1 , µn;n = n(2k + n − 1)µn−1;n−1 ; µn;n =
µ1;1 = 2k,
n!Γ(2k + n) n!(2k + n − 1)! = . (2k − 1)! Γ(2k)
If n < m, then there is a p > 1 such that [K − , K p+ ] =
p−1 q=0
K q+ [K − , K + ]K p−1−q =2 +
p−1 q=0
K q+ K 0 K p−1−q , +
195
May 2, 2006 15:57 WSPC/148-RMP
196
J070-00261
S. Berceanu
which leads in the expression of µn;m to the term 2K 0 K p−1 + , and, after acting to the left with K 0 , we get a 0 contribution. Similarly, if n > m, then [K p− , K + ] = −
p−1
K s− [K 0 , K − ]K p−s−1 , −
s=0
and s = p − 1 in the sum. Acting on the right with K 0 , the contribution is also 0 because of the action on the right with K p−1 − . Acknowledgments The author is thankful to the organizers of the 2nd Operator Algebras and Mathematical Physics Conference, Sinaia, Romania, June 26–July 4, 2003, and the XXIII Workshop on Geometric methods in Physics, June 27–July 3, 2004, Bia lowie˙za, Poland for the opportunity to report results on this subject and for the financial support for attending the conferences. Discussions with Professor S. Twareque Ali and Professor Peter Kramer are kindly acknowledged. The author is grateful to Professor Rolf Berndt and to Dr. Adrian Tanas˘ a for correspondence, to Professor Karl-Hermann Neeb for criticism and to Professor John Klauder for his involved interest. References [1] S. T. Ali, J.-P. Antoine and J.-P. Gazeau, Coherent States, Wavelets, and their Generalizations (Springer-Verlag, New York, 2000). [2] V. Bargmann, Irreducible unitary representations of the Lorentz group, Ann. Math. 48 (1947) 568–640. [3] V. Bargmann, On the Hilbert space of analytic functions and the associated integral transform, Commun. Pure Appl. Math. 14 (1961) 187–214. [4] V. Bargmann, Group representations on Hilbert spaces of analytic functions, in Analytic methods in Mathematical Physics, eds. R. P. Gilbert and R. G. Newton (Gordon and Breach, Science Publishers, New York-London-Paris, 1970), pp. 27–63. [5] H. Bateman, Higher Transcendental Functions, Vol. 2 (McGraw-Hill, New York, 1958). [6] S. Berceanu and C. A. Gheorghe, On the construction of perfect Morse functions on compact manifolds of coherent states, J. Math. Phys. 28 (1987) 2899–2907. [7] S. Berceanu and A. Gheorghe, On equations of motion on Hermitian symmetric spaces, J. Math. Phys. 33 (1992) 998–1007. [8] S. Berceanu and L. Boutet de Monvel, Linear dynamical systems, coherent state manifolds, flows and matrix Riccati equation, J. Math. Phys. 34 (1993) 2353–2371. [9] S. Berceanu and M. Schlichenmaier, Coherent state embeddings, polar divisors and Cauchy formulas, J. Geom. Phys. 34 (2000) 336–358; arXiv: math. DG/9903105. [10] S. Berceanu and A. Gheorghe, Linear Hamiltonians on homogeneous K¨ ahler manifolds of coherent states, An. Univ. Timi¸soara Ser. Mat.-Inform. 39 (2001) 31–56; arXiv:math.DG/0408254. [11] S. Berceanu and A. Gheorghe, Differential operators on orbits of coherent states, Rom. Jour. Phys. 48 (2003) 545–556; arXiv: math.DG/0211054.
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
197
[12] S. Berceanu, Geometrical phases on hermitian symmetric spaces, in Recent Advances in Geometry and Topology, eds. Dorin Andrica and Paul A. Blaga (Cluj University Press, 2004), pp. 83–98; arXiv: math.DG/0408233. [13] S. Berceanu, Realization of coherent state algebras by differential operators, in Advances in Operator Algebras and Mathematical Physics, eds. F. Boca, O. Bratteli, R. Longo and H. Siedentop (The Theta Foundation, Bucharest, 2005), pp. 1–24; arXiv: math.DG/0504053. [14] S. Berceanu, A Holomorphic Representation of the Semidirect Sum of Symplectic and Heisenberg Lie Algebras, XXIV Workshop on geometric methods in Physics (Bialowie˙za, Poland, June–July 2005); A Holomorphic Representation of Jacobi Algebra in Several Dimensions, 6th Operator Algebras International Conference: Operator Algebras and Mathematical Physics-3 (Bucharest, Romania, August, 2005). [15] F. A. Berezin, The general concept of quantization, Commun. Math. Phys. 40 (1975) 153–174. [16] F. A. Berezin, Models of Gross–Neveu type are quantization of a classical mechanics with a nonlinear phase space, Commun. Math. Phys. 63 (1978) 131–153. [17] R. Berndt, Sur l’arithm´etique du corps des fonctions elliptiques de niveau N , in Seminar on Number Theory (Paris 1982–83), Progress in Mathematics, Vol. 51 (Birkh¨ auser Boston, Boston, MA, 1984), pp. 21–32. [18] R. Berndt and S. B¨ ocherer, Jacobi forms and discrete series representations of the Jacobi group, Math. Z. 204 (1990) 13–44. [19] R. Berndt and R. Schmidt, Elements of the Representation Theory of the Jacobi Group, Progress in Mathematics, Vol. 163 (Birkh¨ auser Verlag, Basel, 1998). [20] R. Berndt, Coadjoint Orbits and Representations of the Jacobi Group, IHES/M/ 03/37, preprint (2003). [21] I. Bialynicki-Birula, Solutions of the equations of motion in classical and quantum theories, Ann. Phys. 67 (1971) 252–273. [22] L. C. Biedenharn, The quantum group SUq (2) and a q-analogue of the boson operators, J. Phys. A 22 (1989) 4581–4588. [23] N. N. Bogoliubov, A contribution to the theory of super-fluidity, Izv. Akad. Nauk. 11 (1947) 77–90. [24] M. Cahen, S. Gutt and J. Rawnsley, Quantization of K¨ ahler manifolds I: Geometric interpretation of Berezin’s quantization, J. Geom. Phys. 7 (1990) 45–62. [25] J. Dixmier, Sur les alg´ebres de Weyl, Bull. Soc. Math. France 96 (1968) 209–242. [26] V. V. Dodonov, ‘Nonclassical’ states in quantum optics: A ‘squeezed’ review of the first 75 years, J. Opt. B. Quantum Semiclass. Opt. 4 (2002) R1–R33. [27] P. D. Drummond and Z. Ficek (eds.), Quantum Squeezing (Springer, Berlin, 2004). [28] M. Eichler and D. Zagier, The Theory of Jacobi Forms, Progress in Mathematics, Vol. 55 (Birkh¨ auser, Boston, MA, 1985). [29] G. B. Folland, Harmonic Analysis in Phase Space (Princeton University Press, Princeton, New Jersey, 1989). [30] R. J. Glauber, Coherent and incoherent states of the radiation field, Phys. Rev. 131 (1963) 2766–2788. [31] I. S. Gradˇste˘ın and I. M. Ryˇzik, Tables of Integrals, Sums, Series and Products (Gosudarstv. Izdat. Fiz.-Mat. Lit., Moscow, 1963) (in Russian). [32] J. Hilgert, K.-H. Neeb and B. Ørsted, Conal Heisenberg algebras and associated Hilbert spaces, J. Reine Angew. Math. 474 (1976) 67–112. [33] J. N. Hollenhorst, Quantum limits on resonant-mass gravitational-wave detectors, Phys. Rev. D 19 (1979) 1669–1679. [34] T. Holstein and H. Primakoff, Field dependence of the intrinsic domain magnetization of a ferromagnet, Phys. Rev. 58 (1940) 1098–1113.
May 2, 2006 15:57 WSPC/148-RMP
198
J070-00261
S. Berceanu
[35] C. Itzykson, Remarks on boson commutation rules, Commun. Math. Phys. 4 (1967) 92–122. [36] A. Joseph, Commuting polynomials in quantum canonical operators and realizations of Lie algebras, J. Math. Phys. 13 (1972) 351–357. [37] E. K¨ ahler, Die Poincar´e–Gruppe, Rend. Sem. Mat. Fis. Milano 53 (1983) 359–390. [38] E. K¨ ahler, The Poincar´e group, in Clifford Algebras and Their Applications in Mathematical Physics NATO Advanced Science Institute Series, Series C: Mathematical and Physical Sciences, Vol. 183 (Reidel, Dordrecht, 1986), pp. 265–272. [39] E. K¨ ahler, Raum-Zeit-Individuum, Rend. Accad. Naz. Sci. XL Mem. Mat. (5) 16 (1992) 115–177. [40] E. K¨ ahler, Mathematische Werke, Mathematical Works, eds. R. Berndt and O. Riemenschneider (Walter de Gruyter, Berlin-New York, 2003). [41] E. H. Kennard, Zur Quantenmechanik einfacher Bewegungstypen, Zeit. Phys. 44 (1927) 326–352. ´ ements de la Th´eorie des Representations (Editions Mir, Moscou, 1974). [42] A. Kirillov, El´ [43] A. A. Kirillov, Lectures on the Orbit Method, Graduate Studies in Mathematics Vol. 64 (American Mathematical Society, Providence, Rhode Island, 2004). [44] A. A. Kirillov, Merits and demerits of the orbit method, Bull. Amer. Math. Soc. 36 (1999) 43–73. [45] A. U. Klimyk, On position and momentum operators in the q-oscillator, J. Phys. A 38 (2005) 4447–4458. [46] S. Kobayashi, Irreducibility of certain unitary representations, J. Math. Soc. Japan 20 (1968) 638–642. [47] S. Lie, Theorie der transformationsgruppen, Math. Ann. 16 (1880) 441–528. [48] W. Lisiecki, Coherent state representations. A survey, Rep. Math. Phys. 35 (1995) 327–358. [49] E. Y. C. Lu, New coherent states of the electromagnetic field, Lett. Nuovo. Cimento 2 (1971) 1241–1244. [50] B. R. Mollow and R. J. Glauber, Quantum theory of parametric amplifications: I, Phys. Rev. 160 (1967) 1076–1096. [51] K.-H. Neeb, Realization of general unitary highest weight representations, preprint, Technische Hochschule Darmstadt 1662 (1994). [52] K.-H. Neeb, Holomorphy and Convexity in Lie Theory, de Gruyter Expositions in Mathematics, Vol. 28 (Walter de Gruyter, Berlin-New York, 2000). [53] M. N. Nieto and D. R. Truax, Holstein–Primakoff/Bogoliubov transformations and the multiboson system, Fortschr. Phys. 45 (1997) 145–156. [54] Z. Pasternak-Winiarski and J. Wojcieszynski, Bergman spaces and kernel for holomorphic vector bundles, Demonstratio Math. 30 (1997) 199–214. [55] A. M. Perelomov, Generalized Coherent States and their Applications (Springer, Berlin, 1986). [56] I. I. Pyatetskii-Shapiro, Automorphic Functions and the Geometry of Classical Domains (Gordon & Breach, New York-London-Paris, 1969). [57] M. Rausch de Traubenberg, M. J. Slupinski and A. Tanas˘ a, Finite-dimensional Lie subalgebras of the Weyl algebra, arXiv:math.RT/0504224 v2. [58] J. H. Rawnsley, Coherent states and K¨ ahler manifolds, Quart. J. Math. Oxford 28 (1977) 403–415. [59] I. Satake, Algebraic Structures of Symmetric Domains, Publications of the Mathematical Society of Japan, Vol. 14 (Princeton Univ. Press, 1980). [60] R. Simon, E. C. G. Sudarshan and N. Mukunda, Gaussian pure states in quantum mechanics and the symplectic group, Phys. Rev. A. 37 (1988) 3028–3038.
May 2, 2006 15:57 WSPC/148-RMP
J070-00261
A Holomorphic Representation of the Jacobi Algebra
199
[61] A. Simoni and F. Zaccaria, On realization of semi-simple Lie algebras with quantum canonical variables, Nuovo Cimento A (10) 59 (1969) 280–292. [62] S. Sivakumar, Studies on nonlinear coherent states, J. Opt. B. Quantum Semiclass. Opt. 2 (2000) R61–R75. [63] P. Stoler, Equivalence classes of minimum uncertainty packets, Phys. Rev. D 1 (1970) 3217–3219; —, II, Phys. Rev. D 4 (1971) 1925–1926. [64] D. F. Walls, Squeezed states of light, Nature 306 (1983) 141–146. [65] H. P. Yuen, Two-photon coherent states of the radiation field, Phys. Rev. A 13 (1976) 2226–2243. [66] W.-M. Zhang, D. H. Feng and R. Gilmore, Coherent states: Theory and some applications, Rev. Mod. Phys. 62 (1990) 867–927.
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Reviews in Mathematical Physics Vol. 18, No. 2 (2006) 201–232 c World Scientific Publishing Company
PRESENTATIONS OF WESS–ZUMINO–WITTEN FUSION RINGS
PETER BOUWKNEGT Department of Theoretical Physics, Research School of Physical Sciences and Engineering, and Department of Mathematics, Mathematical Sciences Institute, Australian National University, Canberra, 0200, Australia
[email protected] DAVID RIDOUT∗ Department of Physics and Mathematical Physics, University of Adelaide, Adelaide, 5005, Australia and Department of Mathematics La Trobe University, Bundoora, 3086, Australia
[email protected]
Received 7 February 2006 Revised 12 April 2006 The fusion rings of the Wess–Zumino–Witten models are re-examined. Attention is drawn to the difference between fusion rings over Z (which are often of greater importance in applications) and fusion algebras over C. Complete proofs are given by characterizing the fusion algebras (over C) of the SU(r + 1) and Sp(2r) models in terms of the fusion potentials, and it is shown that the analagous potentials cannot describe the fusion algebras of the other models. This explains why no other representation-theoretic fusion potentials have been found. Instead, explicit generators are then constructed for general WZW fusion rings (over Z). The Jacobi–Trudy identity and its Sp(2r) analogue are used to derive the known fusion potentials. This formalism is then extended to the WZW models over the spin groups of odd rank, and explicit presentations of the corresponding fusion rings are given. The analogues of the Jacobi–Trudy identity for the spinor representations (for all ranks) are derived for this purpose, and may be of independent interest. Keywords: Fusion ring; conformal field theory; character; D-brane; Jacobi–Trudy. Mathematics Subject Classification 2000: 81T40, 13P10, 13F20, 81T30 ∗ Current
address: D´epartement de Physique, Universit´ e Laval, Qu´ebec, G1S 7P4, Canada.
201
May 2, 2006 15:57 WSPC/148-RMP
202
J070-00262
P. Bouwknegt & D. Ridout
1. Introduction The fusion process is a fundamental ingredient in the standard description of all rational conformal field theories. Roughly speaking, the fusion coefficient Nabc counts the multiplicity with which the family of fields φc appears in the operator product expansion of a field from family φa with a field from family φb . This is succinctly written as a fusion rule: Nabc φc . (1.1) φa × φb = c
This definition makes clear the fact that fusion coefficients are non-negative integers. Of course, one can define fusion in a more mathematically precise manner in terms of the Grothendieck ring of a certain abelian braided monoidal category that appears in the vertex operator algebra formulation of conformal field theory. However, we will not have the need for such sophistication in what follows. For our purposes, a fusion ring is defined by Eq. (1.1), where the coefficients Nabc are explicitly given. The standard assumptions and properties of the operator product expansion then translate into properties of the fusion coefficients. It is convenient to express these in terms of matrices Na defined by[Na ]bc = Nabc . We assume that the identity field is in the theory; the corresponding family is denoted by φ0 , and N0 is therefore the identity matrix. Commutativity and associativity of the operator product expansion translate into Nabc Nc , Na Nb = Nb Na and Na Nb = c
respectively. Last, given a family φa , there is a unique family φa+ such that their operator product expansions contain fields from the family φ0 with multiplicity one (this is effectively just the normalization of the two-point function). It followsa that T denotes transposition. Na+ = NT a , where These matrices thus form a commuting set of normal matrices, and so may be simultaneously diagonalized by a unitary matrix U . The diagonalization Na U = (a) (a) UD a (Da diagonal) is equivalent to c Nabc Ucd = Ubd λd where λd are the eigen(a) values of Na . Putting b = 0 then gives Uad = U0d λd , which determines the eigenvalues completely (if U0d were to vanish, Uad would vanish for all a contradicting unitarity). The celebrated Verlinde conjecture [1] identifies the diagonalizing matrix U with the S-matrix S describing the transformations of the characters of the chiral symmetry algebra induced by the modular transformation τ → −1/τ . This gives a closed expression for the fusion coefficients: Nabc =
Sad Sbd S ∗ d
a Here
S0d
cd
.
(1.2)
we are implicitly excluding logarithmic conformal field theories from our considerations.
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
203
It is worthwhile noting that the Verlinde conjecture has recently been proved for a fairly wide class of conformal field theories (in the vertex operator algebra approach) [2]. Mathematically, these families with their fusion product define a finitelygenerated, associative, commutative, unital ring. Moreover, this fusion ring is freely generated as a Z-module (abelian group), and possesses a distinguished “basis” in which the structure constants are all non-negative integers. The matrices Na introduced above correspond to this basis in the regular representation of the fusion ring. It is often convenient to generalize this structure to a fusion algebra (also known as a Verlinde algebra) by allowing coefficients in an algebraically closed field, say C. We will denote the fusion ring by FZ , and the corresponding fusion algebra (over C) by F C = FZ ⊗Z C. It is important to note that the structure which arises naturally in applications is the fusion ring, and that the fusion algebra is just a useful mathematical construct. One of the first advantages in considering F C is that it contains the elements [3] ∗ Sab φb , (1.3) πa = S0a b
where the sum is over the distinguished basis of F C . A quick calculation shows that the πa then form a basis of orthogonal idempotents: πa × πb = δab πb . It follows that there are no non-zero nilpotent elements in F C , and hence the same is true for FZ . Since the fusion algebra is finitely-generated, associative and commutative, it may be presented as a free polynomial ring (over C) in its generators, modulo an ideal I C . The lack of nontrivial nilpotent elements implies that this ideal has the property that whenever some positive power of a polynomial belongs to the ideal, so does the polynomial itself. That is, the ideal is radical, hence completely determined by the variety of points (in Cn ) at which every polynomial in the ideal vanishes [4]. This variety will be referred to as the fusion variety. As F C is a finite-dimensional vector space over C, it follows that the fusion variety consists of a finite number of points, one for each basis element [4]. Since the πa of Eq. (1.3) form a basis of idempotents, they correspond to polynomials which take the values 0 and 1 on the fusion variety. Their supports (points of the fusion variety where the representing polynomials take value 1) cannot be empty, and their orthogonality ensures that their supports must be disjoint. This forces the supports to consist of a single point, different for each πa . We denote this point of the fusion variety by v a . It now follows from inverting Eq. (1.3) that the polynomial (a) pa representing φa takes the value λb = Sab /S0b at v b . Suppose now that there is a subset {φai : i = 1, . . . , r} of the φa which generates the entire fusion algebra. If we take the free polynomial ring to be C[φa1 , . . . , φar ], then the coordinates of the fusion variety are just Sa b vib = pai v b = i . S0b
May 2, 2006 15:57 WSPC/148-RMP
204
J070-00262
P. Bouwknegt & D. Ridout
This proves the following result of Gepner [5]: C C Proposition 1.1. F C ∼ = C[φa1 , . . . , φar ] /I , where I is the (radical ) ideal of polynomials vanishing on the points Sa1 b Sa b ,..., r ∈ Cr . S0b S0b
Notice that this result only characterizes the fusion algebra. The fusion ring may likewise be represented as a quotient of Z[φa1 , . . . , φar ], where the fusion ideal is given by I Z = I C ∩ Z[φa1 , . . . , φar ] [3]. The fusion ideal over Z thus inherits the property from I C that if any integral multiple of a polynomial is in the ideal, then so is the polynomial itself. This ensures that the quotient is a free Z-module, as required. By analogy with radical ideals (and for wont of a better name), we will refer to ideals with this property as being dividing. In this paper, we are interested in the fusion rings of Wess–Zumino–Witten models. These are conformal field theories defined on a group manifold G (which we will take to be simply-connected, connected and compact), and parametrized by a positive integer k called the level. Our motivation derives from the determination of the dynamical charge group of a certain class of D-brane in these theories. The brane charges [6, 7] can be computed explicitly, and the order of the charge group can be shown to be constrained by the fusion rules [8, 9]. A suitably detailed understanding of the structure of the fusion rules therefore makes the computation of the charge group possible. This was achieved for the models based on the groups G = SU(r + 1) in [9], and the general case in [10]. However, the general charge group computations have only been rigorously proved for G = SU(r + 1) andb Sp(2r), essentially because the detailed structure of the fusion rules associated with the other groups is not well understood. The aim of this paper is to re-examine the cases which have been described, and try to elucidate a corresponding detailed structure in other cases. The field families of a level-k Wess–Zumino–Witten model on the group manifold G are conveniently labeled by an integrable highest weight representation of the associated untwisted affine Lie algebra g, hence by the projection of the highest weight onto the weight space of the horizontal subalgebra g (which will be identified with the Lie algebra of G). In other words, the abstract elements naturally appearing in the fusion rules may be identified with the integral weights (of g) in the closed k . In what follows, it fundamental affine alcove. We denote this set of weights by P will usually prove more useful to regard these weights as the integral weights in the open, shifted fundamental alcove. Concretely, k = {λ ∈ P : (λ + ρ, αi ) > 0 for all i, and (λ + ρ, θ) < k + h∨ } , P this paper, we denote by Sp(2r) the (unique up to isomorphism) connected, simply-connected and compact Lie group whose Lie algebra is sp(2r).
b In
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
205
where P is the weight lattice, αi are the simple roots, θ denotes the highest root, ρ the Weyl vector and h∨ is the dual Coxeter number of g. The inner product on the weight space is normalized so that (θ, θ) = 2. For these Wess–Zumino–Witten models, the Verlinde conjecture was proven in [11–13]. By combining this with the Kac–Peterson formula [14] for the Wess– Zumino–Witten S-matrix elements, ∨ (1.4) g, k) det w e−2πi(w(λ+ρ),µ+ρ)/(k+h ) Sλµ = C( w∈W
(here C( g, k) is a constant and W is the Weyl group of G), one can derive a very useful expression for the fusion coefficients, known as the Kac–Walton formula [15–19]: b Nλµν = det w Nλµ w·ν . (1.5) bk w∈ b W
This formula relates the fusion coefficients to the tensor product multiplicities Nλµ ν of the irreducible representations of the group G (or its Lie algebra g), via the shifted k at level k, w · ν = w(ν + ρ) − ρ. action of the affine Weyl group W The Kac–Walton formula suggests that for Wess–Zumino–Witten models, it may be advantageous to choose the free polynomial ring appearing in Proposition 1.1 to be the complexified representation ring (character ring) of G. The character of the irreducible representation of highest weight λ is given by w(λ+ρ) w∈W det w e µ e = , χλ = w(ρ) w∈W det w e µ∈P λ
where Pλ is the set of weights of the representation with multiplicity (and the second equality is the Weyl character formula). The character ring is freely generated by the characters χΛi ≡ χi (i = 1, . . . , r = rank G) of the representations whose highest weights are the fundamental weights Λi of G. Gepner’s result for Wess–Zumino– Witten models may therefore be recast in the form: Proposition 1.2. The fusion algebra of a level-k Wess–Zumino–Witten model is C C given by FkC ∼ = C[χ1 , . . . , χr ] /Ik , where Ik is the (radical ) ideal of polynomials vanishing on the points SΛ1 λ SΛ λ k . ,..., r ∈ Cr : λ ∈ P S0λ S0λ We will likewise denote a level-k Wess–Zumino–Witten fusion ring by FZk and the corresponding fusion ideal of Z[χ1 , . . . , χr ] by IkZ . We are interested in explicit sets of generators for these fusion ideals (over C and Z). Given a candidate set of elements in IkC , the verification that this set is generating may be broken down into three parts: First, one checks that each element vanishes on the fusion variety. Second, one must show that these elements
May 2, 2006 15:57 WSPC/148-RMP
206
J070-00262
P. Bouwknegt & D. Ridout
do not collectively vanish anywhere else. Third, the ideal generated by this candidate set must be verified to be radical. This last step is always necessary because there is generically an infinite number of ideals corresponding to a given variety (consider the ideals xn ⊂ C[x] which all vanish precisely at the origin). It should be clear that verifying radicality does not consist of the trivial task of checking that generating set contains no powers of polynomials (consider
2 the2 candidate x + y , 2xy ⊂ C[x, y]). For the SU(r + 1) and Sp(2r) fusion algebras, generating sets for IkC have been postulated in [5, 20, 21] as the partial derivatives of a fusion potential. The first step of the verification process is well documented there, the second step appears somewhat sketchy, and the third does not seem to have appeared in the literature at all. We rectify this in Sec. 2. The methods we employ are then used to show why analogous potentials have not been found for the other groups, despite several attempts [22, 23]. However, we would like to repeat our claim that it is the fusion ring which is of physical interest in applications, and the above verification process does not allow us to conclude that a set of elements is generating over Z. In other words, a set of generators for IkC need not form a generating set for IkZ , even if the set consists of polynomials with integral coefficients (a simple example would be if IkC = x + y, x − y ⊂ C[x, y], then IkZ = x + y, x − y ⊂ Z[x, y] as this latter ideal is not dividing). This consideration also seems to have been overlooked in the literature, and is, in our opinion, quite a serious omission. We will rectify this situation in Sec. 3 by removing the need to postulate a candidate set of generators; instead, we shall derive generating sets ab initio. In the cases G = SU(r + 1) and Sp(2r), some simple manipulations will allow us to reduce the number of generators in these sets drastically. We will see that these manipulations reproduce the aforementioned fusion potentials. Our results therefore constitute the first complete derivation of this description from first principles, and we emphasize that this derivation holds over Z. The results to this point have already been detailed in [24]. We then detail the analagous manipulations for Spin(2r + 1) in Sec. 4, producing a relatively small set of explicit generators for the corresponding fusion ideal. It is not clear to us whether these generators are related to a description by fusion potentials. The manipulations essentially rely upon the application of a class of identities generalizing the classical Jacobi–Trudy identity (which we will collectively refer to as Jacobi–Trudy identities). Many of these are well known [25], but we were unable to find identities for spinor representations in the literature, so we include derivations in Appendix A. We also include the corresponding identities for Spin(2r), as they may be of independent interest. 2. Presentations of Fusion Algebras In this section, we consider the description of the fusion ideals IkC by fusion potentials. We introduce the potentials for the Wess–Zumino–Witten models over the
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
207
groups SU(r + 1) and Sp(2r), and verify that the induced ideals vanish precisely on the fusion variety, and are radical. We then investigate the obvious class of analogous potentials for Wess–Zumino–Witten models over other groups, and show that in these cases, no potential in this class correctly describes the fusion algebra. Readers that are only interested in fusion rings and presentations of the ideals IkZ should skip to Sec. 3.
2.1. Fusion potentials For Wess–Zumino–Witten models over SU(r + 1) and Sp(2r), the fusion ideal is supposed to be generated by the partial derivatives (with respect to the characters χi of the fundamental representations) of a single polynomial, called the fusion potential. At level k, [5] gives the SU(r + 1)-potential as 1 q k+r+1 , k + r + 1 i=1 i r+1
Vk+r+1 (χ1 , . . . , χr ) =
(2.1)
where the qi are the (formal) exponentials of the weights εi of the defining representation (whose character is χ1 ). Note that q1 · · · qr+1 = 1. The εi are permuted by the Weyl group W = Sr+1 of SU(r + 1), and W acts analogously on the qi . Therefore, Vk+r+1 is clearly W-invariant, hence is indeed a polynomial in the χi [26]. The level-k Sp(2r)-potential is given in [20, 21] as 1 −(k+r+1) q k+r+1 + qi , Vk+r+1 (χ1 , . . . , χr ) = k + r + 1 i=1 i r
(2.2)
where the qi and qi−1 refer to the (formal) exponentials of the weights ±εi of the defining representation of Sp(2r) (whose character is again χ1 ). The Weyl group W = Sr Zr2 acts on the εi by permutation (Sr ) and negation (each Z2 sends one εi to −εi whilst leaving the others invariant). We see again that the given potential is a W-invariant, hence a polynomial in the χi . These potentials are obviously best handled with generating functions. We also note that these potentials may be unified as Vk+h∨ (χ1 , . . . , χr ) =
∨ 1 e(k+h )µ ∨ k+h
(2.3)
µ∈PΛ1
where Pλ denotes the set of weights of the irreducible representation of highest weight λ. Putting this form into a generating function (and dropping the explicit χi dependence) gives ∞ m−1 m µ (−1) Vm t = log (1 + e t) . V (t) = m=1
µ∈PΛ1
May 2, 2006 15:57 WSPC/148-RMP
208
J070-00262
P. Bouwknegt & D. Ridout
This generating function may therefore be expressed in terms of the characters of the exterior powers of the defining representation. These exterior powers are well known [27], and give r+1 χn tn , (2.4) SU(r + 1): V (t) = log n=0
where χ0 = χr+1 = 1, and Sp(2r):
V (t) = log
r−1
n 2r−n r + Er t , En t + t
(2.5)
n=0
where χ0 = 1, χn = 0 for all n < 0, and En = χn + χn−2 + χn−4 + · · · . At this point it should be mentioned that there is an explicit construction for arbitrary rational conformal field theories [28], which determines a function whose derivatives vanish on the fusion variety. This construction, however, requires an explicit knowledge of the S-matrix elements, and is quite unwieldy (as compared with the above potentials). Indeed, it also seems to possess significant ambiguities, and it is not clear how to fix this so as to find a potential with a representationtheoretic interpretation. In any case, it also appears to be difficult to determine if these ideals thus obtained are radical or dividing, so we will not consider this construction any further. There is also a paper [22] postulating simple potentials for every Wess–Zumino–Witten model, similar in form to those of Eqs. (2.1) and (2.2). But, as pointed out in [23], the partial derivatives of the potentials given do not always vanish on the fusion variety, and so cannot generate the fusion ideal. In [23], fusion potentials are presented for rings related to the fusion rings of the Wess–Zumino–Witten models over the special orthogonal groups. Unfortunately, their method fails to give the fusion rings for the special orthogonal groups. We will see in Sec. 2.3 why this is the case. 2.2. Verification Let us first establish that the ideals defined by the potentials given in Eqs. (2.1) and (2.2) vanish on their respective fusion varieties. From Proposition 1.2, the points of the fusion variety have coordinates SΛi λ λ+ρ λ = χi −2πi , vi = S0λ k + h∨ where the second equality follows readily from Weyl’s character formula and Eq. (1.4). It follows that the fusion potentials should have critical points precisely k . In when the characters are evaluated at ξλ = −2πi(λ + ρ)/(k + h∨ ), for λ ∈ P fact, the functions κi defined by ∨ λ+ρ e−2πi(µ,λ+ρ)/(k+h ) = κi (λ) = χi −2πi ∨ k+h µ∈PΛi
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
209
k . Thus, the potenare invariant under the shifted action of the affine Weyl group W tials should have critical points when evaluated at χi = κi (λ), for any λ ∈ P which is not on a shifted alcove boundary. We denote the gradient operations with respect to the fundamental characters χi and the Dynkin labels λj by ∇χ and ∇λ , respectively, and the jacobian matrix of the functions κi with respect to the λj by J. From the chain rule, it follows that if the potential has a critical point with respect to λ at which J is non-singular, then this is also a critical point with respect to the fundamental characters. It is therefore necessary to determine when J becomes singular. Explicit calculation shows that the jacobian, as a function on the weight space, satisfies J(w(ν)) = J(ν) w,
(2.6)
hence det J is anti-invariant under the Weyl group W (here, w on the right-hand side refers to the matrix representation of w with respect to the basis of fundamental weights). It is therefore a multiple of the primitive anti-invariant element [26], and by comparing leading terms, we arrive at r 1 −2πi (eα/2 − e−α/2 ), det J = k + h∨ |P/Q∨ | α∈∆+
where Q∨ is the coroot lattice and ∆+ are the positive roots of g (explicit details may be found in [24]). Evaluating at −2πi(λ + ρ)/(k + h∨ ), it follows that the jacobian is singular precisely when (α, λ + ρ) sin π = 0. k + h∨ α∈∆+
That is, when λ is on the boundary of a shifted affine alcove. Therefore, these boundaries are the only places where a potential may have critical points with respect to λ which need not be critical points with respect to the χi . Evaluating the potentials, Eq. (2.3), as above gives Vk+h∨ (κ1 (λ) , . . . , κr (λ)) =
1 1 e−2πi(µ,λ+ρ) = χ1 (−2πi(λ + ρ)) . k + h∨ k + h∨ µ∈PΛ1
Note that the level dependence becomes quite trivial. We now determine the critical points of these potentials with respect to the Dynkin labels λj . Sp(2r): The 2r weights of the defining representation are the εj and their negatives. The potentials therefore take the form r λ+ρ 2 cos[2π(εj , λ + ρ)] . Vk+h∨ −2πi = k + h∨ k + h∨ j=1
May 2, 2006 15:57 WSPC/148-RMP
210
J070-00262
P. Bouwknegt & D. Ridout
Critical points therefore occur when r
(Λi , εj ) sin[2π(εj , λ + ρ)] = 0,
j=1
for each i = 1, . . . , r. The (Λi , εj ) form the entries of a square matrix ∨ which is easily seen to be invertible, as εj = 12 α∨ + · · · + α r [26]. We j therefore have critical points precisely when sin[2π(εj , λ + ρ)] = sin[π(λj + ρj + · · · + λr + ρr )] = 0, for all j = 1, . . . , r. It follows that λj + · · ·+ λr ∈ Z for each j = 1, . . . , r, hence λ ∈ P. SU(r + 1): In this case, the r + 1 weights of the defining representation are the εj , but we have the constraint ε1 +· · ·+εr+1 = 0. Finding the critical points on the weight space is a constrained optimization problem in Rr+1 , so we add a Lagrange multiplier Ω to the potential: r+1 λ+ρ 1 −2πi(εj ,λ+ρ) Vk+h∨ −2πi e +Ω(λ, ε1 + · · · + εr+1 ) . = k + h∨ k + h∨ j=1 It is now straightforward to show that the critical points are again λ ∈ P, so we leave this as an exercise for the reader. So, for both SU(r + 1) and Sp(2r), the critical points with respect to λ of the potentials of Eq. (2.3) coincide with the weight lattice P. Every integral weight which is not on a shifted affine alcove boundary therefore corresponds to a critical point with respect to the fundamental characters (since J is non-singular there). To conclude that the critical points of the potentials coincide with the points of the corresponding fusion varieties, we therefore need to exclude the possibility that an integral weight on a shifted affine alcove boundary can correspond to a critical point with respect to the fundamental characters. readily from a study 2This follows ∂ V ∨ of the potentials at these of the determinant of the hessian matrix Hλ = ∂λik+h ∂λj points, whose computation we now turn to. SU(r + 1): Here (indeed, for any simply-laced group), P coincides with the dual of the root lattice. Thus, λ ∈ P implies that (µ, λ + ρ) = (Λ1 , λ + ρ) (mod 1) for all µ ∈ PΛ1 . It follows that (Hλ )ij =
−4π 2 (µ, Λi )(µ, Λj ) e−2πi(µ,λ+ρ) k + h∨ µ∈PΛ1
−4π −2πi(Λ1 ,λ+ρ) e IΛ1 (Λi , Λj ), k + h∨ 2
=
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
211
where IΛ1 is the Dynkin index of the irreducible representation of highest weight Λ1 . Thus, r −2πir(Λ1 ,λ+ρ) −4π 2 IΛ1 e = 0, det Hλ = ∨ k+h |P/Q∨ | when λ ∈ P. ∨ Sp(2r): The weights of PΛ1 take the form ±ε = ± 12 (α∨ + · · · + αr ), for 1
= 1, 2, . . . , r, so (ε , Λi )(ε , Λj ) = 4 if i ≥ and j ≥ , and 0 otherwise. Computing the hessian as before gives (Hλ )ij =
−2π 2 k + h∨
min{i,j}
cos[π(λ + · · · + λr + r − + 1)] .
=1
Elementary row operations now suffice to compute r r −2π 2 det Hλ = cos[π(λ + · · · + λr + r − + 1)] , k + h∨ =1
so again det Hλ = 0 on the weight lattice. Denote the hessian matrix with respect to the χi of the potentials by Hχ . Then, from ∂χs ∂ 2 Vk+h∨ ∂χt ∂Vk+h∨ ∂ 2 χ ∂ 2 Vk+h∨ = + , ∂λi ∂λj ∂λi ∂χs ∂χt ∂λj ∂χ ∂λi ∂λj s,t
we see that Hλ = J T Hχ J
when ∇χ Vk+h∨ = 0.
It follows that at the critical points of the potential with respect to the χi , det Hλ = (det J)2 det Hχ .
(2.7)
Now, we have just demonstrated that det Hλ = 0 on the weight lattice, but we know that det J = 0 on the shifted affine alcove boundaries. As det Hχ is a polynomial (hence finite-valued), this forces the conclusion that any integral weight lying on a shifted affine alcove boundary is not a critical point of the potential with respect to the χi . Of course, this is exactly what we wanted to show. To summarize, we have shown that the ideal generated by the derivatives of the potentials given in Eqs. (2.1) and (2.2) vanishes precisely on the fusion variety. To complete the proof (over C) that these potentials describe the fusion ideal IkC , we need to show that this ideal is radical. Happily, this follows immediately from Eq. (2.7) and some standard multiplicity theory, specifically the theory of Milnor numbers [29, 30]: The ideal generated by the derivatives of a potential is radical if and only if the hessian of the potential is non-singular at each point of the corresponding (zero-dimensional) variety. Since Hλ and J are non-singular at the points of the fusion variety, Hχ is non-singular there by Eq. (2.7), and we are done.
May 2, 2006 15:57 WSPC/148-RMP
212
J070-00262
P. Bouwknegt & D. Ridout
The ideals are radical, so the potentials given by Eqs. (2.1) and (2.2) correctly describe the fusion algebras of SU(r + 1) and Sp(2r) (respectively). 2.3. A class of candidate potentials In searching for fusion potentials appropriate for the Wess–Zumino–Witten models over the other (simply-connected) simple groups G, an obvious class of potentials to consider is those of the form (compare Eq. (2.3)) ∨ 1 Γ e(k+h )µ . (2.8) Vk+h ∨ = ∨ k+h µ∈Γ
Here, Γ is a finite W-invariant set of integral weights. This ensures that these potentials are polynomials in the fundamental characters with rational coefficients. Indeed, the derivatives of such polynomials have integral coefficients, as may be seen by differentiating the generating function ∞ m−1 Γ m Γ µ (−1) Vm t = log (1 + e t) . V (t) = m=1
µ∈Γ
In this section, we will show (with the aid of an example) that the fusion algebra of these other Wess–Zumino–Witten models is not described by potentials from this class.c For our example, we choose the exceptional group G2 because its weight space is easily visualized. Specifically, we consider the two potentials obtained from Eq. (2.8) by taking Γ to be the Weyl orbit W(Λi ) of a fundamental weight. One might prefer to take the potentials based on the weights of the fundamental representations, but this leads to more difficult computations. As in Sec. 2.2, we evaluate these potentials on the weight space (at ξλ ). It is extremely important to realize that as functions on the weight space, the potentials k for all k (because are invariant under the shifted action of the affine Weyl groups W the level dependence is essentially trivial). We can therefore restrict to computing the critical points in a fundamental alcove at (effective) level κ ≡ k + h∨ = 1 (a truly fundamental domain for the periodicity of the potentials). The results are shown in Fig. 1. It is immediately evident that in contrast with the SU(r + 1) and Sp(2r) fusion potentials, these G2 potentials have critical points (with respect to the Dynkin labels λi ) which include, but are not limited to, the weight lattice. These non-integral critical points are the crux of the matter. When these critical points lie on a shifted (level-k) alcove boundary, we saw in Sec. 2.2 that they need not correspond to genuine critical points (with respect to the fundamental characters). However, any critical point in the interior of a shifted alcove is necessarily a critical point with respect to the fundamental characters, and Gepner’s characterization of the fusion variety requires these to be integral. Unfortunately, at any c To
be precise, we will prove that the potential cannot take the form of Eq. (2.8) for all levels, unless G is SU(r + 1) or Sp(2r).
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
W(Λ )
213
W(Λ )
Fig. 1. The (shifted) critical points λ + ρ of the potentials Vk+h∨1 and Vk+h∨2 for G2 as a function of the weight space. (Our convention is that Λ1 is the highest weight of the adjoint representation.)
W(Λ )
Fig. 2. The critical points λ of the potential Vk+h∨2 for G2 in the shifted fundamental alcove at level k = 1. The white points denote those in the interior which do not belong to the weight lattice.
k for all k means given level k > 0, the invariance of the critical points under W that there will always be non-integral critical points in the interior of the alcoves W(Λ ) (for k sufficiently large). This is illustrated in Fig. 2 for the potential V5 1 (corresponding to level k = 1). It follows that the potentials based on the Weyl orbits of the G2 fundamental weights do not describe the fusion variety. Γ We can, of course, consider potentials Vk+h ∨ based on more complicated W-invariant sets Γ. However, when evaluating on the weight space, any such potential is just a W-invariant linear combination of formal exponentials of integral
May 2, 2006 15:57 WSPC/148-RMP
214
J070-00262
P. Bouwknegt & D. Ridout W(Λ )
W(Λ )
weights, and so is a polynomial in the potentials Vk+h∨1 and Vk+h∨2 considered before. It follows now from the chain rule for differentiation that if λ + ρ is a W(Λ ) Γ common critical point of all the Vk+h∨i , then it is also a critical point of Vk+h ∨. Γ From Fig. 1, we see that any potential Vk+h∨ for G2 will have critical points at non-integral weights, and so will not correctly describe the fusion variety. The situation is similarly bleak for the other simple groups because any potential Γ of the form Vk+h ∨ will have (shifted) critical points at the vertices of the affine alcoves (at all levels). We will demonstrate this claim shortly. What it implies is that the only time a potential of this form stands the chance of describing the fusion variety is when the alcove vertices are integral (at all levels). This only happens when the comarks of the Lie group are all unity, which is only the case for G = SU(r + 1) and Sp(2r). Let us finish with the promised demonstration. Our earlier remarks show that PΛ it is sufficient to consider the potentials Vm i , i = 1, . . . , r. We will show that these always have critical points (with respect to λ) when λ + ρ is the vertex of an PΛ affine alcove. Identifying m with k + h∨ , the condition for Vm i to have a critical point is just that Jij (−2πi(λ + ρ)) = 0 for each j. We therefore need to show that J(−2πiν) = 0 whenever ν is an alcove vertex. We rewrite Eq. (2.6) in terms of the i-th row of J, ∇λ χi : ∇λ χi (−2πiw(ν)) = ∇λ χi (−2πiν)w. Here w (on the right-hand side) denotes the matrix representing w with respect to the basis of fundamental weights. We will treat the row vector ∇λ χi (−2πiν) as an element of the dual of the weight space (the Cartan subalgebra). We can also restrict our attention to the fundamental alcove vertices, by W-invariance of the characters. If ν = 0, then ν is fixed by every w ∈ W, so ∇λ χi (−2πiν) is a row vector fixed by every w ∈ W. Thus, ∇λ χi (0) is the zero vector (for each i), verifying our claim for this vertex (and its W-images). ∨ The other fundamental alcove vertices have the form ν = Λj /a∨ j , where aj is the j-th comark of g. As ν is invariant under all the simple Weyl reflections except wj , ∇λ χi (−2πiν) is also invariant under all these simple reflections, hence ∇λ χi (−2πiν) is orthogonal to every simple root except αj . But, ν is fixed by the affine reflection about the hyperplane (µ, θ) = 1. This reflection has the form w(µ) = wθ (µ) + θ, where wθ ∈ W is the Weyl reflection associated with the highest root θ. Hence, using the invariance of the characters under translations in Q∨ , ∇λ χi (−2πiν) = ∇λ χi (−2πi(wθ (ν) + θ)) = ∇λ χi (−2πiwθ (ν)) = ∇λ χi (−2πiν) wθ . It follows now that ∇λ χi (−2πiν) is also orthogonal to θ. But, θ and the simple roots, excepting αj , together constitute a basis of the weight space (as the mark aj never vanishes). Thus, ∇λ χi (−2πiν) is again the zero vector, verifying our claim for all the vertices of the fundamental alcove.
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
215
3. Presentations of Fusion Rings We now turn to the study of fusion rings over Z. Given the results of Sec. 2.3, we introduce a characterization of the fusion ideal IkZ for general Wess–Zumino–Witten models which makes no mention of potentials. We then analyze this characterization in the cases of SU(r + 1) and Sp(2r), and show that it can be reduced to recover the potentials of Eqs. (2.1) and (2.2). We would like to emphasize that this constitutes a derivation of these fusion potentials over Z, and not an a posteriori verification over C. In Sec. 4, we will apply this reduction to Spin(2r + 1). 3.1. A general characterization k , we have We begin with the simple observation that given any weight λ and w ∈W χw·λ ∈ IkZ . χλ − det w b
(3.1)
(The definition of character has been extended to non-dominant weights by Weyl’s character formula.) This follows easily from Gepner’s characterization of the fusion algebra, Proposition 1.2 (and the remarks which follow it). Since the fusion ideal is dividing (see Sec. 1), it follows that χλ ∈ IkZ whenever λ is on a shifted affine alcove boundary. Let Lλ denote the irreducible representation of G of highest weight λ. Letting λi denote the Dynkin labels of the weight λ, it follows from the familiar properties of the representation ring that λ is the highest weight of the representation ⊗λr 1 L⊗λ Λ1 ⊗ · · · ⊗ LΛr . As a polynomial in the character ring, Z[χ1 , . . . , χr ], we see that the character χλ has the form χλ = χλ1 1 · · · χλr r − · · · , where the omitted terms correspond, in a sense, to lower weights which we regard as being of lesser importance. Our strategy now is to make this lack of importance precise by introducing a monomial ordering on the character ring such that the leading term (lt) of χλ is precisely lt(χλ ) = χλ1 1 · · · χλr r . Of course, we are studying fusion, so we also want to assign (relative) importance to characters according to whether the associated weight is on a shifted affine alcove boundary or not. In particular, we should distinguish weights on the boundary (λ, θ) = k + 1 from those inside the fundamental alcove (λ, θ) ≤ k. Happily, these requirements can both be satisfied by defining a monomial ordering ≺ on the character ring, Z[χ1 , . . . , χr ], by χλ1 1 · · · χλr r ≺ χµ1 1 · · · χµr r if and only if (λ, θ) < (µ, θ) ,
or
(λ, θ) = (µ, θ)
and (λ, ρ) < (µ, ρ) ,
or
(λ, θ) = (µ, θ)
and (λ, ρ) = (µ, ρ)
and χλ1 1 · · · χλr r ≺ χµ1 1 · · · χµr r ,
May 2, 2006 15:57 WSPC/148-RMP
216
J070-00262
P. Bouwknegt & D. Ridout
where ≺ is any other monomial ordering, lexicographic for definiteness. This is an example of a weight order [4] (and is therefore a genuine monomial ordering). We demonstrate that lt(χλ ) is indeed χλ1 1 · · · χλr r . This proceeds inductively on the height, as it is obvious when λ is zero or a fundamental weight. We decompose ⊗λr 1 L⊗λ Λ1 ⊗ · · · ⊗ LΛr into irreducible representations, so that χλ1 1 · · · χλr r = χλ +
c µ χµ ,
µ
where the µ are all of lower height than λ: (µ, ρ) < (λ, ρ). By induction, lt(χλ ) is the greatest (under ≺) of χλ1 1 · · · χλr r and the monomials −cµ χµ1 1 · · · χµr r . Now, ⊗λr 1 since each µ is a weight of L⊗λ Λ1 ⊗ · · · ⊗ LΛr , µ = λ − i mi αi , where the mi are non-negative integers. It follows that (µ, θ) ≤ (λ, θ) since the Dynkin labels of θ are never negative. But, in the definition of ≺, ties in (·, θ) are broken by height, hence χλ1 1 · · · χλr r is the greatest of the monomials (under ≺) as required. Consider now the ideal lt(IkZ ) generated by the leading terms (with respect to ≺) of the polynomials in the fusion ideal. Since the fusion ring is freely generated k , the leading (as a Z-module) by (the cosets of) the characters of the weights in P terms χλ1 1 · · · χλr r , with (λ, θ) ≤ k must be the only monomials not in lt(IkZ ) . as an abelian group by the set of monomials That is, lt(IkZ ) is freely generated M = χλ1 1 · · · χλr r : (λ, θ) > k . As an ideal, it is now easy to see that lt(IkZ ) is generated by the atomic monomials of M, where the atomic monomials are defined to be those which cannot be expressed as the product of a fundamental character and a monomial from M. Equivalently, atomic monomials are those corresponding to weights from which one cannot subtract any fundamental weight and still remain in the set of weights corresponding to M. It should be clear that every weight λ with (λ, θ) = k + 1 corresponds to an atomic monomial. In fact, for SU(r + 1) and Sp(2r), these are all the atomic monomials, as the comarks are a∨ i = 1 (so if (µ, θ) > k + 1, one can always subtract a fundamental weight from µ yet remain in M). For other groups, it will generally be necessary to include other monomials. For example, a∨ 1 = 2 for G2 , so it follows that (k+2)/2 is also atomic (this is illustrated when the level k is even, the monomial χ1 in Fig. 3). Let χλ1 1 · · · χλr r be an atomic monomial of M. If the associated weight λ is on a shifted affine alcove boundary, we associate to this atomic monomial the polynomial pλ = χλ ∈ IkZ . If not, we use Eq. (3.1) to reflect λ into the fundamental affine alcove, χw·λ ∈ IkZ . In either case, we have constructed a pλ in and take pλ = χλ − det w b the fusion ideal whose leading term with respect to ≺ is χλ1 1 · · · χλr r . Therefore,
LT(IkZ ) = atomic χλ1 1 · · · χλr r in M = LT(pλ ) : λ is associated to an atomic monomial in M . But, this is exactly the definition of a Gr¨ obner basis for IkZ [4, 29].
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
217
˙ ¸ Fig. 3. The weights corresponding to the atomic monomials for the ideal lt(IkZ ) associated with G2 at even and odd levels. Weights corresponding to monomials in the ideal are grey or black, the latter corresponding to atomic monomials. The arrows indicate the effect of multiplying by χ1 and χ2 .
Proposition 3.1. The polynomials pλ constructed above for each weight λ assoobner ciated to an atomic monomial of M = χλ1 1 · · · χλr r : (λ, θ) > k form a Gr¨ basis for the fusion ideal IkZ , with respect to the monomial ordering ≺. That is, IkZ = pλ : λ is associated to an atomic monomial in M . Note the crucial, but subtle, role played by the monomial ordering ≺. Note also that because the given Gr¨ obner basis has elements whose leading coefficient is unity, this presentation shows explicitly that the fusion ideal is dividing. Whilst this presentation has a nice Lie-theoretic interpretation, it is rather more cumbersome than we would wish for. Indeed, a presentation in terms of a potential would give a set of r = rank G generators for the fusion ideal (at every level k), whereas Proposition 3.1 gives a set whose cardinality is of the order of k r−1 . We will therefore indicate in what follows how one can reduce the number of generators to something a bit more manageable (at least for the classical groups).
3.2. Deriving fusion potentials We will begin with the case of SU(r + 1). As noted in Sec. 3.1, the atomic monomials of M = χλ1 1 · · · χλr r : (λ, θ) > k are precisely those corresponding to weights λ with (λ, θ) = k + 1. It follows from Proposition 3.1 that IkZ = χλ : (λ, θ) = k + 1 . The highest root has the form θ = ε1 − εr+1 , so for these weights, k + 1 = (λ, θ) = r+1 λ1 −λr+1 = λ1 . Here, we write λ = j=1 λj εj , and fix the ambiguity corresponding r+1 r+1 = 0. We emphasize that the λj are not to be to j=1 εj = 0 by setting λ confused with the Dynkin labels λj . We now use the Jacobi–Trudy identity, Eq. (A.3), to decompose these generators of the fusion ideal into complete symmetric polynomials (denoted by Hm ) in the qi .
May 2, 2006 15:57 WSPC/148-RMP
218
J070-00262
P. Bouwknegt & D. Ridout
We have
Hλ1 Hλ1 +1 χλ = .. . Hλ1 +r−1 Hk+1 Hk+2 = . .. Hk+r
Hλ2 −1 Hλ2 .. . Hλ2 +r−2 Hλ2 −1 Hλ2 .. .
Hλ2 +r−2
· · · Hλr −r+1 · · · Hλr −r+2 .. .. . . ··· Hλr
· · · Hλr −r+1 · · · Hλr −r+2 . .. .. . . ··· Hλr
Since Hm = χmΛ1 ∈ Z[χ1 , . . . , χr ], expanding this determinant down the first column gives χλ as a Z[χ1 , . . . , χr ]-linear combination of the Hk+i = χ(k+i)Λ1 , where i = 1, . . . , r. Therefore,
IkZ ⊆ χ(k+i)Λ1 : i = 1, . . . , r . Conversely, we show that each (k + i)Λ1 , i = 1, . . . , r, is on a shifted affine alcove boundary, hence is fixed by an affine reflection w, and thus, χ(k+i)Λ1 is in the fusion ideal. This amounts to verifying that ((k + i)Λ1 , α) ∈ (k + h∨ )Z for some root α, and the reader can easily check that α = ε1 − εr+2−i works. We have therefore demonstrated that
IkZ = χ(k+i)Λ1 : i = 1, . . . , r . (3.2) It is rather pleasing that such a simple device can reduce the number of generators from (the order of) k r−1 to r. Before turning to the integration of these generators to a potential, we would like to mention one further observation that may be of interest. We consider the characters χkΛ1 +Λi , where i = 1, . . . , r. Expanding with the Jacobi–Trudy identity, we find that χkΛ1 +Λ1 = Hk+1 χkΛ1 +Λ2 = H1 Hk+1 − Hk+2 χkΛ1 +Λ3 = H12 − H2 Hk+1 − H1 Hk+2 + Hk+3 χkΛ1 +Λ4 = H13 − 2H1 H2 + H3 Hk+1 − H12 − H2 Hk+2 + H1 Hk+3 − Hk+4 .. . We call this the method of 1’s due to the line of 1’s which appear off-diagonal in the Jacobi–Trudy expansion of these characters. These equations show (inductively) that there is another simple generating set for the fusion ideal: IkZ = χkΛ1 +Λi : i = 1, . . . , r .
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
219
This generating set is suggested by the computations of [9] (though not explicitly stated there) on the corresponding brane charge groups.d Note that this set has the nice property of consisting entirely of characters χλ with (λ, θ) = k + 1. We now turn to the derivation of the fusion potential, Eq. (2.1). Let En denote the n-th elementary symmetric polynomial in the qi . From the identity m Hm tm =
n n −1 , we can derive n (−1) En t ∂Hm = (−1)j+1 Hn Hm−j−n . ∂Ej n
(3.3)
For SU(r + 1), Ej = χj ≡ χΛj for j = 1, . . . , r, so we see that i−1
(−1)
∂Hk+h∨ −i i+j = (−1) Hn Hk+h∨ −i−j−n ∂χj n
i−1 is symmetric in i and j. Therefore, i (−1) Hk+h∨ −i dχi is a closed 1-form, hence integrates to a potential Vk+h∨ (there is no topology). We can compute this potential using generating functions. If V (t) = m−1 Vm tm , then m (−1) ∂V (t) ti ti m+i = = (−1) Hm−i tm = n ∂χi (1 + q t) n En t m ti 1 + χ1 t + · · · + χr tr + tr+1
V (t) = log 1 + χ1 t + · · · + χr tr + tr+1 , =
⇒
up to a constant. This is of course Eq. (2.4), from which one can easily recover the fusion potential, Eq. (2.1). We would like to emphasize once again that not only have we given a complete derivation of the fusion potential for the SU(r + 1) Wess–Zumino–Witten models, but we have shown that this potential describes the fusion process over Z, rather than just over C. Consider now the fusion ring for Sp(2r). As before, Proposition 3.1 gives the characters χλ with (λ, θ) = k + 1 as a set of generators for the fusion ideal, IkZ . The highest root is θ = 2ε1 , so for these characters, k+1 = (λ, θ) = λ1 (note that εi 2 = 1 2 ). We expand the Sp(2r) Jacobi–Trudy identity, Eq. (A.4), down the first column. Noting that Hm = χmΛ1 , this shows that the generating characters can be expressed as Z[χ1 , . . . , χr ]-linear combinations of the r elements Hk+1 and Hk+1+i + Hk+1−i d To elaborate somewhat, the authors of [9] computed the brane charge group of the level k SU(r + 1) Wess–Zumino–Witten model from the greatest common divisor of the dimensions of the irreducible representations of highest weight kΛ1 + Λi , i = 1, . . . , r. In [10], the brane charge group was shown to be determined by the greatest common divisor of the dimensions of any set of generators of the ideal IkZ of the fusion ring. This suggests that the χkΛ1 +Λi are such a set of generators, and here we have given a simple proof of this fact.
May 2, 2006 15:57 WSPC/148-RMP
220
J070-00262
P. Bouwknegt & D. Ridout
(i = 1, . . . , r − 1). Here, the Hm are complete symmetric polynomials in the qi and their inverses. It is obvious that these elements belong to IkZ , hence
IkZ = χ(k+1)Λ1 , χ(k+1+i)Λ1 + χ(k+1−i)Λ1 : i = 1, . . . , r − 1 . (3.4) Applying the method of 1’s to these elements gives an alternative set of generators: IkZ = χkΛ1 +Λi : i = 1, . . . , r . Deriving a potential from these generators is somewhat more cumbersome than before. For this purpose, we use the set of generators r−i Hk+h∨ −i−2 : i = 1, . . . , r , =0
which is easily derived from those given above. From Eq. (3.3) and the expressions for En in terms of the χj [27], we compute that i−1
(−1)
r−j r−i r−i ∂ i+j Hk+h∨ −i−2 = (−1) Hn Hk+h∨ −n−i−j−2(m+m ) , ∂χj n m=0 =0
m =0
which is symmetric in i and j (indeed, this symmetry is what suggests the above generating set, as it leads to a closed 1-form). These generators may therefore be integrated to a potential, and the derivation may be completed using generating functions as in the SU(r + 1) case. In this way, we recover Eq. (2.5) and therefore the fusion potential, Eq. (2.2). 4. Presentations for Spin(2r + 1) We now apply the techniques of Sec. 3.1 to the fusion rings of the Wess–Zumino– Witten models over Spin(2r + 1). We are not aware of any concise, representationtheoretic presentations of these rings (nor of the corresponding algebras) in the literature.e We will see that the appropriate Jacobi–Trudy identities may be employed to substantially simplify the presentations given by Proposition 3.1, though the simplification turns out to be not quite so drastic as that found for SU(r + 1) and Sp(2r). In particular, it seems rather doubtful that the presentations obtained are related to potentials. Recall from Sec. 3.1 that we can derive a generating λ1 set λfor the fusion ideal Z Ik by computing the atomic monomials of the set χ1 · · · χr r : (λ, θ) > k . As shown there for G2 , this computation depends upon the comarks a∨ i , which for e In
the course of preparing this section, we were made aware of a conjecture regarding the presentations of the fusion ideals of the Spin(2r + 1) (and Spin(2r)) Wess–Zumino–Witten models [31]. This elegant conjecture amounts to the statement that the fusion ideal at level k is the radical of the ideal generated by the χ(k+i)Λ1 , for i = 1, 2, . . . , h∨ − 1. This is a generalization of the SU(r + 1) result, Eq. (3.2). It is further conjectured that the radical of this ideal is generated by the above characters and χkΛ1 +Λr (χkΛ1 +Λr−1 is also needed for Spin(2r)).
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
221
Spin(2r + 1) are 1 for i = 1, r, and 2 otherwise (we will only consider r > 2). The atomic monomials therefore correspond to the weights k odd :
{λ : (λ, θ) = k + 1}
k even : {λ : (λ, θ) = k + 1} ∪ {λ : (λ, θ) = k + 2 and λ1 = λr = 0}. Finding elements of IkZ whose leading terms are these monomials is easy, and we deduce from Proposition 3.1 that the fusion ring is generated by: k odd :
{χλ : (λ, θ) = k + 1}
k even : {χλ : (λ, θ) = k + 1} ∪ {χλ + χλ−θ : (λ, θ) = k + 2 and λ1 = λr = 0} . (4.1) We note that if λ2 = 0, χλ−θ = 0. In order to reduce the size of this generating set, we again turn to the appropriate Jacobi–Trudy identities. As noted in Appendix A.3, these identities distinguish between tensor and spinor representations (whose highest weight λ has λr even and odd, respectively). We consider first the tensor representations. The appropriate Jacobi–Trudy identity, Eq. (A.7), gives the irreducible characters as a determinant of an r × r matrix: Hλ1 − Hλ1 −2 Hλ2 −1 − Hλ2 −3 · · · Hλr +1−r − Hλr −1−r Hλ1 +1 − Hλ1 −3 Hλ2 − Hλ2 −4 · · · Hλr +2−r − Hλr −2−r χλ = . .. .. .. .. . . . . Hλ1 +r−1 − Hλ1 −r−1 Hλ2 +r−2 − Hλ2 −r−2 · · · Hλr − Hλr −2r (4.2) Here, λj denotes the components of λ with respect to the usual orthonormal basis εj of the weight space, and Hm denotes the m-th complete symmetric polynomial in the qi = exp(εi ), their inverses, and 1. How this treatment differs from the analysis of Sec. 3.2, and is thereby significantly complicated, is that θ = ε1 + ε2 , so (λ, θ) = λ1 + λ2 . It follows that the elements in any single column of the Jacobi–Trudy determinant of a character χλ with (λ, θ) = k + 1 will not generally belong to the fusion ideal, so expanding the determinant down a single column is pointless. Instead, we notice that the top-left 2 × 2 subdeterminant is the character χλ1 ε1 +λ2 ε2 , and that (λ, θ) = k + 1 implies that this subdeterminant is in IkZ . This observation suggests that we must expand Eq. (4.2) down the first two columns. In this way, χλ is expressed as a Z[χ1 , . . . , χr ]-linear combination of the 2 × 2 determinants 1 2 Hλ1 +m1 −1 − Hλ1 −m1 −1 Hλ2 +m1 −2 − Hλ2 −m1 −2 ψm1 m2 λ , λ = . Hλ1 +m2 −1 − Hλ1 −m2 −1 Hλ2 +m2 −2 − Hλ2 −m2 −2
May 2, 2006 15:57 WSPC/148-RMP
222
J070-00262
P. Bouwknegt & D. Ridout
Here, 1 ≤ m1 < m2 ≤ r counts the
“r”
choices of rows used in these subdetermi nants. We have already noted that ψ12 λ1 , λ2 ∈ IkZ when λ1 + λ2 = k + 1, so it is natural to enquire if the same is true for general m1 and m2 . To investigate this, we need to digress a little in order to derive a more amenable form for the ψ12 λ1 , λ2 (see Eq. (4.4) below). This derivation is an exercise in manipulating generating functions. Introducing parameters t1 and t2 , we compute 1 2 1 2 i i 2 ψm1 m2 λ1 , λ2 tλ1 tλ2 = Hλ1 tλ1 Hλ2 tλ2 tj−m − tj+m . j j i,j=1 λ1 ,λ2 ∈Z
2
λ1 ∈Z
λ2 ∈Z
(4.3) Denoting the determinant on the right by Am1 m2 , we form the generating function 2 j−1 ∞ tj − tj+1 zi j m1 m2 Am1 m2 z1 z2 = . (1 − tj zi )(1 − t−1 j zi ) m1 ,m2 =0 i,j=1
Applying Eq. (A.2) to this determinant gives 2−i ∞ zi + z −1 j−1 z 2 tj + t−1 i j i m1 m2 2 3 Am1 m2 z1 z2 = 1 − t1 t2 − t2 2 m1 ,m2 =0 (1 − tj zi )(1 − t−1 j zi ) i,j=1
z2 z3 + z 1 1 1 −1 −1 mi = −A12 2 h m i t 1 , t1 , t2 , t2 z i , z2 z23 + z2 i mi ∈Z where we recognize A12 = 1 − t21 1 − t22 (1 − t1 t2 ) 1 − t−1 1 t2 . Here, hm denotes the m-th complete symmetric polynomial in the ti and their inverses (to be distinguished from the Hm ). It follows that A12 is a factor of Am1 m2 : h m2 −2 hm2 −1 + hm2 −3 Am1 m2 = A12 . hm1 −2 hm1 −1 + hm1 −3 Fascinatingly, if we set tj = exp(ηj ), where ηj denotes the usual orthogonal basis vectors for the weight space of Sp(4), then comparing with Eq. (A.4) gives Am1 m2 Sp(4) = χ(m2 −2)η1 +(m1 −1)η2 . A12 This rather unexpected relation turns out to be extremely useful. For example, we can substitute it back into Eq. (4.3) to recover an expression for the original determinants: Spin(2r+1) ψm1 m2 λ1 , λ2 = χ(λ1 −µ1 )ε1 +(λ2 −µ2 )ε2 . (4.4) µ
Here, the sum is over the weights µ = µ1 η1 + µ2 η2 of the irreducible Sp(4)-module of highest weight (m2 − 2)η1 +(m1 − 1)η2 .
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
223
Recall that the fusion ideal is generated by the characters χλ with (λ, θ) = k + 1 and, if k is even, by the same set augmented by the χλ + χλ−θ with (λ, θ) = k + 2 and λ1 = λr = 0. We have seen that when the characters correspond to tensor representations, the generators ofthe first type may be expressed as a Z[χ1 , . . . , χr ]linear combination of the ψm1 m2 λ1 , λ2 , with λ1 + λ2 = k + 1. Since θ = ε1 + ε2 , it follows that the Jacobi–Trudy determinant for χλ and χλ−θ will be identical in columns 3, . . . , r. Therefore, the generators of the second type (which always correspond to tensor representations) may be expressed as a Z[χ1 , . . . , χr ]-linear combination of the ψm1 m2 λ1 , λ2 + ψm1 m2 λ1 − 1, λ2 − 1 , with λ1 + λ2 = k + 2. Indeed, λ1 = 0 implies that λ1 = λ2 , so the generators ofthe second type can all be expressed in terms of the elements ψm1 m2 k2 + 1, k2 + 1 + ψm1 m2 k2 , k2 . Consider now a single Spin(2r + 1)-character in the sum of Eq. (4.4), labeled by the weight (λ1 − µ1 )ε1 + (λ2 − µ2 )ε2 , with λ1 + λ2 = k + 1. We can pair it with the character labeled by the weight (λ1 + µ2 )ε1 + (λ2 + µ1 )ε2 , its image under the fundamental affine Weyl reflection w 0 . If this character is also (always) in the sum, then we can conclude that the right-hand side of Eq. (4.4) belongs to IkZ , that is 1 2 Z ψm1 m2 λ , λ ∈ Ik . But this follows immediately from the fact that the transformation −µ1 η1 − µ2 η2 → µ2 η1 + µ1 η2 is precisely the action of the Sp(4)-Weyl reflection about the (short) root η1 + η2 . Since the sum in Eq. (4.4) is over the weights of an Sp(4)-representation, which is invariant under this (indeed, any) Sp(4)-Weyl reflection, it is clear that an almost identical ψm1 m2 λ1 , λ2 ∈ IkZ (when λ1 + λ2 = k + 1). More generally, Z k k k k argument shows that ψm1 m2 2 + 1, 2 + 1 + ψm1 m2 2 , 2 ∈ Ik . It follows that the generators of IkZ that correspond to tensor representations can be replaced by λ1 + λ2 = k + 1, ψm1 m2 λ1 , λ2 , k k k k + 1, + 1 + ψm1 m2 , and ψm1 m2 if k is even, 2 2 2 2 where 1 ≤ m1 < m2 ≤ r. The story for the spinor representations (λj half-integral) is much the same. Using the appropriate Jacobi–Trudy identity, Eq. (A.6), we find that the χλ are Z[χ1 , . . . , χr ]-linear combinations of the subdeterminants Hλ1 +m − 3 − Hλ1 −m − 1 Hλ2 +m − 5 − Hλ2 −m − 3 1 2 1 1 1 1 2 2 2 2 ϕm1 m2 λ , λ = χr . H 1 Hλ2 +m2 − 52 − Hλ2 −m2 − 32 λ +m2 − 32 − Hλ1 −m2 − 12 Constructing generating functions as before, one can prove that Spin(2r+1) ϕm1 m2 λ1 , λ2 = χ λ1 −ν 1 − 1 ε + λ2 −ν 2 − 1 ε +Λ , ( r 2) 1 ( 2) 2
(4.5)
ν
where this sum is over the weights ν = ν 1 ζ1 +ν 2 ζ2 of the irreducible Spin(5)-module of highest weight (m2 − 2) ζ1 + (m1 − 1) ζ2 (and the ζi are the usual orthonormal
May 2, 2006 15:57 WSPC/148-RMP
224
J070-00262
P. Bouwknegt & D. Ridout
basis vectors for this weight space). As before, quickly from the fact 1it now follows Z 2 that ζ1 + ζ2 is a root of Spin(5) that ϕm1 m2 λ , λ ∈ Ik . These manipulations for the tensor and spinor representations finally prove that the fusion ideal has the following generators:
k odd : IkZ = ψm1 m2 λ1 , λ2 , ϕm1 m2 λ1 , λ2 :
λ1 + λ2 = k + 1, 1 ≤ m1 < m2 ≤ r , (4.6) k k k k k even: IkZ = ψm1 m2 λ1 , λ2 , ψm1 m2 + 1, + 1 + ψm1 m2 , , 2 2 2 2 (4.7) ϕm1 m2 λ1 , λ2 : λ1 + λ2 = k + 1, 1 ≤ m1 < m2 ≤ r . Since λ1 ≥ λ2 are integers and half-integers in the ψm1 m2 and ϕm1 m2 , respectively, “r” it follows that the number of generators in this set is of the order of k 2 . This compares favorably with the set of generators given in Eq. (4.1), whose number is of the order k r−1 , though perhaps not with the expectation that we could reduce the number of generators to r. Finally, we note that other sets of generators can be deduced from this one, in particular by using the method of 1’s. We leave this as an exercise for the enthusiastic reader.
5. Discussion and Conclusions In this paper, we have attempted to give a complete account of our understanding regarding explicit, representation-theoretic presentations of the fusion rings and algebras associated to the Wess–Zumino–Witten models over the compact, connected and simply-connected (simple) Lie groups. We have discussed presentations in terms of fusion potentials, and have provided complete proofs of the fact that there are explicitly known potentials which correctly describe the fusion algebras of the models over SU(r + 1) and Sp(2r). These potentials appear to have been guessed in an educated manner. We hope that our proofs will complement what has already appeared in the literature, and will be useful for subsequent studies. We have also proven that the fusion algebras of the other groups cannot be described by potentials analogous to those known, which explains why attempts to guess these potentials have not been successful. We recalled that it is the fusion ring, rather than the fusion algebra, which is of physical interest in applications. Despite the fact that the fusion ring is torsion-free, we noted that a presentation for the fusion algebra need not give a presentation of the fusion ring. To overcome this, we have stated and proved a fairly elementary result (Proposition 3.1) giving an explicit presentation (that is easily constructed) of the fusion ring in all cases. We believe that this is the first time such a presentation has been formulated. It is in terms of (linear combinations) of irreducible characters, and so should be regarded as representation-theoretic in the strongest possible sense.
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
225
These general presentations have one rather obvious disadvantage in that the number of characters appearing is quite large. Whilst easy to write down, these presentations nevertheless contain quite a bit of complexity. However, we have seen that it is sometimes possible to express the relevant characters in terms of simpler characters, and so reduce the number of characters that appear. In particular, we have used the well-known determinantal identities for the characters of SU(r + 1) and Sp(2r) to derive the fusion potentials from first principles. An important corollary to our results is then that these fusion potentials correctly describe the fusion rings of the SU(r + 1) and Sp(2r) models. We then extended this result to the Spin(2r + 1) models. The corresponding determinantal identities for the characters did not lead to as nice a simplification as before, in particular we did not end up with a potential description, but the result, Eq. (4.7), is still relatively concise. To the best of our knowledge, this is the first rigorous representation-theoretic presentation of the fusion ideal (over C or Z) for these Wess–Zumino–Witten models. Nonetheless, this presentation is not as concise as we would like for the concrete applications we have in mind. Certainly, for our motivating application to D-brane charge groups, our result allows us to write down an explicit form for this group.f However, we have been unable to substantially simplify this formula, so as to rigorously prove the result conjectured in [10]. We have checked that this result is numerically consistent (to high level) with the generators presented here. We expect that this result can also be extended to the Spin(2r) models. However, we have not done so for two reasons. First, as mentioned in Appendix A.4, the derivation of the appropriate determinantal identities requires a slightly more general approach than what we have been using. It follows that the methods we applied in analyzing the Spin(2r + 1) case will require an analagous generalization. However, we believe that this generalization should follow easily from the methods used in [27]. Our second reason in that as with the Spin(2r + 1) case, we do not expect to get as simple a presentation as we would like. We feel that the root of this is the observation that determinants are not particularly well suited to computations when the Weyl group is not a symmetric group. A far more elegant approach would be to generalize the algebra of determinants to the other Weyl groups, and then derive “generalized determinantal identities” for the Lie group characters in terms of Weyl-symmetric polynomials. It would be very interesting to see if such an approach can be constructed (if it has not already been), and we envisage that it may lead to more satisfactory fusion ring presentations. We hope to return to this in the future.
r−2
charge group has the form Z2x [32], and we can determine x to be the greatest common divisor of the integers obtained by evaluating the fusion ideal generators at the origin of the weight space. With respect to Eq. (4.7), this amounts to replacing the complete symmetric polynomials ` ´ + 2r ) (and then finding the greatest common divisor). Hm q, 1, q −1 by ( m 2r
f The
May 2, 2006 15:57 WSPC/148-RMP
226
J070-00262
P. Bouwknegt & D. Ridout
Acknowledgments P.B. is financially supported by the Australian Research Council, and D.R. would like to thank the Australian National University for a visiting fellowship during this project. We would also like to thank Arzu Boysal, Volker Braun and Howard Schnitzer for helpful and stimulating correspondence. Appendix A. Determinantal Identities of Jacobi–Trudy Type In this section, some formulae are presented, expressing the irreducible characters of the classical groups in terms of determinants of matrices whose entries are relatively simple characters. These formulae, which we will call Jacobi–Trudy identities are well known for the groups SU(r + 1), Sp(2r), SO(2r + 1), and O(2r), and may be found in [25, 27]. We are not aware of a reference for the corresponding formulae for the spinor representations of Spin(2r + 1) or Spin(2r), nor for the tensor representations of the latter which are not restrictions of O(2r) representations. We therefore indicate how Jacobi–Trudy identities for these cases may be derived, following the “transcendental” method of Weyl. The transcendental method relies on Weyl’s character formula [25]: χλ =
Aλ+ρ Aρ
where
Aλ =
det w ew(λ) ,
w∈W
and an identity of Cauchy [27]: k 1 1 − xi yj
i,j=1
=
k−j k k−i k x y j i i,j=1 i,j=1 k
.
(A.1)
(1 − xi yj )
i,j=1 k
Here, |aij |i,j=1 denotes the determinant of the k × k matrix with entries aij . An alternative form of Cauchy’s identity is obtained by replacing yj by yj−1 and multiplying through: k−j k i−1 k x y 1 k i j i,j=1 i,j=1 = . yj − xi k i,j=1 (yj − xi ) i,j=1
We will often apply this in the form k qi + q −1 k−j k tj + t−1 i−1 tk k tj j i,j=1 i j i,j=1 = , k (1−qi tj ) 1−qi−1tj −1 i,j=1 (1 − qi tj ) 1 − qi tj i,j=1
obtained by putting xi = qi + qi−1 and yj = tj + t−1 j .
(A.2)
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
227
A.1. SU(r + 1) The Weyl group is Sr+1 , acting as permutations on the weights εi of the defining representation. We put qi = eεi , so q1 · · · qr+1 = 1, and write λ = ri=1 λi εi , with λr+1 = 0 (in particular, ρj = r + 1 − j). Then, λ1 ≥ λ2 ≥ · · · ≥ λr+1 = 0 are all integers, and j r+1 Aλ = qiλ i,j=1 . We would like to emphasize that the λj are to be distinguished from the Dynkin labels, which we denote by λj . We form a generating function and apply Cauchy’s identity, Eq. (A.1): ∞ r+1 ∞ j j r+1 1 r+1 1 λ λ λ λ Aλ t1 · · · tr+1 = qi tj = 1 − qi tj λ1 ,...,λr+1 =0
λj =0
i,j=1
i,j=1
r+1−j r+1−i t q i j = . (1 − qi tj ) i,j
We recognize Aρ in the numerator, and expand the denominator in terms of complete symmetric polynomials Hm (q) in the qi . We then get ∞ r+1 Aλ λ1 m t1 · · · tλr+1 = tjr+1−i Hmj (q)tj j . A ρ 1 r+1 j λ ,...,λ
=0
mj ∈Z
Bringing the symmetric polynomials into the determinant, changing the summation variables so that the power of tj is λj + ρj , and then bringing the tj out of the determinant finally gives the original Jacobi–Trudy identity: r+1 (A.3) χλ = Hλj +i−j (q)i,j=1 . Note that applying this formula to λ = mΛ1 = mε1 gives Hm (q) = χmΛ1 . A.2. Sp(2r) This time the Weyl group is Sr Zr2 , acting on the weights ±εi of the defining representation by permutation (Sr ) and sign flips (each Z2 negates one of the εi whilst leaving the others invariant). With λ = i λi εi , so ρ j = r + 1 − j, we find j −λ r Aλ = qiλ − qi j i,j=1 . Here, λ1 ≥ λ2 ≥ · · · ≥ λr ≥ 0 are all integers. What follows is very similar to Appendix A.1, so the details are left to the reader. The generating function this time gives the left-hand side of Eq. (A.2), up to a product i qi − qi−1 . After applying the alternative form of Cauchy’s identity, this product combines with the
May 2, 2006 15:57 WSPC/148-RMP
228
J070-00262
P. Bouwknegt & D. Ridout
q-determinant so obtained to give Aρ . From there, the story is as before, and we find that r Hλj +1−j q, q −1 . (A.4) χλ = −1 −1 Hλj +i−j q, q + Hλj +2−i−j q, q i,j=1 In this equation, the top entry of the matrix should be understood to describe the elements of row i = 1, and the bottom entry describes the rows i >1. The complete symmetric functions are in the qi and their inverses. Note that Hm q, q −1 = χmΛ1 . A.3. Spin(2r + 1) The Weyl group is again Sr Zr2 , acting on the non-zero weights ±εi of the defining representation as in the Sp(2r) case. Therefore, we again find that j −λ r Aλ = qiλ − qi j i,j=1 , where λ = i λi εi , and λ1 ≥ λ2 ≥ · · · ≥ λr ≥ 0. In contrast to the Sp(2r) case, the λi can either be all integers (corresponding to a representation of SO(2r + 1), also called a tensor representation) or all half-integers (a spinor representation). Indeed, ρj = r + 12 − j. If we form a generating function with λj integral, Eq. (A.2) gives i−1 r r−j ∞ tj + t−1 tj qi + qi−1 j −1 λ1 λr Aλ t1 · · · tr = qi − qi · . −1 (1 − qi tj ) 1 − qi tj i λ1 ,...,λr =0 i,j
r−j 1/2 −1/2 , and proceeding · qi + qi−1 Recognizing that Aρ factors as i qi − qi as usual gives r Hλj + 12 −j q, q −1 1/2 −1/2 χλ = qi + qi · . (A.5) −1 −1 H j 1 + Hλj + 32 −i−j q, q λ − 2 +i−j q, q i i,j=1
the characters of the spinor Note that because the ρ are half-integers, this describes 1/2 −1/2 . Finally, as the definrepresentations. Note also that χr ≡ χΛr = i qi + qi ing representation has a zero weight, it may be more convenient to express this result in terms of the complete symmetric polynomials in the qi , their inverses, and 1. This gives the Jacobi–Trudy identity for the spinor representations of Spin(2r + 1): r (A.6) χλ = χr Hλj − 12 +i−j q, 1, q −1 − Hλj + 12 −i−j q, 1, q −1 i,j=1 . j
Forming the generating function with λj half-integral then gives the Spin(2r + 1) Jacobi–Trudy identity for the tensor representations. The manipulations are straightforward, and give r (A.7) χλ = Hλj +i−j q, 1, q −1 − Hλj −i−j q, 1, q −1 i,j=1 . Note that χmΛ1 = Hm q, 1, q −1 − Hm−2 q, 1, q −1 , so Hm q, 1, q −1 = χmΛ1 + χ(m−2)Λ1 + · · · .
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
229
Finally, if we compare Eq. (A.5) with Eq. (A.4), we find that we have established a strange relationship between the characters of the spinor representations of Spin(2r + 1) and those of Sp(2r). This is perhaps best written in the following form, where λ labels a tensor representation: Spin(2r+1)
χλ+Λr
Spin(2r+1) Sp(2r) χλ .
= χΛr
(Of course, this has to be interpreted appropriately.) Evaluating the characters at 0 to get the dimensions of the corresponding representations gives an identity of [33]. Interestingly, it is claimed there that this identity cannot hold at the level of characters. A.4. Spin(2r) , acting on the weights ±εi of the defining representation The Weyl group is Sr Zr−1 2 factor corresponds to transformations as in the Sp(2r) case, except that the Zr−1 2 where an even number of the εi are negated and the rest are left invariant. Therefore, j j −λ r −λ r (A.8) 2Aλ = qiλ + qi j i,j=1 + qiλ − qi j i,j=1 , i where λ = i λ εi , and λ1 ≥ λ2 ≥ · · · ≥ λr−1 ≥ |λr |. As in the previous case, we have tensor representations (λi ∈ Z) and spinor representations (λi ∈ Z + 12 ). A non-trivial Dynkin diagram symmetry (for r > 4 this is the only such symmetry) acts via λr → −λr , so representations with λr = 0 will be referred to as symmetric.g Symmetric representations correspond to representations of O(2r), and it is clear that for these representations, the second term in the above formula for Aλ vanishes. Note that ρ j = r − j defines a symmetric tensor representation: r−j r 1 −(r−j) r Aρ = qir−j + qi = qi + qi−1 . i,j=1 i,j=1 2 Since ρ is tensor, forming a generating function with each λj half-integral and positive gives an identity for spinor representations. The derivation of this identity should by now be an easy exercise for the reader. It is: 1/2 r −1/2 qi + qi · Hλj − 12 +i−j q, q −1 − Hλj + 12 −i−j q, q −1 i,j=1 2χλ = i
+
1/2 r −1/2 qi − qi · Hλj − 12 +i−j q, q −1 + Hλj + 12 −i−j q, q −1 i,j=1 . i
1/2 1/2 −1/2 −1/2 Setting all λi = 12 gives 2χr−1 = + i qi − qi . As + qi i qi we assumed λi ≥ 12 when computing the generating function, this formula cannot be applied to χr directly. Instead, it is determined from χr−1 by applying the Dynkin symmetry qr → qr−1 (this symmetry has the effect of chang 1/2 ing the sign of the second term in the above equation). Thus, 2χr = i qi + −1/2 1/2 −1/2 − i qi − qi , leading to the Spin(2r) Jacobi–Trudy identity for spinor qi g For r odd, this symmetry is conjugation, so symmetric coincides with self-conjugate. However, for r even, the conjugation automorphism is trivial.
May 2, 2006 15:57 WSPC/148-RMP
230
J070-00262
P. Bouwknegt & D. Ridout
representations: χλ =
r 1 (χr−1 + χr ) H|λj |− 12 +i−j q, q −1 − H|λj |+ 12 −i−j q, q −1 i,j=1 2 r 1 ± (χr−1 − χr ) H|λj |− 12 +i−j q, q −1 + H|λj |+ 12 −i−j q, q −1 i,j=1 . 2
(A.9)
The ± appearing here reflects the sign of λr . Of course, the absolute values appear j ing in λ are only necessary for j = r. The corresponding derivation for tensor representations is somewhat unique in that Weyl’s transcendental method does not seem to be directly applicable to the first term in Eq. (A.8). Instead, we have to resort to the algebraic method (see [27]). Weyl’s method has no problem with the second term, so this hybrid gives the Spin(2r) Jacobi–Trudy identity for tensor representations: r λr = 0 : χλ = Hλj +i−j q, q −1 − Hλj −i−j q, q −1 i,j=1
r 1 λr = 0 : χλ = H|λj |+i−j q, q −1 − H|λj |−i−j q, q −1 i,j=1 2 r H|λj |−j q, q −1 1 2 ± χr−1 − χ2r . −1 −1 H j 2 + H|λj |+1−i−j q, q |λ |−1+i−j q, q i,j=1
(A.10)
Again, the ± reflects the sign of λr and correlates with −1 the application −1 of the Dynkin −1 − Hm−2 q, q . We also note symmetry qr → qr . Note that χmΛ1 = Hm q, q that χ2r−1 − χ2r = χ2Λr−1 − χ2Λr . A.5. Further remarks Comparing these Spin(2r) identities to those derived for the other groups, we note two novelties. One is the fact that two determinants are generally required, and the second is that explicit factors of 12 appear (in spite of the fact that the right-hand side must be a polynomial in the fundamental characters with integral coefficients). These novelties are direct consequences of the form of Eq. (A.8), which itself reflects the increasing complexity of the Weyl group of Spin(2r), as compared to the cases already treated. Roughly speaking, the Weyl group is sufficiently “non-symmetric” (where “symmetric” refers to the symmetric group) that the use of determinants in Weyl’s transcendental method, in particular applying Cauchy’s identity (Eqs. (A.1) and (A.2)), leads to annoyingly complicated Jacobi–Trudy identities. The Weyl groups of the exceptional groups are even less “symmetric”, and so we expect that the above methods used to derive Jacobi–Trudy identities will be next to useless in these cases. Indeed, the simplest exceptional group G2 has the dihedral group of order 12 for its Weyl group: W = D12 = Z2 S3 . Naively proceeding with Weyl’s transcendental method leads to the evaluation of an unpleasant quotient. Forcing the evaluation with the aid of a computer suggests that the corresponding Jacobi–Trudy identity may require as many as sixty determinants!
May 2, 2006 15:57 WSPC/148-RMP
J070-00262
Presentations of WZW Fusion Rings
231
The appropriate course of action seems therefore clear. Rather than try to force determinants unnaturally upon a Weyl group in order to apply Cauchy’s identity, we should instead try to generalize Cauchy’s identity in such a way that it applies to Weyl’s alternants Aλ = w∈W det w ew(λ) directly. We are not aware of any such generalization, but given the magic of Weyl groups, we would not be surprised if such a generalization could be found. We speculate that such a finding may lead to simple and useful identities of Jacobi–Trudy type for all simple Lie groups.
References [1] E. Verlinde, Fusion rules and modular transformations in 2D conformal field theory, Nucl. Phys. B 300 (1988) 360–376. [2] Y. Huang, Vertex operator algebras and the Verlinde conjecture (2004); arXiv:math. QA/0406291. [3] J. Fuchs, Fusion rules in conformal field theory, Fortschr. Physik 42 (1994) 1–48; arXiv:hep-th/9306162. [4] D. Cox, J. Little and D. O’Shea, Ideals, Varieties, and Algorithms, Undergraduate Texts in Mathematics (Springer-Verlag, New York, 1992). [5] D. Gepner, Fusion rings and geometry, Comm. Math. Phys. 141 (1991) 381–411. [6] J. Polchinski, Dirichlet-branes and Ramond–Ramond charges, Phys. Rev. Lett. 75 (1995) 4724–4727; arXiv:hep-th/9510017. [7] R. Minasian and G. Moore, K-theory and Ramond–Ramond charge, JHEP 9711 (1997) 002; arXiv:hep-th/9710230. [8] P. Bouwknegt and D. Ridout, A note on the equality of algebraic and geometric D-brane charges in WZW models, JHEP 0405 (2004) 029; arXiv:hep-th/0312259. [9] S. Fredenhagen and V. Schomerus, Branes on group manifolds, gluon condensates, and twisted K-theory, JHEP 0104 (2001) 007; arXiv:hep-th/0012164. [10] P. Bouwknegt, P. Dawson and D. Ridout, D-branes on group manifolds and fusion rings, JHEP 0212 (2002) 065; arXiv:hep-th/0210302. [11] A. Tsuchiya, K. Ueno and Y. Yamada, Conformal field theories on universal families of stable curves with gauge symmetries, in Integrable Systems in Quantum Field Theory and Statistical Mechanics, eds. M. Jimbo, T. Miwa and A. Tsuchiya (Academic Press, Boston, 1989), pp. 459–566. [12] G. Faltings, A proof for the Verlinde formula, J. Alg. Geom. 3 (1994) 347–374. [13] A. Beauville, Conformal blocks, fusion rules and the Verlinde formula, Israel Math. Conf. Proc. 9 (1996) 75–96; arXiv:alg-geom/9405001. [14] V. Kac and D. Peterson, Infinite-dimensional Lie algebras, theta functions and modular forms, Adv. Math. 53 (1984) 125–264. [15] V. Kac, Infinite Dimensional Lie Algebras: An Introduction (Birkhauser, Boston, 1983). [16] M. Walton, Fusion rules in Wess–Zumino–Witten models, Nucl. Phys. B 340 (1990) 777–790. [17] M. Walton, Algorithm for WZW fusion rules: A proof, Phys. Lett. B 241 (1990) 365–368. [18] J. Fuchs and P. van Driel, WZW fusion rules, quantum groups, and the modular matrix S, Nucl. Phys. B 346 (1990) 632–648. [19] P. Furlan, A. Ch. Ganchev and V. Petkova, Quantum groups and fusion rules multiplicities, Nucl. Phys. B 343 (1990) 205–227.
May 2, 2006 15:57 WSPC/148-RMP
232
J070-00262
P. Bouwknegt & D. Ridout
[20] M. Bourdeau, E. Mlawer, H. Riggs and H. Schnitzer, Topological Landau–Ginzburg matter from Sp(N )K fusion rings, Mod. Phys. Lett. A 7 (1992) 689–700; arXiv:hepth/9111020. [21] D. Gepner and A. Schwimmer, Symplectic fusion rings and their metric, Nucl. Phys. B 380 (1992) 147–167; arXiv:hep-th/9204020. [22] M. Crescimanno, Fusion potentials for Gk and handle squashing, Nucl. Phys. B 393 (1993) 361–376; arXiv:hep-th/9110063. [23] E. Mlawer, H. Riggs and H. Schnitzer, Integrable N = 2 Landau–Ginzburg theories from quotients of fusion rings, Nucl. Phys. B 418 (1994) 603–636; arXiv:hepth/9310082. [24] D. Ridout, D-brane charge groups and fusion rings in Wess–Zumino–Witten models, Ph.D. thesis, University of Adelaide (2005). [25] H. Weyl, The Classical Groups (Princeton University Press, Princeton, 1939). [26] N. Bourbaki, Lie Groups and Lie Algebras, Chapters 4–6 (Springer-Verlag, Berlin, 2002). [27] W. Fulton and J. Harris, Representation Theory: A First Course (Springer-Verlag, New York, 1991). [28] O. Aharony, Generalized fusion potentials, Phys. Lett. B 306 (1993) 276–282; arXiv:hep-th/9301118. [29] D. Cox, J. Little and D. O’Shea, Using Algebraic Geometry (Springer, New York, 1998). [30] J. Milnor, Singular Points of Complex Hypersurfaces (Princeton University Press, Princeton, 1968). [31] A. Boysal and S. Kumar, Private communication (2005). [32] V. Braun, Twisted K-theory of Lie groups, JHEP 0403 (2004) 029; arXiv:hepth/0305178. [33] M. Gaberdiel and T. Gannon, The charges of a twisted brane, JHEP 0401 (2004) 018; arXiv:hep-th/0311242.
June 5, 2006 10:44 WSPC/148-RMP
J070-00263
Reviews in Mathematical Physics Vol. 18, No. 3 (2006) 233–253 c 2006 by Robert Seiringer
A CORRELATION ESTIMATE FOR QUANTUM MANY-BODY SYSTEMS AT POSITIVE TEMPERATURE∗
ROBERT SEIRINGER Department of Physics, Jadwin Hall, Princeton University, P.O. Box 708, Princeton NJ 08544, USA
[email protected]
Received 30 January 2006 Revised 17 April 2006 We present an inequality that gives a lower bound on the expectation value of certain two-body interaction potentials in a general state on Fock space in terms of the corresponding expectation value for thermal equilibrium states of non-interacting systems and the difference in the free energy. This bound can be viewed as a rigorous version of first-order perturbation theory for many-body systems at positive temperature. As an application, we give a proof of the first two terms in a high density (and high temperature) expansion of the free energy of jellium with Coulomb interactions, both in the fermionic and bosonic case. For bosons, our method works above the transition temperature (for the non-interacting gas) for Bose–Einstein condensation. Keywords: Quantum many-body system; thermodynamic limit; jellium; quasi-free states; correlation inequality; Bose–Einstein condensation. Mathematics Subject Classification 2000: 82B10, 46N50
1. Introduction Correlations play a crucial role in quantum-mechanical many-body systems. They result from interactions among the particles, and it is typically very difficult to obtain information about them in a mathematically rigorous fashion. Approximate theories are often arrived at by neglecting correlations, for instance in the Hartree– Fock theory for fermions. For the problem of estimating the validity of such approximations, it is necessary to estimate the magnitude of correlations present in the state of the interacting system. In [6], Graf and Solovej present a correlation estimate which is applicable for the study of this problem at zero temperature, i.e. for systems in their ground states. The inequality presented there is motivated by earlier correlation estimates by Bach [1] and Bach et al. [2]. Roughly speaking, it estimates the difference of ∗ c 2006 by Robert Seiringer. This paper may be reproduced, in its entirety, for non-commercial purposes.
233
June 5, 2006 10:44 WSPC/148-RMP
234
J070-00263
R. Seiringer
the interaction energy in a general state and the ground state of a non-interacting system in terms of the difference of their one-particle density matrices. Moreover, at least in the case of fermions, the one-particle density matrix can be easily controlled in terms of the total kinetic energy. For bosonic systems, the situation is more complicated, and the correlation estimate in [6] is only applicable provided one can prove the existence of Bose–Einstein condensation — in general, a very difficult task for interacting systems. With the aid of the correlation estimate just mentioned, Graf and Solovej were able to derive the first two terms in a high density expansion of the ground state energy of fermionic jellium [6, Theorem 2] with Coulomb interactions. High density corresponds to small coupling, and hence the result can be viewed as the rigorous estimate of the validity of first-order perturbation theory for this system. In this paper, we present a method that is applicable to the aforementioned problem for systems at positive temperature. Unlike the situation for the ground state, the knowledge of the one-particle density matrix alone does not yield much information about correlations present in the state. As an additional input, one needs to know that the entropy of the state is close to the maximal value possible for the given one-particle density matrix; this maximum is attained by the corresponding quasi-free state. More precisely, we will estimate the difference of the interaction energy of a general state and the thermal equilibrium state of a non-interacting system in terms of the relative entropy of these two states. This relative entropy is related to the difference in free energy. Our result applies to fermions at any temperature, and to bosons above the critical temperature (for the non-interacting gas) for Bose–Einstein condensation. Our main correlation estimate is stated in Theorem 3.1 in Sec. 3. Before describing it in detail, we present an application of the inequality to (fermionic or bosonic) jellium with Coulomb interactions at positive temperature. We will derive the first two terms in a high density (and high temperature) expansion of the free energy. In the fermionic case, this result can be viewed as the positive temperature analogue of [6, Theorem 2]. Our estimate is general enough to be applicable to a wide range of possible interparticle interactions. The two-body potential is required to be positive definite and, in particular, to be decomposable into characteristic functions of balls. In the case of the Coulomb potential, such a decomposition was first used in [5]. The study in [7] provides a criterion for the possibility of such a decomposition for general radial functions, and thus provides many examples of interaction potentials which our method applies to. 2. Jellium Jellium is a model of a charged gas of either fermions or bosons, moving in a uniformly charged background. We assume that the whole system is neutral (in a sense to be made precise below) and contained in a (three-dimensional) cubic box of side length L, which we denote by Λ. We work in the grand-canonical ensemble,
June 5, 2006 10:44 WSPC/148-RMP
J070-00263
A Correlation Estimate at Positive Temperature
235
i.e. in the (anti-)symmetric Fock space over the one-particle space H = L2 (Λ; Cn ). Here, n ≥ 1 denotes the number of internal degrees-of-freedom, corresponding to particles of spin(n − 1)/2. We denote by ∆ the Laplacian on Λ with Dirichlet boundary conditions. We choose units such that = 1 and 2m = 1, with m denoting the particle mass. For > 0 the background density and α > 0 the square of the particle charge, the Hamiltonian on Fock space is H = H0 + αW,
(2.1)
where, in each N -particle sector, H0 = −
N
∆i
(2.2)
i=1
and
N W =− dy i=1
Λ
1 1 1 + + 2 |xi − y| i<j |xi − xj | 2
Λ×Λ
dy1 dy2
1 . |y1 − y2 |
(2.3)
The last constant corresponds to the electrostatic energy of the background charge and is added to ensure the existence of a proper thermodynamic limit. The quantity of interest is the free energy per unit volume at temperature T = β −1 , given by 1 f F,B (β, , α) = − lim ln Tr exp[−βH]. (2.4) L→∞ β|Λ| Here, Tr denotes the trace either over the fermionic (F) or bosonic (B) Fock space. Existence of the thermodynamic limit in (2.4) was shown by Lieb and Narnhofer in [10]. There it was also shown that one would obtain the same result in the canonical ensemble with charge neutrality, i.e. fixing N to be |Λ|. In particular, in our grand-canonical setting, it is not necessary to enforce the charge neutrality N = |Λ| explicitly, as it will be automatically satisfied (for the average particle number). There are three length scales in this problem; the mean particle distance −1/3 , the thermal wavelength β 1/2 , and the inverse coupling constant α−1 . Hence, by simple scaling, f F,B (β, , α) = 5/3 f F,B (β2/3 , 1, α−1/3 ).
(2.5)
We are interested in the high density (and high temperature) asymptotics; more precisely, in large for fixed β2/3 (and fixed α). By the scaling property (2.5), this corresponds to a limit of small coupling. For the statement of our main results, we will distinguish between the fermionic and bosonic cases. 2.1. Fermions Let f0F (β, ) denote the free energy (per unit volume) of a non-interacting gas of spin (n − 1)/2 fermions, at inverse temperature β and average density . It is
June 5, 2006 10:44 WSPC/148-RMP
236
J070-00263
R. Seiringer
given by f0F (β, )
= sup µ − µ∈R
n (2π)3 β
dp ln(1 + e
−β(p2 −µ)
) .
(2.6)
R3
The supremum in (2.6) is attained uniquely at some µ = µF 0 (β, ). We denote the fugacity by z = eβµ for this value of µ. Note that z depends only on β2/3 . Let 1 γ0F (p) = −1 βp2 , (2.7) z e +1 and let γ0F (x) = (2π)−3 dp γ0F (p)eipx denote its inverse Fourier transform. Note that n γ0F (0) = . Theorem 2.1 (High Density Asymptotics for Fermions). As → ∞ and β → 0, αn | γ F (x)|2 − o(4/3 ), f F (β, , α) = f0F (β, ) − dx 0 (2.8) 2 R3 |x| with 0 ≤ o(4/3 ) ≤ C(β2/3 )α4/3 (α−1/3 )1/48 . Moreover, the function C(β2/3 ) is uniformly bounded on compact intervals in (0, ∞). Note that, for fixed β2/3 (and fixed α), the first term on the right-hand side of (2.8) is O(5/3 ), whereas the second term is O(4/3 ). Theorem 2.1 is the positive temperature analogue of [6, Theorem 2]. We remark that (2.8) actually holds uniformly in β2/3 for bounded 1/(β2/3 ), with possibly a worse exponent in the error term than the one given in Theorem 2.1. That is, it is uniform as the ground state is approached. This can be proved by supplementing our lower bound with a bound obtained with the method in [6] at very low temperatures. We do not give the details here, but refer the reader to [15] where a similar argument was given in the case of a dilute Fermi gas with short-range interactions. 2.2. Bosons For bosons, we have to restrict our attention to temperatures bigger than the critical temperature (for the non-interacting gas) or, equivalently, to < c (β) ≡ n(4πβ)−3/2 ≥1 −3/2 . Let f0B (β, ) denote the free energy (per unit volume) of a non-interacting gas of spin (n − 1)/2 bosons, given by n B −β(p2 −µ) dp ln(1 − e ) . (2.9) f0 (β, ) = sup µ + (2π)3 β R3 µ<0 For < c (β), the supremum in (2.9) is attained at µ = µB 0 (β, ) < 0. Denote the fugacity by z = eβµ < 1 for this value of µ. Again, z depends only on the dimensionless quantity β2/3 . Analogously to (2.7), let 1 γ0B (p) = −1 βp2 , (2.10) z e −1 and let γ0B (x) = (2π)−3 dp γ0B (p)eipx denote its inverse Fourier transform.
June 5, 2006 10:44 WSPC/148-RMP
J070-00263
A Correlation Estimate at Positive Temperature
237
Theorem 2.2 (High Density Asymptotics for Bosons). As → ∞ and β → 0 (with β2/3 < βc (β)2/3 ), f B (β, , α) = f0B (β, ) +
αn 2
dx R3
| γ0B (x)|2 − o(4/3 ), |x|
(2.11)
with 0 ≤ o(4/3 ) ≤ C(β2/3 )α4/3 (α−1/3 )1/48 . Moreover, the function C(β2/3 ) is uniformly bounded on compact intervals in (0, βc (β)2/3 ). As in the fermionic case, the first term on the right-hand side of (2.11) is O(5/3 ), whereas the second term is O(4/3 ). Note that the second term diverges as → c (β). This shows that (2.11) cannot hold uniformly as approaches the critical density, since f B (β, , α) ≤ f B (∞, , α) ≤ 0 for any β and . At zero temperature, the leading term in the energy density as → ∞ is actually O(5/4 ) [4, 12]. In particular, first-order perturbation theory (in the grand canonical ensemble) is not applicable below the critical temperature, due to the large fluctuations in particle number. These large fluctuations cannot be present in the interacting system, for any non-zero value of the coupling parameter α. The key ingredient in the proof of Theorems 2.1 and 2.2 is a new correlation estimate, which we present next.
3. Correlation Estimate In this section, we will describe our main correlation estimate, which will then be used in the proof of Theorems 2.1 and 2.2. For ξ ∈ R3 and r > 0, let χr,ξ denote the characteristic function of a ball of radius r centered at ξ. The function χr,ξ defines a projection operator on L2 (R3 ; Cn ) and also, in a natural way, on the subspace L2 (Λ; Cn ). Let nr,ξ denote the operator on Fock space that counts the number of particles in this ball, i.e. the second quantization of the projection χr,ξ on H = L2 (Λ; Cn ). Our correlation estimate concerns a lower bound on the expectation value of the number of pairs of particles inside a ball of radius r or, more precisely, on dξ Tr [nr,ξ (nr,ξ − 1) Γ] . (3.1) R3
Here, Γ is a density matrix, i.e. a positive operator on Fock space with trace equal to one, defining the state of the system. Let γ0 denote the one-particle density matrix of a (grand-canonical) noninteracting (Fermi or Bose) gas at temperature T = β −1 , with chemical potential µ = µF,B 0 (β, ), as defined after Eqs. (2.6) and (2.9), respectively. We choose periodic boundary conditions for γ0 , which has the advantage of γ0 having a constant density. Note that the choice of µ implies that |Λ|¯ ≡ tr γ0 = |Λ| + o(|Λ|) in the thermodynamic limit. Here and in the following, we denote the trace over
June 5, 2006 10:44 WSPC/148-RMP
238
J070-00263
R. Seiringer
the one-particle space H = L2 (Λ; Cn ) by tr, whereas the trace over Fock space is denoted by Tr. The kernel of γ0 is given by 1 F,B γ (p)eip(x−y) δσ,τ , (3.2) γ0 (x, σ; y, τ ) = |Λ| 2π 3 0 p∈
L
Z
where γ0F,B (p) is given in (2.7) and (2.10), respectively, and σ and τ label the spin states. Let Γ0 denote the quasi-free state on Fock space with one-particle density matrix γ0 . It is the Gibbs state (at inverse temperature 1 and chemical potential 0) for a non-interacting system with one-particle Hamiltonian ln[(1 ∓ γ0 )/γ0 ]. Here and in the following, ∓ means − for fermions and + for bosons (and vice versa for ±). We note that for Γ = Γ0 , the expression in (3.1) can be easily calculated. Namely, for any r and ξ, Tr[nr,ξ (nr,ξ − 1)Γ0 ] = (tr[χr,ξ γ0 ])2 ∓ tr(χr,ξ γ0 )2 .
(3.3)
Hence, after integration over ξ, dξ Tr[nr,ξ (nr,ξ − 1)Γ0 ] R3
2
= Λ×Λ
dx dy Jr (x − y) ¯ ∓
2
|γ0 (x, σ; y, σ)|
,
(3.4)
σ
where we denoted Jr (x) = dy χr,ξ (y)χr,ξ (x − y). (Note that Jr is independent of ξ.) Here, we have also used that γ0 has a constant density ¯ = σ γ0 (x, σ; x, σ) for x ∈ Λ. We want to show that for states Γ that are in some sense close to the state Γ0 , the expectation value (3.1) is close to (3.4). A convenient way to characterize this “proximity” is the relative entropy: For two general states Γ and Υ on Fock space, the relative entropy is given by S(Γ, Υ) = Tr Γ(ln Γ − ln Υ).
(3.5)
Note that 0 ≤ S(Γ, Υ) ≤ ∞. Although S does not define a metric, it measures the difference between two states in a certain sense. In particular, S dominates the trace norm. More precisely, S(Γ, Υ) ≥ 2Γ − Υ21 [14, Theorem 1.15]. Note that the relative entropy can also be interpreted as a difference in free energies. More precisely, if Υ = exp(−β(H − F )) for some β > 0, with F = −β −1 ln Tr exp(−βH) the corresponding “free energy”, then β −1 S(Γ, Υ) = Tr[HΓ] + β −1 Tr Γ ln Γ − F.
(3.6)
Note that −Tr Γ ln Γ is just the von Neumann entropy of Γ. Hence, the first two terms on the right-hand side of (3.6) correspond to the free energy of Γ (with Hamiltonian and temperature determined by Υ), whereas F is the free energy of Υ. Our main result estimates the difference of the expectation value (3.1) for Γ0 and a general state Γ in terms of the relative entropy S(Γ, Γ0 ). More precisely, the following theorem, which is the main new result of this work, holds.
June 5, 2006 10:44 WSPC/148-RMP
J070-00263
A Correlation Estimate at Positive Temperature
239
Theorem 3.1 (Main Correlation Estimate). Let Γ0 be given as above, with one-particle density matrix γ0 and density ¯, and with µ ∈ R for fermions and µ < 0 for bosons. Let Γ be any other state on (fermionic or bosonic) Fock space. For any 2r ≤ d ≤ L/2, we have that dξ Tr[nr,ξ (nr,ξ − 1) Γ] R3 2 2 ≥ dx dy Jr (x − y) ¯ ∓ |γ0 (x, σ; y, σ)| Λ×Λ
σ
− CzF,B r3 ¯ 1 + r3 ¯ |Λ|3/4 [d3 (1 + βd−2 )S(Γ, Γ0 ) + β 1/2 d−1 |Λ|]1/4 . (3.7) CzF,B
βµ
Here, are constants depending only on z = e , which are uniformly bounded on compact intervals in (0, ∞) and (0, 1), respectively. We emphasize again that, according to (3.4), the second line in (3.7) equals the first in the case Γ = Γ0 . Although the inequality (3.7) is not sharp in this case, the parameter d can be made very large to obtain an error which is, in the thermodynamic limit, of lower order than the volume. (The restriction d ≤ L/2 in Theorem 3.1 is purely technical and could in principle be avoided by a slight modification of the proof. Since we are mainly concerned here with the application of (3.7) in the thermodynamic limit L → ∞, we have refrained from doing so.) Note that Theorem 3.1 gives an estimate on a “local” quantity, like the expectation value of the number of pairs of particles inside a small ball, in terms of a “global” quantity as the relative entropy. The strong subadditivity of entropy plays a crucial role in this estimate. Before we give the proof of Theorem 3.1, we show how it can be used to prove the applications to Coulomb systems stated in Theorems 2.1 and 2.2. 4. Proof of Theorems 2.1 and 2.2 We are going to treat the fermionic and bosonic case simultaneously, merely pointing out the differences if necessary. We start by deriving a lower bound on the free energy. Note that if Γ denotes the Gibbs state of H at temperature β −1 (and zero chemical potential), then charge neutrality (as proved in [10]) implies that 1 Tr N Γ = L→∞ |Λ| lim
(4.1)
for any fixed β and α > 0. Here, N denotes the number operator on Fock space. Application of the Peierls–Bogoliubov inequality then leads to the lower bound 1 f F,B (β, , α) ≥ f0F,B (β, ) + α lim sup Tr W Γ. (4.2) L→∞ |Λ| To estimate the expectation value of W in the Gibbs state Γ, we will split the Coulomb potential into long- and short-range parts.
June 5, 2006 10:44 WSPC/148-RMP
240
J070-00263
R. Seiringer
4.1. Long-range part We write the Coulomb potential as [5] 1 1 ∞ 1 = dr 5 dξ χr,ξ (x)χr,ξ (y). |x − y| π 0 r R3
(4.3)
As in Sec. 3, χr,ξ denotes the characteristic function of a ball of radius r centered at ξ ∈ R3 . We split the r-integration into a part r ≤ R and a part r ≥ R and, correspondingly, write 1 = V
R (x − y). |x − y|
(4.4)
Note that VR , we note that it has a positive Fourier transform, as follows immediately from the decomposition (4.3). Hence, we obtain the lower bound [17, 4.5.20] N V>R (xi − xj ) ≥ dy V>R (xi − y) 1≤i<j≤N
i=1
1 − 2 2
Λ
Λ×Λ
dy1 dy2 V>R (y1 − y2 ) −
N V>R (0). 2
(4.5)
This estimate actually holds for any > 0. The last term equals V>R (0) = 4/(3R), and hence will be negligible if we choose R −1/3 . 4.2. Short-range part As in Sec. 3, let nr,ξ denote the operator that counts the number of particles in a ball of radius r centered at ξ, i.e. the second quantization of the projection χr,ξ on H = L2 (R3 ; Cn ). The expectation value of the short-range part V
− CzF,B r3 ¯
σ
1 + r3 ¯ |Λ|3/4 [d3 1 + βd−2 S(Γ, Γ0 ) + β 1/2 d−1 |Λ|]1/4 .
(4.7)
June 5, 2006 10:44 WSPC/148-RMP
J070-00263
A Correlation Estimate at Positive Temperature
241
For Γ the Gibbs state of H, an upper bound on S(Γ, Γ0 ) is, in fact, easy to obtain. Using the fact that the quadratic form domain of the Dirichlet Laplacian is contained in the quadratic form domain of the Laplacian on Λ with periodic boundary conditions, we can write S(Γ, Γ0 ) = β Tr(H0 − µN )Γ + Tr Γ ln Γ ∓ tr ln(1 ∓ γ0 ) = − ln Tr exp[−βH] − βµ Tr N Γ − βα Tr W Γ ∓ tr ln(1 ∓ γ0 ).
(4.8)
We now use the lower bound W ≥ −const. N |Λ|1/3 [10], as well as the fact that |Λ|−1 Tr N Γ → in the thermodynamic limit, as explained in the beginning of this section. This leads to the estimate S(Γ, Γ0 ) ≤ |Λ|β(f F,B (β, , α) − f0F,B (β, )) + const. β|Λ|α4/3 + o(|Λ|).
(4.9)
As the upper bound to the free energy in Sec. 4.4 shows, the first term is negative in the fermionic case and can thus be neglected for an upper bound. In the bosonic case, it is bounded above by Cz β|Λ|α4/3 for some constant depending only on z. This follows immediately from the upper bound leading to (2.11), together with simple scaling. (Note that Cz diverges as z → 1.) Hence, in general, S(Γ, Γ0 ) ≤ Cz β|Λ|α4/3 + o(|Λ|).
(4.10)
Here and in the following, we abuse the notation slightly and denote by Cz any expression that depends only on z (and is uniformly bounded on compact intervals in (0, ∞) in the fermionic case and (0, 1) in the bosonic case). We insert the bound (4.10) into (4.7). Choosing d = β −1/8 α−1/4 −1/3 , we thus obtain that, as long as d ≥ 2r (and α−1/3 ≤ const.), R3
dξ Tr[nr,ξ (nr,ξ − 1) Γ]
≥
2
Λ×Λ
dx dy Jr (x − y) ∓
2
|γ0 (x, σ; y, σ)|
σ
− Cz r3 1 + r3 |Λ|(β 5/2 α4/3 )1/16 + o(|Λ|).
(4.11)
Here, we have also used that ¯ = + o(1) as L → ∞. Note that we are going to use this estimate in (4.6) only for r ≤ R. Below we will choose R β −1/8 α−1/4 −1/3 , hence (4.11) will be applicable. For a lower bound, we can restrict the r-integration in (4.6) to R0 ≤ r ≤ R for some 0 < R0 < R, and simply neglect the contribution from the r ≤ R0 2 part. A simple estimate, using that ¯2 , shows that in the σ |γ0 (x, σ; y, σ)| ≤ −1 state Γ0 this r ≤ R0 contribution is bounded above by (2π) |Λ|(4π/3)2 ¯2 R02 . We
June 5, 2006 10:44 WSPC/148-RMP
242
J070-00263
R. Seiringer
then have R 1 1 dr 5 dξ Tr[nr,ξ (nr,ξ − 1)Γ] 2π 0 r R3 8π 1 2 2 dx dy V
1/16 1 − Cz |Λ| + R2 α−1/3 + o(|Λ|). (4.12) R0 Here, we have used again that z is a function of β2/3 . 4.3. Final lower bound For the one-body part containing V
Λ
R3
In combination, (4.5), (4.12) and (4.13) yield, in the thermodynamic limit, n 8π 2 2 2 1 lim inf Tr W Γ ≥ ∓ − R0 dx V
1/16 1 2 −Cz + R α−1/3 . (4.14) R0 In the fermionic case, we can simply use VR (x) and estimate (using V>R (x) ≤ V>R (0) = 4/(3R)) n 2n 2 z dx V>R (x)| γ0B (x)|2 ≤ dp|γ0B (p)|2 ≤ (2π)−3 . (4.15) 2 R3 3R 3R 1−z 3 R With the choice R = −1/3 (α−1/3 )−1/48 and R0 = 1/(R2 ), this yields
1/48 n 1 | γ0F,B (x)|2 Tr W Γ ≥ ∓ − Cz 4/3 α−1/3 lim inf L→∞ |Λ| 2 R3 |x|
(4.16)
for some constant Cz depending only on z. Inserting this bound into (4.2) finishes the proof of the lower bound. 4.4. Upper bound For the upper bound to the free energy, we use the variational principle, which states that 1 1 (4.17) − ln Tr exp[−βH] ≤ Tr HΓ − S(Γ) β β
June 5, 2006 10:44 WSPC/148-RMP
J070-00263
A Correlation Estimate at Positive Temperature
243
for any state Γ on Fock space. Here, S(Γ) = −Tr Γ ln Γ denotes the von Neumann entropy. We choose as a trial state Γ a quasi-free state with one-particle density matrix γ given by the kernel γ(x, σ; y, τ ) = g(x)g(y) γ0F,B (x − y)δσ,τ .
(4.18)
Here, 0 ≤ g(x) ≤ 1 is a continuously differentiable function with the property that g(x) = 0 for x ∈ Λ, g(x) = 1 if x ∈ Λ and dist(x, ∂Λ) ≥ R, and |∇g| ≤ const. R−1 . We shall choose the variational parameter R to satisfy 1/(L2/3 ) R (L2 )−1/5 for large L. The calculation of the energy of the state Γ is similar to the corresponding calculation in [6]. It is in fact simpler since the particle number does not have to be fixed. Let ϕ(x) = σ γ(x, σ; x, σ) denote the density of γ. A simple computation (compare with (3.3) and (3.4)), using the fact that Γ is a quasi-free state, yields 1 Tr W Γ = 2
1 2 (ϕ(x) − )(ϕ(y) − ) ∓ dx dy |γ(x, σ; y, σ)| . |x − y| Λ×Λ σ
(4.19) Note that ϕ(x) = (1 − g(x)2 ) and hence, by definition, ϕ(x) = if x ∈ Λ and x is at least a distance R away from the boundary of Λ. Using the Hardy–Littlewood– Sobolev inequality [9, Theorem 4.3], it is easy to see that the first term on the right-hand side of (4.19) is bounded from above by const. 2 (L2 R)5/3 and is thus negligible in the thermodynamic limit, if R (L2 )−1/5 . In the fermionic case, the second term is bounded from above by n − 2
dx dy Λ×Λ
1 | γ F (x − y)|2 1 − 2(1 − g(x)2 ) , |x − y| 0
(4.20)
which yields the desired expression in the thermodynamic limit, provided R L, which is amply satisfied for our choice of R. In the bosonic case, we can simply use g ≤ 1 to obtain the desired bound. The kinetic energy of Γ is given by γ0F,B (0) Tr H0 Γ = −n∆
dx g(x)2 +
R3
≤ −n|Λ|∆ γ0F,B (0) + const.
dx|∇g(x)|2
R3
ρL2 . R
(4.21)
Again, the first term is the desired expression, and the last term is negligible if R 1/(L2/3 ).
June 5, 2006 10:44 WSPC/148-RMP
244
J070-00263
R. Seiringer
It remains to derive a lower bound on the entropy S(Γ). We claim that n dx g(x)2 S(Γ) ≥ − (2π)3 Λ × dp[γ0F,B (p) ln γ0F,B (p) ± (1 ∓ γ0F,B (p)) ln(1 ∓ γ0F,B (p))], (4.22) R3
which gives the desired quantity as long as R L. Inequality (4.22) follows from a variant of the Berezin–Lieb inequality [3, 8]. The one-particle density matrix (4.18) can be written as dp γ0F,B (p)g|p, σ p, σ|g, (4.23) γ= σ
R3
where g denotes multiplication by g(x), and |p, σ denotes a plane wave with wave function (2π)−3/2 exp(ipx) and spin σ. Moreover, since Γ is a quasi-free state, S(Γ) = tr s(γ),
(4.24)
where we denoted s(t) = −t ln t ∓ (1 ∓ t) ln(1 ∓ t) for t ≥ 0. Note that s is a concave function, with s(0) = 0. Hence, we can apply the Berezin–Lieb inequality, in the form proved in [16, Theorem A1]. Noting that dp g|p, σ p, σ|g = g 2 ≤ 1, (4.25) σ
R3
as well as p, σ|g 2 |p, σ = (2π)−3 We conclude that
dx g(x)2 , this yields (4.22).
nα | γ F,B (x)|2 1 ln Tr exp[−βH] ≤ ∓ − n∆ γ0F,B (0) dx 0 L→∞ β|Λ| 2 R3 |x| n dp[γ0F,B (p) ln γ0F,B (p) ± (1 ∓ γ0F,B (p)) ln(1 ∓ γ0F,B (p))]. + (2π)3 β R3 (4.26)
− lim
The last two terms together are just f0F,B (β, ). We have thus established the desired upper bounds. This concludes the proof of Theorems 2.1 and 2.2. 5. Proof of Theorem 3.1 5.1. Localization of relative entropy If X denotes a projection on the one-particle space H, then states on the Fock space can be restricted to the Fock space over the subspace XH of H. We denote such a restriction of a state Γ by ΓX . Since χr,ξ defines a projection on H = L2 (Λ; Cn ), we can write Tr[nr,ξ (nr,ξ − 1)Γ] = Tr[nr,ξ (nr,ξ − 1)Γχr,ξ ], the latter trace being over the Fock space over χr,ξ H.
(5.1)
June 5, 2006 10:44 WSPC/148-RMP
J070-00263
A Correlation Estimate at Positive Temperature
245
It is well known [14] that the relative entropy decreases under restriction. More precisely, for any two states Γ and Υ on Fock space, S(Γ, Υ) ≥ S(ΓX , ΥX ).
(5.2)
This property is closely related to the strong subadditivity of the von Neumann entropy [11, 13]. Let η : R3 → R be a function with the following properties: • η ∈ C 4 (R3 ), • η(0) = 1, and η(x) = 0 for |x| ≥ 1, • η(p) = dx η(x)e−ipx ≥ 0 for all p ∈ R3 . We note that such a function (with any degree of regularity) can, for instance, be obtained by taking a smooth function of compact support, and convolving it with itself. The resulting function is then smooth, has compact support and positive Fourier transform. In our application, we need the existence of the fourth derivatives at the origin (see Eq. (5.30) below). Given such a function η, we define ηd (x) = η(x/d) and ηdper (x) = ηd (x + jL). (5.3) j∈Z3
ηdper
is a periodic function with period L and, since L ≥ 2d by assumption, Note that we have that ηdper ≤ 1. Moreover, we define a one-particle density matrix γd on H by the kernel γd (x, σ; y, τ ) = γ0 (x, σ; y, τ )ηdper (x − y),
(5.4)
with γ0 defined in (3.2). This defines a positive operator, with plane waves as eigenfunctions, and eigenvalues determined by the convolution of ηd and γ0F,B (p). If [L/2d] denotes the largest integer ≤ L/2d, define d¯ by L/2d¯ = [L/2d]. Then, d ≤ d¯ ≤ 2d. For 0 ≤ r ≤ d/2, let Xr denote the characteristic function of a collection ¯ of balls of radius r, separated by 2d: χr,ξ (x) = χper (5.5) Xr (x) = r,ξ (x), ξ∈2d¯Z3
ξ∈2d¯Z3 ∩[0,L)3
where we denoted χper r,ξ (x) =
χr,ξ (x + jL).
(5.6)
j∈Z3
Note that the minimal distance between the balls is 2d¯ − 2r ≥ d. Hence, per Xr γd Xr = χper r,ξ γd χr,ξ ,
(5.7)
ξ∈2d¯ Z3 ∩[0,L)3
the off-diagonal terms vanish since ηd (x) = 0 for |x| ≥ d. That is, Xr γd Xr is a 3 ¯ 3 direct sum of one-particle density matrices on χper r,ξ H for ξ ∈ 2dZ ∩ [0, L) .
June 5, 2006 10:44 WSPC/148-RMP
246
J070-00263
R. Seiringer
Let Γd denote the quasi-free state on Fock space with one-particle density matrix γd , and let Γ denote any other state on Fock space. The characteristic function Xr defines a projection operator on the one-particle space H = L2 (Λ; Cn ). Hence, the monotonicity of the relative entropy implies S(Γ, Γd ) ≥ S(ΓXr , Γd,Xr ),
(5.8)
where ΓXr and Γd,Xr denote the states restricted to the Fock space over Xr H, respectively. Note that the one-particle density matrix of the quasi-free state Γd,Xr is given by Xr γd Xr . Hence, (5.7) shows that Γd,Xr can be written as a product of 3 ¯ 3 states on the Fock spaces over the one-particle spaces χper r,ξ H for ξ ∈ 2d Z ∩ [0, L) . Under this condition S is superadditive, as follows easily from subadditivity of the von Neumann entropy [14]. More precisely, S(Γχper , Γd,χper ). (5.9) S(Γ, Γd ) ≥ S(ΓXr , Γd,Xr ) ≥ r,ξ r,ξ ξ∈2d¯ Z3 ∩[0,L)3
We can repeat the argument above with a projector defined by the multiplication ¯ 3 . Averaging over a then yields operator Xr (x + a) for some vector a ∈ [0, 2d] 1 S(Γ, Γd ) ≥ ¯ 3 da S(Γχper , Γd,χper ) r,ξ+a r,ξ+a (2d) [0,2d] ¯3 =
1 ¯3 (2d)
ξ∈2d¯ Z3 ∩[0,L)3
Λ
dξ S(Γχper , Γd,χper ). r,ξ r,ξ
(5.10)
Remark. We emphasize that in order to obtain the superadditivity of the rela . Our tive entropy leading to (5.9), we have used the fact that Γd,Xr = ξ Γd,χper r,ξ estimate applies to any density matrix having this property. For a general state, however, it will be difficult to check this property; in the case of a quasi-free state considered here, it simply translates to the vanishing of off-diagonal terms in the one-particle density matrix (more precisely, the validity of (5.7)). 5.2. Upper bound on relative entropy with cut-off In the previous subsection, we have shown how to localize relative entropy in the case when the second argument is a state that has been cut off in such a way as to avoid correlations between balls at a certain distance. In the following, we will quantify the effect of this cut-off on the relative entropy. If Γγ denotes the quasi-free state with one-particle density matrix γ, and Υ is any other state on Fock space, then S(Υ, Γγ ) is convex in γ. This follows from operatorconcavity of the logarithm and S(Υ, Γγ ) = Tr Υ ln Υ − tr ω ln γ ∓ tr(1 ∓ ω) ln(1 ∓ γ), where ω denotes the one-particle density matrix of Υ. Note that γd can be written as a convex combination of the form 1 ηd (q) γ0,q (5.11) γd = |Λ| 2π 3 q∈
L
Z
June 5, 2006 10:44 WSPC/148-RMP
J070-00263
A Correlation Estimate at Positive Temperature
where γ0,q is defined by the kernel 1 1 F,B [γ (p + q) + γ0F,B (p − q)]eip(x−y) δσ,τ . γ0,q (x, σ; y, τ ) = |Λ| 2π 3 2 0 p∈
L
247
(5.12)
Z
Hence, convexity implies that, for any state Γ, 1 S(Γ, Γd ) ≤ ηd (q) S(Γ, Γ0,q ), |Λ| 2π 3 q∈
L
(5.13)
Z
where Γ0,q denotes the quasi-free state corresponding to the one-particle density matrix γ0,q . Recall that Γ0 denotes the quasi-free state on Fock space with one-particle density matrix γ0 given in (3.2), i.e. Γ0 ≡ Γ0,0 . We claim that, for any t > 0,
1 1 −1 S(Γ, Γ0,q ) ≤ 1 + t − hq S(Γ, Γ0 ) + tr(hq − h0 ) (1+t)h −th , 0 q ± 1 e ±1 e (5.14) where hq = ln[(1 ∓ γ0,q )/γ0,q ]. In the Bose case, we have to assume that (1 + t)h0 − thq > 0, which is satisfied for t small enough, as our estimates in Lemma 5.2 below will show. Inequality (5.14) follows from the two inequalities (where γ denotes the one-particle density matrix of Γ) tr γ((1 + t)h0 − thq ) + Tr Γ ln Γ ≥ ∓tr ln(1 ± e−(1+t)h0 +thq ) ≥ ∓tr ln(1 ± e−h0 ) + t tr (h0 − hq )[e(1+t)h0 −thq ± 1]−1
(5.15)
and ∓tr ln(1 ± e−hq ) ≥ ∓tr ln(1 ± e−h0 ) + tr(hq − h0 )[ehq ± 1]−1 .
(5.16)
Dividing (5.15) by t and adding (5.16) yields (5.14). To estimate the last term in (5.14), we need the following simple lemmas, estimating the expression hF,B q (p) = ln
2 ∓ γ0F,B (p + q) ∓ γ0F,B (p − q) γ0F,B (p + q) + γ0F,B (p − q)
.
(5.17)
2 Note that hF,B 0 (p) = β(p − µ).
Lemma 5.1 (Fermions). Let Dz = supu>0 [zu/(eu + z)]. Then, F 2 −2βq 2 (3Dz + 2βp2 ) ≤ hF q (p) − h0 (p) ≤ 2βq (1 + 2Dz ).
(5.18)
F 2 β(q 2 − 2|pq|) ≤ hF q (p) − h0 (p) ≤ β(q + 2|pq|)
(5.19)
Moreover,
independently of z.
June 5, 2006 10:44 WSPC/148-RMP
248
J070-00263
R. Seiringer
Lemma 5.2 (Bosons). Let Dz = supu>0 [z 2 ueu /(eu − z)2 ]. Then,
B 2 −2βq 2 3Dz + 2βp2 ≤ hB q (p) − h0 (p) ≤ βq .
(5.20)
Moreover, B 2 hB q (p) − h0 (p) ≥ β(q − 2|pq|)
(5.21)
independently of z. We defer the proof of Lemmas 5.1 and 5.2 to the appendix. The last term in (5.14) is given by 1 1 F,B (hF,B (p)−h (p)) − F,B n F,B F,B q 0 (1+t)h (p)−th (p) h (p) q q 0 ±1 e ±1 e p∈ 2π Z3
.
(5.22)
L
A simple estimate on the derivative of the last term in brackets with respect to F,B hF,B 0 (p) − hq (p) shows that (5.22) is bounded above by 1 F,B 2 n(1 + t)Cz (hF,B sup , (5.23) q (p) − h0 (p)) (1+s)h0F,B (p)−shqF,B (p) ± 1 −1≤s≤t e p∈ 2π Z3 L
where Cz = 1 for fermions and Cz = (1 − z)−1 for bosons. (Here we have used that 1 + γ0B (p) ≤ (1 − z)−1 .) The upper bounds in (5.18) and (5.20) show that, for 0 ≤ s ≤ t, 2 hF 0 (p) − 2tβq (1 + 2Dz ) for fermions, F,B F,B (5.24) (1 + s)h0 (p) − shq (p) ≥ 2 for bosons. hB 0 (p) − tβq We choose t = min{1, (2βq 2 (1 + 2Dz ))−1 } in the fermionic case, and t = min{1, −µ/(2q 2)} in the bosonic case. With this choice, (5.24) becomes β(p2 − µ) − 1 for fermions, F,B F,B (5.25) (1 + s)h0 (p) − shq (p) ≥ for bosons, β(p2 − µ/2) for 0 ≤ s ≤ t. For −1 ≤ s ≤ 0 we use the lower bounds in (5.19) and (5.21), respectively. It is then easy to see that in this case F,B 2 2 2 (1 + s)hF,B (5.26) 0 (p) − shq (p) ≥ β min{p , (p − q) , (p + q) } − µ . Applying the bounds (5.26) and (5.25) to the denominator in (5.23) and using (5.18) F,B 2 and (5.20), respectively, to bound the expression (hF,B 0 (p) − hq (p)) from above, we obtain that (5.22) ≤ Cz |Λ|β 1/2 q 4 2
(5.27)
as long as βq ≤ const. Here, we have also used that t ≤ 1 by definition. (Again, as in Sec. 4, we abuse the notation slightly and denote by Cz any expression that depends only on z.) It remains to show that (5.27) holds also for large values of βq 2 . To do this, we can go back to (5.22) and apply the bounds above directly to this term. In case
June 5, 2006 10:44 WSPC/148-RMP
J070-00263
A Correlation Estimate at Positive Temperature
249
F,B hF,B q (p) ≥ h0 (p), we use (5.25) (with s = t) as well as the upper bounds in (5.18) F,B and (5.20). For the case hF,B q (p) ≤ h0 (p), we use (5.26) and the lower bounds in (5.19) and (5.21). We then split the sum into three regions according to where the minimum in (5.26) is attained, and change variables from p to p − q or p + q, respectively. In this way, we see that
(5.22) ≤ Cz |Λ|β −1 |q|(1 + β 1/2 |q|)
(5.28)
for any value of q. Hence, in particular, (5.27) holds for all q. We have thus shown that S(Γ, Γ0,q ) ≤ 2(1 + Cz βq 2 )S(Γ, Γ0 ) + Cz |Λ|β 1/2 q 4 , Cz
(5.29)
Cz
= 1 + 2Dz for fermions and = −1/ ln z for bosons. We insert this bound with into (5.13) and sum over q. We can use 1 ηd (q)q 4 = ∆2 η(0)d−4 (5.30) |Λ| 2π 3 q∈
L
Z
and similarly for q 4 replaced by q 2 . This leads to the result that, irrespective of whether we consider Fermi or Bose symmetry, β 1/2 −2 S(Γ, Γd ) ≤ Cz (1 + βd )S(Γ, Γ0 ) + |Λ| 4 , (5.31) d with Cz a constant depending only on z = eβµ . 5.3. Final steps in the proof If nr,ξ denotes the operator that counts the number of particles in a ball of radius r centered at ξ, we want a lower bound on the expression dξ Tr[nr,ξ (nr,ξ − 1)Γχr,ξ ]. (5.32) R3
For a lower bound, we can replace the positive operator nr,ξ (nr,ξ − 1) by fK (nr,ξ (nr,ξ − 1)), where t for t ≤ K (5.33) fK (t) = K for t > K for some K > 0 to be determined. Then, Tr[nr,ξ (nr,ξ − 1)Γχr,ξ ] ≥ Tr[fK (nr,ξ (nr,ξ − 1))Γχr,ξ ] ≥ Tr[fK (nr,ξ (nr,ξ − 1))Γd,χr,ξ ] − KΓχr,ξ − Γd,χr,ξ 1 .
(5.34)
2
Next, we note that t − fK (t) = [t − K]+ ≤ t /(4K), and hence Tr[fK (nr,ξ (nr,ξ − 1))Γd,χr,ξ ] ≥ Tr[nr,ξ (nr,ξ − 1)Γd,χr,ξ ] −
1 Tr[n2r,ξ (nr,ξ − 1)2 Γd,χr,ξ ]. 4K
(5.35)
June 5, 2006 10:44 WSPC/148-RMP
250
J070-00263
R. Seiringer
Note that Γd,χr,ξ is a quasi-free state. Hence (compare with (3.3)–(3.4)), dξ Tr[nr,ξ (nr,ξ − 1)Γd,χr,ξ ] R3 = dx dy Jr (x − y) ¯2 ∓ |γd (x, σ; y, σ)|2 . Λ×Λ
(5.36)
σ
Moreover, the last term in (5.35) is easy to estimate. Since Γd,χr,ξ is quasi-free, it can be the explicitly expressed in terms of χr,ξ γd χr,ξ . A simple estimate then yields, in the fermionic case, Tr[n2r,ξ (nr,ξ − 1)2 Γd,χr,ξ ] ≤ (tr[χr,ξ γd ])2 (tr[χr,ξ γd ] + 2)2 .
(5.37)
In the bosonic case, we obtain Tr[n2r,ξ (nr,ξ
2 1 − 1) Γd,χr,ξ ] ≤ 24(tr[χr,ξ γd ]) tr[χr,ξ γd ] + . 2 2
2
(5.38)
Note that tr[χr,ξ γd ] = 4πr3 ¯/3 as long as Λ contains the ball of radius r centered at ξ, since γd has a constant density ¯. For any ξ and r, we have tr[χr,ξ γd ] ≤ 4πr3 ¯/3. Integrating over ξ thus yields dξ Tr[n2r,ξ (nr,ξ − 1)2 Γd,χr,ξ ] ≤ const. |Λ|(r3 ¯)2 (1 + r3 ¯)2 . (5.39) R3
To estimate the last term in (5.34), we first note that Γχr,ξ − Γd,χr,ξ 21 ≤ 2S(Γχr,ξ , Γd,χr,ξ ) [14, Theorem 1.15]. Using Schwarz’s inequality for the ξintegration yields
1/2 √ 3/2 dξΓχr,ξ − Γd,χr,ξ 1 ≤ 2(L + 2r) dξ S(Γχr,ξ , Γd,χr,ξ ) . (5.40) R3
R3
Here, we have also used the fact that the integrand is zero if the distance between ξ and Λ is bigger than r, since there are no particles outside Λ and hence, both restricted states are the Fock space vacuum in this case. To estimate the last term in (5.40), we would like to use (5.10). We note that, again by monotonicity of the , Γd,χper ). The latter quantity is periodic relative entropy, S(Γχr,ξ , Γd,χr,ξ ) ≤ S(Γχper r,ξ r,ξ in ξ, with period L. Moreover, since r ≤ L/2 by assumption, the cube of side length L + 2r is contained within 33 copies of Λ, and hence dξ S(Γχr,ξ , Γd,χr,ξ ) ≤ 33 dξ S(Γχper , Γd,χper ). (5.41) r,ξ r,ξ R3
Λ
Using (5.10) this yields ¯ 3/2 S(Γ, Γd )1/2 . dξΓχr,ξ − Γd,χr,ξ 1 ≤ 4(L + 2r)3/2 (3d)
(5.42)
R3
Note that (L + 2r) ≤ (3/2)|Λ|1/3 since 2r ≤ L/2 by assumption, as well as d¯ ≤ 2d.
June 5, 2006 10:44 WSPC/148-RMP
J070-00263
A Correlation Estimate at Positive Temperature
251
Collecting all the terms and optimizing over K, we obtain the lower bound dξ Tr[nr,ξ (nr,ξ − 1) Γ] R3 2 2 ≥ dx dy Jr (x − y) ¯ ∓ |γd (x, σ; y, σ)| Λ×Λ
3
3
σ 3/4 3/4
− const. r ¯(1 + r ¯)|Λ|
d
S(Γ, Γd )1/4 .
(5.43)
Note that |γd (x, σ; y, σ)| ≤ |γ0 (x, σ; y, σ)| because of (5.4) and the fact that |ηdper | ≤ 1. Hence, (5.43), together with (5.31), proves the theorem in the fermionic case. In the bosonic case, we have to estimate, in addition, the term dx dy Jr (x − y)|γ0 (x, σ; y, σ)|2 (1 − ηdper (x − y)2 ). (5.44) σ
Λ×Λ
We use that Jr (x) ≤ (4π/3)r3 and |γ0 (x, σ; y, σ)| ≤ ρ¯/n. Moreover, we can estimate ηdper (x)2 ≥ 1 − const. (x/d)ν for any 0 < ν ≤ 2. Choosing ν = 1/4, we obtain the bound 3 −1/4 dx dy |γ0 (x, σ; y, σ)||x − y|1/4 . (5.45) (5.44) ≤ const. r ¯ d Λ×Λ
By simple scaling, the integral is bounded above by Cz |Λ|β 1/8 for some z-dependent constant. Hence, the error term (5.44) can be absorbed into the error terms already present in (5.43), merely adjusting the constant. This finishes the proof of Theorem 3.1. Acknowledgments This work is partially supported by the U.S. National Science Foundation grant PHY-0353181 and by an Alfred P. Sloan Fellowship. It is a pleasure to thank Elliott Lieb and Jan Philip Solovej for stimulating and fruitful discussions. Appendix A Proof of Lemmas 5.1 and 5.2. We first prove (5.19) and (5.21). Since both x → ln[(2 − x)/x] and x → ln[(2 + x)/x] are monotone decreasing (for 0 < x < 2 and x > 0, respectively), we can obtain upper and lower bounds on hF,B q (p) by replacing γ0F,B (p + q) and γ0F,B (p − q) by the minimal and maximal values of these two expressions, respectively. This yields (5.19) and (5.21). The upper bound in (5.20) follows immediately from convexity of the map x → ln[(2 + x)/x] for x > 0. The proof of (5.18) and the lower bound in (5.20) are a bit more tedious, but elementary. For convenience, we set β = 1, the correct β-dependence follows easily
June 5, 2006 10:44 WSPC/148-RMP
252
J070-00263
R. Seiringer
by scaling. For 0 ≤ λ ≤ 1, we define f (λ) = hF,B λq (p). Note that f (0) = 0 and hence 1 F,B F,B hq (p) − h0 (p) = f (1) − f (0) = dλ(1 − λ)f (λ). (A.1) 0
To calculate f (λ) it is useful to note that
q∇γ0F,B (p) = −2pq γ0F,B (p) 1 ∓ γ0F,B (p)
(A.2)
and (q∇)2 γ0F,B (p) = −2q 2 γ0F,B (p)(1 ∓ γ0F,B (p)) + 8(pq)2 γ0F,B (p)(1
∓
γ0F,B (p))
1 F,B ∓ γ0 (p) . 2
(A.3)
Denoting p± = p ± λq and γ± = γ0F,B (p± ), we therefore have 1 1 f (λ) = − 2 + (γ + γ )2 + − (2 ∓ γ+ ∓ γ− ) ×(2p+ q γ+ (1 ∓ γ+ ) − 2p− q γ− (1 ∓ γ− ))2 1 1 − ± + −2q 2 γ+ (1 ∓ γ+ ) − 2q 2 γ− (1 ∓ γ− ) 2 ∓ γ+ ∓ γ− γ+ + γ−
1 1 2 2 ∓ γ+ + 8(p− q) γ− (1 ∓ γ− ) ∓ γ− + 8(p+ q) γ+ (1 ∓ γ+ ) . 2 2 (A.4) Rearranging the various terms we can write ±4 2 2 (1 ∓ γ+ ) + (p− q)2 γ− (1 ∓ γ− ) (p+ q)2 γ+ f (λ) = γ+ + γ− γ+ γ− 2 ∓ (p+ q(1 ∓ γ+ ) + p− q(1 ∓ γ− )) γ+ + γ− 4 ∓ (p+ q)2 γ+ (1 ∓ γ+ )2 + (p− q)2 γ− (1 ∓ γ− )2 2 ∓ γ+ ∓ γ− (1 ∓ γ+ )(1 ∓ γ− ) 2 ∓ (p+ q γ+ + p− q γ− ) 2 ∓ γ+ ∓ γ− 1 1 + ± (2q 2 γ+ (1 ∓ γ+ ) + 2q 2 γ− (1 ∓ γ− )). + 2 ∓ γ+ ∓ γ− γ+ + γ− (A.5) The term in the last line is positive and bounded above by 4q 2 , both in the fermionic and bosonic case. For an upper bound in the fermionic case, we use that p2± γ± ≤ Dz ,
June 5, 2006 10:44 WSPC/148-RMP
J070-00263
A Correlation Estimate at Positive Temperature
253
√ |p± |γ± ≤ Dz as well as 0 ≤ γ± ≤ 1 to get f (λ) ≤ 4q 2 (1 + 2Dz ). Similarly, we can obtain a lower bound. Using that p+ q + p− q = 2pq in the second line in (A.5), a simple estimate yields f (λ) ≥ −12q 2 Dz − 8p2 q 2 in the fermionic case. Using these bounds in (A.1) proves (5.18). In the bosonic case, we only need to prove a lower bound on (A.5). Proceeding √ as above, this time using p2± γ± (1 + γ± ) ≤ Dz and |p± |γ± ≤ Dz we obtain again f (λ) ≥ −12q 2Dz − 8p2 q 2 . This finishes the proof of the lemmas. References [1] V. Bach, Error bound for the Hartree–Fock energy of atoms and molecules, Commun. Math. Phys. 147 (1992) 527–548. [2] V. Bach, R. Lewis, E. H. Lieb and H. Siedentop, On the number of bound states of a (boltzonic and) bosonic N -particle system, Math. Z. 214 (1993) 441–460. [3] F. A. Berezin, Covariant and contravariant symbols of operators, Izv. Akad. Nauk, Ser. Mat. 36 (1972) 1134–1167; USSR Izv. 6 (1973) 1117–1151 (English translation); General concept of quantization, Commun. Math. Phys. 40 (1975) 153–174. [4] J. G. Conlon, E. H. Lieb and H.-T. Yau, The N 7/5 Law for Charged Bosons, Commun. Math. Phys. 116 (1988) 417–448. [5] C. L. Fefferman and R. de la Llave, Relativistic stability of matter I, Rev. Mat. Iber. 2 (1986) 119–161. [6] G. M. Graf and J. P. Solovej, A correlation estimate with applications to quantum systems with Coulomb interactions, Rev. Math. Phys. 6 (1993) 977–997. [7] C. Hainzl and R. Seiringer, General decomposition of radial functions on Rn and applications to N -body quantum systems, Lett. Math. Phys. 61 (2002) 75–84. [8] E. H. Lieb, The classical limit of quantum spin systems, Commun. Math. Phys. 31 (1973) 327–340. [9] E. H. Lieb and M. Loss, Analysis, 2nd edn. (American Mathematical Society, Providence, RI, 2001). [10] E. H. Lieb and H. Narnhofer, The thermodynamic limit for jellium, J. Stat. Phys. 12 (1975) 291–310. Errata, ibid. 14 (1976) 465. [11] E. H. Lieb and M. B. Ruskai, Proof of the strong subadditivity of quantum-mechanical entropy, J. Math. Phys. 14 (1973) 1938–1941; A fundamental property of quantum mechanical entropy, Phys. Rev. Lett. 30 (1973) 434–436. [12] E. H. Lieb and J. P. Solovej, Ground state energy of the one-component charged Bose gas, Commun. Math. Phys. 217 (2001) 127–163. [13] G. Lindblad, Completely positive maps and entropy inequalities, Commun. Math. Phys. 40 (1975) 147–151. [14] M. Ohya and D. Petz, Quantum Entropy and Its Use, Texts and Monographs in Physics (Springer, 2004). [15] R. Seiringer, The thermodynamic pressure of a dilute Fermi gas, Commun. Math. Phys. 261 (2006) 729–758. [16] J. P. Solovej, Upper bounds to the ground state energies of the one- and twocomponent charged Bose gases, to appear in Commun. Math. Phys.; arXiv:mathph/0406014. [17] W. Thirring, Lehrbuch der Mathematischen Physik 3, 2nd edn. (Springer, 1994).
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
Reviews in Mathematical Physics Vol. 18, No. 3 (2006) 255–283 c World Scientific Publishing Company
THE MANIFESTLY COVARIANT SOLITON SOLUTIONS ON NONCOMMUTATIVE ORBIFOLDS T 2 /Z6 AND T 2 /Z3
HUI DENG∗ , BO-YU HOU† , KANG-JIE SHI‡ , ZHAN-YING YANG§ , RUI-HONG YUE¶ and LIU ZHAO ∗,Department
of Physics, Nankai University Tianjin 300071, P. R. China
∗,†,‡,§,¶Institute
of Modern Physics, Northwest University Xi’an, 710069, P. R. China ∗hdeng [email protected] †[email protected] ‡[email protected] §[email protected] ¶[email protected] [email protected] Received 24 January 2006 Revised 2 May 2006
In this paper, we construct a closed form of projectors on the integral noncommutative orbifold T 2 /Z6 in terms of elliptic functions by GHS (Gopakumar, Headrick and Spradlin) construction. Thereafter, we give a general solution of projectors on T 2 /Z6 and T 2 /Z3 with minimal trace and continuous reduced matrix M (k, q0 ). The projectors constructed by us possess symmetry and manifestly covariant forms under Z6 rotation. Since projectors correspond to the soliton solutions of field theory on the noncommutative orbifold, we thus present a series of corresponding manifestly covariant soliton solutions. Keywords: Noncommutative orbifold; soliton; projection operator. Mathematics Subject Classification 2000: 81T75, 35Q99, 34L30
1. Introduction The idea that the space-time coordinates do not commute is quite old [1]. Indeed, noncommutative geometry has arisen in at least three distinct but closely related contexts in string theory. Witten’s open string field theory formulates the interaction of bosonic open strings in the language of noncommutative geometry [2]. Compactification of matrix theory on noncommutative tori was argued to correspond to the supergravity with constant background three-form tensor field [3]. More generally, it has been realized that noncommutative gauge theory arises in the world-volume theory on D-brane in the presence of a constant background B 255
June 5, 2006 10:44 WSPC/148-RMP
256
J070-00264
H. Deng et al.
field in string theory [4]. Until now, many have made contributions to the mathematics and physical application of noncommutative geometry [5–7]. Naturally, one would like to know what is new that arises from the quantum field theories on noncommutative space. The UV/IR mixing caused by noncommutativity of space-time is one of the intriguing aspects of noncommutative field theory [8, 9]. Noncommutative field theory provides us with a powerful tool for studying the quantum Hall effect [10–12]. The research about the quantum Hall effect draws a lot of interests [13–25]. As an important object associated with D-brane, soliton solution is given a lot of attention by string theorists. Although Derrick’s theorem forbids solitons in ordinary more than 1 + 1 dimensions scalar field theory [26], however Gopakumar, Minwalla and Strominger pointed out that there exist soliton solutions in noncommutative scalar field theory [27]. Then, the important issue of scattering of solitons in noncommutative scalar field is investigated in [28]. It was soon realized that noncommutative solitons represent D-branes in string field theory with a background B field [29, 30], and many of Sen’s conjectures [31, 32] regarding tachyon condensation in string field theory have been beautifully confirmed using properties of noncommutative solitons. Soliton solutions in noncommutative gauge theory were introduced by Polychronakos in [33]. The papers listed in [34, 35] contributed a lot of essential work to the study of solitons in noncommutative gauge theory. The important findings of Gopakumar, Minwalla and Strominger that a projector may correspond to a soliton in the noncommutative field theory in [27], shows the significance of studying projection operators in various noncommutative spaces. Reiffel [36] constructed the complete set of projection operators on the noncommutative torus T 2 . On the basis, Boca studied the projection operators on noncommutative orbifold [37] having obtained some elegant results and the wellknown example of projection operator for the case of T 2 /Z4 in terms of the theta function. Martinec and Moore in their important article studied in depth soliton solutions on a wide variety of orbifolds, and the relation between physics and mathematics in this area [38]. Gopakumar, Headrick and Spradlin established a rather clear method to construct the multi-soliton solution on noncommutative integral torus with generic τ [39]. The stability and time-dependence of multi-solition solution are simultaneously discussed by Hadasz, Lindstrom, Rocek and von Unge [40]. The approach in [39] can be used to construct the projection operators on the integral noncommutative orbifold T 2 /ZN [41]. Some manifestly covariant projectors with Z4 symmetry on noncommutative orbifold T 2 /Z4 were given [37, 42]. In [41], we have used the GHS construction to obtain a closed form for the projectors on noncommutative orbifold T 2 /Z6 in terms of the theta function. However, its form is complicated and not explicitly covariant. In this paper, by GHS construction, we give the projectors for integral T 2 /Z3 and T 2 /Z6 , which are symmetric and manifestly covariant under Z6 and Z3 rotations. Also, the integration form of this expression include all the projectors with minimal trace and continuous reduced matrices with respect to the variables k and q, just as that in [42].
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
257
This paper is organized as follows: In Sec. 2, we briefly review the operators on the noncommutative orbifold T 2 /ZN and GHS construction. In Sec. 3, we present the explicit and manifestly covariant form for the projectors on noncommutative orbifold T 2 /Z6 . In the last section, we provide the general covariant projection operators on the integral noncommutative orbifolds T 2 /Z6 and T 2 /Z3 . We conclude this paper with some discussions. 2. Noncommutative Orbifold T 2 /ZN In this section, we introduce operators on the noncommutative orbifold T 2 /ZN . Let two hermitian operators yˆ1 and yˆ2 satisfy the following commutation relation: [ˆ y1 , yˆ2 ] = i.
(2.1)
The operators constituted by the series of yˆ1 and yˆ2 ˆ= O Cmn yˆ1m yˆ2n , m, n ∈ Z and m, n ≥ 0
(2.2)
m,n
form a noncommutative plane R2 . All the operators in R2 which commute with U1 and U2 defined by U1 = e−ilˆy2 ,
U2 = eil(τ2 yˆ1 −τ1 yˆ2 ) ,
(2.3)
(where l, τ1 , τ2 are all real numbers and l, τ2 > 0, τ = τ1 + iτ2 ), constitute the noncommutative torus T 2 . We have U1−1 yˆ1 U1 = yˆ1 + l, U2−1 yˆ1 U2 = yˆ1 + lτ1 , U1−1 yˆ2 U1 = yˆ2 ,
(2.4)
U2−1 yˆ2 U2 = yˆ2 + lτ2 .
The operators U1 and U2 are two different wrapping operators around the noncoml2 τ2
mutative torus and their commutation relation is U1 U2 = U2 U1 e−2πi 2π . When 2 A = l2πτ2 is an integer, we have [U1 , U2 ] = 0 and call the noncommutative torus integral. Define two operators u1 and u2 : u1 = e−ilˆy2 /A ,
u2 = e−il(τ2 yˆ1 −τ1 yˆ2 )/A ,
u1 u2 = u2 u1 e2πi/A ,
−1 A uA 1 = U1 , u 2 = U2 .
(2.5)
The operators on the noncommutative torus T 2 are composed of the Laurent series of u1 and u2 , n ˆT 2 = O Cmn um (2.6) 1 u2 , m,n
where m, n ∈ Z and C00 is called the trace of the operator. Equation (2.6) includes all the operators on the noncommutative torus T 2 , satisfying the invariant relation ˆ T 2 Ui = O ˆT 2 . We may rewrite Eq. (2.6) as under action of {Ui } : Ui−1 O
ˆT 2 = O
A−1 s,t=0
A us1 ut2 Ψst (uA 1 , u2 ),
(2.7)
June 5, 2006 10:44 WSPC/148-RMP
258
J070-00264
H. Deng et al.
A where Ψst is the coefficient function of the Laurent series of operators uA 1 and u2 . We call this formula standard expansion for the operator on the noncommutative torus T 2 . The trace of the operator is the constant term’s coefficient of Ψ00 . Next, we introduce rotation R in noncommutative space R2 ,
R(θ) = e−iθ
2 +y 2 y ˆ1 ˆ2 2
+i θ2
(2.8)
with R−1 yˆ1 R = cos θˆ y1 + sin θˆ y2 ,
(2.9)
R−1 yˆ2 R = cos θˆ y2 − sin θˆ y1 .
(2.10)
Assume τ = τ1 + iτ2 = e2πi/N , θ = 2π/N (N ∈ Z). Define RN ≡ R(2π/N ). Then, −1 Ui RN can be expressed by the monomial of {Ui } and their inverses for Ui ≡ RN A = 2, 3, 4, 6. For these cases, we may introduce the orbifold T 2 /ZN [37, 38]. We call the operators invariant under rotation RN on the noncommutative torus as operators on noncommutative orbifold T 2 /ZN . We can also realize these operators in Fock space. Introduce a=
y1 yˆ2 − iˆ √ , 2
a+ =
y1 yˆ2 + iˆ √ , 2
(2.11)
then [a, a+ ] = 1, +
R = e−iθa
a
.
(2.12)
From the above discussion, we know that the operators U1 and U2 commute with each other on the integral torus T 2 when A is an integer. So we can introduce a complete set of their common eigenstates, namely |k, q representation [39, 43, 44] l −iτ1 yˆ22 /2τ2 ijkl e e |q + jl, (2.13) |k, q = 2π j where the ket on the right-hand side is the eigenstate of yˆ1 . We have U1 |k, q = e−ilk |k, q, U2 |k, q = eilτ2 q |k, q = e2πiqA/l |k, q, 2πl l id = dk dq|k, qk, q|. 0
(2.14) (2.15)
0
It also satisfies
2π |k, q = k + , q = eilk |k, q + l. l l u1 |k, q = k, q + , u2 |k, q = e−ilτ2 q/A |k, q = e−2πiq/l |k, q. A
(2.16) (2.17)
Consider Eq. (2.7), namely the standard expansion of operators on T 2 . We have A −ilk −2πiqA/l ,e )|k, q ≡ ψ˜st (k, q)|k, q, Ψst (uA 1 , u2 )|k, q = Ψst (e
(2.18)
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
259
where ψ˜st is a function of the independent variables k and q, called symbol function A ˜ of Ψst (uA 1 , u2 ). From (2.18), we see that the function ψst is invariant when q → q + l/A or k → k + 2π/l, 2πm ln , q = ψ˜st k, q + (2.19) ψ˜st k + = ψ˜st (k, q), m, n ∈ Z. l A As long as the symbol function is obtained, the operator on the noncommutative torus can be completely determined. Introducing a set of new basis |k, q0 , n ≡ 2π l |k, q0 + ln A , k ∈ [0, l ), q0 ∈ [0, A ), we have from (2.15) A−1 n=0
0
2π l
l A
dk 0
ln ln dq0 k, q0 + k, q0 + = id. A A
(2.20)
From the above equation and (2.17), (2.19), we see that when any power of the operators u1 and u2 acts on the |k, q0 + ln A , the result can be expanded in the basis with the same k, q . So all the operators on the noncommutative torus |k, q0 + ln 0 A do not change k and q0 . We have ln O ˆT 2 k, q0 + ln = k, q O M (k, q ) + . (2.21) 0 nn 0 A A n
Thus, for any k and q0 we get an A×A matrix, called the reduced matrix M o (k, q0 ). We have A
ln B ˆ k, q0 + ln = k, q AˆB M (k, q )M (k, q ) + . (2.22) 0 0 n n 0 A A n
For the projection operator on torus T 2 , we have ln ln P k, q0 + M (k, q0 )n n k, q0 + = . A A
(2.23)
n
It is easy to find that the sufficient and necessary condition for P 2 = P from (2.22) [41] is M (k, q0 )2 = M (k, q0 ).
(2.24)
When T 2 satisfies ZN symmetry, since after RN rotation Ui can be expressed by monomial of {Ui } and their inverses, the state vector RN |k, q0 + ln A is still and U . With the completeness of the common eigenstate of the operators U 1 2 k, q + ls , and to consider the eigenvalues of Ui in the kq representation, this A as follows vector can be expanded in the basis k , q + lsA ln ln RN k, q0 + A(k, q0 )n n k , q0 + = , (2.25) A A n
June 5, 2006 10:44 WSPC/148-RMP
260
J070-00264
H. Deng et al.
where k ∈ [0, 2π/l), q0 ∈ [0, l/A) are definite [41]. Equation (2.25) gives ln ln −1 −1 A (k, q0 )n n k, q0 + RN k , q0 + = . A A
(2.26)
n
We can get the relation expression between k , q0 and k, q0 . The mapping W : (k, q0 ) → (k , q0 ), where W N = id, is essentially a linear relation, and areapreserving. By this fact and the unitarity of RN , we conclude that the matrix A is a unitary matrix [41], namely A∗ (k, q0 )nn = A−1 (k, q0 )n n .
(2.27)
−1 P RN = P , Since the projector on the noncommutative orbifold T 2 /ZN satisfies RN then from (2.23), (2.25) and (2.26), one obtains ln ln −1 −1 RN PR N k, q0 + [A (k, q0 )M (k , q0 )A(k, q0 )]n n k, q0 + = , (2.28) A A n
which should be equal to: ln ln M (k, q0 )n n k, q0 + = . P k, q0 + A A
(2.29)
n
So, we have M (k , q0 ) = A(k, q0 )M (k, q0 )A−1 (k, q0 ).
(2.30)
Thus, the sufficient and necessary conditions for the reduced matrix of a projector on the noncommutative orbifold T 2 /ZN to satisfy are: M (k, q0 )2 = M (k, q0 ), M (k
, q0 )
(2.31) −1
= A(k, q0 )M (k, q0 )A
(k, q0 ).
(2.32)
Next, we study the relation between the coefficient function ψ˜st (k, q) and reduced matrix M (k, q0 ). Due to (2.17), (2.18), (2.19) and (2.23), we have
ln ln A P k, q0 + k, q us1 ut2 Ψst uA , u + = 0 1 2 A A s,t l(n + s) e−2πi(q0 /l+n/A)t ψ˜st (k, q0 ) k, q0 + = A s,t ln = M (k, q0 )n n k, q0 + . (2.33) A n
From the periodic condition of |k, q (see (2.16)), for the n + s < A case, we have M (k, q0 )n+s,n =
A−1 t=0
e−2πi(q0 /l+n/A)t ψ˜st (k, q0 ),
(2.34)
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
261
and for the n + s ≥ A case, we have M (k, q0 )n+s−A,n =
A−1
e−2πi(q0 /l+n/A)t ψ˜st (k, q0 )e−ilk .
(2.35)
t=0
Setting M (k, q0 )n+s,n = M (k, q0 )n+s−A,n eilk ,
(2.36)
we can uniformly write as: M (k, q0 )n+s,n =
A−1
e−2πi(q0 /l+n/A)t ψ˜st (k, q0 )
(2.37)
t=0
and have A−1 1 ψ˜st (k, q0 ) = M (k, q0 )r+s,r e2πi(q0 /l+r/A)t . A r=0
(2.38)
Equations (2.37) and (2.38) are the relations between ψ˜st and the elements of the reduced matrix M . We set the elements of the reduced matrix M to be (the GHS construction) [39] ln ln k, q0 + φ1 φ2 k, q0 + A A . M (k, q0 )nn = (2.39) ln ln φ1 φ2 k, q0 + k, q0 + A A n
It satisfies (2.31) and as long as R|φj = eiαj |φj , it also satisfies (2.32) (see [42]). Notice that M (k, q0 ) in (2.39) possesses the property (2.36). We then have A−1 1 ˜ ψst (k, q0 ) = M (k, q0 )r+s,r e2πi(q0 /l+r/A)t A r=0
A−1 l(r + s) lr 2πi(q0 /l+r/A)t 1 φ1 φ2 k, q0 + k, q0 + e A r=0 A A = lr lr k, q0 + φ1 φ2 k, q0 + A A r =
f˜st (k, q0 ) , Af˜00 (k, q0 )
(2.40)
where f˜st (k, q0 ) ≡
A−1
k, q0 +
r=0
l(r + s) lr 2πi(q0 /l+r/A)t + , φ φ e k, q 1 2 0 A A
(2.41)
June 5, 2006 10:44 WSPC/148-RMP
262
J070-00264
H. Deng et al.
with f˜st (k, q0 ) = f˜st (k, q0 + l/A) = f˜st (k + 2π/l, q0 ),
(2.42)
f˜st (k, q0 ) = f˜s+A,t (k, q0 )e−ilk
(2.43)
= f˜s,t+A (k, q0 )e
−2πiq0 A/l
.
(2.44)
Define q0 , l fst (u, Av) ≡ f˜st (k, q0 ).
u=
lk , 2π
v=
(2.45) (2.46)
So the function fst (u, Av) is a function of the independent variables u and Av with period 1. Similarly, define ψst (u, Av) ≡ ψ˜st (k, q0 ),
(2.47)
and we have ψst (u, Av) =
fst (u, Av) . Af00 (u, Av)
(2.48)
Let X ≡ e−ilk = e−2πiu , Y ≡ e−2πiq0 A/l = e−2πiAv . A If we change the variables X and Y into uA 1 and u2 , respectively, in ψst (u, Av), the standard form (2.7) of the projection operator can be easily obtained. So, the key question is how to find f˜st (k, q0 ). For simplicity, we set
|φ1 = |φ2 = |0,
a|0 = 0,
RN |0 = |0.
(2.49)
After some derivation, we have [39]
2 τ τk τ 1 q 0 k, q|0 ≡ C0 (k, q) = √ + θ , e− 2iτ2 k +ikq 1/4 0 l lτ2 A lπ lk Aq A −πi Aq22 Ai 0 τl , √ θ + ,− = e lτ π 0 2π lτ τ
where
Define
0 (z, τ ), θ(z, τ ) ≡ θ 0 a 2 (z, τ ) = θ eπiτ (m+a) e2πi(m+a)(z+b) . b m ls ls gss (u, v) ≡ k, q0 + 0 0k, q0 + . A A
(2.50)
(2.51)
(2.52) (2.53)
(2.54)
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
263
Then, we get for real u and v, fst (u, Av) =
A−1
r
gs+r,r (u, v) × e2πit( A +v)
r=0
1 r uτ + s + r τ uτ ∗ + r −τ ∗ √ θ v+ , , = θ v+ × e2πit(v+ A ) A A A A l π r × eπi =
τ −τ ∗ A
s u2 +2πi A u
A A r s+r A A A √ θ u+ v+ ,− θ u+ ∗ v+ , ∗ l|τ | π r τ A τ τ A τ A
× e−πi τ (v+
s+r 2 A A ) +πi τ ∗
2
(v+ Ar ) × e2πit(v+ Ar ) .
(2.55)
Then, from (2.55) and the properties of theta functions, we have fst (u + 1, Av) = fst (u, Av + 1) = fst (u, Av), fst (u + Aτ, Av) = e−2πi(2u+A(τ +τ fst (u, Aτ + Av) = e
∗
(2.56)
∗ s )v+ A 2 (τ −τ )+ τ
∗ −2πi(2Av+(τ +τ ∗ )u+A τ −τ 2
−tτ )
)
fst (u, Av),
fst (u, Av).
(2.57) (2.58)
This is the brief review of the GHS construction of projection operators on noncommutative orbifold T 2/ZN . In the next section, we will concretely discuss how to construct the manifestly covariant projectors on noncommutative orbifold T 2 /Z6 . 3. The Covariant Projectors on Noncommutative Orbifold T 2 /Z6 In the above section, we reviewed some results for projectors on noncommutative orbifold T 2 /ZN . Boca and us presented some manifestly covariant projectors with Z4 symmetry on noncommutative integral orbifold T 2 /Z4 [37, 42]. In [41], we have presented a closed form for projectors on the noncommutative orbifold T 2 /Z6 in terms of the elliptic function. However, its form is not explicitly covariant. In this section, we are devoted to develop the manifestly covariant form for projectors on the noncommutative orbifold T 2 /Z6 by GHS construction. In the case that πi τ = τ6 = e 3 , we have fst (u + 1, Av) = fst (u, Av + 1) = fst (u, Av), fst (u + Aτ, Av) = e fst (u, Aτ + Av) = e
s −2πi(2u+Av+Aτ − A 2 +τ )
fst (u, Av),
−2πi(2Av+u+Aτ − A 2 −tτ )
fst (u, Av).
(3.1) (3.2) (3.3)
From this, it can be proved that fst (u, Av) belongs to a three-dimensional linear space. We can define the basis of this space as θ(Av + α)θ(Av + u + β)θ(u + γ) ≡ e(u, Av).
(3.4)
June 5, 2006 10:44 WSPC/148-RMP
264
J070-00264
H. Deng et al.
Here the parameters α, β, γ will be determined later. Any function satisfying conditions (3.1)–(3.3) can be presented by the three linearly independent functions {e(u, Av)}. We denote θ(z) ≡ θ(z, Aτ ) ≡ θ
0 0
(z, Aτ )
for simplicity. (In the following, the theta function without a modular parameter means its modular parameter is Aτ ). We have from (3.4), e(u + 1, Av) = e(u, Av + 1) = e(u, Av),
(3.5)
e(u + Aτ, Av) = e−2πi(2u+Av+Aτ +β+γ) e(u, Av),
(3.6)
e(u, Aτ + Av) = e−2πi(u+2Av+Aτ +α+β) e(u, Av).
(3.7)
Thus, we require that α+β =−
A − tτ, 2
β+γ =−
A s + , 2 τ
(3.8)
πi
where τ = e 3 . Next, we will consider the covariant property for the projectors. From the definition, it is easy to get for R = R6 , u1 = R−1 u1 R = u−1 2 ,
u2 = R−1 u2 R = e−πi/A u1 u2 ,
u1 u2 = e2πi/A u2 u1 .
(3.9) (3.10)
Define c = e−πi/A , then it follows that R−1 P R =
st
−2st+t2 A2 A A ut1 ut−s Ψst u−A 2 c 2 , c u1 u2 .
(3.11)
We have from (2.14), (2.17) and (2.45) −2πiu |k, q, uA 1 |k, q = e −2πi(−Av) uA |k, q, 1 |k, q = e
−2πiAv uA |k, q 2 |k, q = e −πiA −2πi(u+Av) uA e |k, q 2 |k, q = e
From (2.18) and (2.47), we have
A Ψst uA 1 , u2 |k, q ≡ ψst (u, Av)|k, q
A A
A R−1 Ψst uA 1 , u2 R|k, q = Ψst u1 , u2 |k, q A = ψst −Av, − + u + Av |k, q. 2
(3.12) (3.13)
(3.14) (3.15)
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
265
That is, the variables u and Av change as u → −Av,
Av → −
A + u + Av 2
(3.16)
under the rotation R = R6 . Therefore, when P = R−1 P R, the formulae (2.7) and (3.11) demand A −2st+t2 c ψst −Av, − + u + Av = ψt,t−s (u, Av). (3.17) 2 Notice that ψ00 (u, Av) is invariant under rotation R. From (2.40), we can get fst (u, Av) . ψ˜st (k, q0 ) ≡ ψst (u, Av) = Af00 (u, Av)
(3.18)
As long as we find the function fst (u, Av), which satisfies the relation similar to (3.17), we can obtain ψst (u, Av) by (3.18). Set ψst (u, Av)|k, q = A (see the text after Eq. (2.48)). Then, we can get the projector Ψst (uA 1 , u2 )|k, q, A P = st us1 ut2 Ψst uA which is invariant under rotation R. In the following, 1 , u2 we wish to find a set of covariant basis to construct such fst (u, Av). We write the basis as e(u, Av) = θ(Av + α)θ(Av + u + β)θ(u + γ). After rotation R, it is turned into A A e (u, Av) = θ − + u + Av + α θ − + u + β θ(Av − γ) 2 2 = θ(Av + u + β )θ(u + γ )θ(Av + α ).
(3.19)
Thus, the basis vector changes its parameters under the rotation to α = −γ,
β = α −
A , 2
γ = −
A +β 2
(mod Z).
(3.20)
Now, we take the transformation under the light of (3.17) s = t,
t = t − s ⇒ t = s ,
s = s − t .
The covariant basis should satisfy A est −Av, − + u + Av = et,t−s (u, Av) = es ,t (u, Av). 2
(3.21)
We set α = αst = α1 s + α2 t + α3 , β = βst = β1 s + β2 t + β3 , γ = γst = γ1 s + γ2 t + γ3 ,
(3.22)
June 5, 2006 10:44 WSPC/148-RMP
266
J070-00264
H. Deng et al.
here √ √ 3 3 1 B + i, α2 = − i, α3 = , 2 6 3 2 √ √ 3 3 1 1 A B β1 = − i, β2 = − − i, β3 = − − 2 6 2 6 2 2 √ √ 3 3 1 B i, γ2 = − + i, γ3 = , B ∈ Z. γ1 = − 3 2 6 2
α1 =
(3.23)
Then, (3.8) is satisfied. From (3.20) and (3.23), the variables α, β, γ transform into α = α1 s + α2 t + α3 − B = αs t , β = β1 s + β2 t + β3 + A = βs t ,
γ = γ1 s + γ2 t + γ3 − B = γs t
(3.24) (mod Z).
So, we have A est −Av, − + u + Av = et,t−s (u, Av) = es ,t (u, Av). 2
(3.25)
We see that it really satisfies the covariance condition. Then, respectively, take B to be 0 and 1 in (3.24). We obtain two linearly independent bases, denoted by e0 and e1 , which obey (3.25). We verify that fst (u, Av) of (2.55) can be expanded by such two bases in the following. We rewrite the fst (u, Av) in (2.55) by taking v˜0 + When τ = τ6 = e
2π 6 i
τu s s˜0 =v+ + , A A A
v˜0 = v +
τ ∗u . A
(3.26)
, it follows that s (2τ − 1)u s˜0 = + . A A A
(3.27)
We get fst (u, Av) as follows: fst (u, Av) =
1 τ∗ s˜ + r τ r √ θ v˜0 + 0 , θ v˜0 + , − × e2πitr/A l π A A A A r × eπi
τ −τ ∗ A
s u2 +2πi A u+2πitv
.
(3.28)
Expanding the two theta functions involved in (3.28) by (2.53) and we obtain the following form of formula (see Appendix A for details): A 2πiφ 2τ − 1 e θ z, fst (u, Av) = √ θ(w, A(2τ − 1)), l π A δ=0,1
(3.29)
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
267
where s 2τ − 1 τ t+ + u, A A A w = (δA − t)τ + s + 2Av + u,
z = −δτ −
2τ − 1 2 τ −1 s + δ(Av − (τ − 1)u) + u + u, 2 2A A τ st τ 2 − tu − + t . A A 2A
φ = δA
(3.30)
From (2.55) and (3.1)–(3.3), the function f˜st (k, q) = fst (u, Av) belongs to a three-dimensional space spanned by functions of u and Av, and fst (u, Av) can be expanded by the following bases: e0 (u, Av) = θ(Av + α)θ(Av + u + β)θ(u + γ), 1 1 1 e1 (u, Av) = θ Av + α + θ Av + u + β − θ u+γ+ , 2 2 2 e2 (u, Av) = θ(Av + α + x)θ(Av + u + β − x)θ(u + γ + x),
(3.31)
0 < x 1, (3.32)
where α, β, γ are given in (3.23) and (3.24) with B = 0. We have fst = c0 e0 + c1 e1 + c2 e2 .
(3.33)
For the convenience of derivation, we change the arguments as follows: Av = λ − α + a,
(3.34)
u = −γ + b,
(3.35)
where λ = 12 (Aτ + 1). Notice β − α − γ = A 2 in the setting of (3.23) and (3.24) for B = 0. Then, we have A e0 = θ(λ + a)θ λ − + a + b θ(b), 2 1 1 A 1 e1 = θ λ + a + θ λ− +a+b− θ b+ , (3.36) 2 2 2 2 A e2 = θ(λ + a + x)θ λ − + a + b − x θ(b + x). 2 Based on the replacement of arguments given by (3.34) and (3.35), we rewrite fst in (3.29) by variables a and b, A 2πiφst 2πiφδ (a,b) 2τ − 1 2τ − 1 √ e b, fst (u, Av)|u=−γ+b = e θ δ(1 − τ ) + v= λ−α+a l π A A A δ=0, 1
× θ(δAτ + 2λ + 2a + b, A(2τ − 1)),
(3.37)
June 5, 2006 10:44 WSPC/148-RMP
268
J070-00264
H. Deng et al.
where
√ 3i 2 1 (s + t2 − st) − st, 6A 2A √ 3i 2 τ −1 φδ (a, b) = b + δ δA + λ + a + (1 − τ )b . 2A 2 φst =
In order to verify c2 = 0 and determine the coefficients c0 , c1 in (3.33), we will mA ∈ Z and a = a n = n2 , n ∈ Z. Since the consider the case of bm = 1−A 2 + 2τ −1 , m 1 theta function has the property θ n1 + 2 + n2 + 12 τ, τ = 0, n1 , n2 ∈ Z, when
−1) δ = 1, the first theta function in (3.37) θ (1−2A)(2τ + m + 12 , 2τA−1 vanishes. 2A Therefore, we have for τ = eπi/3 m fst (u, Av)|u=−γ+b v= 1 (λ−α+an ) A
≡ fmst
(1 − A)(2τ − 1) 2τ − 1 A + m, = √ e2πiφst +2πiφδ |δ=0 θ l π 2A A mA 1−A + , A(2τ − 1) × θ Aτ + 1 + 2an + 2 2τ − 1 A 2πiφst +2πiφδ | (1 − A)(2τ − 1) 2τ − 1 δ=0 = √ e + m, θ 2A A l π A(2τ − 1) 1 m + − A(2τ − 1) + n, A(2τ − 1) . ×θ 2 2 3
(3.38)
For the case of u = −γst + bm , v = A1 (λ − αst + an ), n ∈ Z, (3.38) is independent of n. When m = 3p, p ∈ Z, the second theta function on the right-hand side of (3.38) vanishes, namely fmst = 0,
(3.39)
when m = 3p, define the function fst (u, Av), fst (u, Av) ≡ fmst = 0. Next, we check e0 , e1 and e2 in the various cases: • If a = 0, one has e0 = 0 and 1 1 m 0 e1 (m) ≡ e1 |a=0 = θ λ + θ λ − A(2τ − 1) θ bm + ; 2 3 2 • If a = − 12 , one has e1 = 0 and
1 m e00 (m) ≡ e0 |a=− 12 = θ λ − θ λ − A(2τ − 1) θ(bm ); 2 3
• If m = 3p, one has e0 = e1 = 0 and e2 = 0.
(3.40)
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
269
From (3.33) and (3.39), we can obtain c2 = 0.
(3.41)
So, we expand fst (u, Av) in the following manner: fst (u, Av) = c0 e0 + c1 e1 .
(3.42)
Next, taking into account the case of m = 3p, we have, respectively, fmst = c1 e01 (m), fmst = c0 e00 (m). It follows that
So we have
c1 =
fmst , e01 (m)
(3.43)
c0 =
fmst . e00 (m)
(3.44)
fst = fmst
e0 e1 + e00 (m) e01 (m)
.
(3.45)
In addition, we have fmst = e2πiφst fm00 and note the ratio e00 (m) = e01 (m)
θ(b ) m 1 θ bm + 2
does not contain s and t. Thus, from (2.48), (3.45) and (3.46), we have 1 θ b + e0 (st) + θ(bm )e1 (st) m fst e2πiφst 2 = ψst (u, Av) = , 1 Af00 A θ bm + e0 (00) + θ(bm )e1 (00) 2 where
(3.46)
(3.47)
1 1 1 ej (st) = θ Av + αst + j θ Av + u + βst + j θ u + γst + j . 2 2 2
Let Θ(e−2πix ) ≡ θ(x, Aτ ). From (2.7) and (2.18), we have
PZ6
1 θ bm + ε0 (st) + θ(bm )ε1 (st) 2 s t 2πiφst , = u1 u2 e 1 s,t=0 A θ bm + ε0 (00) + θ(bm )ε1 (00) 2 A−1
(3.48)
June 5, 2006 10:44 WSPC/148-RMP
270
J070-00264
H. Deng et al.
where
l j (τ2 yˆ1 − τ1 yˆ2 ) + αst + , Aτ 2π 2 l j+A ×θ (τ2 yˆ1 + (1 − τ1 ) yˆ2 ) + βst + , Aτ 2π 2 j × θ lˆ y2 + γst + , Aτ 2 A −2πi(αst + 1 j)
A −2πi(βst + 12 j) −2πi(γst + 12 j) 2 = Θ u2 e × Θ uA × Θ uA . 1 u2 e 1e
εj (st) = θ
(3.49) Note that relation
A 2
included in the second θ function in εj (st) comes from the commutation [l(τ2 yˆ1 − τ1 yˆ2 ), lˆ y2 ] ≡ [Aˆ v, u ˆ] = il2 τ2 = 2πiA
due to −A 2πi(Av+u) u−A |k, q 1 u2 |k, q = e
= ei(Aˆv+ˆu−πA) |k, q. In (3.48) and (3.49), the parameters αst , βst , γst are given by (3.23) and (3.24) with B = 0, √ πi 3 − πi 6 2 , (3.50) t] αst = [e s + e 3 πi A (3.51) βst = e− 3 αst − , 2 2πi
γst = e− 3 αst , √ 3i 2 st (s + t2 − st) − , φst = 6A 2A A 1−A m 1−A +m = − A(2τ − 1), bm = 2 2τ − 1 2 3 We take m = 3p + M , M = ±1 to obtain A A(2τ − 1) 1 θ bm + − , Aτ θ 2 2 3 , = A 1 A(2τ − 1) θ(bm ) − − , Aτ θ 2 2 3
(3.52) (3.53) m = 3p.
(3.54)
(3.55)
which is independent of the choice of M, p. Now, we check the covariance under the rotation transformation R. In the following, we find the expression (3.48) possesses manifest covariance. Actually, e0 and e1 are the covariant functions obtained from
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
271
(3.24) by taking B = 0 and B = 1. Therefore, they satisfy the covariant relation (3.25). We then check (3.17). From (3.47), the exponent of phase factor related to st on the left-hand side of (3.17) is proportional to √ 3i 2 1 2 1 2 (t − 2st) = (s + t2 − st) − (t − st). φst − 2A 6A 2A On the right-hand side the exponent of phase factor is √ 3i 1 2 ((t − s)2 + t2 − t(t − s)) − (t − st) 6A 2A √ 1 2 3i 2 = (s + t2 − st) − (t − st). 6A 2A The two phase factors equal. From (3.25) and (3.49), we know that the ψst given by (3.47) really satisfies the covariance relation (3.17). So, (3.48) is the solution of projector which possesses the manifestly rotational covariance. Now, we have obtained the explicit and manifestly covariant form for the pro1 . jection operators on noncommutative integral orbifold T 2 /Z6 with trace A 4. The General Covariant Projection Operators In this section, we construct the general projectors with manifestly covariant property by GHS construction. Instead of the vacuum |0 , we take |φj = d2 zFj (z)|z, j = 1, 2, (4.1) where |z is the coherent state satisfying the relation a|z = i√l 2 z|z, Fj (z) is an arbitrary continuous function of the argument z. Then, for R in (2.12), we have R|z = |e−iθ z. Now, take θ =
π 3,R
(4.2)
= R6 . When Fj (z) satisfies the Z6 symmetry, namely πi
Fj (e 3 z) = eiαj Fj (z),
(4.3)
R|φj = eiαj |φj .
(4.4)
we have
Then, we may obtain a projector in T 2 /Z6 from (2.40). In this case, we have fst (u, Av) =
d2 z1 d2 z2
A−1 r=0
×e
q 2πi( l0
r +A )t
k, q0 +
l(r + s) lr + z z k, q 1 2 0 A A
F1 (z1 )F2 (z2 )∗ .
(4.5)
June 5, 2006 10:44 WSPC/148-RMP
272
J070-00264
H. Deng et al.
Define G(u, v, z1 , z2∗ )ss ≡ We have
fst (u, Av) ≡
ls ls z2 k, q0 + k, q0 + z1 . A A
d2 z1 d2 z2 F1 (z1 )F2 (z2 )∗ fst (u, Av, z1 , z2∗ ),
where fst (u, Av, z1 , z2∗ ) =
A−1
r
G(u, v, z1 , z2∗ )s+r,r e2πit( A +v)
r=0
= c0 (s, t)θ(Av + α − Az1 α1 − Az2∗ β1 )θ(Av + u + β − Az1 β1 − Az2∗ α1 ) × θ(u + γ − Az1 γ1 +
Az2∗ γ1 )
+ c1 (s, t)θ Av + α − Az1 α1 −
Az2∗ β1
1 + 2
1 1 ∗ ∗ × θ Av + u + β − Az1 β1 − Az2 α1 − θ u + γ − Az1 γ1 + Az2 γ1 + 2 2 ∗
∗
l2
∗
× e2πis(z1 −z2 )γ1 +2πit(z1 α1 +z2 β1 )+ 4 (2z1 z2 +|z1 |
2
+|z2 |2 )
.
(4.6)
The proof is given in Appendix B. Due to (2.40), one has d2 z1 d2 z2 fst (u, Av, z1 , z2∗ )F1 (z1 )F2 (z2 )∗ ψst (u, Av) = . A d2 z1 d2 z2 f00 (u, Av, z1 , z2∗ )F1 (z1 )F2 (z2 )∗
(4.7)
Therefore, we obtain the explicit form for the general projection operators as follows, 1 θ bm + ε0 (st) + θ(bm )ε1 (st) A−1 1 s t 2πiφst 2 P = u1 u2 e , (4.8) 1 A s,t=0 θ bm + ε0 (00) + θ(bm )ε1 (00) 2 where εj (s, t) = dz1 dz2 F1 (z1 )F2 (z2 )∗ Ej (s, t, z1 , z2∗ ), and
−2πi(αst + 2j ) 2πiA(z1 α1 +z2∗ β1 ) e Ej (s, t, z1 , z2∗ ) = Θ uA 2e
A 2πi(βst + 2j ) 2πiA(z1 β1 +z2∗ α1 ) × Θ uA e 1 u2 e
2πi(γst + j2 ) 2πiA(z1 −z2∗ )γ1 e × Θ uA 1e ∗
∗
l2
∗
× e2πis(z1 −z2 )γ1 +2πit(z1 α1 +z2 β1 )+ 4 (2z1 z2 +|z1 |
2
+|z2 |2 )
.
(4.9)
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
273
When Fj (zj ) satisfies the Z6 symmetry, namely, in the case θ = π3 in formula (4.2), (4.8) shows the projectors PZ6 ; obviously, the projector also belongs to PZ3 . Just as proved in [42], we have obtained all the projectors with trace A1 on the orbifolds T 2 /Z6 including the case that ψ˜st (k, q0 ) is an analytic function. When Fj (zj ) satisfies the Z3 symmetry but does not satisfy Z6 symmetry, namely, Fj (e
2πi 3
z) = eiαj Fj (z),
(4.10)
and πi
Fj (e 3 z) = const. Fj (z),
(4.11)
then (4.8) gives a projector of T 2 /Z3 , but it is not a projector of T 2 /Z6 . It is shown that the form of our solution possesses manifest covariance under rotation in Appendix B.
5. Discussion 1 in We have found the complete set of projectors in analytic form with trace A 2 2 all the cases of integral orbifold T /ZN (in the case of T /Z4 , refer to [42]); of 1 course the case with trace A−1 A is naturally obtained via the case with trace A by P = id − P . However, we have not obtained analytic solutions about projectors with an arbitrary trace A−m A , 1 < m < A − 1, which is an intriguing question that is closely related to the resolvent of the case that A is a rational number but not an integer number. It is worthy of further study that whether there exists such an analytic solution or there is something special in its framework if such a solution exists.
Acknowledgments This work is supported by the National Natural Science Foundation of China granted by No. 10575080 and No. 90403014.
Appendix A Now, we show briefly the proof of (3.29). Set |φ = |0, l(s + r) k, q0 + 0 0 k, q0 + A r=0 1 s˜0 + r τ , = √ θ v˜0 + θ v˜0 + l π r A A
fst (u, Av) =
A−1
×e
2πitr A
× eπi
τ −τ ∗ A
s u2 +2πi A u+2πitv
.
lr A
×e
τ∗ r ,− A A
2πitr A
×e
2πitq0 l
(A.1)
June 5, 2006 10:44 WSPC/148-RMP
274
J070-00264
H. Deng et al. ∗
Note v˜0 = v, s˜0 = s due to (3.26), (3.27) and − τA = = Fst
τ −1 A .
We define Fst as
2πitr s˜0 + r τ r τ −1 , θ v˜0 + θ v˜0 + , e A . A A A A r=0
A−1
(A.2)
In terms of the definition of the theta function, we expand the theta functions in the Laurent series as follows involved in Fst Fst =
r
e
τ πi A m2
e
s ˜ +r 2πim(˜ v0 + 0A )
m
e
−1 πi τ A m2
e2πim
r (˜ v0 + A )
×e
2πitr A
.
m
After replacing variable m by n − m , we get = Fst
πi
e A {m
2
(2τ −1)+n2 (τ −1)−2mn(τ −1)}
s ˜0
× e2πin˜v0 × e2πi A m ×
A−1
m,n
e2πi
t+n A r
.
r=0
Using A−1
e
2πi t+n A r
r=0
A when n = LA − t, L ∈ Z, = 0 otherwise,
and substituting LA − t for n, where L runs over all integers, after some computation and arrangement, we have =A Fst
πi
1
e A (2τ −1){(m− 2 (LA−t))
2
+ 14 (LA−t)2 }
m,L πi
× e− 2A {(LA) πi
2
−2LAt}
πi
2
πi
1
× e− 2A t × e− A (m− 2 (LA−t))(t−2˜s0 )
1
× e− A ( 2 (LA−t))(t−2˜s0 ) × e2πi(LA−t)˜v0 × eπimL . Next, we set L = 2h + δ, where h ∈ Z, δ = 0, 1 and note the fact that eπimL = eπim(2h+δ) = eπimδ . We obtain Fst in the form of sum over the three variables h, m, δ as follows: =A Fst
eπi(
2τ −1 2 2 1 1 A ){(m− 2 ((2h+δ)A−t)) + 4 ((2h+δ)A−t) }
δ=0,1 m,h πi
× e− 2A {(2h+δ) πi
1
2
A2 −2(2h+δ)At}
πi
2
× e− 2A t × eπimδ × e2πi[(2h+δ)A−t]˜v0 πi
× e− A (m− 2 ((2h+δ)A−t))(t−2˜s0 ) × e− 2A ((2h+δ)A−t)(t−2˜s0 ) .
(A.3)
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
275
After the arrangement, we find that the sum over m and n can be separated into products of two theta functions, namely (note that δ 2 = δ) δA−t+2˜ s0 δA−t πiA(2τ −1)h2 2πih[ +2A˜ v +(2τ −1) ] 0 2 2 e ×e Fst = A δ
×
h
e
πi 2τA−1 (m−hA)2
×e
2πi(m−hA)[
δA−t+2˜ s0 2A
−( 2τA−1 )( δA−t )] 2
m 2τ −1
δA−t 2
2
πi
× e2πi A ( 2 ) × e− 2A (δA−t) × e2πi(δA−t)˜v0 (δA − t) s˜0 2τ − 1 (1 − τ ) + , = Aθ A A A δ=0,1
× θ((δA − t)τ + s˜0 + 2A˜ v0 , A(2τ − 1)) × e2πi 2τ − 1 ≡ Aθ z, θ(w, A(2τ − 1))e2πiφ , A
2τ −2 δA−t 2 ) A ( 2
× e2πi(δA−t)˜v0 (A.4)
δ
where τ −1 (δA − t)2 + (δA − t)˜ v0 , 2A δA − t s˜0 z= (1 − τ ) + , A A
φ=
w = (δA − t)τ + s˜0 + 2A˜ v0 .
τ −τ ∗ A
s u2 +2πi A u+2πitv
,
∗
from (A.1), (A.2) as well as v˜0 = v + τAu and s˜A0 = As + (2τ −1)u , we obtain A 2τ − 1 Aθ z , fst = θ(w, A(2τ − 1))e2πiφ , A δ
where τ s 2τ − 1 2τ − 1 = −δτ − t + + u, A A A A w = (δA − t)τ + s + 2Av + u,
z = z − t
2τ − 1 2 τ −1 s + δ(Av − (τ − 1)u) + u + u, 2 2A A τ st τ 2 − tu − + t . A A 2A
φ = δA
This is the formula (3.29).
(A.6) (A.7)
Having known that πi fst = Fst e
(A.5)
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
H. Deng et al.
276
Appendix B Now, we would like to derive the general form of the projectors. We have the inner product of k, q| and the coherent state |(z) in paper [42], √ √ q + ττ2 k − i 2z τ 1 − τ k2 +ikq+ 2kz −(z 2+z z¯ )/2 , e 2iτ2 k, q|(z ) = √ θ , l A lπ 1/4 where a|(z ) = z |(z ). Let i√l 2 z = z , |(z ) = |z, we have a|z = i√l 2 z|z. (In [42], the coherent state is denoted by |z with a|z = z|z which is the same as |(z) in this paper, however, we have given another implication to |z in this paper.) Substituting the above formula into Gss (u, v) one has Gss (u, v, z1 , z2∗ ) ls ls z2 k, q0 + = k, q0 + z1 A A ∗ τu s τ ∗ u s τ 1 ∗ −τ + − z1 , + − z2 , θ v+ = √ θ v+ l π A A A A A A × eπi
τ −τ ∗ A
∗ u2 +2πiu( s−s A −z1 +z2 )
l2
2
∗2
× e 4 (z1 +z2
+z1 z1∗ +z2 z2∗ )
and fst (u, Av, z1 , z2∗ ) ≡
A−1
r
Gs+r,r (u, v, z1 , z2∗ ) × e2πit( A +v)
r=0
1 τ τu s + r √ + − z1 , = θ v+ A A A l π r uτ ∗ −τ ∗ r − z2∗ , ×θ v + + × e2πitr/A A A A × eπi
τ −τ ∗ A
s u2 +2πiu( A −z1 +z2∗ )+2πitv
l2
2
∗2
× e 4 (z1 +z2
+z1 z1∗ +z2 z2∗ )
.
Define v˜, s˜ by the following relation expressions v˜ +
τu s s˜ =v+ + − z1 , A A A τ ∗u + z2∗ . v˜ = v + A
(B.1) (B.2)
πi
From (B.1) and (B.2), we have for τ = e 3 , −i u = √ (˜ s − s + A(z1 − z2∗ )) 3 s˜ − s i i i Az1 Az2∗ Av = A˜ v+ 1+ √ 1+ √ 1− √ + + . 2 2 2 3 3 3
(B.3) (B.4)
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
277
In terms of the new variables v˜ and s˜, we have A s˜ 2τ − 1 ∗ √ fst (u, Av, z1 , z2 ) = θ (δA − t)(1 − τ ) + , A A l π δ
× θ((δA − t)τ + 2A˜ v + s˜, A(2τ − 1)) × e2πi × e2πiδA˜v × e ×e
−s 2πit( s˜2A (1+ √i3 )+
√π (˜ s2 −s2 +2sA(z1 −z2∗ )) 3A
z1 2 l2
(1+ √i3 )+
τ −1 2 2A (δA−t)
∗ z2 √i 2 (1− 3 ))
∗
∗
∗
× e 4 (2z1 z2 +z1 z1 +z2 z2 ) .
(B.5)
When taking z1 = z2 = 0 in Eq. (B.5) and denoting v˜0 = v˜(z1 = z2 = 0), s˜0 = s˜(z1 = z2 = 0), we get A s˜0 2τ − 1 fst (u, Av, z1 , z2∗ )z1 =z2 =0 = √ θ (δA − t)(1 − τ ) + , l π A A δ
× θ((δA − t)τ + 2A˜ v0 + s˜0 , A(2τ − 1)) × e2πi ×e
τ −1 2 2A (δA−t)
√π (˜ s20 −s2 ) 3A
× e2πiδA˜v0 × e
2πit
s ˜0 −s √i 2A (1+ 3 )
.
(B.6)
On the other hand, when z1 = z2 = 0, it is obvious that fst (u, Av, 0, 0) = fst (u, Av). From (3.26), (3.32) and (3.42), we have s˜0 − s i fst (u, Av) = c0 θ A˜ v0 + 1+ √ +α 2 3 −i s˜0 − s i s0 − s) + γ × θ A˜ v0 + 1− √ + β θ √ (˜ 2 3 3 s˜0 − s i 1 + c1 θ A˜ v0 + 1+ √ +α+ 2 2 3 −i s˜0 − s 1 i 1 s0 − s) + γ + × θ A˜ v0 + 1− √ θ √ (˜ , +β− 2 2 2 3 3 (B.7) where
A (1 − A) 2τ − 1 2τ − 1 , c0 = √ e2πi(φst +φδ=0 ) θ δ l π 2 A A 1 m 1 − ×θ A(2τ − 1) + , A(2τ − 1) 2 3 2 Aτ + 1 1 Aτ + 1 m ÷ θ + − A(2τ − 1), Aτ θ 2 2 2 3 1−A m ×θ − A(2τ − 1) , 2 3
(B.8)
June 5, 2006 10:44 WSPC/148-RMP
278
J070-00264
H. Deng et al.
A (1 − A) 2τ − 1 2τ − 1 , c1 = √ e2πi(φst +φδ=1 ) θ δ 2 A A l π 1 m 1 − ×θ A(2τ − 1) + , A(2τ − 1) 2 3 2 Aτ + 1 1 Aτ + 1 m ÷ θ + − A(2τ − 1), Aτ θ 2 2 2 3 1−A m 1 × θ − A(2τ − 1) + . 2 3 2
(B.9)
c0 and c1 can be derived from (3.38) and (3.43). Thus, for all u and v, it holds that as analytic functions of A˜ v0 and s˜0 , the right-hand side of (B.6) equals to the right-hand side of (B.7) (see Sec. 3). Namely, they form an identity. Through the v , s˜0 → s˜, we get transformation of A˜ v0 → A˜ A s˜ 2τ − 1 √ θ (δA − t)(1 − τ ) + , l π A A δ
× θ((δA − t)τ + 2A˜ v + s˜, A(2τ − 1) τ −1
2πit s˜−s (1+ √i )
2
√π
(˜ s2 −s2 )
2A 3 × e 3A × e2πi 2A (δA−t) × e2πiδA˜v × e s˜ − s i = c0 θ A˜ v+ 1+ √ +α 2 3 s˜ − s i i s˜ − s +γ × θ A˜ v+ 1− √ + β θ −√ 2 3 3 2 s˜ − s i 1 + c1 θ A˜ v+ 1+ √ +α+ 2 2 3 s˜ − s 1 i × θ A˜ v+ +β− 1− √ 2 2 3 1 i s˜ − s +γ+ . × θ −√ ≡ fst 2 3 2
(B.10)
Comparing the right-hand side of (B.5) with the left-hand side of (B.10), it is easy to find the following relation 1
× e2πit[(z1 ( 2 + fst (u, Av, z1 , z2∗ ) = fst l2
∗
∗
√ √ ∗ 1 3i 3i 6 )+z2 ( 2 − 6 )]
× e2πis(
−
√ 3
3i
z1 +
√
3i ∗ 3 z2 )
∗
× e 4 (2z1 z2 +z1 z1 +z2 z2 ) . Substituting (B.1) into (B.10), we get fst (u, Av, z1 , z2∗ )
√ √ 3i 3i 1 ∗ 1 + − Az2 − +α = c0 θ Av − Az1 2 6 2 6 √ √ 3i 3i 1 ∗ 1 × θ Av + u − Az1 − − Az2 + +β 2 6 2 6
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
279
i i × θ u − Az1 − √ − Az2∗ √ +γ 3 3 √ i 1 1 3i − Az2∗ − +α+ + c1 θ Av − Az1 √ 2 6 2 3 √ i 3i 1 1 ∗ − − Az2 √ × θ Av + u − Az1 +β− 2 6 2 3 i i 1 × θ u − Az1 − √ − Az2∗ √ +γ+ 2 3 3 1
× e2πit[(z1 ( 2 + l2
√
∗
∗ 1 3i 6 )+z2 ( 2 − ∗
√ 3i 6 )]
× e2πis(
−
√ √ 3i z1 + 33i z2∗ ) 3
∗
× e 4 (2z1 z2 +z1 z1 +z2 z2 ) .
(B.11)
Take the transformation u → −Av,
Av → −
A + u + Av 2
on (B.11). The function fst is transformed into f¯st f¯st (u, Av, z1 , z2∗ ) i i ∗ − Az2 − √ + αt,t−s = c0 (s, t)θ Av − Az1 √ 3 3 √ √ 3i 3i 1 ∗ 1 + − − Az2 + βt,t−s × θ Av + u − Az1 2 6 2 6 √ √ 3i 3i 1 ∗ 1 − + − Az2 + γt,t−s × θ u − Az1 2 6 2 6 i 1 i ∗ + c1 (s, t)θ Av − Az1 √ − Az2 − √ + αt,t−s + 2 3 3 √ √ 1 1 3i 3i ∗ 1 × θ Av + u − Az1 + − Az2 − + βt,t−s − 2 6 2 6 2 √ √ 3i 3i 1 1 ∗ 1 − − Az2 + + γt,t−s + × θ u − Az1 2 6 2 6 2 × e2πi(t−s)( l2
∗
√ √ 3i 3i ∗ 3 z1 − 3 z2 ) ∗
∗
× e 4 (2z1 z2 +z1 z1 +z2 z2 ) .
1
× e2πit[(z1 ( 2 −
√
∗ 1 3i 6 )+z2 ( 2 +
√ 3i 6 )]
(B.12) 2πi
In addition, we let s = t, t = t − s and change z1 and z2 into z1 = e 6 z1 2πi and z2 = e− 6 z2 . Rewrite cj (s, t) as e2πiφst cj (0, 0) (see (B.8) and (B.9) for the
June 5, 2006 10:44 WSPC/148-RMP
280
J070-00264
H. Deng et al.
definitions of cj (s, t)). We have f¯st (u, Av, z1 , z2∗ ) =e
2πiφst
c0 (0, 0)θ Av −
× θ Av + u −
Az1
Az1
√ √ 3i 3i 1 ∗ 1 + − − Az2 + αs ,t 2 6 2 6
√ √ 3i 3i 1 ∗ 1 − − Az2 + + βs ,t 2 6 2 6
i i ∗ × θ u − Az1 − √ − Az2 √ + γs ,t 3 3 √ √ 3i 3i 1 1 2πiφst ∗ 1 + − − Az2 + αs ,t + +e c1 (0, 0)θ Av − Az1 2 6 2 6 2
× θ Av + u −
Az1
√ √ 3i 3i 1 1 ∗ 1 − + − Az2 + βs ,t − 2 6 2 6 2
i 1 i ∗ × θ u − Az1 − √ − Az2 √ + γs ,t + 2 3 3
1
× e2πit [(z1 ( 2 + −
× e2πis (
√
∗ 1 3i 6 )+z2 ( 2 −
√ √ 3i z1 + 33i z2∗ ) 3
√
3i 6 )]
l2
∗
× e 4 (2z1 z2
+z1 z1∗ +z2 z2∗ )
,
(B.13)
where φst is defined in (3.37). It is easy to check that 2
e2πiφst = c2st−t e2πiφs,t , where φst
√ 3i 2 1 (s + t2 − st) − st. = 6A 2A
Therefore, we have 2 fs t (u, Av, z1 , z2∗ ) = c−2st+t f¯st (u, Av, z1 , z2∗ ).
(B.14)
Finally, we check that the function ψs,t (u, Av) given by fst satisfies the covariance condition (3.17), namely, c
−2st+t2
A ψst −Av, − + u + Av = ψt,t−s (u, Av) = ψs ,t (u, Av). 2
(B.15)
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
281
Since we have fs t (u, v) Af00 (u, v) d2 z1 d2 z2 fs t (u, Av, z1 , z2∗ )F1 (z1 )F2 (z2 )∗ = A d2 z1 d2 z2 f00 (u, Av, z1 , z2∗ )F1 (z1 )F2 (z2 )∗
ψs ,t (u, Av) =
2 c−2st+t d2 z1 d2 z2 f¯st (u, Av, z1 , z2∗ )F1 (z1 )F2 (z2 )∗ = , A d2 z1 d2 z2 f¯00 (u, Av, z1 , z2∗ )F1 (z1 )F2 (z2 )∗ and, based on (4.3) and (4.7), we have A ψst −Av, − + u + Av 2 d2 z1 d2 z2 f¯st (u, Av, z1 , z2∗ )F1 (z1 )F2 (z2∗ ) = . A d2 z1 d2 z2 f¯00 (u, Av, z1 , z2∗ )F1 (z1 )F2 (z2∗ ) It is obvious that Eq. (B.15) holds. References [1] H. S. Snyder, Quantized space-time, Phys. Rev. 71 (1947) 38; The electromagnetic field in quantized space-time, Phys. Rev. 72 (1947) 68. [2] E. Witten, Noncommutative geometry and string field theory, Nucl. Phys. B 268 (1986) 253. [3] A. Connes, M. Douglas and A. Schwartz, Matrix theory compactification on tori, J. High Energy Phys. 9802 (1998) 003, hep-th/9711162; M. Douglas and C. Hull, D-branes and noncommutative torus, ibid. 9802 (1998) 008, hep-th/9711165. [4] N. Seiberg and E. Witten, String theory and noncommutative geometry, J. High Energy Phys. 9909 (1999) 032, hep-th/9908142; V. Schomerus, D-branes and deformation quantization, ibid. 9906 (1999) 030. [5] A. Connes, Noncommutative Geometry (Academic Press, 1994). [6] G. Landi, An introduction to noncommutative space and their geometry, hep-th/9701078; J. Varilly, An introduction to noncommutative geometry, physics/9709045. [7] J. Madore, An Introduction to Noncommutative Differential Geometry and its Physical Applications, 2nd edn. (Cambridge University Press, 1999). [8] S. Minwalla, M. V. Raamsdonk and N. Seiberg, Noncommutative perturbative dynamics, J. High Energy Phys. 0002 (2000) 020, hep-th/9912072. [9] A. Matusis, L. Susskind and N. Toumbas, The IR/UV connection in the noncommutative gauge theories, J. High Energy Phys. 0012 (2000) 002, hep-th/0002075. [10] S. S. Gubser and M. Rangamani, D-brane dynamics and the quantum Hall effect, J. High Energy Phys. 0105 (2001) 041, hep-th/0012155.
June 5, 2006 10:44 WSPC/148-RMP
282
J070-00264
H. Deng et al.
[11] A. P. Polychronakos, Quantum Hall states as matrix Chern–Simons theory, J. High Energy Phys. 0104 (2001) 011, hep-th/0103013. [12] S. Hellerman and M. V. Raamsdonk, Quantum Hall physics equals noncommutative field theory, J. High Energy Phys. 0110 (2001) 039, hep-th/0103179. [13] L. Susskind, The quantum Hall fluid and non-commutative Chern–Simons theory, hep-th/0101029. [14] M. Fabinger, Higher-dimensional quantum Hall effect in string theory, J. High Energy Phys. 0205 (2002) 037, hep-th/0201016. [15] J. P. Hu and S. C. Zhang, Collective excitations at the boundary of a 4D quantum Hall droplet, cond-mat/0112432. [16] D. Karabali and V. P. Nair, Quantum Hall effect in higher dimensions, Nucl. Phys. B 641 (2002) 533–546, hep-th/0203264. [17] Y. X. Chen and B. Y. Hou, Non-commutative geometry of 4-dimensional quantum Hall droplet, Nucl. Phys. B 638 (2002) 220–242, hep-th/0203095. [18] B. Freivogel, L. Susskind and N. Toumbas, A two-fluid description of the quantum Hall soliton, hep-th/0108076. [19] S. Hellerman and L. Susskind, Realizing the quantum Hall system in string theory, hep-th/0107200. [20] B. A. Bernevig, J. Brodie, L. Susskind and N. Toumbas, How Bob Laughlin tamed the giant graviton from Taub-NUT space, J. High Energy Phys. 0102 (2001) 003, hep-th/0010105. [21] A. P. Polychronakos, Quantum Hall states as matrix Chern–Simons theory, J. High Energy Phys. 0104 (2001) 011, hep-th/0103013. [22] A. P. Polychronakos, Quantum Hall states on the cylinder as unitary matrix Chern– Simons theory, J. High Energy Phys. 0106 (2001) 070, hep-th/0106011. [23] B. Morariu and A. P. Polychronakos, Finite noncommutative Chern–Simons with a Wilson line and the quantum Hall effect, J. High Energy Phys. 0107 (2001) 006, hep-th/0106072. [24] B. Y. Hou, D. T. Peng, K. J. Shi and R. H. Yue, Solitons on noncommutative torus as elliptic Calogero Gaudin models, Branes and Laughlin wave function, Int. J. Mod. Phys. A 18 (2003) 2477–2500, hep-th/0204163. [25] B. Y. Hou and D. T. Peng, Elliptic algebra and integrable models for solitons on noncummutative torus, Int. J. Mod. Phys. B 16 (2002) 2079–2088. [26] G. Derrick, Comments on nonlinear wave equations as models for elementary particles, J. Math. Phys 5 (1965) 1252. [27] R. Gopakumar, S. Minwalla and A. Strominger, Noncommutative soliton, J. High Energy Phys. 005 (2000) 048, hep-th/0003160. [28] U. Lindstrom, M. Rocek and R. von Unge, Non-commutative soliton scattering, J. High Energy Phys. 0012 (2000) 004, hep-th/0008108. [29] K. Dasgupta, S. Mukhi and G. Rajesh, Noncommutative tachyons, J. High Energy Phys. 0006 (2000) 022, hep-th/0005006. [30] J. A. Harvey, P. Kraus, F. Larsen and E. J. Martinec, D-branes and strings as noncommutative solitons, J. High Energy Phys. 0007 (2000) 042, hep-th/0005031. [31] A. Sen, Tachyon condensation on the brane antibrane system, J. High Energy Phys. 08 (1998) 012, hep-th/9805170. [32] A. Sen, Tachyon condensation in string theory, J. High Energy Phys. 0003 (2000) 0002, hep-th/9912249. [33] A. P. Polychronakos, Flux tube solutions in noncommutative gauge theories, Phys. Lett. B 495 (2000) 407–412, hep-th/0007043.
June 5, 2006 10:44 WSPC/148-RMP
J070-00264
The Manifestly Covariant Soliton Solutions
283
[34] J. Harvey, Komaba lectures on noncommutative solitons and D-branes, hepth/0102076; J. A. Harvey, P. Kraus and F. Larsen, J. High Energy Phys. 0012 (2000) 024, hep-th/0010060; M. Hamanaka and S. Terashima, On exact noncommutative BPS solitons, J. High Energy Phys. 0103 (2001) 034. [35] D. J. Gross and N. A. Nekrasov, Solitons in noncommutative gauge theory, J. High Energy Phys. 0103 (2001) 044, hep-th/0010090; M. R. R. Douglas and N. A. Nekrasov, Noncommutative field theory, Rev. Mod. Phys. 73 (2001) 977–1029, hep-th/0106048. [36] M. Rieffel, Pacific J. Math. 93 (1981) 415. [37] F. P. Boca, Comm. Math. Phys. 202 (1999) 325. [38] E. J. Martinec and G. Moore, Noncommutative solitons on orbifolds, EFI-2000-55, RUNHETC-2000-58, hep-th/0101199. [39] R. Gopakumar, M. Headrick and M. Spradin, On noncommutative multi-solitons, Commun. Math. Phys. 233 (2003) 355–381, hep-th/0103256. [40] L. Hadasz, U. Lindstrom, M. Rocek and R. von Unge, Noncommutative solitons: Moduli spaces, quantization, finite theta effects and stability, J. High Energy Phys. 0106 (2001) 040, hep-th/0104017. [41] B. Y. Hou, K. J. Shi and Z. Y. Yang, Solitons on noncommutative orbifold T 2 /ZN , Lett. Math. Phys. 61 (2002) 205–220, hep-th/0204102. [42] H. Deng, B. Y. Hou, K. J. Shi, Z. Y. Yang and R. H. Yue, Soliton solutions on noncommutative orbifold T 2 /Z4 , J. Math. Phys. 45 (2004) 978–995, hep-th/0305212. [43] H. Bacry, A. Grassman and J. Zak, Proof of completeness of lattice states in the kq representation Phys. Rev. B 12 (1975) 1118. [44] J. Zak, In Solid State Physics, eds. H. Ehrenreich, F. Seitz and D. Turnbull, Vol. 27 (Academic, New York, 1972).
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
Reviews in Mathematical Physics Vol. 18, No. 3 (2006) 285–310 c World Scientific Publishing Company
DYNAMICS AND UNIVERSALITY OF AN ISOTHERMAL COMBUSTION PROBLEM IN 2D
Y. W. QI Department of Mathematics, University of Central Florida, Orlando, FL 32816, USA [email protected] Received 30 December 2005 In this paper, the Cauchy problem of the system u1,t = u1 − u1 um 2 ,
u2,t = du2 + u1 um 2
is studied, where x ∈ R2 , m ≥ 1 and d > 0 is the Lewis number. This system models isothermal combustion (see [7]), and auto-catalytic chemical reaction. We show the global existence and regularity of solutions with non-negative initial values having mild decay as |x| → ∞. More importantly, we establish the exact spatio-temporal profiles for such solutions. In particular, we prove that for m = 1, the exact large time behavior of solutions is characterized by a universal, non-Gaussian spatio-temporal profile, with anomalous exponents, due to the fact that quadratic nonlinearity is critical in 2D. Our approach is a combination of iteration using Renormalization Group method, which has been developed into a very powerful tool in the study of nonlinear PDEs largely by the pioneering works of Bricmont, Kupiainen and Lin [6], Bricmont, Kupiainen and Xin, [7], (see also [9]) and key estimates using the PDE method. Keywords: Auto-catalytic chemical reactions; critical nonlinearity; anomalous exponent; renormalization group; universal spatial-temporal profiles. Mathematics Subject Classification 2000: 34C20, 34C25, 92E20
1. Introduction In this paper, we study the initial value problem of reaction-diffusion system u1,t = u1 − u1 um 2 , u2,t = du2 +
u1 um 2 ,
(1.1) (1.2)
in R2 , where m ≥ 1 and d > 0 is the Lewis number. We assume u1 (x, 0) = a1 (x) ≥ 0,
u2 (x, 0) = a2 (x) ≥ 0, 1
a1 (x), a2 (x) ∈ L (R2 ) ∩ L∞ (R2 ).
(1.3)
We are concerned with (i) the global existence and regularity of solutions, and (ii) the large time dynamics. 285
June 5, 2006 10:44 WSPC/148-RMP
286
J070-00268
Y. W. Qi
The system (1.1), (1.2) arises as an isothermal combustion model, see [7], and can also be understood as a pre-mixed, isothermal auto-catalytic chemical reaction of the type A + mB → (m + 1)B, under the usual assumption that the isothermal reaction rate is proportional to u1 um 2 . Here u1 is the concentration density of reactant A, u2 is the concentration density of auto-catalyst B. In particular, it contains the two most important autocatalytic chemical reactions of A + B → 2B, A + 2B → 3B, in respect to m = 1 and m = 2 as special cases. The importance of this model in relation to thermal-diffusive combustion is well-documented in [7]. In chemical reaction theory and biology, it is well known that auto-catalytic chemical reactions play a very important role in complex chain reactions which fulfil many important functions in a living cell and is the focus of intensive research in cell biology. The study of the system in a bounded domain is carried out by many authors, in particular, see Alikakos [1], Hollis, Martin and Pierre [10], Martin and Pierre [13] and Masuda [14]. Among other things, the authors established boundedness, global existence and large time behavior of solutions. For homogeneous Dirichlet or Neumann boundary conditions, the large time behavior is that (u1 , u2 ) converges to a constant vector (c1 , c2 ) such that c1 · c2 = 0, see Masuda [14]. However, our approach is more in line to that of Bricmont, Kupiainen, and Xin [7] and Berlyand and Xin [3], where the Cauchy problem in 1D is studied (see also [18]). As a matter of fact, our main motivation is to extend the pioneering work of Bricmont, Kupiainen, and Xin [7] to 2D and to a wider class of initial values by combining the Renormalization Group (RG) method, and key estimates in using the PDE method. The existence of traveling front solution in 1D is established by Billingham and Needham in [4, 5] for m = 1 and m = 2. The authors also study the large-time behavior of solutions using formal methods and numerical computation. The most important difference in study systems such as (1.1), (1.2) rigorously, as against single equations, is that there is no maximum principle for the systems. We overcome this difficulty by using a combination of PDE analysis (including Maximum Principles to each single equation involved) and the Renormalization Group (RG) method. In particular, deriving a priori estimates play a crucial role in our study. It allows us to obtain an exact large time dynamics for a wide class of initial values, demonstrating a universal behavior. But, equally important, it also makes the analysis in general, and Renormalization Group iteration in particular, much simpler and more transparent, demonstrating the power of the RG method in
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
287
the study of critical behavior. Another significant aspect of this paper is a detailed analysis of the limiting linear eigenvalue problem for the m = 1 case. We consider the system with initial values a1 , a2 ∈ B, where B is the Banach space of continuous function in R2 with norm f = sup |f (x)|(1 + |x|)q , x∈R2
q > 2.
(1.4)
Let φ be the Gaussian 1 exp(−|x|2 /4d). 4πd For A > 0, let ψA be the normalized ( ψA (x) dµ(x) = 1, see below) principal eigenfunction and λA > 0 the principal eigenvalue, of differential operator x LA = − − · ∇ − 1 + Aφd (x) 2 φ(x) =
2
on L2 (R2 , dµ), with dµ(x) = e|x| /4 dx. Our main result is the following theorem. Theorem 1.1. Suppose the initial values a1 , a2 ∈ B and ai ≥ 0, ≡ 0, i = 1, 2. Let A = R2 a1 (x)+a2 (x) dx > 0 be the total mass, which is conserved in time. Then the system (1.1), (1.2) has a unique global classical solution (u1 (x, t), u2 (x, t)) ∈ B × B for ∀ t ≥ 0. Furthermore, (i) if m = 1 and q > q(A) ≡ 2 + 2λA ,
(1.5)
there is a positive constant B depending continuously on (a1 , a2 ) such that √ t1+λA u1 ( t ·, t) − BψA ( · ) → 0 √ tu2 ( t ·, t) − Aφ( · ) → 0 as t → ∞; (ii) if m > 1 and q > 2, √ tu1 ( tx, t) − A1 e−|x|2 /4 → 0 4π and
√ tu2 ( tx, t) − A2 e−|x|2 /4d → 0 4πd
as t → ∞
as t → ∞
with A1 , A2 > 0 and A1 + A2 = A. Remark 1.2. Just as the cubic nonlinearity of m = 2 is critical in 1D, the same is true for quadratic nonlinearity of m = 1 in 2D. This is the reason why we have the extra decay power λA and the non-Gaussian profile of ψA . In other words, the scaling law which works for m > 1 no longer works for m = 1, and thus the appearance of the anomalous exponent λA .
June 5, 2006 10:44 WSPC/148-RMP
288
J070-00268
Y. W. Qi
Remark 1.3. A distinct feature of our result for m = 1 is that we can quantify how large the decaying rate q is for initial values, in terms of total mass A, to qualify for the universal non-Gaussian profile of ψA . As a matter of fact, we think our result is optimal, see the Remark 5.9 for further details. To understand heuristically the result of Theorem 1.1, in particular the critical case of m = 1, let us suppose that the nonlinear term in u1 equation causes some extra decay in time on u1 in the order of t−δ , where δ > 0, on top of the pure diffusion decay of t−1 . This, in turn, results in u2 having the pure diffusion behavior, since the nonlinear term u1 u2 is in the order of t−(1+δ) u2 . In the RG terminology, the nonlinear term is irrelevant in the u2 equation. Then, it is a relatively simple matter to show that √ tu2 ( tx, t) → Aφ(x) as t → ∞. That is, √ w2 (y, s) = tu2 ( tx, t),
x s = log t and y = √ , t
converges to a steady solution of us = −Lu, where x L = −d − · ∇ − 1. 2 If we substitute the limiting profile of u2 into u1 equation, we have x u1,t = ∆u1 − t−1 Aφ √ u1 , t which, after a self-similar change of variables u1 (y, s) = tu1 (x, t),
x y = √ , s = log t, t
turns into u1,s = −LA u1 . Therefore, it is reasonable to guess the large time behavior of u1 is determined by the first eigenvalue λA and the corresponding eigenfunction ψA of LA , so that u1 (y, s)eλA s → BψA (y),
as s → ∞
with some constant B > 0, provided u1 has moderate, but sufficient decay in x as |x| → ∞. But, since we are dealing with an initial value problem with strong nonlinear terms, it is a nontrivial matter to prove the result rigorously. This is particularly true if u1 has just modest decay (not in L2 (R2 , dµ)) as |x| → ∞, which is the case we are dealing with in this paper. The organization of the paper is as follows. In Sec. 2, we show a priori estimates on the solution of the system and consequently establish the global existence. In Sec. 3, we derive the large time dynamics for m > 1. In Sec. 4, we consider the
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
289
limiting linear eigenvalue problem in relation to the m = 1 case and provide some detailed analysis. In Sec. 5, we establish refined bounds for m = 1 case and use RG method to prove the convergence of u1 to a first eigenfunction of LA , completing the proof of Theorem 1.1. Throughout the paper, ·2 stands for the L2 (R2 , dµ)-norm of a function, unless otherwise stated, · the norm defined in (1.4), and other Lp -norm is the standard one. Also, for simplicity of notation, we shall not distinguish generic constant C from line to line. 2. Preliminary Estimates and Global Existence In this section, we show that u1 and u2 have a uniform (in time) Lp (R2 )-norm, 1 ≤ p ≤ ∞. Then, it follows from the classical theory that the solutions (u1 , u2 ) are smooth and exist globally in time. In this section, all Lp -norms are the standard ones. First, we collect some simple facts. Lemma 2.1. The solutions (u1 , u2 ) with (a1 , a2 ) ∈ (L1 (R2 )× L∞ (R2 ))2 satisfy the following estimates: (i) R2 (u1 + u2 )(x, t) dx = R2 (a1 + a2 )(x) dx, ∀ t ≥ 0; (ii) (iii)
u (x, t) dx R2 1
∞ 0
≤
a (x) dx, R2 1
u um (x, t) dxdt R2 1 2
u (x, t) dx R2 2
≥
a (x) dx R2 2
∀ t ≥ 0;
< +∞.
Furthermore, (iv) 0 ≤ u1 (x, t) ≤ a1 ∞ , u1 (x, t) ≤ u1 (x, t) and u2 (x, t) ≥ u2 (x, t), ∀ t ≥ 0, where u2 is a solution of the heat equation u2,t = du2 ,
u2 |t=0 = a2 (x),
and u1 is a solution of u1,t = u1 − u1 um 2 ,
u1 |t=0 = a1 (x).
Proof. Simple integration yields (i)–(iii). (iv) is obtained by direct use of maximum principle. The key estimates of this section are stated in the following lemma. Lemma 2.2. The solutions (u1 , u2 ) of (1.1)–(1.2) are uniformly bounded in time in Lp (R2 ) norm: u1 (· , t)Lp + u2 (· , t)Lp ≤ C(a1 , a2 , p) < ∞,
1 ≤ p < ∞,
(2.1)
where C(a1 , a2 , p) is a constant depending on initial data and p is a positive integer.
June 5, 2006 10:44 WSPC/148-RMP
290
J070-00268
Y. W. Qi
Proof. It is clear by Lemma 2.1 that the u1 bound holds. We proceed to show the bound for u2 . By standard local existence theory, (u1 , u2 ) are classical solutions local in time and all Lp (R2 )-norm of (u1 , u2 ) and the L2 (R2 )-norm of (∇u1 , ∇u2 ) are finite and continuous in time. Therefore, we can freely perform integration by with p ≥ 2, integrating over R2 × (0, t), we get parts. By multiplying (1.2) by pup−1 2 t up2 dx = ap2 dx − d p(p − 1)|∇u2 |2 up−2 dxdτ 2 R2
R2
+ R2
0
R2
0
t
pu1 um+p−1 dxdτ. 2
(2.2)
In addition, with the help of integration by parts, we derive the identity d (u2 + u1 )up2 dxdτ dt R2 1 p = (1 + 2u1 )(u1 − u1 um )u dx + (u21 + u1 )pup−1 (du2 + u1 um 2 2 ) dx 2 2 R2
= −2 −
R2
R2
|∇u1 |2 up2 dx −
R2
(1 + 2u1 )pup−1 ∇u1 · ∇u2 2
(1 + 2u1 )u1 um+p dx − d 2
−d
R2
R2
R2
(1 + 2u1 )pup−1 ∇u1 · ∇u2 2
(u21 + u1 )p(p − 1)up−2 |∇u2 |2 dx + 2
R2
(u21 + u1 )pup+m−1 u1 dx 2
= I + II + III + IV + V + VI . It is clear that
(2.3)
I + II + IV ≤ (1 + 2a1 ∞ )(1 + d) ≤
R2
−2
|∇u1 |2 up2 R2
R2
|∇u1 · ∇u2 |pup−1 −2 2
dx + C(a1 ∞ , p)
R2
R2
|∇u1 |2 up2 dx
|∇u2 |2 up−2 dx 2
|∇u1 |2 up2 dx
≤ C(a1 ∞ , p)
|∇u2 |2 up−2 2
R2
dx −
R2
|∇u1 |2 up2 dx.
Moreover, III ≤ −
R2
u1 um+p dx, 2
and VI ≤ (a1 ∞ +
a1 2∞ )
V ≤0
R2
pu1 um+p−1 dx. 2
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
An integration of (2.3) from 0 to t yields t (u21 + u1 )up2 dx (t) ≤ (a21 + a1 )ap2 dx + C(p, a1 ) R2
R2
+ (a1 ∞ + a1 2∞ ) −
t R2
0
t 0
R2
u1 um+p dxdτ − 2
R2
0
291
|∇u2 |2 up−2 dxdτ 2
pu1 um+p−1 dxdτ 2
t 0
|∇u1 |2 up2 dxdτ.
R
(2.4)
The combination of (2.2) and (2.4) then gives t up2 + (|∇u2 |2 up−2 + |∇u1 |2 up2 + u1 um+p ) dxdτ 2 2 R2
0
R2
t ≤ C(a1 , a2 , p) 1 +
R2
0
u1 um+p−1 2
dxdτ .
For p = 1, we want to show t t m+1 2 (u1 u2 + |∇u1 | u2 ) dxdτ ≤ C(a1 , a2 ) 1 + 0
R2
Proceed as the case of p ≥ 2, we have d (u2 + u1 )u2 dx dt R2 1 = −2 |∇u1 |2 u2 dx − (1 + 2u1 )∇u1 · ∇u2 − R2
R2
−d ≤ −2 −
R
R2
−
|∇u1 | u2 dx + C(a1 , a2 )
R2
2
R2
R2
dxdτ . (2.6)
(1 + 2u1 )u1 um+1 dx 2
(u21 + u1 )u1 um 2 dx (|∇u1 · ∇u2 | + u1 um 2 )
u1 um+1 dx 2
≤ −2
R2
u1 um 2
2
R2
R2
(1 + 2u1 )∇u1 · ∇u2 dx +
R2
0
(2.5)
|∇u1 | u2 dx + C(a1 , a2 )
R2
(−1 |∇u1 |2 + |∇u2 |2 + u1 um 2 )
u1 um+1 dx 2
(2.7)
for any > 0. Now, integrating the above inequality from 0 to t gives t (u21 + u1 )u2 dx + (2|∇u1 |2 u2 + u1 um+1 ) dxdτ 2 R2
0
R2
t ≤ C(a1 , a2 ) 1 + 0
R2
(
−1
2
2
|∇u1 | + |∇u2 | +
dxdτ.
u1 um 2 )
(2.8)
June 5, 2006 10:44 WSPC/148-RMP
292
J070-00268
Y. W. Qi
With the help of (2.2) when p = 2, we obtain t t |∇u2 |2 dxdτ ≤ a22 dx + 2 u1 um+1 dxdτ. 2d 2 0
R2
R2
0
(2.9)
R
Similarly, by multiplying (1.1) by 2u1 and integrating over R2 × (0, t), we have t t u21 dx = a21 dx − 2 |∇u1 |2 dxdτ − 2 u21 um (2.10) 2 dxdτ, R2
R2
R2
0
R2
0
which implies that t 2 0
R2
|∇u1 |2 dxdτ ≤
R2
a21 dx.
(2.11)
Combining (2.8), (2.9) and (2.11) with sufficiently small gives the desired inequality (2.6). It is easy to see that the lemma follows from a simple induction on p using (2.5) and (2.6). Lemma 2.3. The following estimates for the derivative of (u1 , u2 ) hold if t ≥ t0 > 0: ∇u1 (· , t)2 + ∇u2 (· , t)2 ≤ C(a1 , a2 ) < ∞,
(2.12)
where t0 > 0 is arbitrary. Proof. The above estimates follow by standard techniques once the Lp estimate of (u1 , u2 ) is established. For completeness, we include the details here. First, we have from (2.10), t 2 2 u1 dx + |∇u1 | dxdτ ≤ a21 dx. R2
Similarly, we have t u22 dx + R2
0
R2
0
R2
|∇u2 |2 dxdτ ≤
R2
t 0
R2
u1 um+1 dxdτ + 2
R2
a22 dx ≤ C(a1 , a2 )
by Lemma 2.2. By Fubini’s theorem, there exists t1 ∈ [0, t0 ] such that 2 2 |∇u1 | + |∇u2 | dx (t1 ) ≤ C(a1 , a2 )/t0 .
(2.13)
R2
Next, multiplying (1.1) by u1 and integrating over R2 yield 1 d − (u1 )2 dx − u1 um ∇u1 22 = 2 u1 dx 2 dt 2 2 R R 2 m 2 (u1 ) dx + u2 |∇u1 | dx + m u1 um−1 ∇u1 · ∇u2 dx. = 2 R2
R2
R2
(2.14)
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
293
By similar means, we have −
1 d ∇u2 22 2 dt =d (u2 )2 dx − m R2
R2
u1 um−1 |∇u2 |2 dx − 2
R2
um 2 ∇u1 · ∇u2 dx. (2.15)
Adding (2.14) and (2.15), and integrate from t1 to t then gives (∇u1 22 + ∇u2 22 )(t) ≤
(∇u1 22
+
t +m t1
R2
∇u2 22 )(t1 )
t + t1
R2
um 2 |∇u1 · ∇u2 | dxdτ
u1 um−1 |∇u2 |2 dxdτ + m 2
t t1
R2
u1 um−1 |∇u1 · ∇u2 | dxdτ. 2 (2.16)
We now derive bounds for each of the last three terms on the right-hand side of (2.16). Let M be the smallest positive integer bigger or equal to m. t um 2 |∇u1 · ∇u2 | dxdτ t1 R2
≤
≤
t t1 R2
t t1 R2
×
2 um 2 |∇u1 |
1/2 t
2 uM 2 |∇u1 |
t t1 R2
t1 R2
2 um 2 |∇u2 |
m/2M t
2 uM 2 |∇u2 |
t1 R2
2
1/2
(M−m)/2M
|∇u1 |
m/2M t t1 R2
|∇u2 |2
(M−m)/2M (2.17)
by H¨ older’s inequality. Similarly, t u1 um−1 |∇u2 |2 dxdτ 2 t1 R2
≤ a1 ∞ ≤ a1 ∞
R2
um−1 |∇u2 |2 dxdτ 2
t t1 R2
uM−1 |∇u2 |2 2
σ t dxdτ t1 R2
2
|∇u2 | dxdτ
1−σ , (2.18)
June 5, 2006 10:44 WSPC/148-RMP
294
J070-00268
Y. W. Qi
where σ = (m − 1)/(M − 1) and t t1 R2
u1 um−1 |∇u1 · ∇u2 | dxdτ 2
≤ a1 ∞ ≤ a1 ∞ ×
t t1 R2
um−1 |∇u1 · ∇u2 | dxdτ 2
t
t
t1 R2
t1 R2
uM−1 |∇u1 |2 2
uM−1 |∇u2 |2 dxdτ 2
σ/2 t dxdτ σ/2 t
t1 R2
t1 R2
2
(1−σ)/2
|∇u1 | dxdτ
|∇u2 |2 dxdτ
(1−σ)/2 . (2.19)
With the help of (2.16)–(2.19), and (2.5) and (2.6) in Lemma 2.2, we obtain (∇u1 22 + ∇u2 22 )(t) ≤ (∇u1 22 + ∇u2 22 )(t1 ) + C(a1 , a2 ). Hence, by (2.13), (∇u1 22 + ∇u2 22 )(t) ≤ C(a1 , a2 ) for all t ≥ t0 . Proposition 2.4. The system (1.1), (1.2) has a unique classical solution satisfying, for all t > 0, u1 Lp + u2 Lp ≤ C(a1 , a2 ),
1 ≤ p ≤ ∞,
(2.20)
where the constant depends only on the initial data (a1 , a2 ) ∈ (L1 (R2 ) ∩ L∞ (R2 ))2 . Proof. The bound for u1 follows directly from Lemmas 2.1 and 2.2. For u2 , the classical theory of local existence yields the bound for small t, say t ≤ t0 , where t0 > 0. For t > t0 , if we write Eq. (1.2) as u2,t − du2 = f, the standard boot-strapping argument yields u2 2,2 ≤ C(u2 1,2 + f 2 ), where · 2,2 is the norm of Sobolev space W 2,2 (R2 ) and · 1,2 that of W 1,2 (R2 ), with C = C(n, d). The Sobolev embedding, (2.5) for p = 2 and Lemma 2.3 give us u2 ∞ ≤ C(a1 , a2 ). This completes the proof of the proposition.
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
295
3. The Case of m > 1 In this section, we show that the solutions (u1 , u2 ), when m > 1, have the large-time dynamics as solutions of pure heat equations. First, it is clear by maximum principle that u1 ∞ ≤ C(a1 )(1 + t)−1 ,
∀ t ≥ 0.
(3.1)
Second, u2 (x, t) = ≤
R2
H(y, t)a2 (x − y) dy +
C(a2 ) + 1+t
1
t
t
R2
0
H(y, s)u1 um 2 (x − y, t − s) dyds,
+ R2
0
R2
1
H(y, s)u1 um 2 (x − y, t − s) dyds
where H(x, t) =
1 −x2 /4dt e 4πdt
is the heat kernel. 1 0
R2
≤
H(y, s)u1 um 2 (x − y, t − s) dyds 0
1
u2 m ∞ u1 ∞ (t − s) ds
≤ C(a1 , a2 ) ≤
0
1
1 ds 1+t−s
C(a1 , a2 ) t
using (3.1) and the L∞ bound u2 ∞ ≤ C(a1 , a2 ). t 1
R2
≤
H(y, s)u1 um 2 (x − y, t − s) dyds 1
t
s−1 u1 um−1 ∞ (t − s)ds 2
≤ C(a1 , a2 )
1
t
s−1 ·
R2
u2 dy
1 ds 1+t−s
≤ C(a1 , a2 )2 log t/(1 + t). Therefore, u2 ∞ ≤ C(a1 , a2 ) log t/(1 + t).
(3.2)
June 5, 2006 10:44 WSPC/148-RMP
296
J070-00268
Y. W. Qi
Substitute this inequality back into (3.2), we have t H(y, s)u1 um 2 (x − y, t − s) dyds 1
R2
≤ C(a1 , a2 )
1
t
s−1
(log(1 + t − s))m−1 ds (1 + t − s)m
≤ C/(1 + t). The last inequality can be established, for instance, by breaking the integral into two parts on [1, t2/(1+m) ] and [t2/(1+m) , t] and estimating each part. Hence, u2 ∞ ≤ C(a1 , a2 )/(1 + t).
(3.3)
It is then clear that both u1 and u2 satisfy an equation in the form ut = Du + O((1 + t)−m )u, with D a positive constant. By classical theory of heat equation, the nonlinear term is irrelevant and the large-time dynamics for each of them is characterized by a pure heat equation. That is, √ tu1 ( tx, t) − A1 e−|x|2 /4 → 0 as t → ∞ 4π and
√ tu2 ( tx, t) − A2 e−|x|2 /4d → 0 as t → ∞ 4πd
with A1 , A2 > 0 and A1 + A2 = A. This is the exact statement of (ii) in Theorem (1.1). 4. The Linear Eigenvalue Problem In this section, we study the linear eigenvalue problem LA w = λA w, where λA > 0 is the first eigenvalue of LA . First, we list some known facts from classical functional analysis and other sources, for details see [7, 8]. (i) The spectrum of LA , as considered in L2 (R2 , dµ), consists of eigenvalues only and the eigenfunctions form a complete orthogonal set in that space. (ii) The first eigenvalue λA > 0 is non-degenerate and the corresponding eigenfunction is positive, radial with w = w(r), where r = |y| and w (0) = 0. (iii) λA depends continuously on A, is an increasing function of A and is strictly less than A/4πd.
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
297
The main purpose of this section is to show that the eigenfunction w has the asymptotics, as y → ∞, w(r) = ce−r
2
/4
|r|2λA + h.o.t
with c a positive constant. As a matter of fact, we shall consider the more general case of r w + w − Aφw = −λw, w + w + 2 r w (0) = 0, w(r) > 0, ∀ r ≥ 0,
(4.1)
and prove that a positive solution with λ > 0 has the asymptotics as above when λA is replaced by λ, provided that w ∈ L2 (R2 , dµ). If we integrate Eq. (4.1), we have r r2 ν(Aφ − λ)w dν. (4.2) rw (r) + w = 2 0 If there exists r0 > 0 such that w (r0 ) + r0 w(r0 )/2 < 0
and Aφ(y) − λ < 0,
∀ r ≥ r0 ,
then g(r) = w (r) + rw/2 satisfies g(r0 ) ≤ 0 and rg (r) + g(r) < 0, ∀ r ≥ r0 . Hence, there exists η > 0 with the property rg(r) < −η, ∀ r ≥ r0 + 1 ≡ r1 . An integration then yields r 2 2 2 eν /4 ν −1 dν, er /4 w(r) − er1 /4 w(r1 ) < −η r1
which in turn implies η , ∀ r 1. 2r2 A contradiction! Thus, g(r) > 0 for all r large. In addition, it is easy to see that w cannot change sign for r large. This, together with g(r) > 0, shows w (r) < 0 for all r large. An inspection of (4.2) gives g(r) > 0 for all r > 0. In consequence, g (r) → 0 as r → ∞. To derive the exact asymptotics, we make a change of variables. Let w(r) < −
z = er
2
/4
w.
We know from the above analysis that z, z > 0 ∀ r > 0. The equation for z is r 1 z − z + z = (Aφ − λ)z. 2 r If z → M > 0 as r → ∞, we would have, upon an integration, z > λz/r,
∀ r 1.
Clearly, this is in contradiction with z → M > 0 as r → ∞. Hence, z → ∞ as r → ∞.
June 5, 2006 10:44 WSPC/148-RMP
298
J070-00268
Y. W. Qi
Let G(y) = rz /2 − λz, 1 r 2λ r r 2λ2 −λ z = − z + Aφz. G = z + G− 2 2 2 r r 2 We show that G > 0 for r 1. Otherwise, G < 0 for all r large. Furthermore, Ge−r
2
/4 2λ
r
< −C < 0,
∀ r 1.
In consequence, z>
C C r2 /4 −2λ e r ⇔ w(r) > r−2λ , λ λ
∀ r 1,
which is in clear contradiction with w ∈ L2 (R2 , dµ). Hence, G > 0 for all r 1. We now show zr−2λ → C > 0 as r → ∞. For this purpose, we make another change of variables by letting σ(t) = zr−2λ ,
t = log r.
The equation for σ, using σ = dσ/dt, is as follows: 2t e2t −e σ + + 4λ σ + 4λ2 σ = e2t Aφσ + σ. 2 λ
(4.3)
First, we observe that for any δ > 0, σ < δσ if t 1. Otherwise, we would have, using (4.3), σ > 0, σ Hence, σ > σ
(σ − δσ) > 0
1 − e2t 2
an integration of which yields
∀ t 1.
for any > 0 if t 1,
1 1 − e2t . log σ > 2 2
Moreover, we see that σ − µσ > 0,
(σ − µσ) > 0,
for any µ > 0, if t 1. Consequently, we would have z ≥ µz if y 1. A contradiction! It is clear from the above argument that we must have 4λ2 σ > σ e2t /4 for all t 1. An integration of which yields σ is bounded from above. Hence, σ → C > 0 as t → ∞. The asymptotics of w is thus established.
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
299
5. The Case of m = 1 In this section, we use the Renormalization Group (RG) method, in combination with the a priori estimates derived in Sec. 2 and below to derive the spatio-temporal profile of general solutions for the case of m = 1, and therefore completing the proof of Theorem 1.1. First, we summarize some basic facts on u1 and u2 . Proposition 5.1. (i) If 0 ≤ a1 , a2 ∈ B, then u1 (· , t), u2 (· , t) ∈ B for any t > 0, and there exists C = C(q, a1 , a2 ) > 0 such that u1 (x, t) + u2 (x, t) ≤ eCt (1 + |x|)−q . (ii) For any t0 > 0, there exists δ > 0 such that x u2 (x, t) ≥ δ(1 + t)−1 φ (1 + t)
for t ≥ t0 .
(iii) u1 (· , t), u2 (· , t) are positive, classical solutions to (1.1), (1.2) with uniformly bounded L∞ norms. Proof. (i) is an easy exercise, and we omit the proof. (ii) and (iii) are proved in Sec. 2. Next, we observe that by making the change of variables: wi (y, s) = (1 + t)ui (x, t),
i = 1, 2,
s = log(1 + t),
x y= √ , 1+t
the system is changed to y · ∇w1 + w1 − w1 w2 , 2 y w2,s = dw2 + · ∇w2 + w2 + w1 w2 . 2 It is a more convenient formulation to work with. w1,s = w1 +
Lemma 5.2. Suppose a1 ∈ B,
u2 (x, t) ≥ c0 φ(y)(1 + t)−1 ,
then there exist c1 = c1 (c0 , q, d) > 0 and E = E(c0 , q, a1 ) such that u1 (x, t) ≤ E(1 + t)−(2+c1 )/2 (1 + |y|)−q . Proof. We use the formulation (5.1). The conclusion is the same as showing w1 (y, s) ≤ e−sc1 /2 (1 + |y|)−q . Let D > 0 be such that D2 > 4q(q + 1)/(q − 2)
(5.1)
June 5, 2006 10:44 WSPC/148-RMP
300
J070-00268
Y. W. Qi
and for all |y| ≥ D, exp(−|y|2 /4) ≤ |y|−q
−|y| exp(−|y|2 /4) ≥ −2q|y|−q−1 .
and
Let c = min|y|≤D c0 φ(y). Without lost of generality, we assume c ≤ (q − 2)/4. Denote I(w) = ws − w − Let w ¯=
y · ∇w − w + c0 φ(y)w. 2
E1 exp(−|y|2 /4)e−c1 s/2 + M e−c1 s/2 , 2
E1 exp(−|y| /4)e
−c1 s/2
+ Me
−c1 s/2
|y| ≤ D, −q
D |y| q
, |y| ≥ D,
(5.2)
where c1 = δc with δ > 0 a small number such that 4c1 < q − 2, and E1 and M are positive numbers satisfying c1 E1 <
2 M (q − 2 − 2c1 )eD /4 . 4
It is clear that with suitable choice of M , and consequently E1 , w(y, ¯ 0) ≥ w1 (y, 0). We now demonstrate that I(w) ¯ ≥ 0, which, together with I(w1 ) ≤ 0, yields the conclusion of the lemma. If |y| ≤ D, it is easy to compute
2 I(w) ¯ ≥ e−c1 s/2 (1 − δ)c E1 e−|y| /4 + M > 0. If |y| ≥ D, a more detailed calculation gives 2 M −c1 s/2 −c1 E1 e−|y| /4 + (q − 2 − c1 )Dq |y|−q I(w) ¯ ≥e 2 2 −(q+2) q −M q(q + 1)|y| D >0 by our careful choice of c1 , D and E1 . This completes the proof of Lemma 5.2. Lemma 5.3. Suppose a1 , a2 ∈ B. There exists δ > 0 such that √ x lim tu2 (x, t) − Aφ √ tδ → 0 uniformly in {x : |x| ≤ C t}, t→∞ t lim u2 (x, t)dx − A tδ → 0. t→∞ 1
(5.3) (5.4)
R
Proof. It is easy to see that u1 + u2 1 = a1 + a2 1 and u1 1 is decreasing in t and u2 1 is increasing. By Lemma 5.2, u1 1 → 0
as t → ∞,
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
301
and therefore, u2 1 → A as t → ∞. Moreover, using the u1 bound in Lemma 5.2, we deduce that u2 satisfies, with some δ0 > 0, u2,t = u2 + O(t−1−δ0 )u2 . Clearly, the nonlinear term is irrelevant and the conclusion follows from classical result for heat equation. With the limiting profile of u2 settled, we can now derive a better estimate for u1 than the one given in Lemma 5.2. Lemma 5.4. Suppose a1 , a2 ∈ B and q > q(A). For any > 0, there exists t0 > 0 such that if t ≥ t0 , u1 (x, t) ≤ M (1 + |y|)−q t−λ−1 , where λ = λA− is the first eigenvalue of LA− in L2 (R2 , dµ), and M is a positive constant. Proof. Let t1 > 1 be sufficiently large so that w2 (y, s) ≥ Aφ(y) − e−δs ≥
A φ(y) 2
for |y| ≤ D and s ≥ s1 ≡ log t1 , where D is a large positive number to be determined later. Set w ¯ = e−λs (M1 ψ(y) + (1 + |y|2 )−q/2 ),
(5.5)
where ψ(y) is the eigenfunction of LA− corresponding to λ with maxy∈R2 ψ(y) = 1. Denote y J(w) = ws − ∆w − · ∇w − w + w2 w. 2 It is easy to see that J(e
−λs
ψ(y)) ≥
e−λs ψ(y)[Aφ(y) − e−δs ] |y| ≤ D, −e−λs Aψ(y)φ(y)
|y| ≥ D.
(5.6)
Similarly, J(e−λs (1 + |y|2 )−q/2 ) q−2 q(q + 2)|y|2 5 ≥ e−λs − λ (1 + |y|2 ) − q − (1 + |y|2 )−q/2−1 2 2 1 + |y|2 > e−λs (1 + |y|2 )−q/2 (q − 2 − 2λ) /4 > 0
if |y| ≥ D1 (λ, q),
June 5, 2006 10:44 WSPC/148-RMP
302
J070-00268
Y. W. Qi
where D1 is the first positive number such that q − 2 − 2λ 5 (1 + |y|2 )2 ≥ q(1 + |y|2 ) + q(q + 2)|y|2 4 2
for all |y| ≥ D1 .
Here, we assume D D1 . Clearly, J(e−λs ψ(y)) ≥
1 −λs e Aψφ > 0 2
if s ≥ s2 (A, D) and |y| ≤ D. Furthermore, J(M1 e−λs ψ(y)) ≥
1 −λs e M1 Aψφ > 2q(q + 2)e−λs (1 + |y|2 )−q/2−1 2
if M1 > M (A, q) and |y| ≤ D1 . This shows that J(w) ¯ ≥ 0 if |y| ≤ D and s ≥ s2 (A, D). Finally, if |y| ≥ D and D is sufficiently large,
J e−λs M1 ψ(y) + (1 + |y|2 )−q/2 q − 2 − 2λ ≥ (1 + |y|2 )−q/2 − M1 Aψ(y)φ(y) e−λs > 0. 4 This shows that w ¯ is a super-solution if s ≥ s0 ≡ max(s1 , s2 ). It is a simple ¯ s0 ), where M2 is a matter to see that w1 (y, s0 ) can be bounded above by M2 w(y, ¯ s), s > s0 . This big positive number. By maximum principle w1 (y, s) ≤ M2 w(y, completes the proof of the lemma. The success of the Renormalization Group method depends, crucially, on estimates of solutions to the linear operators ws = LA w and ws = Lw. We collect some relevant ones in the next lemma. In what follows, we use e−sLA f and e−sL f to denote their solutions with initial value f , respectively. Lemma 5.5. Suppose f ∈ B. (i) There exists δ = δ(A, q) and s0 < ∞ such that for s ≥ s0 , e−sLA f ≤ e−δs f . Moreover, if q > q(A) and (ψA , f ) = 0, e−sLA f ≤ e−(δ+λA )s f , where λA is the first eigenvalue of LA , and (· , ·) is the inner product in L2 (R2 , dµ). (ii) If g ∈ L2 (R2 , dµ), there exists c > 0 such that e−LA g ≤ Cg2 , where · 2 is the norm in L2 (R2 , dµ).
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
303
(iii) Let χu = χ(|x| ≥ σ), the characteristic function. There exists C > 0 such that χu e−σLA g ≤ Ce−(q−1)σ/2 g.
(iv) Suppose R2 f dx = 0. Then, there exists δ = δ(q) > 0 and s0 < ∞ such that for s ≥ s0 , e−sL f ≤ e−δs f . (v) The quantities |λA − λA |, |1 − (ψA , ψA )| and the operator norm PA − PA in B are all bounded by C(M )|A − A |, where PA is the orthogonal projection in L2 (R2 , dµ) on ψA and PA ≤ C(M ), for 0 ≤ A, A ≤ M. Proof. The first part of (i) follows immediately from Lemma 5.2. The second part is proved in the proposition below. To show (ii), observe that 2 2 e−LA (x, y)e−y /8 ey /8 g(y) dy. (e−LA g)(x) = R2
Then, an application of Cauchy–Schwartz inequality together with the fact that
−LA 2 2 sup(1 + |x|)q e (x, y) e−y /4 dy < ∞ (5.7) R2
x
yields the desired result. The validity of (5.7) can be verified using e−sLA (x, y) ≤ e−sL0 (x, y),
by Feynmann–Kac formula
and the Mehler’s formula, e
−sL0
|x − e−s/2 y|2 −s −1 (x, y) = 4π(1 − e ) exp − . 4(1 − e−s )
Part (iii) is a direct consequence of a property of the “transformed” heat kernel e−sL0 (x, y), see [7]. Part (iv) is essentially the same as the second part of (i) except the first eigen2 value is zero and the corresponding eigenfunction is φ = e−x /4d . The condition 2 2 −1 dx). R2 f dx = 0 means (f, φ) = 0, where (· , ·) is the inner product in L (R , φ Part (v) can be proved by using ) = 0, (ψA , ψA
λA ≤ (4πd)−1
and (LA − λA )−1 is a bounded operator on the subspace {f ∈ B | (f, ψA ) = 0}, which follows from the second part of (i). For more details, see the appendix of [7]. and λA are derivatives of ψA and λA to A, respectively. Here, ψA Remark 5.6. Some of the results in Lemma 5.5 is a direct extension to 2D of results in [7] for the 1D case, we collect them here for easy reference in the later proof.
June 5, 2006 10:44 WSPC/148-RMP
304
J070-00268
Y. W. Qi
Proposition 5.7. The second part of (i) is true. Proof. Suppose s1 > 0. Let δ > 0 be small so that and µA − λA > 4δ,
q > 2 + 2λA + 4δ
where µA is the second eigenvalue of LA . Take f ∈ B, (ψA , f ) = 0 and f = 1. We proceed to show inductively that for sn = ns1 , χb = χ(|x| ≤ s1 ) and χu = 1 − χb , v(sn ) = e−sn LA f , we have the inequalities χu v(sn ) ≤ e−βn , χb v(sn )2 + xb v(sn ) ≤ e
(5.8)
s21 /6 −βn
e
,
(5.9)
where β = (λA + 2δ)s1 . The conclusion of Proposition 5.1 follows immediately from (5.8) and (5.9) if n is large (nδ > s1 /6) and s = sn . For s ∈ (sn , sn+1 ), apply the first part of (i) in Lemma 5.5 and use the fact that (n − 1)δ > s1 /6 + λA if n is large. First, we note that if f ∈ B and (ψA , f ) = 0, then (e−sLA f, ψA ) = 0
∀ s > 0.
If n = 0, The bounds in (5.8) and (5.9) hold by the obvious inequality 2
χb f 2 ≤ es1 /8 f
(5.10)
and our assumption that f = 1. If n = 1, (5.8) is true by (iii) of Lemma 5.5, since q > 2 + 2λA + 4δ. Write f = χb f + χ u f = f b + f u . Since (ψA , f ) = 0, −(q−2−2λA )
|(fb , ψA )| = |(fu , ψA )| ≤ C(A)s1
−(q−2−2λA ) −λA s1
e−s1 LA fb 2 ≤ C(A)s1
e
,
+ e−µA s1 fb 2 ≤
1 −β−δs1 s21 /8 e e , 4 (5.11)
if s1 is reasonably large. In the first inequality, we used the asymptotic behavior of ψA derived in Sec. 4. Using part (ii) of Lemma 5.5, we get 2
e−s1 LA fb ≤ Ce−(s1 −1)LA fb 2 ≤ Ce−(s1 −1)(λA +3δ) es1 /8 <
1 −β s21 /8 e e , 4
(5.12)
if s1 is a reasonably large constant. By combining first part of (i) in Lemma 5.5 and (5.10), we have 2
2
χb e−s1 LA fu 2 ≤ es1 /8 e−s1 LA fu ≤ Ce−δs1 es1 /8 ≤ It is clear that (5.11)–(5.13) gives (5.9).
1 −β s21 /6 e e . 4
(5.13)
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
305
Suppose (5.8) and (5.9) hold for n, we show they hold for n + 1. Write v = v(sn ) = vb + vu . Since (v, ψA ) = 0, |(vb , ψA )| = |(vu , ψA )| ≤ C(A)vu s−(q−2−2λA ) ≤ e−βn , e−s1 LA vb 2 ≤ e−βn e−λA s1 + e−µA s1 vb 2 ≤
1 −(β+δs1 )(n+1) s21 /6 e e . (5.14) 4
Again, by (ii) of Lemma 5.5, we get 2
e−s1 LA vb ≤ Ce−(s1 −1)LA fb 2 ≤ Ce−(s1 −1)(β+δ)(n+1) es1 /6 <
1 −β(n+1) s21 /6 e e . 4 (5.15)
By combining first part of (i) in Lemma 5.5 and (5.10), we have 2
2
χb e−s1 LA vu 2 ≤ es1 /8 e−s1 LA vu ≤ Ces1 /8 e−δs1 vu ≤
1 −β(n+1) s21 /6 e e . 4 (5.16)
It follows immediately that (5.14)–(5.16) implies (5.9) for n + 1. At last, (5.8) follows from 2
χu e−s1 LA χb g ≤ e−s1 /5 χb g2 , Lemma 5.5(iii) applied to g = vu and the bounds (5.8) and (5.9) for vb and vu . This completes the proof of proposition. The RG method, as applied to our situation, is an iterative scheme. Let L > 1. We start by defining the RG map R: (a1 , a2 ) → (a1 , a2 )
in B × B
as ai (x) = L2 ui (LX, L2 ),
i = 1, 2,
where ai = ui (x, 1), i = 1, 2 and (u1 , u2 ) solve (1.1), (1.2). It is clear that
, an−1 ). (an1 , an2 ) ≡ Rn (a1 , a2 ) = L2n u1 (Ln x, L2n ), L2n u2 (Ln x, L2n ) = R(an−1 1 2 Our ultimate goal is to prove that (u1 , u2 ) behave asymptotically as the solution of the limiting linear problem. Accordingly, decompose the RG map as
where n1 (x, t) = −
dτ 1
n2 (x, t) = −
t
t
a1 = L−2LA a1 + L2 n1 (Lx, L2 ),
(5.17)
a2
(5.18)
2
2
a2 + L n2 (Lx, L ),
y HA (t, τ, x, y) u1 u2 (y, τ ) − u1 Aτ −1 φ √ dy, τ R2
dτ 1
=L
−2L
R2
H(t − τ, x − y)u1 u2 (y, τ ) dy,
June 5, 2006 10:44 WSPC/148-RMP
306
J070-00268
Y. W. Qi
where HA is the fundamental solution of ut = u − Aφu and A is to be defined in what follows. To track the evolution, we write (5.19) a1 (x) = BψA + b1 , a2 (x) = Aφ(x) + b2 (x), where B = (a1 , ψA ), A = R2 a2 dx. With the normalization (ψA , ψA ) = 1 and φ dx = 1, we have R2 b2 dx = 0. (ψA , b1 ) = 0, R2
It is easy to verify that with same decomposition as in (5.19) for a1 , a2 , L2 n2 (Lx, L2 ) dx, B = (ψA , a1 ), A = A + R2
b1
= (1 − PA )a1 ,
b2 = L−2L b2 + L2 n2 (Lx, L2 ) + (A − A )φ.
Since the system (1.1)–(1.2) is invariant under the scaling transform uiK = Kui (Kx, K 2 t),
i = 1, 2
and Lemma 5.4, we can assume, for 1 ≤ s ≤ L2 , a1 < ,
a1 · a2 < ,
u1 (s) < ,
u1 (s) · u2 (s) < .
We proceed to derive estimates for A , b2 , B and b1 , in that order. Since, H(t − τ, x − y)(1 + |y|)−q dy ≤ cec(t−τ ) (1 + |x|)−q ,
(5.20)
R2
for some c > 0, L2 n2 (Lx, L2 ) ≤ C(L)a2 and |n2 | = sup n2 (· , t) ≤ C(L)a2 . t∈[1,L2 ]
(5.21) It follows that |A − A| ≤ C(L)a2 ,
(5.22)
and consequently, b2 ≤ L−2δ b2 + C(L)a2 by (iv) of Lemma 5.5. Now, consider n1 , and write −1
y √ τ
w(y, τ ) = u2 (y, τ ) − Aτ φ .
d(τ −1)∆ b2 (y) + n2 (y, τ ) . = e
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
307
By part (iv) of Lemma 5.5 and (5.21), |w| ≤ C(L)(a2 + A)(b2 + a2 ),
and, since A ≤ Ca2 and R2 HA (t, τ, x, y)(1 + |y|)−q dy ≤ C 1 + follows from Lemma 5.2, we obtain
|x| √ t−τ
−q
, which
Ln1 (Lx, L2 ) ≤ C(L)(b2 + a2 ),
(5.23)
using u1 ∞ ≤ Ca1 and a1 · a2 ≤ . Next, we consider B . Since from (5.17), B = (ψA , a1 ) = (ψA , L−2LA a1 ) + (ψA , L2 n1 (Lx, L2 )) = (ψA , ψA )BL−2λA + (ψA , (PA − PA )L−2LA b1 ) + (ψA , L2 n1 (Lx, L2 )), (PA b1 = 0), we have, by (v) of Lemma 5.5 and (5.23), |B − BL−2λA | ≤ C|A − A|L−2λA (B + b1 ) + C(L)(b2 + a2 ) ≤ C(L)2 + C(L)2 a2 + C(L)b2 . In the last inequality, we used (5.22), a1 < , a1 · a2 ≤ and |B| + b1 ≤ Ca1 . Finally, we estimate b1 . b1 = (1 − PA )a1 = BL−2λA (PA − PA )ψA + L−2LA b1 + L−2LA b1 (PA − PA ), again by PA b1 = 0. It follows immediately from Lemma 5.5(i), (v), (5.23) and (5.22), b1 ≤ L−2(λA +δ) b1 + C(L)((1 + a2 ) + b2 ). To summarize our estimates, we have the following: Lemma 5.8. Suppose L ≥ L0 = e2s0 , where s0 satisfies the conditions in Lemma 5.4 and Lemma 5.5. There exists 0 (L) and C(L) such that if a1 · a2 ≤ ≤ 0 (L), we have (a) (b) (c) (d)
|A − A| ≤ C(L)a2 , b2 ≤ L−2δ b2 + C(L)a2 , |B − L−2λA B| ≤ C(L)((1 + a2 ) + b2 ), b1 ≤ L−2(λA +δ) b1 + C(L)((1 + a2 ) + b2 ).
Proof of Theorem 1.1. In light of results established in Secs. 2 and 3, we only need to show (i). We write ani as in (5.19) with An , Bn and bni in place of A, B and bi , and derive bounds for An , Bn and bni using Lemmas 5.4 and 5.8.
June 5, 2006 10:44 WSPC/148-RMP
308
J070-00268
Y. W. Qi
It is clear that An → A∗ =
(a1 + a2 ) dx.
R2
First, by Lemma 5.4 and Lemma 5.8(a), |An+1 − An | ≤ C(L)an1 · an2 · an2 ≤ C(L)e−2nη , with 0 < η < λ∗A , and hence, |An − A∗ | ≤ C(L)e−2nη . Set nλn =
n−1
λAm .
m=0
Since λA is a continuous and increasing function of A and An is a bounded and increasing sequence, λAn → λ∗ = λA∗ ,
λn → λA∗ .
If we take η < min(δ, λA∗ ), we get, by Lemma 5.8(b) and Lemma 5.4, bn2 ≤ e−2nη a2
and bn1 ≤ e−2n(λA∗ +η) (1 + a2 ).
This, together with Lemma 5.8(c) and Proposition 2.4, gives |Bn+1 − Bn L−2λAn | ≤ C(L)L−2n(λA∗ +η) (1 + a2 ), with a smaller η, if necessary. This shows there exists B ∗ such that Bn L+2nλn → B ∗
as n → ∞.
Now, since λA is a differentiable function of A and λA ≤ 1/4πd, n−1 n−1 n−1 λAm − λA∗ ≤ |λAm − λA∗ | ≤ C(L)e−2nη , |nλn − nλA∗ | = m=0
m=0
m=0
and therefore, nλn − nλA∗ converges to a finite limit as n → ∞. Hence, Bn L2nλA∗ → B ∗∗
as n → ∞.
With t = L2n , L > L0 , the results are directly translated to √ tu2 ( t ·, t) − A∗ φ ≤ Ct−η a2 , √ t1+λA∗ u1 ( t ·, t) − B ∗∗ ψA∗ ≤ Ct−η (1 + a2 ), This proves Theorem 1.1.
(5.24) (5.25)
June 5, 2006 10:44 WSPC/148-RMP
J070-00268
An Isothermal Combustion Problem in 2D
309
Remark 5.9. To see why Theorem 1.1 cannot be true if q ≤ q(A), we look at the linear problem us = −LA u. Set
wL (y, s) =
(5.26)
e−λA s ψA (y) Ce
−λA s
|y| ≤ D, 2 −q/2
(1 + |y| )
|y| ≥ D,
where C is the constant which makes wL continuous at |y| = D. It is easy to verify (see the proof of Lemma 5.3) that the function wL is a sub-solution of (5.26) provided D is large enough such that q(q + 2) and
|y|2 ≥ A2 φ2 (y) ∀ |y| ≥ D (1 + |y|2 )2
(y) ≤ C (1 + |y|2 )q/2 ψA
at|y| = D.
It is clear that for a solution to (5.26) with continuous, positive initial value u0 (y) satisfying lim u0 (y)(1 + |y|)q = M > 0,
|y|→∞
we can find a δ > 0 such that u0 (y) > δwL (y, 0). In consequence, u(y, s) > δwL (y, s),
s > 0.
This shows Theorem 1.1 cannot be true for the linear problem (5.26) since wL (y, s), though it has the desired time decay, has a much slower decay in space variable y than ψA (y). In summary, in spite of the fact that we have not given a direct proof that Theorem 1.1 is false when q ≤ q(A), the above serves as a strong evidence that it should not be true for the nonlinear problem if it is false for the limiting linear problem. References [1] N. Alikakos, Lp bounds of solutions of the reaction-diffusion systems, Comm. Partial Differential Equations 4 (1979) 827–868. [2] J. D. Avrin, Qualitative theory for a model of laminar flames with arbitrary nonnegative initial data, J. Differential Equations 84 (1990) 290–308. [3] L. Berlyand and J. Xin, Large time asymptotics of solutions to a model combustion system with critical nonlinearity, Nonlinearity 8 (1995) 161–178. [4] J. Billingham and D. J. Needham, The development of traveling waves in quadratic and cubic autocatalysis with unequal diffusion rats, I, Permanent form travelling waves, Philos. Trans. Roy. Soc. London Ser. A 334 (1991) 1–24. [5] J. Billingham and D. J. Needham, The development of traveling waves in quadratic and cubic autocatalysis with unequal diffusion rats, II, An initial-value problem with an immobilized or nearly immobilized autocatalyst, Philos. Trans. Roy. Soc. London Ser. A 336 (1991) 497–539.
June 5, 2006 10:44 WSPC/148-RMP
310
J070-00268
Y. W. Qi
[6] J. Bricmont, A. Kupiainen and G. Lin, Renormalization group and asymptotics of solutions of nonlinear parabolic equations, Comm. Pure Appl. Math 47 (1994) 893–922. [7] J. Bricmont, A. Kupiainen and J. Xin, Global large time self-similarity of a thermaldiffusive combustion system with critical nonlinearity, J. Differential Equations 130 (1996) 9–35. [8] J. Glimm and A. Jaffe, Quantum Physics (Springer, Berlin, 1981). [9] N. Goldenfeld, O. Martin, Y. Oono and F. Liu, Anomalous dimensions and the renormalization group in a nonlinear diffusion process, Phys. Rev. Lett. 64 (1990) 1361–1364. [10] S. Hollis, R. Martin and M. Pierre, Global existence and boundedness in reactiondiffusion systems, SIAM J. Math. Anal. 18 (1987) 744–761. [11] T. Kato, Strong Lp -solution of the Navies–Stokes equation in Rm , with application to weak solutions, Math. Z. 187 (1984) 471–480. [12] Y. Li and Y. W. Qi, The global dynamics of isothermal chemical systems with critical nonlinearity, Nonlinearity (2003) 1057–1074. [13] R. Martin and M. Pierre, Nonlinear reaction-diffusion systems, in Nonlinear Equations in the Applied Sciences, eds. W. F. Ames and C. Rogers (Academic Press, Boston, 1992). [14] K. Masuda, On the global existence and asymptotic behaviour of solutions of reactiondiffusion equations, Hokkaido Math. J. 12 (1983) 360–370. [15] B. J. Matkowsky and G. I. Sivashinsky, An asymptotic derivation of two models in flame theory associated with the constant density approximation, SIAM J. Appl. Math. 37 (1979) 686–699. [16] M. J. Metcalf, J. H. Merkin and S. K. Scott, Oscillating wave fronts in isothermal chemical systems with arbitrary powers of autocatalysis, Proc. Roy. Soc. London Ser. A 447 (1994) 155–174. [17] Y. Nishiura, Kunimochi Sakamoto and Niky Kamran, Far-from-Equilibrium Dynamics (American Mathematical Society, Providence, RI, 2002). [18] Y. W. Qi, The global self-similarity of a chemical reaction system with critical nonlinearity, preprint (2004). [19] M. E. Schonbek, L2 decay for weak solutions of the Navier–Stokes equation, Arch. Rational Mech. Anal. 88 (1985) 209–222. [20] G. I. Sivashinsky, Instability, pattern formation and turbulence in flames, Ann. Rev. Fluid Mech. 15 (1983) 179–199.
June 5, 2006 10:44 WSPC/148-RMP
J070-00265
Reviews in Mathematical Physics Vol. 18, No. 3 (2006) 311–328 c World Scientific Publishing Company
THE BIEDENHARN APPROACH TO RELATIVISTIC COULOMB-TYPE PROBLEMS
´ P. A. HORVATHY Laboratoire de Math´ ematiques et de Physique Th´ eorique, Universit´ e de Tours, Parc de Grandmont, F-37200 Tours, France [email protected] Received 26 January 2006 The approach developed by Biedeharn in the 1960s for the relativistic Coulomb problem is reviewed and applied to various physical situations. Keywords: Relativistic coulomb problem; the Biedeharn approach. Mathematics Subject Classification 2000: 81R05
1. Introduction In a paper anticipating supersymmetric quantum mechanics [1], Biedenharn proposed a new approach to the Dirac–Coulomb problem. His idea has been to iterate the Dirac equation. The resulting quadratic equation, written in a non-relativistic Coulomb form, is readily solved using the “Biedenharn–Temple” operator Γ analogous to the angular momentum operator (but with a fractional eigenvalues). Then, the solutions of the first-order equation can be recovered from those of the secondorder equation by projection. In this review, we apply the approach of Biedeharn to various physical problems. 2. The Dirac Approach Let us first summarize the original approach of Dirac in his classic book [2]. He starts with the first-order Hamiltonian H = −eA0 + ρ1σ · p + ρ3 m, where the “Dirac” matrices can be chosen as 12 −i12 ρ1 = , ρ2 = , 12 i12 where 12 is the 2 × 2 unit matrix. 311
(2.1)
ρ3 =
12
−12
,
(2.2)
June 5, 2006 10:44 WSPC/148-RMP
J070-00265
P. A. Horv´ athy
312
For a spherically symmetric potential, A0 = A0 (r), Dirac proposes the following solution. First, he proves the vector identity σ · u σ · v = (u · v ) + iσ · (u × v ). (2.3) Then, applying to the orbital angular momentum and momentum, u = ≡ x × p,
and v = p,
respectively, and interchanging u and v , he deduces that the two-component operator z ≡ σ · + 1 anticommutes with σ · p, z, σ · p = 0. Therefore, the operator σ , K = ρ3 Z where Z = Σ · + 1, Σ = σ
(2.4)
(2.5)
commutes with all three terms in the Hamiltonian (2.1), and is hence a constant of the motion. Next, applying to u = v = allows him to infer, using the identity × = i,
(2.6)
that 2 1 Z 2 = σ · + 1 = J 2 + , 4
1 where J ≡ + Σ. 2
(2.7)
J is here the total angular momentum operator. The eigenvalues of K are therefore half-integers, 1 κ=± j+ . (2.8) 2 Further application of the identity (2.3) with u = x and v = p shows that (2.9) σ · x σ · p = rpr + i(z − 1), where pr = −i∂r . Note that pr , K = 0. At this stage, Dirac introduces a second operator, namely x w (2.10) ω = ρ1 W, W = , w = σ · , w r which satisfies the relations ω 2 = W 2 = w2 = 1,
[ω, K] = 0.
Finally, Dirac rewrites the Hamiltonian (2.1) in the form Z −1 H = −eA0 + ω pr + i + ρ3 m. r
(2.11)
(2.12)
In the Coulomb case, eA0 = α/r, and the radial form (2.12) allows one to find the spectrum (3.15) of the relativistic hydrogen atom [2].
June 5, 2006 10:44 WSPC/148-RMP
J070-00265
The Biedenharn Approach to Relativistic Coulomb-Type Problems
313
3. The Biedenharn Approach to the Dirac–Coulomb Problem Biedenharn [1] proposes instead to introduce the projection operators α O± = iρ2σ · p ± m − ρ3 E + , r
(3.1)
so that H − E = ρ3 O+ , and observes that (H − E)ψ = 0 ⇒ O− O+ ψ = 0,
O+ φ = O+ O− ψ = O− O + −ψ = 0,
since the O± commute, so that the solutions of the first-order equation O+ φ = 0 can be obtained from those of the iterated equation by projecting, φ = O− ψ = 0.
(3.2)
He then defines the “Biedenharn (Temple) operator” Γ = − Z + iαω ≡ −
z iαw
iαw z
.
(3.3)
Γ is conserved for the iterated, but not for the first-order equation, and allows us to rewrite O− O+ ψ = 0 in a form reminiscent of the non-relativistic Coulomb problem,
2 1 Γ(Γ + 1) 2αE + m2 − E 2 ψ = 0. − ∂r + + + r r2 r
(3.4)
The operator Γ plays here a role of the angular momentum. However, 1 Γ2 = K2 − α2 = J 2 + − α2 , 4
(3.5)
so that the eigenvalues of Γ are γ=±
κ2 − α2 = ± (j + 1/2)2 − α2 ,
sign γ = sign κ.
(3.6)
For a Γ-eigenfunction, 1 Γ(Γ + 1) = (γ)((γ) + 1) with (γ) = |γ| + [sign(γ) − 1], 2
(3.7)
i.e. the “angular momentum” (γ) is irrational. The operator Γ is hermitian as long as α ≤ 1, i.e. for nuclei with less than 137 protons.
June 5, 2006 10:44 WSPC/148-RMP
314
J070-00265
P. A. Horv´ athy
To get explicit formulæ, remember [3] that the angular spinors 1 |κ| + ± µ |κ| + 12 ∓ µ µ+1/2 0 µ−1/2 1 µ 2 χ± = Yj±1/2 Yj±1/2 ∓ , 0 1 2|κ| + 1 2|κ| + 1
(3.8)
where the ± refers to the sign of κ, and the Y ’s are the spherical harmonics, are 1 3 2 not only eigenfunctions of J and of J3 with eigenvalues j(j + 1), j = 2 , 2 , . . . and µ = −j, . . . , j, respectively, but also satisfy the crucial relations zχµ± = ±|κ|χµ± Put
Ξ+ =
χµ+ 0
,
Ξ− =
0 χµ−
and wχµ± = χµ∓ .
,
Υ+ =
0 χµ+
(3.9)
,
Υ− =
χµ− 0
.
(3.10)
Then, the Φ+ = −iαΞ+ + (|κ| − |γ|)Ξ− ,
Φ− = (|κ| − |γ|)Ξ+ + iαΞ− ,
ϕ+ = −iαΥ+ + (|κ| − |γ|)Υ− , ϕ− = (|κ| − |γ|)Υ+ + iαΥ−
(3.11)
are eigenfunctions of Γ with eigenvalues ±|γ|, ΓΦ± = ±|γ|Φ± ,
Γϕ± = ±|γ|ϕ± .
(3.12)
Then, setting ψ± = u± Φ± , the iterated equation indeed takes a non-relativistic Coulomb form with irrational angular momentum (γ),
2 1 (γ)((γ) + 1) 2αE 2 2 + m + + − E (3.13) − ∂r + u± = 0, r r2 r whose solutions are the well-known Coulomb eigenfunctions αE (γ) ikr , 2(γ) + 2, −2ikr , (3.14) u± (r) ∝ r e F (γ) + 1 − i k √ where k = E 2 − m2 and F denotes the confluent hypergeometric function. The energy levels are obtained from the poles of F , (γ) + 1 − iαE/k = −n,
n = 0, 1, 2, . . . ,
yielding the familiar spectrum shown on Fig. 1, α2 Ep = m 1 − 2 , p + α2 1 1 p = (γ) + 1 + n = |γ| + sign γ + + n, 2 2
n = 0, 1, . . . .
(3.15)
Since γ and thus are irrational, + n = + n is only possible for γ = ±γ, so different j-sectors yield different E-values. For each fixed j, the same energy is obtained in the γ > 0 sector for n − 1 as in the γ < 0 sector for n. These energy
June 5, 2006 10:44 WSPC/148-RMP
J070-00265
The Biedenharn Approach to Relativistic Coulomb-Type Problems
315
Fig. 1. The spectrum of a Dirac electron in the field of an H-atom. The ± signs refer to the sign of γ. In different j-sectors the energy levels are shifted by the fine structure.
levels are hence doubly degenerate. In the γ < 0 sector the n = 0 state is unpaired: each j sector admits a ground-state. α2 j j |γ|−1 −αmr/(j+1/2) e with energy E0 = m 1 − . (3.16) u0 ∝ r (j + 1/2)2 Observe that Eq. (3.16) is consistent with (3.14) due to F (a, a, z) = ez . 4. Charged Dirac Particle in a Monopole Field A Dirac particle in the field of a Dirac monopole, = −g r , B r3
(4.1)
can be treated along the same lines [4]. The Hamiltonian is now H=−
α + ρ1σ · π + ρ3 m, r
π = p − eA,
(4.2)
is the vector potential of a Dirac monopole, ∇ ×A = B. Introducing again where A the projection operators α O± = iρ2 σ · π ± m − ρ3 E + , (4.3) r the solutions of the first-order equation can again be obtained from that of the iterated equation by projecting, cf. Eq. (3.2). Dirac’s operator, · + 1 , (4.4) K = −ρ3 Σ
June 5, 2006 10:44 WSPC/148-RMP
316
J070-00265
P. A. Horv´ athy
is formally the same as in (2.5), except for the replacements p → π ⇒ = r × π .
(4.5)
Note that is now only part of the orbital angular momentum, = − q r , L r where q = eg. The novelty is that, unlike in (2.8), the eigenvalues of K became now irrational, 2 1 κ= − q2 . (4.6) j+ 2 The iterated equation reads again formally (3.4),
2 1 Γ(Γ + 1) 2αE 2 2 + m − E ψ = 0. + + − ∂r + r r2 r with the Biedenharn operator Γ = −(Z + iαω), cf. (3.3). The square of Γ is now Γ2 = K2 − α2 = J 2 +
1 − q 2 − α2 , 4
(4.7)
where + 1Σ = − q r + 1 Σ. J = L 2 r 2
(4.8)
is the total angular momentum. The eigenvalues of Γ are, therefore, “even more irrational”, since the monopole-charge term q 2 and the Coulomb-charge term α2 are both subtracted: (4.9) γ = ± κ2 − α2 = ± (j + 1/2)2 − q 2 − α2 , sign γ = sign κ. Observe that this now yields an imaginary γ for the lowest angular momentum j = q − 1/2 sector for any positive α, and the situation is worsened when α is increased. These cases should be discarded. Let us assume that α is small, typically a few times 1/137 so that γ is real except for the lowest angular momentum sector. Assuming j ≥ q + 1/2, consider those angular 2-spinors χ± in (3.8), i.e. 1 |κ| + ± µ |κ| + 12 ∓ µ µ+1/2 0 µ−1/2 1 µ 2 Yj±1/2 Yj±1/2 χ± = ∓ , (4.10) 0 1 2|κ| + 1 2|κ| + 1 but with the Y ’s being now replaced by the “Wu–Yang” monopole harmonics [5]. These spinors are eigenfunctions of J 2 and J3 with eigenvalues j = q − 12 , q + 12 , . . .
June 5, 2006 10:44 WSPC/148-RMP
J070-00265
The Biedenharn Approach to Relativistic Coulomb-Type Problems
317
Fig. 2. The bound-state spectrum of a Dirac particle in a charged monopole field. The ± refers to the sign of the Biedenharn operator Γ. In each j = const. sector, the energy levels are doubly degenerate except for the lowest-energy ground-state, which occurs in the γ < 0 sector. Different j-sectors are shifted by a modified fine structure. For j = 0, there are no γ > 0 states, and Γ is not hermitian. This critical case j = 0, γ < 0 is not discussed here.
and −j ≤ µ ≤ j, respectively. Then, the Φ± and ϕ± in Eq. (3.11) are eigenfunctions of Γ with eigenvalues ±|γ|, cf. (3.12). For the two signs, 1 (4.11) (γ) = |γ| + [sign(γ) − 1)] 2 cf. (3.7). Setting ψ± = u± Φ± (and ψ± = u± ϕ± , respectively), the iterated Dirac equation O− O+ reduces to the non-relativistic Coulomb form (3.13) with solutions as in (3.14) and energy levels (3.15). The only difference is in the value of γ. The ground-states of the j = const. sector are √ 2 2 α2 (j) . (4.12) uj0 ∝ r|γ|−1 e−αmr/ γ +α with energy E0 = m 1 − 2 γ + α2 Γ(Γ + 1) = (γ)((γ) + 1) with
The spectrum is shown on Fig. 2. For q = 0 (no monopole), we plainly recover Biedernharn’s results in [1] on the Dirac–Coulomb problem. For α = 0 (no Coulomb potential), one has a pure Dirac monopole [6]. The Biedenharn operator Γ reduces to −Z. No further diagonalization in ρ-space is thus necessary. Since [Z, ρ3 ] = 0, ρ3 is now conserved for the iterated equation (but not for the first-order equation). The iterated equation splits therefore into two (identical) Pauli equations, and we can work with 2-spinors. For j ≥ q + 1/2, the angular eigenfunctions of Γ = −Z are those Ξ’s in (3.10).a a The
Ξ± ’s are proportional to those ξ i ’s (i = 1, 2) in Eqs. (11) and (19) of Kazama, Yang and Goldhaber [8]. Their φi ’s are just our ϕ∓ ’s in (3.11).
June 5, 2006 10:44 WSPC/148-RMP
318
J070-00265
P. A. Horv´ athy
For j ≥ q +1/2, there are no bound-states. The hypergeometric function reduces to a Bessel function and the radial eigenfunction becomes 1 (4.13) u± ∝ √ J|κ|±1/2 kr the same as [11, Eq. (37)]. The j = q − 1/2 case should not be discarded: the eigenvalue of Z ≡ Γ only vanishes, rather than becoming imaginary. The problem requires, nevertheless, special treatment. The Dirac Hamiltonian is indeed not self-adjoint [7] but admits a 1-parameter family of self-adjoint extensions, corresponding to different boundary conditions at r = 0. These yield different physics. The one constructed by Callias [7] has further significance for the theta-angle in QCD. Kazama et al. [8, 9] suggest to cure the non-self-adjointnsess problem by adding an infinitesimal extra magnetic moment. For further discussion and details, the reader is invited to consult the literature [7–10]. 5. Dyons Let us consider a massless Dirac particle in the long-distance field of a (self-dual) Bogomolny–Prasad–Sommerfield monopole [16, 17], = −q r and Φ = q 1 − 1 . (5.1) eB r3 r Identifying Φ with the fourth component of a gauge field, we get a static, self-dual Abelian gauge field in four euclidean dimensions 1 A = qAD , A4 = q 1 − , (5.2) r where AD denotes the vector potential of a Dirac monopole of unit strength. The associated Dirac Hamiltonian is therefore [13, 17] Q† σ · π − iΦ D = ρ1 (σ · π ) − ρ2 Φ = = . (5.3) Q σ · π + iΦ In contrast to the Coulomb case, the scalar term ρ2 Φ is now off-diagonal, because it comes from the fourth, euclidean, direction, rather than from the time coordinate. The total angular momentum, J in Eq. (4.8) is conserved. Using the notations and formulæ introduced for the charged monopole, we observe that the counterpart of Dirac’s operator (4.4), iz , (5.4) K = −ρ2 Z = −iz commutes with D and 1 K2 = z 2 = J 2 + − q 2 , 4
(5.5)
June 5, 2006 10:44 WSPC/148-RMP
J070-00265
The Biedenharn Approach to Relativistic Coulomb-Type Problems
so that z (and hence, Z and K) have irrational eigenvalues, 2 1 κ= − q2 , j+ 2
319
(5.6)
cf. Eq. (4.6). Since j ≥ q − 1/2, K is hermitian, but for j = q − 1/2, its eigenvalue κ vanishes and thus, K is not invertible. The Dirac operator (5.3) is, as in any even dimensional space, chiralsupersymmetric: {Q, Q† } is a SUSY Hamiltonian and the SUSY sectors are the ±1 eigenspaces of the chirality operator ρ3 . The supercharges Q and Q† can be written as 1 z + qw 1 z − qw + qw = −i ∂r + + + qw w, (5.7) Q = −iw ∂r + − r r r r 1 1 z − qw z + qw Q† = iw − ∂r + + qw = i − ∂r + + qw w. + − r r r r (5.8) The square of (5.3) is 2 D = where
H0 ,
H1
=
Q† Q QQ†
,
(5.9)
2
1 12 H0 = π + q 1 − r 2
2
and H1 = H0 − 2
σ · r . r3
(5.10)
In the “lower” (i.e. ρ3 = −1) sector, the gyromagnetic ratio is g = 0, and H0 can be viewed as describing two, uncoupled, spin 0 particles in the combined field of a Dirac monopole, of a Coulomb potential and of an inverse-square potential. This system has been solved many years ago; it has a Coulomb-type spectrum, whose degeneracy is explained by its “accidental” o(4) symmetry [15]. In the “upper” (i.e. ρ3 = 1) sector g = 4; H1 is the Hamiltonian of D’Hoker and Vinet in [14]. 2 In terms of Z and w, D is also 2 2 1 2q 2 Z 2 + q2 1 z + qw 2 D = − ∂r + +q + − − 2 r r r2 r
z − qw
.
(5.11)
The Biedenharn operator, Γ = −(Z + qρ3 W ) ≡ −
z + qw z − qw
,
(5.12)
2 for the does not commute with D, but it commutes with D ; it is thus conserved quadratic dynamics H0 and H1 (but not for the Dirac Hamiltonian D). In terms of
June 5, 2006 10:44 WSPC/148-RMP
320
J070-00265
P. A. Horv´ athy
2 Γ, D becomes 2 2 1 Γ(Γ + 1) 2q 2 + q2 . + − D = − ∂r + r r2 r
(5.13)
1 Γ2 = z 2 + q 2 = J 2 + , 4
(5.14)
Now,
because, unlike in (4.7), the q 2 comes with a positive sign. The eigenvalues of Γ are, therefore, (half)-integers, 1 γ=± j+ , sign γ = sign κ. (5.15) 2 Hence, for a Γ-eigenfunction, 1 (5.16) Γ(Γ + 1) = L(γ)(L(γ) + 1) where L(γ) = j ± . 2 (The sign is plus or minus depending on the sign of γ.) L(γ) is now a (half-)integer. Using the notations x = z − qw and y = z + qw, the supercharges are written as 1 y 1 x Q = −iw ∂r + − + qw = −i ∂r + + + qw w, (5.17) r r r r 1 1 x y (5.18) Q† = iw − ∂r + + + qw = i − ∂r + − qw . r r r r Note that one can also write + 1 + 2qw σ · L Γ=− 0
0 +1 σ · L
=
−y
0
0
−x
,
(5.19)
where x and y are self-adjoint, x = x† , y = y † , w = w† . To find an explicit solution, we construct, cf. (4.10), angular 2-spinors ϕµ± and µ Φ± , which are both eigenfunctions of J 2 and J3 with eigenvalues j(j + 1) and µ, and which diagonalize the operators x and y: xϕµ± = ∓|γ|ϕµ±
and yΦµ± = ∓|γ|Φµ± .
(5.20)
In the “lower” sector, the coefficient of the r−2 term here is the square of the orbital angular momentum, 2 = L(γ)(L(γ) + 1), x(x − 1) = L
(5.21)
so that L(γ) is just the orbital angular quantum number. Due to the addition theorem of the angular momentum, if j ≥ q+1/2, L(γ) = j±1/2, but for j = q−1/2, the only allowed value of L(γ) is L(γ) = j + 1/2. For j ≥ q + 1/2, consider therefore, L(γ) + 12 ± µ µ−1/2 1 L(γ) + 12 ∓ µ µ+1/2 0 µ YL(γ) YL(γ) ± , (5.22) ϕ± = 0 1 2L(γ) + 1 2L(γ) + 1
June 5, 2006 10:44 WSPC/148-RMP
J070-00265
The Biedenharn Approach to Relativistic Coulomb-Type Problems
321
where the Y ’s are again the Wu–Yang [5] monopole harmonics, and the sign ± refers to the sign of γ. The ϕ’s satisfyb J 2 ϕ± = j(j + 1)ϕ± , J3 ϕ± = µϕ± , µ = −j, . . . , j L2 ϕ± = L(γ)(L(γ) + 1)ϕ± . · σ = J 2 − L 2 − 3/4, we have Since L · σ + 1 ϕ± = ∓|γ|ϕ± , xϕ± = L
(5.23) (5.24) (5.25)
(5.26)
as wanted. For j = q − 1/2, no ϕ− (i.e. no L(γ) = q − 1) state is available, but Eq. (5.22) still yields 2(q − 12 ) + 1 = 2q ϕ0+ s with L(γ) = q, namely, 0 µ q + 12 + µ µ−1/2 1 q + 12 + µ µ+1/2 0 Yq Yq ϕ+ = + , (5.27) 0 1 2q + 1 2q + 1 where µ = −(q − 1/2), . . . , (q − 1/2). They are eigenstates of x with eigenvalue −q. The y-eigenspinors Φ of the “upper” (i.e. ρ3 = 1) sector are constructed indirectly. Assume first that one can find angular spinors χ± which diagonalize z = σ · + 1, zχ± = ±|κ|χ± ,
(5.28)
and also satisfy J 2 χµ± = j(j + 1)χµ± , J3 χµ± = µχµ± , w
χµ±
=
1 1 j = q − ,q + ,..., 2 2 µ = −j, . . . , j,
χµ∓ .
(5.29) (5.30) (5.31)
In the subspace spanned by the χ± ’s, x = z − qw and y = z + qw have the remarkably symmetric matrix representations |κ| −q |κ| q [x] = and [y] = . (5.32) −q −|κ| q −|κ| The eigenvectors ϕ± and Φ± of x and y with eigenvalues ±|γ| are thus ϕ+ = (|κ| + |γ|)χ+ − qχ− , ϕ− = qχ+ + (|κ| + |γ|)χ− , Φ+ = (|κ| + |γ|)χ+ + qχ− , Φ− = −qχ+ + (|κ| + |γ|)χ− .
(5.33)
Expressing the χ’s from the upper two equations in terms of the x-eigenspinors ϕ yield the z-eigenspinors 1 q 1 q χ+ = ϕ− , χ− = ϕ+ + ϕ− , (5.34) ϕ+ + − 2|γ| |γ| + |κ| 2|γ| |γ| + |κ| b The
superscript µ is dropped for the sake of simplicity.
June 5, 2006 10:44 WSPC/148-RMP
322
J070-00265
P. A. Horv´ athy
which do indeed satisfy (5.28). For j = q −1/2, χ− is missing and χ+ is proportional to the lowest ϕ0+ in (5.27). Eliminating the χ’s allows to deduce the y-eigenspinors Φ from the x-eigenspinorss ϕ according to Φ+ =
1 1 and Φ− = |κ|ϕ+ + qϕ− − qϕ+ + |κ|ϕ− |γ| |γ|
(5.35)
which, by construction, satisfy J 2 Φ± = j(j + 1)Φ± ,
(5.36)
J3 Φ± = µΦ± ,
(5.37)
µ = −j, . . . , j,
y Φ± = ∓|γ|Φ± .
(5.38)
Finally, w = σ · r/r interchanges the x and y eigenspinors, wϕµ± = Φµ∓ .
(5.39)
In contrast to what happens in the “lower” (i.e. ρ3 = −1) sector, in the “upper” (i.e. ρ3 = 1) sector 2 − 2σ · r y(y − 1) = L r is not the square of an angular momentum and hence, we do have L(γ) = q − 1 states: |γ| = q, κ = 0 for the lowest value of total angular momentum, j = q − 1/2, and for γ = −q, Eq. (5.33) yields (5.27), Φ0 (= Φ− ) = ϕ0+ ,
(5.40)
while the entire Φ+ -tower is missing. This is a (−1)-eigenstate of w, wΦ0 = −Φ0 .
(5.41)
Since ϕ0+ is a (−q)-eigenstate of x, Φ0 is an eigenstate of y = x + 2qw with eigenvalue (+q). Since Γ(Γ + 1)Φµγ = L(γ)(L(γ) + 1)Φµγ ,
Γ(Γ + 1)ϕµγ = L(γ)(L(γ) + 1)ϕµγ , 2 by construction, for j ≥ q + 1/2 the eigenfunctions of D are found as Φ± Ψ±|γ| = u± for aρ3 = 1, 0 1 if j ≥ q + , 2 0 ψ±|γ| = u± for ρ3 = −1 ϕ±
(5.42)
(5.43)
where the radial functions u± (r) solve the non-relativistic Coulomb-type equations
2 1 L(γ)(L(γ) + 1) 2q 2 2 + q u± = E 2 u± . + − (5.44) − ∂r + r r2 r
June 5, 2006 10:44 WSPC/148-RMP
J070-00265
The Biedenharn Approach to Relativistic Coulomb-Type Problems
323
By (5.16), these are just the upper (respectively, lower) equations of
1 j− 2 2 2 1 2q 1 + q2 + 2 − − ∂r + r r r
1 j+ 2
3 1 j+ j+ 2 2
(5.45)
and hence, u± (r) ∝ r
L(γ) ikr
e
q2 F L(γ) + 1 − i , 2L(γ) + 2, −2ikr , k
(5.46)
where k = E 2 − q 2 . For j = q − 1/2 we get the (2q) spinors ψ+ = u+
0 ϕ0+
,
sign γ = +1
(5.47)
in the ρ3 = −1 sector with L(γ) = q,c with u+ still as in (5.46). The energy levels are obtained from the poles of F , L(γ) + 1 −
iq 2 = −n, k
n = 0, 1, . . . .
Introducing the principal quantum number p = L(γ) + 1 + n ≥ q + 1 we conclude that, in both ρ3 sectors, 2 q Ep = q 1 − , p 2
p = q + 1, . . . .
(5.48)
The same energy is obtained if L + n = L + n . The degeneracy of a (p ≥ q + 1)level is hence 2(p2 − q 2 ). If j = q − 1/2, (2q) extra states arise in the ρ3 = 1 sector for γ = −q, Ψ0 = u0
Φ0 0
for ρ3 = 1
and γ = −q,
(5.49)
where u0 solves (5.44) with L(γ) = q − 1. The principal quantum number is now p = q, yielding the 2q-fold degenerate 0-energy ground-states. Since F (0, a, z) = 1, and the lowest k-value is iq, u0 is simply u0 = rq−1 e−qr ,
c L(γ)
= q-values arise in the ρ3 = 1 sector for γ = −(q + 1).
(5.50)
June 5, 2006 10:44 WSPC/148-RMP
324
J070-00265
P. A. Horv´ athy
Fig. 3. The dyon spectrum in the g = 0 sector. The sign refers to that of (−x). Each j ≥ q + 1/2 sector is doubly degenerate. For j = q − 1/2, there are no (−x) = −q states. The energy only depends on the principal quantum number = L(γ) + 1 + n.
Fig. 4. The dyon spectrum in the g = 4 sector. The sign refers to that of (−y). Each j ≥ q + 1/2 sector is doubly degenerate. For j = q−1/2, there are no (−y) = +q states but E = 0 ground-states arise for (−y) = −q.
cf. [13, 14]. The situation is shown in Figs. 3 and 4: 6. Further Applications As yet another illustration, we consider a spin 12 particle described by the fourcomponent Hamiltonian σ · ˆr λ2 1 2 r H1 5σ ·ˆ = (6.1) H= π − q 2 + 2 − λγ H0 2 r r r2 where λ is a real constant [18]. The Hamiltonian (6.1) can again be viewed as associated to a static gauge field on R4 , A = qAD ,
A4 = λ/r,
(6.2)
June 5, 2006 10:44 WSPC/148-RMP
J070-00265
The Biedenharn Approach to Relativistic Coulomb-Type Problems
325
cf (5.2). The square of the associated Dirac operator λ † σ · π − i Q r D= (6.3) = , λ Q σ·π+i r is precisely (6.1). The partner hamiltonians of the chiral-supersymmetric Dirac operator have again the same spectra. Much of the theory developed before in Secs. 4 and 5 apply. The conserved total angular momentum is (4.8) and Dirac’s K is again (5.4). The supercharges Q and Q† can now be written as 1 y 1 x Q = −iw ∂r + − = −i ∂r + + w, r r r r (6.4) 1 x 1 y † = −i ∂r + + w, Q = −iw ∂r + − r r r r where x = z − λw
and y = z + λw.
The Biedenharn operator, conserved for the quadratic dynamics, is now y 5 5 Γ = −(σ · + 1 + γ λw) i.e. − (z + γ λw) ≡ − . x Since Γ2 = z 2 + λ2 = J2 + 1/4 + λ2 − q 2 , the eigenvalues of Γ, 2 1 γ=± + λ2 − q 2 , sign γ = sign κ, j+ 2 2 are in general again irrational. In terms of Γ, D is written † 2 2 1 Γ(Γ + 1) Q Q + . D = = − ∂r + † QQ r r2
(6.5)
(6.6)
(6.7)
(6.8)
The explicit solution. The operator Γ can be diagonalized as in Sec. 5, cf. [8, 9]. We get 2-spinors which diagonalize z are 1 q χ+ = ϕ− , ϕ+ + 2j + 1 j + 12 + |κ| (6.9) 1 q χ− = + ϕ , ϕ − + − 2j + 1 j + 12 + |κ| where the φ± are given in (5.22). Hence, 1 1 φ+ = |κ| + j + χ+ − λχ− , φ− = λχ+ + |κ| + j + χ− , 2 2 1 1 Φ+ = |κ| + j + χ+ + λχ− , Φ− = −λχ+ + |κ| + j + χ− , 2 2
(6.10)
June 5, 2006 10:44 WSPC/148-RMP
326
J070-00265
P. A. Horv´ athy
diagonalize x and y, xφµ± = ∓|γ|φµ±
and yΦµ± = ∓|γ|Φµ± .
(6.11)
The operator w = σ · ˆr interchanges the x and y eigenspinors, wφµ± = Φµ∓ .
(6.12)
For j = q − 1/2, no ϕ− is available and χ− is hence missing. χ+ is proportional to the lowest ϕ+ in (5.27). There are no φ− -states in the γ 5 = −1 sector and no Φ+ states in the γ5 = 1 sector. However, in each γ 5 sector, (6.9) yields (2q) (+1)-eigenstates of w, namely, q + 12 + µ µ−1/2 1 q + 12 + µ µ+1/2 0 0 µ 0 µ Yq Yq (φ+ ) = (Φ− ) ∝ + . (6.13) 0 1 2q + 1 2q + 1 2 The eigenfunctions of D are then found as Φ± ! 5 Ψ±|γ| = u± γ =1 0 1 for j ≥ q + , for 5 2 0 γ = −1 ψ±|γ| = u± φ± (6.14) Φ − 0 0 ! 5 Ψ− = u− γ =1 0 1 for j = q − . for 5 2 0 γ = −1 0 = u0+ ψ+ φ+ Thus, the radial functions u± (r) solve
2 1 γ(γ + 1) − ∂r + + − 2E u± = 0. r r2
(6.15)
This is the wave equation for a free particle except for the fractional “angular momentum” γ. Its solutions is hence given by the Bessel functions, √ (6.16) u± (r) ∝ r−1/2 J|γ|∓ 12 ( 2Er). • For λ = 0, we recover the formulae in [19]. The well-known self-adjointness problem in the j = q − 1/2 sector shows up in that the eigenvalue γ vanishes in this 2 case. (Self-adjointness of D requires in fact |λ| ≥ 3/2, [18]). • Another interesting particular value is λ = ±q, when the Biedenharn–Temple operator has half-integer eigenvalues, 1 γ=± j+ . (6.17) 2 In this case, γ(γ + 1) is the same for −|γ| as for |γ| − 1, leading to identical solutions. Thus, the corresponding energy levels are two-fold degenerate. (This only happens for |γ| ≥ |γ|min + 1 i.e. for j ≥ q + 1/2). This can also be understood by
June 5, 2006 10:44 WSPC/148-RMP
J070-00265
The Biedenharn Approach to Relativistic Coulomb-Type Problems
327
noting that, for λ = ±q, the spin dependence drops out in one of the γ 5 -sectors. For λ = q, e.g., the Hamiltonian (6.1) reduces to q2 σ · r 2 + − 2q π 1 H1 r2 r3 , (6.18) H= = 2 H0 2 q 2 π + 2 r i.e. H0 describes a spin 0 particle, while H1 = H0 − 2qσ · r/r3 corresponds to a particle with anomalous gyromagnetic ratio 4, cf. dyons in Sec. 5. Hence, the system admits an extra o(3) symmetry, generated by the spin vectors 1 σ 2 S1 = U † S0 U S0 =
for H0
, (6.19) for H1 √ √ where U = Q/ H1 and U −1 = U † = 1/ H1 Q† are the unitary transformations which intertwine the non-zero-energy parts of the chiral sectors. have a non-relativistic conEach of the partner Hamiltonians H1 and H0 in (6.1) formal o(2, 1) symmetry [7] which combines, with D and −iγ 5 D, into an osp(1/2) superalgebra [18]. The symmetries of the problem are studied in detail [18, 20]. Acknowledgment The author is indebted to Roman Jackiw for their interesting correspondence. References [1] L. C. Biedenharn, Remarks on the relativistic Kepler problem, Phys. Rev. 126 (1962) 845; L. C. Biedenharn and N. V. V. Swamy, Remarks on the relativistic Kepler problem, II, Approximate Dirac–Coulomb Hamiltonian possessing two vector invariants, Phys. Rev. B 133 (1964) 1353. [2] P. A. M. Dirac, The Principles of Quantum Mechanics (Clarendon, Oxford, 1958). [3] J. D. Bjorken and S. D. Drell, Relativistic Quantum Mechanics (McGraw-Hill, New York, 1964). [4] M. Berrondo and H. V. McIntosh, Degeneracy of the Dirac equation with electric and magnetic Coulomb potentials, J. Math. Phys. 11 (1970) 125. [5] T. T. Wu and C. N. Yang, Dirac monopole without strings: Monopole harmonics, Nucl. Phys. B 107 (1976) 365. [6] P. P. Banderet, Zur theorie der singul¨ aren magnetpole, Helv. Phys. Acta 19 (1946) 503; Harish-Chandra, Motion of an electron in the field of a magnetic pole, Phys. Rev. 74 (1948) 883. [7] C. Callias, Spectra of fermions in monopole fields: Exactly soluble models, Phys. Rev. D 16 (1977) 3068. [8] Y. Kazama, C. N. Yang and A. S. Goldhaber, Scattering of a Dirac particle with charge Ze by a fixed magnetic monopole, Phys. Rev. D 15 (1977) 2287. [9] Y. Kazama and C. N. Yang, Existence of bound states for a charged spin 1/2 particle with an extra magnetic moment in the field of a fixed magnetic monopole, Phys. Rev. D 15 (1977) 2300.
June 5, 2006 10:44 WSPC/148-RMP
328
J070-00265
P. A. Horv´ athy
[10] P. Rossi, Spin 1/2 particles in the field of monopoles, Nucl. Phys. B 127 (1977) 518; for a review see H. Yamagishi, The fermion-monopole system reexamined, Phys. Rev. D 27 (1983) 2383; The fermion-monopole system reexamined, 2, ibid. 28 (1983) 977. [11] E. D’Hoker and L. Vinet, Supersymmetry of the Pauli equation in the presence of a magnetic monopole, Phys. Lett. 137 (1984) 72. [12] I. S. Gradshtein and I. M. Ryzhik, Tables of Integrals, Sums, Series and Products (Nauka, Moscow, 1971). [13] L. Gy. Feh´er, P. A. Horv´ athy and L. O’Raifeartaigh, Applications of chiral supersymmetry for spin fields in self-dual backgrounds, Internat. J. Modern Phys. A 4 (1989) 5277; L. Gy. Feh´er, P. A. Horv´ athy and L. O’Raifeartaigh, Separating the dyon system, Phys. Rev. D 40 (1989) 666. [14] E. D’Hoker and L. Vinet, Constants of motion for a spin 1/2 particle in the field of a dyon, Phys. Rev. Lett. 55 (1986) 1043; Supersymmetries of the dyon, in Field Theory, Quantum Gravity and Strings, Meudon/Paris Seminars 85/86, Springer Lecture Notes in Physics, Vol. 280 (Springer, Berlin-Heidelberg, 1987), p. 156; Hidden symmetries and accidental degeneracy of spin 1/2 particle in the field of a dyon, Lett. Math. Phys. 12 (1986) 71; E. D’Hoker, V. A. Kostelecky and L. Vinet, Spectrumgenerating superalgebras, in Dynamical Groups and Spectrum Generating Algebras (World Scientific, Singapore, 1988), pp. 339–367. [15] H. V. McIntosh and A. Cisneros, Degeneracy in the presence of a magnetic monopole, J. Math. Phys. 11 (1970) 896; D. Zwanziger, Exactly soluble nonrelativistic model of particles with both electric and magnetic charges, Phys. Rev. 176 (1968) 1480; A. O. Barut and G. L. Bornzin, The o(4) symmetry has been extended into o(4,2), SO(4)-formulation of the symmetry breaking relativistic Kepler problems with or without magnetic charges, J. Math. Phys. 4 (1971) 141; J. Sch¨ onfeld, The physical interpretation of this system, Dynamical symmetry and magnetic charge, J. Math. Phys. 21 (1971) 2528. [16] L. Gy. Feh´er, Dynamical O(4) symmetry in the asymptotic field of the Prasad– Sommerfield monopole, J. Phys. A 19 (1986) 1259; L. Gy. Feh´er and P. A. Horv´ athy, Nonrelativistic scattering of a spin 1/2 particle off a selfdual monopole, Mod. Phys. Lett. A 3 (1988) 1451. [17] F. Bloore and P. A. Horv´ athy, Helicity-supersymmetry of dyons, J. Math. Phys. 33 (1992) 1869; hep-th/0512144. [18] E. D’Hoker and L. Vinet, Dynamical supersymmetry of the magnetic monopole and the 1/r 2 potential, Commun. Math. Phys. 97 (1985) 391–427. [19] P. A. Horv´ athy, New applications of the Biedenharn–Temple operator, in Festschrift in Honor of L. C. Biedenharn, ed. B. Gruber (Plenum, New York, 1994); hep-th/0410161; further developments include F. De Jonghe, A. J. Macfarlane, K. Peeters and J.-W. van Holten, New supersymmetry of the monopole, Phys. Lett. B 359 (1995) 114; M. Plyushchay, On the nature of fermion monopole supersymmetry, ibid. 485 (2000) 187; hep-th/0005122; C. Leiva and Mikhail S. Plyushchay, Nonlinear superconformal symmetry of a fermion in the field of a Dirac monopole, ibid. 582 (2004) 135; hep-th/0311150. [20] P. A. Horv´ athy, A. J. Macfarlane and J.-W. van Holten, Monopole supersymmetries and the Biedenharn operator, Phys. Lett. B 486 (2000) 346–352; hep-th/0006118.
June 5, 2006 10:44 WSPC/148-RMP
J070-00266
Reviews in Mathematical Physics Vol. 18, No. 3 (2006) 329–347 c World Scientific Publishing Company
DYNAMICAL (SUPER)SYMMETRIES OF MONOPOLES AND VORTICES
´ P. A. HORVATHY D´ epartement de Math´ ematiques, Universit´ e de Tours, Parc de Grandmont, F-37200 Tours, France [email protected] Received 26 January 2006 The dynamical (super)symmetries for various monopole systems are reviewed. For a Dirac monopole, non-smooth Runge–Lenz vector can exist; there is, however, a spectrumgenerating conformal o(2, 1) dynamical symmetry that extends into osp(1/1) or osp(1/2) for spin 1/2 particles. Self-dual ’t Hooft–Polyakov-type monopoles admit an su(2/2) dynamical supersymmetry algebra, which allows us to reduce the fluctuation equation to the spin 0 case. For large r, the system reduces to a Dirac monopole plus a suitable inverse-square potential considered before by McIntosh and Cisneros, and by Zwanziger in the spin 0 case, and to the “dyon” of D’Hoker and Vinet for spin 1/2. The asymptotic system admits a Kepler-type dynamical symmetry as well as a “helicity-supersymmetry” analogous to the one Biedenharn found in the relativistic Kepler problem. Similar results hold for the Kaluza–Klein monopole of Gross–Perry–Sorkin. For the magnetic vortex, the N = 2 supersymmetry of the Pauli Hamiltonian in a static magnetic field in the plane combines with the o(2) × o(2, 1) bosonic symmetry into an o(2) × osp(1/2) dynamical superalgebra. Keywords: Magnetic monopoles; vortices; dynamical symmetries; supersymmetry. Mathematics Subject Classification 2000: 81R05
1. Introduction The architype of a dynamical symmetry is provided by the Runge–Lenz vector [1] in the Kepler problem, 1 {p × L − L × p} − M ˆr, (1.1) 2 where M is the mass of the sun, the planet’s mass is taken to be 1, and L denotes the planet’s (orbital) angular momentum, L = r × p. The vector A is directed from the sun’s position towards the perihelion point. Under commutation (Poisson bracket), the Runge–Lenz vector and the angular momentum close into o(4) for bound (elliptic) motions, into o(3) ⊕s R3 for parabolic motions and into o(3, 1) for hyperbolic motions. This makes it possible to calculate the spectrum and the S-matrix algebraically. A=
329
June 5, 2006 10:44 WSPC/148-RMP
330
J070-00266
P. A. Horv´ athy
The Kepler problem also admits an o(2, 1) “spectrum-generating symmetry”, which combines with the o(4)/o(3, 1) into an irreducible representation of the conformal group o(4, 2) [1]. In this review, we examine how similar dynamical symmetries — as well as supersymmetries — arise for various magnetic monopole systems. In the last chapter, we examine what happens around a magnetic vortex. 2. The Dirac Monopole [2] Let us consider a Dirac monopole, whose magnetic field is B=g
r . r3
(2.1)
The conserved angular momentum of a charged, spinless particle is L0 = r × π − qˆr,
(2.2)
where π = p − iqAD , rot AD = r/r3 , q = eg, e being the electric charge. Since L0 · ˆr = −q,
(2.3)
the particle moves classically on a cone of opening angle cos α = −q/L0 . There are no bound motions. The problem of having a conserved Runge–Lenz-type vector naturaly arises, and it has been claimed [3] that the vector A which points from the origin to the closest (“perihelion”) point of the trajectory is such a conserved vector, which would generate, with the angular momentum, an o(3, 1) dynamical symmetry. This statement is, however, false: a Dirac monopole cannot admit any time-independent, conserved Runge–Lenz-type vector [4]. This can be understood by considering the “umbrella” transformation of Boulware et al. [5], r → R =
ˆ 0) ˆ 0 (r · L r−L , sin α
(2.4)
which rotates the monopole problem into a potential problem: the particle trajectories in the monopole field correspond to those in the plane perpendicular to the angular momentum, L0 , in an −q 2 /2R2 potential (and makes the o(2, 1) symmetry [6] manifest). The inverse-square potential problem is integrable. Golo’s “Runge–Lenz” vector goes thereby into the vector pointing to the closest point, R0 , of the rotated trajectory in the plane perpendicular to L0 . This transformation is, however, singular when the motion is radial: when the cone’s opening angle closes to zero, the direction of the umbrella transformation becomes undetermined. More precisely, the inverse transformation becomes the familiar Hopf fibering U (1) → SO(3) → S2 [4].
June 5, 2006 10:44 WSPC/148-RMP
J070-00266
Dynamical (Super)Symmetries of Monopoles and Vortices
331
A spinless particle in the field of a Dirac monopole admits instead an o(2, 1) symmetry [6], generated by the “non-relativistic conformal transformations” H=
1 2 π 2
time translations,
1 D = tH − {π, r} 4
dilations,
1 K = −t2 H + 2tD + mr2 2
expansions,
(2.5)
which satisfy the o(2, 1) relations [H, D] = iH,
[H, K] = 2iD,
[D, K] = iK,
(2.6)
allowing for a derivation of the spectrum from the group theory. This result can be explained from studying the non-relativistic structure of space-time. A free, non-relativistic particle admits in fact the so-called Schr¨odinger group as symmetry [7]. This latter is the extension of the Galilei group with dilations and expansions. It is best understood in the five-dimensional framework, where nonrelativisic motions are light-like reductions of null geodesics in a five-dimensional Lorentz manifold [8]. Jackiw’s o(2, 1) is just the residual symmetry left over from the Schr¨ odinger group after adding a Dirac monopole. The only potential which is consistent with the conformal algebra (2.5) is λ2 /r2 : for an arbitrary λ, the Hamiltonian 1 2 λ2 (2.7) π + 2 2 r is o(2, 1) symmetric. Adding a Coulomb term would break this symmetry. However, as first noticed by Zwanziger, and by McIntosh and Cisneros (MCZ) [9], a slightly different system does have a Kepler-type dynamical symmetry. It consists of Dirac monopole plus a fine-tuned inverse-square potential plus a Coulomb term, 1 α q2 (2.8) π2 + + 2 , HMCZ = 2 r r which admits a conserved Runge–Lenz vector, namely A0 =
1 {π × L0 − L0 × π} − q 2 ˆr. 2
(2.9)
This is understood by noting that, when applying the “umbrella-transformation” (2.4) the q 2 /2r2 potentials cancel and we are left with an effective Kepler problem. The o(4)/o(3, 1) dynamical symmetry generated by L0 and A0 can be used to determine the spectrum and the scattering matrix [9], respectively. It extends into o(4, 2), but in another representation as for Kepler [1, 10].
June 5, 2006 10:44 WSPC/148-RMP
332
J070-00266
P. A. Horv´ athy
Jackiw’s result was generalized [11] to a spin 12 particle with gyromagnetic ratio 2, described by the two-component Pauli Hamiltonian 1 σ · ˆr 2 (2.10) HP = π −q 2 . 2 r This system has not only the bosonic o(2, 1) with D, K in (2.5) as for spin 0, but also two conserved supercharges, namely 1 Q = √ σ·π 2
1 and S = √ σ · r − tQ, 2
(2.11)
which close with the bosonic generators into an osp(1/1) superalgebra, i.e. (2.6), supplemented by i Q, 2 [Q, H] = 0, [Q, D] =
{Q, Q} = 2H
[K, D] = iS,
[S, H] = iQ,
i [S, D] = − S, 2
[S, K] = 0, {Q, S} = −2D,
(2.12) {S, S} = 2K.
The osp(1, 1) symmetry, which allows to derive the spectrum algebraically, can be seen to be the residual superalgebra of the “super-Schr¨odinger algebra”, obtained from adding the (fermionic) “helicity operator” Q in (2.11) to the Schr¨ odinger group [12]. 3. Supersymmetric Quantum Mechanics D’Hoker and Vinet [13, 14] have further generalized the problem. To explain their results, let us consider a four-dimensional, euclidean space and choose the representation 0 σk 0 −i12 12 0 k 4 5 , (3.1) γ = , γ = , γ = σk 0 i12 0 0 −12 for the Dirac matrices. Let Aµ denote a gauge field. The four-dimensional Dirac operator, Q† µ D ≡ γ (∂µ − iAµ ) ≡ (3.2) Q is, as in any even dimensions, chiral-supersymmetric. This means that the square of D, 2 H1 , (3.3) D = H0 is a supersymmetric Hamiltonian. Its ±1 chirality sectors (eigensectors of γ 5 ) are related by the unitary transformations 1 U =Q√ H1
1 and U −1 ≡ U † = √ Q† , H1
(3.4)
June 5, 2006 10:44 WSPC/148-RMP
J070-00266
Dynamical (Super)Symmetries of Monopoles and Vortices
333
which intertwine H1 = Q† Q and H0 = QQ† , H1 = U † H0 U . If Ψ0 is an H0 eigenfunction with eigenvalue E > 0, then † U Ψ0 (3.5) ±Ψ0 √ is a D-eigenfunction with eigenvalues ± E. Zero-energy ground-states may arise; the difference of their multiplicities in the two sectors, called the Atiyah–Singer index, is calculated by topological formulae. Furthermore, if A0 is conserved for H0 , [A0 , H0 ] = 0, then A1 = U † A0 U
(3.6)
is conserved for H1 , [A1 , H1 ] = 0. Let us first apply these framework to the gauge field A = qAD ,
A4 =
λ , r
(3.7)
where λ is an arbitrary real constant. This gauge field represents a Dirac monopole plus a Coulomb potential in the fourth (euclidean) direction. Assuming that nothing depends on the fourth direction, ∂4 ( · ) = 0, the associated Dirac operator becomes 1 λ 1 1 Q† √ D= √ γ i πi + γ 4 = √ r 2 2 Q 2 λ σ·π−i 1 r . = √ (3.8) 2 σ · π + iλ r Its square is the four-component Hamiltonian
σ · ˆr σ · ˆr λ2 1 . H= π 2 − q 2 + 2 − λγ 5 2 2 r r r
(3.9)
The Hamiltonian (3.9) is block-diagonal, and the ±1 chirality components only differ in the sign of λ. They describe two uncoupled spin 12 particles with anomalous gyromagnetic ratios. Interestingly, the Hamiltonian (3.9) is a perfect square in two different ways: λ 1 (3.10) and Q2 = −iγ 5 Q1 Q1 = √ γ 5 γ i πi + γ 4 r 2 both satisfy {Qa , Qb } = δab H, and are hence conserved. They mix with the bosonic o(2, 1) symmetry, yielding two more supercharges, namely, 1 S1 = −tQ + √ γ 5 γ i ri 2
and S2 = −iγ 5 S1
June 5, 2006 10:44 WSPC/148-RMP
334
J070-00266
P. A. Horv´ athy
which satisfy {Sa , Sb } = 2δab K. Finally, {Qa , Sb } = −2δab D + 2ab Y, where Y is the parity operator 1 σ · ˆr 3 Y = γ 5 σ · + − λγ 5 , 2 2 r
(3.11)
where = r × π.
(3.12)
The four bosonic operators H, D, K, Y close with the fermionic operators Qa , Sa (a = 1, 2) into the superalgebra osp(1/2). Since the field (3.7) is manifestly spherically symmetric, the total angular momentum, 1 (3.13) J = L0 + σ 2 is also conserved. For the special value q = ±λ, the Pauli term drops out from one of the sectors while the gyromagnetic ratio becomes 4 in the other. Equation (3.9) reduces hence to λ2 2q σ · ˆr 1 H1 H= = π2 + 2 + 2 . (3.14) H0 0 2 r r Being spin-independent, the lower Hamiltonian clearly admits 1 (3.15) S0 = σ 2 as symmetry. However, supersymmetry implies that its partner Hamiltonian has also a “spin” symmetry, S1 = U † S0 U
(3.16)
commutes with H1 . S0 and L0 = J − S0 are hence both conserved for H0 . Thus, S1 and 1 L1 = U † L0 U = J − S1 = L0 + σ − S1 (3.17) 2 “H
1 are both conserved for H1 . The combined system H0 conserved “angular momenta”, namely, L1 S1 and L = . S= S0 L0
”
has, therefore, two
(3.18)
The action of the supercharges extends the o(3)spin algebra into u(2/2). Let us indeed define the vector supercharges Qα = 2i[S0 , Qα ] (α = 1, 2), i.e.
Q1 =
−2iQ† S0 2iS0 Q
,
Q2 =
−2S0 Q
(3.19) −2Q† S0
.
(3.20)
June 5, 2006 10:44 WSPC/148-RMP
J070-00266
Dynamical (Super)Symmetries of Monopoles and Vortices
335
All these operators commute with the Hamiltonian H. One has furthermore [γ 5 , bosonic] = 0,
{γ 5 , fermionic} = 0.
To summarize, the bosonic operators S0 , S1 , γ 5 , H and the fermionic operators Qa , Qa satisfy the (anti)commutation relations [S0i , S0j ] = iijk S0k ,
[S1i , S1j ] = iijk S1k ,
[S0i , S1j ] = 0 i i [γ5 , S0 ] = γ5 , S1 = 0
bosonic sector
[γ 5 , Qka ] = 2iab Qkb (a, b = 1, 2, k = 1, 2, 3) i k i j [S0 , Qa ] = δij Qa + ijk Qa , action of 2 i bosonic operators [S0i , Qja ] = − Qia , on fermionic sector 2 i [S1i , Qja ] = − δij Qa − ijk Qka , 2 i i i [S1 , Qa ] = Qa 2 {Qa , Qa } = 2δab H, i i i {Qa , Qb } = −4Hab S0 + S1 , fermionic sector i j i i {Qa , Qb } = 2Hδij δab − 4Hijk ab S1 − S1 , [γ 5 , Qa ] = 2iab Qb
(a, b = 1, 2)
i.e. close into the su(2/2) SUSY algebra [14, 17, 18]. The osp(2, 1) found before mixes with the o(3)rotations and the u(2/2)spin to yield a supersymmetric version of o(4, 2). Its precise structure has not yet been determined.
4. Self-Dual ’t Hooft–Polyakov Monopoles The Dirac monopole was generalized by ’t Hooft and Polyakov in non-abelian gauge theory [2]. It is a static, purely magnetic (∂0 = 0), everywhere-regular, finiteenergy solution to the SU (2) Yang–Mills Higgs equations associated to the energy functional
1 1 λ 3 ij j 2 2 Tr Fij F , (4.1) + Tr(Dj ΦD Φ) + 1 − Tr(Φ ) E= d x 4 2 4 where Fij = ∂i Aj − ∂j Ai + [Ai , Aj ] and Dj Φ = ∂j Φ + [Aj , Φ]. Finite-energy requires |Φ| 1 for large r, so that the asymptotic values of the Higgs field define a mapping from the “sphere at infinity” S2 into the “vacuum
June 5, 2006 10:44 WSPC/148-RMP
336
J070-00266
P. A. Horv´ athy
manifold” M = |Φ| = 1. M is again a two-sphere, so it provides us with the integer n = [Φ] ∈ π2 (S2 ) Z,
(4.2)
called the topological charge. For the non-vanishing Higgs potential (i.e. λ = 0), the sytem has the same o(2, 1) bosonic symmetry as the Dirac monopole. In the “Prasad–Sommerfield limit” of vanishing λ the situation is different. The second-order field equations associated to (4.1) are solved by the “self-duality” or “Bogomolny” equations B = DΦ
where Bi =
1 ijk F jk . 2
For n = 1, for example, Prasad and Somerfield found the solution k x 1 xa r a . , Φ = − coth r − Aaj = ajk 1 − sinh r r2 r r
(4.3)
(4.4)
Setting A4 = Φ and requiring ∂4 = 0, a PS monopole can also be viewed as a self-dual Yang–Mills field in four euclidean dimensions. Let us now consider a massless Dirac particle in a BPS background, described by the four-dimensional Dirac operator σ · π − iΦ Q† = D= . (4.5) Q σ · π + iΦ As explained in Sec. 3, D is chiral-supersymmetric. Now, owing to QQ† = π 2 + Φ2 + σ · (B − DΦ),
Q† Q = π2 + Φ2 + σ · (B + DΦ),
the spin drops out in the self-dual sector, while we get a factor 2 in the other one: H0 describes two spin 0 particle (or a spin 12 particle with gyromagnetic ratio 0), while H1 describes a particle with anomalous gyromagnetic ratio 4. This is why the fluctuation equation in the BPS background can be reduced to the study of the spin 0 system [15, 16]. The spin operator S0 = σ/2 is trivially conserved for H0 . Its superpartner,
1 1 2 S1 = U † S0 U = [π − Φ2 ]σ + Φ (π × σ) − (σ · π)π , (4.6) H1 2 is therefore conserved for H1 . Zero-energy ground states only arise for H1 (but not for H0 ) as solutions of QΨ = 0. The multiplicity of these states (the Atiyah–Singer index) was found to be 2n, twice the topological charge [16]. Since BPS monopoles with topological charge n ≥ 2 are not spherically symmetric, for a general BPS monopole, this is the end of the story. For the n = 1 of the BPS solution above, however, we also have spherical symmetry and hence the
June 5, 2006 10:44 WSPC/148-RMP
J070-00266
Dynamical (Super)Symmetries of Monopoles and Vortices
337
total angular momentum, J = L0 + 12 σ, is conserved. The same argument as in Sec. 3 shows that L0 = J − S0
and L1 = U L0 U † = J − S1 ,
cf. (3.17), are conserved for H0 and H1 , respectively; the commuting operators L and S in Eq. (3.18) generate o(3)rotations ⊕ o(3)spin , and the spin part is extended into u(2/2) as in Eq. (3.21). 5. Dyons For large r, the systems become even more symmetric. The BPS monopole becomes an imbedded Dirac monopole with an additional long-range scalar field ˆ the Φ ∼ 1 − 1/r. For eigenstates of the electric charge operator Qem = Φ, SU (2)-covariant derivative reduces to the electromagnetic covariant derivative with coupling constant equal q, the electric charge. Thus, 2 1 2 2 H0 → HMCZ = π + q 1 − r when r → ∞. (5.1) 2 σ · r 1 H 1 → HD = π 2 + q 2 1 − + 2q 3 r r Remarkably, the large-r limit of H0 is precisely the HMCZ , the McIntosh– Cisneros–Zwanziger hamiltonian (2.8) (times the unit 2 × 2 matrix), while its partner H1 becomes the “dyon” hamiltonian HD of D’Hoker and Vinet [17, 18]. Supersymmetry then converts the Runge–Lenz vector A0 of MCZ into a spindependent Runge–Lenz vector,
1 r·σ q q 2 (π × L0 − L0 × π) − q ˆ (5.2) r + π × σ + σ − q 3 r − σ, A1 = 2 r r 2 A0
which is conserved for HD . For the asymptotic system, HD , HMCZ
(5.3)
the bosonic symmetry algebra o(3)rotations ⊕ o(3)spin extends therefore into o(4) ⊕ o(3)spin
(5.4)
for bound motions (and into o(4) ⊕ o(3)spin /o(4) ⊕ o(3)spin for scattered motions),a generated by A1 , (5.5) A= A0 a It
is likely that this symmetry is further extended to o(4, 2) ⊕ o(3)spin .
June 5, 2006 10:44 WSPC/148-RMP
338
J070-00266
P. A. Horv´ athy
and by L and S in Eq. (3.17), to which is added the supersymmetry algebra u(2/2) in Eq. (3.21). The dynamical symmetry (5.4) makes it possible to find the spectrum [14, 18, 19], q2 E =q 1− 2 , p 2
p=
q, q + 1, . . . q + 1, . . .
for
H1 H0
.
(5.6)
Chiral SUSY means that the spectra of H0 and of H1 are identical up to zero-energy ground-states. Closer inspection shows, however, even more symmetry, namely, a two-fold degeneracy. Let us focus our attention to a fixed j = const. sector. The pattern is reminiscent of a supersymmetric system except that the ground-state energy is non-zero. Generalizing Biedenharn’s approach to the relativistic Kepler problem [20], we can exhibit another conserved operator, namely, q2 q iσ · π + + (σ · ˆr) R† r x = (5.7) R= 2 , q q R iσ · π − + (σ · ˆr) r y that we call “dyon helicity” the operator [19]. Here, ˆ · ˆr x = σ · + 1 − qσ ˆ · ˆr y = σ · + 1 + qσ
is conserved for
H0 H1
( = r × p).
(5.8)
x and y both have the eigenvalues ±(j + 1/2) [18, 19]. They are just the components of the Biedenharn–Temple operator −y Γ = −(σ · + qγ 5 σ · rˆ) = . (5.9) −x Since the dyon helicity operator R satisfies 2 (j) R2 = D − E0 .
(5.10)
(j)
Subtracting the ground-state energy E0 ,
2 (j) D − E0
q4 2 H1 − q + (j + 1 )2 2 =
H0 − q 2 +
4
q (j + 12 )2
(5.11)
becomes hence a supersymmetric, with R as square-root. The new supersymmetrysectors are the ±1 eigensectors of the normalized Biedenharn–Temple operator Γ/|Γ|.
June 5, 2006 10:44 WSPC/148-RMP
J070-00266
Dynamical (Super)Symmetries of Monopoles and Vortices
339
The dyon helicity operator has the nice property that it respects the angular decomposition. Explicit eigenfunctions are constructed in [19]. 6. Particle in the Wu–Yang Monopole Field The MCZ system has yet another symmetric generalizations. Rather than considering spin 12 particles, with vanishing isospin, we can also study spin 0 particles with isospin, moving in a self-dual Wu–Yang [21] monopole field. This latter is obtained by imbedding the Dirac monopole into SU (2) gauge theory and adding a suitable “hedgehog” scalar field, i 1 i σ × ˆr, Φ = 1− σ · ˆr. (6.1) A= 2r 2 r The electric charge is defined [2] as the eigenvalue of ˆ = σ · ˆr. Qem = Φ
(6.2)
The Hamiltonian is hence HW Y =
2 2 Q2 1 1 . − i∇ − Qem AD + em 1 − 2 2 r
(6.3)
Since on the Qem = ±q, eigensectors HW Y reduces to the MCZ hamiltonian, such a particle admits the conserved Runge–Lenz vector [22] A=
1 {π × J − J × π} − q 2 ˆr 2
1 i + qσ − q(σ · ˆr)ˆr − − σ × ˆr − rσ × π + (σ · + 1)ˆr . 2r 2
(6.4)
A variation of the model can be obtained by considering “nucleon-type” particles [13, 23], whose electric charge operator is 1 (6.5) Qem = Qem − σ · ˆr. 2 The associated Hamiltonian is only slightly different from yet another one studied by D’Hoker and Vinet [13], namely 2 q 2 + 1 − σ · ˆr 1 1 4 2 +α . HN = −i∇ − Qem − σ.ˆr AD + (6.6) 2 2 2 r r This admits again a conserved Runge–Lenz vector, namely, [23] A=
1 {π × L0 − L0 × π} − q 2 ˆr 2
1 i + qσ − q(σ · ˆr)ˆr − − σ × ˆr − rσ × π + (σ · + 1)ˆr . 2r 2
(6.7)
June 5, 2006 10:44 WSPC/148-RMP
J070-00266
P. A. Horv´ athy
340
D’Hoker and Vinet have also proved that HN is actually a partner Hamiltonian of a supersymmetric system, namely of
HN
.
HD
(6.8)
7. The Kaluza–Klein Monopole The Kaluza–Klein monopole [24] is obtained by imbedding the Taub-NUT gravitational instanton as a static soliton in Kaluza–Klein theory. This latter is described by the 4-metric V {dr2 + r2 (dθ2 + sin2 θdφ2 )} +
1 {dψ + 4m cos θdφ}2 V
where V = 1 +
4m . r
(7.1)
The “vertical” variable ψ describes a internal circle. The apparent singularity at r = 0 is unphysical if ψ is periodic with period 16 πm. In the usual context, the Taub-NUT parameter, m is positive. We shall, however, also consider m < 0. Such a situation arises, e.g., in the long-range scattering of self-dual SU (2) monopoles [25]. ∂ψ is a Killing vector, and the associated conserved quantity, q is quantized in half-integers. It is identified with the electric charge. The curved-space gamma — matrices γˆA and the spin connection ΓA in the KK monopole background are found to be 0
γˆj = i √ σ V
i −√ σ V , 0
√ i V + √ σ·A V (7.2) γˆ4 = √ i 0 V − √ σ·A V
0
and Γi =
−
1 1 (B × σ)i (σ · B)Ai + 2V 2 2V 0
0
,
−
0
Γ4 =
1 B·σ 2V 2 0
0
. (7.3)
0
Requiring that all fields be equivariant with respect to the vertical action ψ → ψ + α, i.e. have the form, eiqψ Ψ, the Dirac operator becomes [26]
D=
0 Q
†
Q 0
=
0 √ 1 q √ σ·π V +i V V 4m
q √ 1 √ σ·π−i V 4m V , 0
(7.4)
June 5, 2006 10:44 WSPC/148-RMP
J070-00266
Dynamical (Super)Symmetries of Monopoles and Vortices
341
where π = −i∇ − (q/4m)A, A being the vectorpotential of a Dirac monopole of unit strength. (It is easy to check that Q and Q† are each other’s adjoint with respect to the Taub-NUT volume element V d4 x, as they have to be). Using the self-duality property ∇V = B,
(7.5)
the square of D is readily found to be 1 q σ · L0 12m2 ˆ H0 + V − r2 V σ · r + 4m r3 V + r4 V 2
q 2 1 V2 π2 + V 4m
,
(7.6)
L0 being the spin-0 “monopole” orbital angular momentum, L0 = r × π − qˆr in Eq. (2.2). (L0 is conserved only for H0 but not for H1 ). The partner Hamiltonians H1 and H0 differ hence in a complicated expression, and it is not at all obvious that they will have the same spectra. Chiral SUSY implies however that this is nevertheless true. Let us first focus our attention to the γ 5 = −1 sector. Observe now that the spin dependence has again dropped out, so it actually describes two, uncoupled, spin 0 particles. H0 is in fact the same as the Hamiltonian for a spin 0 particle in the KK field [4] (times the unit matrix). Because the spin is uncoupled, the system again has two angular momenta, namely, orbital angular momenta and the spin vectors, L0 ,
L1 = U † L0 U,
and S0 =
σ , 2
S1 = U † S0 U,
(7.7)
cf. (3.17) and (3.18). H0 admits [25] a Runge–Lenz vector, 1 A0 = {π × L − L × π} − 4mˆr H0 − 2
q 4m
2 ! .
(7.8)
The vector operators L0 and K0 generate an o(3, 1) dynamical symmetry for scattered motions and o(4) for bound motions. Its superpartner, A1 = U † A0 U [cf. (5.5)], generates an analogous dynamical symmetry group for H1 [26].
8. Supersymmetry of the Magnetic Vortex The three-dimensional (super)symmetries studied above become even larger in the plane [27, 28], namely, for a magnetic vortex (an idealization for the Aharonov– Bohm experiment). Firstly, the o(2, 1) symmetry (2.5) is still present; on the other
June 5, 2006 10:44 WSPC/148-RMP
342
J070-00266
P. A. Horv´ athy
hand, the N = 2 supersymmetry of the Pauli Hamiltonian of a spin 12 particle, present for any magnetic field in the plane [29], combines, for a magnetic vortex, with Jackiw’s o(2) × o(2, 1) into an o(2) × osp(1/2) superalgebra.b This curious supersymmetry is realized with two (rather than four)-component objects, and is only possible in two spatial dimensions [30]. It arises owing to the existence of two “scalar products” in the plane, namely, the ordinary (symmetric) scalar product, and the (antisymmetric) vector product.c 1 In detail, let us first consider a spin 2 particle in an arbitrary static magnetic field B = 0, 0, B , B = B(x, y). Dropping the irrelevant z variable, we work in the plane. Then, our model is described by the Pauli Hamiltonian H=
1 2 π − eBσ3 , 2m
(8.1)
where B = rot A(≡ ij ∂i Aj ). It is now easy to see that the Hamiltonian (8.1) is a perfect square in two different ways: both operators 1 π·σ Q= √ 2m
and Q∗ = √
1 π × σ, 2m
(8.2)
where σ = (σ1 , σ2 ), satisfy {Q, Q} = {Q , Q } = 2H.
(8.3)
Thus, for any static, purely magnetic field in the plane, H is an N = 2 supersymmetric Hamiltonian. The supercharge Q is a standard object used in supersymmetric quantum mechanics; the “twisted” charge Q was used, e.g., [32], to describe the Landau states in a constant magnetic field [29, 31]. Let us assume henceforth that B is the field of a point-like magnetic vortex directed along the z-axis, B = Φ δ(r), where Φ is the total magnetic flux.d Inserting Ai (r) = −(Φ/2π) ij rj /r2 into the Pauli Hamiltonian H in (8.1), it is straightforward to check that 1 D = tH − {π, r} 4
1 and K = −t2 H + 2tD + mr2 2
cf. (2.3), generate, along with H, an o(2, 1) Lie algebra (2.6). The angular momentum, J = r × π, adds to this o(2, 1) an extra o(2).e
b This
is to be compared with the Galilean supersymmetry [30] for non-relativistic Chern–Simons systems, and with the osp(1/2) found by Hughes et al. in a constant magnetic field [31]. c The vector or cross product of two planar vectors, u × v = ui v j , is a scalar. ij d Our setup can be thought of as an idealization of the spinning version of the Aharonov–Bohm experiment [33]. e The correct definition of angular momentum requires boundary conditions.
June 5, 2006 10:44 WSPC/148-RMP
J070-00266
Dynamical (Super)Symmetries of Monopoles and Vortices
343
Commuting Q and Q with the expansion, K, yields two more generators, namely " m π S = i[Q, K] = r − t · σ, 2 m (8.4) " π m S = i[Q , K] = r − t × σ. 2 m It is now straightforward to see that both sets Q, S and Q , S extend the o(2, 1) ∼ = osp(1/0) into an osp(1/1) superalgebra. These two algebras do not close yet, though: the “mixed” anticommutators {Q, S } and {Q , S} produce a new conserved charge, viz. 1 σ3 . 2 But J satisfies now nontrivial commutation relations with the supercharges, {Q, S } = −{Q, S} = J + 2Σ,
[J, Q] = −iQ ,
[J, Q ] = iQ,
where Σ =
[J, S] = −iS ,
[J, S ] = iS.
Thus, setting Y = J + 2Σ = r × π + σ3 , the generators H, D, K, Y and Q, Q , S, S satisfy [Q, D] =
i Q, 2
[Q , D] =
i Q , 2
[Q, K] = −iS,
[Q , K] = −iS ,
[Q, H] = 0,
[Q , H] = 0,
[Q, Y ] = −iQ , i [S, D] = − S, 2
[Q , Y ] = iQ, i [S , D] = − S , 2
[S, K] = 0,
[S , K] = 0,
[S, H] = iQ,
[S , H] = iQ ,
[S, Y ] = −iS ,
[S , Y ] = iS,
{Q, Q} = 2H,
{Q , Q } = 2H,
{S, S} = 2K,
{S , S } = 2K,
{Q, Q } = 0, {Q, S} = −2D, {Q, S } = Y,
(8.5)
{S, S } = 0, {Q , S } = −2D, {Q , S} = −Y.
Added to the o(2, 1) relations, this means that our generators span the osp(1/2) superalgebra [11, 13]. On the other hand, 1 Z = J + Σ = r × π + σ3 2
June 5, 2006 10:44 WSPC/148-RMP
344
J070-00266
P. A. Horv´ athy
commutes with all generators of osp(1/2), so that the full symmetry is the direct product osp(1/2) × o(2), generated by 1 Y = r × π + σ3 , π · σ, Q= √ 2m 1 2 1 π × σ, H= π − eBσ3 , Q = √ 2m 2m " 1 eB m (8.6) D = − {π, q} − t σ3 , q · σ, S= 4 2m 2 " 1 m 2 K = mq , q × σ, S = 2 2 Z = r × π + 1 σ3 , . 2 where we have put q = r(π/m)t. The supersymmetric Hamiltonian (8.1) is the square of Jackiw’s [32] twodimensional Dirac operator π × σ. However, the Dirac operator is supersymmetric in any even dimensional space. The energy levels are therefore non-negative; eigenstates with non-zero energy are doubly degenerate; the system has Ent(eΦ − 1) zero-modes [32, 33]. The superalgebra (8.6) allows for a complete group-theoretical solution of the Pauli equation, along the lines indicated by D’Hoker and Vinet [11, 13]. Notice that the two-dimensional Dirac operator π × σ of [32] — essentially, our Q — is associated with the unusual choice of the two-dimensional “Dirac” (i.e. Pauli) matrices γ1 = −σ2 , γ2 = σ1 . Our helicity operator, Q, is again a “Dirac operator” — but one associated with the standard choice γ1 = σ1 , γ2 = σ2 . Acknowledgments This review is based on joint research with L. Feh´er, B. Cordani, L. O’Raifeartaigh, F. Bloore, C. Duval, G. Gibbons and A. Comtet, to whom I express my indebtedness.
References [1] B. Cordani, The Kepler Problem (Birkh¨ auser, 2003); The o(4, 2) symmetry was first found by H. Kleinert, Colorado Lecture (1966) (unpublished); A. O. Barut and H. Kleinert, Transition probabilities of the hydrogen atom from noncompact dynamical groups, Phys. Rev. 156 (1967) 1541; G. Gy¨ orgyi, Kepler’s equation, Fock variables, Bacry’s generators and Dirac brackets, Il Nuovo Cimento A53 (1968) 717. [2] P. Goddard and D. Olive, Magnetic monopoles in gauge field theories, Rep. Prog. Phys. 41 (1978) 1357. [3] Golo, Dynamic SO(3,1) symmetry of a Dirac magnetic monopole, JETP Lett. 35 (1982) 535.
June 5, 2006 10:44 WSPC/148-RMP
J070-00266
Dynamical (Super)Symmetries of Monopoles and Vortices
345
[4] L. Gy. Feh´er, The O(3,1) symmetry problem of the charge-monopole interaction, J. Math. Phys. 28 (1987) 234. [5] D. G. Boulware, L. S. Brown, R. N. Cahn, S. D. Ellis and C. Lee, Scattering on magnetic charge, Phys. Rev. D 14 (1976) 2708. [6] R. Jackiw, Dynamical symmetry of the magnetic monopole, Ann. Phys. (N.Y.) 129 (1980) 183. [7] R. Jackiw, Introducing scaling symmetry, Phys. Today 25 (1972) 23; U. Niederer, The maximal kinematical invariance group of the free Schr¨ odinger equation, Helv. Phys. Acta 45 (1972) 802; C. R. Hagen, Scale and conformal transformations in GalileanCovariant Field Theory, Phys. Rev. D 5 (1972) 377; C. Duval, Quelques proc´edures g´eometriques en dynamique des particules, Th`ese de Doctorat d’Etat, Marseille (1982) (unpublished). [8] C. Duval, G. Burdet, H. P. K¨ unzle and M. Perrin, Bargmann structures and Newton– Cartan theory, Phys. Rev. D 31 (1985) 1841; C. Duval, G. W. Gibbons and P. A. Horv´ athy, Celestial mechanics, conformal structures, and gravitational waves, Phys. Rev. D 43 (1991) 3907. [9] H. V. McIntosh and A. Cisneros, Degeneracy in the presence of a magnetic monopole, J. Math. Phys. 11 (1970) 896; D. Zwanziger, Exactly soluble nonrelativistic model of particles with both electric and magnetic charges, Phys. Rev. 176 (1968) 1480; J. Sch¨ onfeld, Dynamical symmetry and magnetic charge, J. Math. Phys. 21 (1971) 2528; L. Gy. Feh´er, Dynamical O(4) symmetry in the asymptotic field of the Prasad– Sommerfield monopole, J. Phys. A 19 (1986) 1259; Dynamical O(4) symmetry in long-range monopole-test particle and monopole-monopole interactions, in NonPerturbative Methods in Quantum Field Theory; (eds.) Z. Horv´ ath, L. Palla and A. Patk´ os (World Scientific, 1987). For the scattering, see L. Gy. Feh´er and P. A. Horv´ athy, Non-relativistic scattering of a spin 1/2 particle off a self-dual monopole, Mod. Phys. Lett. A 3 (1988) 1451. [10] A. O. Barut and G. L. Bornzin, SO(4)-formulation of the symmetry breaking relativistic Kepler problems with or without magnetic charges, J. Math. Phys. 4 (1971) 141; B. Cordani, L. G. Feh´er and P. A. Horv´ athy, Kepler-type dynamical symmetries of long-range monopole interactions, J. Math. Phys. 31 (1990) 202. [11] E. D’Hoker and L. Vinet, Supersymmetry of the Pauli equation in the presence of a magnetic monopole, Phys. Lett. B 137 (1984) 72. [12] J. P. Gauntlett, J. Gomis and P. K. Townsend, Supersymmetry and the physicalphase-space formulation of spinning particles, Phys. Lett. B 248 (1990) 288; C. Duval and P. A. Horv´ athy, On Schr¨ odinger superalgebras, J. Math. Phys. 35 (1994) 2516; [hep-th/0508079]. [13] E. D’Hoker and L. Vinet, Dynamical supersymmetry of the magnetic monopole and the 1/r 2 potential, Commun. Math. Phys. 97 (1985) 391–427. [14] E. D’Hoker and L. Vinet, Supersymmetries of the dyon, in Field Theory, Quantum Gravity and Strings, Meudon-Paris Seminars 85/86, Springer Lecture Notes in Physics, Vol. 280 (Springer-Verlag, 1987), p. 156; Hidden symmetries and accidental degeneracy of spin 1/2 particle in the field of a dyon, Lett. Math. Phys. 12 (1986) 71; E. D’Hoker, V. A. Kostelecky and L. Vinet, Spectrum-generating superalgebras, in Dynamical Groups and Spectrum Generating Algebras (World Scientific, Singapore, 1988), pp. 339–367. [15] E. Mottola, Zero modes of the ’t Hooft–Polyakov monopole, Phys. Lett. B 79 (1979) 242. [16] E. J. Weinberg, Parameter counting for multi-monopole solutions, Phys. Rev. D 20 (1979) 936.
June 5, 2006 10:44 WSPC/148-RMP
346
J070-00266
P. A. Horv´ athy
[17] E. D’Hoker and L. Vinet, Constants of motion for a spin 1/2 particle in the field of a dyon, Phys. Rev. Lett. 55 (1986) 1043. [18] L. Gy. Feh´er, P. A. Horv´ athy and L. O’Raifeartaigh, Applications of chiral supersymmetry for spin fields in self-dual backgrounds, Int. J. Mod. Phys. A 4 (1989) 5277; Separating the dyon system, Phys. Rev. D 40 (1989) 666. [19] F. Bloore and P. A. Horv´ athy, Helicity-supersymmetry of dyons, J. Math. Phys. 33 (1992) 1869. [20] L. C. Biedenharn, Remarks on the relativistic Kepler problem, Phys. Rev. 126 (1962) 845; M. Berrondo and H. V. McIntosh, Degeneracy of the Dirac equation with electric and magnetic Coulomb potentials, J. Math. Phys. 11 (1970) 125. [21] T. T. Wu and C. N. Yang, Some solutions of the classical isotopic gauge field equations, in Properties of Matter under Unusual Conditions, eds. H. Mark and S. Fernbach (Interscience, 1969). [22] A. O. Barut and G. L. Bornzin, New relativistic Coulomb Hamiltonian with O(4) symmetry and a spinor realization of the dynamical group O(4, 2), Phys. Rev. D 7 (1973) 3018. [23] P. A. Horv´ athy, Isospin-dependent o(4,2) symmetry of self-dual Wu–Yang monopoles, Mod. Phys. Lett. A 6 (1991) 3613. [24] Gross and M. Perry, Magnetic monopoles in Kaluza–Klein theories, Nucl. Phys. B 226 (1983) 29; R. Sorkin, Kaluza–Klein monopole, Phys. Rev. Lett. 51 (1983) 87. [25] N. Manton and G. W. Gibbons, Classical and quantum dynamics of BPS monopoles, Nucl. Phys. 274 (1986) 183; L. Gy. Feh´er and P. A. Horv´ athy, Dynamical symmetry of monopole scattering, Phys. Lett. B 183 (1987) 182; B. Cordani, L. Gy. Feh´er and P. A. Horv´ athy, o(4,2) dynamical symmetry of the Kaluza–Klein monopole, Phys. Lett. 201 (1988) 481. [26] Z. F. Ezawa and A. Iwazaki, Monopole-fermion dynamics and the Rubakov effect in Kaluza–Klein theories, Phys. Lett. B 138 (1984) 81; M. B. Paranjape and G. W. Semenoff, Fractional fermion number in Kaluza–Klein theory, Phys. Rev. D 31 (1985) 1324. Later developments include: M. Visinescu, Generalized Runge–Lenz vector in Taub-NUT spinning space, Phys. Lett. B 339 (1994) 28; J. W. van Holten, Supersymmetry and geometry of Taub-NUT, ibid. 342 (1995) 47, A. Comtet and P. A. Horv´ athy, The Dirac equation in Taub-NUT Space, ibid. 349 (1995) 49, etc. [27] R. Jackiw, Dynamical symmetry of the magnetic vortex, Ann. Phys. (N.Y.) 201 (1990) 83. [28] C. J. Parks, The dynamical supersymmetry of the point magnetic vortex, Nucl. Phys. B 367 (1992) 99; J.-G. Demers, Dynamical supersymmetry and solutions for Pauli Hamiltonians, Mod. Phys. Lett. 8 (1993) 827; C. Duval and P. A. Horv´ athy, Exotic supersymmetry of the magnetic vortex, Tours Preprint N. 60/93 (1993) (unpublished). [29] E. Witten, Dynamical breaking of supersymmetry, Nucl. Phys. B 185 (1981) 513; P. Salomonson and J. W. Van Holten, Fermionic coordinates and supersymmetry in quantum mechanics, ibid. 169 (1982) 509; M. De Crombrugghe and V. Rittenberg, Supersymmetric quantum mechanics, Ann. Phys. (N.Y.) 151 (1983) 99. [30] M. Leblanc, G. Lozano and H. Min, Extended superconformal galilean symmetry in Chern–Simons matter systems, Ann. Phys. (N.Y.) 219 (1992) 328; C. Duval and P. A. Horv´ athy, in [12]. The bosonic galilean symmetry was pointed out by R. Jackiw and S.-Y. Pi, Classical and quantal nonrelativistic Chern–Simons theory, Phys. Rev. D 42 (1990) 3500. [31] R. J. Hughes, V. A. Kosteleck´ y and M. M. Nieto, Supersymmetric quantum mechanics in a first-order Dirac equation, Phys. Rev. D 34 (1986) 1100.
June 5, 2006 10:44 WSPC/148-RMP
J070-00266
Dynamical (Super)Symmetries of Monopoles and Vortices
347
[32] R. Jackiw, Fractional charge and zero modes for planar systems in a magnetic field, Phys. Rev. D 29 (1984) 2375. [33] C. R. Hagen, Aharonov–Bohm scattering of particles with spin, Phys. Rev. Lett. 64 (1990) 503; R. Musto, L. O’Raifeartaigh and A. Wipf, The U(1) Anomaly, the noncompact index theorem and the (supersymmetric) BA effect, Phys. Lett. B 175 (1986) 433; P. Forg´ acs, L. O’Raifeartaigh and A. Wipf, Scattering theory, U(1) anomaly and index theorems for compact and noncompact manifolds, Nucl. Phys. B 293 (1987) 559.
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Reviews in Mathematical Physics Vol. 18, No. 4 (2006) 349–415 c World Scientific Publishing Company
RIGOROUS STEPS TOWARDS HOLOGRAPHY IN ASYMPTOTICALLY FLAT SPACETIMES
CLAUDIO DAPPIAGGI Dipartimento di Fisica Nucleare e Teorica, Universit` a di Pavia, Italy and Istituto Nazionale di Fisica Nucleare Sezione di Pavia, via A.Bassi 6 I-27100 Pavia, Italy [email protected] VALTER MORETTI∗ and NICOLA PINAMONTI† Dipartimento di Matematica, Universit` a di Trento, Povo (TN), Italy and Istituto Nazionale di Alta Matematica “F.Severi” unit` a locale di Trento, Povo (TN), Italy and Istituto Nazionale di Fisica Nucleare Gruppo Collegato di Trento, via Sommarive 14 I-38050 Povo (TN), Italy ∗[email protected] †[email protected] Received 11 June 2005 Revised 11 April 2006 Scalar QFT on the boundary + at future null infinity of a general asymptotically flat 4D spacetime is constructed using the algebraic approach based on Weyl algebra associated to a BMS-invariant symplectic form. The constructed theory turns out to be invariant under a suitable strongly-continuous unitary representation of the BMS group with manifest meaning when the fields are interpreted as suitable extensions to + of massless minimally coupled fields propagating in the bulk. The group theoretical analysis of the found unitary BMS representation proves that such a field on + coincides with the natural wave function constructed out of the unitary BMS irreducible representation induced from the little group ∆, the semidirect product between SO(2) and the two-dimensional translations group. This wave function is massless with respect to the notion of mass for BMS representation theory. The presented result proposes a natural criterion to solve the long-standing problem of the topology of BMS group. Indeed the found natural correspondence of quantum field theories holds only if the BMS group is equipped with the nuclear topology rejecting instead the Hilbert one. Eventually, some theorems towards a holographic description on + of QFT in the bulk are established at level of C ∗ -algebras of fields for asymptotically flat at null infinity spacetimes. It is proved that preservation of a certain symplectic form implies the existence of an injective ∗-homomorphism from the Weyl algebra of fields of the bulk into that associated with the boundary + . Those results are, in particular, applied to 4D Minkowski spacetime where a nice interplay between Poincar´e invariance in the bulk and BMS invariance on 349
June 29, 2006 16:15 WSPC/148-RMP
350
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti the boundary at null infinity is established at the level of QFT. It arises that, in this case, the ∗-homomorphism admits unitary implementation and Minkowski vacuum is mapped into the BMS invariant vacuum on + . Keywords: Asymptotically flat spacetimes; BMS group; algebraic quantum field theory; Weyl algebra; C ∗ -algebra; unitary irreducible representations; Mackey machine; induced representations; holography. Mathematics Subject Classification 2000: 81T20, 81T05, 83C30, 83C47, 81R10, 22D30
Contents 1. Introduction 1.1. Holography in asymptotically flat spacetimes 1.2. Basic definitions and notations 2. Scalar QFT on + 2.1. Asymptotic flatness, asymptotic Killing symmetries, BMS group and all that 2.2. Space of fields with BMS representations 2.3. BMS-invariant symplectic form 2.4. Weyl algebraic quantization and Fock representation 2.5. Unitary BMS invariance 2.6. Topology on GBMS in view of the analysis of irreducible unitary representations and strongly continuity 3. BMS Theory of Representations in Nuclear Topology 3.1. General goals of the section 3.2. The group G BMS and some associated spaces 3.3. Main ingredients to study unitary representations of G BMS 3.4. Construction of unitary irreducible representations of G BMS 3.5. The scalar-induced wave function 3.6. The covariant scalar wave function and its bulk interpretation 4. A Few Holographic Issues 4.1. General goals of the section 4.2. Linear QFT in the bulk 4.3. General holographic tools 4.4. Holographic interplay of Minkowski space and + 5. Conclusions Appendix A A.1. GNS reconstruction A.2. Proof of some propositions
350 350 353 355 355 358 361 363 365 367 369 369 369 372 376 381 384 389 389 390 391 393 399 402 402 403
1. Introduction 1.1. Holography in asymptotically flat spacetimes One of the key obstacles in the current, apparently never-ending, quest to combine in a unique framework general relativity and quantum mechanics consists in
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
351
a deeply-rooted lack of comprehension of the role and the number of quantum degrees-of-freedom of gravity. Within this respect, a new insight has been gained from the work of ’t Hooft who suggested to address this problem from a completely new perspective which is now referred to as the holographic principle [1]. This principle states, from the most general point of view, that physical information in spacetime is fully encoded on the boundary of the region under consideration. ’t Hoofts paper represented a cornerstone for innumerable research papers which led to an extension of the celebrated Bekenstein–Hawking results about black hole entropy to a wider class of spacetime regions (see, in particular, the covariant entropy conjecture in [2]). Furthermore, a broader version of the holographic principle arisen from the above-cited developments according to which any quantum field theory — gravity included — living on a D-dimensional spacetime can be fully described by means of a second theory living on a suitable submanifold, with codimension 1, which is not necessary (part of the) boundary of the former. However, the holographic principle lacks any general prescription on how to concretely construct a holographic counterpart of a given quantum field theory. In high energy physics in the past years, the attempt to fill this gap succeeded in achieving some remarkable results. The most notable is the so-called AdS/CF T correspondence [3] or Maldacena conjecture, the key remark being the existence of the equivalence between the bulk and the boundary partition function once the asymptotically AdS boundary conditions have been imposed on the physical fields. Without entering into details (see [4] for a recent review), it suffices to say that in the low energy limit, a supergravity theory living on a AdSD × X 10−D manifold is (dual to) a SU (N ) conformal super Yang–Mills field theory living on the boundary at spatial infinity of AdSD . Other remarkable versions of holographic principle for AdS-like spacetime are due to Rehren [5, 6] who proved rigorously several holographic results for local quantum fields in a AdS background, establishing a correspondence between bulk and boundary observables without employing string machinery. It is rather natural to address the question whether similar holographic correspondences hold whenever a different class of spacetimes is considered. In this paper, we will deal with the specific case of asymptotically flat spacetimes and we consider fields interacting, in the bulk, only with the gravitational field. The quest to construct a holographic correspondence in this scenario started only recently and a few different approaches have been proposed [7–9]. In particular, in [7], in order to implement the holographic principle in a four-dimensional asymptotically flat spacetimes (M, g), it has been proposed to construct a bulk to boundary correspondence between a theory living on M and a quantum field theory living at future (or past) null infinity + of M . A key point is that the theory on + is further assumed to be invariant under the action of the asymptotic symmetry group of this class of spacetimes: the so called Bondi–Metzner–Sachs (BMS) group. The analysis performed along the lines of Wigner approach to Poincar´e invariant free quantum field theory has led to construct the full spectrum, the equations of motion and the Hamiltonians for free quantum field theory enjoying BMS invariance [7, 10]. A first and apparently surprising conclusion which has been drawn from
June 29, 2006 16:15 WSPC/148-RMP
352
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
these papers is that, in a BMS invariant field theory, there is a natural plethora of different kinds of admissible BMS-invariant fields. As a consequence, the one-to-one correspondence between the bulk and boundary particle spectrum, proper of the Maldacena conjecture, does not hold in this context or needs further information to be constructed. Nevertheless, such a conclusion should not be seen as a setback, since it represents the symptom of a key feature proper only of asymptotically flat spacetimes. This is the universality of the boundary data, i.e. as explained in more detail in the next section, the structure at future and past null infinity of any asymptotically flat spacetime is the same. Thus, from a holographic perspective, a BMS-invariant field theory on + should encode the information from all possible asymptotically flat bulk manifolds. Consequently, it is not surprising if there is such a huge number of admissible BMS-invariant free fields. The main question now consists on finding a procedure allowing one to single out information on a specific bulk from the boundary theory. The aim of this paper is develop part of this programme using the theory of unitary representations of BMS group as well as tools proper of algebraic local quantum field theory. In particular, using the approach introduced in [15–17] and fully developed in [17], we define quantum field theory on the null surface + using the algebraic framework based on a suitable representation of Weyl C ∗ -algebras of fields. Then, we investigate the interplay of that theory and quantum field theory of a free scalar field in the bulk finding several interesting results. There is a GNS (Fock space) representation of the field theory on + , based on a certain algebraic quasifree state, which admits an irreducible strongly-continuous unitary representation of the BMS group which leaves invariant the vacuum state. The algebra of fields transforms covariantly with respect to that unitary representation. In other words, the fields on + and the above-mentioned unitary action of BMS group have manifest geometrical meaning when the fields on + are interpreted as suitable extensions of massless minimally coupled fields propagating in the bulk. Furthermore, the group theoretical analysis of the BMS representation proves that the bulk massless field “restricted” on + coincides with the natural wave function constructed out of the unitary BMS irreducible representation induced from the little group ∆: the semidirect product between SO(2) and the two-dimensional translations. This wave function is massless with respect to a known notion of mass in BMS representation theory. In this context the found extent provides the solution of a long-standing problem concerning the natural topology of BMS group. In fact, the found unitary representation of GNS group takes place only if the BMS group is equipped with the nuclear topology. In this sense, the widely considered Hilbert topology must be rejected. Eventually some theorems towards a holographic description on + of QFT in the bulk are established at level of Weyl C ∗ -algebras of fields for spacetimes which are asymptotically flat at null infinity. It is shown that, if a symplectic form is preserved passing from the bulk to the boundary, the algebra of fields in the bulk can be identified with a subalgebra for the field observables on + by means of an injective
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
353
∗-homomorphism. Moreover, the BMS invariant state of quantum field theory on + induces a corresponding reference state in the bulk. It could be used to give a definition of particle based only upon asymptotic symmetries, no matter if the bulk admits any isometry group (see also [12]). Those results are, in particular, applied to 4D Minkowski spacetime where a nice interplay between Poincar´e invariance in the bulk and BMS invariance on the boundary + is established at level of quantum field theories. Among other results it arises that the above-mentioned injective ∗-homomorphism has unitary implementation such that the Minkowski vacuum is mapped into the BMS invariant vacuum on + . The outline of the paper is the following. In Sec. 2, we review the notion of asymptotically flat spacetime and of the Bondi– Metzner–Sachs group. Starting from these premises, a field living at null infinity + is defined as a suitable limit of a bulk scalar field and the set of fields on + is endowed with a symplectic structure. Eventually, the quantum field theory for an uncharged scalar field living on + is built up within the Weyl algebra approach and a preferred Fock representation is selected which also admits a suitable unitary representation of the BMS group. In Sec. 3, the theory of unitary and irreducible representation for the BMS group is discussed and quantum field theory on + is defined along the lines of Wigner analysis for the Poincar´e invariant counterpart. Furthermore, it is shown that, at least for scalar fields, the approaches discussed in this and in the previous sections are essentially equivalent provided one adopts a nuclear topology on the BMS group. In Sec. 4, the issue of a holographic correspondence is discussed for spacetimes satisfying a requirement weaker than strongly asymptotically predictability given in Proposition 2.5. We show that preservation of a certain symplectic form implies existence of an injective ∗-homomorphism from the Weyl algebras of the fields in the bulk into that on + . It is done by devoting particular attention to the specific scenario when the bulk is four-dimensional Minkowski spacetime. It arises that, in this case, the ∗-homomorphisms admits unitary implementation and the Minkowski vacuum is mapped into the BMS invariant vacuum on + and the standard unitary representation of Poincar´e group in the bulk is transformed into a suitable unitary representation of a subgroup of BMS group on + and the correspondence has a clear geometric interpretation. In Sec. 5, we present our conclusion with some comments about possible future developments and investigations. The Appendix contains the proof of most of the statement within the paper.
1.2. Basic definitions and notations In this paper, smooth means C ∞ and we adopt the signature (−, +, +, +) for the Lorentzian metric. The symbol B A will be reserved for a semidirect product of a pair of groups (B, ·), (A, ∗). We recall the reader that B A is defined as the group obtained by
June 29, 2006 16:15 WSPC/148-RMP
354
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
the assignment, on the set of pairs B × A, of the group product (b, a) (b , a ) = (b·b , a∗βb (a )) where B b → βb is a fixed (it determining ) group representation of B in terms of group automorphisms of A. A turns out to be naturally isomorphic to the normal subgroup of B A made of the pairs (I, a) with a ∈ A, I denoting the unit element of B. The proper orthocronous Lorentz group will be denoted by SO(3, 1) ↑, while ISO(3, 1) = SO(3, 1) ↑ T 4 is the proper orthocronous Poincar´e group with semidirect product structure induced by (Λ, t) (Λ , t ) = (ΛΛ , t + Λt ). In a manifold equipped with Lorentzian metric, := ∇a ∇a indicates d’Alembert operator referred to Levi–Civita connection ∇a , £ξ denotes the Lie derivative with respect to the vector field f and f ∗ the push-forward associated with the diffeomorphism f acting on tensor fields of any fixed order. C ∞ (M ; N ) and Cc∞ (M ; N ), respectively, indicates the class of smooth functions and compactly supported smooth functions f : M → N . We omit N in the notation if N = R. lim+ f indicates a function on + which is the smooth extension to + of the function f defined in M . A spacetime is a four-dimensional smooth (Hausdorff second countable) manifold M equipped with a Lorentzian metric g assumed to be everywhere smooth, finally M is supposed to be time-orientable and time-oriented. A vacuum spacetime is a spacetime satisfying vacuum Einstein equations. In this paper, we make use of several properties of globally hyperbolic spacetimes as defined in [19, Chap. 8], employing standard notations of [19] concerning causal sets. We adopt the notion of asymptotically flat at future null infinity vacuum spacetime presented in [18, 19]. A smooth spacetime (M, g) is called asymptotically flat vacuum spacetime at future null infinity [18, 19] if it is a solution of vacuum Einstein equations and the follow˜ , g˜) such that ing requirements are fulfilled. There is a second smooth spacetime (M + ˜ . + is an ˜ M turns out to be an open submanifold of M with boundary ⊂ M + ˜ − ˜ ˜ embedded submanifold of M satisfying ∩ J (M ) = ∅. (M , g˜) is required to be strongly causal in a neighborhood of + and it must hold g˜M = Ω2M gM where ˜ ) is strictly positive on M . On + , one must have Ω = 0 and dΩ = 0. Ω ∈ C ∞ (M Moreover, defining na := g˜ab ∂b Ω, there must be a smooth function, ω, defined in ˜ with ω > 0 on M ∪ + , such that ∇ ˜ a (ω 4 na ) = 0 on and the integral lines of M −1 + ω n are complete on . Finally, the topology of + must be that of S2 × R. + is called future infinity of M . It is possible to make stronger the definition of asymptotically flat spacetime by requiring asymptotic flatness at both null infinity — including the past null infinity ˜ − defined analogously to + — and spatial infinity, given by a special point in M 0 indicated by i . The complete definition is due to Ashtekar (see [19, Chap. 11] for a general discussion). We stress that the results presented in this work do not require such a stronger definition: for the spacetimes considered in this work the existence of + is fully enough. Hence, throughout this paper asymptotically flat spacetime means asymptotically flat vacuum spacetime at future null infinity.
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
355
2. Scalar QFT on + 2.1. Asymptotic flatness, asymptotic killing symmetries, BMS group and all that Considering an asymptotically flat spacetime, the metric structures of + are affected by a gauge freedom due the possibility of changing the metric g˜ in a neighborhood of + with a factor ω smooth and strictly positive. It corresponds to the freedom involved in transformations Ω → ωΩ in a neighborhood of + . The topology of + (which is that of R × S2 ) as well as the differentiable structure are not affected by the gauge freedom. Let us stress some features of this extent. Fixing Ω, + turns out to be the union of future-oriented integral lines of the field ˜ b Ω. This property is, in fact, invariant under gauge transformation, but na := g˜ab ∇ the field n depends on the gauge. For a fixed asymptotically flat vacuum spacetime (M, g), the manifold + together with its degenerate metric ˜h induced by g˜ and the field n on + form a triple which, under gauge transformations Ω → ωΩ, transforms as + → + ,
˜ → ω 2 h, ˜ h
n → ω −1 n.
(2.1)
˜ n) transforming as If C denotes the class containing all of the triples (+ , h, in (2.1) for a fixed asymptotically flat vacuum spacetime (M, g), there is no general physical principle which allows one to select a preferred element in C. Conversely, C is universal for all asymptotically flat vacuum spacetimes in the following sense. If C1 and C2 are the classes of triples associated respectively to (M1 , g2 ) and (M2 , g2 ), + + ˜ there is a diffeomorphism γ : + 1 → 2 such that for suitable (1 , h1 , n1 ) ∈ C1 ˜ and (+ 2 , h2 , n2 ) ∈ C2 , + γ(+ 1 ) = 2 ,
˜ 2, ˜1 = h γ ∗h
γ ∗ n1 = n2 .
The proof of this statement relies on the following nontrivial result [19]. For whatever asymptotically flat vacuum spacetime (M, g) (either (M1 , g1 ) and (M2 , g2 ) in particular) and whatever initial choice for Ω0 , varying the latter with a judicious choice of the gauge ω, one can always fix Ω := ωΩ0 in order that the metric g˜ associated with Ω satisfies g˜+ = −2du dΩ + dΣS2 (x1 , x2 ) .
(2.2)
This formula uses the fact that in a neighborhood of + , (u, Ω, x1 , x2 ) define a meaningful coordinate system. dΣS2 (x1 , x2 ) is the standard metric on a unit 2-sphere (referred to arbitrarily fixed coordinates x1 , x2 ) and u ∈ R is nothing but an affine parameter along the complete null geodesics forming + itself with tangent vector n = ∂/∂u. In these coordinates, + is just the set of the points with u ∈ R, (x1 , x2 ) ∈ S2 and, no matter the initial spacetime (M, g) (either ˜ B , nB ) := (M1 , g1 ) and (M2 , g2 ) in particular), one has finally the triple (+ , h 2 (R × S , dΣS2 , ∂/∂u).
June 29, 2006 16:15 WSPC/148-RMP
356
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
Bondi–Metzner–Sachs (BMS) Group, GBMS [20–23], is the group of diffeomorphisms of γ : + → + which preserve the universal structure of + , i.e. ˜ n) at most by a gauge transformation (2.1). h, γ ∗ n) differs from (+ , h, (γ(+ ), γ ∗ ˜ The following proposition holds [19]. Proposition 2.1. The one-parameter group of diffeomorphisms generated by a smooth vector field ξ on + is a subgroup of GBMS if and only if the following holds. ξ can be extended smoothly to a field ξ (generally not unique) defined in M in some neighborhood of + such that Ω2 £ξ g has a smooth extension to + and Ω2 £ξ g → 0 approaching + . The requirement Ω2 £ξ g → 0 approaching + is the best approximation of the Killing requirement £ξ g = 0 for a generic asymptotically flat spacetime which does not admit proper Killing symmetries. In this sense, the BMS group describes asymptotic null Killing symmetries valid for all asymptotically flat vacuum spacetimes. Remark 2.2. (1) Notice that BMS group is smaller than the group of gauge transformations in Eq. (2.1) because not all those transformations can be induced by diffeomorphisms of + . On the other hand, the restriction of the gauge group to those transformations induced by diffeomorphisms permits to view the BMS group as a group of asymptotic Killing symmetries. Henceforth, whenever it is not explicitly stated otherwise, we consider as admissible realizations of the unphysical metric on + only those metrics ˜h which can be reached through transformations of BMS group — i.e. through asymptotic symme˜ B , nB ). tries — from a metric whose associated triple is (+ , h ˜ in general may not coincide with the initial metric induced by g˜ (2) Therefore, h on + but a further, strictly positive on + , factor ω defined in a neighborhood of + may take place.a In this sense, freedom allowed by rescaling with factors ω is larger than freedom involved in re-defining the unphysical metric g˜ on the whole ˜. unphysical spacetime M To give an explicit representation of GBMS , we need a suitable coordinate frame ˜ B , nB ), one is still free to select an arbitrary on + . Having fixed the triple (+ , h coordinate frame on the sphere and, using the parameter u of the integral curves of nB to complete the coordinate system, one is free to fix the origin of u depending on ζ, ζ generally. Taking advantage of the stereographic projection, one may adopt complex coordinates (ζ, ζ) on the (Riemann) sphere, ζ = eiφ cot(ϑ/2), φ, ϑ being usual spherical coordinates. Coordinates (u, ζ, ζ) on + define a Bondi frame when (ζ, ζ) ∈ C × C are complex stereographic coordinates on S2 , u ∈ R (with the origin fixed arbitrarily) is ˜ B , nB ). the parameter of the integral curves of n and (+ , ˜h, n) = (+ , h a In case the spacetime is, more strongly, asymptotically flat at future and past null infinity and ˜ which does not spatial infinity [19], ωΩ could have singular behavior at spatial infinity i0 ∈ M belong to + by definition, see footnote, p. 279 in [19] for details.
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
357
In this frame, the set GBMS is nothing but SO(3, 1)↑ × C ∞ (S2 ), and (Λ, f ) ∈ SO(3, 1)↑ × C ∞ (S2 ) acts on + as [24] u → u := KΛ (ζ, ζ)(u + f (ζ, ζ)), ζ → ζ := Λζ :=
KΛ (ζ, ζ) :=
aΛ cΛ
bΛ dΛ
aΛ ζ + b Λ , cΛ ζ + dΛ
(2.3)
ζ → ζ := Λζ :=
aΛ ζ + b Λ . cΛ ζ + dΛ
(1 + ζζ) (aΛ ζ + bΛ )(aΛ ζ + bΛ ) + (cΛ ζ + dΛ )(cΛ ζ + dΛ )
= Π−1 (Λ).
(2.4)
and (2.5)
Π is the well-known surjective covering homomorphism SL(2, C) → SO(3, 1)↑. Thus, the matrix of coefficients aΛ , bΛ , cΛ , dΛ is an arbitrary element of SL(2, C) determined by Λ up to an overall sign. However, KΛ and the right-hand sides of (2.4) are manifestly independent from any choice of such a sign. It is clear from (2.4) and (2.5) that, GBMS can be viewed as the semidirect product of SO(3, 1)↑ and the abelian additive group C ∞ (S2 ), the group product depending on the used Bondi frame. The elements of this subgroup are called supertranslations. In particular, if denotes the product in GBMS , ◦ denotes the composition of functions, · denotes the pointwise product of scalar functions and Λ acts on (ζ, ζ) as said in the righthand sides of (2.4): KΛ (Λ(ζ, ζ))KΛ (ζ, ζ) = KΛ Λ (ζ, ζ),
(2.6)
(Λ , f ) (Λ, f ) = (Λ Λ, f + (KΛ−1 ◦ Λ) · (f ◦ Λ)).
(2.7)
Remark 2.3. We underline that in the literature the factor KΛ does not always have the same definition. In particular, in [25–29] KΛ (ζ, ζ) :=
(aΛ ζ + bΛ )(aΛ ζ + bΛ ) + (cΛ ζ + dΛ )(cΛ ζ + dΛ ) , (1 + ζζ)
but in this paper, we stick to the definition (2.5) as in [24, 30] adapting accordingly the calculations and results from the above mentioned references. The following proposition arises from the definition of Bondi frame and the equations above. Proposition 2.4. Let (u, ζ, ζ) be a Bondi frame on + . The following holds.
(a) A global coordinate frame (u , ζ , ζ ) on + is a Bondi frame if and only if
u = u + g(ζ , ζ ),
(2.8)
ζ=
aR ζ + b R , cR ζ + dR
ζ=
aR ζ + b R
cR ζ + dR
,
(2.9)
June 29, 2006 16:15 WSPC/148-RMP
358
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
for g ∈ C ∞ (S2 ) and R ∈ SO(3) referring to the canonical inclusion SO(3) ⊂ SO(3, 1)↑ (i.e. the canonical inclusion SU (2) ⊂ SL(2, C) for matrices of coefficients (aΛ , bΛ , cΛ , dΛ ) in (2.5)). (b) The functions KΛ are smooth on the Riemann sphere S2 . Furthermore, KΛ (ζ, ζ) = 1 for all (ζ, ζ) if and only if Λ ∈ SO(3). (c) Let (u , ζ , ζ ) be another Bondi frame as in (a). If γ ∈ GBMS is represented by (Λ, f ) in (u, ζ, ζ), the same γ is represented by (Λ , f ) in (u , ζ , ζ ) with (Λ , f ) = (R, g)−1 (Λ, f ) (R, g).
(2.10)
2.2. Space of fields with BMS representations Let us consider QFT on + developed in the way presented in [15–17] where QFT on null hypersurfaces was investigated in the case of Killing horizons. + is not a Killing horizon but the theory can be re-adapted to this case with simple adaptations. The procedure we go to introduce is similar to that sketched in [31] for graviton field. First of all, we fix a relation between scalar fields φ in (M, g) and scalar fields ψ on + . The idea is to consider the fields ψ as re-arranged smooth restrictions to + of fields φ. Simple restrictions make no sense because + does not belong to M . We aspect that a good definition of fields ψ is a suitable smooth limit to + of products Ωα φ for some fixed real exponent α. A strong suggestion for the value ˜ is d’Alembert operator referred of α is given by the following proposition. (Below ˜ ˜ , respectively.) to g˜ and R and R are the scalar curvatures on M and M Proposition 2.5. Assume that (M, g) is asymptotically flat with associated ˜ , g˜) with g˜M = Ω2 g. Suppose that there is an open set unphysical spacetime (M − + ˜ ˜ ˜ ) such that (V˜ , g˜) is V ⊂ M with M ∩ J ( ) ⊂ V˜ (the closure being referred to M globally hyperbolic so that (M ∩V, g) is globally hyperbolic, too. If φ : M ∩ V˜ → C has compactly supported Cauchy data on some Cauchy surface of M ∩ V˜ and satisfies massless conformal Klein–Gordon equation, 1 φ − Rφ = 0, 6
(2.11)
The following facts hold: (a) the field φ˜ := Ω−1 φ can be extended uniquely into a smooth solution in (V˜ , g˜) of ˜ φ˜ = 0; ˜ φ˜ − 1 R 6
(2.12)
(b) for every smooth positive factor ω defined in a neighborhood of + used to rescale Ω → ωΩ in such a neighborhood, (ωΩ)−1 φ extends to a smooth field ψ on + uniquely. We have assumed the possibility of having R = 0 in M because, as noticed in [19], all we said in Sec. 2.1 holds true dropping the hypotheses for the spacetime (M, g) to be a vacuum Einstein solution, but requiring that the stress energy tensor
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
Fig. 1.
359
˜ as in Proposition 2.5. Manifolds M and M
T is such that Ω−2 T is smooth on + . A simple and well-known example of the application of the theorem is given by Minkowski spacetime, but also Schwarzschild spacetime fulfills these hypotheses (more precisely, the hypotheses are satisfied for regions of the cited spacetimes in the future of a fixed suitable spacelike Cauchy surface). Proof. In this proof, we define MV˜ := M ∩ V˜ and the symbol “tilde” written on a causal set indicates that the metric g˜ is employed, otherwise the used metric is g. (In Fig. 1, for the sake of simplicity, it has been assumed that V˜ ⊃ M so that MV˜ = M .) Notice that J˜− (M ) ∩ + = ∅ so that J − (p; MV˜ ) = J˜− (p; V˜ ) if p ∈ MV˜ . (MV˜ , g) is globally hyperbolic because it is strongly causal and the sets J − (p; MV˜ ) ∩ J + (q; MV˜ ) are compact for p, q ∈ MV˜ (see [19, Sec. 8]). Indeed, (V˜ , g˜) is strongly causal and thus (MV˜ , g) is strongly causal, moreover, if p, q ∈ MV˜ , J − (p; MV˜ ) ∩ J + (q; MV˜ ) is compact because J − (p; MV˜ ) ∩ J + (q; MV˜ ) = J˜− (p; V˜ ) ∩ J˜+ (q; V˜ ) and J˜− (p; V˜ ) ∩ J˜+ (q; V˜ ) is compact since (V˜ , g˜) is globally hyperbolic. As a consequence, we can use in MV˜ (but also in V˜ ) standard results of solutions of the Klein–Gordon equation with compactly supported Cauchy data in globally hyperbolic spacetimes [19]. (a) Let S be a spacelike Cauchy surface for (MV˜ , g). It is known that [19], in any open subset of M and under the only hypothesis g˜ = Ω2 g, (2.11) is valid for φ if and only if (2.12) is valid for φ˜ := Ω−1 φ. The main idea of the proof is to associate φ with Cauchy data for φ˜ on a suitable Cauchy surface ˜ of (2.12) of the larger spacetime (V˜ , g˜), so that the unique maximal solution Φ uniquely determined in (V˜ , g˜) by those Cauchy data, on a hand is well defined on + ⊂ V˜ , on the other hand it is a smooth extension of Ω−1 φ initially defined in MV˜ only. Let KS be the compact support of Cauchy data of φ on S. As V˜ is homeomorphic to the product manifold R × Σ, R denoting a global time coordinate on V˜ and Σ being a spacelike Cauchy surface of V˜ , one can fix Σ in the past of the compact set KS . Since KS is compact and the class of the open sets I˜− (p; V˜ ) ∩ I˜+ (q; V˜ ) with p, q ∈ MV˜ is a basis of the topology of MV˜ , it is possible to determine a finite number of points p1 , . . . , pn ∈ MV˜ in the future of KS in order that ∪i I − (pi ; MV˜ ) ⊃ KS . In this way, one also has ∪i J − (pi ; MV˜ ) = ∪i J˜− (pi ; V˜ ) ⊃ KS .
June 29, 2006 16:15 WSPC/148-RMP
360
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
On the other hand, as is well known ∪i J˜− (pi ; V˜ ) ∩ D+ (Σ) is compact and, in particular, KΣ := ∪i J˜− (pi ; V˜ ) ∩ Σ = ∪i J − (pi ; MV˜ ) ∩ Σ is compact too, it being a closed subset of a compact set. Notice that, outside J − (KS ; MV˜ ) ∪ J + (KS ; MV˜ ), the field φ vanishes in MV˜ . Thus, we are naturally lead to consider compactly supported (in KΣ ) Cauchy data on Σ for Eq. (2.12), obtained by restriction of ˜ be the unique solution of (2.12) in the whole Ω−1 φ and its derivatives to Σ. Let Φ globally hyperbolic spacetime (V˜ , g˜), associated with those Cauchy data on Σ. By ˜ must be an extension to (V˜ , g˜) of φ˜ defined in M ˜ (more precisely construction, Φ V + ˜ ˜ in D (Σ; V ) ∩ MV˜ = D+ (Σ ∩ MV˜ ; MV˜ )), since they satisfy the same equation and have the same Cauchy data on Σ. The proof concludes by noticing that + ⊂ V˜ ˜ ˜ + is, in fact, a smooth extension to + of φ. and thus ψ := Φ (b) The case with ω = 1 is now a trivial consequence of what proved above replacing Ω with ωΩ in the considered neighborhood of + where ω > 0.
Remark 2.6. We recall the reader that an asymptotically flat spacetime at null and spacelike infinity [19] (M, g) is said to be strongly asymptotically predictable in the sense of [19], if in the unphysical associated spacetime there is an open set ˜ so that i0 ∈ V ˜ with M ∩ J − (+ ) ⊂ V˜ (the closure being referred to M V˜ ⊂ M also if, by definition, i0 ∈ + ) such that (V˜ , g˜) is globally hyperbolic. Minkowski spacetime is such [19]. For those spacetimes in particular, the proposition above applies. We go to define a field theory on + — thought as a pure differentiable manifold — based on smooth scalar fields ψ and assuming GBMS as the natural symmetry group. The latter assumption is in order to try to give some physical interpretation of the theory, since physical information is invariant under GBMS as said above. In particular, we have to handle the extent of a metrical structure on + which is not invariant under BMS group. The field theory should be viewed, more appropriately, ˜ n) connected with (+ , h ˜ B , nB ) by the as QFT on the class of all the triples (+ , h, transformations of GBMS . In this way, one takes asymptotic Killing symmetries into account. Therefore, we need a representation GBMS γ → Aγ in terms of transformations Aγ : C ∞ (+ ; C) → C ∞ (+ ; C). The naive idea is to define such an action as the push-forward on scalar fields of diffeomorphisms γ ∈ GBMS , i.e. Aγ := γ ∗ . However this is not a very satisfactory idea, if one wants to maintain the possibility to interpret some of the fields ψ as extensions to + of fields (ωΩ)α φ defined in the bulk. Proposition 2.1 shows that there are one-parameter (local) groups of diffeomorphisms {γt } in the physical spacetime (in general, not preserving (2.11)) which induce one-parameter subgroups of GBMS , γt . A natural requirement on the wanted representation A(α) is that, for a scalar field φ on M such that (ωΩ)α φ admits a smooth extension ψ to + Aγ ψ = lim(ωΩ)α γt∗ (φ) (α) t
+
(2.13)
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
361
for every (local) one-parameter group of diffeomorphisms {γt } generated by any vector field ξ as in Proposition 2.1, for every value t of the associated (local) oneparameter group of diffeomorphisms. We have the following result whose proof is in the Appendix. Proposition 2.7. Assume that (M, g) is asymptotically flat with associated ˜ , g˜) (with g˜ M = Ω2 g). Fix ω > 0 in a neighborhood unphysical spacetime (M + ˜ B , nB ). Consider, for a g is associated with the triple (+ , h of such that ω˜ (α) fixed α ∈ R, a representation GBMS γ → Aγ in terms of transformations (α) Aγ : C ∞ (+ ; C) → C ∞ (+ ; C) such that t → Aγt ψ0 is smooth for every fixed ψ0 and every fixed one-parameter group of diffeomorphisms {γt } subgroup of GBMS . Finally assume that (2.13) holds for any ψ obtained as smooth extension to + of (ωΩ)α φ, φ ∈ C ∞ (M ; C). Then, in any Bondi frame (α) A(Λ,f ) ψ (u , ζ , ζ ) := KΛ (ζ, ζ)−α ψ(u, ζ, ζ), (2.14) for any (Λ, f ) ∈ GBMS and referring to (2.3)–(2.5). From (2.6), Eq. (2.14) defines, in fact, a representation of GBMS when assumed valid on all the fields ψ ∈ C ∞ (+ , C) or some BMS-invariant subspace of C ∞ (+ ) as Cc∞ (+ ; C) or similar. From now on, we assume that the action of GBMS on scalar fields ψ ∈ C ∞ (+ ; C) (α) is given from a representation A(α) : GBMS γ → Aγ defined in (2.14) with α fixed. Transformations (2.14) are well known and used in the literature [30]. We stress (α) that our interpretation of A(Λ,f ) is active here, in particular, the fields ψ are scalar fields and thus, they transform as usual scalar fields under change of coordinates related or not by a BMS transformation (passive transformations). Using Proposition 2.4, (2.5) in particular, the reader can easily prove the following result. Proposition 2.8. Consider two Bondi frames B and B on + . Take γ ∈ GBMS and represent it as (Λ, f ) and (Λ , f ) in B and B respectively (so that (2.10) holds). (α) (α) Acting on a scalar fields ψ, A(Λ,f ) and A(Λ ,f ) produce the same transformed scalar field. The proposition says that the representation defined in Proposition 2.7 does not depend on the particular Bondi frame used to represent + , but it depends only on the diffeomorphisms γ ∈ GBMS individuated by the pairs (Λ, f ) in the Bondi frame used to make explicit the representation. In this way, we are given a unique (α) representation GBMS γ → Aγ not depending on the used Bondi frame which can be represented as in (2.14) when a Bondi frame is selected. 2.3. BMS-invariant symplectic form As a second step we introduce the space of (real) wave functions on + , S(+ ). In a fixed Bondi frame S(+ ) is the real linear space of the smooth functions
June 29, 2006 16:15 WSPC/148-RMP
362
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
ψ : + → R such that ψ itself and all of its derivatives in any variable vanish as |u| → +∞, uniformly in ζ, ζ, faster than any functions |u|−k for every natural k. It is simply proved that actually S(+ ) does not depend on the used Bondi frame (use Proposition 2.4 and the fact that functions f are continuous and thus bounded on the compact S2 ). Obviously, Cc∞ (+ ) ⊂ S(+ ) and it is simply proved that S(+ ) is invariant under the representation A(1) of GBMS defined in the previous section. One has the following result that shows that S(+ ) can be equipped with a symplectic form invariant under the action of BMS group. That symplectic form was also studied in [23, 17]. Theorem 2.9. Consider the representations A(α) on C ∞ (+ ; C) of GBMS introduced above and the map: σ : S(+ ) × S(+ ) → R ∂ψ1 ∂ψ2 − ψ1 (2.15) σ(ψ1 , ψ2 ) := ψ2 du ∧ S2 (ζ, ζ), ∂u ∂u R×S2 (u, ζ, ζ) being a Bondi frame on + and S2 being the standard volume form of the unit 2-sphere S2 (ζ, ζ) :=
2dζ ∧ dζ . i(1 + ζζ)2
(2.16)
The following holds: (a) σ is a nondegenerate symplectic form on S(+ ) (i.e. it is linear, antisymmetric and σ(ψ1 , ψ2 ) = 0 for all ψ1 ∈ S(+ ) implies ψ2 = 0) independently from the used Bondi frame. (b) S(+ ) is invariant under every representation A(α) , whereas σ is invariant under A(1) . Proof. (a) can be proved by direct inspection using Proposition 2.4 to check on the independence from the used Bondi frame and taking advantage of the fact that S2 (ζ, ζ) is invariant under three-dimensional rotations. Invariance of S(+ ) under A(α) can be established immediately using the fact that the functions f in (2.3) and the functions KΛ in (2.5) and (2.14) are bounded. Let us prove the nontrivial part of item (b). One has ∂ψ1 ∂ψ2 σ(ψ1 , ψ2 ) = ψ2 − ψ1 du ∧ S2 (ζ , ζ ). ∂u ∂u 2 R×S Now, we can use (2.14) together with the known relation
S2 (ζ , ζ ) = KΛ (ζ, ζ)2 S2 (ζ, ζ) obtaining σ(ψ1 , ψ2 ) which is the thesis.
∂ψ1 ∂ψ2 − ψ1 = ψ2 du ∧ S2 (ζ, ζ), ∂u ∂u R×S2
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
363 (1)
Remark 2.10. From now on, the restriction to the invariant space S(+ ) of Aγ is indicated by Aγ , similarly A denotes the representation GBMS γ → Aγ . 2.4. Weyl algebraic quantization and Fock representation
As the third and last step, we define QFT on + for uncharged scalar fields in Weyl approach giving also a preferred Fock space representation. The formulation of real scalar QFT on the degenerate manifold + we present here is an almost straightforward adaptation of the theory presented in [17] (see Sec. 4.2 for the corresponding in general curved spacetime [32]). As S(+ ) is a real vector space equipped with a nondegenerate symplectic form σ, there exists a complex C ∗ -algebra ([33, Theorem 5.2.8]) generated by nonvanishing elements, W (ψ) with ψ ∈ S(+ ) satisfying, for all ψ, ψ ∈ S(+ ), (W1)
W (−ψ) = W (ψ)∗ ,
(W2)
W (ψ)W (ψ ) = eiσ(ψ,ψ
)/2
W (ψ + ψ ).
That C ∗ -algebra, indicated by W(+ ), is unique up to (isometric) ∗ -isomorphisms ([33, Theorem 5.2.8]). As consequences of (W1) and (W2), W(+ ) admits unit I = W (0), each W (ψ) is unitary and, from the nondegenerateness of σ, W (ψ) = W (ψ1 ) if and only if ψ = ψ1 . W(+ ) is called Weyl algebra associated with S(+ ) and σ whereas the W (ψ) are called (abstract) Weyl operators. The formal interpretation of elements W (ψ) is W (ψ) ≡ eiΨ(ψ) where Ψ(ψ) are symplectically smeared field operators as we shall see shortly. The definition of σ entails straightforward implementation of locality principle: [W (ψ1 ), W (ψ2 )] = 0 if
(supp ψ1 ) ∩ (supp ψ2 ) = ∅.
(2.17)
Differently from QFT in curved spacetime, but similarly to [17], here we do not impose any equation of motion. On the other hand, the space of wave functions, differently from the extent in the case of degenerate manifolds studied in [17], gives rise to direct implementation of locality. No “causal propagator” has to be introduced in this case. A Fock representation of W(+ ) based on a BMS-invariant vacuum state can be introduced as follows. From a physical point of view, the procedure resembles quantization with respect to Killing time in a static spacetime. Fix a Bondi frame (u, ζ, ζ) on + . Any ψ ∈ S(+ ) can be written as a Fourier integral in the parameter u and one may extract the positive-frequency part (with respect to u): dE −iEu √ e (2.18) ψ+ (E, ζ, ζ), ψ+ (u, ζ, ζ) := + 4πE R where R+ := [0, +∞) and ψ + (E, ζ, ζ) :=
√ du +iEu √ e 2E ψ(u, ζ, ζ) 2π R
for E ∈ R+ .
(2.19)
June 29, 2006 16:15 WSPC/148-RMP
364
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
Obviously, it also holds ψ = ψ+ + ψ+ . It could seem that the definition of positive frequency part depend on the used Bondi frame and the coordinate u in particular; actually, by direct inspection based on Proposition 2.4, one finds that: Proposition 2.11. Positive-frequency parts do not depend to the Bondi frame and define scalar fields. In other words if ψ ∈ S(+ ) has positive frequency parts ψ+ and , respectively, in Bondi frames (u, ζ, ζ) and (u , ζ , ζ ), it holds ψ+
ψ+ (u, ζ, ζ) = ψ+ (u (u, ζ, ζ), ζ (ζ, ζ), ζ (ζ, ζ)),
for all u ∈ R,
(ζ, ζ) ∈ C × C. (2.20)
We are able to give a definition of one-particle Hilbert space and show that it is isomorphic to a suitable space L2 . Let us denote by S(+ )C + the space made of the complex finite linear combinations of positive-frequency parts of the elements of S(+ ). The proof of the following result is in the Appendix. Theorem 2.12. With the given definition of S(+ ), σ and S(+ )C + , the following holds. (a) The right-hand side of the definition of σ (2.15) is well-behaved if evaluated on functions in S(+ )C + and it is independent from the used Bondi frame. (b) Using (a) and extending the definition of σ (2.15) to S(+ )C + , consider the complex numbers ψ1+ , ψ2+ := −iσ(ψ1+ , ψ2+ ),
for every pair ψ1 , ψ2 ∈ S(+ ).
(2.21)
There is only one Hermitean scalar product ·, · on S(+ )C + which fulfils (2.21). ·, · is independent from the used Bondi frame, whereas, referring ψ + to a given Bondi frame (u, ζ, ζ), ψ ψ1+ , ψ2+ = 1+ (E, ζ, ζ) ψ2+ (E, ζ, ζ) dE ⊗ S2 (ζ, ζ), R+ ×S2
for every pair ψ1 , ψ2 ∈ S(+ ).
(2.22)
(c) Let H be the Hilbert completion of S(+ )C + with respect to ·, ·. The unique + complex linear and continuous extension of the map ψ+ → ψ + (for ψ ∈ S( )) 2 + with domain given by the whole H is a unitary isomorphism onto L (R × S2 , dE ⊗ S2 ). (d) The map K : S(+ ) ψ → ψ+ ∈ H has range dense in H. In the following, H will be called one-particle space. Quantum field theory on + relies on the bosonic (i.e. symmetric) Fock space F+ (H) built upon the vacuum state Υ (we assume ||Υ|| = 1 explicitly). The field operator symplectically smeared with ψ ∈ S(+ ) is now defined as [32] σ(ψ, Ψ) := ia(ψ+ ) − ia† (ψ+ ),
(2.23)
where the operators a† (ψ+ ) and (anti-linear in ψ+ ), respectively, create and annihilate the state ψ+ ∈ H. The common invariant domain of all the involved operators
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
365
is the dense linear manifold F (H) spanned by the vectors with finite number of particles. Ψ(ψ) is essentially self-adjoint on F (H) (it is symmetric and F (H) is dense and made of analytic vectors) and satisfies bosonic commutation relations (CCR): [σ(ψ, Ψ), σ(ψ , Ψ)] = −iσ(ψ, ψ )I. Since there is no possibility of misunderstandings because we will not introduce other, nonsymplectic, smearing procedures for field operators defined on + , from now on, we use the simpler notation Ψ(ψ) := σ(ψ, Ψ),
(2.24)
however the reader should bear in mind that symplectic smearing is understood. Finally, the unitary operators
(ψ) := eiΨ(ψ) W
(2.25)
+) enjoy properties (W1), (W2) so that they define a unitary representation W( + of W( ) which is also irreducible. A proof of these properties is contained in [33, Propositions 5.2.3 and√5.2.4] where the used field operator is Φ(f ) with f ∈ h := H and it holds Ψ(ψ) = 2Φ(iψ+ ) for ψ ∈ S(+ ). In particular, irreducibility arises from (2.3) and (2.4) in [33, Proposition 5.2.4] using the fact that the real linear map K : S(+ ) ψ → ψ+ ∈ H has range is dense as stated in (d) of Theorem 2.12 (notice that this is not obvious in the general case since, by definition of H and (c) of the mentioned theorem, the complexified range of K is dense in H, but not necessarily the range itself).b
+ ) denotes the unique (σ being nondegenerate) C ∗ -algebra If Π : W(+ ) → W( isomorphism between those two Weyl representations, (F+ (H), Π, Υ) coincides, up to unitary transformations, with the GNS triple associated with the algebraic pure state λ on W(+ ) uniquely defined by the requirement (see the Appendix) λ(W (ψ)) := e−ψ+ ,ψ+ /2 .
(2.26)
2.5. Unitary BMS invariance Let us show that F(H) admits a unitary representation of GBMS which is covariant with respect to an analogous representation of the group given in terms of
+ ). Moreover, we show that the vacuum state Υ (or equiv∗-automorphism of W( alently, the associated algebraic state λ on W(+ )) is invariant under the representation. Consider the representation A of GBMS in terms of transformations of fields in S(+ ) used in Secs. 2.2 and 2.3. As a consequence of the invariance of σ under the action of Aγ , by (2.4) in [33, Theorem 5.2.8], one has the following straightforward result concerning the C ∗ -algebra W(+ ) constructed with σ. Proposition 2.13. With the given definitions of A (Remark 2.10) and W(+ ) there is a unique representation of GBMS , indicated by α : GBMS γ → αγ , and b With
the formalism of [34] the irreducibility of the representation follows from [34, Lemma A.2, (ii)] making use of (2.26) and (d) in Theorem 2.12 again.
June 29, 2006 16:15 WSPC/148-RMP
366
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
made of ∗-automorphisms of W(+ ), satisfying αγ (W (ψ)) = W (Aγ ψ).
(2.27)
Let us come to the main result given in the following theorem. Theorem 2.14. Consider the representation of W(+ ) built upon Υ in the Fock space F+ (H) equipped with the representation of GBMS , α, given above. The following holds: (a) There is unique a unitary representation U : GBMS γ → Uγ such that both the requirements below are fulfilled. (i) It is covariant with respect to the representation α, i.e.
(ψ)U † = αγ (W
(ψ)), Uγ W γ
for all γ ∈ GBMS
and
ψ ∈ S(+ ). (2.28)
(ii) The vacuum vector Υ is invariant under U : U Υ = Υ. (b) Any projective unitary representationc V : GBMS γ → Vγ on F+ (H) which is covariant with respect to α can be made properly unitary, since it must satisfy, eig(γ) Vγ = Uγ ,
with e−ig(γ) = Υ, Vγ Υ,
for every γ ∈ GBMS . (2.29)
(c) The subspaces of F+ (H) with fixed number of particles are invariant under U and U itself is constructed canonically by tensorialization of UH . The latter satisfies, for every γ ∈ GBMS and the positive frequency part of any ψ ∈ S(+ ) (1) Uγ ψ+ = A(1) γ (ψ+ ) = (Aγ (ψ))+ .
(2.30)
Equivalently, in a fixed Bondi frame, where GBMS γ ≡ (Λ, f ) ∈ SO(3, 1)↑ C ∞ (S2 ), −1
(U(Λ,f ) ϕ)(E, ζ, ζ) =
−1
eiEKΛ (Λ (ζ,ζ))f (Λ (ζ,ζ)) ϕ(EKΛ (Λ−1 (ζ, ζ)), Λ−1 (ζ, ζ)), −1 KΛ (Λ (ζ, ζ)) (2.31)
is valid for every ϕ ∈ L2 (R+ × S2 ; dE ⊗ S2 ), ϕ = ψ + in particular. Proof. (a) and (c). Let us assume it exists U which satisfies (i) and in particular (ii). Then, the uniqueness property is a straightforward consequence of (b) (whose proof is independent from (a) and (c)) since, from (2.29), V Υ = Υ which implies e−ig(γ) = Υ, Vγ Υ = 1. Let us pass to prove the existence of U . Consider the positive frequency part ψ+ of ψ ∈ S(+ ). Theorem 2.12 (in the Appendix), we (1) have that ψ+ ∈ C ∞ (+ ; C) so that Aγ ψ+ is well defined. Furthermore, ψ+ with its derivatives decay as |u| → +∞ fast enough and uniformly in ζ, ζ, so that it makes sense to apply σ to a pair of functions ψ+ . Moreover, the proof of the invariance c See
also [35, 36] for an earlier discussion on this issue.
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
367
of σ under the representation A(1) given in Theorem 2.12 by changing the relevant domains simply — when working on functions ψ+ instead of functions in S(+ ). Collecting all together, since ψ1+ , ψ2+ := −iσ(ψ1+ , ψ2+ ), it turns out that the (1) map ψ+ → Aγ ψ+ preserves the values of the scalar product in H provided any (1) function Aγ ψ+ is the positive frequency part of some ψ ∈ S(+ ) when ψ ∈ S(+ ). Now, by direct inspection using (2.18), (2.19) as well as (2.14) and (2.5), and taking (1) (1) the positivity of KΛ into account, one finds, in facts, that Aγ (ψ+ ) = Aγ (ψ) + . (1)
The map Lγ : ψ+ → Aγ ψ+ preserve the scalar product and thus, it can be extended by C-linearity and continuity to an isometric transformation Sγ from H = S(+ )C + to H. That transformation is unitary it being surjective because Sγ −1 is its inverse. γ → Sγ gives rise, in fact, to a unitary representation of GBMS on H. Let us define the unitary representation GBMS γ → Uγ on the whole space F+ (+ ) by assuming Uγ Υ := Υ and using the standard tensorialization of Sγ on every subspaces with finite number of particles. To conclude the proofs of (a) and (c) it is now sufficient to establish the validity of (2.28). (Notice that,with the given (1) (1) definition of U , in proving the validity of the identity Aγ (ψ+ ) = Aγ (ψ) + one proves, in fact, also (2.30) and (2.31)). To prove (2.28), it is sufficient to note that, in general, whenever the unitary map V : F+ (H) → F+ (H) satisfy V Υ = Υ and it is the standard tensorialization of some unitary map V1 : H → H then, for any φ ∈ H, V a† (φ)V † = a† (V1 φ) and V a(φ)V † = a(V1 φ). Since Ψ(ψ) = −ia† (ψ+ )+ia(ψ+ ), one has Uγ Ψ(ψ)Uγ† = Uγ Ψ(ψ)Uγ† Ψ(Aγ ψ). Exponentiating this identity (using the fact that the vectors with finite number of particles are analytic vectors for Ψ(ψ) [33]) (2.28) arises.
(ψ)Uγ† = αγ (W
(ψ)Vγ† so that [Vγ† Uγ , W
(ψ)) = Vγ W
(ψ)] = (b) By hypotheses, Uγ W +
0. On the other hand, the representation of Weyl algebra W( ) is irreducible as said above and thus, by Schur’s lemma, Vγ† Uγ = α(γ)I. Since (Vγ† Uγ )−1 = (Vγ† Uγ )† = α(γ)I, it must be |α(γ)|2 = 1 and so eig(γ) Vγ = Uγ . Finally, eig(γ) Vγ = Uγ and (ii) implies e−ig(γ) = Υ, Vγ Υ.
2.6. Topology on GBMS in view of the analysis of irreducible unitary representations and strongly continuity Up to now, we have assumed no topology on GBMS . As the group is infinite dimensional and made of diffeomorphisms, a very natural topology is that induced by a suitable countable class of seminorms [37] yielding the so-called nuclear topology (see below), though other choices have been made in the literature. We spend some words on this interesting issue. Since its original definition in [24, 38], the BMS group has been recognized as a semidirect product of two groups GBMS = H N as it can be directly inferred from (2.7). The group H stands for the proper orthocronous Lorentz group, whereas the abelian group, the space of supertranslations N , is a suitable set of sufficiently regular real functions on the two sphere equipped with the abelian group structure induced by pointwise sum of functions. Up to now,
June 29, 2006 16:15 WSPC/148-RMP
368
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
we have chosen N = C ∞ (S2 ), but there are other possibilities connected with the question about the topology to associate to N in order to have the most physically sensible characterization for the Bondi–Metzner–Sachs group. In the Penrose construction [20], where the BMS group arises as the group of exact conformal motions (preserving null angles) of the boundaries ± of conformally compactified asymptotically simple spacetimes, a specific degree of smoothness on the elements of N was never imposed. Nonetheless, historically, the first stringent request has been proposed by Sachs in [24], i.e. each α ∈ N must be at least twice differentiable. This choice has been abandoned by McCarthy in his study of the BMS theory of representations [25], where he widened the possible supertranslations to the set of real-valued square-integrable functions N = L2 (S2 ; S2 )R equipped with Hilbert topology. The underlying reasons for this proposal are two, the former concerning the great simplification of the treatment of induced representations in this framework,d the latter related to the conjecture that square integrable supertranslations are more suited to describe bounded gravitational systems [27]. It is imperative to notice that, though such assertions may seem at a first glance reasonable (barring a problem with the interpretation of the elements of the group in terms of diffeomorphisms), they have never been really justified besides purely heuristic arguments. As a matter of fact, a natural choice for N and a corresponding topology is, accordingly to the discussion in Sec. 2.2, N = C ∞ (S2 ) equipped with the nuclear topology, first proposed in [28]. We follow [29] (and references therein) according to which the nuclear topology on C ∞ (S2 ) is the topology such that C ∞ (S2 ) ⊃ {fn }n ∈ N turns out to converge to f ∈ C ∞ (S2 ) iff, for every local chart on S2 , φ : U p → (x(p), y(p)) and in any compact K ⊂ U : α+β ∂ ∂ α+β sup α β fn ◦ φ−1 − α β f ◦ φ−1 → 0, ∂x ∂y ∂x ∂y K
as n → +∞,
for every choice of α, β = 0, 1, 2, . . . . As it is well known, this topology can be induced by a suitable class of seminorms. Although it has been pointed out that this choice for N and its topology should describe more accurately unbounded gravitating sources [27], we will nonetheless find this framework more natural than the Hilbert topology and thus, we adopt the nuclear topology on N = C ∞ (S2 ) and equip GBMS with the consequent topology product. In particular, we shall show in Proposition 3.35 that, with our choice, it is possible to identify a field on + , which transforms with respect to GBMS as said in (2.31), with an intrinsic BMS field as introduced in the next section. After that proposition, we shall remark that the result cannot be achieved using Hilbert topology. To conclude this section we state a theorem about strongly continuity of the representation of GBMS , U : GBMS g → Ug , defined in Theorem 2.14 on F+ (H). The relevance of strongly continuity for a unitary representation, is that, through Stone’s d Originally
it was also thought that, at a level of representation theory, the results were not affected by the choice of the topology of N though this claim was successively falsified.
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
369
theorem, it implies the existence of self-adjoint generators of the representation itself. The proof of the theorem is in the Appendix. Theorem 2.15. Make GBMS a topological group adopting the product topology of the standard topology of SO(3, 1)↑ and the nuclear topology of C ∞ (S2 ). The unitary representation of the topological group GBMS defined in Theorem 2.14, U : GBMS g → Ug , on F+ (H) is strongly continuous. 3. BMS Theory of Representations in Nuclear Topology 3.1. General goals of the section In the previous discussions and in particular in Sec. 2.2, we have developed a scalar QFT on + whose kinematical data are fields ψ which are suitable smooth extensions/restrictions to + of fields φ living in (M, g). Nonetheless, a second candidate way to construct a consistent QFT at null infinity consists of considering as kinematical data, the set of wave functions invariant under a unitary irreducible representation of the GBMS group [7]. The support of such functions is not a priori the underlying spacetime — + in our scenario — but it is a suitable manifold modelled on a subgroup of GBMS . For this reason, we shall also refer to such fields as intrinsic GBMS fields. The rationale underlying this section is to demonstrate that, at least for scalar fields, both approaches are fully equivalent. In particular, we shall establish that (2.31) is the transformation proper of an intrinsic scalar GBMS field. 3.2. The group G BMS and some associated spaces To achieve our task, in the forthcoming discussion on representations of the BMS group, we shall study the unitary representations of the topological group ∞ 2 G BMS = SL(2, C) C (S ) where the product of the group is given by suitable re-interpretations of (2.6) and (2.7) and the topology is the product of the usual topology on SL(2, C) and that nuclear on C ∞ (S2 ) introduced in Sec. 2.6. In a fixed Bondi frame, the composition of two elements g = (A, α), g = (A , α ) ∈ G BMS is defined by (A , α ) (A, α) = (A A, α + (KA−1 ◦ A) · (α ◦ A)), aζ + b aζ + b , , A(ζ, ζ) := cζ + d cζ + d (1 + ζζ) a b and A := KA (ζ, ζ) := . c d (aζ + b)(aζ + b) + (cζ + d)(cζ + d)
(3.1) (3.2)
(3.3) In a sense, noticing that SL(2, C) is the universal covering of SO(3, 1) ↑, G BMS could be considered as the universal covering of GBMS . A discussion on this point would be necessary if one tries to interpret the term “universal covering” literally
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
370
since both GBMS and G BMS are infinite dimensional topological groups. However we limit ourselves to say that, according to [25, 35], replacing in the structure of GBMS the orthocronous proper Lorentz group SO(3, 1)↑e with its universal covering SL(2, C), it introduces only further unitary irreducible representations, induced by the Z2 subgroup of SL(2, C), beyond the unitary irreducible representations of GBMS . These represent nothing but the symptom that SL(2, C) “covers twice” SO(3, 1) ↑ and they will be not considered in this paper: we shall pick out only representations of G BMS which are as well representations of GBMS . The next step consists in the following further definition [29, 39]: Definition 3.1. If n ∈ Z is fixed, we call D(n,n) the space of real functions f of two complex variables ζ1 , ζ2 and their conjugate ones ζ 1 , ζ 2 such that: • f is of class C ∞ in its arguments except at most the origin (0, 0, 0, 0); ¯ ζ 1 , σζ2 , σ ¯ ζ 2 ) = σ (n−1) σ ¯ (n−1) f (ζ1 , ζ 1 , ζ2 , ζ 2 ) for all • for any σ ∈ C, f (σζ1 , σ ζ1 , ζ2 , ζ 1 , ζ 2 . Moreover, D(n,n) is assumed to be endowed with the topology of uniform convergence on all compact sets not containing the origin for the functions and all their derivatives separately. The relevance of the definition above arises from the following proposition which, first of all, allows one to identify C ∞ (S2 ) with the space D(2,2) and the subsequent space D2 introduced below. These spaces will be used later. The relevance of the second statement will be clarified shortly after Proposition 3.6. The action Λα of Λ ∈ SL(2, C) on an element α of C ∞ (S2 ), considered in Eq. (3.5) below, is that arising from the representation SL(2, C) in terms of C ∞ (S2 ) automorphisms used to define the semidirect product SL(2, C) C ∞ (S2 ). Notice that, by the natural normal subgroup identification C ∞ (S2 ) α ≡ (I, α) ∈ G BMS one also has: (I, α) → g (I, α) g −1 = (I, Λα)
for any g = (Λ, α ) ∈ G BMS ,
(3.4)
I being the unit element of SL(2, C). Since C ∞ (S2 ) is abelian, the dependence on α is immaterial as the notation suggests. Proposition 3.2. There is a one-to-one map T : C ∞ (S2 ) α → f ∈ D(2,2) . In this way, the action of Λ ∈ SL(2, C) on an element α of C ∞ (S2 ) (Λα)(ζ, ζ) = KΛ (Λ−1 (ζ, ζ))α(Λ−1 (ζ, ζ))
(3.5)
is equivalent to the action (defined in [39]) of the same Λ on f f ◦ Λ−1 := f (aζ1 + cζ2 , aζ 1 + cζ 2 , bζ1 + dζ2 , bζ 1 + dζ 2 ), −1 a b ∀Λ= ∈ SL(2, C) . (3.6) c d
e The
orthocronous proper Lorentz group is called homogeneous Lorentz group in [25, 35].
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
371
Finally, T is a homeomorphism so that the topology of D(2,2) coincides with that on C ∞ (S2 ). The proof of this result may be found in the appendix of [41] though we review some of the details which will be important in the forthcoming discussion. The sketch of the argument is the following: the homogeneity condition for the functions f ∈ D(n,n) allows us to associate to each of such f a pair of C ∞ functions ξ, ξˆ such that ζ2 ζ¯2 , ¯ = |ζ1 |2(n−1) ξ(ζ, ζ), f (ζ1 , ζ 1 , ζ2 , ζ 2 ) = |ζ1 |2(n−1) f ζ1 ζ1 ζ1 ζ¯1 2(n−1) ˆ ζ), f (ζ1 , ζ 1 , ζ2 , ζ 2 ) = |ζ2 | f , = |ζ2 |2(n−1) ξ(ζ, ζ2 ζ¯2 where ζ =
ζ1 ζ2
and ˆ ζ) = |ζ|2(n−1) ξ(ζ −1 , ζ −1 ) ξ(ζ,
(3.7)
ˆ the above discuswhenever (ζ1 , ζ2 ) = (0, 0). If we call Dn the set of the functions ξ, sion can be recast as the existence of a bijection between D(n,n) and Dn which thus inherits the same topology as D(n,n) (or vice versa). Furthermore, (3.6) becomes, with obvious notation, d + bζ d¯ + ¯bζ −1 2(n−1) , ξ , a + cζ = 0, (ξ ◦ Λ )(ζ, ζ) = |a + cζ| a + cζ a ¯ + c¯ζ a + cζ a ¯ + c¯ζ , (ξˆ ◦ Λ−1 )(ζ, ζ) = |d + bζ|2(n−1) ξ , d + bζ = 0. d + bζ d¯ + ¯bζ If we specialize to n = 2, it is now possible to show (see [29, 41]) that the above equations correspond to the canonical realization of the G BMS group as SL(2, C) ∞ 2 ∞ 2 C (S ) if we associate the supertranslation α ∈ C (S ) with ξˆ as: ˆ ζ) = (1 + |ζ|2 )α(ζ, ζ). ξ(ζ,
(3.8) ∞
Within this framework and for every Λ ∈ SL(2, C) and α ∈ C (S ), (3.5) turns out to be equivalent to (3.6) as one can check by direct inspection. 2
Remark 3.3. Identifying the topological vector space of supertranslations C ∞ (S2 ) with D2 and equivalently with D(2,2) , the G BMS group turns out to be locally homeomorphic to a nuclear spacef and thus it is a nuclear Lie group as defined by Gelfand and Vilenkin in [40]. In other words, there exists a neighborhood of the unit element of G BMS which is homeomorphic to a neighborhood of zero in a (separable Hilbert) nuclear space. If N is the real topological vector space of supertranslation C ∞ (S2 ), N ∗ indicates its topological dual vector space, whose elements are called (real) distributions on N . recall the reader that, given a separable Hilbert space H, E ⊂ H is called a nuclear space if it is the projective limit of a decreasing sequence of Hilbert spaces Hk such that the canonical imbedding of Hk in Hk (k > k ) is an Hilbert–Schmidt operator.
f We
June 29, 2006 16:15 WSPC/148-RMP
372
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
Remark 3.4. Since N can be topologically identified as D(2,2) , N ∗ is fully equivalent to the set of continuous linear functionals D(−2,−2) which is obtained setting n = −2 in Definition 3.1 with the prescription that all the equations should be interpreted in a distributional sense [29, 39]. Consequently, each φ ∈ D(−2,−2) is a real distribution in two complex variables bijectively determined by a pair φ, φˆ ∈ D−2 of real distributions such that φˆ = |z|−6 φ, as in (3.8). The counterpart of (3.8) for N ∗ is the following: to each functional φ ∈ D(−2,−2) corresponds the distribution β ∈ N∗ β = (1 + |ζ|2 )3 φ.
(3.9)
Furthermore, if L2 (S2 , S2 ) is the Hilbert completion of N with respect to the scalar product associated with S2 , N ⊂ L2 (S2 , S2 ) ⊂ N ∗ is a rigged Hilbert space. 3.3. Main ingredients to study unitary representations of G BMS The starting point to study unitary representations of BMS group consists in the detailed analysis of McCarthy [25, 26, 29]. The theory of unitary and irreducible rep resentations for G BMS with nuclear topology has been developed in [29] by means either of Mackey theory of induced representation [43–45] applied to an infinite dimensional semidirect product [42] either of Gelfand–Vilenkin work on nuclear groups [39, 40]. In the following, we briefly discuss some key points. Here we introduce the main mathematical tools in order to construct the intrinsic wave functions. We refer to [7] for a detailed analysis in the Hilbert topology scenario. Definition 3.5. If A is an abelian topological group, a character (of A) is a continuous group homomorphism χ : A → U (1), the latter being equipped with the natural topology induced by C. The set of characters A is an abelian group called the dual character group if equipped with the group product (χ1 χ2 )(α) := χ1 (α)χ2 (α).
for all α ∈ A.
A central tool concerns an explicit representation of the characters in terms of distributions [29]. The proof of the following relevant proposition is in the Appendix. Proposition 3.6. Viewing N := C ∞ (S2 ) as an additive continuous group, for every χ ∈ N there is a distribution β ∈ N ∗ such that χ(α) = exp[i (α, β)],
for every α ∈ N,
where (α, β) has to be interpreted as the evaluation of the β-distribution on the test function α. Remark 3.7. With characters, one can decompose any unitary representation of N = C ∞ (S2 ). Indeed, a positive finitely normalizable measure µN ∗ on N ∗ exists, which is quasi invariant under group translations (i.e. for any measurable X ⊂ N ∗ , µN ∗ (X) = 0 iff µN ∗ (N + X) = 0), and a family of Hilbert spaces {Hβ }β ∈ N ∗
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
373
such that, for any unitary representation of N , U : H → H, H being any Hilbert space, the following direct-integral decomposition holds (cf. [40, Chaps. I and IV, Theorem 5 and subsequent discussion]): ⊕ Hβ dµN ∗ (β). H= N∗
Moreover, the spaces Hβ are invariant under U and, for every α ∈ N and ψβ ∈ Hβ , one has U Hβ ψβ = ei(α,β) ψβ . Here, (α, β) denotes action of the distribution β on the test function α. For any Λ ∈ SL(2, C) a natural action χ → Λχ on N induced by duality from that on α ∈ N , considered above, is [25, 29]: (Λχ)(α) := χ(Λ−1 α)
(3.10)
whereas an action β → Λβ on N ∗ is intrinsically defined from the identity (Λβ, α) = (β, Λ−1 α).
(3.11)
ˆ as discussed in Remark 3.4, the If we associate to the distribution β the pair (φ, φ)
−1 latter SL(2, C) action translates as, if Λ = ac db ∈ SL(2, C), b + dζ d¯ + ¯bζ , (Λφ)(ζ, ζ) = |a + cζ|−6 φ , with a + cz = 0, (3.12) a + cζ a ¯ + c¯ζ c + aζ a ¯ + c¯ζ ˆ ζ) = |d + bζ|−6 φˆ , , with d + bz = 0. (3.13) (Λφ)(ζ, d + bζ d¯ + ¯bζ Definition 3.8. Consider a semidirect group product G = B A where A is a topological abelian group, B is any group and denotes the product in G. With the identification of A with the normal subgroup of G containing the pairs (I, α), α ∈ A, define the actiong gα of g ∈ G on α ∈ A: (I, gα) := g (I, α) g −1 ,
for all α ∈ A, g ∈ G,
thus extend this action on charcters, χ ∈ A , by duality: (gχ)(α) := χ(g −1 α),
for all χ ∈ A , α ∈ A, g ∈ G.
For any χ ∈ A , the orbit of χ (with respect to G) is the subset of A Gχ := {χ ∈ A | ∃ g ∈ G such that χ = gχ},
(3.14)
the isotropy group of χ (with respect to G) is the subgroup of G Hχ := {g ∈ G | gχ = χ},
(3.15)
and the little group of χ (with respect to G) is the subgroup of Hχ Lχ := {g = (L, 0) ∈ G | gχ = χ}. g It
(3.16)
coincides with the action of B on A in terms of A-group-automorphisms used in the definition of .
June 29, 2006 16:15 WSPC/148-RMP
374
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
∞ 2 Referring to G BMS = SL(2, C) C (S ), to (3.10) and to (3.11), Lχ can equivalently be seen as the subgroup of SL(2, C) whose elements L satisfy
¯ Lβ¯ = β,
(3.17)
β¯ ∈ N ∗ being associated to χ according to Proposition 3.6. Remark 3.9. A direct inspection shows also that the G action on a character is completely independent from A due to abelianess. Thus, the most general isotropy group has the form Hχ = Lχ A. ∞ 2 This applies in particular to G BMS where A = C (S ).
We now discuss a last key remark concerning the mass of a BMS field. First of all, define a base of real spherical harmonics {Slk }l=0,1,...,k=1,2,...,2l+1 , in the real vector space C ∞ (S2 ) as follows: Slk := Yl0 Yl−k − Ylk √ Slk := 2 Yl−k + Ylk √ Slk := i 2
if k = 2l + 1,
(3.18)
if 1 < k ≤ l,
(3.19)
if l < k ≤ 2l,
(3.20)
where Ylm are the usual (complex) spherical harmonics with m ∈ Z such that −l ≤ m ≤ l. Now, let us consider a generic supertranslation α ∈ C ∞ (S2 ) and let us decompose (in the sense of L2 (S2 , 2S )) it in real spherical harmonics ¯ = α(ζ, ζ)
1 2l+1 l=0 k=1
¯ + alm Slm (ζ, ζ)
∞ 2l+1
¯ alk Slk (ζ, ζ),
α ¯ lk ∈ R.
(3.21)
l=2 k=1
The former double sum defines the translational component of α and the latter the pure supertranslational component of α. This relation allows one to split C ∞ (S2 ) into an orthogonal direct sum T 4 ⊕ Σ where T 4 is a four-dimensional real space invariant under SL(2, C) viewed as the subgroup of G BMS made of elements (A, 0). More precisely, (see also Proposition 4.4 below): Proposition 3.10. The subset SL(2, C) T 4 ⊂ G BMS made of the elements (Λ, α) 4 with α ∈ T is a subgroup of GBMS itself which is invariant under SL(2, C), i.e. if g ∈ SL(2, C) T 4 , g (A, 0) and (A, 0) g ∈ SL(2, C) T 4 ,
for all A ∈ SL(2, C).
Remark 3.11. Defining the analogous subset SL(2, C) Σ, one finds that Σ is not SL(2, C) invariant. More precisely, breaking of invariance happens when A does not belong to SU (2).
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
375
The decomposition (3.21) explicitly associates to each α ∈ C ∞ (S2 ) the 4-vector 1 3 a01 √ , a11 , a12 , a13 . aµ ≡ − (3.22) 2 π 3 One has the following very useful proposition which can be proved by direct inspection and which will be used in several key points in the following. Proposition 3.12. If αa ∈ T 4 , where aµ is made of the first four components of αa as in (3.22), transforming αa under the action of A ∈ SL(2, C) as in (3.5) is equivalent to transforming the 4-vector aµ under the action of the Lorentz transformation associated with A itself. In other words: (3.23) KA (ζ, ζ)−1 αa A(ζ, ζ) = αΠ(A)−1 a (ζ, ζ), for all A ∈ SL(2, C), Π : SL(2, C) → SO(3, 1)↑ being the canonical covering projection. According to the discussion in [29], (3.22) can be translated to the dual space N ∗ where we shall define the annihilator of T 4 as (T 4 )0 = {β ∈ N ∗ | (α, β) = 0, ∀ α ∈ T 4 → C ∞ (S2 )}.
(3.24)
→ recalls the reader that T 4 above is seen as a subspace of C ∞ (S2 ) and not as the four-dimensional translation group of vectors aµ acting in Minkowski space. From now on (T 4 )∗ ⊂ N ∗ denotes the subspace generated by the subset of N ∗ ∗ {Slk | − l ≤ m ≤ l, l = 0, 1}, ∗ where each Slk is completely defined by the requirement ∗ (α, Slm ) := alm
∀ α ∈ N , and alm given in (3.21),
taking into account that each map N α → alm is continuous in nuclear tolopogy and thus it belongs to N ∗ . It is simply proved that (T 4 )∗ and N ∗ /(T 4 )0 are canonically isomorphic and the isomorphism (first introduced in [29]) is invariant under SL(2, C) transformation. As a consequence, there is a linear projection of N ∗ onto (T 4 )∗ (which is, in fact, the usual projection onto the quotient space composed with the cited isomorphism) π : N ∗ → (T 4 )∗ ∼
N∗ . (T 4 )0
(3.25)
That projection enjoys the following remarkable properties [29, 39] which gives the first step in order to introduce the notion of mass for BMS representations: Proposition 3.13. Let β ∈ N ∗ and let φ ∈ D(−2,−2) and φˆ = |ζ|−6 φ be the distributions associated with β as in Remark 3.4. The function i , ζ ) = [(ζ − ζ )(ζ − ζ )φ(ζ, ζ) π(β)(ζ 2 2(1 + |ζ | ) |ζ|<1
ˆ ζ)] dζdζ, + (1 − ζζ )(1 − ζζ )φ(ζ,
(3.26)
June 29, 2006 16:15 WSPC/148-RMP
376
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
is well defined for ζ, ζ ∈ C and, in fact, it belongs to T 4 . Moreover, as the notation depends on π(β) and not on the whole distribution β. That is, π(β) = suggests, π(β) ) if π(β) = π(β ) for whatever β, β ∈ N ∗ . π(β The following final proposition [25, 26, 29] is, partially, a straightforward consequence of Proposition 3.12. It produces the preannounced notion of mass similar to that used in the theory of Poincar´e representations. Proposition 3.14. The space (T 4 )∗ is invariant under the SL(2, C)-action on N ∗ may be expanded in and, according to (3.8), the supertranslation associated to π(β) . Moreover, if one spherical harmonics thus extracting as in (3.22) a 4-vector π(β) µ defines the real bilinear form on N ∗ β1 , β2 as B(β1 , β2 ) := η µν π(β 1 )µ π(β2 )ν ,
(3.27)
with η := diag(−1, 1, 1, 1) and it turns out that B is SL(2, C)-invariant. −B(β, β) = m2 is the equation for the squared-mass m2 of an intrinsic BMS field. It is the analog of the invariant mass of a field in Wigner’s approach to define Poincar´einvariant particles. Consequently, we shall refer to N ∗ as the supermomentum space and its elements as the supermomenta. 3.4. Construction of unitary irreducible representations of G BMS Consider a group G = B A where A is a (possibly infinite dimensional) topological abelian group and B a locally compact topological group and a suitable group operation is defined in order to make G the semidirect product of B and A, which is a topological group with respect to the product of topologies. Using the definitions and propositions given above, the procedure to build up unitary irreducible representations of G BMS goes on as follows, starting from representations of the little groups of characters. The next proposition has a trivial straightforward proof. Proposition 3.15. Take a character χ ∈ A and a closed subgroup of B, K, which leaves invariant χ. If K L → σL is a unitary and irreducible representation of K acting on a, not necessarily finite-dimensional, target Hilbert space V, an associated unitary and irreducible representation K A g → χσg of K A acting on V is constructed as follows: := χ(α)σΛ (ψ), χσ(Λ,α) ψ
∈V . for all ψ
(3.28)
Furthermore, let us define the following equivalence relation in G × V equipped with the product topology: (g, v) ∼K (g , v )
−1 iff there is gK ∈ K such that (g , v ) = (ggK , χσ(gK )v).
(3.29)
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
377
The quotient space equipped with its natural topology, will be denoted by G ×K V :=
G×V . ∼K
From now on, concerning the equivalence classes associated with the equivalence relation defined above, we use the notation [g, v] instead of the more appropriate but more complicated [(g, v)]. Remark 3.16. A natural projection map exists τ : G ×K V →
G , K
which associates [g, v] ∈ G ×K V with gK. Furthermore, the inverse image τ −1 (p) with p = gK ∈ G/K for some g ∈ G, has the form [g, v] where v ∈ V is uniquely determined by p. Thus, it exists a natural bijection from τ −1 (p) into V such that, automatically, the former acquires the structure of a Hilbert space and this structure does not depend upon the choice of g ∈ G with p = gK. As a matter of fact, if p = gK = g1 K with g = g1 , then the following diagram commutes: τ −1 (p)
[g,v]→v
/V σ(k) (g1−1 g) .
id.
τ −1 (p)
[g1 ,v]→v
/V
Consequently, since the representation σ (k) (g1−1 g) as in (3.29) is unitary, the above statement naturally follows [43]. According to the above remark, we can introduce the following definition: Definition 3.17. A triple (X, τ, Y ) is called Hilbert bundle if X and Y are topological spaces, τ is a continuous surjection of X on Y and τ −1 (p) (the fiber) has an Hilbert space structure for each p ∈ Y (see [46, Chap. 7]). In the following, a Hilbert bundle (X, τ, Y ) will be also denoted τ : X → Y. Definition 3.18. Let (X1 , τ1 , Y1 ) and (X2 , τ2 , Y2 ) be two Hilbert bundles. A Hilbert-bundle isomorphism is a pair of homeomorphisms λ1 : X1 → X2 , λ2 : Y1 → Y2 , such that • τ2 λ1 = λ2 τ1 , • λ1 isometrically maps the fiber τ1−1 (p) into τ2−1 (λ2 p) for each p ∈ Y1 .
June 29, 2006 16:15 WSPC/148-RMP
378
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
Definition 3.19. Let G be a topological group and (X, τ, Y ) Hilbert bundle. Then, (X, τ, Y ) is called a G-Hilbert bundle if there are two continuous actions of G onto X, Y such that the pair λ1,g : X → X, with x → gx and λ2,g : Y → Y , with y → gy, is a Hilbert bundle automorphism for each g ∈ G. Accordingly an isomorphism between two different G-Hilbert bundles is an isomorphism between the two Hilbert bundles which commutes with the G-action (see [46, Chap. 9]). Proposition 3.20. According to Definitions 3.17 and 3.19, take a representation (3.28) χσ associated with a character χ ∈ N and an irreducible representation σ of Lχ on the finite dimensional Hilbert space H. AG BMS -Hilbert bundle can be built up as follows, τχσ : G BMS ×Hχ H →
G BMS , Hχ
(3.30)
where: (a) G BMS ×Hχ H consists of the equivalence classes [g, ψ] associated with the equiv alence relation ∼Hχ in (G BMS × H) × (GBMS × H) ) ∼H (g, ψ), (g , ψ χ
) = (gk −1 , χσ(k)ψ) for some k ∈ Hχ , if and only if (g , ψ
(b) the group actions, respectively on G BMS ×Hχ H and = [g g, ψ], g [g, ψ]
G BMS Hχ ,
are defined as
g (g Hχ ) = (g g ) Hχ .
Eventually if considering two G BMS representations χσ on the finite dimensional Hilbert space H and χτ on the finite dimensional Hilbert space H , which are unitary G BMS equivalent by U : H → H , then the Hilbert bundles τχσ : G BMS ×Hχ H → Hχ and τχη : G BMS ×Hχ H →
G BMS Hχ
are G BMS -isomorphic under the map [g, ψ] → [g, U ψ].
In order to fully control the theory of G BMS unitary representations, we also need some measure theoretical notions which will allow us to impose integrability conditions on the set of G BMS wave functions. Consider a generic topological space X. Two Borel measures µ, ν on X are said to be lying in the same measure class if they assume the value zero for the same Borel sets in X so that µ admits Radon–Nikodym derivative with respect to ν and vice versa. In particular, when we deal with locally compact groups such as SL(2, C), the following theorem holds (see [47] for the demonstration and also [46, Sec. 4]) and it is of a great importance for our later applications. Theorem 3.21. For any closed subgroup K of a locally compact group G, there is G such that if µ ∈ M, µg ∈ M for every a unique nonvanishing measure class M on K G −1 . M is called invariant g ∈ G, where µg (E) = µ(g E) for every Borel set E ⊂ K G measure class of K .
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
379
Furthermore, according to [43], consider the Borel-measurable sections of a G BMS G BMS Hilbert bundle (3.30), i.e. Borel measurable functions ψ : Hχ → G BMS × Hχ H ∞ 2 SL(2,C)C (S ) = id G . Since the orbit Oχ = such that τ σ ◦ ψ is isomorphic to ∞ 2 χ
Lχ C
BMS Hχ
(S )
SL(2,C) , Lχ
we can exploit Theorem 3.21 introducing for any orbit Oχ and for a µ ∈ M the following Hilbert space: : Oχ → H dµ(p) ψ(p), ψ(p) < ∞ . (3.31) Hµ = ψ Oχ
Above, , refers to the G BMS -invariant Hermitean inner product of the fiber σ −1 (p) where p is an element on the orbit Oχ . τχ in Hµ , usually called an “induced wave function”i (or BMS Each elementh ψ intrinsic free field), inherits a natural G BMS action as: dµ(gp) −1 g(ψ(g (p))), ∀ g ∈ G (g ψ)(p) = (3.32) BMS , dµ(p) dµ(gp) where dµ(p) is the Radon–Nikodym derivative. It is worth stressing that, by construction, the scalar product in Hµ is invariant under the above action of G BMS . Let us fix a little group Hχ and consider the set of all possible GBMS -Hilbert BMS bundles (3.30) ζ σ = ( GH , τχσ , G BMS × Hχ H). We are entitled to directly apply χ Mackey’s theorem (see [44, Chap. 16] and [42, 49]) which grants us that:
Proposition 3.22. (3.32) individuates a unitary strongly continuous irreducible σ G BMS representation Tµ (ζ ) called induced representation associated with the irreducible representation σ. Remark 3.23. For a fixed little group Hχ , if we consider two invariant measures dµ ∈ Hµ the element µ, ν ∈ M , then the map which associates to each ψ ψ ∈ Hν dν
defines an isometry between Hµ and Hν and, at the same time, an equivalence between Tµ (ζ σ ) and Tν (ζ σ ). Since, according to Theorem 3.21, we have chosen the unique invariant measure class µ of the base space GBMS on each G BMS -Hilbert Hχ
bundle, we are entitled to drop the µ-dependence in the induced representation Tµ (ζ σ ) ≡ T (ζ σ ). adopt the symbol ψ either for the intrinsic G BMS field either for the bulk field suitably restricted on + since they will ultimately be the same object, at least for a scalar G BMS representation. i For an interested reader, we underline that we adopt the most common name for the wave functions constructed from induced representations. Nonetheless, in the literature, it exists a zoology of different names the most notables being canonical wave function (as in [48]) or Mackey wave function (as in [44]). h We
June 29, 2006 16:15 WSPC/148-RMP
380
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
Apparently the last discussion grants us that T (ζ σ ) depends only upon a selected representation of the little group Hχ , but it is rather intuitive that the existence of Hilbert bundle isomorphisms could imply that, a priori different representations of Hχ on different G BMS -Hilbert bundles, could actually induce equivalent full GBMS representations. In detail, the last assertion can be justified if we notice that (3.30) depends only on the orbit G BMS χ and not on the specific choice of χ. Let us thus choose two different bundles, namely G BMS , τχσ : G BMS ×Hχ H → Hχ
G BMS τχσ11 : G BMS ×Hχ1 H → Hχ1
such that G BMS χ = GBMS χ1 for χ1 = χ. As a consequence, an element g1 ∈ GBMS −1 exists such that χ1 = g1 χ and, according to Definition 3.8, Lχ1 = g1 Lχ g1 . This identity translates at a level of representation as σ1 (h) = σ(g1−1 hg1 ) for each h ∈ Hχ . Furthermore, according to Definition 3.19, there is an isomorphism G G BMS BMS σ σ → G , (λ1 , λ2 ) : G BMS ×Hχ H, τχ , BMS ×Hχ1 H, τχ1 , Hχ Hχ1 induced by the maps → [g1 g g −1 , ψ] and λ2 : G λ1 : [g, ψ] BMS χ p → g1 p ∈ GBMS χ1 , 1 where p stands for a generic point on the orbit. Thus, the irreducible representations, induced either from σ, i.e. T (ζ σ ) either from σ1 , i.e. T (ζ σ1 ), are G BMS -equivalent by construction and, consequently, they will be considered as the same. A summary of this discussion lies in the following remark: Remark 3.24. The σ-dependence of T (ζ σ ) is determined up to G BMS -equivalence. Remark 3.25. According to the previous discussion and, in particular, according to Remarks 3.23 and 3.24, a generic G BMS (unitary) representation depends only upon the choice of the character χ and of the unitary representation σ of Hχ . Consequently it will be indicated as T (ζχσ ) making explicit the dependence on χ. The explicit action of a generic T (ζχσ ) should be defined on the induced wave function as in (3.32). However it is more convenient to recast (3.32) asj : ψ(gh) = T (h−1 )ψ(g), ∀ g ∈ SL(2, C), h ∈ Lχ , (3.33) where we write T (h−1 ) instead of T ζχσ (h−1 )ψ(g) to stress, that for a fixed G BMS -Hilbert bundle and for a fixed representation σ of Lχ , the dependence of the j In the literature such as [25, 26], the argument (ζ σ ) is considered a priori fixed and thus it is not χ even introduced.
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
381
induced representation T on such data is superfluous. The G BMS action explicitly reads, for (Λ, α) ∈ GBMS , −1 g), (Λψ)(g) = ψ(Λ (αψ)(g) = χ(g −1 α)ψ(g),
(3.34) (3.35)
which is a unitary representation induced from T (ζχσ ) as in (3.33) and thus, according to Mackey theorem, it is also irreducible. From an operative point of view, an equivalent definition of an induced wave function can be constructed dropping the condition (3.34). In this scenario, we introduce the set of µ square-integrable maps of Hµ (see Theorem 3.21) : Oχ = SL(2, C) → H. ψ Lχ
(3.36)
However, the absence of (3.34) requires the introduction of an additional datum, namely an almost everywhere continuous section ω of the bundle τ : SL(2, C) → Oχ which satisfies ω(p)χ = p, ∀ p ∈ Oχ . Thus, we can define, as an induced wave function, a map (3.36) which transforms under (Λ, α) ∈ G BMS as ω )(p) = dµ(Λp) [T (ζ σ )](ω(p)−1 Λω(Λ−1 p))ψ ω (Λ−1 (p)), (Λψ χ dµ(p) Λ ∈ SL(2, C), p ∈ Oχ ω )(p) = p(α)ψ ω (p), (αψ
α ∈ C ∞ (S2 ).
(3.37) (3.38)
Above, p(α) denotes the action of the character p ∈ SL(2, C)χ on α and the subscript ω reflects the strict dependence of the induced wave function upon the choice of the section itself. 3.5. The scalar-induced wave function The long explicit construction all the G BMS irreducible unitary representations has been completed and extensively discussed in the Hilbert topology in [25, 26] and in the nuclear topology in [29], thus it will not be reviewed here. It is anyway interesting for our purposes to stress some of the nontrivial points in McCarthy analysis; in particular, whereas in the Hilbert topology all the unitary representation for the BMS group can be constructed as induced representations from compact little group, in the nuclear topology the scenario is far more complicated and it can be summarized in the following proposition [29]: Proposition 3.26. If a unitary representation of G BMS is irreducible, then it must arise either from a transitive SL(2, C) action on N ∗ or from a cylinder measure with respect to which the SL(2, C) action is strictly ergodic. That proposition is the reason why the current classification of unitary irreducible representations of G BMS group is not complete. As a matter of fact, the construction
June 29, 2006 16:15 WSPC/148-RMP
382
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
of representations arising from strictly ergodic measure is rather challenging and, up to now, it has not been solved nor addressed in detail. Nonetheless, for our purposes, we are mainly interested in induced representations. These have been fully considered in [29] where, starting from the analysis in [50], a plethora of G BMS possible little groups has been identified. These can be classified in two different families, the connected subgroups of SL(2, C) and the nonconnected compact subgroups of SU (2). We shall now concentrate onk SU (2), SU (1, 1), it it Γ (the universal covering of SO(2) made of all the matrices diag(e 2 , e− 2 ) with t ∈ R) and on ∆ = Γ T 2 (the double covering of the two-dimensional Euclidean group). The analysis for the SU (2) scenario has been already developed in [7, 11] in the Hilbert topology, where the wave functions of the intrinsic BMS free fields, their kinematical and dynamical configurations have been throughout discussed. On the opposite, we shall now focus attention on the ∆ case — proper only of the nuclear topology — which will turn out to be in direct correspondence with scalar fields on + induced from the bulk. ∆ Orbit Classification. This little group is the set of matrices it e2 υ Λt,υ = , it 0 e− 2 with t ∈ R and υ ∈ C. Thus, according to (3.17), a fixed point β¯ (which thus admits ∆ as little group) satisfies: ¯ = β. ¯ (∆β) In order to solve this distributional equation, the rationale is to switch from β¯ ∈ N ∗ ˆ ¯ φ) ¯ ∈ D−2 as in Remark 3.4 and to use (3.12) and (3.13), i.e. to the associated pair (φ, ¯ ∆φ¯ = φ,
ˆ¯ ∆φˆ¯ = φ.
As discussed in [29], the general solution to these equations is: φ¯ = S,
(3.39)
¯ = S|ζ|−6 + Aδ 2,2 + Cδ, φˆ
(3.40)
where S, A, C ∈ R are constants and δ p,q is the pth derivative on the variable ζ and qth derivative on the variable ζ of δ = δ(ζ)δ(ζ). Proposition 3.27. The mass (3.27) associated to any orbit of the ∆χ little group is 0. Proof. The demonstration follows from Proposition 3.14 and (3.26) directly according to which C ¯ , ζ ) =
≡ 0; π( β)(ζ 1 + |ζ |2 k These
compact little groups are also present in the Hilbert topology scenario.
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
383
¯ = C(1, 0, 0, 1); consequently we conclude from thus (3.22) grants us that π( β) µ 2 µν ¯ ¯ (3.27) that m = η π(β) π(β) = 0. µ
ν
Only a three-dimensional orbit SL(2,C) = R × S2 with a vanishing mass can be ∆ associated to the ∆χ little group. Furthermore, from (3.40) we can infer that, besides the constant C which plays the role of the energy, the orbit is fully determined only if we fix the values A, S which from now on are set to 0. We can now explicitly construct the representation and we can choose an which represent the key data to SL(2, C)-invariant measure on the orbit SL(2,C) ∆χ construct the induced wave function as in Propositions 3.10 and 3.12. Leaving the detailed analysis for the other nonconnected little groups to [7, 26], we concentrate on: ∆-Induced Wave Functions. Unitary and irreducible representations of ∆ are of two types [29, 44]. A representation Dλ,p,q of the first type is individuated by a triple p, q ∈ R\{0}, λ ∈ Z2 and it is defined by: D
λ,p,q
it
e2, υ it 0 e− 2
= eiλt ei(pb+qc) ,
υ = b + ic.
This acts on an infinite dimensional complex target Hilbert space by multiplication s and it induces an infinite dimensional G BMS representation. A representation D of the second type is individuated by a number s ∈ Z/2 and it is defined by: D
s
it
e2, υ − it 0 e 2
= eist .
(3.41)
This acts on a one-dimensional complex target Hilbert space by multiplication. Remark 3.28. Although the above representations are well known even for a Poincar´e invariant theory, the second being faithful, the first being unfaithful, but in a G BMS scenario they are both faithful. More generally, it has been shown in [29] ¯ = 0, φ¯ φ) that an induced representation built upon an orbit Oχ is faithful iff π( being the supermomentum associated to Oχ solving (3.17). It is possible to reinforce the result presented in Theorem 3.21: we can exploit either Theorem 1 and Corollary 1 in [44, Sec. 3, Chap. 4] either the unimodularityl of SL(2, C) and ∆ to claim that the SL(2, C)-invariant measure class M contains a measure µ which is SL(2, C)-invariant. Referring to this specific measure µ we can l A locally compact group G is called unimodular if its right-invariant and left-invariant Haar measure coincide (cf. [44, p. 69])
June 29, 2006 16:15 WSPC/148-RMP
384
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
construct the Hilbert space of induced wave functions ψ : Oχ → C ¯ Hµ = ψ : Oχ → C dµ(p)ψ(p)ψ(p) < +∞ .
(3.42)
Oχ
Now, we can use Remark 3.23 and the formula (3.37) and (3.38) to construct the explicit expression of the induced wave function (3.32). Remark 3.29. The ∆ little group is a rather special case since no global continuous exists such for the SU (2) or the Γ section ω of the bundle τ : SL(2, C) → SL(2,C) ∆ little group. There are different choices commonly used and, they being far from the aim of this paper, we refer to [51] for a complete discussion. According to (3.37) and (3.38), an induced wave function transforms, for any g = (Λ, α) ∈ G BMS and under the ∆ representation (3.41), as dµ(Λp) p(α)Ds (ω(p)−1 Λω(Λ−1 p))ψω (Λ−1 p) (gψ)(p) = dµ(p) = p(α)eist ψω (Λ−1 p),
(3.43)
where, with the above-said choice of µ, the Radon–Nikodym measure is 1 and eist is the action of the one-dimensional ∆ representation associated to Ds (ω(p)−1 Λω(Λ−1 p)). Eventually, we may write the induced-scalar ∆ G BMS wave function (i.e. s = 0 in (3.43)) as ψ : Oχ → C, (gψω )(Λp) = p(α)ψω (p),
(3.44) ∀g ∈ G BMS .
(3.45)
3.6. The covariant scalar wave function and its bulk interpretation Although the induced wave function transforms under a unitary irreducible G BMS representation, thus containing all relevant physical information, from a physical perspective, it is rather common to start from a different wave function. That is the covariant wave function(al) (or covariant free field) which, in a BMS setting, is [7, 52]: : N ∗ → H , Ψ
(3.46)
where H is a suitable finite dimensional target Hilbert space either real or complex and N ∗ is the space of distributions over S2 . Under the action of g = (Λ, α) ∈ G BMS , Ψ in (3.46) transforms as −1 β], [U λ (g)Ψ][β] = χβ (Λα)Dλ (Λ)Ψ[Λ
(3.47)
where Dλ (Λ) is a unitary, but not necessary irreducible, representation of SL(2, C) labeled by the superscript λ. χβ is the character associated with β as in Definition 3.5 and it acts according to Remark 3.7.
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
385
Remark 3.30. At first glance (3.46) and (3.32) are a priori unrelated, the main striking difference consisting in the existence of a different induced wave function for each isotropy subgroup Hχ , whereas the covariant wave equation is unique up to the choice of a unitary SL(2, C) representation. Nonetheless, it is possible to show that both kinds of wave functions are ultimately equivalent if suitable constraints are imposed on the covariant wave functionm [44, 48]. In the BMS scenario [7], upon selecting a specific covariant wave function and a representation U λ as in (3.47), the restriction to the induced wave function associated to a fixed little group Hχ operatively corresponds to: → N ∗ (→ denoting 1. restrict the support of (3.47) from N ∗ to the orbit SL(2,C) Lχ an embedding); by the linear transformation U λ ω −1 (β) where β is a point on the 2. act on Ψ orbit and where ω is the section chosen in (3.37); 3. select in (3.47) the irreducible unitary representation σ of (3.32) contained in Dλ . We discuss now in details the last step in the above construction which is rather counterintuitive. Let us start either from a generic unitary, but fixed, representation Dλ of SL(2, C) either from a generic, but fixed, irreducible unitary representation σ j of a fixed little group Lχ . Let us now consider the restriction of Dλ to Lχ which decomposes as the finite j j is a unitary irreducible representation of sum Dλ Lχ = j Cλ,j σ , where σ Lχ and Cλ,j are suitable integers standing for the multiplicity of the σ j in Dλ . According to [44, Theorem 16.2.1], the above decomposition translates either to the full G BMS representation either to the target space of (3.47), i.e. H = Cλ,j Hj . (3.48) j
Let us now recognize a fixed Hj as the target space of an induced wave function (3.32) which transforms under the action of the unitary and irreducible represen tation σ j . The selection of a fixed representation σ j in Dλ is now equivalent to constrain H — the target space of (3.46) — to Hj . This result can be operatively achieved imposing the subsidiary condition on the covariant wave function (with support on Oχ ) ¯ = Ψ[ ¯ β] β], ρΨ[
(3.49)
where ρ is the projector selecting Hj ⊂ H and where β¯ is the supermomentum associated to the fixed point on Oχ . m We
shall also refer to the construction of the covariant wave equation and of the associated equations of motion as the Wigner’s program in relation to Wigner seminal paper [53] where he dealt with the Poincar´e case.
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
386
If we now remember that the following identity holds: ¯ β], Ψ[β] = [Dλ (Λ)Ψ][
(3.50)
where β = Λ−1 β¯ ∈ SL(2,C) and where Λ ∈ SL(2, C), (3.49) becomes Lχ = Ψ[β]. Dλ (Λ)ρ[Dλ (Λ)]−1 Ψ[β] If we setn ρ[β] = Dλ (Λ)ρ[Dλ (Λ)]−1 , (3.50) becomes the so-called projection equation ρ[β]Ψ[β] = Ψ[β].
(3.51)
Remark 3.31. According to the discussion of [44, Chap. 21, Sec. 1B], which easily generalize to the G BMS scenario, (3.51) is also a covariant matrix operator and, since induced wave functions are in one-to-one correspondence with pairs {Dλ (Λ), ρ}, it is customary to claim that (3.51) represents the most general covariant wave equation.o Since our final goal is to show the equivalence between (2.31) and the G BMS ∆ scalar induced wave function (3.45), let us restrict our attention to this specific case. Definition 3.32. A covariant scalar wave function is a map Ψ : N ∗ → C,
(3.52)
[U λ (g)Ψ](β) = χβ (Λα)Ψ[Λ−1 β],
(3.53)
which transforms as
under a scalar SL(2, C) unitary representation Ug , with g = (Λ, α) ∈ G BMS . In −1 (3.53) Λ β is defined as in (3.10). Proposition 3.33. Referring to the definition above, the constraint to impose to reduce (3.53) to (3.45) is SL(2, C) ¯ ¯ = 0, β Ψ[β] = 0, π( β) (3.54) β− ∆ where β ∈ N ∗ and β¯ is the fixed point of the ∆ orbit constructed out of (3.39) and (3.40). The proof of this proposition is a straightforward consequence of the analysis in [7] and of the coincidence between the scalar covariant SL(2, C) representation and n The
β dependence of ρ should not be interpreted literally: it means that Λ in (3.50) is the unique value such that Λ−1 β¯ = β. o [44, Chap. 21] contains the specific discussion for the Poincar´ e scenario where it is shown that (3.51) is simply a compact expression for the usual Dirac, Proca equations, etc. whereas, for the G BMS counterpart in the Hilbert topology, we refer to [7].
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
387
the scalar representation induced from the ∆ little group (i.e. (3.51) is identically satisfied). Furthermore the mass equation which usually appears in the Hilbert topology [7], i.e. π(β) ]Ψ[β] = 0, [η µν π(β) µ ν is automatically satisfied by (3.54) since the little group ∆ is associated only to a ¯ = 0. vanishing mass whenever π( β) To conclude, we want now to establish the main result of this section, namely that a covariant massless scalar field which satisfies (3.54) is identical to (2.31). Let us remember that N ∗ ∼ D(−2,−2) ∼ D−2 as well as N ∼ D(2,2) ∼ D2 . Thus, ˆ φ ∈ D−2 introduced in an element β ∈ N ∗ is bijectively related with the pair φ, Proposition 3.2. Furthermore, let us recall that the fixed point of the ∆ orbit is ¯ φˆ = S|z|−6 + Kδ 2,2 + Cδ with C = 0; if we select the specific values S = K = 0, ¯ then φˆ = Cδ and, according to Proposition 3.2 and to Remark 3.4, the associated supermomentum is ¯ φˆ β¯ = . (1 + |ζ|2 )3 We need now the following lemma: Lemma 3.34. The supermomentum β¯ lies in (T 4 )∗ . Proof. Consider the isomorphism discussed about (3.25) first introduced in [29] N∗ ∼ (T 4 )∗ , (T 4 )0
(3.55)
where both sides are preserved under SL(2, C) transformation and the isomorphism commute with the action of that group. It is straightforward that β¯ ∈ / (T 4 )0 since if we consider any supertranslation 4 ∞ 2 ¯ = Cf (0) = 0. As a consequence, f ∈ T → C (S ) such that f (0) = 0, then (f, β) ∗ ¯ we are free to choose β as the representative of a conjugacy class in (TN4 )0 and, according to (3.55), β¯ also lies in (T 4 )∗ . ¯ (3.55) also grants us that Furthermore, since the orbit Oβ¯ is generated as SL(2,C) β, ∆ 4 ∗ Oβ¯ ∈ (T ) , i.e. according to Proposition 3.13, π(β) = β for any β ∈ Oβ¯ . This last remark entitles to substitute in (3.53) β with π(β). Ψ : (T 4 )∗ → C,
[U (g)Ψ](π(β)) = χπ(β) (Λα)Ψ[Λ−1 π(β)], ∀ g ∈ G BMS , ¯ Ψ[π(β)] = 0. π(β) which still satisfies the orbit constraint π(β) − SL(2,C) ∆ The next step consists in bearing in mind that (T 4 )∗ ∼ T 4 [29], i.e. according to Proposition 3.13 and (3.26), any element π(β) ∈ (T 4 )∗ is in one to one correspon ∈ T 4. dence with the element π(β)
June 29, 2006 16:15 WSPC/148-RMP
388
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
Furthermore, according to (3.22) and to Proposition 3.27, we can identify each µν π(β) ∈ Oπ( ¯ with a four-vector pµ which satisfies the mass relation η pµ pν = 0. β)
Thus, we can write pµ ≡ (E, En(ζ , ζ )) where n(ζ , ζ ) is a three-dimensional spatial versor spanning a two-sphere of unit radius whose coordinates are ζ , ζ . At this stage, the reader should bear in mind that we are ultimately dealing with a Gelfand triplet, i.e. N ⊂ L2 (S2 ) ⊂ N ∗ ; thus, according to these last remarks we are entitled to switch from the covariant wave function living on (T 4 )∗ ⊂ N ∗ to a second one living on T 4 ⊂ N which readsp : 4 Ψ : Oπ( ¯ → T → C, β)
= χ(Λα)Ψ[Λ−1 π(β)], (U (g)Ψ)[π(β)]
∀ g = (Λ, α) ∈ G BMS .
now lies in C ∞ (S2 ), the net effect of an SL(2, C) action is according to Since π(β) (3.5) and to (3.11) −1 ζ , Λ−1 ζ ). , ζ ) = KΛ (Λ−1 ζ , Λ−1 ζ )π(β)(Λ (Λ−1 π(β))(ζ corresponds to the 4-vector whose components are In terms of 4-vectors, Λ−1 π(β)
p0 = KΛ (Λ−1 ζ , Λ−1 ζ )E,
p = KΛ (Λ−1 ζ , Λ−1 ζ )En(Λ−1 ζ , Λ−1 ζ ). −1
−1
−1
The character can be directly evaluated as χ(Λα) = eiEKΛ (ζ ,ζ )α(Λ ζ ,Λ ζ ) . Substituting these results in the scalar covariant wave equation and taking into , ζ ) is uniquely determined by its associated four vector account that each (π(β))(ζ pµ which, in turn, is determined by the coordinates (E, ζ, ζ), we can eventually
as: recast (3.53) in terms of a field ϕ(E, ζ, ζ) := Ψ[π(β)]
[U (g)ϕ](E, ζ , ζ ) =
−1
−1
−1
eiEKΛ (Λ ζ ,Λ ζ )α(Λ ζ KΛ (Λ−1 (ζ, ζ))
,Λ−1 ζ )
ϕ(KΛ−1 (ζ , ζ )E, Λ−1 ζ , Λ−1 ζ ).
The square root is due to the fact that we passed to the measure dE ⊗ S2 from the invariant one dp/E(p). We have found nothing but the unitary representation of GBMS given in (2.31). Therefore, this fact shows also that the representation of G BMS obtained by (3.54) is a unitary representation of GBMS as well. We have eventually proved that: Theorem 3.35. A field on + satisfying (2.31) is identical to a G BMS -covariant massless scalar field which satisfies (3.54). Furthermore, the representation of G BMS obtained by (3.54) is a unitary representation of GBMS as well. As a last remark, we wish to clarify why the above theorem holds only when a suitable nuclear topology is imposed on the set of supertranslations. If we choose it is possible to interpret the covariant wave function on T 4 as the one on (T 4 )∗ where the argument β ∈ (T 4 )∗ has been evaluated with a fixed test function as in (3.26). p Alternatively,
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
389
N = L2 (S2 , S2 ) (where the field of the Hilbert space is R), it is still possible to construct a massless scalar wave function induced from the Γ little group living on an orbit whose fixed point has a vanishing pure supertranslational component. Nonetheless, in this framework, according to the Riesz–Fisher theorem, a character χ(α) can be always associated with an element β ∈ L2 (S2 , S2 ) [25] such that χ(α) = ei
R
S2
S2 αβ
,
∀ α ∈ L (S , S2 ). 2
2
This formula represents the key obstruction to obtain (2.31) in an Hilbert topology framework since, whenever we consider a scalar covariant wave function Ψ : N ∗ = L2 (S2 , S2 ) → C with a support restricted on T 4 ⊂ N ∗ , we are requiring that 1 2k+1 β ∈ N ∗ can be written as β(ζ, ζ) = k=0 l=1 βlk Slk (ζ, ζ) where Slk are the real spherical harmonics. Accordingly, a character will always be written as: 1 2k+1 i
χ(α) = e
P
P
k=0 l=1
βlk αlk
, −1
−1
−1
−1
which cannot produce a phase as that in (2.31) eiEKΛ (Λ ζ ,Λ ζ )α(Λ ζ ,Λ ζ ) whenever α includes components in the space of pure supertranslations. Thus, it is this expression which represents the symptom that a correspondence between (2.31) and an intrinsic BMS field could be achieved only if a distributional support for the covariant wave function is considered. 4. A Few Holographic Issues 4.1. General goals of the section We want to start to investigate the issue of holographic correspondence between QFT formulated in the bulk for fields φ satisfying Klein–Gordon equation (2.11) as in Proposition 2.5, and QFT formulated on the boundary + as showed in the previous section. In this sense the bulk is the globally hyperbolic subregion near the null infinity + of an asymptotically flat spacetime contained into a globally hyperbolic nonphysical spacetime in the sense of the hypotheses of Proposition 2.5 (in particular it could be strongly asymptotically predictable). We know from Proposition 2.5 that, at level of classical fields, there is a correspondence between solutions of field equations ( − 16 R)φ = 0 and associated fields ψ defined on + . We want to investigate whether or not such a correspondence can be implemented at level of algebras of observables associated with the relevant fields. If the correspondence can be implemented in terms of an injective ∗-homomorphism, the algebra of the bulk can be realized as a (sub)algebra of the observables of the boundary. In this sense, it would realize a sort of holographic machinery which encodes complete information of QFT defined in the bulk in QFT living in the boundary. To this end, we have to recall some features of linear QFT in globally hyperbolic spacetime [32].
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
390
4.2. Linear QFT in the bulk Let us assume that the spacetime (M, g) is globally hyperbolic, K := + P , P being any smooth real-valued function on M , denotes a Klein–Gordon-like operator in that spacetime and SK (M ) indicates the real space of solutions φ of Kφ = 0 with compactly supported Cauchy data on a (and thus, every) Cauchy surface of (M, g). A natural nondegenerate symplectic form on SK (M ) can be defined as σM (φ1 , φ2 ) := (φ2 ∇N φ1 − φ1 ∇N φ2 ) dµ(S) (4.1) g , S
the choice of the Cauchy surface S being immaterial because the right-hand side does not depend on such a choice. N is the unit future directed normal vector to S (S) and dµg the measure associated with the metric induced on S by g. Nondegenerateness implies that there is a unique C ∗ -algebra generated by (abstract) Weyl operators W (φ), with φ ∈ SK (M ), such that they are not vanishing and (Wb1)
WM (−φ) = WM (φ)∗ ,
(Wb2)
WM (φ1 )WM (φ2 ) = eiσ(φ1 ,φ2 )/2 WM (φ1 + φ2 ).
That C ∗ -algebra is Weyl algebra, WK (M ), associated with K in the spacetime (M, g). The formal interpretation of elements W (φ) is eiσM (φ,Φ) , σM (φ, Φ) being the usual field operator symplectically smeared with smooth field equations with compactly supported Cauchy data. There is an equivalent construction of WK (M ) which allows a straightforward representation of locality based on the linear, real, formally anti self-adjoint operator EK : Cc∞ (M ) → C ∞ (M ) called causal propagator of K. It is defined as the difference of advanced and retarded fundamental solutions of Kf = 0 which are known to exist globally provided the spacetime is globally hyperbolic. Let us focus our attention on the remarkable features of EK which are listed below: (i) (ii) (iii) (iv)
EK f ∈ SK (M ) for f ∈ Cc∞ (M ). EK is surjective onto SK (M ). supp(EK f ) ⊂ J(supp f ). EK f = 0 if and only if f = Kg for some g ∈ Cc∞ (M ).
As consequence of those properties the identity holds [32] φf dµg = σM (EK f, φ) and thus f EK g dµg = σM (EK f, EK g), (4.2) M
M
where dµg is the volume form of M induced by the metric g. To go on, it is convenient to define VM (f ) := WM (EK f ),
for every f ∈ Cc∞ (M ).
(4.3)
Taking the former of (4.2) into account, the formal interpretation of elements VM (f ) is eiΦ(f ) , Φ(f ) = M Φf dµg being the usual field operator smeared with smooth
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
391
compactly supported functions. The interpretation given above makes sense in terms of operators whenever a regular state is fixed, by applying GNS theorem. It turns out, for (iv), that VM (f ) = VM (g),
if and only if f − g = Kh for some h ∈ Cc∞ (M ).
(4.4)
This is nothing but the constraint due to field equation KK Φ = 0 given in a distributional-like fashion, using the fact that K is formally self-adjoint and KE K = 0 by definition of EK . By construction, generators VM (f ) generate the same C ∗ -algebra, W(M ), as WM (φ). The improvement is due to the fact that, now, property (iii) together with (Wb2) and the latter in (4.2) entail [VM (f ), VM (g)] = 0 whenever the supports of f and g are causally separated. This is the natural formulation of locality in spacetime. 4.3. General holographic tools All results and tools introduced above can be used in the globally hyperbolic spacetime (V˜ ∩ M, g) whenever (M, g) is asymptotically flat, in accordance with the hypotheses of Proposition 2.5 equipped with Klein–Gordon operator for a conformally coupled massless scalar field K := − 16 R. The main proposition concerning holographic relations between WK (V˜ ∩M ) and W(+ ) consists of the following proposition. We need a preliminary definition. If (M, g) is an asymptotically flat spacetime, satisfying hypotheses of Proposition 2.5 ˜ and K := − 1 R, the projection map ΓM ˜ : SK (M ˜ ) → with respect to V˜ ⊂ M V 6 V S(+ ) is the real linear map which associates every φ ∈ SK (MV˜ ) with the smooth extension to + of (ωΩ)−1 φ as in Proposition 2.5, where (ωΩ)2 g induces the triple ˜ B , nB ) on + . (+ , h Proposition 4.1. Let (M, g) be a globally hyperbolic asymptotically flat spacetime ˜ and let EK satisfying the hypotheses of Proposition 2.5 with respect to V˜ ⊂ M 1 ˜ denote the causal propagator in MV˜ := V ∩ M of K := − 6 R. Assume that both conditions below hold true for the projection map ΓMV˜ : (a) ΓMV˜ (SK (MV˜ )) ⊂ S(+ ), (b) symplectic forms are preserved by ΓMV˜ , that is, for all φ1 , φ2 ∈ S(MV˜ ), σMV˜ (φ1 , φ2 ) = σ(ΓMV˜ φ1 , ΓMV˜ φ2 ).
(4.5)
Then W(MV˜ ) can be identified with a sub C ∗ -algebra of W(+ ) by means of a C ∗ -algebra isomorphism ı uniquely determined by the requirement ı(WMV˜ (φ)) = W (ΓMV˜ φ),
for all φ ∈ SK (MV˜ ),
(4.6)
or, equivalently, ı(VMV˜ (f )) = W (ΓMV˜ EK f ),
for all f ∈ Cc∞ (MV˜ ).
(4.7)
June 29, 2006 16:15 WSPC/148-RMP
392
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
Proof. For (4.3), the thesis can be proved referring to generators VMV˜ (φ) only. We start by fixing the relevant sub C ∗ -algebra of W(+ ) as follows. As a consequence of (a), it makes sense to consider the ∗ -algebra in W(+ ), A, finitely generated by the elements VMV˜ (ΓMV˜ φ) for all φ ∈ S(MV˜ ). The closure (in W(+ )) of that ∗ -algebra, A, is a sub C ∗ -algebra of W(+ ) by construction. On the other hand, by construction and using the uniqueness of the norm of a C ∗ -algebra, A must coincide with Weyl algebra associated with the real vector space S0 := ΓMV˜ (S(MV˜ )) and the nondegenerate symplectic form σ restricted to that space. Whenever the real linear application Γ : S(MV˜ ) → S0 is bijective, the validity of requirement (b) entails (as an immediate consequence of the main statement of [33, Theorem 5.2.8]) that there is a ∗ -algebra isomorphism ı : W(M) → W(S0 ) ≡ A uniquely determined by the requirement ı(VMV˜ (φ)) = W (ΓMV˜ φ)), which is nothing but (4.6). As is well known, ∗ -algebra isomorphisms of C ∗ -algebras are C ∗ -algebra isomorphisms. Hence, the thesis holds true provided the map ΓMV˜ : S(MV˜ ) → S0 is bijective. ΓMV˜ is surjective by construction. Assume that φ ∈ Ker(Γ) then, by condition (b) and using left-argument linearity of σ, one has σMV˜ (φ, ψ) = 0 for all ψ ∈ S(MV˜ ). Thus, it must hold φ = 0 because σMV˜ is nondegenerate. It implies that ΓMV˜ is also injective concluding the proof. Remark 4.2. The hypotheses in the previous proposition are, to a certain extent, rather restrictives. In particular condition (b) automatically excludes a large class of manifolds such as asymptotically flat spacetimes with a black hole since part of the symplectic flow of data crosses the event horizon and it does not reach + . Thus, in this framework, Eq. (4.5) is never satisfied and a different holographic mechanism must be considered. Adopting an “Occam razor” perspective, the simplest road to pursue would be to consider, as the screen where holographic data are encoded, both the event horizon and null infinity. Thus, within this perspective, the full content of the bulk theory can be reconstructed starting from two lower dimensional quantum field theories. Nonetheless, here, we will not deal with this issue in detail since it would need an extensive analysis far from the aims of this paper. In case the hypotheses of Proposition 4.1 is fulfilled, another relevant consequence will take place. In that case any algebraic state ν : W(+ ) → C can be pulled back on S(MV˜ ) through ı to the state νı : W(MV˜ ) → C, defined as νı (a) := ν(ı(a)) for all a ∈ W(MV˜ ). In particular, it happens for the BMS-invariant state λ (corresponding to Υ in its GNS representation) of Sec. 2.4: the state λı could be used to build up QFT in the bulk. For instance, it may give a notion of particle also if the bulk spacetime does not admit any group of isometries (Poincar´e group in particular). From the fact that bulk isometries, barring pathological situations prepared ad hoc, give rise to asymptotic symmetries and λ is invariant under all asymptotic symmetries, we expect that λı is invariant under isometries of the bulk. The formal investigation on this fact in the general case will be performed elsewhere. Another relevant point which deserves investigation concerns the short distance behavior
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
393
of n-point functions associated with λı . In fact it is a well-established result that physically meaningful states must have Hadamard behavior property (see [32] for a general discussion on this extent). There is no evidence, from our construction, that λı is Hadamard. However, all those properties can be studied in the particular and relevant case of the Minkowski spacetime. This is the content of next section. 4.4. Holographic interplay of Minkowski space and + Let us consider the case of four-dimensional Minkowski space (M4 , η). That spacetime is asymptotically flat. More precisely, it is asymptotically flat at past null, future null and spatial infinity and it is strongly asymptotically predictable in the sense of [19]. Starting from a fixed Minkowski frame referred to coordinates (t, x), the unphys˜ , g˜) can be fixed to be Einstein static universe [19] as follows. One ical spacetime (M passes to spherical coordinates in the rest space of the initial Minkowski frame, obtaining coordinates (t, r, ϑ, φ) on M4 , and finally, one adopts null coordinates u := t − r ∈ R, v := t + r ∈ R obtaining global coordinates (u, v, ϑ, φ) on M4 . Using these initial coordinates, define coordinates ϑ = ϑ, ϕ = φ, T = tan−1 v + tan−1 u and R = tan−1 v − tan−1 u and assume Ω2M4 = 4[(1 + v 2 )(1 + u2 )]−1 . With these definitions g˜ := Ω2 η reads g˜ = −dT 2 + dR2 + sin2 R(dϑ2 + sin2 ϑdϕ2 ) . (4.8) ˜ , g˜) obtained by assuming T ∈ R, This metric makes sense in a larger spacetime (M 2 R ∈ (0, π) and ϑ, ϕ varying everywhere on S . That is the Einstein static spacetime. ˜ are only apparent they being “origins of (The singularities for R → 0, π in M spherical coordinates” and the expression of the metric (4.8) is valid throughout all Einstein spacetime except for the two one-dimensional submanifolds corresponding ˜ two charts at least to values “R = 0” and “R = π”. To cover the whole manifold M are necessary.) ˜ , g˜) as a globally With that procedure (M4 , η) turns out to be embedded into (M + hyperbolic submanifold and is completely represented by the set of points with T +R = π, R ∈ (0, π). Actually, with the definition given at the beginning, the space (M, g) which fulfills the very definition of asymptotic flat at future null infinity is the portion of M4 in the future of any fixed t-constant Cauchy surface. As a matter of fact, in the following, we consider only this region also if we shall not stress it explicitly. Rescaling g˜ on + by the further regular factor ω 2 := (1 + u2 )(1 + v 2 )/v 2 and changing coordinates in the sector T, R one gets a metric with associated triple ˜ B , nB ). (+ , h A natural Bondi frame on + , which we say to be associated with the Minkowski frame (t, x) in M4 , is finally obtained as (u, ζ, ζ) where u is just the (limit to + of the) null coordinate u in the reference frame initially fixed in Minkowski spacetime and ζ := eiφ cot ϑ2 , also (ϑ, φ) being angular spherical coordinates in the reference frame initially fixed in Minkowski spacetime.
June 29, 2006 16:15 WSPC/148-RMP
394
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
˜ B is nothing but that of Remark 4.3. The metric of the 2-sphere determined by h 2 the unit 2-spheres 4dζdζ/(1+ζζ) in the rest frame of initial Minkowski coordinates (t, x). Starting form a different initial Minkowski frame (t , x ) connected with the initial one by means of a orthocronous proper Poincar´e transformation (Λ, a), one would determine the same asymptotic manifold + but would find a different metric ˜ B = 0du +4dζ dζ /(1+ζ ζ )2 . Notice that the nondegenerate part ˜ on + itself, h h B is again the standard metric of the unit 2-sphere but, as ζ = ζ and ζ = ζ , it is not the standard metric of the unit 2-sphere determined in the former case: Conversely, it is that of the unit 2-spheres in the rest frame of Minkowski coordinates (t , x ). ˜ B , nB ) and (+ , h ˜ , n ) are However, a closer scrutiny shows that the triples (+ , h B B connected by a transformation of BMS group (Λ, fa ). Indeed, one has the following result whose (simple) proof is left to the reader (see also [54, 55]). Proposition 4.4. Let (t, x) = (x0 , x1 , x2 , x3 ) and (t, x) = (x0 , x1 , x2 , x3 ) be Minkowski frames in (M4 , η) such that xµ = Λµ ν (xν + aν ) for some (Λ, a) ∈ ISO(3, 1) and let (u, ζ, ζ) and (u , ζ , ζ ) be the respectively associated Bondi frames on + . The following holds: (a) The Bondi frames are connected by means of the BMS transformation u := KΛ (ζ, ζ)(u + fa (ζ, ζ)),
(ζ , ζ ) = Λ(ζ, ζ),
where the action of Λ on (ζ, ζ) is that in (2.4) and the function fa belongs to the space T 4 spanned by the first four real spherical harmonics as defined in Sec. 3.3, that isq fa := a0 −
a1 (ζ + ζ) a2 (ζ − ζ) a3 (ζζ − 1) − − . ζζ + 1 i(ζζ + 1) ζζ + 1
(4.9)
(b) The set a1 (ζ + ζ) a2 (ζ − ζ) a3 (ζζ − 1) 0 4 R := (Λ, fa ) ∈ GBMS fa = a − − − ,a∈R ζζ + 1 i(ζζ + 1) ζζ + 1 is a subgroup of GBMS , the map ISO(3, 1) (Λ, a) → (Λ, fa ) ∈ R being a continuous-group isomorphism. The second statement is a straightforward consequence of Proposition 3.12. The quantum version of proposition above will be established in Theorem 4.8 below. These results are due to the fact that Poincar´e isometries are also asymptotic symmetries (see also [54]). Einstein static universe is globally hyperbolic because it is static and T -constant sections are compact (see [56, Chap. 6]). As a consequence (M4 , η) (more precisely, the region in (M4 , η) in the future of a fixed Minkowskian spacelike Cauchy surface) angular spherical coordinates, we recognize in the factors below in front of −a1 , −a2 and −a3 the components of the radial versor, respectively, sin ϑ cos φ, sin ϑ sin φ and cos ϑ. q In
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
395
˜ itself. The part fulfills the hypotheses of Proposition 2.5 with respect to V˜ := M of standard free QFT in Minkowski spacetime [57, 58] for a massless scalar field φ, we are interested in, can be summarized as follows in Weyl quantization referred to Weyl algebra W(M4 ) with K := −. Standard free QFT can be viewed as the GNS realization of W(M4 ) based on a preferred algebraic state λM4 invariant under Poincar´e group and individuated as we go to describe. Take a Minkowski frame with coordinates (t, x) ∈ R4 and, for every φ ∈ SK (M4 ), define its positive frequency part, φ+ , dp ei(p·x−t|p|) ! φ+ (t, x) := φ+ (p), 16π 3 |p| R3 (4.10) |p| φ)(0, x) (∂ t φ(0, x) − i φ e−ip·x . + (p) := 3 dx 16π 3 R |p| φ+ has no compactly supported Cauchy data and φ = φ+ + φ+ . The sesquilinear form φ1+ , φ2+ M4 := −iσM4 (φ1+ , φ2+ ),
for every pair φ1 , φ2 ∈ SK (M4 ) (4.11)
is well defined and give rise to a Hermitean scalar product on the space SK (M4 )C + of complex linear combinations of positive frequency parts and φ1+ , φ2+ M4 = dp φ for every pair φ1 , φ2 ∈ SK (M4 ). 1+ (p)φ2+ (p), R3
(4.12) 2 3 As a consequence, SK (M4 )C + is isomorphic to a subspace of L (R , dp). Since the former is also dense in the latterr by (4.10), one finds that the one-Minkowskiparticle space HM4 , i.e. the Hilbert completion of SK (M4 )C + , is isomorphic to L2 (R3 , dp) itself. The orthocronous proper Poincar´e group ISO(3, 1) acts naturally on wave functions via push-forward: g ∗ : SK (M4 ) φ → φ ◦ g −1 for every g ∈ ISO(3, 1). The symplectic form σM4 is invariant under such g ∗ , g being an isometry. Furthermore, it turns out that there is an irreducible strongly-continuous unitary representation (1) (1) L(1) : ISO(3, 1) g → Lg with Lg : HM4 → HM4 such that (g ∗ φ)+ = Lg φ+ for every g ∈ ISO(3, 1) and every φ ∈ SK (M4 ). In particular, this implies that the decomposition in positive and negative frequency parties as well as the scalar product, do not depend on the particular Minkowski frame used. An irreducible operator
R is well known, the map (see (4.10)) Cc∞ (R3 ) f → R3 dpf (x)e−ip·x has range dense in because Fourier transform is a Hilbert-space isomorphism and Cc∞ (R3 ) is dense in L2 (R3 , dp), therefore the range is also L2 -dense in the space B ⊂ L2 (R3 , dp) of functions which 2 3 are in Cc∞ (R3 ) and vanish in a neighborhood of p = 0. Finally, p in L (R , dp) and it is p B is dense invariant under multiplication of its elements with either |p| and 1/ |p|. Thus, by the latter r As
L2 (R3 , dp)
2 3 equation in (4.10), we find that, up to Hilbert-space isomorphisms, SK (M4 )C + = L (R , dp).
June 29, 2006 16:15 WSPC/148-RMP
396
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
4 ) of Weyl algebra W(M4 ) is constructed on F+ (HM4 ) in terms representation W(M of usual symplectically-smeared field operators and their exponentials σM4 (φ, Φ) := ia(φ+ ) − ia† (φ+ ),
M4 (ψ) := eiσM4 (φ,Φ) . W
(4.13)
The vacuum state ΥM4 of F+ (HM4 ) is, by definition, invariant under the unitary representation L of ISO(3, 1) obtained by tensorialization of L(1) and the following covariance relations hold
M4 (φ)L†g = W
M4 (g ∗ φ), Lg W
for every φ ∈ SK (M4 ) and g ∈ ISO(3, 1) . (4.14)
4 ) denotes the unique (σM4 being nondegenerate) If ΠM4 : W(M4 ) → W(M ∗ C -algebra isomorphism between those two Weyl representations, (F+ (HM4 ), ΠM4 , ΥM4 ) coincides, up to unitary transformations, with the GNS triple associated with the algebraic state λM4 on W(M4 ) uniquely defined by the requirement (see the Appendix) λM4 (WM4 (φ)) := e−φ+ ,φ+ M4 /2 .
(4.15)
λM4 is pure as well known, however this is also a direct consequence of (b) in Theorem 4.5 below since λ is pure. We can now state and prove the main results of this section. Theorem 4.5. Consider free QFT for a real scalar field φ propagating in fourdimensional Minkowski spacetime (M4 , η) and QFT for a real scalar field on + . Let W(M4 ) be the Weyl algebra associated with the space SK (M4 ) and the symplectic form σM4 as defined in Sec. 4.2 (with MV˜ := M4 and K := −). The following holds. (a) ΓM4 (SK (M4 )) ⊂ S(+ ) because ΓM4 φ has compact support for φ ∈ SK (M4 ), moreover ΓM4 preserves symplectic forms. As a consequence, W(M4 ) can be identified with a sub C ∗ -algebra of W(+ ) by means of a C ∗ -algebra isomorphism ıM4 uniquely determined by the requirement ıM4 (WM4 (φ)) = W (ΓM4 φ),
for all φ ∈ SK (M4 ).
(4.16)
(b) Consider Minkowski vacuum λM4 on W(M4 ) and the BMS-invariant vacuum λ on W(+ ) and focus on the respectively associated GNS realizations (F(HM4 ), ΠM4 , ΥM4 ) and (F(H), Π, Υ). The C ∗-algebra isomorphism ıM4 corresponds to a unitary (i.e. isometric surjective) operator U : F(HM4 ) → F(H)
M4 (φ)U −1 = W
(ΓM4 φ). Therefore, the such that: (i) U : ΥM4 → Υ, and (ii) U W 4 algebraic state induced by λ on W(M ) through ıM4 is Minkowski vacuum λM4 . Proof. (a) Fix a Minkowski reference frame (t, x) in M4 , pass to spherical coordinates in the rest frame obtaining coordinates (t, r, ζ, ζ), next pass to null coordinates in the sector t, r and, finally, construct coordinates (u, ζ, ζ) on + referred to a Bondi frame as described at the beginning of this section. In Minkowski spacetime
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
397
solutions of Kφ = 0 propagate along null geodesics [59]. In other words, if φ = Ef , the support of φ is included in the union of null geodesics originated from every point q ∈ supp f . On the the hand the map u = Z(q, ζ, ζ) that associates the unique null geodesics starting from the point q ∈ M4 and direction (ζ, ζ) with the coordinate u where the geodesics reaches + (the remaining coordinates being (ζ, ζ)) is well defined and smooth [60, 61]. If φ ∈ SK (M4 ), φ = Ef where f is smooth with compact support, as a consequence supp ΓM4 φ ⊂ {Z(q, ζ, ζ) | q ∈ supp f , (ζ, ζ) ∈ S2 } × S2 is compact because Z is continuous and defined on a compact set. Since ΓM4 φ is smooth by definition, we have proved that ΓM4 (SK (M4 )) ⊂ S(+ ). Now, we pass to prove that ΓM4 preserves the symplectic forms. To this end, we notice that, if φ, φ ∈ SM4 , then σM4 (φ, φ ) = i2Reφ+ , φ+ M4 and the analog holds for wave functions ψ, ψ ∈ S(+ ) referring to the corresponding symplectic form σ and scalar product · , · as in Theorem 2.12. (The proof is immediate, taking into account the fact that positive frequency parts satisfy σM4 (φ+ , φ+ ) = 0 and the analog for the other case.) As a consequence, to show that σM4 (φ, φ ) = σ(ΓM4 φ, ΓM4 φ ), it is completely equivalent to show that φ+ , φ+ M4 = (ΓM4 φ)+ , (ΓM4 φ )+ ,
for every pair of wave functions φ, φ ∈ SM4 . (4.17)
Notice that the positive frequency parts in the left-hand side are referred to Minkowski time t in M4 , whereas those in the right-hand side are referred to coordinate u in + . Proof of (4.17) is a consequence of the following lemma whose proof is quite technical and presented in the Appendix. Lemma 4.6. In the hypotheses of Theorem 4.5, fix a Minkowski reference frame (t, x) in M4 , and consider the associated Bondi frame (u, ζ, ζ) on + . If (E, ζ, ζ) are the spherical coordinates of p in the rest frame (where E := |p| in particular), it holds for all φ ∈ SM4 , (4.18) (Γ M4 φ)+ (E, ζ, ζ) = −iE φ+ (p(E, ζ, ζ)), the function in the left-hand side being that of definition (2.19) with S(+ ) ψ = ΓM4 φ. From (4.18) one proves (4.17). Indeed, starting from (4.12), passing in spherical coordinates in the integral in the right-hand side and taking (2.22) into account, one gets (4.17) φ+ , φ+ M4 = dEE 2 S2 φ + (p(E, ζ, ζ))φ+ (p(E, ζ, ζ)) R+ ×S2
=
R+ ×S2
dE S2 −iE φ + (p(E, ζ, ζ)) (−i)E φ+ (p(E, ζ, ζ))
= (ΓM4 φ)+ , (ΓM4 φ )+ . (b) Referring to Lemma 4.6, start from the C-linear isometric map h0 : SK (M4 )C + → S(+ )C which associates the function φ (p) with the function + +
June 29, 2006 16:15 WSPC/148-RMP
398
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
−iE φ + (p(E, ζ, ζ)) = (ΓM4 φ)+ (E, ζ, ζ). The domain and the range of that map are dense in HM4 and H, respectively: In the first case, it has been proved previously; the proof for the latter case is immediate using the density property in the former and the measures in the relevant L2 spaces corresponding to the two Hilbert spaces. As a consequence, h0 extends to a unitary map h : HM4 → H. In turn, this second map extends to a unitary map U : F(HM4 ) → F(H) by tensorialization and assuming that (i) UΥM4 = Υ. By construction, it also holds UσM4 (φ, Φ)U −1 = Ψ(ΓM4 φ) working in the dense space of analytic vectors containing a finite number of particles. Passing to exponentials one finds (ii). Remark 4.7. The result established in (a) of Theorem 4.5 straightforwardly extends to the case of a spacetime M obtained by switching on curvature in the past of an (arbitrarily far in the future) smooth spacelike Cauchy surface Σ of ˜ passing for i0 . With obvious notation, M4 contained in a Cauchy surface of M ΓM (SK (M )) ⊂ S(+ ) because ΓM φ has again compact support for φ ∈ SK (M ) by construction. Moreover, ΓM preserves symplectic forms since, after Σ, the symplectic form associated with M is as the same as that of M4 and symplectic forms are preserved under time evolution in bulk spacetime. As a consequence, W(M ) can be identified with a sub C ∗ -algebra of W(+ ) by means of a C ∗ -algebra isomorphism ı uniquely determined by the requirement ı(WM (φ)) = W (ΓM φ),
for all φ ∈ SK (M ).
A second theorem concerns the interplay of orthocronous proper Poincar´e group ISO(3, 1) and GBMS . We knows that in the bulk there is a strongly-continuous unitary irreducible representation ISO(3, 1) g → Lg satisfying (4.14). Referring to the Minkowski frame (t, x) used to build up the metric on + and all that, if g = (Λ, T ) with Λ ∈ SO(3, 1)↑ and a ∈ R4 , the action of Lg on a positive frequency part φ"+ reads EΛ−1 −i(p|Λa) " " e (4.19) φ+ (pΛ−1 ), (L(Λ,a) φ+ )(p) = E where p := (E, p), (EΛ , pΛ ) := Λp, whereas (a|b) denotes the standard product of 4-vectors a and b. The question is: what is the meaning of the representation ISO(3, 1) g → ULg U −1 acting on quantum states for QFT defined in + ? The following theorem gives an answer to the question which is the quantum version of Proposition 4.4. Theorem 4.8. With the same hypotheses as in Theorem 4.5, represent GBMS as the semidirect product of SO(3, 1)↑ and C ∞ (S2 ) in the Bondi frame on + associated with the Minkowski frame (t, x). Consider the natural unitary representation of ISO(3, 1) in QFT in (M4 , η) given in (4.19). The representation on F(H), induced on QFT on + by means of U, is ISO(3, 1) g → ULg U −1 and it coincides with the restriction of the representation of GBMS , U , defined in Theorem 2.14, to the
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
399
subgroup isomorphic to ISO(3, 1) (see Proposition 4.4) a1 (ζ + ζ) a2 (ζ − ζ) a3 (ζζ − 1) (Λ, fa ) ∈ GBMS fa = a0 − − − , (Λ, a) ∈ ISO(3, 1) . ζζ + 1 i(ζζ + 1) ζζ + 1 Proof. By Lemma 4.6 one has (U φ + )(p, ζ, ζ) = −iE φ+ (p(E, ζ, ζ)).
(4.20)
Representing the right-hand side of (4.19) in complex spherical coordinates ζ, ζ and applying U on the final result making use of (4.20), a straightforward, but tedious, −1 ψ+ = computation based on (3.23) proves that, for every ψ + ∈ U(HM4 ), UL(Λ,a) U U(Λ,fa ) ψ+ holds true whenever (Λ, a) is any pure translation, any pure rotation and any boost along z. Hence, the decomposition theorem of Lorentz group and the structure of the group product in ISO(3, 1) and in GBMS imply that the identity holds for every element (Λ, a) ∈ ISO(3, 1). Since U(HM4 ) ⊂ H is dense in H (and U preserve one-particle spaces), we have obtained that UL(Λ,a) H U −1 = U(Λ,fa ) H . Finally, since U : F(HM4 ) → F(H), L(Λ,a) : F(HM4 ) → F(HM4 ) and U(Λ,fa ) : F(H) → F(H) are all obtained by tensorialization procedure, it must hold UL(Λ,a) U −1 = U(Λ,fa ) . 5. Conclusions The main purpose underlying this paper has been to show that, at least in the scalar case, it is possible to start from a scalar free field φ living in the bulk of an asymptotically flat four-dimensional spacetime M and to relate it by means of a suitable extension/restriction procedure with a second field ψ living on + , the boundary of M at future null infinity. Under suitable hypotheses (preservation of a symplectic form), this relation preserves information at level of quantum field theories when passing from the bulk to the boundary thus implementing the holographic principle. However, it is worth stressing that the notion of bulk to boundary correspondence that we have envisaged in this paper is to all purposes rather different from the more common one proper of the AdS/CFT scenario since we deal only with the reconstruction of test fields living in a bulk with a fixed background metric whereas, up to now, the reconstruction of geometric data is not addressed within this approach. The main statements of this paper have been proved at the level of Weyl C ∗ algebras associated with the fields. Within this framework, ψ is interpreted as a kinematical datum of a quantum field theory intrinsically defined on + and invariant under the action of the BMS group as discussed in Sec. 1. We have shown that such physical intuitive idea can be made rigorously precise identifying ψ with an intrinsic BMS field constructed out of the induced unitary irreducible representations. Such
June 29, 2006 16:15 WSPC/148-RMP
400
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
result has been achieved by means of a technology similar to celebrated Wigner’s one used to classify and construct explicitly all possible Poincar´e-invariant wave functions. Universality of such an approach and the techniques handled in Secs. 2 and 3 suggest that our results, achieved for massless fields, may be extended far beyond the case of vanishing “spin”. Furthermore, it would be interesting to investigate the interplay of these results with the asymptotic quantization procedure proposed by Ashtekar [31] where the main variable is played by BMS-invariant gauge massless fields living on ± . To this end, it is worth noticing that, in [31] and in most of the paper concerning applications of the BMS group, the peculiar role played by the unitary BMS irreducible representation induced from the subgroup Γ (instead of our ∆), suggests that it has been always implicitly assigned a Hilbert topology to the set of supertranslations N . This is in apparent contrast with our results and the issue deserves future investigation. This is because the results presented in Sec. 3 indicates that, in order to “relate” a bulk field with the boundary BMS-invariant counterpart, it is necessary to adopt a nuclear topology on N . The relevance of this result does not only lie in the realm of a rigorous mathematical analysis of the BMS group, but it mainly affects the physical kinematical configuration of the field theory living on + since, as discussed in [29] and partly in Sec. 3, in the “nuclear” scenario, it arises a plethora of possible free fields (or equivalently little groups) which are not present in the Hilbert topology. A further key requirement within our approach consists of considering specifically four-dimensional spacetimes. Beside the natural physical relevance, this scenario is the lone one where SL(2, C) C ∞ (S2 ) plays the role of the (asymptotic) symmetry group. Nonetheless, it is natural to wonder whether a possible extension of our results to higher (lower) dimensional spacetimes could be envisaged. Unfortunately, a straightforward attempt in this direction runs into two serious obstructions, the first referring to the impossibility to coherently perform Penrose construction in odd d-dimensional manifolds with d > 4. As proved by Ishibashi and Hollands in [62, 63], the definition itself of null infinity adopted in this paper is at stake and this seems to force us to choose either a different codimension one submanifold where to encode bulk data or a different projection map since the one, introduced in Sec. 2 for the massless scalar field conformally coupled to gravity, strongly relies on the (geometry of) Penrose compactification. Furthermore, although we consider even d-dimensional asymptotically flat spacetimes with d > 4, we cannot slavishly transfer our results since, as proved in [62, 63], in these scenarios there is no notion of supertranslations and thus, the asymptotic symmetry group at null infinity is neither the BMS nor a BMS-like group. Thus, although one could project bulk fields, to ± , in order to interpret them as intrinsic boundary fields, one is forced either to study, case by case, the theory of unitary and irreducible representation for the new asymptotic symmetry group either to repeat the Wigner programme. The final result of such approach would consists on a full construction of the kinematical and the dynamical spectrum of the boundary free field theory
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
401
which should be compared with the projected bulk fields as it has been done in Sec. 3 for the four-dimensional scenario.s A complete survey of the bulk to boundary relation for free fields should also comprise the rather elusive case of massive fields. Within this specific framework, the extension/restriction procedure proposed in Sec. 2 fails mainly due to the presence of an intrinsic scale length represented by the mass. Nonetheless, we believe that a “holographic investigation” along the lines proposed in [11] is still possible and it is currently under investigation. Other key results of this paper appear in Sec. 4 where the holographic interplay between a bulk theory living on a spacetime satisfying a weaker requirement as in Proposition 2.5 (in particular, a strongly asymptotically predictable spacetime in the sense of [19]) and the BMS boundary theory has been discussed within the framework of C ∗ -algebras of field-observables and their isomorphisms. In particular, in the specific scenario of Minkowski spacetime, a key achievement consists on establishing an unitary correspondence between the bulk vacuum and the BMS counterpart on + , though the uniqueness of the latter has not been proved and it should be analyzed in detail. The uniqueness problem of a BMS-invariant quasifree (algebraic) state λ on + has relevance in the issue of the notion of particle in the absence of Poincar´e group. If the BMS-invariant quasifree state is uniquely determined, it could be used to give a definition of particle for spacetime which does not admit a group of isometries but are asymptotically flat and the algebra of the field in the bulk can be identified with a subalgebra of the fields on + by means of an injective ∗-homomorphism ı as in Proposition 4.1. In this case, λ induces a quasifree state λı for the algebra of fields in the bulk with an associated definition of particle. We have established in Theorems 4.5 and 4.8 that such a notion of particle, whenever available, must agree with the usual one in four-dimensional Minkowski spacetime since λı , in that case, is just Minkowski vacuum. Another relevant point which deserves investigation concerns the short distance behavior of n-point functions associated with λı . There is no evidence, from our construction, that λı is Hadamard also if it happens in Minkowski spacetime trivially. To conclude, we wish also to pinpoint that, within this paper, we have completely discarded the role of interactions. Nonetheless, in order to construct a full holographic bulk to ± correspondence it is imperative to understand how to couple the boundary free field either with self/external interactions (barring gravitational field) either with gauge degrees-of-freedom. A complete and concrete solution of this challenging issue would possibly rule out whether it is really possible or not sA
similar conclusion holds also for d = 3 where the counterpart of the BMS group is Diff(S 1 )C ∞ (S 1 ) [64]. Nonetheless, in this scenario, there is a further key difficulty since the role of SL(2, C), a finite dimensional Lie group, is traded by Diff(S 1 ), an infinite dimensional group. Thus, Mackey imprimitivity theorem may not hold and the inducing technique may not grant us an exhaustive reconstruction of unitary and irreducible representations.
June 29, 2006 16:15 WSPC/148-RMP
402
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
to define a full asymptotically flat/BMS correspondence and, thus, we believe it is worth to be deeply analyzed. Acknowledgments The part of this work due to N.P. has been funded by Provincia Autonoma di Trento within the postdoctoral project FQLA, Rif. 2003-S116-00047 Reg. Delib. No. 3479 and allegato B. Appendix A A.1. GNS reconstruction The interplay of the Fock representation presented in Sec. 2.4 and the GNS theorem [58, 65] is simply sketched. (The same extent holds for QFT in Minkowski spacetime presented in Sec. 4.4 if replacing W(+ ) with W(M4 ), W (ψ) with WM4 (φ), Π with ΠM4 , Ψ(ψ) with σM4 (φ, Φ) and λ with λM4 .) Using notation introduced in
+ ) denotes the unique (σ being nondegenerate) Sec. 2.4, if Π : W(+ ) → W( ∗ C -algebra isomorphism between those two Weyl representations, it turns out that (F+ (H), Π, Υ) is the GNS triple associated with a particular pure algebraic state λ (quasifree [65] and invariant under the automorphism group associated with GBMS ) on W(+ ) we go to introduce. Define λ(W (ψ)) := e−ψ+ ,ψ+ /2 , then extend λ to the ∗ -algebra finitely generated by all the elements W (ψ) with ψ ∈ S(+ ), by linearity and using (W1), (W2). It is simply proved that, λ(I) = 1 and λ(a∗ a) ≥ 0 for every element a of that ∗ -algebra so that λ is a state. As the map R t → λ(W (tψ)) is continuous, known theorems [66] imply that λ extends uniquely to a state λ on the complete Weyl algebra W(+ ). On the other hand, by direct
(ψ)Υ. Since a state on a C ∗ algebra computation, one finds that λ(W (ψ)) = Υ, W is continuous, this relation can be extended to the whole algebras by linearity and continuity and using (W1), (W2) so that a general GNS relation is verified: λ(a) = Υ, Π(a)Υ
for all a ∈ W(+ ).
(A.1)
To conclude, it is sufficient to show that Υ is cyclic with respect to Π. Let us show it. If F#(+ ) denotes the ∗ -algebra generated by field operators Ψ(ψ), ψ ∈ S(+ ), defined on F (H), F#(+ )Υ is dense in the Fock space (see [33, Proposition 5.2.3]). Let Φ ∈ F+ (H) be a vector orthogonal to both Υ and to all the vectors
(t1 ψ1 ) · · · W
(tn ψn )Υ for n = 1, 2, . . . and ti ∈ R and ψi ∈ S(+ ). Using Stone’s W theorem to differentiate in ti for ti = 0, starting from i = n and proceeding backwardly up to i = 1, one finds that Φ must also be orthogonal to all of the vectors Ψ(ψ1 ) · · · Ψ(ψn )Υ and thus vanishes because F#(+ )Υ is dense. This result means that Π(W(+ ))Υ is dense in the Fock space too, i.e. Υ is cyclic with respect to Π. Since Υ satisfies also (A.1), the uniqueness of the GNS triple proves that the triple (F+ (H), Π, Υ) is just (up to unitary transformations) the GNS triple associated with
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
403
λ. Since the GNS representation is irreducible (see discussion after Theorem 2.14) λ is pure. A.2. Proof of some propositions Proof of Proposition 2.7. In the following we assume that Ω includes the further factor ω. Referring to the expression of BMS group in a Bondi frame, we prove the thesis for any element (Λ, f ) of BMS group of the form either (Λ(t), 0) or (I, tf ) where f ∈ C ∞ (S2 ) and t → Λ(t) is a one-parameter subgroup of SO(3, 1)↑. Notice that the subgroups t → (Λ(t), 0) and t → (I, tf ) are also one-parameter groups of diffeomorphisms of + generated by a smooth vector fields ξ on + as in Proposition 2.1 as one may check by direct inspection. From the decomposition theorem of Lorentz group, it is simply proved that every element of GBMS is a finite product of those elements (Λ(t), 0) and (I, tf ). Hence, using the property (2.6), the thesis turns out to be valid for a generic element of BMS group. Assume that (A, f ) is an element of the one-parameter group of + diffeomorphisms {γt } generated by ξ and let ξ be a smooth extension of ξ to ˜ ) as in Proposition 2.1 generating {γt }. In coordinates (Ω, u, ζ, ζ) about M (i.e. M + , (2.13) can be written down (Aγt ψ)(Ωt , ut , ζt , ζ t ) = lim Ωα t φt (Ωt , ut , ζt , ζ t ), Ωt →0
(A.2)
where γt : (Ω, u, ζ, ζ) → (Ωt , ut , ζt , ζ t ) and φt := γt∗ φ so that φt (Ωt , ut , ζt , ζ t ) = φ(Ω, u, ζ, ζ). (A.2) can be re-written Ωα t Ωα φ(Ω, u, ζ, ζ) , Ωt →0 Ωα that is, since on + γt coincides with γt which preserves + itself, α Ωt (Aγt ψ)(ut , ζt , ζ t ) = lim ψ(u, ζ, ζ). Ω→0 Ω Using Hˆopital rule, α ∂Ωt (Aγt ψ)(ut , ζt , ζ t ) = ψ(u, ζ, ζ). ∂Ω (Aγt ψ)(ut , ζt , ζ t ) = lim
(A.3)
(A.4)
Ω=0
Our task is computing the derivative in the right-hand side of (A.4). By definition of ξ, one finds d ∂Ωt ∂ξ Ω (γt (Ω, u, ζ, ζ)) . (A.5) = dt ∂Ω (Ω=0,u,ζ,ζ) ∂Ω Ω=0 Now, making explicit the condition that (Ω2 £ξ g)αβ extends smoothly to a vanishing field approaching + (Proposition 2.1) in the considered coordinates, one easily finds for components α = Ω, β = u: ∂ξ u (u, ζ, ζ) ∂ξ Ω (Ω, u, ζ, ζ) , = − ∂Ω ∂u Ω=0
June 29, 2006 16:15 WSPC/148-RMP
404
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
where we also used ξ = ξ on + . Finally, from (A.5) ∂ξ u (ut , ζt , ζ t ) d ∂Ωt =− ln | . dt ∂Ω (Ω=0,u,ζ,ζ) ∂ut (ut ,ζt ,ζ t )=γ (u,ζ,ζ)
(A.6)
t
Let us solve this equation in the relevant cases. By direct inspection, one finds that the right-hand side vanishes when the one-parameter subgroup {γt } generated by ∂Ωt ξ has the form t → (I, tf ) and so ∂Ω (Ω=0,u,ζ,ζ) = constant in this case. Since ∂Ω0 = 1, (A.4) produces ∂Ω (Ω=0,u,ζ,ζ)
(Aγt ψ)(ut , ζt , ζ t ) = ψ(u, ζ, ζ), which is just the thesis in the considered case. Let us consider the other case with γt having the form t → (Λt , 0). In this case, one gets dKΛt (ζ, ζ) ut ξ u (ut , ζt , ζ t ) = dt KΛt (Λ−1 (ζ,ζ)=Λ−1 t (ζt , ζ t )) t (ζt ,ζ t ) dln|KΛt (ζ, ζ)| = ut . dt (ζ,ζ)=Λ−1 (ζt ,ζ ) t
From (A.6), using the fact that
∂Ω0 ∂Ω |(Ω=0,u,ζ,ζ)
t
= 1, one finds at the end via (A.4):
(Aγt ψ)(ut , ζt , ζ t ) = KΛt (ζ, ζ)−α ψ(u, ζ, ζ), which is the thesis in the considered case. Proof Theorem 2.12. (a) and (b). Take ψ ∈ S(+ ). Using integration by parts in (2.19) and standard theorem (Lesbegue’s dominate convergence) to interchange the symbol of derivative with it is simply proved that, if ψ ∈ S(+ ), √ that of integral, ∞ (E, ζ, ζ) → ψ+ (E, ζ, ζ)/ E belongs to C (R+ ×S2 ; C) and, as E → +∞, it decays, uniformly in ζ, ζ with all derivatives in any variable, faster than any negative power of E. Using the same procedure in (2.18), one finds straightforwardly that ζ, ζ uniform estimates hold for ψ+ : ∂k ∂c ∂d Ck,c,d (A.7) k c d ψ+ (u, ζ, ζ) ≤ 1 + |u|k+1 ∂u ∂ζ ∂ζ for nonnegative constants Ck,c,d depending on k, c, d = 0, 1, 2, . . . . Therefore, it make sense to apply σ defined in (2.15) to a pair of positive frequency parts ψ1+ , ψ2+ when ψ1 , ψ2 ∈ S(+ ). The independence from the used Bondi frame can be proved by direct inspection using (2.20), Proposition 2.4 to check on the independence from the used Bondi frame and taking advantage of the fact that S2 (ζ, ζ) is invariant under three-dimensional rotations. Let us prove the item (b). In the following we use the notation ψ (u, ζ, ζ) := (A(Λ,f ) ψ)(u, ζ, ζ). Finally, (2.22) is a straightforward application of Fubini–Tonelli theorem in the explicit expression for Ω(ψ1+ , ψ2+ ), the hypotheses being fulfilled due to the decaying estimates said above, using (2.18) (take into account that
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
405
actually the apparent singularity due to the factor E −1/2 does not exist because of (2.19) where the integral produces a smooth function in E). The remaining part of (b) is an immediate consequence of (2.22). Let us prove (c). First of all + notice that the map ψ+ → ψ + for ψ ∈ S( ) is well defined because the map ψ+ → ψ+ is injective. The proof follows straightforwardly from injectivity of Fourier transformation in Schwartz space referring to Fourier transform involved in (2.19) and using the fact that ψ is real. By (2.21), the complex linear extension of ψ+ → ψ + is bounded and isometric and thus, it being defined in a dense subspace, it admits a unique bounded isometric extension U from the completion of S(+ )C + , H, to a closed subspace of L2 (R+ × S2 , dE ⊗ S2 ). To prove the thesis, that is that U is a Hilbert space isomorphism, it is sufficient to show that the subspace includes Cc∞ ((0, +∞) × S2 ; C) because the latter is dense in L2 (R+ × S2 , dE ⊗ S2 ). To this end, take φ ∈ Cc∞ ((0, +∞) × S2 ; C) and define ψ as: ψ(u, ζ, ζ) := R+
dE −iEu √ φ(E, ζ, ζ) + e 4πE
R+
dE −iEu √ e φ(E, ζ, ζ). 4πE
Notice that the singularity of E −1/2 at E = 0 is harmless since the support of φ does not include that point and thus the whole integrand is smooth and compactly supported. Finally, by direct inspection, one finds that ψ ∈ S(+ ) and ψ + = φ so that, as wanted, φ = U ψ+ for some ψ ∈ S(+ ). The last argument actually proves that the range of K : S(+ ) ψ → ψ+ includes the space U −1 Cc∞ ((0, +∞)×S2 ; C) which is dense in H = U −1 L2 (R+ × S2 , dE ⊗ S2 ) and thus it proves also (d). This concludes the proof. Proof of Theorem 2.15. As is well known working with group representations, to prove the thesis it is sufficient to show that strong continuity holds for g → I (the unit element of GBMS ). Let us to prove strong continuity as g → I for the restriction of the representation U to H. To this end we prove, as the first step, the strong continuity of U when it works on one-particle states represented by smooth ˜ compactly supported functions φ(E, ζ, ζ). (In the following, for sake of simplicity, we write ζ, ζ concerning coordinates on S2 , but actually one needs at least two charts to cover the compact smooth manifold S2 . The use of two charts removes the apparent singularity of the coordinates ζ, ζ on the point ∞ of the Riemann ˜ → 0 as sphere.) Using the fact that every Ug is unitary, one sees that Ug φ˜ − φ ˜ ˜ ˜ ˜ g → I is equivalent to (φ, Ug φ) → (φ, φ) as g → I. With an explicit representation (by means of (2.31)) we have to prove that, as g → I and for a smooth compactly ˜ supported φ, lim (Λ,f )→(I,0) R+ ×S2
= R+ ×S2
E iEf (ζ,ζ) " " KΛ (ζ, ζ)e ζ, ζ) dE ⊗ S2 (ζ, ζ) ψ , Λ(ζ, ζ) ψ(E, KΛ (ζ ζ)
" " ζ, ζ) dE ⊗ S2 (ζ ζ). ψ(E, ζ, ζ)ψ(E,
(A.8)
June 29, 2006 16:15 WSPC/148-RMP
406
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
Taking Λ in a relatively compact neighborhood B of the unit element of SO(3, 1)↑, (for any fixed f ) the smooth compactly supported map iEf (ζ,ζ) E " " (Λ, E, ζ ζ) → e , Λ(ζ, ζ) ψ(E, ζ, ζ) KΛ (ζ ζ) ψ KΛ (ζ, ζ) is bounded by construction by some constant K not depending on f (which does not give contribution to the considered functions since it is real valued). On the other hand, there is a compact C ⊂ R+ × S2 containing all the supports of the maps E iEf (ζ,ζ) " KΛ (ζ ζ) ψ" ζ, ζ) , (E, ζ ζ) → e , Λ(ζ, ζ) ψ(E, KΛ (ζ, ζ) for all Λ ∈ B and all f ∈ C ∞ (S2 ). As a consequence, all those maps are (Λ, f )uniformly bounded by a smooth compactly supported function on R+ × S2 which assumes the value K in C. Thus, we can use Lebesgue’s dominate convergence theorem in the right-hand side of (A.8) establishing the validity of (A.8) itself. We have proved strong continuity on smooth compactly supported functions in H. As the space of those functions is dense in H, it implies strong continuity on the whole H. Indeed, if φ ∈ H and for any fixed smooth compactly supported φn ∈ H, triangular inequality entails φ − Ug φ ≤ φ − φn + φn − Ug φn + Ug (φn − φ) = 2φ − φn + φn − Ug φn . Let {Vm }m∈N be a fundamental system of neighborhoods of I — one can always choose m ∈ N the topology being induced by a countable class of seminorms — such that Vm ⊃ Vm+1 and ∩m Vm = {I}. From the inequality above and using limm→+∞ supg∈Vm φn − Ug φn = 0 which is a straightforward consequence of limg→I φn − Ug φn = 0, one has 0≤
lim
sup φ − Ug φ ≤ 2φ − φn .
m→+∞ g∈Vm
Taking a sequence of φn with φn → φ for n → +∞, one obtains limm→+∞ supg∈Vm φ − Ug φ = 0 which entails limg→I φ − Ug φ = 0, i.e. strong continuity holds for UH . To conclude the proof we show that the strong continuity in H implies strong continuity in the whole Fock space. By construction, if Vg := U H , on the U (N ) := UHN = invariant subspace HN ⊂ F+ (H) containing N particles one has Vg (0) Vg ⊗ · · · ⊗ Vg where the number of factors is N . (Obviously Vg := I.) As a (N ) is strongly continuous. Now, consider a generic element of consequence g → Vg F+ (H) which can be viewed as a sequence Φ = {ΨN }N =0,1,... with ΨN ∈ HN . Let us show that (Φ, Vg Φ) → Φ2 as g → I. (Using either the fact that Vg is unitary either the group representation structure, this is equivalent to Vg Φ − Vh Φ2 → 0
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
as g → h). Spaces HN are invariant, pairwisely orthogonal and Vg as a consequence one has
(N )
(Φ, Vg Φ) =
+∞
407
are isometric;
(Ψ(N ) , Vg(N ) Ψ(N ) ),
N =0
where
(N ) |(Ψ(N ) , Vg Ψ(N ) )| +∞
(N )
≤ Ψ(N ) Vg
Ψ(N ) = Ψ(N ) 2 and thus
|(Ψ(N ) , Vg(N ) Ψ(N ) )| ≤
N =0
+∞
Ψ(N )2 = Φ2 .
N =0
This g-uniform bound (essentially via Lebesgue dominate convergence theorem) allows one to interchange symbols of summation and limit: lim (Φ, Vg Φ) =
g→I
+∞ N =0
lim (Ψ(N ) , Vg(N ) Ψ(N ) ) =
g→I
+∞
(N )
(Ψ(N ) , VI
Ψ(N ) )Φ2 ,
N =0
where we have used strong continuity of each representation V (N ) . This is what we wanted to prove. Proof of Proposition 3.6. It is sufficient to show that each χ ∈ N admits a corresponding function β : N → R continuous and linear such that χ(α) = eiβ(α) for every α ∈ N . (In fact, a continuous linear functional β : N = C ∞ (S2 ) → R is a distribution by definition and thus one can write (α, β) instead of β(α).) Let us prove it. Actually, the following proof holds true in the more general hypothesis on N to be a topological vector space. Fix χ ∈ N . First, of all we identify U (1) with S1 and, in turn, we identify S1 with (−π, π] where π ≡ +π. In this picture, for our fixed χ ∈ N , there is a continuous map f : N → (−π, π] such that χ(α) = eif (α) for all α ∈ N . From continuity, there is an open set B0 ⊂ N such that B0 = f −1 ((−π, π)). B0 is a neighborhood of the zero vector of N . Indeed, eif (0) = χ(0) = 1 since χ is a homomorphism. We have found that f (0) = 2kπ for some k ∈ Z. On the other hand, because f (0) ∈ (−π, π] by hypotheses, it must be f (0) = 0. In particular, f (0) ∈ (−π, π) hence 0 ∈ B0 and thus, as we said, B0 is a open neighborhood of 0. As N is a topological vector space, there is an open balanced (also said star-shaped) neighborhood of 0, B ⊂ B0 . In general, the function f does not satisfy f (u) + f (v) = f (u + v) because f (u) + f (v) may not belong to B0 also if f (u), f (v) do. Nevertheless, we define the map β : N → R such that: 1 v , β(v) := nv f nv for all v ∈ N , nv > 0 being the first natural with (1/nv )v ∈ B. We have the following results. (a) For every α ∈ N it holds eiβ(α) = χ(α).
(A.9)
June 29, 2006 16:15 WSPC/148-RMP
408
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
Indeed, using χ(v)m = χ(mv) valid for every natural m > 0 and eif (α/nα ) = χ(α/nα ), one has eiβ(α) = einα f (α/nα ) = (χ(α/nα ))nα = χ(nα (α/nα )) = χ(α). (b) If β is continuous it is additive as well, i.e. β(u + v) = β(u) + β(v),
for all u, v ∈ N.
Indeed, from χ(u)χ(v) = χ(u + v) and (a), one obtains ei(β(u)+β(v)) = eiβ(u+v) . Fix u, v ∈ N and let t range in [0, 1]. The function g : t → β(u) + β(tv) − β(u + tv) must be continuous because straightforward composition of continuous functions. On the other hand, since ei(β(u)+β(tv)) = eiβ(u+tv) , g must take values in the nonconnected and discrete set 2πZ. Since continuous functions transforms connected sets to connected sets, g must take a constant value in 2πZ. As g(0) = 0, we conclude that β(u) + β(tv) − β(u + tv) = 0 for t ∈ [0, 1], in particular β(u) + β(v) = β(u + v). (c) If β is continuous it is linear as well. Indeed from (b), one has mβ(v) = β(mv) for every natural m > 0 and v ∈ N . As a consequence, defining u := nv, one obtains β(u/n) = (1/n)β(u) valid for every natural n > 0 and u ∈ N . Both these results entail that rβ(w) = β(rw) for every rational r > 0 and w ∈ N . By continuity, one finds rβ(w) = β(rw) for every real r > 0 and w ∈ N . Finally, (b) implies also that β(0u) = 0β(u) = 0 and β(−u) = −β(u) for every u ∈ N . Putting all together one obtains that rβ(w) = β(rw) for every r ∈ R and w ∈ N . Taking additivity into account we have proved that β is linear. To conclude, the proof it is sufficient to show that β defined in (A.9) is continuous. Let us demonstrate it proving that β is continuous at each point α ∈ N . The difficult point to handle in the proof is that nα in (A.9) is a function of α itself in spite of f being continuous. If α ∈ N , by definition of nα one has α/nα ∈ B, but / B. If B c denotes N \B, there are now two possibilities concerning the α/(nα − 1) ∈ / B: (1) α/(nα − 1) ∈ int(B c ) or (2) α/(nα − 1) ∈ ∂B. requirement α/(nα − 1) ∈ Suppose that (1) holds, i.e. α/(nα − 1) ∈ int B c , together with α/nα ∈ B. In this case α ∈ nα B as well as α ∈ int (nα − 1)B c . These sets are open by construction. As a consequence, there is an open neighborhood V of α such that, if α ∈ V , α /(nα − 1) ∈ int B c — so α /(nα − 1) ∈ B — and furthermore, α /nα ∈ B. In other words, nα = nα . In this case, there is a constant C = nα > 0 such that, if α lies in a neighborhood V of α, β(α ) = Cf (α /C). Since f is continuous, β is such in V and thus β is continuous at α. To conclude, suppose that (2) is valid, that is α/(nα − 1) ∈ ∂B, together with α/nα ∈ B. Consider a sufficiently small open neighborhood V of such a α. If α ∈ V there are two possibilities: α ∈ (nα − 1)B c or α ∈ (nα − 1)B. If α ∈ (nα − 1)B c , one has α /(nα − 1) ∈ / B, but α /nα ∈ B so that nα = nα and thus β(α ) = nα f (α /nα ).
(A.10)
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
409
Conversely, if α ∈ (nα − 1)B, it must hold α /(nα − 1) ∈ B so that nα is not the first positive natural nα such that α /nα ∈ B. In this case, nα < nα and thus β(α ) = nα f (α /nα ),
where nα < nα .
(A.11)
Let us prove that in this second case, actually, nα f (α /nα ) = nα f (α /nα ),
(A.12)
holds true anyway so that β(α ) = nα f (α /nα ) and (A.10) is valid in every case. Defining γ := nα α (notice that γ ∈ B by hypotheses) and m = nα − nα (notice that 0 < m < nα by construction), (A.12) is equivalent to m nα f (γ) − mf (γ) = nα f γ − γ . (A.13) nα To prove (A.13) notice that, from χ(α) = eif (α) one gets (use the fact that χ is a homomorphism and N nα , m > 0), m nα f (γ) − mf (tγ) − nα f γ − γ ∈ 2πZ. nα Finally consider the map, with γ ∈ B fixed,
m [0, 1] t → h(t) := nα f (tγ) − mf (tγ) − nα f tγ − tγ . nα
This map is continuous because f is continuous on B, tγ ∈ B and tγ − nmα tγ ∈ B for t ∈ [0, 1] since γ ∈ B, B is balanced and 0 ≤ 1 − m/nα < 1. As 2πZ is not connected and discrete but [0, 1] is connected, it must be h(t) = constant. On the other hand h(0) = 0, thus h(t) = 0 for t ∈ [0, 1] and (A.13) must hold true. We have proved once again that there is a constant C = nα > 0 such that, if α lies in a neighborhood V of α, β(α ) = Cf (α /C). Since f is continuous, β is such in V and thus β is continuous at α. Proof of Lemma 4.6. From the decomposition in the former formula in (4.10), passing in spherical coordinates one gets, for φ ∈ SK (M4 ) (remind that S2 = sin ϑ dϑ ∧ dϕ is the standard volume form of the unit 2-sphere), φ(t, r, ϑ , ϕ ) =
1 (2π)3/2
R+
dE E 2 √ 2E
S2
S2 (ϑ, φ)eiE(r cos α(ϑ,ϕ,ϑ ,ϕ )−t) φ + (E, ϑ, ϕ) + c.c.,
where α = α(ϑ, ϕ, ϑ , ϕ ) is the angle between vectors x = (r sin ϑ cos ϕ , r sin ϑ sin ϕ , r cos ϑ ) and p = (E sin ϑ cos ϕ, r sin ϑ sin ϕ, r cos ϑ). Passing to null (E, ϑ, ϕ) := coordinates u := t − r, v := t + r and using the function φ + √ E φ (E, ϑ, ϕ) which, by the second formula in (4.10), turns out to be +
June 29, 2006 16:15 WSPC/148-RMP
410
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
bounded, smooth and ϑ, ϕ-uniformly rapidly decaying as E → +∞ by construction (to prove it use the latter in (4.10) taking into account that Cauchy surfaces are smooth and compactly supported and Fourier transform maps such functions into Schwartz functions), the equation above can be rearranged as φ(t, r, ϑ , ϕ ) =
1 4π 3/2
dE R+
S2
(E, ϑ, ϕ) + c.c. S2 (ϑ, ϕ)eiEv(cos α−1)/2 e−iEu(cos α+1)/2 E φ +
By definition of ΓM4 and using the fact that ω 2 Ω2 = 4/v 2 (see the beginning of Sec. 4.4), it must hold: (ΓM4 φ)(u, ϑ , ϕ ) = lim
v→+∞
v φ(u, v, ϑ , ϕ ). 2
In other words (ΓM4 φ)(u, ϑ , ϕ )
1 v→+∞ 8π 3/2
= lim
R+
iEv(cos α−1) −iEu(cos α+1) (E, ϑ, ϕ) + c.c. 2 2 dE S2 (ϑ, ϕ)e e vE φ + S2
(A.14) Notice that the former exponential in the integrand, essentially due to Riemann– Lebesgue’s lemma, makes vanishing the integral except for the case cos α − 1 = 0, that is when (ϑ, ϕ) = (ϑ , ϕ ); on the other hand the factor v blows up in this point giving rise to a Dirac δ. Indeed, the limit can be computed using standard Dirac-δ-regularization procedures of distributional calculus obtaining (see below) −i (ΓM4 φ)(u, ϑ, ϕ) = √ 4π
R+
(E, ϑ, ϕ)eiEu + c.c. dE φ +
(A.15)
We have found out that (ΓM4 φ)(u, ϑ, ϕ) =
dE R+
iEu (−i)E φ + (E, ϑ, ϕ)e √ + 4πE
dE R+
iEu (−i)E φ + (E, ϑ, ϕ)e √ . 4πE
From that expression for (ΓM4 φ)(u, ϑ, ϕ), applying the definition (2.19) and standard properties of Fourier transform for L1 functions, one straightforwardly gets (Γ M4 φ)+ (E, ϑ, ϕ) = (−i)E φ+ (E, ϑ, ϕ), which is the thesis we wanted to prove. To conclude let us prove (A.15). Without loss of generality we can rotate the used Cartesian frame to have p with the direction
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
411
of the positive axis z. In this case (A.14) reads, if c := cos ϑ, (ΓM4 φ)(u, 0, ϕ )
1 v→+∞ 8π 3/2
= lim
R+
0 2π
dE R+
1
dϕ
−2i v→+∞ 8π 3/2
= lim
2π
dE
dϕ
dc e
e
−iEu(c+1) 2
−1 1
dc −1
0
iEv(c−1) 2
(E, ϑ, ϕ) + c.c. vE φ +
d iEv(c−1) −iEu(c+1) 2 φ+ (E, ϑ, ϕ) + c.c. e e 2 dc
Integration by parts gives (noticing that the dependence from ϕ vanishes for ϑ = 0, π, i.e. c = 1, −1, and, thus, integration in dϕ trivially produces a factor 2π) (ΓM4 φ)(u, 0, ϕ ) −i4π v→+∞ 8π 3/2
(E, 0, ϕ) dEe−iEu φ +
= lim
R+
−i4π − lim v→+∞ 8π 3/2 −2i + lim v→+∞ 8π 3/2
R+
(E, 0, ϕ) + c.c. dEe−iEv φ +
dE R+
dϕ
2π
1
dc e
iEv(c−1) 2
e
−1
0
−iEu(c+1) 2
(E, ϑ, ϕ) + c.c. (−i)Euφ +
In other words, (ΓM4 φ)(u, 0, ϕ ) −i i −iEu (E, 0, ϕ) + c.c. √ √ = dE φ+ (E, 0, ϕ)e + lim dE e−iEv φ + v→+∞ 4π R+ 4π R+ 2π 1 iEv(c−1) −iEu(c+1) −1 (E, ϑ, ϕ) + c.c. 2 + lim dE dϕ dc e 2 e Euφ + v→+∞ 4π 3/2 R+ 0 −1 (E, 0, ϕ) (with ϕ constant) is smooth and rapidly decaying, the As the map E → φ + Riemann–Lebesgue’s lemma implies that the limit in the former line vanishes. Let us focus on the last limit. As the integrand is L1 , we can interchange the order of integration via Fubini–Tonelli theorem obtaining in particular that the considered limit can be re-written (up to an overall constant) iEv(c−1) −iEu(c+1) 2 2 dϕdc dE e e Euφ+ (E, ϑ, ϕ) + c.c. lim v→+∞
[0,2π]×[−1,1]
R+
(A.16) By the Riemann–Lebesgue’s lemma, the integral in brackets vanishes, as v → +∞, almost everywhere in (c, ϕ). On the other hand, since the following c, ϕ-uniform bound holds iEv(c−1) −iEu(c+1) (E, ϑ, ϕ) ≤ (E, ϑ, ϕ)| = M < +∞, 2 2 dE e e Eu φ dE|Euφ + + R+
R+
June 29, 2006 16:15 WSPC/148-RMP
412
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
and the domain of integration of the external integral in (A.16) has measure finite, we can use Lebesgue’s dominate theorem getting: iEv(c−1) −iEu(c+1) 2 2 dϕdc dE e e Euφ+ (E, ϑ, ϕ) + c.c. lim v→+∞
R+
[0,2π]×[−1,1]
=
dϕdc lim [0,2π]×[−1,1]
v→+∞
=
dEe R+
iEv(c−1) 2
e
−iEu(c+1) 2
Euφ+ (E, ϑ, ϕ) + c.c.
dϕdc 0 + c.c. = 0. [0,2π]×[−1,1]
We conclude that (Γ
M4
−i φ)(u, 0, ϕ ) = √ 4π
R+
(E, 0, ϕ)e−iEu + c.c. dE φ +
Notice that the values ϕ and ϕ are arbitrary only because of the singularity of spherical coordinates for ϑ = 0 (the problem is harmless here because the singular set has measure zero). What is relevant in the expression above is that, barring the problem with coordinates, it says that the versor n on S2 in the argument of the function in the left-hand side, (ΓM4 φ)(u, n ), coincides with the analog, n, in (E, n). Rotating back the used Cartesian the argument of the integrated function φ + frame to work with a generic value of ϑ the equation above transforms into: −i (E, ϑ, ϕ)e−iEu + c.c., dE φ (ΓM4 φ)(u, ϑ, ϕ) = √ + 4π R+ where we have identified the angles ϕ and ϕ as it is due working for ϑ = 0, π because n = n. This equation is (A.15). References [1] G. ’t Hooft, Dimensional reduction in quantum gravity, arXiv:gr-qc/9310026. [2] R. Bousso, The holographic principle, Rev. Mod. Phys. 74 (2002) 825; arXiv: hep-th/0203101. [3] O. Aharony, S. S. Gubser, J. M. Maldacena, H. Ooguri and Y. Oz, Large N field theories, string theory and gravity, Phys. Rept. 323 (2000) 183; arXiv:hep-th/9905111. [4] J. de Boer, L. Maoz and A. Naqvi, Some aspects of the AdS/CFT correspondence, arXiv:hep-th/0407212. [5] K. H. Rehren, Algebraic holography, Annas. Henri Poincar´e 1 (2000) 607; arXiv: hep-th/9905179. [6] K. H. Rehren, Local quantum observables in the anti-deSitter — conformal QFT correspondence, Phys. Lett. B 493 (2000) 383; arXiv:hep-th/0003120. [7] G. Arcioni and C. Dappiaggi, Exploring the holographic principle in asymptotically flat spacetimes via the BMS group, Nucl. Phys. B 674 (2003) 553; arXiv: hep-th/0306142. [8] J. de Boer and S. N. Solodukhin, A holographic reduction of Minkowski space-time, Nucl. Phys. B 665 (2003) 545; arXiv:hep-th/0303006. [9] E. Alvarez, J. Conde and L. Hernandez, Goursat’s problem and the holographic principle, Nucl. Phys. B 689 (2004) 257; arXiv:hep-th/0401220.
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
413
[10] G. Arcioni and C. Dappiaggi, Holography in asymptotically flat space-times and the BMS group, Class. Quant. Grav. 21 (2004) 5655; arXiv:hep-th/0312186. [11] C. Dappiaggi, BMS field theory and holography in asymptotically flat space-times, J. High Energy Phys. 0411 (2004) 011; arXiv:hep-th/0410026. [12] C. Dappiaggi, Elementary particles, holography and the BMS group, Phys. Lett. B 615 (2005) 291; arXiv:hep-th/0412142. [13] S. Frittelli, C. Kozameh and E. T. Newman, GR via characteristic surfaces, J. Math. Phys. 36 (1995) 4984; arXiv:gr-qc/9502028. [14] S. Frittelli, C. Kozameh and E. T. Newman, Lorentzian metrics from characteristic surfaces, J. Math. Phys. 36 (1995) 4975; arXiv:gr-qc/9502025. [15] V. Moretti and N. Pinamonti, Holography, SL(2,R) symmetry, Virasoro algebra and all that in Rindler spacetime, J. Math. Phys. 45 (2004) 230; arXiv:hep-th/ 0304111. [16] V. Moretti and N. Pinamonti, Quantum Virasoro algebra with central charge c = 1 on the event horizon of a 2D Rindler spacetime. J. Math. Phys. 45 (2004) 257; arXiv:hep-th/0307021. [17] V. Moretti and N. Pinamonti, Bose–Einstein condensate and spontaneous breaking of conformal symmetry on killing horizons, J. Math. Phys. 46 (2005) 062303, hepth/0407256. [18] A. Ashtekar and B. C. Xanthopoulos, Isometries compatible with asymptotic flatness at null infinity: A complete description, J. Math. Phys. 19(10) (1978) 2216–2222. [19] R. M. Wald, General Relativity (Chicago University Press, Chicago, 1984). [20] R. Penrose, Asymptotic properties of space and time, Phys. Rev. Lett. 10 (1963) 66. [21] R. Penrose, in Group Theory in Non-Linear Problems, ed. A. O. Barut (Reidel, Dordrecht, 1974), p. 97, Chap. 1. [22] R. Geroch, in Asymptotic Structure of Spacetime, eds. P. Esposito and L. Witten (Plenum, New York, 1977). [23] A. Ashtekar and M. Streubel, Symplectic geometry of radiative modes and conserved quantities at null infinity, Proc. R. Lond. A 376 (1981) 585. [24] R. Sachs, Asymptotic symmetries in gravitational theory, Phys. Rev. 128 2851 (1962). [25] P. J. McCarthy, Representations of the Bondi–Metzner–Sachs group I, Proc. R. Soc. London A330 (1972) 517. [26] P. J. McCarthy, Representations of the Bondi–Metzner–Sachs group II, Proc. R. Soc. London A333 (1973) 317. [27] M. Crampin and P. J. McCarthy, Physical significance of the topology of the Bondi– Metzner–Sachs, Phys. Rev. Lett. 33 (1974) 547. [28] L. Girardello and G. Parravicini, Continuous spins in the Bondi–Metzner–Sachs of asymptotically symmetry in general relativity, Phys. Rev. Lett. 32 (1974) 565. [29] P. J. McCarthy, The Bondi–Metzner–Sachs in the nuclear topology, Proc. R. Soc. London A343 (1975) 489. [30] R. Penrose and W. Rindler, Spinor and Twistor Methods in Space-Time Geometry, Vol. 2, Spinors and Space-Time (Cambridge University Press, 1986). [31] A. Ashtekar, Asymptotic quantization of the gravitational field, Phys. Rev. Lett. 46 (1981) 573. [32] R. M. Wald, Quantum Field Theory in Curved Spacetime and Black Hole Thermodynamics (Chicago University Press, Chicago, 1994). [33] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics, Vol. 2 (Springer, Berlin, Germany, 1996). [34] B. S. Kay and R. M. Wald, Theorems on the uniqueness and thermal properties of stationary, nonsingular, quasifree states on space-times with a bifurcate killing horizon, Phys. Rept. 207 (1991) 49.
June 29, 2006 16:15 WSPC/148-RMP
414
J070-00270
C. Dappiaggi, V. Moretti & N. Pinamonti
[35] P. J. McCarthy, Lifting of projective representations of the Bondi–Metzner–Sachs group, Proc. R. Soc. London A358 (1978) 141. [36] Y. K. Lau and X. N. Wu, On the projective representations of the Bondi–Metzner– Sachs group, Proc. R. Soc. Lond. A 457 (2001) 453. [37] J. Milnor, Remarks on infinite dimensional Lie groups, in Relativity, Groups and Topology II, eds. B. S. DeWitt and R. Stora (Elsevier, Amsterdam/New Yorks, 1984), pp. 1007–1057. [38] H. Bondi, M. G. J. van der Burg and A. W. K. Metzner, Gravitational waves in general relativity VII. Waves from axi-symmetric isolated points, Proc. Roy. Soc. London Ser. A 269 (1962) 21. [39] I. M. Gel’fand et al., Generalized Functions: Integral Geometry and Representation Theory, Vol. 5 (Academic Press, 1996). [40] I. M. Gel’fand et al., Generalized functions: Application of Harmonic Analysis, Vol. 4 (Acadmic Press, 1966). [41] P. J. McCarthy, Structure of the Bondi–Metzner–Sachs group, J. Math. Phys. 13 (1972) 1837. [42] A. Piard, Unitary representations of semidirect product groups with infinite dimensional abelian normal subgroup, Rept. Math. Phys. 11 (1977) 259. [43] D. J. Simms, Lie Groups and Quantum Mechanics (Springer-Verlag, 1968). [44] A. O. Barut and R. Raczka, Theory of Group Representation and Applications, 2nd edn. (World Scientific, 1986). [45] F. Lledo, Massless relativistic wave equations and quantum field theory, Ann. Henri Poincar´e 5 (2004) 607. [46] G. W. Mackey, Unitary Group Representations in Physics, Probability and Number Theory (Addison-Wesley Publishing, 1989). [47] N. Bourbaki, Mesure de Haar (Hermann, 1963). [48] M. Asorey, L. J. Boya and J. F. Carinena, Covariant representations in a fiber bundle framework, Rept. Math. Phys. 21 (1986) 391. [49] A. Piard, Representations of the Bondi–Metzner–Sachs group with the Hilbert topology, Rept. Math. Phys. 11 (1977) 279. [50] R. Shaw, The subgroup structure of the homogeneous Lorentz group, Quart. J. Math. Oxford Ser. 21 (1970) 101. [51] L. J. Boya, J. F. Carinena and M. Santander, On the continuity of the boost on each orbit, Comm. Math. Phys. 37 (1974) 331. [52] G. Goldin, Nonrelativistic current algebras as unitary representations of groups, J. Math. Phys. 12 (1971) 462. [53] E. Wigner, On unitary representations of the inhomogeneous Lorentz group, Ann. of Math. (2) 40 (1939) 149. [54] S. Frittelli and E. T. Newman, Pseudo-Minkowskian coordinates in asymptotically flat space-times, Phys. Rev. D 55 (1997) 1971. [55] S. Frittelli and E. T. Newman, Poincar´e pseudosymmetries in asymptotically flat spacetimes, in On Einstein’s Path (Springer, New York, 1999), p. 227. [56] S. A. Fulling, Aspects of Quantum Field Theory in Curved Space-Time (Cambridge University Press, Cambridge, 1991). [57] R. F. Streater and A. S. Wightman, PCT, Spin and Statistics, and All That (Princeton University Press, Princeton, 2000); Paperback printing with revised Preface and corrections. [58] R. Haag, Local Quantum Physics: Fields, Particles, Algebras, 2nd revised and enlarged edn. (Springer Berlin, Germany, 1992).
June 29, 2006 16:15 WSPC/148-RMP
J070-00270
Holography in Asymptotically Flat Spacetimes
415
[59] F. G. Friedlander, The Wave Equation on a Curved Space-Time (Cambridge University Press, Cambridge, 1975). [60] C. Kozameh and E. T. Newman, Theory of light cone cuts of null infinity, J. Math. Phys. 24 (1983) 2481. [61] S. Frittelli, E. T. Newman and G. Silva-Ortigoza, The eikonal equation in asymptotically flat space-times, J. Math. Phys. 40 (1999) 1041. [62] S. Hollands and A. Ishibashi, Asymptotic flatness and Bondi energy in higher dimensional gravity, J. Math. Phys. 46 (2005) 022503; arXiv:gr-qc/0304054. [63] S. Hollands and A. Ishibashi, Asymptotic flatness at null infinity in higher dimensional gravity, arXiv:hep-th/0311178. [64] A. Ashtekar, J. Bicak and B. G. Schmidt, Asymptotic structure of symmetry reduced general relativity, Phys. Rev. D 55 (1997) 669; arXiv:gr-qc/9608042. [65] O. Bratteli and D. W. Robinson, Operator Algebras And Quantum Statistical Mechanics, Vol. 1 (Springer-Verlag, New York, 1979). [66] P. J. M. Bongaarts, Linear fields according to I.E. Segal, in Mathematics of Contemporary Physics, ed. R. F. Streater (Academic Press, London, 1972); J. T. Lewis, The free boson gas, ibid.
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Reviews in Mathematical Physics Vol. 18, No. 4 (2006) 417–483 c World Scientific Publishing Company
LOCALIZATIONS AT INFINITY AND ESSENTIAL SPECTRUM OF QUANTUM HAMILTONIANS: I. GENERAL THEORY
VLADIMIR GEORGESCU∗ and ANDREI IFTIMOVICI† CNRS (UMR 8088) and Department of Mathematics, University of Cergy-Pontoise, 2, Avenue Adolphe Chauvin, 95302 Cergy-Pontoise Cedex, France ∗[email protected] †[email protected] Received 2 November 2005 Revised 8 May 2006 We isolate a large class of self-adjoint operators H whose essential spectrum is determined by their behavior at x ∼ ∞ and we give a canonical representation of σess (H) in terms of spectra of limits at infinity of translations of H. Keywords: Essential spectrum; C*-algebras; crossed products; ultrafilters; Schr¨ odinger operators; Dirac operators. Mathematics Subject Classification 2000: 47A10, 81Q10, 46L60, 46N50, 47L65, 47L90, 47L80
Contents 1. 2. 3. 4. 5. 6. A.
Introduction Preliminaries Crossed Products Affiliation to C (X) Localizations at Infinity Applications Appendix
417 428 435 440 446 453 475
1. Introduction In this paper, we continue the investigation of the spectral properties of quantum Hamiltonians with C ∗ -algebra methods on the lines of our previous work [22]. More precisely, our aim is to study the essential spectrum of general classes of (unbounded) operators in L2 (X), where X is a locally compact non-compact abelian group, by using crossed product techniques. For some historical remarks and comparison with other recently obtained results, see Secs. 1.2 and 1.4. 417
June 29, 2006 16:14 WSPC/148-RMP
418
J070-00269
V. Georgescu & A. Iftimovici
1.1. Essential spectrum: Main result We set B(X) = B(L2 (X)) and we denote by Ux the operator of translation by x ∈ X and by Vk the operator of multiplication by the character k ∈ X ∗ (our notations, although rather standard, are summarized in Sec. 2). We definea C (X) = T ∈ B(X) lim [T, Vk ] = 0 and lim (Ux − 1)T (∗) = 0 (1.1) x→0
k→0
∗
which is clearly a C -algebra of operators on L (X) (without unit if X is not discrete). Besides the norm topology on C (X), we shall also consider on it the topology defined by the family of seminorms Sθ = Sθ(Q) + θ(Q)S with θ ∈ C0 (X) and we shall denote Cs (X) the corresponding topological space (see Remark 5.7). Here, θ(Q) is the operator of multiplication by θ in L2 (X). Our main result is a description of the essential spectrum of the operators T ∈ C in terms of their “localizations at infinity”. We denote by δX the set of all ultrafilters on X finer than the Fr´echet filter (cf. Sec. 2.4). If Ai are subsets of a topological space we denote ∪i∈I Ai the closure of their union. 2
Theorem 1.1. If T ∈ C (X) is a normal operator, then for each κ ∈ δX, the limit limx→κ Ux T Ux∗ = κ.T exists in Cs (X) and σ(κ.T ). (1.2) σess (T ) = κ
Note that x → κ should be read “x tends to infinity along the filter κ ”. The limit operator κ.T will be called localization at κ of T . Since an ultrafilter finer than the Fr´echet filter can be thought as a point on an ideal boundary at infinity of X, the operators κ.T will also be called localizations at infinity of T . We are mainly interested in the essential spectrum of unbounded self-adjoint operators H “affiliated” to C (X), but the corresponding result is an immediate consequence of Theorem 1.1. We say that H is affiliated to some C ∗ -algebra A of operators on L2 (X) if ϕ(H) ∈ A for all ϕ ∈ C0 (R) (for this, it suffices to have (H − z)−1 ∈ A for one z ∈ ρ(H)). For technical reasons, we have to consider selfadjoint operators which are not necessarily densely defined and, in order to avoid confusions with the standard terminology, we shall call these more general objects observables. A more detailed presentation of this notion can be found in Sec. 2.2. For the moment we note only that an observable H is affiliated to C (X) if and only if lim [Vk , (H − z)−1 ] = 0 and
k→0
lim (Ux − 1)(H − z)−1 = 0
x→0
(1.3)
for some z ∈ ρ(H). This follows from the fact that if T ∈ B(X) is normal, then lim (Ux − 1)T = 0 ⇒ lim (Ux − 1)T ∗ = 0.
x→0
x→0
make the following convention: if a symbol like T (∗) appears in a relation, then the relation must hold for the operator T and for its adjoint T ∗ .
a We
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
419
Theorem 1.2. Let H be an observable on L2 (X) affiliated to C (X). Then for each κ ∈ δX, the limit κ.H := limx→κ x.H exists in the following sense: there is an observable κ.H affiliated to C (X) such that limx→κ Ux ϕ(H)Ux∗ = ϕ(κ.H) in Cs (X) for all ϕ ∈ C0 (R). Moreover, we have σess (H) = σ(κ.H). (1.4) κ
Practically, we are interested only in the case when H is a self-adjoint operator in the standard sense. However, even in this case, κ.H could be not densely defined and quite often we have κ.H = ∞ (i.e. the domain of κ.H is {0}). For example, if H has purely discrete spectrum, then ϕ(H) is a compact operator and we clearly get κ.H = ∞ for all κ. Since σ(∞) = ∅, we then obtain σess (H) = ∅, as it should be. Remark 1.3. The observable H should be thought as the Hamiltonian (energy observable) of a physical system. Thus, (1.4) says that the essential spectrum of the Hamiltonian H can be computed in terms of the spectra of its localizations at infinity κ.H. We emphasize that this notion of infinity is determined by the position observable Q. In other terms, if H satisfies (1.3), then σess (H) is given by its localizations in the region Q = ∞. This property does not hold in many situations of physical interest (e.g., if magnetic fields which do not vanish at infinity are involved) because localizations at infinity with respect to other observables must be taken into account, see [21]. Remark 1.4. It will be clear from the proof of Theorem 1.2 (see Lemma 5.3 and Proposition 5.10) that (1.4) remains valid if κ runs over sets much smaller than δX: it suffices to take κ ∈ K if K ⊂ δX has the property: if ϕ is a bounded uniformly continuous function on X and limx→κ ϕ(x + y) = 0 for all y ∈ X, κ ∈ K, then ϕ ∈ C0 (X). Remark 1.5. We mention the following immediate consequence of (1.4): if two observables affiliated to C (X) have the same localizations at infinity, then they have the same essential spectrum. If the difference of the resolvents is a compact operator, then clearly they have the same localizations at infinity, but the converse is far from being true (e.g., see the example from [22, p. 531], where the essential spectrum is independent of the details of the shape of the function ω). On the other hand, one may find in [18] criteria which ensure the compactness of the difference of the resolvents of two self-adjoint operators under rather weak conditions, e.g., an example from [18, p. 26] is a general version of [31, Proposition 4.1]. Remark 1.6. The following remark is useful in applications: if H is an observable affiliated to C (X) and if θ : σ(H) → R is a proper continuous function, then θ(H) is affiliated to C (X) and we have κ.θ(H) = θ(κ.H) for all κ ∈ δX (see Sec. 2.2) Remark 1.7. As explained in [22, p. 520], all our results extend trivially to the case when L2 (X) is replaced with the space of L2 functions with values in a Hilbert space E: it suffices to replace the algebra A with A ⊗ K(E). For example, Theorem 1.2
June 29, 2006 16:14 WSPC/148-RMP
420
J070-00269
V. Georgescu & A. Iftimovici
remains valid without any change if L2 (X) is replaced by L2 (X; E), where E is finite dimensional, and C (X) is defined exactly as before. Thus, in applications, we can consider differential operators with matrix valued coefficients, like Dirac operators. 1.2. Examples We give here the simplest applications of Theorems 1.1 and 1.2, a more detailed study and more general examples can be found in Sec. 4. Assume first that X is discrete. Note that in the particularly important case X = Zn , Theorem 1.1 has been proved in [39] (with a slightly different formulation and with quite different methods). Now, we have (1.5) C (X) = T ∈ B(X) lim [T, Vk ] = 0 . k→0 Since Vk∗ Ux Vk = k(x)Ux , we see that each operator of the form T = a∈X ϕa (Q)Ua , with ϕa ∈ ∞ (X) and ϕa = 0 only for a finite number of a, belongs to C (X). Clearly, we have κ.T = a∈X (κ.ϕa )(Q)Ua where the function κ.ϕa ∈ ∞ (X) is defined by (κ.ϕa )(y) = limx→κ ϕ(x + y). The Jacobi and CMV operators considered in [31] are particular cases of such operators T . Now, we give three examples in the case X = Rn . We start with the Schr¨odinger operator. We denote by Hs the Sobolev space of order s ∈ R associated to L2 (Rn ). Note that ∆ is the positive Laplacian. From Proposition 4.12 we get: Proposition 1.8. Let W be a continuous symmetric sesquilinear form on H1 such that: (1) W ≥ −µ∆ − ν as forms on H1 for some numbers µ < 1 and ν > 0, (2) limk→0 [Vk , W ]H1 →H−1 = 0. Let H0 be the self-adjoint operator associated to the form sum ∆+ W and let V be a real function in L1loc (Rn ) such that its negative part is relatively bounded with respect to H0 with relative bound < 1. Then the self-adjoint operator H = H0 +V (Q) (form sum) is affiliated to C (Rn ), hence the conclusions of Theorem 1.2 hold for it. This can be extended to a general class of hypoelliptic operators, cf. Proposition 4.16. We present below a very particular case. Proposition 1.9. Let h : Rn → R be of class C m for some m ≥ 1 and such that: (1) limk→∞ h(k) = +∞, (2) the derivatives of order m of h are bounded, (α) (k)| ≤ C(1 + |h(k)|). (3) |α|≤m |h Let G = D(|h(P )|1/2 ) be the form domain of the operator h(P ) and assume that W is a symmetric continuous form on G such that: (4) W ≥ −µh(P ) − ν as forms on G for some numbers µ < 1 and ν > 0, (5) limk→0 [Vk , W ]G→G ∗ = 0.
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
421
Let H0 = h(P ) + W (form sum) and let V ∈ L1loc (Rn ) real such that its negative part is relatively bounded with respect to H0 with relative bound < 1. Then the self-adjoint operator H = H0 + V (Q) (form sum) is affiliated to C (Rn ), hence the conclusions of Theorem 1.2 hold for it. Remark 1.10. If X is an arbitrary group, h : X → R is continuous and satisfies |h(k)| → ∞ as k → ∞, and if V ∈ L∞ (X), then obviously h(P ) + V (Q) is affiliated to C (X) and so we can apply Theorem 1.2. In order to cover unbounded V without much effort, a quite weak regularity condition on h is sufficient, see Proposition 4.16 and especially relation (4.8). We shall not try to optimize on this here. Finally, we consider a Dirac operator D. Let H = L2 (Rn ; E) for some finite dimensional Hilbert space E. We only need to know that D is a symmetric firstorder differential operator with constant coefficients acting on E-valued functions and which is realized as a self-adjoint operator on H such that the domain of |D|1/2 is the Sobolev space H1/2 . Now, from Corollary 4.8 we get: Proposition 1.11. Let W be a continuous symmetric form on H1/2 such that: (1) ±W ≤ µ|D| + ν as forms on H1/2 for some numbers µ < 1 and ν > 0, (2) limk→0 [Vk , W ]H1/2 →H−1/2 = 0. Then the self-adjoint operator H = D +W, defined as explained after Definition 4.7, is affiliated to C (Rn ), hence the conclusions of Theorem 1.2 hold for it. Observe that condition (2) is trivially satisfied if W is the operator of multiplication by an operator valued function W : Rn → B(E). Remark 1.12. We emphasize that the conditions on the perturbation W in Propositions 1.8, 1.9 and 1.11 is such that W can contain terms of the same order as ∆, h(P ) or D, respectively. For example, operators of the form ∂j ajk ∂k + singular lower-order terms − j,k
with ajk ∈ L∞ such that the matrix (ajk (x)) is bounded from below by a strictly positive constant are already covered by Proposition 1.8. See Example 4.12 for much more general results. These examples may be combined with the Remark 1.6 √ to cover functions of operators, e.g., H if H ≥ 0. 1.3. The role of crossed products Crossed products of C ∗ -algebras by the action of X play a fundamental role in our proof of Theorem 1.1 but we have to stress that they are important for two distinct reasons. First, they are in a natural sense C ∗ -algebras of energy b observables b We
emphasize “energy” because algebras of observables and crossed products were frequently used in various domains of the quantum theory in the last 50 years, but with different meanings and scopes than here.
June 29, 2006 16:14 WSPC/148-RMP
422
J070-00269
V. Georgescu & A. Iftimovici
(or quantum Hamiltonians), and hence they allow one to organize the Hamiltonians in classes each having some specific properties, e.g., the essential spectrum of the operators in a class is given by a “canonical” formula specific to that class (see (1.6)). On the other hand, crossed products are very efficient at a technical level, their use allows one to solve a non-abelian problem by abelian means: the problem of computing the quotient of a non-commutative C ∗ -algebra A ⊂ B(X) with respect to the ideal K (X) ≡ K(L2 (X)) is reduced to that of computing A/C0 (X) where A is a C ∗ -algebra of bounded uniformly continuous functions on X. The first reason mentioned above will be clarified by the later developments, but one may observe already now that the decomposition (1.2) is far from efficient. Indeed, its extreme redundancy becomes clear when we realize that many κ give the same κ.H (e.g., if the filters κ and χ have the same envelope, then χ.T = κ.T , see Sec. 2.6) and many more give the same σ(κ.H) (e.g., χ.T = Ux κ.T Ux∗ if χ is the translation by x ∈ X of κ). Thus at a qualitative level (1.2) is not very significant, it does not say much about σess (H), at least when compared with the N -body situation where the HVZ theorem has such a nice physical interpretation that you can predict it and believe it without proof. In order to partially remediate this drawback, we consider smaller classes of Hamiltonians. The following framework, introduced in [22], gives us more specific information about σess (H). Let C(X) be the C ∗ -algebra of all bounded uniformly continuous functions on X and C∞ (X) that of continuous functions which have a limit at infinity (in the usual sense). Definition 1.13. An algebra of interactions A on X is a C ∗ -subalgebra of C(X) which is stable under translations and which contains C∞ (X). The C ∗ -algebra of quantum Hamiltonians of type A is the norm closed linear space A ≡ A X ⊂ B(X) generated by the operators of the form ϕ(Q)ψ(P ) with ϕ ∈ A and ψ ∈ C0 (X ∗ ). We have denoted ϕ(Q) the operator of multiplication by ϕ in L2 (X) and ψ(P ) becomes multiplication by ψ after a Fourier transformation. The Propositions 3.3 and 3.4 explain why we think of A as a C ∗ -algebra of Hamiltonians. For example, if X = Rn , the self-adjoint operators of the form ∆ + nk=1 ak (x)∂k + a0 (x) with aj ∈ A∞ (functions in A with all derivatives in A) generate A . It turns out that A is canonically isomorphic with the crossed product of A by the natural action of X, which explains the notation A X and the relevance of crossed products in our context. Remark 1.14. Note that the definition and the quoted propositions tend to give the impression that the algebra A is rather small. But this is wrong, A is much larger than expected. For example, C(X) X = C (X) and we shall see in Sec. 4 that the set of self-adjoint operators affiliated to C (X) is very large. Other examples
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
423
are the N -body algebra and the “bumps” algebras. In fact, we may summarize our approach as follows: we first isolate a class of elementary Hamiltonians, these being the simplest operators we would like to study, but our results concern all the operators affiliated to the C ∗ -algebra they generate, which happens to be a crossed product and is very rich. In order to state the next consequence of Theorem 1.1, we have to introduce some new notations. Let σ(A) be the space of characters of the abelian C ∗ -algebra A. Then σ(A) is a compact topological space which contains X as an open dense subset, so δ(A) = σ(A)\X is a compact space. We shall adopt the following abbreviation: H ∈ A means that H is either a normal element of the algebra A or an observable affiliated to A . If H is an observable affiliated to A , then Ux HUx∗ is also an observable affiliated to A and we have ϕ(Ux HUx∗ ) = Ux ϕ(H)Ux∗ for ϕ ∈ C0 (R). By “continuity” of a map σ(A) κ → κ.H ∈ Cs (X) whose values are observables, we mean that σ(A) κ → ϕ(κ.H) ∈ Cs (X) is continuous for all ϕ ∈ C0 (R). Theorem 1.15. If H ∈ A , then the map X x → Ux HUx∗ ∈ A extends to a continuous map σ(A) κ → κ.H ∈ Cs (X) and we have σess (H) =
σ(κ · H).
(1.6)
κ∈δ(A)
Remark 1.16. To see the connection between this and Theorems 1.1 and 1.2, we recall that an ultrafilter finer than the Fr´echet filter is the same thing as a character κ of the algebra of all bounded functions on X such that κ(ϕ) = 0 if ϕ ∈ C0 (X) (see Sec. 2.5). Moreover, if χ is a second such ultrafilter and κ(ϕ) = χ(ϕ) for all ϕ ∈ C(X), then κ.H = χ.H for all H ∈ C (X), thus the union in (1.2) and (1.4) may be taken in fact over κ ∈ δ(C(X)). We emphasize that, although Theorem 1.15 seems stronger than Theorems 1.1 and 1.2, it is in fact an immediate consequence of Theorem 1.1 (just “abstract nonsense”, see Sec. 5.3 for details). Note also that (1.6) is a canonical decomposition of the essential spectrum of H, all the objects in the formula being canonically associated to A. The representation (1.6) is further discussed in Sec. 5.3 after Example 5.20. Remark 1.17. We mention that, by using a more involved algebraic formalism as in [22], one can obtain partial, but often relevant, information concerning the essential spectrum of H as follows. Let J be an X-ideal such that C0 (X) ⊂ J ⊂ A and let J = J X (we use here notations and results from [22]). Then K (X) ⊂ J ⊂ A and J is an ideal in A , so the image HJ of H is well defined as observable affiliated to the quotient algebra A/J . By using the natural surjection A/K (X) → A/J , we clearly get σ(HJ ) ⊂ σess (H). In this argument, J need not be a crossed product, but if it is, we can use A/J ∼ = (A/J ) X to get a concrete representation of HJ .
June 29, 2006 16:14 WSPC/148-RMP
424
J070-00269
V. Georgescu & A. Iftimovici
1.4. Historical comments This subsection is devoted to some historical comments and a discussion of some related results from the literature. Theorem 1.1 was announced in the preprints [28, 20], see [28, Theorems 1.3 and 4.2] and [20, Theorem 4.1 and Corollary 4.2]. In fact, the theorem was stated in a stronger form, namely we assert that the union in (1.2) is already closed. Moreover, some nontrivial applications are stated in [23, p. 149]. The closedness of the union in (1.2) as well as more explicit applications of Theorem 1.1 will be discussed in the second part of this paper. However, we show here that the union in (1.6) is closed for some special algebras A when the result is far from obvious (Sec. 6). The main idea of the proof of Theorem 1.1 we had in mind at that moment is presented in [20, pp. 30–31] and it has to be combined with the two main points of the algebraic approach we used in that paper, namely: (1) If H is a Hilbert space, then the quotient algebra B(H)/K(H) is a C ∗ -algebra and, if T is the projection of T ∈ B(H) in the quotient, then σess (T ) = σ(T). (2) We have K(L2 (X)) = C0 (X) X and if A is an algebra of interactions then (1.7) A X / C0 (X) X ∼ = A/C0 (X) X. If T ∈ A X, the isomorphism (1.7) allows us to reduce the computation of T to
an abelian problem and hence, to deduce T ∼ = (κ.T )κ∈δ(A) ∈ κ∈δ(A) C (X). The preceding strategy requires a lot of abstract machinery and is not adapted to a purely Hilbert space setting. For example, the isomorphism (1.7) is a consequence of the fact that the functor A → A X transforms short exact sequences in short exact sequences, an assertion which does not even make sense if we fix the Hilbert space on which the algebras are realized. Instead, in the present paper we decided to avoid step (2) of this strategy and to base our arguments on a beautiful theorem due to M. B. Landstad [30] which gives an intrinsic characterization of crossed products. We feel that this makes the argument more elementary and gives deeper insight into the matters treated here. In fact, we could now avoid completely going out from the purely Hilbert space setting (in particular, forget about the step (1) above), but this does not seem to us a natural attitude and we finally decided to adopt a median approach. It is remarkable that C (X) as defined in (1.1) is precisely the crossed product C(X) X. Initially, this fact was proved by direct methods in the case X = Rn in [12] (because of this, [20, Corollary 4.2] was stated only for X = Rn ). The general case follows in fact immediately from Landstad’s theorem. We make now some comments concerning other papers with goals similar to ours. We note first that, in the particular case X = Zn , V. S. Rabinovich, S. Roch and B. Silberman [39] discovered Theorem 1.1 before us and proved it with no C ∗ -algebra techniques (in Remark 3.18 we explain why (1.5) is just their algebra of “band dominated operators”). It seems that they realized the fact that their algebra
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
425
in the case X = Z is a crossed product only in [38] (this fact is a particular case of [22, Theorem 4.1]). In [39] and in subsequent works [36, 39, 40] (see also [40] for references to earlier papers), these authors use a discretization technique in order to treat perturbations of pseudo-differential operators in L2 (Rn ). They get relations like (1.4) and show that in some situations the union is already closed. Moreover, in [40, Chap. 7], they present an abstract version of their approach (in particular, they consider groups more general than Zn ) which seems to us complementary to our approach and relevant in contexts like that of [24]. We learned about these works quite recently thanks to a correspondence with B. Simon who sent us a copy of the paper [36]; this explains why the above references were not included in our previous works on this topic. We discuss now the relation between our paper and the article [47] (this reference was pointed out to us by one of the referees). We shall do it in some detail because C ∗ -algebra techniques are emphasized in [47]. The purpose of J. Roe is to extend the results of Rabinovich, Roch and Silberman to non-abelian groups. He considers a finitely generated discrete (non-abelian) group Γ and defines A as the C ∗ -algebra of operators on 2 (Γ) generated by ∞ (Γ) and by the right translation operators Rγ (this is a natural extension of the procedure introduced in [39]). Then denoting Lγ the left translation operators, he shows that for each T ∈ A, the map γ → Lγ T L∗γ extends to a ∗-strongly continuous map βΓ → A, where βΓ is the Stone– ˇ Cech compactification of Γ (the space of characters of ∞ (Γ)). The restriction to δΓ = βΓ\Γ of this map is the symbol of T and the main result of [47] is that for exact groups (in the C ∗ -algebra sense) an operator T ∈ A has symbol equal to zero if and only if T is compact. On [20, pp. 30–31], where we describe the main ideas of the proof of Theorem 1.2, we introduce the notion of regular operator on L2 (X) for X an abelian locally compact group (and in a more general context in the footnote on [20, p. 31]): we say that a bounded operator T on L2 (X) is regular if {Ux T (∗) Ux∗ | x ∈ X} are strongly relatively compact sets. Then we note that for such operators the map x → Ux T Ux∗ extends to a strongly continuous map βX κ → Tκ ∈ B(L2 (X)) (this ˇ time the Stone–Cech compactification βX involves the topology of X) and call the values Tκ with κ ∈ δX localizations at infinity of T . We show that the elements of C(X) X are regular and from the arguments on page 31 it is rather obvious that their localizations at infinity belong to the same algebra C(X) X. This is more explicitly stated and proved in [23, Lemma 3.10] (which is Lemma 3.9 in the preprint version and Lemma 5.8 here). All this can also be done at the level of the algebra C(X) and at the bottom of [20, p. 31] we say that if ϕ ∈ C(X) and all its localizations at infinity are zero, then ϕ ∈ C0 (X) (this is easy to prove, cf. Lemma 5.3 here) and finish by saying that this remains true after taking crossed products (which is not obvious but can be deduced from [20, Theorem 3.4] or (1.7) here; as we said before, in this paper we prefer to use Landstad’s theorem at this last step). We emphasize that although the starting points of [20, 47] (in particular, the ˇ relevance of the Stone–Cech compactification) are similar, the proofs of the main
June 29, 2006 16:14 WSPC/148-RMP
426
J070-00269
V. Georgescu & A. Iftimovici
fact (that the kernel of the symbol map, in the terminology of [39], is just the compacts) are of a quite different nature. Indeed, Roe mentions that A is the reduced crossed product L∞ (Γ)r Γ but never uses this fact, cf. the proof of [47, Proposition 3.3]. On the other hand, the crossed product structure and relations like (1.7) are the heart of our approach (and we expect that (1.7) is also true under Roe’s conditions). The methods used by Roe also seem relevant for the solution of a problem left open (but not explicitly stated) in [24]. The space Γ considered there is a tree, which is a finitely generated mono¨ıd. The natural object in this case is the C ∗ -algebra generated by the right translations and by ∞ (Γ), the localizations at infinity being given by left translations. Due to obvious technical difficulties the algebra considered in [24] is much smaller: it is generated by the Laplacian (which is a certain polynomial in the right translations) and by the functions in ∞ (Γ) which extend continuously to the hyperbolic compactification of Γ (see Secs. 3.4 and 3.5 and [24, Theorem 5.1] and its proof; the references are to the preprint version). A larger algebra, associated to the analogue of the slowly oscillating functions on Γ, is considered in [17], where the problem is treated by very different techniques. It would be interesting to know if the techniques from [47, Sec. 3] can be adapted to solve the most general situation. Y. Last and B. Simon obtained in [31] relations like (1.4) for large classes of Schr¨ odinger operators on Rn and their discrete versions (Jacobi or CMV operators). Their proofs involve “classical” geometrical methods (localization with the help of a partition of unity). We have to emphasize that many people working on pseudo-differential operators have been led to consider C ∗ -algebras generated by such operators and to describe their quotients with respect to the ideal of compact operators: in fact, this is one of the most efficient ways to define the symbol of an operator (see [8], for example). Much more specific and relevant with respect to our goals is the work of H. O. Cordes (see [9] for a review). For example, the C ∗ -algebra generated by a hypoelliptic operator and by the algebra of slowly oscillating functions and the computation of its quotient with respect to the compacts seem to have been considered for the first time in M. Taylor’s thesis (see [49, Theorem 1]). For more recent work on these lines, we refer to [34]. A rather different class of “C ∗ -algebras of Hamiltonians” appears in the work of J. Bellissard on solid state physics [3, 4]: he fixes a Hamiltonian H and considers the C ∗ -algebra generated by its translates. These algebras do not contain compact operators in general, so the techniques we use do not seem relevant in his setting. A more detailed discussion of the connection between the approach of Bellissard and ours can be found in [22]. The origin of our approach can be traced back to the algebraic treatment of the N -body problem from [6, 7] (where the HVZ theorem and the Mourre estimate are proved in an abstract graded C ∗ -algebra framework for a very general class of N -body Hamiltonians). The role of the crossed products was pointed out in [20–22]
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
427
and a treatment of the N -body problem along these lines is presented in [11–13]. Various applications and extensions of the crossed product technique can be found in [2, 32, 33, 43, 44] and references therein. Our interest in localizations at infinity of a Hamiltonian was initially motivated by our desire to go beyond the N -body problem and to consider general (phase space) anisotropic systems [19, 28]. Indeed, in the N -body case, there is a lot of supplementary structure which makes the theory simple and beautiful (cf. Sec. 6.5), but this structure has no analogue in other types of anisotropy. We first found that the C ∗ -algebra techniques are quite well adapted to the study of Hamiltonians with Klaus type potentials, see [20, 22] and also Sec. 6.4 here for a treatment in the spirit of Theorem 1.1. We finally realized that the relation (1.7), which is the main point of the algebraic approach that we used, predicts in fact the description (1.4) of the essential spectrum of H in terms of its localizations at infinity. The paper [25] played an important role in our understanding of this fact. Indeed, B. Helffer and A. Mohamed prove there that the essential spectrum of a magnetic Hamiltonian (P − A)2 + V is the closure of the union of the spectra of some limit Schr¨ odinger operators. Their proof is based on hypoellipticity techniques and the result is already interesting if the magnetic field is not present. The class of potentials they consider is quite large, but the function V has to be bounded from below and to satisfy some regularity conditions. These assumptions imply that the limit operators have only polynomial electric and magnetic potentials, which is easily explained in our framework, see [23, Proposition 3.13]. 1.5. Plan of the paper Our purpose being to emphasize not only the power but also the simplicity of the C ∗ -algebra techniques, we made an effort to make the paper essentially selfcontained and easy to read by people working in the spectral theory of quantum Hamiltonians and with little background in C ∗ -algebras. We could have written a much shorter paper but which would have been accessible mostly to people with no interest in spectral theory. Instead, we have chosen to present in some detail most of the tools which are not standard among those interested in the subject. In particular, we give in an Appendix a simple and selfcontained proof of Landstad’s theorem (Theorem 3.6) which plays an important role in our arguments. In Sec. 2, we introduce our notations and make a r´esum´e of what we need concerning (ultra)filters and their relation with the characters of some abelian C ∗ algebras. In Sec. 3, we introduce crossed products in the version we need and we point out several useful consequences of Landstad’s theorem. This replaces the much more abstract arguments from [20, 22], since we remain in a purely Hilbert space setting, but also gives stronger and more explicit results in applications. Section 4 is devoted to criteria of affiliation to the algebra C (X), we show there that this algebra is much larger than one would think at first sight.
June 29, 2006 16:14 WSPC/148-RMP
428
J070-00269
V. Georgescu & A. Iftimovici
In Sec. 5, we prove our main result, Theorem 5.11. Finally, in Sec. 6, we consider three algebras of quantum Hamiltonians, those which seem the most interesting to us. The first one V (X) is generated by slowly oscillating potentials and is the simplest nontrivial algebra of Hamiltonians since it is defined by the property that if H is affiliated to V (X), then all its localizations at infinity are free Hamiltonians (i.e., functions of the momentum). The second one is the algebra associated to a sparse set and it is remarkable because the localizations at infinity of the Hamiltonians affiliated to it are two-body Hamiltonians and thus their essential spectrum has a quite interesting structure. The third one is, of course, the N -body algebra, or rather a more general and geometrically natural algebra that we call Grassmann algebra, an object of a remarkable simplicity, richness and beauty. The final subsections are devoted to some remarks of a different nature on the localizations at infinity of Hamiltonians of the form h(P ) + v(Q) with v(Q) relatively bounded with respect to h(P ). 2. Preliminaries In this section we describe our notations and recall facts needed in the rest of the paper. 2.1. Notations If X is a locally compact topological space, then C∞ (X) is the C ∗ -algebra of continuous functions which have a limit at infinity and C0 (X) is the subalgebra of functions which converge to zero at infinity; thus C∞ (X) = C + C0 (X). Let Cc (X) be the subalgebra of functions with compact support. If A is a C ∗ -algebra, then we similarly define C0 (X; A ) for example, which is also a C ∗ -algebra. If X is compact we set C(X; A ) = C0 (X; A ) and C(X) = C(X; C), which does not conflict with the notation (2.3) because the continuous functions on X are uniformly continuous. The characteristic function of a set S ⊂ X is denoted 1S . In order to facilitate the reading of the paper, we tried to respect as much as possible the following notational conventions. For abelian algebras (abstract as well as concrete ones, like function algebras) we use “mathcal” fonts, like A, C. For non-abelian algebras we use “mathscr” fonts, like A , C . Moreover, the crossed product of an abelian algebra A by the action of some group is denoted A . For other mathematical objects, we use either greek letters or “mathcal” fonts with one exception: filters are often denoted by small gothic letters like f, g. However, ultrafilters are generally denoted κ because we think of them as “points at infinity” of the space X whose points are denoted x. 2.2. Observables If H is a Hilbert space, then B(H) and K(H) are the C ∗ -algebras of bounded and compact operators on H, respectively. The resolvent set, the spectrum and
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
429
the essential spectrum of an operator S are denoted ρ(S), σ(S) and σess (S), respectively. By morphism between two C ∗ -algebras, we understand ∗-homomorphism. An ideal in a C ∗ -algebra is assumed to be closed and two-sided. An observable is a linear operator H : D(H) → H such that HD(H) ⊂ K, where K is the closure of D(H) in H, and such that H when considered as operator in K is self-adjoint in the usual sense. A trivial observable which, however, is quite important, is the unique observable whose domain is equal to {0}; we shall denote it ∞. One has to think that H is equal to ∞ on K⊥ and for this reason, we set ϕ(H) = 0 on K⊥ if ϕ ∈ C0 (R). Note that we keep the notation (H − z)−1 for the resolvent of H in H but (H − z)−1 = 0 on K⊥ . If C ⊂ B(H) is any C ∗ -subalgebra, then an observable H is said to be affiliated to C if (H − z)−1 ∈ C for some z ∈ ρ(H). Then ϕ(H) ∈ C for all ϕ ∈ C0 (R). It is theoretically much more convenient to define an observable affiliated to C as a morphism H : C0 (R) → C and then to set H(ϕ) ≡ ϕ(H). We refer to [22, pp. 522–523] for a r´esum´e of what we need and also to [13] for comments on this notion which should not be confused with that introduced by S. Baaj and S. L. Woronowicz (in [1, Sec. 8.1] one can find a systematic presentation of this point of view). We recall two definitions which make the transition from Theorem 1.1 to Theorem 1.2 trivial. The spectrum of the observable H is the set σ(H) = {λ ∈ R | ϕ ∈ C0 (R) and ϕ(λ) = 0 ⇒ ϕ(H) = 0}.
(2.1)
Let K = C ∩ K(H), this is an ideal in C . Then the essential spectrum of H is the set σess (H) = {λ ∈ R | ϕ ∈ C0 (R) and ϕ(λ) = 0 ⇒ ϕ(H) ∈ / K }.
(2.2)
∗
We also note that any morphism π : C → B between two C -algebras extends in a trivial way to a map between observables affiliated to C to observables affiliated to B. Indeed, it suffices to define π(H) by the condition ϕ(π(H)) = π(ϕ(H)). For example, if π : C → C /K is the canonical morphism of C onto the quotient algebra C /K , we have σess (H) = σ(π(H)). Finally, we mention one more immediate consequence of the definition of an observable in terms of morphisms, cf. [1, p. 370]. We shall use the easily proven fact that ϕ(H) depends only on the restriction of ϕ to the closed real set σ(H). Let θ : σ(H) → R be continuous and proper (i.e. |θ(λ)| → ∞ if |λ| → ∞). Then the observable θ(H) is well defined by the rule ϕ(θ(H)) = (ϕ ◦ θ)(H) for ϕ ∈ C0 (R) (if H is a self-adjoint operator, then θ(H) is just the operator defined by the usual functional calculus). Clearly: if H is affiliated to C , the observable θ(H) is also affiliated to C . 2.3. Abelian groups: Notations We describe now objects and notations from the harmonic analysis on groups. Everything we need can be found in [16] or [15]; see also [50].
June 29, 2006 16:14 WSPC/148-RMP
430
J070-00269
V. Georgescu & A. Iftimovici
Let X be an abelian locally compact group (with the operation denoted additively) equipped with a Haar measure dx. We abbreviate B(X) = B(L2 (X)), K (X) = K(L2 (X)) and note that these are C ∗ -algebras depending on X and not on the choice of the Haar measure. Other such C ∗ -algebras are L∞ (X), C∞ (X), C0 (X) and the C ∗ -algebra of bounded uniformly continuous functions on X, which plays the most important role in what follows: C(X) = {ϕ : X → C | ϕ is bounded and uniformly continuous}.
(2.3)
In order to avoid ambiguities, if ϕ is a measurable function on X, then we denote ϕ(Q) the operator of multiplication by ϕ in L2 (X) (the symbol Q has no operator meaning). By using this map, we identify the algebra L∞ (X) and its C ∗ -subalgebras with C ∗ -subalgebras of B(X), in particular, we always embed C0 (X) ⊂ C∞ (X) ⊂ C(X) ⊂ B(X).
(2.4)
Note that the C ∗ -algebra ∞ (X) of all bounded functions on X cannot be embedded in B(X) (neither can the C ∗ -algebra B(X) of bounded Borel functions). Let X ∗ be the set of characters of X (continuous homomorphisms k : X → C with |k(x)| = 1) equipped with the locally compact group structure defined by the operation of multiplication and the topology of uniform convergence on compact sets. We denote the operation in X ∗ additively and its neutral element by 0, as in [50, Chap. II, Sec. 5] (this convention looks rather strange if X = Zn , for example). If X is a real finite dimensional vector space, then X ∗ is identified with the vector space dual to X as follows: let · , · : X × X ∗ → R be the canonical bilinear map and take k(x) = eix,k . In fact, the field of real numbers can be replaced here by an arbitrary non-discrete locally compact field, see [16, p. 91]) and [50, Chap. II, Sec. 5]. We recall that the dual group (X ∗ )∗ of X ∗ is identified with X, each x ∈ X being seen as a character of X ∗ through the formula x(k) = k(x). : X ∗ → C given by TheFourier transform of u ∈ L1 (X) is the function F u ≡ u ∗ u (k) = X k(x)u(x) dx. We equip X with the unique Haar measure dk such that F induces a unitary map F : L2 (X) → L2 (X ∗ ). From F −1 = F ∗ , we get (F −1 v)(x) = k(x)v(k) dk for v ∈ L2 (X ∗ ). By taking into account the identification X ∗∗ = X, X∗ the Fourier transform of ψ ∈ L1 (X ∗ ) and the Fourier inversion formula are ψ(x) =
k(x)ψ(k) dk
X∗
and ψ(k) =
dx. k(x)ψ(x)
(2.5)
X
For each measurable ψ : X ∗ → C we define the operator ψ(P ) on L2 (X) by ψ(P ) = F ∗ Mψ F , where Mψ is the operator of multiplication by ψ in L2 (X ∗ ). In particular, the restriction to L∞ (X ∗ ) of the map ψ → ψ(P ) is injective and gives us C ∗ -subalgebras C0 (X ∗ ) ⊂ L∞ (X ∗ ) ⊂ B(X).
(2.6)
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
431
Let {Ux }x∈X and {Vk }k∈X ∗ be the strongly continuous unitary representations of X and X ∗ in L2 (X) defined by (Ux f )(y) = f (x + y) and (Vk f )(y) = k(y)f (y), respectively. Note that Ux and Vk satisfy the canonical commutation relations Ux Vk = k(x)Vk Ux .
(2.7)
Observe that we have Ux = x(P ) if x ∈ X is identified with the function k → k(x) and similarly Vk = k(Q). Also, we have, cf. (2.5): dx if ψ ∈ L1 (X). Ux ψ(x) (2.8) ψ(P ) = X
2.4. Filters We summarize here some facts we need concerning filters, cf. [5, 26, 48]. A filter on X is a family f of subsets of X which does not contain the empty set, is stable under finite intersections, and has the property: G ⊃ F ∈ f ⇒ G ∈ f (the empty set is a filter!). If Y is a topological space and θ : X → Y is any map, then limf θ = y means that θ−1 (V ) ∈ f if V is a neighborhood of y. We shall often write limx→f θ(x) instead of limf θ for reasons which will become clear later on. If f, g are filters and f ⊂ g, then g is said to be finer than f. An ultrafilter is a maximal element in the set of all filters on X for this order relation. If x ∈ X, then the family of sets which contain x is the ultrafilter determined by x. A filter f is an ultrafilter if and only if for each A ⊂ X one has A ∈ f or Ac ≡ X\A ∈ f. Ultrafilters are important because of the following property: if f is an ultrafilter and θ : X → Y is an arbitrary map with values in a compact space Y , then limf θ exists. This fact will become clear after the explanations in Sec. 6.3. The space γX of all ultrafilters on X is a compact space for the topology defined as follows: the map f → {κ ∈ γX | κ ⊃ f} is a bijection from the set of all filters on X onto the set of all closed subsets of γX. Thus, one should think that a filter is a closed subset of γX. Another description of this topology will be given later on. ˇ The compact topological space γX is the discrete Stone–Cech compactification of X and it is characterized by the following universal property: if Y is a compact space then each map θ : X → Y has a unique extension to a continuous map θ : γX → Y . Since this property is important for us, we shall further discuss it in Sec. 6.3. The set X is identified with an open dense subset of γX (to x ∈ X one associates the ultrafilter determined by x) and the topology induced by γX on X is the discrete topology. However, the space γX\X is much too large for our purposes, the only ultrafilters of interest to us belong to the compact subset of γX defined by δX = {κ | κ is an ultrafilter finer than the Fr´echet filter}.
(2.9)
We call Fr´echet filter the filter consisting of the sets with relatively compact complement (this is not quite standard). This filter depends on the locally compact non-compact topology given on X. In view of the standard meaning of the notation limx→∞ , it is natural to denote by ∞ the Fr´echet filter. As explained above, one
June 29, 2006 16:14 WSPC/148-RMP
432
J070-00269
V. Georgescu & A. Iftimovici
should think of ∞ as a certain compact subset of γX and then we have in fact ∞ = δX. 2.5. Characters We shall explain now the relation between filters and characters of certain abelian C ∗ -algebras. If A is such an algebra we denote σ(A) the space of characters of A (a character is a non-zero morphism A → C) equipped with the weak∗ topology. This is a locally compact topological space which is compact if and only if A is unital. Let B be a unital abelian C ∗ -algebra and let A ⊂ B be a C ∗ -subalgebra which contains the unit of B. Then each character of B restricts to a character of A and each character of A is obtained in this way. This gives a canonical map π : σ(B) → σ(A) which is continuous and surjective and if we define in σ(B) an equivalence relation κ ∼ χ by the condition κ(S) = χ(S) ∀ S ∈ A, the compact topological space σ(A) is just the quotient of σ(B) with respect to this relation. In particular, a map f : σ(A) → Y is continuous if and only if f ◦ π : σ(B) → Y is continuous, where Y is an arbitrary topological space. Let ∞ (X) be the C ∗ -algebra of all bounded functions on X. Then the space of all characters of ∞ (X) can be identified with the space γX of all ultrafilters on X: γX = σ(∞ (X)).
(2.10)
Indeed, the map which associates to an ultrafilter f the character ϕ → limf ϕ is a homeomorphism and the inverse map associates to the character κ the ultrafilter f = {F ⊂ X | κ(1F ) = 1}. From now on, we shall identify f and κ, so an ultrafilter is the same thing as a character of ∞ (X), and we shall work with the interpretation which is most suited to the context. We also set κ(F ) = κ(1F ) for F ⊂ X. Then δX = {κ ∈ γX | κ(K) = 0 ∀ K ⊂ X compact}.
(2.11)
The algebras A that we consider are unital subalgebras of ∞ (X), thus their character spaces σ(A) are quotients of γX. In other terms, we can view the characters of A as equivalence classes of ultrafilters: if κ is a character of A, then there is an ultrafilter f such that κ(ϕ) = limf ϕ for all ϕ ∈ A, and in fact, there are many such ultrafilters. For the algebras which are of interest for us, we always have C∞ (X) ⊂ A ⊂ C(X) ⊂ ∞ (X).
(2.12)
Then X is identified with an open dense subset of σ(A) and the topology induced by σ(A) on X coincides with the initial topology, so σ(A) is a compactification of the locally compact space X. Thus δ(A) = σ(A)\X = {κ ∈ σ(A) | κ(ϕ) = 0 ∀ ϕ ∈ C0 (X)}
(2.13)
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
433
is a compact subset of σ(A), the boundary of X in the compactification σ(A). The uniform compactification βu X of X is defined by the largest algebra C(X): βu X = σ(C(X)),
δu X = βu X\X = δ(C(X)).
(2.14)
Later on, we shall explicitly describe the equivalence relation in γX which defines βu . We are interested only in the boundary δ(A) of X in σ(A). We show now that this is a quotient of δX. Lemma 2.1. Let f be an ultrafilter on X and let κ be the character of A defined by κ(ϕ) = limf ϕ. Then κ ∈ δ(A) if and only if f ∈ δX. Proof. If f is an ultrafilter and Y ⊂ X then there are only two possibilities: either Y ∈ / f, and then X\Y ∈ f, hence Y ∩ Z = ∅ for all Z ∈ f, or Y ∈ f, and then the sets Y ∩ Z with Z ∈ f form an ultrafilter on Y . If f is not finer, than the Fr´echet filter, then there is a set with compact complement Y which does not belong to f, and so Y ∈ f. Since any ultrafilter on a compact set is convergent, we see that there is y ∈ Y such that f contains the filter of neighborhoods of y. But then clearly limf ϕ = ϕ(y) for any continuous function ϕ, hence the character κ(ϕ) = limf ϕ is just y and does not belong to δ(A). On the other hand, if f ∈ δX then clearly κ ∈ δ(A). Thus the characters κ ∈ δ(A) are equivalence classes of ultrafilters f ∈ δX. In general, we do not distinguish between a character and the elements of the equivalence class of ultrafilters which define it. However, when needed for the clarity of the argument, we shall use the map δ which sends an element into its equivalence class. More precisely, from (2.12) we see that there are canonical surjections δX → δu X → δ(A) → {∞}
(2.15)
and all of them (and their compositions) will be denoted δ. Here, ∞ is the Fr´echet filter and we have δ(C∞ (X)) = {∞}. 2.6. On the uniform compactification The space βu X is the quotient of γX given by an equivalence relation that we describe now (see [48, p. 121]). If f is a filter, then its envelope is the filter f◦ generated by the sets F +V where F ∈ f and V belong to the filter of neighborhoods of the origin (observe that the sets F + V , with V an open neighborhood of the origin, are open and form a basis of f◦ ). Note that f ⊃ f◦ and (f◦ )◦ = f◦ . Two filters are u-equivalent (uniformly equivalent) if they have the same envelope. The quotient of γX with respect to this relation is βu X. We shall give a complete proof of this assertion since in [48] the C ∗ -algebra point of view is not explicitly considered. The following simple fact will be useful for other purposes too.
June 29, 2006 16:14 WSPC/148-RMP
434
J070-00269
V. Georgescu & A. Iftimovici
Lemma 2.2. Let ϕ : X → C be uniformly continuous and let
f be a filter on X.
(1) limf ϕ exists if and only if limf◦ ϕ exists and in this case they are equal. (2) If limx→f ϕ(x + y) ≡ ξ(y) exists for each y ∈ X, then the limit exists locally uniformly in y and ξ is a uniformly continuous function. Proof. To prove (1), it suffices to show that limf◦ ϕ = 0 if limf ϕ = 0. For ε > 0, let Fε be the set of points where |ϕ(x)| < ε. We have Fε ∈ f and if we choose a neighborhood V of the origin such that |ϕ(x) − ϕ(y)| < ε if x − y ∈ V , then for x ∈ Fε + V we have |ϕ(x)| < 2ε, hence F2ε ∈ f◦ . Now, we prove (2). Set ωV (ϕ) = supy−z∈V |ϕ(y) − ϕ(z)| if V is a neighborhood of the origin. Then ϕ is uniformly continuous if and only if for each ε > 0 there is V such that ωV (ϕ) < ε. Clearly, ωV (ξ) ≤ ωV (ϕ), so ξ is uniformly continuous. Now, let V be open and let K be a compact set. Then K is covered by the open
sets x + V , x ∈ K, hence there is Z ⊂ K finite such that K ⊂ z∈Z (z + V ). Thus, for each y ∈ K there is z ∈ Z such that y ∈ z + V and then |ϕ(x + y) − ϕ(x + z)| ≤ ωV (ϕ) for all x ∈ X. Then we have: |ϕ(x + y) − ξ(y)| ≤ |ϕ(x + y) − ϕ(x + z)| + |ϕ(x + z) − ξ(z)| + |ξ(z) − ξ(y)| ≤ ωV (ϕ) + |ϕ(x + z) − ξ(z)| + ωV (ξ) ≤ 2ωV (ϕ) + |ϕ(x + z) − ξ(z)|. We choose V such that ωV (ϕ) ≤ ε/3 and then we fix Z as above. Since Z is finite, there is F ∈ f such that |ϕ(x + z) − ξ(z)| ≤ ε/3 for all x ∈ F and z ∈ Z. Finally, we get |ϕ(x + y) − ξ(y)| ≤ ε for all x ∈ F and y ∈ K. Lemma 2.3. Assume f = f◦ where f is a filter on X. Then for each F ∈ f, there is an open subset G ∈ f of F and a function θ ∈ C(X) such that θ = 1 on G and θ = 0 on F c ≡ X\F . Proof. Note first that the open sets from f form a basis of f. Clearly, there is an open G ∈ f and an open, relatively compact neighborhood of the origin U such that G + (U − U ) ⊂ F , so denoting A = G − U we shall have A + U ⊂ F . We then set θ = |U |−1 1A ∗ 1U , so for each x ∈ X, we have θ(x) = |U |−1 |A ∩ (x − U )|. For x ∈ G, / A + U , A ∩ (x − U ) = ∅, x − U ⊂ A, thus θ(x) = |U |−1 |x − U | = 1, and for x ∈ hence θ(x) = 0. But A + U ⊂ F , thus θ = 0 on F c too. Finally, from u ∗ vL∞ ≤ uL1 vL∞
and x.(u ∗ v) − u ∗ vL∞ ≤ x.u − uL1 vL∞ ,
where (x.u)(y) = u(y + x), we get L1 (X) ∗ L∞ (X) ⊂ C(X), hence θ ∈ C(X). Proposition 2.4. Let κ, χ be ultrafilters on X. Then κ ◦ = χ◦ if and only if κ(ϕ) = χ(ϕ) for each ϕ ∈ C(X).
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
435
Proof. The “only if” part follows from κ(ϕ) = limκ ϕ = limκ ◦ ϕ = limχ◦ ϕ = limχ ϕ = χ(ϕ), the second and the fourth equality being consequences of Lemma 2.2. / χ◦ ⊂ χ, hence F c ∈ χ Conversely, let κ ◦ = χ◦ . Then there is F ∈ κ ◦ such that F ∈ because χ is an ultrafilter. Let now G and θ be as in Lemma 2.3. Since G ∈ κ ◦ ⊂ κ we have κ(1G ) = 1, thus κ(1Gc ) = 0. Hence, κ(θ) = κ(θ1G ) + κ(θ1Gc ) = 1 + κ(θ)κ(1Gc ) = 1. On the other hand, F c ∈ χ implies χ(1F c ) = 1 and θ1F c = 0, thus 0 = χ(θ1F c ) = χ(θ)χ(1F c ) = χ(θ). 3. Crossed Products In this section we recall some facts concerning crossed products and point out some properties important for our later arguments. A locally compact non-compact abelian group X is fixed in what follows. We shall say that a C ∗ -algebra A is an X-algebra if a homomorphism α : x → αx of X into the group of automorphisms of A is given, such that for each A ∈ A the map x → αx (A) is norm continuous.c An X-subalgebra of A is a C ∗ -subalgebra that is left invariant by all the automorphisms αx . An X-ideal is an ideal stable under the αx . If (A, α) and (B, β) are two X-algebras, a morphism φ : A → B is called X-morphism if φ[αx (A)] = βx [φ(A)] for all x ∈ X and A ∈ A. We shall not need the abstract definition of the crossed product A X of an X-algebra A by the action of X. We mention only that A X is a C ∗ -algebra uniquely defined modulo a canonical isomorphism by a certain universal property (see [41], for example) and that the correspondence A → A X has certain functorial properties (see [23]) which play an important role in [22] but will not be used here. On the other hand, the following concrete realization of A X for certain A will be important. There is a natural action of X on L∞ (X) by translations (τx ϕ)(y) = ϕ(y + x) and it is clear that x → τx ϕ ∈ L∞ (X) is norm continuous if and only if ϕ ∈ C(X). Thus, C(X) becomes an X-algebra and we will be interested only in crossed products A X with A an X-subalgebra of C(X), i.e. a C ∗ -subalgebra stable under translations. In many cases, we shall slightly simplify the writing and set x.ϕ = τx ϕ. Note that if ϕ ∈ C(X) ∩ L2 (X), we have x.ϕ = Ux ϕ but (x.ϕ)(Q) = Ux ϕ(Q)Ux∗ . More generally, we shall use the notations: x ∈ X, T ∈ B(X) ⇒ x.T ≡ τx (T ) = Ux T Ux∗ .
(3.1)
The next definition describes AX in what we could call the pseudo-differential operator representation, or ΨDO-representation. Definition 3.1. If A is an X-subalgebra of C(X), the crossed product A X ≡ A is the norm closed linear subspace of B(X) generated by the operators of the form ϕ(Q)ψ(P ) with ϕ ∈ A and ψ ∈ C0 (X ∗ ). terminology “C ∗ -dynamical system” used by some C ∗ -algebra theorists seems to us extremely confusing in our context, even if X is R or Z, so we shall not use it.
c The
June 29, 2006 16:14 WSPC/148-RMP
436
J070-00269
V. Georgescu & A. Iftimovici
The fact that A is a C ∗ -algebra follows from: Lemma 3.2. If ϕ ∈ C(X) and ψ ∈ C0 (X ∗ ), then for each number ε > 0, there are elements x1 , . . . , xn ∈ X and functions ψ1 , . . . , ψn ∈ C0 (X ∗ ) such that: ψ(P )ϕ(Q) − (3.2) ϕ(Q + xk )ψk (P ) < ε. k
For the proof, first approximate ψ by functions such that ψ ∈ L1 (X) and then adapt the proof of [11, Lemma 2.1]. We mention two results which explain why we think of A as a C ∗ -algebra of quantum Hamiltonians. The first one is [22, Proposition 4.1]. Proposition 3.3. Let A be an X-subalgebra of C(X) which contains the constants. Let h : X ∗ → R be a continuous non-constant function such that limk→∞ |h(k)| = ∞. Then A X is the C ∗ -algebra generated d by the self-adjoint operators of the form h(P + k) + v(Q), with k ∈ X ∗ and v ∈ A real. The second one is [11, Corollary 2.4]. Here we assume X = Rn and denote A∞ the set of functions in A such that all their derivatives exist and belong to A. Proposition 3.4. Let h be a real elliptic polynomial of order m on X and let A be as in Proposition 3.3. Then A X is the C ∗ -algebra generated by the self-adjoint operators of the form h(P ) + S, where S runs over the set of symmetric differential operators of order < m with coefficients in A∞ . Example 3.5. We shall point out now the simplest crossed products. The smallest crossed product {0} = {0} X is, of course, of no interest. (1) (2) (3) (4)
The The The The
largest crossed product is C (X) = C(X) X, see Theorem 3.10. C0 functions of momentum: C0 (X ∗ ) = C X. algebra of compact operators: K (X) = C0 (X) X. two-body algebra: T (X) := C∞ (X) X = C0 (X ∗ ) + K (X).
The name of the fourth algebra is justified by Propositions 3.3 and 3.4. Indeed, if X = Rn , then T (X) is the C ∗ -algebra generated by the self-adjoint operators of the form (P + k)2 + v(Q) with k ∈ X and v ∈ Cc∞ (X) is real, or by those of the form n ∆ + j=1 aj ∂j + a0 where aj are C ∞ functions constant outside a compact. Remark 3.6. Note that the only abelian crossed products are {0} and C0 (X ∗ ). We have defined a map A → A X from the set of all X-subalgebras of C(X) into the set of C ∗ -subalgebras of B(X) which is obviously increasing. The following theorem, which is an immediate consequence of a more general abstract result due to M. B. Landstad, cf. [30, Theorem 4] or [35], says that this map is injective and describes its range. d If S is a family of self-adjoint operators, then the C ∗ -algebra generated by S is the smallest C ∗ -algebra of operators on H to which is affiliated each H ∈ S.
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
437
Theorem 3.7. A C ∗ -subalgebra A of B(X) is a crossed product if and only if for each A ∈ A the following two conditions are satisfied: • If k ∈ X ∗ , then Vk∗ AVk ∈ A and limk→0 Vk∗ AVk − A = 0, • If x ∈ X, then Ux A ∈ A and limx→0 (Ux − 1)A = 0. In this case, there is a unique X-subalgebra A ⊂ C(X) such that A = A X, and this algebra is given by A = A := {ϕ ∈ C(X) | ϕ(Q)(∗) ψ(P ) ∈ A , ∀ ψ ∈ C0 (X ∗ )}.
(3.3)
Note that, since A is stable under taking adjoints, if we replace Ux A by AUx and (Ux −1)A by A(Ux −1) in the second condition above we get an equivalent condition. If each element A of a C ∗ -subalgebra A ⊂ B(X) verifies the two conditions of the theorem, we shall say that A satisfies Landstad’s conditions. The following reformulation of the second Landstad condition is useful. Lemma 3.8. If T ∈ B(X) then the next three assertions are equivalent: • limx→0 (Ux − 1)T = 0, • T = ψ(P )T0 for some ψ ∈ C0 (X ∗ ) and T0 ∈ B(X), • ∀ ε > 0 ∃F ⊂ X ∗ with X ∗ \F compact and 1F (P )T < ε. Proof. It suffices to consider only the first two conditions. If T = ψ(P )T0 , then (Ux − 1)T ≤ (Ux − 1)ψ(P )T0 ≤ T0 sup |(k(x) − 1)ψ(k)| → 0 as x → 0. k
To prove the converse assertion, let B0 = {T ∈ B | limx→0 (Ux − 1)T = 0}. This is clearly a closed subspace of B such that ψ(P )B0 ⊂ B0 if ψ ∈ C0 (X ∗ ). By taking ψ(k) = |K|−1 1K in (2.8), where K runs over the family of compact neighborhoods of the origin in X ∗ , we easily see that each T ∈ B0 is a norm limit of operators of the form ψ(P )T . Now, the Cohen–Hewitt factorization theorem [15, Theorem V.9.2] shows that each T ∈ B0 can be written as T = ψ(P )T0 with ψ ∈ C0 (X ∗ ) and T0 ∈ B0 . Corollary 3.9. If A is a crossed product, then each A ∈ A can be factorized as A = A1 ψ1 (P ) = ψ2 (P )A2 with Ai ∈ A and ψi ∈ C0 (X ∗ ). In particular, if A ∈ A and ψ is a bounded continuous function on X ∗ , then Aψ(P ) and ψ(P )A are in A . Theorem 3.7 allows us to give an intrinsic description of some crossed products. By “intrinsic” we mean a description which makes no reference to the crossed product operation. Examples may be found in Sec. 6, here we give the description of the largest crossed product C (X) which makes the connection with the definition (1.1). Theorem 3.10. The crossed product C (X) = C(X) X is given by (1.1).
June 29, 2006 16:14 WSPC/148-RMP
438
J070-00269
V. Georgescu & A. Iftimovici
For the proof, it suffices to note that the right-hand side of (1.1) is a C ∗ -algebra and to apply Theorem 3.7. It is useful to view the last condition in (1.1) from the perspective of Lemma 3.8: this gives a precise meaning to the fact that the operators from C (X) tend to zero as P → ∞. Remark 3.11. If X = Rn , we see that C (X) is the norm closed linear subspace of B(X) generated by the operators ϕ(Q)ψ(P ) with ϕ in the space of C ∞ functions which are bounded together with all their derivatives and ψ in the space of C ∞ functions with compact support. So C (X) is generated by a rather restricted class of pseudo-differential operators. In particular, C (X) is the norm closure of the set of pseudo-differential operators with symbols of class S m if m < 0 (see [27, Definition 18.1.1] and use [27, Theorem 18.1.6]). From Proposition 3.4, it also follows that C (X) is generated by a rather small class of elliptic operators. As a consequence, we get an intrinsic description of the algebras of quantum Hamiltonians, in the sense of Definition 1.13. Proposition 3.12. A C ∗ -subalgebra A ⊂ B(X) is a C ∗ -algebra of quantum Hamiltonians if and only if A ⊃ T (X) and • x ∈ X, k ∈ X ∗ , A ∈ A ⇒ Vk∗ AVk and Ux A belong to A , • limk→0 [A, Vk ] = limx→0 (Ux − 1)A = 0. Remark 3.13. Observe that the classical Riesz–Kolmogorov compactness criterion K (X) = T ∈ B(X) lim (Vk − 1)T = 0 and lim (Ux − 1)T = 0 k→0
= {T ∈ B(X) | T = ϕ(Q)S = ψ(P )R
x→0
with ϕ ∈ C0 (X), ψ ∈ C0 (X ∗ ) and S, R ∈ B(X)}
is also an intrinsic characterization of a crossed product and follows easily from Theorem 3.7 and Lemma 3.8 together with a similar fact with the group Ux replaced by Vk . In more intuitive terms, the compact operators are characterized by the fact that they vanish when P → ∞ and Q → ∞. Now, we show that the set of C ∗ -subalgebras of B(X) which are crossed products is stable under arbitrary intersections and that the C ∗ -algebra generated by an arbitrary of crossed products is again a crossed product. We denote by family
∗ ∗ ∗ C λ Bλ the C -subalgebra generated by a family of C -subalgebras Bλ . Theorem 3.14. If (A λ ) is an arbitrary family of X-subalgebras of C(X), then ∗ λ Aλ and C λ Aλ are X-subalgebras and: (3.4) λ Aλ X = λ Aλ X, ∗ C∗ (3.5) λ (Aλ X) = C λ Aλ X.
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
439
Proof. The fact that λ Aλ and C ∗ λ Aλ are X-subalgebras is easy to prove and the inclusions ⊃ in (3.4) and ⊂ in (3.5) are obvious. The proof of ⊃ in (3.5) is elementary. Indeed, it suffices the left-hand to show that ϕ(Q)ψ(P ) belongs to
. Then we may assume that ϕ = ϕ A = side of (3.5) if ϕ ∈ C ∗ L λ λ λ∈L ϕλ with ϕλ ∈ Aλ and L a finite set. Let λ ∈ L and M = L\{λ}. Then Corollary 3.9 applied to ϕλ (Q)ψ(P ) ∈ Aλ X gives ϕL (Q)ψ(P ) = ϕM (Q)ϕλ (Q)ψ(P ) = ϕM (Q)ψλ (P )Aλ for some ψλ ∈ C0 (X ∗ ) and Aλ ∈ Aλ X. Repeating the argument with ϕL replaced by ϕM , we see that ϕL (Q)ψ(P ) can be written as a product of elements of Aλ X with λ ∈ L. This proves (3.5). The inclusion ⊂ in (3.4) is deeper, it depends on Theorem 3.7. Let Aλ = Aλ X and A = λ Aλ . It is easy to check that A satisfies the two conditions of Theorem 3.7, so A = A X where A is defined by (3.3). If ϕ ∈ C(X) has the property ϕ(Q)(∗) ψ(P ) ∈ A for all ψ ∈ C0 (X ∗ ), then we also have ϕ(Q)(∗) ψ(P ) ∈ Aλ for all such ψ, hence ϕ ∈ (Aλ ) = Aλ for each λ. Thus, ϕ ∈ λ Aλ , hence A ⊂ λ Aλ . Proposition 3.15. If A, J are X-subalgebras then J is an ideal of A if and only if J X is an ideal of A X. Proof. The fact that “J ⊂ A ideal ⇒ J X ⊂ A X ideal” follows easily from Lemma 3.2. For the converse it suffices to show that if J , A are crossed products and if J is an ideal of A , then J is an ideal of A . Let ξ ∈ J and ϕ ∈ A , then by Corollary 3.9, for each ψ ∈ C0 (X ∗ ) we can factorize ϕ(Q)ψ(P ) = ψ0 (P )S for some ψ0 ∈ C0 (X ∗ ) and S ∈ A . Thus (ξϕ)(Q)ψ(P ) = ξ(Q)ψ0 (P )S ∈ J because ξ(Q)ψ0 (P ) ∈ J and J is an ideal of A , hence ξϕ ∈ J . Proposition 3.16. Assume that A, B, J are X-subalgebras of C(X) such that A = B + J and that J is an ideal in A. Then J X is an ideal in A X and A X = B X + J X. If A = B + J is a linear direct sum, then A X = B X + J X is a linear direct sum. Proof. We know that J X is an ideal in A X and that B X ⊂ A X is a C ∗ -subalgebra. From [14, Corollary 1.8.4], we see that B X + J X is closed in A X, and since it is clearly dense in A X, we have A X = B X + J X. Finally, (B X) ∩ (J X) = B ∩ J X because of (3.4), and this is {0} if B ∩ J = {0}. We mention a fact which is useful in the explicit computations of A .
June 29, 2006 16:14 WSPC/148-RMP
440
J070-00269
V. Georgescu & A. Iftimovici
Remark 3.17. It is clear that in (3.3) it suffices to consider only ψ ∈ Cc (X ∗ ). Since, by Corollary 3.9, a crossed product is a C0 (X ∗ )-bimodule, we get the following simpler description of A: if there is ξ ∈ C0 (X ∗ ) such that ξ(k) = 0 for all k ∈ X ∗ , then A = {ϕ ∈ C(X) | ϕ(Q)(∗) ξ(P ) ∈ A }.
(3.6)
Such a ξ exists if and only if X ∗ is σ-compact (i.e. a countable union of compact sets). Remark 3.18. The following comment on the first Landstad condition is of some interest, although it does not play any role in our arguments. Let C u (Q) be the set of S ∈ B(X) which verify the first Landstad condition; this is clearly a C ∗ -algebra. Let us say that an operator S ∈ B(X) is of finite range (not rank!) if there is a compact neighborhood Λ of the origin such that S1K (Q) = 1K+Λ (Q)S1K (Q) for any Borel set K. Clearly, the set of finite range operators is a ∗-subalgebra of B(X) and it can be shown that the set of finite range operators which belong to C u (Q) is dense in C u (Q). Moreover, under quite general conditions on X, it can be shown that a finite range operator belongs to C u (Q) (this is probably always true). Thus, if X = Rn or if X is a discrete group for example, then C u (Q) is exactly the norm closure of the set of finite range operators. These questions are treated in [18, Propositions 4.11 and 4.12]. 4. Affiliation to C (X) Theorem 1.2 shows that the essential spectrum of the operators affiliated to C (X) is determined by their localizations at infinity, so it is important to show that the class of operators affiliated to C (X) is large. We show in this section that this is indeed the case: singular perturbations of hypoelliptic self-adjoint pseudodifferential operators are affiliated to C (X). If one thinks of C (X) as the C ∗ -algebra generated by the operators of the form ϕ(Q)ψ(P ) with ϕ ∈ C(X), ψ ∈ C0 (X), this is far from obvious. In the rest of the section, we fix a finite dimensional Hilbert space E, we set H = L2 (X; E) and define C = C (X) as in (1.1). Since the adjoint spacee H∗ is identified with H by using the Riesz isomorphism, if G is a Hilbert space with G ⊂ H continuously and densely then we get a similar embedding H ⊂ G ∗ . Let H be a self-adjoint operator on H and let z ∈ ρ(H). As we saw in (1.3), H is affiliated to C if and only if lim (Ux − 1)(H − z)−1 = 0 and
x→0
lim [Vk , (H − z)−1 ] = 0.
k→0
(4.1)
In the next subsection, we make an abstract analysis of these relations and in Sec. 4.2 we give concrete examples. adjoint space (space of antilinear continuous forms) of a Hilbert space G is denoted G ∗ and if u ∈ G and v ∈ G ∗ then we set v(u) = u, v.
e The
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
441
4.1. Abstract affiliation criteria A function θ : X ∗ → R such that limk→∞ θ(k) = +∞ will be called divergent. Lemma 3.8 and an interpolation argument give: Lemma 4.1. The first condition in (4.1) is fulfilled if and only there are s > 0 and a continuous divergent function θ such that D(|H|s ) ⊂ D(θ(P )). And then this property holds for all real numbers s > 0. Let S(E) be the space of symmetric operators on E. If h : X ∗ → S(E) is Borel, then h(P ) is the self-adjoint operator on H such that F h(P )F ∗ is the operator of multiplication by h in L2 (X ∗ ; E). If limk→∞ dist(0, σ(h(k))) = ∞, then we write limk→∞ h(k) = ∞. This property is equivalent to limk→∞ (h(k) + i)−1 = 0 and implies limk→∞ ϕ(h(k)) = 0 for all ϕ ∈ C0 (R). If = C, this means limk→∞ |h(k)| = ∞. Corollary 4.2. If h : X ∗ → S(E) is a continuous function on X ∗ , then h(P ) is affiliated to C if and only if limk→∞ h(k) = ∞. In particular, if X = R2 then the operator H = ∂12 − ∂22 is not affiliated to C . A second interesting operator not affiliated to C is H = (∂1 + ix2 )2 + (∂2 + ix1 )2 . We now give the simplest affiliation criterion. Proposition 4.3. Assume that H0 is a self-adjoint operator affiliated to C and that V is a bounded symmetric operator such that limk→0 [Vk , V ] = 0. Then H = H0 + V is a self-adjoint operator affiliated to C . Proof. Let R = (H + i)−1 and R0 = (H0 + i)−1 . Since H and H0 have the same domain and R[1 + V R0 ] = R0 , the operator 1 + V R0 is invertible. On the other hand, 1 + V R0 satisfies the second condition in (4.1), hence its inverse verifies it too. From R = R0 [1 + V R0 ]−1 , we see that both conditions in (4.1) are satisfied. From now on, we consider only situations when V is not bounded. Proposition 4.4. Let H be a self-adjoint operator such that Vk D(H) ⊂ D(H) for all k. Then H is affiliated to C if and only if D(H) ⊂ D(θ(P )) for some continuous divergent function θ and lim [Vk , H]D(H)→D(H)∗ = 0.
k→0
(4.2)
Proof. It is clear that Vk D(H) ⊂ D(H) for all k if and only if Vk extends to a continuous map D(H)∗ → D(H)∗ for each k, and then we have in B(H): [Vk , (H − z)−1 ] = (H − z)−1 [H, Vk ](H − z)−1 .
(4.3)
The operator [H, Vk ] belongs to B(D(H), H) and so we can consider it as a map D(H) → D(H)∗ . But (H − z)−1 is an isomorphism H → D(H) and D(H)∗ → H. To end the proof it suffices to use Lemma 4.1.
June 29, 2006 16:14 WSPC/148-RMP
442
J070-00269
V. Georgescu & A. Iftimovici
We shall give below three perturbative criteria of affiliation: we add to an operator affiliated to C an operator which is not necessarily affiliated to it. Note that functions of Q are never affiliated to C . First, we consider operator bounded perturbations. Corollary 4.5. Let H0 be a self-adjoint operator affiliated to C such that Vk D(H0 ) ⊂ D(H0 ) for all k. Let V be a symmetric operator with domain D(H0 ) and such that H = H0 + V is self-adjoint. Then H is affiliated to C if and only if lim [Vk , V ]D(H0 )→D(H0 )∗ = 0.
k→0
(4.4)
Now, we want to consider form bounded perturbations in a generalized sense (in order to cover not semibounded operators). Let H be a self-adjoint operator on H. We say that a Hilbert space G is adapted to H if D(H) ⊂ G ⊂ H continuously and densely and H − z extends to an isomorphism G → G ∗ for some (hence for all) z ∈ C outside the spectrum of H. Then H extends to a continuous operator G → G ∗ and we keep the notation H for the extended map. It is not difficult to show that if H is a semibounded operator then G is adapted to H if and only if G = D(|H|1/2 ) as topological vector spaces, see [18, p. 47]. But, in general, for example in the case of Dirac operators, this is not the case. Observe that D(H) ⊂ G ⊂ H ⊂ G ∗ ⊂ D(H)∗ continuously and densely, in particular B(G, G ∗ ) ⊂ B(D(H), D(H)∗ ). It is then clear that one has Vk G ⊂ G for all k if and only if Vk extends to a continuous map G ∗ → G ∗ for each k, and in this case the identity (4.3) is valid in B(G ∗ , G). The operator [H, Vk ] belongs to B(G, G ∗ ) and so we can consider it as a map D(H) → D(H)∗ . But (H − z)−1 is an isomorphism H → D(H) and D(H)∗ → H. Thus: Proposition 4.6. Let H be a self-adjoint operator on H such that D(H) ⊂ D(θ(P )) for some continuous divergent function θ. Assume that G is a Hilbert space adapted to H and that Vk G ⊂ G for all k. Then H is affiliated to C if and only if lim [Vk , H]D(H)→D(H)∗ = 0.
k→0
(4.5)
In many situations of interest in quantum mechanics, the domain of the Hamiltonian is difficult to determine while its form domain is quite explicit. For this reason, the following condition stronger than (4.5) is often more convenient: lim [Vk , H]G→G ∗ = 0.
k→0
(4.6)
We shall use this in the following context. Definition 4.7. Let H0 be a self-adjoint operator on H and let G be a Hilbert space adapted to it. We say that V is a standard form perturbation of H0 if V is a continuous symmetric sesquilinear form on G and there are numbers µ ∈ [0, 1) and
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
443
ν ≥ 0 such that one of the following conditions is satisfied: (1) ±V ≤ µ|H0 | + ν as forms on G (2) H0 is bounded from below and V ≥ −µH0 − ν as forms on G. Then, the operator H = H0 + V : G → G ∗ is such that its restriction to D(H) = {u ∈ G | Hu ∈ H} is a self-adjoint operator on H (and will also be denoted H) and G is adapted to H too (see [13]). Note that V is seen as a continuous operator G → G ∗ . Corollary 4.8. Let H0 and V as above. We assume that G ⊂ D(θ(P )) for some continuous divergent function θ, that Vk G ⊂ G for all k, and limk→0 [Vk , H]G→G ∗ . Then H is affiliated to C . The next result covers perturbations of H0 which are not dominated by H0 . Proposition 4.9. Let H1 , H2 be bounded from below self-adjoint operators and let us denote Gi = D(|Hi |1/2 ). Assume that G ≡ G1 ∩ G2 is dense in H and let H = H1 +H2 , the sum being defined in form sense. Let us suppose that G ⊂ D(θ(P )) for some continuous divergent function θ and that for i = 1, 2 we have Vk Gi ⊂ Gi and limk→0 [Vk , Hi ]B(Gi ,Gi∗ ) = 0. Then H is affiliated to C . Proof. Let us recall that the form sum H = H1 + H2 is defined as the unique self-adjoint operator such that D(|H|1/2 ) = G and u, Hu = u, H1 u + u, H2 u for all u ∈ G. The topology of G is the intersection topology of G1 and G2 , so thinking in terms of sesquilinear forms we see that [Vk , H]B(G,G ∗ ) ≤ C[Vk , H1 ]B(G1 ,G1∗ ) + C[Vk , H2 ]B(G2 ,G2∗ ) for some constant C. Hence, (4.6) is satisfied. 4.2. Examples of affiliated operators If w is a continuous divergent function on X ∗ let Hw ≡ Hw (X) = D(w(P )) equipped with the graph norm. We saw in Lemma 4.1 that if H is affiliated to C then D(|H|1/2 ) ⊂ Hw for such a w. We consider now operators whose form domain is equal to some Hw . We say that w is a weight f if w : X ∗ → ]0, ∞[ is continuous and w(k + p) ≤ ω(k)w(p) for some function ω and all k, p ∈ X ∗ . If ω is the smallest function satisfying such an estimate, then ω(k + p) ≤ ω(k)ω(p). From now on, we shall assume that ω satisfies this submultiplicativity condition. We also say ω-weight if we need to be more specific. If X = Rn , then a standard choice is w(k) = ks for some real s. f The
terminology is suggested by that from [27, Sec. 10.1], cf. the remark after Theorem 10.1.5.
June 29, 2006 16:14 WSPC/148-RMP
444
J070-00269
V. Georgescu & A. Iftimovici
Lemma 4.10. A continuous divergent function w on X ∗ is an ω-weight if and only if Vk Hw ⊂ Hw and Vk B(Hw ) ≤ ω(k) for all k. Proof. We may take w(P )u as norm on Hw . From Vk∗ w(P )Vk = w(P + k), we see that we have Vk Hw ⊂ Hw and Vk B(Hw ) ≤ ω(k) if and only if w(P + k)u ≤ ω(k)w(P )u for all u, which is equivalent to w(k + p) ≤ ω(k)w(p) for all k, p. Proposition 4.11. A self-adjoint operator on H with D(|H|1/2 ) = Hw for some divergent weight w and such that limk→0 [Vk , H]B(Hw ,Hw∗ ) = 0 is affiliated to C . This is an immediate consequence of Proposition 4.6. Proposition 4.12. Let H be as in Proposition 4.11 and bounded from below. Let V ∈ L1loc (X) be a real function whose negative part is form bounded with respect to H with relative bound strictly less than 1. Then the self-adjoint operator H + V (Q) (form sum) is affiliated to C . Proof. Let V+ , V− be the positive and negative parts of V , then we define the sum as (H − V− ) + V+ and apply successively Propositions 4.6 and 4.9. Example 4.13. The most common situation is X = Rn and w(k) = ks for some real s > 0. Then Hw is the usual Sobolev space Hs and typical operators satisfying the conditions of the Proposition 4.11 are the uniformly elliptic operators of order 2s. For example, let s = m ≥ 1 integer and P α aαβ P β L= |α|,|β|≤m
for some measurable functions aαβ : X → B(E) such that the operator of multiplication by aαβ is a continuous map Hm−|β| → H|α|−m (this is a very general assumption which allows one to give a meaning to the differential expression L). Then L : Hm → H−m is a continuous map and Vk∗ LVk is a polynomial in k. If u, Lu ≥ µu2Hm − νu2 for some µ, ν > 0, then L induces a self-adjoint operator in H which is affiliated to C . Example 4.14. We give an explicit example of physical interest in the case s = 1. Let (Pi − Ai )Gij (Pj − Aj ) + V ≡ (P − A)G(P − A) + V, (4.7) H= i,j
where Gij , Ai , V are (the operators of multiplication by) locally integrable real functions having the following properties ( · 1 is the norm of H1 ): (1) Gij ∈ L∞ (X), the matrix G(x) = (Gij (x)) is symmetric and G(x) ≥ ν > 0, (2) for each ε > 0, there is δ ∈ R such that Aj u ≤ εu1 + δu for all u ∈ H1 ,
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
445
(3) if V− is the negative part of V , then for each ε > 0, there is a real number δ such that u, V− u ≤ εu21 + δu2 for all u ∈ H1 . Note that the conditions on Aj and V− are satisfied if there is s < 1 such that Aj u ≤ Cus and u, V− u ≤ Cu2s . Then H is affiliated to C . Indeed, observe first that H0 ≡ (P − A)G(P − A) is a self-adjoint operator with form domain equal to H1 , because there is δ such that: ν ν u, H0 u ≥ ν(P − A)u2 ≥ P u2 − νAu2 ≥ P u2 − δu2 . 2 4 Hence, according to Proposition 4.12, it suffices to prove that H0 is affiliated to C . But Vk∗ H0 Vk = (P − A + k)G(P − A + k) = H0 + kG(P − A) + (P − A)Gk + kGk. Thus, Vk∗ H0 Vk − H0 B(H1 ,H−1 ) ≤ C(|k| + |k|2 ) so we can use Proposition 4.11. Remark 4.15. Let us consider the operator H0 under the more general condition Aj ∈ L2loc (X). More precisely, H0 is the positive self-adjoint operator associated to the closed quadratic form (P − A)u2 whose domain is the set G of u ∈ H such that the distributions (Pj − Aj )u belong to H. The preceding computation shows that Vk G ⊂ G and that (4.6) is satisfied. Hence, H0 is affiliated to C if and only if G ⊂ θ(P ) for some continuous divergent function θ. But this cannot be true without some boundedness conditions on A at infinity. As a final example, we consider singular perturbations of h(P ), where h is a continuous divergent function X ∗ → R and X is an arbitrary group. Let G = D(|h(P )|1/2 ). Two functions u, v on a neighborhood of infinity will be called equivalent if they satisfy c1 |u(k)| ≤ |v(k)| ≤ c2 |u(k)| for all large k and some constants c1 , c2 > 0. It is clear that G = Hw if and only if h is equivalent to w2 . Then Proposition 4.12 implies: Proposition 4.16. Let h : X ∗ → R be a divergent function equivalent to a weight and such that |h(p + k) − h(p)| = 0. (4.8) lim sup k→0 p 1 + |h(p)| Let W be a standard form perturbation of h(P ) with limk→0 [Vk , W ]B(G,G ∗ ) = 0 and define H0 = h(P ) + W as a form sum. Let V ∈ L1loc (X) real and such that V− ≤ µH0 + ν on G for some µ < 1, ν > 0. Then the form sum H = H0 + V (Q) is a self-adjoint operator affiliated to C . Example 4.17. Let X = Rn and assume that h is of class C 1 and satisfies |h (k)| ≤ C(1 + |h(k)|). Then (4.8) is fulfilled because |h(p + k) − h(p)| ≤ sup |h (p + θk)||k| ≤ C 1 + sup |h(p + θk)| |k| 0<θ<1
0<θ<1
June 29, 2006 16:14 WSPC/148-RMP
446
J070-00269
V. Georgescu & A. Iftimovici
which is ≤ C (1 + |h(p)|)|k| if |k| ≤ 1 because h is a equivalent to a weight. On the other hand, assume that h is of class C m for some integer m ≥ 1 and that we have: (1) limk→∞ h(k) = +∞, (2) the derivatives of order m of h are bounded, (α) (k)| ≤ C(1 + |h(k)|). Then from [1] we get that h is equiv(3) |α|≤m |h alent to a weight. Any real hypoelliptic polynomial satisfies all these conditions, see [27, Definition 11.1.2 and Theorem 11.1.3]. 5. Localizations at Infinity In this section, we prove our main result, Theorem 5.11, and some easy consequences. 5.1. Localizations at infinity of functions We define first the localizations at infinity for functions in C(X). We denote Cs (X) the space C(X) equipped with the topology given by the seminorms ϕθ = ϕθ with θ ∈ C0 (X) (this is the strict topology associated to the essential ideal C0 (X)). Lemma 5.1. If ϕ ∈ C(X) and κ ∈ δX, then κ.ϕ(y) := limx→κ ϕ(x+y) exists locally uniformly in y ∈ X. Equivalently, we have x.ϕ → κ.ϕ in Cs (X) if x → κ in γX. The function κ.ϕ belongs to C(X) and we have (κ.ϕ)(y) = κ(y.ϕ). Proof. Since ϕ is a bounded function, we have lim ϕ(x + y) = lim (y.ϕ)(x) = κ(y.ϕ)
x→κ
x→κ
by taking into account the two interpretations of κ. Then we use Lemma 2.2. Thus, κ.ϕ ∈ C(X) is well defined for all κ ∈ γX (if κ = x ∈ X, see Sec. 3) and all ϕ ∈ C(X). The next lemma is a slight improvement of Lemma 5.1, it will allow us to give a completely elementary proof of Theorem 5.16 (see the remark after the proof of the theorem). Note that the relation (κ.ϕ)(y) = κ(y.ϕ) remains true for all κ ∈ γX if we interpret x ∈ X as a character of ∞ (X). Since y.ϕ ∈ C(X), we see that κ.ϕ depends in fact only on the class of κ in βu X, cf. Sec. 2.5. We shall keep the notation κ.ϕ even if κ ∈ βu X. Recall that X ⊂ βu X is an open dense subset. Lemma 5.2. Let ϕ ∈ C(X). Then X x → x.ϕ ∈ C(X) extends to a continuous function βu X κ → κ.ϕ ∈ Cs (X). We have κ.ϕ(y) = κ(y.ϕ) for all y ∈ X. Proof. For κ ∈ βu X = σ(C(X)), the function κ.ϕ is given by κ.ϕ(y) = κ(y.ϕ), y ∈ X. It is easy to check directly that κ.ϕ so defined belongs to C(X): we have |κ(y.ϕ)| ≤ y.ϕ = ϕ and |κ(y.ϕ) − κ(z.ϕ)| = |κ(y.ϕ − z.ϕ)| ≤ y.ϕ − z.ϕ = (y − z).ϕ − ϕ.
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
447
It remains to prove that κ → κ.ϕ θ ∈ C0 (X) is continuous for any θ ∈ C0 (X), i.e. that for each χ ∈ βu X, each ε > 0 and each θ ∈ C0 (X), there is a neighborhood V of χ in βu X such that (κ.ϕ − χ.ϕ)θ < ε if κ ∈ V . Since θ is C0 , it will suffice to prove that for each χ and ε as before and each compact set K ⊂ X there is a neighborhood V of χ such that κ ∈ V implies |κ(y.ϕ) − χ(y.ϕ)| < ε for y ∈ K. But the map y → y.ϕ ∈ C(X) is norm continuous, thus {y.ϕ | y ∈ K} is a compact subset of C(X). Hence, there is a finite subset Z of K such that for each y ∈ K we have minz ∈ Z y.ϕ − z.ϕ < ε. Thus, for each y ∈ K and z ∈ Z, we have |κ(y.ϕ) − χ(y.ϕ)| = |κ(y.ϕ − z.ϕ) + κ(z.ϕ) − χ(z.ϕ) + χ(z.ϕ − y.ϕ)| < 2ε + |κ(z.ϕ) − χ(z.ϕ)|. Now, if we take V = {κ ∈ βu X | supz ∈ Z |κ(z.ϕ) − χ(z.ϕ)| < ε}, then V is a neighborhood of χ in βu X because Z is a finite set, and for each κ ∈ V and each y ∈ K we have |κ(y.ϕ) − χ(y.ϕ)| < 3ε. Lemma 5.3. If ϕ ∈ C(X), then κ.ϕ = 0 for all κ ∈ δX if and only if ϕ ∈ C0 (X). Proof. If κ.ϕ = 0 for all κ ∈ δX, then κ(ϕ) = (κ.ϕ)(0) = 0 for such κ. If ϕ ∈ / C0 (X), then there is a number a > 0 such that the set U = {x | |ϕ(x)| > a} is not relatively compact. Since U ∩V = ∅ for each V with relatively compact complement, we see that the family of sets U ∩ V is a filter basis and the filter f it generates is finer than Fr´echet and contains U . Let κ be any ultrafilter finer than f, then κ ∈ δX and κ(1U ) = 1. Finally, from |ϕ| ≥ a1F , we get |κ(ϕ)|2 = κ(|ϕ|2 ) ≥ a2 κ(1V ) = a2 , so we cannot have κ(ϕ) = 0. Definition 5.4. If ϕ ∈ C(X) and κ ∈ δX then the function κ.ϕ ∈ C(X) is the localization of ϕ at κ. And (ϕ) := {κ.ϕ | κ ∈ δX} ⊂ C(X) is the set of localizations of ϕ at infinity. For each κ ∈ δX, let τκ : C(X) → C(X) be given by τκ (ϕ) = κ.ϕ. Clearly, this is a unital morphism and, since the property x.(κ.ϕ) = κ.(x.ϕ) is easy to check, τκ is in fact an X-morphism. By Lemma 5.3, we have ker τκ = C0 (X). (5.1) κ∈δX
Note that ker τκ is the maximal X-ideal included in the maximal ideal ker κ of C(X). Remark 5.5. In general τκ τχ = τχ τκ . 5.2. Localizations at infinity of operators In this subsection, we extend the notion of localization to operators in C (X). Definition 5.6. Let Cs (X) be the space C (X) equipped with the topology defined by the family of seminorms T θ = T θ(Q) + θ(Q)T with θ ∈ C0 (X).
June 29, 2006 16:14 WSPC/148-RMP
448
J070-00269
V. Georgescu & A. Iftimovici
Note that if X is σ-compact then there is θ ∈ C0 (X) with θ(x) > 0 for all x ∈ X and then · θ is a norm on C (X) which induces on bounded subsets of C (X) the topology of Cs (X). In any case, the topology of Cs (X) is finer than the strong operator topology induced by B(X). Note also that the topology of Cs (X) does not depend on any Hilbert space realization of C (X) because C (X) is a C(X)-bimodule and C0 (X) is an ideal of C(X). Finally, observe that we could consider on C (X) the (intrinsically defined) strict topology associated to the ideal K (X); this is weaker than that of Cs (X) and finer than the strong operator topology (but coincides with it on bounded sets). Remark 5.7. That this is the natural topology in our context should have been clear for us a long time ago, since it is induced by the strict topology of C(X), cf. [20, p. 31] and [23, p. 148]. However, we did not realize it until B. Simon, in a private communication, emphasized its importance, in relation with [31, Proposition 3.11 and Theorem 4.5]. We are indebted to him for this remark. On the other hand, note that this topology does not play any role in our paper, the strong operator topology on C (X) (used in [20, 23]) suffices. We now describe some topological properties of Cs (X). Lemma 5.8. The map T → T ∗ is continuous on Cs (X) and the operation of multiplication is continuous on bounded sets. If T ∈ C (X) the map x → Ux T Ux∗ ∈ C (X) is norm continuous and the set {Ux T Ux∗ | x ∈ X} is relatively compact in Cs (X). Proof. The first assertion is obvious. To prove the second one, note first that if S ∈ C (X) and θ ∈ C0 (X), then the operators Sθ(Q) and θ(Q)S are compact. Indeed, it suffices to show this for S of the form ϕ(Q)ψ(P ) and then the assertion is obvious. In particular, from the Remark 3.13, it follows that there are K ∈ K (X) and θ ∈ C0 (X) such that Sθ(Q) = θ (Q)K, and similarly for θ(Q)S. Thus, for A, B, S, T ∈ C (X), we have (BA − T S)θ(Q) ≤ B(A − S)θ(Q) + (B − T )θ (Q)K from which the continuity of multiplication follows. The norm continuity of x → Ux T Ux∗ is obvious by (1.1). Finally, the last assertion of the lemma says that x → Ux T Ux∗ θ(Q) has relatively compact range and similarly when θ is on the left side. Clearly it suffices to take T = ϕ(Q)ψ(P ) and then Ux T Ux∗ θ(Q) = ϕ(Q+x)ψ(P )θ(Q) and ψ(P )θ(Q) is a compact operator. Now, the assertion follows from the Riesz– Kolmogorov criterion (Remark 3.13) which clearly implies: if K is a compact operator and ϕ ∈ C(X) then ϕ(Q + x)K is a norm relatively compact family of operators. Proposition 5.9. If T ∈ C (X) and κ ∈ δX, then κ.T := limx→κ Ux T Ux∗ exists in the topological space Cs (X). The map τκ : C (X) → C (X) defined by τκ (T ) = κ.T
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
449
is a morphism uniquely determined by the property: ϕ ∈ C(X), ψ ∈ C0 (X ∗ ) ⇒ τκ (ϕ(Q)ψ(P )) = (κ.ϕ)(Q)ψ(P ).
(5.2)
If T ∈ C (X) and ψ : X ∗ → C is a bounded continuous function, then τκ (T ψ(P )) = τκ (T )ψ(P ) and τκ (ψ(P )T ) = ψ(P )τκ (T ). For each k ∈ X ∗ , we have τκ Vk∗ T Vk = Vk∗ τκ (T )Vk .
(5.3)
Proof. We must show that there is an operator κ.T ∈ C (X) such that lim (Ux T Ux∗ − κ.T )θ(Q) = lim θ(Q)(Ux T Ux∗ − κ.T ) = 0
x→κ
x→κ
for all θ ∈ C0 (X). It is clearly sufficient to consider T = ϕ(Q)ψ(P ) with ϕ ∈ C(X) and ψ ∈ C0 (X ∗ ). Then we have Ux T Ux∗ θ(Q) = ϕ(Q + x)ψ(P )θ(Q) = ϕ(Q + x)θ (Q)K for some θ ∈ C0 (X) and K ∈ K (X). Indeed, ψ(P )θ(Q) is a compact operator and so we can use the Remark 3.13. Now, it suffices to use Lemma 5.1. The argument for θ(Q)Ux T Ux∗ is even simpler. The other assertions are easy to prove, for example the last assertion follows from Vk∗ Ux T Ux∗Vk = Ux Vk∗ T Vk Ux∗ . Proposition 5.10. Let T ∈ C (X). Then κ.T = 0 for each κ ∈ δX if and only if T ∈ K (X). Proof. In order to prove that κ.T = 0 if T ∈ K (X) it suffices to consider T = ϕ(Q)ψ(P ) with ϕ ∈ C0 (X). Then κ.T = (κ.ϕ)(Q)ψ(P ) and κ.ϕ = 0 if ϕ ∈ C0 (X). Reciprocally, let J = {T ∈ C (X) | κ.T = 0, ∀ κ ∈ δX} and notice that J is a C ∗ -algebra and, moreover, it is a crossed product because of the last assertions of Proposition 5.9. Also, for each S ∈ C (X) we have κ.(ST ) = (κ.S)(κ.T ) = 0 so J is an ideal. Thus, by Proposition 3.15, there is an ideal J in C(X) such that J = J X. Let us show that J = C0 (X). This will finish the proof, because then J = C0 (X) X = K (X). From (3.3), we get J = {ϕ ∈ C(X) | κ.(ϕ(Q)ψ(P )) = 0 ∀ ψ ∈ C0 (X ∗ ) and ∀ κ ∈ δX}. But κ.(ϕ(Q)ψ(P )) = (κ.ϕ)(Q)ψ(P ). On the other hand, if θ ∈ C(X) is such that θ(Q)ψ(P ) = 0 ∀ ψ ∈ C0 (X ∗ ), then θ = 0. Indeed, Vk∗ θ(Q)ψ(P )Vk = θ(Q)ψ(P + k), so if ψ ∈ L1 (X ∗ ), then we have in the weak operator topology 0= Vk∗ θ(Q)ψ(P )Vk dk = θ(Q) ψ(P + k) dk = θ(Q) ψ dk. X∗
X∗
X∗
Thus, it suffices to take ψ such that X ∗ ψ dk = 0. Finally, we see that J is the set of ϕ ∈ C(X) such that κ.ϕ = 0 for all κ ∈ δX, i.e. J = C0 (X) by Lemma 5.3. The next result follows easily from Propositions 5.9 and 5.10.
June 29, 2006 16:14 WSPC/148-RMP
450
J070-00269
V. Georgescu & A. Iftimovici
Theorem 5.11. The map T → (κ.T )κ ∈ δX is a morphism C (X) → with K (X) as kernel, so we have a canonical embedding C (X). C (X)/K (X) →
κ ∈ δX
C (X) (5.4)
κ ∈ δX
Theorem 1.1 is an immediate consequence. As explained in Sec. 2.2, the morphism τκ extends to observables affiliated to A and Theorem 1.2 follows easily. Remark 5.12. It has been brought to our attention by Steffen Roch that it is not possible to deduce Theorem 1.1 for not normal operators from Theorem 5.11, as we stated in an earlier version of this paper, because the spectrum of a general element of an infinite product of C ∗ -algebras is not so simply related to the spectra of its components. We could have stated a version of Theorem 1.1 valid for not normal operators in the spirit of [40, Theorem 2.2.1] but we did not do it because the only applications we have in mind refer to quantum Hamiltonians, which are selfadjoint operators. We mention, however, that for some algebras A the Theorem 1.15 remains true (without closure) for non-normal operators, see [40, Secs. 2.4 and 2.5.] Definition 5.13. If H is an observable affiliated to C (X) and if κ ∈ δX, then the observable κ.H affiliated to C (X) is called localization of H at κ. The set of operators (H) := {κ.H | κ ∈ δX} is the set of localizations of H at infinity. Then we can write the relation (1.4) as follows: σ(κ.H) = σess (H) = κ ∈ δX
σ(K).
(5.5)
K ∈ (H)
ˇ Remark 5.14. By using the universal property of the Stone–Cech compactification γX (cf. Sec. 2.4) we see that for T ∈ B(X) the following two assertions are equivalent: (1) the set {x.T | x ∈ X} is strongly relatively compact in B(X); (2) X x → x.T extends to a strongly continuous map γX κ → κ.T ∈ B(X). The set of operators having these properties is a norm closed subalgebra of B(X) (quite large, it contains C (X), L∞ (X), L∞ (X ∗ ) and much more). It is easy to check that σ(κ.T ) ⊂ σess (T ) if κ ∈ δX, but in most cases the operators κ.T do not suffice to determine the essential spectrum of T . This fact extends to observables affiliated to this algebra. For example, if H is the Hamiltonian of a particle in 2 dimensions in a constant non-zero magnetic field, then T = ϕ(H) has the property (1) and κ.T = 0 if ϕ ∈ C0 (R), i.e. κ.H = ∞ for all κ ∈ δX. But σess (H) = ∅.
5.3. Localizations at infinity of algebras We fix now an algebra of interactions A on X, and set A = A X ⊂ C . Theorem 5.11 gives a description of A /K (X) but we can make it more precise because many ultrafilters give the same character of A.
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
451
Definition 5.15. If κ ∈ δX the C ∗ -algebras Aκ = τκ (A) and Aκ = τκ (A ) = Aκ X are the localizations at κ of the algebras A and A , respectively. As explained in Sec. 2.5, and taking into account the relation (κ.ϕ)(y) = κ(y.ϕ) (see Lemma 5.1) and Lemma 2.1, we see that Aκ and Aκ depend only on the restriction to A of the character κ. In other terms, we have, for example, Aκ = Aχ if δ(κ) = δ(χ), where δ : δX → δ(A) is the canonical surjection, cf. (2.15). According to the convention made in Sec. 2.5 we shall use the same notations Aκ and Aκ if κ ∈ δ(A). In the statement of the next theorem, we use the canonical identification of X (as topological space) with an open dense subset of σ(A). Theorem 5.16. If T ∈ A the norm extends to a continuous map σ(A) map τκ : A → C defined by τκ (T ) = has κ.T = 0 for all κ ∈ δ(A ) if and embedding
continuous map X x → x.T ∈ A ⊂ C κ → κ.T ∈ Cs (X). For each κ ∈ δ(A), the κ.T is a morphism with Aκ as range. One only if T ∈ K (X) which gives a canonical
A /K (X) →
Aκ ·
(5.6)
κ ∈ δ(A)
Proof. Consider for each T ∈ A the map FT : γX → Cs (X) defined by FT (κ) = κ.T . From Lemma 5.2 it follows that FT is continuous: indeed, it suffices to assume that T = ϕ(Q)ψ(P ) and to argue as in the proof of Lemma 5.8. Notice that if the characters κ, χ ∈ γX are equal on A, then FT (κ) = FT (χ). Indeed, for T as above we have κ.T = (κ.ϕ)(Q)ψ(P ) = (χ.ϕ)(Q)ψ(P ) = χ.T . Thus, as explained in Sec. 2.5, if π : γX → σ(A ) is the canonical surjection, we shall have FT = fT ◦ π, where fT : σ(A ) → Cs (X) is continuous. If x ∈ X, then π(x) = x so fT (x) = FT (x) = x.T . We have both X ⊂ σ(A ) and X ⊂ γX and since the restriction of π to X is the identity mapping, π acts nontrivially only on the boundary. Let δ be the restriction of the map π to δX, hence δ : δX → δ(A ) is a canonical surjection. Thus, fT (κ) = 0 for all κ ∈ δ(A ) is equivalent to FT (κ) = 0 for all κ ∈ δX which means that T ∈ K (X). Remark 5.17. By using the last assertion of Lemma 5.8 and the universal property of the space γX, cf. Sec. 2.4, one may avoid the use of Lemma 5.2. Remark 5.18. In nice situations, the localization at infinity Aκ is simpler than A, and (Aκ )χ is still simpler, and so on, but this is not always the case. Note also that in general Aκ ⊂ A. If, however, this holds for each κ ∈ δ(A), then it is natural to ask whether we have τκ τχ ϕ = τχ τκ ϕ for all ϕ ∈ A and all κ, χ ∈ δ(A). Although this is not true if A = C(X), in several nontrivial and physically interesting situations this property is satisfied. See Examples 5.19 and 5.20 and Sec. 6. Example 5.19. We shall consider here the localizations at infinity of the simplest algebras. If A = C∞ (X) then σ(A) = X ∪ {∞} is the Alexandroff compactification of X, we have δ(A) = {∞}, and the localization of ϕ ∈ A at ∞ is the constant
June 29, 2006 16:14 WSPC/148-RMP
452
J070-00269
V. Georgescu & A. Iftimovici
function which takes the value ϕ(∞) = limx→∞ ϕ(x). If X = R and A is the set of bounded continuous functions which have limits as x → ±∞, then σ(A) = [−∞, +∞], δ(A) = {−∞, +∞}, and the localization of ϕ ∈ A at +∞ is again the constant function which takes the value ϕ(+∞) = limx→+∞ ϕ(x) and similarly for the localization at −∞. Thus, in both examples, we have Aκ = C for all κ ∈ δ(A). In Sec. 6.2, we shall describe explicitly the largest X-subalgebra A ⊂ C(X) such that Aκ = C for all κ ∈ δ(A). Example 5.20. The next example is due to Gilles Godefroy (we thank him for answering to our questions) and is relevant in the context of Remark 5.18. Let X = Z × Z and let A be the set of ϕ ∈ ∞ (X) such that limk→∞ ϕ(j, k) = 0 for all j ∈ Z. Let θ ∈ ∞ (Z) and set ϕ(j, k) = θ(k) if |k| ≤ j and = 0 otherwise. Then ϕ ∈ A and lima→+∞ ϕ(a + j, k) = θ(k) for each j, k. It is clear now that we may construct an ultrafilter κ ∈ δX such that κ.ϕ = 1 ⊗ θ so κ.ϕ ∈ / A in general. Theorem 1.15 is a corollary of Theorem 5.16. Thus, if H is a normal element of A or an observable affiliated to A and if we set κ.H = τκ (H), then σess (H) = σ(κ.H). (5.7) κ ∈ δ(A)
This representation of the essential spectrum of H, although more precise than (5.5), is still quite redundant, cf. Sec. 1.3, and can be improved in many situations (the most interesting one being the N -body case). To explain this, for κ ∈ δ(A) let us denote Jκ = ker τκ = {ϕ ∈ A | κ(x.ϕ) = 0 ∀ x ∈ X}.
(5.8)
This is is the maximal X-ideal included in the maximal ideal ker κ of A. Although the ideals ker κ for different κ are not comparable, it often happens that the Jκ are comparable, i.e. we may have Jκ ⊂ Jχ for κ = χ. Lemma 5.21. If Jκ ⊂ Jχ then σ(Hχ ) ⊂ σ(Hκ ). In particular, (5.7) remains true if we restrict the union to the κ such that the ideal Jκ is minimal in {Jκ | κ ∈ δ(A)}. Proof. Here, we use more abstract algebraic tools, as in [20, 22]. The morphism τκ : A → Aκ is surjective and has Jκ as kernel, hence induces an isomorphism A/Jκ ∼ = Aκ . If T ∈ A and if T /Jκ is its projection in the quotient A/Jκ , then T /Jκ is sent by this isomorphism into κ.T , hence σ(T /Jκ ) = σ(κ.T ). From Jκ ⊂ Jχ , we get a canonical surjective morphism A/Jκ → A/Jχ which sends T /Jκ into T /Jχ . Finally, recall that σ(Φ(S)) ⊂ σ(S) if Φ is a morphism. Example 5.22. If, for x ∈ X and κ ∈ δ(A), we denote x + κ the character κ ◦ τx , then clearly Jx+κ = Jκ , hence σ((x+κ).H) = σ(κ.H). However, this case is trivial because clearly (x + κ).H = Ux (κ.H)Ux∗ . One further simplification may be obtained as follows.
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
453
Lemma 5.23. Let K ⊂ δ(A) such that: if ϕ ∈ A and κ(x.ϕ) = 0 for all κ ∈ K and x ∈ X, then ϕ ∈ C0 (X). Then (5.7) remains valid if δ(A) is replaced by K. Proof. This is a consequence of the proof of Theorem 5.16 but can also be proved directly as follows. One first notices that the condition on K is equivalent to the density in δ(A) = σ(A/C0 (X)) of the set of characters of the form κ ◦ τx , with κ ∈ K and x ∈ X. Then one can use the following easily proven fact: if Sα is a net (∗) of operators such that Sα → S (∗) strongly, then σ(S) is included in the closure of
α σ(Sα ). 6. Applications After some preliminaries, we describe here three classes of C ∗ -algebras of Hamiltonians which seem to us particularly relevant and treat some more explicit examples. 6.1. Algebras associated to translation invariant filters In this preliminary subsection, we give an intrinsic description of a class of crossed products introduced in [20, 22]. Recall that a filter f is translation invariant if: x ∈ X, F ∈ f ⇒ x + F ∈ f. Note that f◦ will also be translation invariant. If f is a translation invariant filter let J (f) = {ϕ ∈ C(X)| limf ϕ = 0}.
(6.1)
This is clearly an X-ideal in C(X) and from Lemma 2.2, we get: J (f) = J (f◦ ).
(6.2)
Then C(f) = C + J (f) is the X-algebra consisting of the bounded uniformly continuous functions ϕ such that limf ϕ exists. Observe that if f is the Fr´echet filter then J (f) = C0 (X) and C(f) = C∞ (X). Below we shall consider nets indexed by the filter f equipped with the order relation F ≤ G ⇔ F ⊃ G. For example, limF ∈ f 1F (Q)T = 0 means that for each ε > 0 there is a Borel set F ∈ f such that 1F (Q)T < ε. Proposition 6.1. J (f) X = {T ∈ C (X) | limF ∈ f 1F (Q)T (∗) = 0}. Proof. Each T ∈ J (f) X has the property limF ∈ f 1F (Q)T = 0. Indeed, it suffices to consider operators of the form T = ϕ(Q)ψ(P ) with ϕ ∈ J (f), ψ ∈ C0 (X ∗ ). But then the set F of points x such that |ϕ(x)| < ε is open and belongs to f, and so we have 1F (Q)ϕ(Q) ≤ ε, which is more than needed. Conversely, let J be the set of T ∈ C (X) such that limF ∈ f 1F (Q)T (∗) = 0. This is clearly a C ∗ -subalgebra of C (X) which is stable under the morphisms T → Vk∗ T Vk . By Theorem 3.7, we have J = J X for a unique X-algebra J , namely the set of ϕ ∈ C(X) such that limF ∈ f 1F (Q)ϕ(Q)(∗) ψ(P ) = 0 for all ψ ∈ C0 (X ∗ ).
June 29, 2006 16:14 WSPC/148-RMP
454
J070-00269
V. Georgescu & A. Iftimovici
Thus, it remains to prove the following assertion: if ϕ ∈ C(X) has the property limF ∈ f 1F (Q)ϕ(Q)ψ(P ) = 0 for ψ ∈ C0 (X ∗ ), then limF ∈ f 1F (Q)ϕ(Q) = 0. Observe that, due to (6.2) we may assume f = f◦ . Fix f ∈ L2 (X), ψ ∈ C0 (X ∗ ) and let us set θ = ψ(P )f and θa (x) = (Ua∗ θ)(x) = θ(x − a). Clearly limF ∈ f 1F (Q)ϕ(Q)Ua∗ θ = 0 uniformly in a ∈ X. Thus, for any ε > 0, there is F ∈ f Borel such that 1F ϕθa < ε for all a, hence |ϕ(a)|1F θa ≤ 1F (ϕ(a) − ϕ)θa + 1F ϕθa ≤ 1F (ϕ(a) − ϕ)θa + ε. Since f = f◦ we may assume that F = G + V where G ∈ f and V is a a compact neighborhood of the origin. Moreover, since ϕ is uniformly continuous and since we may choose V as small as we wish, we may assume that |ϕ(x) − ϕ(a)| < ε if x − a ∈ V . It is possible to choose f, ψ such that supp θ ⊂ V and θ = 1. Indeed, θ is equal to the convolution product ψ ∗ f where ψ(x) = ψ(−x) and it suffices to choose f, ψ continuous, positive and not zero and such that supp f + supp ψ ⊂ V . Then for a ∈ G, we clearly have supp θa ⊂ F , hence |ϕ(a)| = |ϕ(a)| θa = |ϕ(a)| 1F θa ≤ (ϕ(a) − ϕ)θa + ε ≤ 2ε. This proves that limf ϕ = 0. From Proposition 6.1 we easily get: C(f) X = T ∈ C (X) | ∃ S ∈ C0 (X ∗ ) such that lim 1F (Q)(T − S)(∗) = 0 .
F ∈f
The X-algebras of the form λ C(fλ ) are of some physical interest [43]. Indeed, one should think of a filter finer than the Fr´echet filter as the set of traces on X of the filter of neighborhoods of some closed part of the boundary of X in a compactification of X. This explains the interest of the algebras λ C(fλ ) in the present context: they consist of “potentials” which have limits at infinity when going in certain directions. One may easily deduce from Theorem 3.14 and Proposition 6.1 an intrinsic description of the crossed products λ C(fλ ) X. 6.2. The V (X) algebra We shall consider now the simplest nontrivial functions in C(X), those all of whose localizations at infinity are constants. Our purpose is to give a simple characterization of the X-algebra A defined by the condition Aκ = C for all κ ∈ δX and of the associated crossed product. So we introduce the X-algebra: V(X) := {ϕ ∈ C(X) | κ.ϕ ∈ C, ∀κ ∈ δX}.
(6.3)
Observe that the relation κ.ϕ ∈ C is equivalent to κ.ϕ = κ(ϕ). Lemma 6.2. We have ϕ ∈ V(X) if and only if ϕ ∈ C(X) and lim (ϕ(x + y) − ϕ(x)) = 0,
x→∞
∀ y ∈ X.
(6.4)
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
455
Proof. The condition (6.4) is equivalent to y.ϕ − ϕ ∈ C0 (X) for all y ∈ X and, by (5.1), this is equivalent to κ(y.ϕ − ϕ) = 0 for all κ ∈ δX and all y, hence to κ.ϕ(y) = κ(ϕ) for all κ, y, which means ϕ ∈ V(X). It is easily shown that ϕ ∈ C(X) satisfies (6.4) if and only if ϕ is a bounded continuous function such that limx→∞ (ϕ(x + y) − ϕ(x)) = 0 uniformly in y when y runs over a compact neighborhood of the origin. Thus, the functions from V(X) are of vanishing oscillation at infinity or slowly oscillating, and their role in the theory of pseudo-differential operators was noticed a long time ago due especially to a well-known theorem of H. Cordes concerning the compactness of the commutators [ϕ(Q), ψ(P )] (see [1, pp. 176–177] for a short presentation of the main ideas). If X = Rn , then V(X) is just the norm closure of the set of bounded functions of class C 1 whose derivative tends to zero at infinity. Thus, results of the same nature as the embedding (6.6) may be found already in [49]. The algebra V(X) was systematically considered in the works [36, 38–40]; see especially [40] where one may find references to other earlier papers. Although the authors emphasize the case X = Zn , it is clear for us that their methods extend to many other groups. On the other hand, since they allow the functions ϕ to be Banach space valued, the applications of their theory cover directly the case of operators on L2 (Rn ), for example (this involves a certain discretization technique). In particular, [40, Theorems 2.4.2 and Corollary 2.4.28] are much stronger than our next Proposition 6.3 in the case X = Zn . Taking into account the wealth of informations and applications in connection to these question which may be found in [40, Chap. 2, 4, 5], we decided to keep this section to a minimum, just to point out the special role of the algebra V(X) in the crossed product formalism. More recently, the relevance of V(X) in questions related to the computation of the essential spectrum has been independently noticed in [31, 32]. We mention that the compactification σ(V(X)) and the boundary υX = δ(V(X)) are called Higson compactification and Higson corona of X and play an important role in recent questions of topology, C ∗ -algebras, K-theory, etc. [45, 46]. Finally, we note that a non-abelian version of V(X) appears in a natural way in the spectral analysis of Schr¨ odinger operators on a tree X, see [17]. We now give an intrinsic description of the crossed product V (X) = V(X) X and a more specific decomposition of the essential spectrum. Proposition 6.3. We have V (X) = {T ∈ C (X) | κ.T ∈ C0 (X ∗ ), ∀ κ ∈ δX}.
(6.5)
If T ∈ V (X) then the map κ → κ.T ∈ C0 (X ∗ ) is norm continuous, hence (5.6) takes the more precise form V (X)/K (X) → C(υX; C0 (X ∗ )).
(6.6)
June 29, 2006 16:14 WSPC/148-RMP
456
J070-00269
V. Georgescu & A. Iftimovici
In particular, (T ) = {κ.T | κ ∈ υX} ⊂ C0 (X ∗ ) is a compact set. If H is a normal element of V (X) or is an observable affiliated to V (X) then: σ(κ.H) = σ(K). (6.7) σess (H) = κ ∈ υX
K ∈ (H)
Proof. To show the inclusion ⊂ in (6.5) and the norm continuity of the map κ → κ.T ∈ C0 (X ∗ ) it suffices to consider T = ϕ(Q)ψ(P ) with ϕ ∈ V(X) and ψ ∈ C0 (X ∗ ). But then κ.T = κ(ϕ)ψ(P ) and these facts become obvious. Note
that the compactness of the set (T ) implies that the union T ∈ (T ) σ(T ) is closed, hence (6.7) is true. It remains to show the inclusion ⊃ in (6.5). Since κ.(Vk∗ T Vk ) = Vk∗ (κ.T )Vk and (κ.T )Ux = κ.(T Ux ) and since C0 (X ∗ ) is stable under the automorphism generated by Vk and under multiplication by Ux , it is clear that the right-hand side of (6.5) satisfies Landstad’s conditions. Hence, Theorem 3.7 shows that it suffices to prove that if ϕ ∈ C(X) has the property (κ.ϕ)(Q)ψ(P ) ∈ C0 (X ∗ ) for all ψ ∈ C0 (X ∗ ) and all κ ∈ δX, then ϕ ∈ V(X). Thus, it suffices to show that if ξ ∈ C(X) and ξ(Q)ψ(P ) ∈ C0 (X ∗ ) for all ψ ∈ C0 (X ∗ ), then ξ is a constant. But we have ξ(Q)ψ(P ) = Ux ξ(Q)ψ(P )Ux∗ = ξ(x + Q)ψ(P ), hence (ξ(Q) − ξ(x + Q))ψ(P ) = 0 for all ψ, so ξ(Q) = ξ(x + Q) for all x. Remark 6.4. If the reader has any difficulty in proving that the union in (6.7) is closed, he should look at the proof of [12, Theorem 2.10]. Remark 6.5. V (X) is the largest crossed product A such that A /K (X) is
abelian. Indeed, A /K (X) → κ ∈ δ(A) Aκ by (5.6) and the Aκ are crossed products, so Aκ is abelian if and only if Aκ = {0} or Aκ = C0 (X ∗ ). Remark 6.6. The observables affiliated to C0 (X ∗ ) are functions of momentum, so that it is natural to call them free Hamiltonians. Then we may describe in physical terms V (X) as the largest C ∗ -algebra of energy observables such that if H is affiliated to it, then all its localizations at infinity are free Hamiltonians. Remark 6.7. We reconsider here the question of Remark 5.18 for A = V(X). If κ ∈ δX then τκ : V(X) → C is just the character associated to κ and so if χ ∈ δX then τχ τκ ϕ = τκ ϕ = τχ ϕ = τκ τχ ϕ in general. 6.3. More remarks on filters The following general remarks will be useful in the next subsections. Let Y be a closed subspace of X (thus K ∩ Y is compact for each compact K ⊂ X). If f is a filter on Y then f can be seen as a filter basis on X and we shall denote (just for a moment) by fX the filter on X that it generates (this is the set of subsets of X which contain a set from f). The map f → fX is an injective map from the
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
457
set of filters on Y onto the set of filters on X which contain Y . Indeed, we have f = {F ∩ Y | F ∈ fX }. It is also clear that if κ is an ultrafilter on Y then fX is also an ultrafilter. Finally, if f is finer than Fr´echet on Y , then fX is finer than Fr´echet on X. Since Y ∈ fX , if T : X → Z, then limfX T exists if and only if limf T |Y exists and then they are equal. From now on, we shall not distinguish fX from f, so we use the same notation f for both. In particular, we get natural embeddings γY ⊂ γX
and δY ⊂ δX.
(6.8)
It is convenient to understand this when the ultrafilters are interpreted as characters. We have an obvious embedding ∞ (Y ) ⊂ ∞ (X) so each character of ∞ (X) gives a character of ∞ (Y ) by restriction, and reciprocally, each character of ∞ (Y ) has a canonical extension to a character of ∞ (X), namely κ(ϕ) := κ(ϕ1Y ). Thus: γY = {κ ∈ γX | κ(Y ) = 1} and δY = γY ∩ δX. It is easy to see now that γY is a clopen subset of γX, equal to the closure of Y in γX. One says that a filter on a topological space is convergent to some point x if it is finer than the filter of neighborhoods of x. Any ultrafilter on a compact space is convergent. This is easily seen to be equivalent to any of the usual definitions of compactness [5, Chap. 1, Sec. 9]. It is easy now to understand the universal property of γX, cf. Sec. 2.4. We first observe that γ should be considered as a functor from the category of sets into the category of compact spaces. Indeed, if X, Y are sets and θ : X → Y , then it is obvious how to define γθ : γX → γY if ultrafilters are thought as characters: note first that ϕ → ϕ ◦ θ is a morphism θ∗ : ∞ (Y ) → ∞ (X) and then if κ ∈ γX define γθ(κ) as the character of ∞ (Y ) given by γθ(κ) = κ ◦ θ∗ . The continuity of γθ is clear. Now, assume Y is a compact topological space. The only thing we need to accept is that σ(C(Y )) = Y , this is not difficult to prove directly. Then we have a natural continuous map γY χ → χ ∈ Y which associates to a character χ of ∞ (Y ) its restriction to C(Y ). In fact, the ultrafilter χ is convergent and χ is just its limit. Finally, κ → γθ(κ) is the unique extension of θ to a continuous map γX → Y . 6.4. Sparse sets From the point of view of the complexity of the interactions, the algebra of interactions that one should consider next is A = {ϕ ∈ C(X) | κ.ϕ ∈ C∞ (X), ∀ κ ∈ δX}.
(6.9)
The corresponding algebra of energy observables is A = A X = {T ∈ C (X) | κ.T ∈ T (X), ∀ κ ∈ δX}.
(6.10)
June 29, 2006 16:14 WSPC/148-RMP
458
J070-00269
V. Georgescu & A. Iftimovici
Thus A is the largest C ∗ -algebra of energy observables such that all the localizations at infinity of a Hamiltonian H affiliated to it are two-body Hamiltonians. We shall leave for the second part of our work the study of the algebra (6.10) and we shall consider here only subalgebras corresponding to Klaus-type potentials. Remark 6.8. The algebra A defined by (6.9) is characterized by Aκ = C∞ (X) for each κ, hence contains C∞ (X) and is stable under all the morphisms τκ . It is also clear that τχ τκ ϕ ∈ C and is distinct from τκ τχ ϕ in general, cf. Remark 5.18. M. Klaus discovered in [29] the following class of Hamiltonians with nontrivial essential spectrum. Let L ⊂ R be a discrete set such that the distance between two successive points of L tends to infinity when we approach infinity. For each l ∈ L, let Vl ∈ L1 (R) real such that Vl L1 ≤ A and supp Vl ⊂ [−A, A] for a fixed finite A. Denote H = P 2 + l Vl (Q − l) and Hl = P 2 + Vl (Q). Then the description of σess (H) given in [29] is equivalent to: σess (H) = σ(Hl ), (6.11) F ∈ F (L) l ∈ F c
where F (L) is the set of finite subsets of L and F c = L \ F . One of the main examples in [20, 22] consisted in an algebraic treatment of this example, treatment based on the construction of a C ∗ -algebra to which operators like H are affiliated. We recall below the definition of this type of algebras and then we shall give a description of σess (H) for the operators affiliated to them which is more in the spirit of Theorem 1.1 (description which also appears in [20, 22] but which is deduced there by very different means). If L, Λ are subsets of X we denote LΛ = L + Λ and LcΛ = X \LΛ . If L has the property LΛ = X for each compact Λ, then we associate to it the filter
fL = {A ⊂ X | A ⊃ LcΛ for some compact Λ ⊂ X}.
(6.12)
This is clearly a translation invariant filter finer than the Fr´echet filter and such that f◦L = fL . Thus, CL (X) = {ϕ ∈ C(X)| limfL ϕ exists}
(6.13)
is an algebra of interactions on X. An intrinsic description of the corresponding algebra of Hamiltonians CL (X) follows immediately from the results of Sec. 6.1. Let δL X = δ(CL (X)) = σ(CL (X))\X
(6.14)
be the boundary of X in the compactification associated to CL (X). We recall that δL X is a quotient of δX. We set ∞L = {κ ∈ γX | κ ⊃ fL } = {κ ∈ γX | LcΛ ∈ κ if Λ ⊂ X is compact}.
(6.15)
This is a compact subset of δX and if κ ∈ ∞L , then κ(ϕ) = limfL ϕ, so that ∞L gives just a point in δL X. The problem that remains to be solved is the description of the other points of δL X.
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
459
In this subsection, we consider only the case when L is a sparse set, in the following sense: L is locally finite and for each compact set Λ there is a co-finite set M ⊂ L (i.e. such that L\M is finite) with the following property: if m ∈ M and l ∈ L, l = m, then (m + Λ) ∩ (l + Λ) = ∅. With the conventions made in Sec. 6.3, we have δL ⊂ δX, more explicitly for κ ∈ δL and ϕ ∈ ∞ (X) we set κ(ϕ) ≡ κ(ϕ1L ) = lim ϕ(l). l→κ
Below we use the symbol to denote disjoint union of sets. Lemma 6.9. Let θ : X × δL → δL X be defined by θ(x, κ) = κ ◦ τx . Then θ is injective and its range is δL X \{∞L }, which gives us an identification δL X ∼ = (X × δL) {∞L }.
(6.16)
Proof. We set θ(x, κ) = θx,κ and note the more explicit formula θx,κ (ϕ) = lim ϕ(l + x). l→κ
We first prove that θ is injective. It is clearly sufficient to show that if x ∈ X and κ, χ ∈ δL are such that κ(x.ϕ) = χ(ϕ) for all ϕ ∈ CL , then x = 0 and κ = χ. Let M ⊂ L such that κ(M ) = 1. Since κ is finer than the Fr´echet filter, M is infinite and κ(N ) = 1 if N is a co-finite subset of M . Let Λ ⊂ X be compact and such that 0, x ∈ Λ. Eliminating if needed a finite number of points from M , we may assume that (L\M ) ∩ MΛ = ∅ and MΛ = l ∈ M (l + Λ). Choose ϕ ∈ C0 (X) with supp ϕ ⊂ Λ and let us define ϕM = l ∈ M τ−l ϕ. Then: 1M (y)(τx−l ϕ)(y) = 1M (y)ϕ(x + y − l). (1L x.ϕM )(y) = l∈M
l∈M
In the sum from the right-hand side, the terms are zero unless l, y ∈ M and x + y ∈ l + Λ; but this implies l = y because x ∈ Λ. We get 1L x.ϕM = 1M ϕ(x) and so, by choosing ϕ such that ϕ(x) = 0, we see that κ(x.ϕM ) = κ(1L x.ϕM ) = κ(1M )ϕ(x) = ϕ(x) = 0. Similarly 1L ϕM = 1M ϕ(0) and so χ(ϕM ) = χ(1M ϕM ) = χ(1M )ϕ(0). If x = 0, we may choose ϕ such that ϕ(0) = 0 and we see that κ(x.ϕ) = χ(ϕ) for some ϕ ∈ CL . If x = 0 but κ = χ, then M can be chosen such that χ(M ) = 0 (because κ and χ are distinct ultrafilters), hence again κ(x.ϕ) = χ(ϕ) for some ϕ ∈ CL . This proves the injectivity of the map θ. Now, we show that for any χ ∈ δX such that χ ∈ / ∞L there is (x, κ) ∈ X × δL such that χ(ϕ) = κ(x.ϕ) for all ϕ ∈ CL . Since χ is not finer than fL , there is a / χ. But χ is an ultrafilter, so LΛ ∈ χ. Since χ is compact set Λ ⊂ X such that LcΛ ∈ finer than the Fr´echet filter, there is M ⊂ L such that χ(MΛ ) = 1 and MΛ = l∈M (l + Λ) ≡ M × Λ.
June 29, 2006 16:14 WSPC/148-RMP
460
J070-00269
V. Georgescu & A. Iftimovici
The sets F ⊂ MΛ with χ(F ) = 1 form a basis for χ and each such F can be uniquely written as a disjoint union F = l∈N (l + F (l)) with N ⊂ M and F (l) ⊂ Λ non-empty sets. We define surjective maps πM : MΛ → M and πΛ : MΛ → Λ with the help of the identification MΛ ≡ M × Λ. The image κ = πM (χ), i.e. the filter of subsets of M generated by the πM (F ) with F ∈ χ, is obviously an ultrafilter on M , hence on L, finer than the Fr´echet filter. Similarly, πΛ (χ) is an ultrafilter on Λ, which is a compact space, hence πΛ (χ) converges to some point x ∈ Λ. If F is
as above then πM (F ) = N and πΛ (F ) = l ∈ N F (l) and the families of these sets are bases for the filters κ and πΛ (χ), respectively. In particular, since πΛ (χ) is finer than the filter of neighborhoods of x, for each neighborhood V of x there is F such
that l ∈ N F (l) ⊂ V . We prove now that χ(ϕ) = κ(x.ϕ) if ϕ ∈ CL . We have χ(ϕ) = limχ ϕ, thus for each ε > 0 there is F ∈ χ as above such that |ϕ(y) − χ(ϕ)| < ε for all y ∈ F . Thus, |ϕ(l+λ)− χ(ϕ)| < ε for all l ∈ N and λ ∈ F (l). On the other hand, ϕ being uniformly continuous, there is a neighborhood V of x such that |ϕ(l + λ) − ϕ(l + x)| < ε for all l ∈ N and λ ∈ V . By what we said above, the preceding F may be chosen such that
χ l ∈ N F (l) ⊂ V . Hence, we get |ϕ(l + x) − (ϕ)| < 2ε for all l ∈ N . Since N ∈ κ and ε > 0 is arbitrary, this shows that liml→κ ϕ(l + x) = χ(ϕ). Remark 6.10. It is easy to show that δL X is homeomorphic with (X ×δL){∞L }, thought as the one point compactification of X × δL, but we do not need this. In the next lemma we use the notation of Definition 5.15. Let κ ∈ δX. / ∞L , then CL (X)κ = C∞ (X). Lemma 6.11. If κ ∈ ∞L , then CL (X)κ = C. If κ ∈ Proof. If κ ∈ ∞L , then κ.ϕ(x) = κ(x.ϕ) = limfL x.ϕ = limfL ϕ because fL is translation invariant. Thus κ.ϕ ∈ C in this case. Now, let χ ∈ / ∞L . It suffices then χ to show that .ϕ ∈ C0 (X) if limfL ϕ = 0 and by an easy density argument, we see that it suffices to assume that supp ϕ ⊂ LK for a compact subset K of X. If κ, x are such that θ(x, κ) = χ, then χ.ϕ(y) = χ(y.ϕ) = κ(x.(y.ϕ)) = κ((x + y).ϕ) = lim ϕ(l + x + y). l→κ
But if z ∈ / K, then there is M ⊂ L co-finite such that l + z ∈ / LK if l ∈ M , and then ϕ(l + z) = 0 for all such l, and so liml→κ ϕ(l + z) = 0. Hence, supp χ.ϕ ⊂ K − x. To finish the proof it remains to show that if χ ∈ / ∞L and ξ ∈ C∞ (X), then there is ϕ ∈ CL such that χ.ϕ = ξ. It suffices to show this under the assumption that ξ has compact support. Then it suffices to take ϕ = ξL = l∈L τ−l ξ. Lemma 6.12. If ϕ ∈ CL (X), the map δL κ → κ.ϕ ∈ C∞ (X) is norm continuous. Proof. By a density argument, it suffices to show this for supp ϕ ⊂ MΛ , where M ⊂ L is a co-finite set and Λ ⊂ X is a compact set such that MΛ = l∈M (l + Λ). If l ∈ M let ϕl be the function defined by ϕl (x) = ϕ(l + x) for x ∈ Λ and
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
461
ϕl (x) = 0 otherwise. Then clearly ϕl ∈ C0 (X), supp ϕl ⊂ Λ, and the family {ϕl }l∈M is equicontinuous. Thus, the set {ϕl | l ∈ M } is relatively compact in C0 (X). From the universal property of γM , cf. Sec. 2.4, there is a unique continuous map γM κ → ϕκ ∈ C0 (X) such that ϕκ = ϕl for all l ∈ M . Since δM = δL, it suffices to show that ϕκ = κ.ϕ if κ ∈ δM . But we have κ.ϕ(x) = lim ϕ(l + x) = lim ϕl (x) = ϕκ (x) l→κ
l→κ
because γM κ → ϕκ (x) is continuous. Putting all this together we obtain, for the algebra A = CL (X), an improvement of Theorem 5.16. If T ∈ CL (X) then, according to that theorem, we have a continuous map σ(CL (X)) κ → κ.T ∈ Cs (X) which induces an embedding CL (X)κ · (6.17) CL (X)/K (X) → κ ∈ δL X
From Lemma 6.11, we see that the localization CL (X)κ = CL (X)κ X at κ is C0 (X ∗ ) if κ = ∞L , (6.18) CL (X)κ = T (X) if κ = ∞L . Here, δL X is represented as in (6.16). We identify δL ≡ {0} × δL ⊂ X × δL and we simplify the relation (6.17) by taking into account the discussion made in Sec. 5.3. First, since (κ ◦ τx ).T = Ux T Ux∗ , it suffices to restrict the product to the set δL ∪ {∞L }. Secondly, we note that the contribution of the point ∞L is already covered by the other ones. Indeed, this follows from the easy to check relation g ∞L (ϕ) = ∞L (τκ (ϕ))
for all ϕ ∈ CL (X),
which implies Jκ ⊂ J∞L , and from Lemma 5.21. Finally, we get: Theorem 6.13. If T ∈ CL (X) and κ ∈ δL, then liml→κ Ul T Ul∗ = κ.T ≡ τκ (T ) exists in Cs (X) and belongs to T (X). The map δL κ → κ.T ∈ T (X) is norm continuous. The maps τκ : CL (X) → T (X) are surjective morphisms and the intersection of their kernels is K (X), which gives us a canonical embedding CL (X)/K (X) → C(δL; T (X)).
(6.19)
If H a normal operator in CL (X) or an observable affiliated to CL (X), then σ(κ.H). (6.20) σess (H) = κ ∈ δL
The last assertion follows from the norm continuity of the map κ → κ.T . Remark 6.14. Theorem 6.13 has been obtained by rather different methods in [20, 22], see, for example, [22, Theorems 5.5 and 5.6]. The point is that in these g Note
that this is related to Remark 5.18: we have τκ τχ = τχ τκ = ∞L on CL .
June 29, 2006 16:14 WSPC/148-RMP
462
J070-00269
V. Georgescu & A. Iftimovici
references the quotient CL (X)/K (X) was computed directly and the notion of localization at infinity did not play a role. Our purpose here was only to show that Theorem 1.1 can be effectively used even in some rather complicated situations. Our arguments in this subsection may, in fact, serve as a model for other computations. Remark 6.15. That the class of operators affiliated to CL (X) is quite large can be seen from the following result [22, Theorem 6.1]. Let X = Rn and denote Ht the Sobolev space of order t and · t the norm in B(Ht , H−t ). Let h : X → R be a continuous function such that c−1 (1 + |k|)2s ≤ |h(k)| ≤ c(1 + |k|)2s for some constant c and all large k, and denote H0 = h(P ). Let 0 ≤ t < s reals, choose a sparse set L ⊂ X and let {Wl }l ∈ L be a family of symmetric operators Wl : Ht → H−t with the property supl ∈ L (1 + |Q|)λ Wl t < ∞ for some λ > 2n. Then the series l ∈ L Ul∗ Wl Ul ≡ W converges in the strong topology of B(Ht , H−t ). Let H = H0 + W , Hl = H0 + Wl be the self-adjoint operators in L2 (X) defined as form sums. Then H is affiliated to CL (X). If κ ∈ δL, then we also have κ.H = liml→κ Hl in norm resolvent sense. Remark 6.16. The preceding arguments can be simplified and everything becomes an elementary exercise for the subalgebras of CL (X) corresponding to a finite number of types of bumps [20, p. 548]. The case of just one type is already interesting. More precisely, let L be a finite partition of L consisting of n infinite sets and let CL be the set of ϕ ∈ CL such that limal→∞ ϕ(l + x) ≡ a.ϕ(x) exists for each x and for each a ∈ L. Then δL is replaced by the finite set L and the Hamiltonians affiliated to CL have (modulo translations) exactly n + 1 localizations at infinity: a free one H0 ∈ C0 (X ∗ ) and a two-body one a.H ∈ T (X) for each a ∈ L. And
σess (H) = a ∈ L σ(a.H). We make a final comment on the algebra A defined in (6.9). We saw that for any sparse set L, we have CL (X) ⊂ A. On the other hand, if M is a second sparse set, then L ∪ M is not sparse in general. However, the C ∗ -algebra CL,M generated by CL ∪ CM is still included in A. Note that to each Hamiltonian affiliated to CL one may associate in a canonical way a free Hamiltonian, this is the localization of H at the point ∞L . But this is not the case for Hamiltonians affiliated to A . 6.5. Grassmann algebras We shall construct here C ∗ -algebras canonically associated to finite dimensional vector spaces and which allow one to consider a very general version of N -body Hamiltonians. This algebras have first been pointed out in [11] and the spectral theory of the operators affiliated to them (essential spectrum and the Mourre estimate) has been studied in detail in [12]. Our approach here is rather different, the graded algebra structure so important in the quoted works does not play a big role anymore. If Y is a closed subgroup of a locally compact abelian group X, then X/Y is also a locally compact abelian group and we have a continuous surjective group
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
463
morphism πY : X → X/Y . Then the map defined by ϕ → ϕ ◦ πY gives us a natural embedding C(X/Y ) → C(X). In fact C(X/Y ) = {ϕ ∈ C(X) | y.ϕ = ϕ ∀ y ∈ Y }.
(6.21)
Note that we shall denote just 0 the group {0} and then C(0) = C0 (0) = C, hence C(X/X) = C0 (X/X) = C. On the other hand, if 0 ⊂ Y ⊂ Z ⊂ X are closed subgroups then X/Z ∼ = (X/Y )/(Y /Z) and we have natural maps X → X/Y → X/Z → 0,
(6.22)
C ⊂ C(X/Z) ⊂ C(X/Y ) ⊂ C(X).
(6.23)
hence, we get embeddings
In the rest of this subsection, we shall consider only finite dimensional real vector spaces, although much of the theory can be extended to more general groups. We shall consider the algebra generated by the C ∗ -subalgebras C0 (X/Y ) ⊂ C(X/Y ) ⊂ C(X). We recall that the Grassmannian G(X) is the set of all vector subspaces of X and the projective space P(X) is the set of all one-dimensional subspaces of X. Definition 6.17. The (classical) Grassmann algebra of the vector space X is the X-subalgebra G(X) ⊂ C(X) defined by C0 (X/Y ), (6.24) G(X) = norm closure of Y
where Y runs over G(X). The quantum Grassmann algebra of X is the C ∗ -algebra G (X) ⊂ B(X) defined by G (X) = G(X) X = norm closure of C0 (X/Y ) X. (6.25) Y ∗
The fact that G(X) is a C -algebra follows from the obvious relation C0 (X/Y ) · C0 (X/Z) ⊂ C0 (X/(Y ∩ Z)).
(6.26)
The second equality from (6.25) follows from Theorem 3.14. Let G(X) be the set of finite unions of strict vector subspaces of X: Y . G(X) = L ⊂ X | ∃ F ⊂ G(X)\{X} finite such that L =
Y ∈F
If L is as above and Λ ⊂ X is compact then LΛ = L + Λ = Y ∈F (Y + Λ). Thus, LΛ is a closed set, LΛ = X, and we have LΛ ∪ MΛ = (L ∪ M )Λ and LΛ ⊂ LΛ if Λ ⊂ Λ . This justifies the next definition. Definition 6.18. The Grassmann filter g = gX on X is the filter generated by the family of open sets LcΛ = X \LΛ where L runs over G(X) and Λ over the set of compact subsets of X. If Y is a subspace of X, then we denote also by gY the filter on X generated by the Grassmann filter of Y .
June 29, 2006 16:14 WSPC/148-RMP
464
J070-00269
V. Georgescu & A. Iftimovici
Clearly, gX is translation invariant, finer than the Fr´echet filter, and If X is one dimensional, then gX is just the Fr´echet filter.
g◦X = gX .
Remark 6.19. For L ∈ G(X), we may consider the filter fL defined as in (6.12).
Then gX is just the filter generated by L fL . This can be expressed in other terms as follows: (1) gX is the upper bound of the set of filters fL ; (2) when seen as compact subsets of δX (cf. Sec. 2.4), gX is the intersection of the compact sets fL . Remark 6.20. If we equip X with an Euclidean norm | · | and denote πY the orthogonal projection onto Y ⊥ ∼ = X/Y , then δL (x) ≡ dist(x, L) = minY ∈ F |πY x| (with L, F as before). Then the sets Lcr = {x ∈ X | δL (x) > r}, with L ∈ G(X) and r > 0 real, form a basis of the filter gX . Note that L has empty interior and if x is outside it then δL (tx) = |t|δL (x) → ∞ as t → ∞. If f is a filter on a set S and π is a map from S to a locally compact space T , then limf π = ∞ means: for each compact K ⊂ T there is F ∈ f such that π(F ) ∩ K = ∅. Lemma 6.21. Let Y, Z ∈ G(X). If Y ⊂ Z, then limgY πZ = 0. If Y ⊂ Z then limgY πZ = ∞. Proof. Since Y ∈ gY the above limits involve only the restriction of πZ to Y . In the first case, if y ∈ Y then πZ (y) = 0, so the assertion is clear. If Y ⊂ Z, then E = Y ∩ Z is a strict subspace of Y . Let E be a supplementary subspace for E in Y . Then πZ : E → X/Z is injective, hence if K ⊂ X/Z is compact, then the set Λ of y ∈ E such that πZ y ∈ K is a compact in E and thus in Y . If y ∈ Y \EΛ ∈ gY then y = e + e with e ∈ E and e ∈ E \Λ, so πZ y = πZ e ∈ / K. Corollary 6.22. If ϕ ∈ G(X) and Y ∈ G(X), then limgY ϕ exists. If ϕ = is a finite sum of ϕZ ∈ C0 (X/Z), then limgY ϕ = Z⊃Y ϕZ (0).
Z
ϕZ
We see that each filter gY defines a character of G(X) and we could proceed as in the proof of Lemma 6.9 and describe δG X ≡ δ(G(X)) in terms of couples (Y, y) with y ∈ X/Y . We shall not do it explicitly, but this is hidden in what follows. We only note that the role of ∞L is now played by gX . Proposition 6.23. For each ϕ ∈ G(X), the limit τY ϕ = lim y.ϕ y→gY
(6.27)
exists locally uniformly on X. If ϕ is a finite sum ϕ = Z ϕZ with ϕZ ∈ C0 (X/Z), then τY ϕ = Z⊃Y ϕZ . If Y ∈ P(X), y ∈ Y \ 0, then τY ϕ(x) = limt→∞ ϕ(x + ty). Proof. We have to show that limy→gY ϕ(x + y) exists locally uniformly in x. But this is an immediate consequence of the Corollary 6.22 and Lemma 2.2. According to the conventions, we made at the beginning of this subsection, we have G(X/Y ) ⊂ C(X/Y ) ⊂ C(X) if Y is a subspace of X.
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
465
Proposition 6.24. We have G(X/Y ) ⊂ G(X). Moreover, there is a unique morphism τY : G(X) → G(X/Y ) such that τY is a projection (in the sense of linear spaces). The map τY is given by (6.27). If Y ⊂ Z ⊂ X, then C = G(0) ⊂ G(X/Z) ⊂ G(X/Y ) ⊂ G(X)
(6.28)
and τY τZ = τZ τY = τZ . More generally, for any Y, Z ∈ G(X), we have G(X/Y ) ∩ G(X/Z) = G(X/(Y + Z))
(6.29)
and τY τZ = τZ τY = τY +Z . Proof. The algebra G(X/Y ) is generated the C0 ((X/Y )/E) with E ⊂ X/Y subspace. If Z = πY−1 (E), then Y ⊂ Z ⊂ X, E = Z/Y and (X/Y )/E ∼ = X/Z allows us to identify C0 ((X/Y )/E) = C0 (X/Z) and thus to get the first assertion of the proposition. Observe that C0 (X/Z) G(X/Y ) = norm closure of Z⊃Y
= {ϕ ∈ G(X) | y.ϕ = ϕ ∀ y ∈ Y }.
(6.30)
The other assertions of the proposition are easy to check. Proposition 6.25. If ϕ ∈ G(X) and τY ϕ = 0 for all Y ∈ P(X), then ϕ ∈ C0 (X). Proof. This follows from [11, Theorem 3.2 and Lemma 4.1], but we give a selfcontained proof here. Consider first a finite set F ⊂ G(X) which is stable under intersections and such that 0 ∈ F and let A = Y ∈F C0 (X/Y ). Then A is a ∗-algebra because of (6.26) and C0 (X) ⊂ A. Clearly τY ϕ ≤ ϕ for all Y ∈ F, ϕ ∈ A. Let us write ϕ = Y ϕY with ϕY ∈ C0 (X/Y ). From Proposition 6.23, we get τY ϕ = Z Y Y Z⊃Y ϕ , so if Y is a maximal element of F then τY ϕ = ϕ , hence ϕ ≤ ϕ. By induction,we easily see that there is a constant c such that ϕY ≤ cϕ for all Y ∈ F, ϕ ∈ A. This clearly implies that A is a C ∗ -algebra and that Y ∈ F C0 (X/Y ) is a topological direct sum. If ϕ is as above and τY ϕ = 0 for all Y = 0, then Z⊃Y ϕZ = 0 if Y = 0 hence, the sum being direct, we get ϕZ = 0 for all Z = 0, thus ϕ ∈ C0 (X). It follows that the map ϕ → (τY ϕ)Y =0 is a morphism from A into
) with kernel equal to C0 (X). In particular, the induced map Y =0 G(X/Y
A/C0 (x) → Y =0 G(X/Y ) is an isometry, so that if ψ ∈ A is such that τY ψ ≤ ε for all Y = 0, then there is ψ0 ∈ C0 (X) such that ψ − ψ0 ≤ 2ε (just by definition of the quotient norm). Let now ϕ ∈ G(X) such that τY ϕ = 0 for all Y ∈ P(X). From Proposition 6.24 it follows that this property remains true for all Y ∈ G(X), Y = 0. From the definition (6.24) it follows easily that for each ε > 0 there is A as above and there is ψ ∈ A such that ϕ − ψ ≤ ε. Then clearly we have τY ψ ≤ ε for all Y = 0, so by
June 29, 2006 16:14 WSPC/148-RMP
466
J070-00269
V. Georgescu & A. Iftimovici
what we proved above there is ψ0 ∈ C0 (X) such that ψ − ψ0 ≤ 2ε, and hence ϕ − ψ0 ≤ 3ε. This clearly implies ϕ ∈ C0 (X). The next theorem is now an immediate consequence of Theorem 5.16, Propositions 6.24 and 6.25, and of Lemma 5.23. We denote C0 (X/Z) X. (6.31) GY (X) = G(X/Y ) X = norm closure of Z⊃Y
We mention that we have non-canonical isomorphisms GY (X) G (X/Y )⊗ C0 (Y ∗ ). Theorem 6.26. If T ∈ G (X) and Y ∈ G(X), then τY T = limy→gY Ux T Ux∗ exists in Cs (X) and belongs to GY (X). The map τY : G (X) → GY (X) is a morphism and a linear projection and is uniquely characterized by these properties. We have ∗ . τY τZ = τZ τY = τY +Z . If Y ∈ P(X) and y ∈ Y, y = 0, then τY T = limt→∞ Uty T Uty We have T ∈ K (X) if and only if τY T = 0 for all Y ∈ P(X), which gives us GY (X). (6.32) G (X)/K (X) → Y ∈ P(X)
From (6.32), we get that the essential spectrum of an observable H affiliated to G (X) is equal to the closure of the union σ(τY H) with Y ∈ P(X). But now we can prove more: as in the situations considered in Theorem 6.13 and Proposition 6.3, the union is already closed (although it is not finite, as in the usual N -body problem). Theorem 6.27. If T ∈ G (X), then {τY T | Y ∈ P(X)} is a compact set in G (X). In particular, if H a normal operator in G (X) or an observable affiliated to G (X), then σ(τY H). (6.33) σess (H) = Y ∈P(X)
One should note that the map Y → τY T is not continuous: if T ∈ C0 (X/Z) X then τY T = T if Y ⊂ Z and τY T = 0 if Y ⊂ Z. Theorem 6.27 is a corollary of [12, Theorem 4.2 and Proposition 5.4]. We shall give below a slightly improved proof. Note that only some general properties of the lattice G(X) and of the graded algebra structure of G (X) are really needed. The next two lemmas imply the first assertion of Theorem 6.27 (hence, the second). Lemma 6.28. If T ∈ G (X), then for each Z ∈ G(X), Z = 0 there is Y ∈ P(X) such that τZ T = τY T . Proof. Let E ⊂ G(X) be countable. Then {E ∩ Z | E ∈ E, E ∩ Z = Z} is a countable set of strict subspaces of Z, so its union is not Z. Let Y ∈ P(Z) such that Y ∩ E ∩ Z = 0 if E is in the preceding set. Then from E ∈ E and E ⊃ Y, we get E ⊃ Z. Now, if TE is a finite sum E∈E T E with T E ∈ C0 (X/E) X, then clearly τZ T = τY T . Finally, if T is arbitrary, then there is E as above such that T be a norm limit of operators of the form TE , so we have τZ T = τY T .
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
467
Lemma 6.29. Let {Yn }n≥0 be a sequence of linear subspaces of X and let us define Y = n≥0 m≥n Ym . If k is the dimension of Y, then there is N such that for all n ≥ N and all T ∈ G (X): (τYn − τY )T ≤ k sup (τYn − τYm )T . m≥n
Proof. Since a decreasing sequence of subspaces is eventually constant, there is N such that Y = m≥n Ym for all n ≥ N . The dimension of Y being k, for each n ≥ N there are n < n1 < · · · < nk such that Y = Yn +Yn1 +· · ·+Ynk . From Theorem 6.26, we get τY = τYn τYn1 · · · τYnk . Let P = τYn , Pi = τYni , and Pi = 1 − Pi . Then: P − τY = P[1 − P1 · · · Pk ] =
k−1
PPi Pi+1 · · · Pk + PPk .
i=1
Since the morphisms Pi commute, we get (P − τY )T ≤ suffices to note that PPi = P(P − Pi ).
k i=1
PPi T . Now, it
Proof of Theorem 6.27. If {τYn T } is a norm Cauchy sequence and Y is as in the Lemma 6.29 then (τYn − τY )T → 0. Observe that we do not have k = 0 because this would imply Yn = 0 for large n. Thus, we can use Lemma 6.28 and find E ∈ P(X) such that τY T = τE T , which proves the first assertion of the theorem. Remark 6.30. The usual form of the HVZ theorem for N -body Hamiltonians follows easily from Theorem 6.27. Indeed, in the Agmon–Froese–Herbst formalism [1] one is given a finite lattice L and an injective map L a → Xa ∈ G(X) such that Xa∧b = Xa ∩ Xb , Xmax L = X and Xmin L = 0. The N -body Hamiltonians are observables H affiliated to the C ∗ -algebra C = a∈L C0 (X a ) X ⊂ G (X), where X a = X/Xa . Let τa = τXa , then τa is a morphism and a linear projection of C onto the C ∗ -subalgebra Ca = b≥a C0 (X b ) X. Let us set Ha = τa H. Then (a generalized version of) the HVZ theorem says that σess (H) = σ(Ha ), (6.34) a∈M
where M is the set of atoms of L. To get this from (6.33), note that for each Y ∈ P(X) there is a smallest b in L such that Y ≤ Xb , so we have Y ⊂ Xc if and only if b ≤ c. Then for T ∈ C we have τY T = τb T . On the other hand, there is an atom a such that a ≤ b, and then τb T = τb τa T . Thus, σ(τY T ) ⊂ σ(τa T ). Reciprocally, if Z ∈ P(X) and Z ⊂ Xa , then τa T = τa τZ T and so σ(τa T ) ⊂ σ(τZ T ). Example 6.31. The simplest application of Theorem 6.27 is obtained by taking X = Rn and H = ∆ + V (x) where V ∈ G(X). Although simple, this situation is, however, not trivial because the union in (6.33) contains an infinite number of distinct terms in general. For example, the construction of V may involve an infinite number of subspaces Y whose union is dense in X.
June 29, 2006 16:14 WSPC/148-RMP
468
J070-00269
V. Georgescu & A. Iftimovici
Example 6.32. We show here that in an N -body type situation (i.e. involving only a finite number of subspaces Y ) the class of Hamiltonians for which (6.34) applies is very large. We use the setting of Remark 6.30 and, to simplify notations, we equip X with an Euclidean structure, so that X is identified with X ∗ and X a = Xa⊥ . For real s let Hs be the usual Sobolev spaces, set H = H0 = L2 (X), and embed as usual Hs ⊂ H ⊂ H−s if s > 0. Fix s > 0 and denote · s the norm in B(Hs , H−s ). Let h : X → R be continuous and such that c (1 + |k|)2s ≤ h(k) ≤ c (1 + |k|)2s outside a compact, for some constants c , c . Then H(max L) := h(P ) is a selfadjoint operator with domain H2s and form domain Hs . Then for each a = max L let H(a) : Hs → H−s be a symmetric continuous operator such that the following properties hold: (1) (2) (3) (4)
Ux H(a)Ux∗ = H(a) if x ∈ Xa , Vk H(a)Vk∗ − Ha s → 0 as k → 0 in X, (Vk − 1)H(a)s → 0 as k → 0 in X a , Ha := b≥a H(b) ≥ µh(P ) − ν as forms on Hs , for some µ, ν > 0.
Then H ≡ H(min L) is affiliated to C and τa H = Ha , so (6.34) holds. See [12, Theorem 4.6] for the details of the computation and for more general results. 6.6. On the operators κ.H We observed after Theorem 1.2 that if H is a self-adjoint operator affiliated to C (X), then its localizations at infinity κ.H are not necessarily densely defined. We shall make in this subsection some comments on this question and we shall give conditions which allow one to compute κ.H directly in terms of x.H and so to avoid considering the resolvent of H. This is not possible for H = h(P ) + v(Q) if the operator V = v(Q) is not relatively bounded with respect to h(P ), so we shall consider here only more elementary situations which are of some physical interest. To fix the ideas we consider here only the case X = Rn and take H = L2 (X; E), where E a finite dimensional Hilbert space, cf. Sec. 4. For simplicity, we consider only operators whose form domain is a Sobolev space Hs with s > 0 (everything extends with no difficulty to hypoelliptic operators). Set k = (1 + |k|2 )1/2 and denote S(E) the space of symmetric operators on E and |S| the absolute value of S ∈ S(E). Let h : X → S(E) be locally Lipschitz and such that c k2s ≤ |h(k)| ≤ c k2s and |h (k)| ≤ ck2s outside a compact, where c, c , c are constants. We set H0 = h(P ) and observe that D(|H0 |1/2 ) = Hs . Let v : X → S(E) be a locally integrable function such that the operator V = v(Q) satisfies V Hs ⊂ H−s and ±V ≤ µ|H0 | + ν for some real numbers µ, ν with µ < 1. Then the self-adjoint operator H = H0 + V (form sum) is affiliated to C (X), cf. Corollary 4.8. Note that x.V = Ux V Ux∗ = v(x + Q) satisfies the same
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
469
estimates as V and that x.H = H0 + x.V . We mention that the next lemma is valid under the much more general conditions of Definition 4.7. Lemma 6.33. Let us assume that for each C ∞ function with compact support f the set {x.V f | x ∈ X} is relatively compact in H−s . Then for each κ ∈ δX, the limit limx→κ x.V = κ.V exists in the strong operator topology in B(Hs , H−s ), we have ±κ.V ≤ µ|H0 | + ν as forms on Hs , and we have κ.H = H0 + κ.V if κ.H is defined as in Theorem 1.2. Proof. Let z ∈ ρ(H) and R = (H − z)−1 ∈ C (X). Then κ.H is defined by the operator κ.R = limx→κ x.R where the limit exists in Cs (X). Note that we know that the limit exists but we do not yet know whether κ.R is injective or not. On the other hand, the existence of κ.V follows from the fact that the set of operators x.V is bounded in B(Hs , H−s ): thus it suffices to show the existence of the limit limx→κ x.V f in H−s for f a C ∞ function with compact support, and this is obvious ˇ by the universal property of the Stone–Cech compactification γX of the discrete space X and our assumption. Note that κ.V is the operator of multiplication by a distribution which could not be a function, but clearly the estimate verified by V remains valid in the limit. Hence, if we define κ.H = H0 + κ.V as form sum, we get a densely defined self-adjoint operator such that κ.H − z extends to an isomorphism Hs → H−s . Now, it suffices to prove that κ.R = (κ.H − z)−1 . Since x.H − z : Hs → H−s is also an isomorphism, one can easily justify the equality (x.H − z)−1 − (κ.H − z)−1 = (x.H − z)−1 (κ.V − x.V )(κ.H − z)−1 in B(H−s , Hs ). Then for f ∈ H−s , we have [(x.H − z)−1 − (κ.H − z)−1 ]f Hs ≤ C(κ.V − x.V )(κ.H − z)−1 f H−s , where C = (H − z)−1 B(H−s ,Hs ) = (x.H − z)−1 B(H−s ,Hs ) . This clearly finishes the proof. This lemma gives a rather concrete method of computing κ.H and also shows that this operator is densely defined. The most elementary way of checking the relative compacity assumption from the lemma is described below. Proposition 6.34. Assume that for each µ > 0, there is ν such that |V | ≤ µP 2s + ν. Then limx→κ x.V = κ.V exists strongly in B(Hs , H−s ) for each κ ∈ δX, for each µ > 0 there is ν such that ±κ.V ≤ µ|H0 | + ν as forms on Hs , and κ.H = H0 + κ.V if κ.H is defined as in Theorem 1.2. In particular, we have σess (H) = σ(κ.H). κ∈δX
June 29, 2006 16:14 WSPC/148-RMP
470
J070-00269
V. Georgescu & A. Iftimovici
Proof. We only have to show that the set {x.Vf | x ∈ X} is relatively compact in H−s if f ∈ Cc∞ (X), i.e. if f is a C ∞ function with compact support. This is equivalent to the relative compactness in H of the set {P −s x.V f | x ∈ X}. Let ψ, ξ ∈ Cc∞ (X) with ξ(x) = 1 on supp f and let S = P −s ξ(Q)P s and T = P −s V P −s . Then: ψ(P )P −s x.V f = ψ(P )P −s ξ(Q)x.V f = ψ(P )SUx T Ux∗ P s f ≡ ψ(P )Sfx . The set {fx | x ∈ X} is bounded in H and the operator ψ(P )S is compact in H, so the set {ψ(P )P −s x.V f | x ∈ X} is relatively compact in H. Thus, it suffices to prove the following assertion: for each ε > 0 there is ψ ∈ Cc∞ (X) such that ψ(P )⊥ P −s x.V f ≤ ε for all x ∈ X, where ψ(P )⊥ = 1 − ψ(P ). Let V± be the positive and negative parts of V , so that V = V+ − V− and |V | = V+ + V− , then it is clearly sufficient to prove this assertion with V replaced by V± . If T± = P −s V± P −s then ψ(P )⊥ P −s x.V± f = ψ(P )⊥ Ux T± Ux∗ P s f ≤ ψ(P )⊥ T± P s f and if we set C± = T± 1/2 , then ψ(P )⊥ T± ≤ C± ψ(P )⊥ T± = C± ψ(P )⊥ T± ψ(P )⊥ 1/2 . 1/2
On the other hand, from |V | ≤ µP 2s + ν, we get V± ≤ µP 2s + ν and then T± ≤ µ + νP −2s and so ψ(P )⊥ T± ψ(P )⊥ ≤ [µ + νP −2s ](1 − ψ(P ))2 . Since µ can be chosen as small as we wish, it is clear that the right-hand side above can be made ≤ ε for any ε > 0 by choosing ψ conveniently. Since the left-hand side is positive we then get ψ(P )⊥ T± ψ(P )⊥ ≤ ε. Corollary 6.35. If the conditions of Proposition 6.34 are satisfied and if for each C ∞ function f with support in the unit ball we have limx→∞ x.V f H−s = 0, then the essential spectrum of H is given by σess (H) = σ(H0 ). Example 6.36. There are at least three physically interesting situations covered by Proposition 6.34: (1) The Schr¨ odinger operator H = P 2 + V = ∆ + V (x). Then s = 1 and the assumptions of the proposition are satisfied if V is of Kato class, so we get [31, Theorem 4.5]. Corollary 6.35 is similar to [31, Theorem 4.3]. (2) The relativistic spin zero operator H = (P 2 + m2 )1/2 + V (x), then s = 1/2. Here m is any real number. (3) The Dirac operator H = D + V (x). Here D is the free Dirac operator of mass m ≥ 0, s = 1/2, E = CN is not trivial, and V (x) is matrix valued. The last two situations are also considered in [36].
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
471
6.7. Cocompact subgroups We consider now a situation similar to that from [31, Sec. 5]. In a C ∗ -algebra setting such examples and generalizations appear in [32]. Throughout this subsection, X is an abelian locally compact non-compact group and Y a closed subgroup such that X/Y is compact. Since Y is fixed, we shall abbreviate πY = π. We embed C(X/Y ) ⊂ C(X) as described in (6.21), so we think of C(X/Y ) as a translation invariant C ∗ -subalgebra of C(X) containing the constants, explicitly described by (6.21). More generally, we identify any function v defined on X/Y with the function v ◦ π defined on X. The full justification of the class of functions introduced below will become clear later on, cf. Lemma 6.42. Definition 6.37. If θ : X → X, then write θ(a + x) ∼ a + θ(x) if lim [θ(a + x) − a − θ(x)] = 0
x→∞
∀ a ∈ X.
(6.35)
If θ is uniformly continuous, then this is equivalent to having θ(x) = x + ξ(x) where ξ : X → X is uniformly continuous and slowly oscillating in a sense similar to (6.4). The next proposition is completely elementary but we state it separately because the main idea of the proof is very clear in this context. Proposition 6.38. Let h : X ∗ → R be a continuous function such that |h(k)| → ∞ as k → ∞. Assume that θ : X → X is uniformly continuous and θ(a+x) ∼ a+θ(x). If v : X/Y → R is continuous, H = h(P ) + v(Q) and Hθ = h(P ) + v ◦ θ(Q), then σess (Hθ ) = σ(H). Proof. This will be a consequence of Theorem 1.2. The self-adjoint operator Hθ is affiliated to C (X) because of Proposition 4.3. It remains to compute the localizations κ.H for κ ∈ δX. The image π◦θ(κ) is an ultrafilter on the compact space X/Y , hence it converges to some unique point κ ∈ X/Y . Let z ∈ X such that κ = π(z). Since v ≡ v ◦ π, we may define the translated function κ.v(s) = v(κ + s) = where κ.H = v ◦ π(z + x) for s = π(x) ∈ X/Y . We shall prove that κ.Hθ = κ.H = σ(H), which finishes the h(P ) + κ.v(Q). Note that κ.H = Uz HUz∗ , so σ(κ.H) proof. Observe that D(H) = D(h(P )) is stable under translations, so it suffices to prove if f ∈ D(H). But this follows from the much stronger fact: s -limx→κ x.Hθ f = κ.Hf This means that for each f ∈ L2 (X), we have s -limx→κ x.v ◦ θ(Q) = κ.v(Q). |v ◦ π(θ(x + y)) − v ◦ π(z + y)|2 |f (y)|2 dy = 0,
lim
x→κ
X
(6.36)
June 29, 2006 16:14 WSPC/148-RMP
472
J070-00269
V. Georgescu & A. Iftimovici
where z is as above. Now, for large x, we have π(θ(x + y)) ∼ π(θ(x)) + π(y) and = π(z) and since v is uniformly continuous we have limx→κ π(θ(x)) = κ lim sup |v ◦ π(θ(x + y)) − v ◦ π(z + y)| = 0
x→κ y∈K
for each compact K ⊂ X. This clearly implies (6.36). The extension of Proposition 6.38 to bounded measurable functions v seems to require further conditions. Indeed, one could say that it suffices to use the dominated convergence theorem in (6.36). But this requires some care because κ is a filter, we did not assume X separable, and the dominated convergence theorem is not true if sequences are replaced by nets. We indicate below two situations where these problems can be avoided. Proposition 6.39. If X/Y is separable or if θ is such that θ−1 (N ) is of measure zero whenever N ⊂ X is of measure zero, then Proposition 6.38 remains valid if v is a bounded measurable function. Proof. If X/Y is separable, then the point κ has a countable fundamental system of neighborhoods {Gn }. For each n, choose Fn ∈ κ such that π(θ(Fn )) ⊂ Gn and then choose points xn ∈ X such that xn ∈ Fn . Clearly, we shall have π(θ(xn )) → κ and s -limn→∞ xn .Hθ = κ.Hθ if the left-hand side exists. Now, the rest of the proof of Proposition 6.38 works after replacing x by xn and x → κ by n → ∞, this time we can use the dominated convergence theorem directly in (6.36). If θ has the property |N | = 0 ⇒ |θ−1 (N )| = 0, we argue as follows. Since v is bounded, it is sufficient to prove (6.36) for f the characteristic function of a compact set. Then we approximate v in L2 (X/Y ) by functions w ∈ C(X/Y ), for such w the relation (6.36) being obvious. The only problem which appears is to estimate the term |v ◦ η(x + y) − w ◦ η(x + y)|2 dy, K
where K ⊂ X is a compact and η = π ◦ θ. The map η : X → X/Y is continuous and has the property |N | = 0 ⇒ |η −1 (N )| = 0, by hypothesis and [16, Theorem 2.9]. But this implies that there is an integrable function g ≥ 0 on X/Y such that the preceding integral be ≤ X/Y |v − w|2 g ds, and this can be made as small as we wish. In the case X = Rn and under stronger assumptions on θ we may extend Proposition 6.38 to unbounded functions v, in particular we may recover Theorem 5.1 of the revised version of [31]. In order to be specific, we assume that h is as in Sec. 6.6 and that s ≤ 1. In particular, our assumptions below imply those of Proposition 6.34. Then we easily obtain: Proposition 6.40. Let Y be the additive subgroup of X = Rn generated by n linearly independent vectors. Let θ : X → X be a homeomorphism such that θ and
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
473
θ−1 are Lipschitz and such that θ(a + x) ∼ a + θ(x). Assume that v : X → B(E) is a locally integrable Y -periodic function and that V = v(Q) has the property: for each µ > 0 there is ν such that |V | ≤ µP 2s + ν. Then the operator Vθ = v ◦ θ(Q) has the same property and if we set H = h(P ) + V and Hθ = h(P ) + Vθ then σess (Hθ ) = σ(H). Example 6.41. We give an elementary example. Let X = R, Y = Z and let v be a d2 real periodic locally integrable function on R. Then the form sum H = − dx 2 + v(x) 2 is a self-adjoint operator on L (R) and its spectrum is purely absolutely continuous. Let θ : R → R be of class C 1 with θ (x) > 0 for all x and such that θ (x) → 1 as d2 |x| → ∞. Then the form sum Hθ = − dx 2 + v(θ(x)) is a self-adjoint operator and its essential spectrum is equal to the spectrum of H. We shall now consider the questions treated above in this subsection from the point of view of Theorem 1.15. As in the proof of Proposition 6.3, we get from (6.21) and from Theorem 3.7: C(X/Y ) X = {T ∈ C (X) | y.T = T ∀ y ∈ Y }. Clearly C0 (X) ∩ C(X/Y ) = {0}, from which it follows easily that A = C0 (X) + C(X/Y ) is an algebra of interactions and that we are in the conditions of Proposition 3.16, hence we have a topological direct sum A ≡ A X = K (X) + C(X/Y ) X = {T ∈ C (X) | y.T − T ∈ K (X) ∀ y ∈ Y }.
(6.37)
This is a rather trivial algebra but things become less trivial when we look at the image of A under an automorphism of C (X). If θ : X → X is a uniformly continuous homeomorphism then θ∗ : C(X) → C(X) is the injective morphism defined by θ∗ ϕ = ϕ ◦ θ. Clearly θ∗ C0 (X) = C0 (X) but in nontrivial situations the image through θ∗ of an X-algebra is not an X-algebra. However, we are interested only in algebras of interactions (which contain C0 (X)), and the property of θ isolated in Definition 6.37 will be sufficient. The next lemma and its corollary are obvious. Lemma 6.42. If θ : X → X is uniformly continuous and θ(a + x) ∼ a + θ(x), then for each a ∈ X the map τa θ∗ − θ∗ τa sends C(X) into C0 (X). Corollary 6.43. Let θ : X → X be a uniformly continuous homeomorphism such that θ(a + x) ∼ a + θ(x). Then, if A is an algebra of interactions, Aθ := θ∗ A is also an algebra of interactions. Moreover, θ∗ leaves C0 (X) invariant and so it induces a canonical isomorphism of X-algebras A/C0 (X) ∼ = Aθ /C0 (X). We apply this to the situation (6.37). Since X/Y is compact, we have δ(A) = σ(A/C0 (X)) = σ(C(X/Y )) = X/Y
June 29, 2006 16:14 WSPC/148-RMP
474
J070-00269
V. Georgescu & A. Iftimovici
and thus we get δ(Aθ ) ∼ = X/Y . Since X acts transitively on X/Y we see that, modulo a unitary equivalence, we have only one localization at infinity for an observable affiliated to Aθ := Aθ X. This assertion can be made more precise as follows. Proposition 6.44. If θ : X → X is a uniformly continuous homeomorphism such that θ(a + x) ∼ a + θ(x), then there is a unique morphism P : Aθ → C(X/Y ) X such that 0 if ϕ ∈ C0 (X), P(ϕ ◦ θ(Q)ψ(P )) = (6.38) ϕ(Q)ψ(P ) if ϕ ∈ C(X/Y ). This morphism is surjective and has K (X) as kernel. If κ is a filter on X finer than the Fr´echet filter and such that limκ π ◦ θ = 0, then P(T ) = limx→κ Ux T Ux∗ , where the limit exists in Cs (X). Proof. The uniqueness of P follows from the fact that the operators ϕ ◦ θ(Q)ψ(P ) with ϕ ∈ A generate Aθ , and the surjectivity holds for a similar reason. A filter κ as in the statement of the proposition exists because π ◦ θ : X → X/Y is surjective. If T = ϕ ◦ θ(Q)ψ(P ), then Ux T Ux∗ = ϕ ◦ θ(x + Q)ψ(P ) and ϕ ◦ θ(x + Q)ψ(P )ξ(Q) converges strongly to zero or to ϕ(Q)ψ(P )ξ(Q) as x → κ if ϕ ∈ C0 (X) or ϕ ∈ C(X/Y ), respectively. Here ξ ∈ C0 (X) and Remark 3.13 is used. Thus, we can define P(T ) by the last assertion of the proposition. If P(T ) = 0, then the beginning of the proof of Proposition 6.38 gives τχ T = 0 for all χ ∈ δX, so T is compact. Remark 6.45. Thus, if H is an element of Aθ or an observable affiliated to Aθ , then σess (H) = σ(P(H)).
(6.39)
One may get a large class of Hamiltonians H affiliated to Aθ by using [13, Theorem 2.8 and Lemma 2.9]. For example, let H0 ≥ 0 be self-adjoint operator strictlyh affiliated to Aθ . Let V be a quadratic form with −µH0 − ν ≤ V ≤ ν(H0 + 1) for some 0 < µ < 1 and ν > 0 and such that (H0 + 1)−α V (H0 + 1)−1/2 ∈ Aθ for some α > 0. Then the form sum H = H0 + V is a self-adjoint operator affiliated to Aθ . For example, the last condition is satisfied if V ∈ A θ and then one gets singular functions V as limits of sequences Vn such that (H0 + 1)−α Vn (H0 + 1)−1/2 ∈ Aθ is norm convergent (this gives a class of V larger than that from Proposition 6.40). Acknowledgments It is a pleasure to thank Barry Simon for the stimulating correspondence and for sending us a copy of the paper [36]. We are also indebted to Fran¸coise Piquard and h Strictly means (1 + εH )−1 T − T → 0 as ε → 0 for all T ∈ A . For example, it suffices that 0 θ H0 = h(P ) where h is a positive continuous function on X ∗ which diverges at infinity.
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
475
George Skandalis for helpful discussions and to Steffen Roch for pointing out to us an erroneous assertion in the first version of this paper, cf. Remark 5.12. Finally, we are grateful to the referees, their comments and critics allowed us to eliminate several errors form the first version of this paper and to significantly improve the general presentation. A. Appendix A.1. Heuristic comments We give here a detailed proof of Theorem 3.7. We follow rather closely Landstad’s arguments, but we use the characterization of A X taken as Definition 3.1, which makes the proof more transparent. We mention that the space B2◦ (X) is suggested by Kato’s theory of smooth operators, cf. [42]. We shall not discuss the uniqueness of A because the proof of [30, Lemma 3.1] can hardly be simplified (if X is discrete we have A = I (A ) so uniqueness is trivial, see Remark A.8). We begin with some heuristic comments which will make the rigorous proof quite natural. The first question is, given A , how to determine A. Observe that if we know Aψ(P + k) for all k, then we can recover the operator A by integrating over k, because this operation will give Aψ with ψ := X ∗ ψ(k) dk. On the other hand, ψ(P + k) = Vk∗ ψ(P )Vk so that if A commutes with Vk , then we get Aψ = X ∗ Vk∗ Aψ(P )Vk dk. Thus if, T = j ϕj (Q)ψj (P ), then ϕj (Q)ψj = Vk∗ T Vk dk =: I (T ). X∗
j
If the group X is discrete, so that X ∗ is compact, this formal argument can easily be made rigorous, the map I is well defined on all B(X) and we have A = I (A ) (we strongly advise the reader to first prove Landstad’s theorem for discrete X; this is a really pleasant exercise). In general, one can give a meaning to I (T ) for a sufficiently large class of T for the rest of the proof to work. Anyway, the preceding formula shows how to extract the part in A of the operator T ∈ A . The second point is that one can reconstruct T from such quantities by using the formally obvious relation I (T Ux∗ )Ux dx T = X
which is just the Fourier inversion formula, see (A.11). But the right-hand side here is, again formally, in A X. To make all this rigorous demands some preliminary constructions that we expose in Sec. A.2 in a more general context. We set H = L2 (X) and we abbreviate B = B(X) = B(H). We recall that we have unitary representations Ux and Vk of X and X ∗ in H which satisfy the canonical commutation relations Ux Vk = k(x)Vk Ux .
(A.1)
June 29, 2006 16:14 WSPC/148-RMP
476
J070-00269
V. Georgescu & A. Iftimovici
Most of the next arguments do not depend on the explicit form of the operators Ux , Vk . A.2. Smooth operators We first introduce the space of “smooth operators” with respect to the unitary representation Vk : T Vk f 2 dk < ∞, ∀ f ∈ H . (A.2) B2◦ := T ∈ B X∗
Lemma A.1. B2◦ is a left ideal (not closed in general) in B. If T ∈ B2◦ , then 1/2 2 |||T ||| := sup T Vk f dk <∞ (A.3) f =1
X∗
and B2◦ , ||| ·||| is a Banach space such that |||ST ||| ≤ S ·|||T ||| for all S ∈ B. Finally, if x ∈ X and T ∈ B2◦ then T Ux ∈ B2◦ and |||T Ux||| = |||T |||. For the proof of (A.3), we have only to remark that the map which sends f ∈ H into (T Vk f )k∈X ∗ ∈ L2 (X ∗ ; H) is clearly closed and linear, hence it is continuous. The last assertion of the lemma follows from (A.1). The map x → T Ux ∈ B2 is not norm continuous in general. For this reason it will be convenient to consider the following left ideal in B and closed subspace of B2◦ B2 := T ∈ B2◦ lim |||T Ux − T ||| = 0 . (A.4) x→0 ∗ The following property of B2 will be important in what follows: if S ∗ S ≤ Tj Tj for some Tj ∈ B2 , then S ∈ B2 . Lemma A.2. If ψ ∈ L∞ (X ∗ ), then ψ(P ) ∈ B2◦ if and only if ψ ∈ L2 (X ∗ ). In this case, we have ψ(P ) ∈ B2 and |||ψ(P )||| = ψL2 . Proof. We have 2 ψ(P )Vk f dk = X∗
X∗
Vk∗ ψ(P )Vk f 2 dk
dk
= X∗
X∗
= X∗
ψ(P + k)f 2 dk
|ψ(p + k)|2 |f(p)|2 dp
= ψL2 (X ∗ ) fL2 (X ∗ ) = ψL2 (X ∗ ) f L2 (X) . Then |||ψ(P )Ux − ψ(P )||| = xψ − ψL2 where x is identified with the map k → k(x) on X ∗ , we clearly get ψ ∈ B2 . Definition A.3. B1 is the linear subspace of B generated by the operators of the form S ∗ T with S, T ∈ B2 .
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
477
The polarization identity 4S ∗ T =
3
im (im S + T )∗ (im S + T )
(A.5)
m=0
shows that B1 is linearly generated by the operators of the form S ∗ S with S ∈ B2 . We recall that a subset C ⊂ B is called hereditary if: 0 ≤ S ≤ T ∈ C ⇒ S ∈ C . Lemma √ A.4. B1 is an hereditary ∗-subalgebra of B. If S ≥ 0, then S ∈ B1 if and only if S ∈ B2 . If S ∈ B1 , then Ux SUy ∈ B1 for all x, y ∈ X. Proof. The fact that B1 is a linear space and that S ∗ ∈ B1 if S ∈ B1 is obvious. The space B1 is stable under multiplication because for S, T ∈ B2 we have left ideal. S ∗ ST ∗ T = S ∗ · ST ∗ T ∈ B1 the space B2 being a √ We prove now that if S ≥ 0 and S ∈ B1 then S ∈ B2 (the reverse implication n being obvious). Since S ∈ B1 we have S = j=1 λj Sj∗ Sj with λj ∈ C and Sj ∈ B2 . If S = S ∗ , then by taking the real parts we may assume that λj ∈ R. Then + λj Sj∗ Sj , λj Sj∗ Sj ≤ S= λj >0
λj <0
λj >0
√ which implies S ∈ B2 by the property mentioned after (A.4). √ T ∈ B2 , so S ∈ B1 by the same Finally, if 0 ≤ S ≤ T ∈ B1 , then property. n Let T ∈ B1 and let us write T = j=1 Sj∗ Tj with Sj , Tj ∈ B2 . Then if f, g ∈ H: 1/2 1/2 n 2 2 |Vk f, T Vk g| dk ≤ Sj Vk f dk Tj Vk g dk X∗
≤
j=1 n
X∗
X∗
|||Sj ||||||Tj |||f g < ∞.
j=1
From the operator version of the Riesz lemma, it follows that there is a unique operator I (T ) ∈ B(X) such that Vk f, T Vk g dk for all f, g ∈ H. f, I (T )g = X∗
In other terms, we see that the strongly continuous map k → Vk∗ T Vk is such that the integral Vk∗ T Vk dk (A.6) I (T ) = X∗
exists in the weak operator topology of B(X). It is clear that for all T ∈ B2 , we have I(T ∗ T )1/2 = |||T |||.
(A.7)
June 29, 2006 16:14 WSPC/148-RMP
478
J070-00269
V. Georgescu & A. Iftimovici
Moreover, the computation done above gives for S, T ∈ B2 : I(S ∗ T ) ≤ |||S||| |||T |||. ∞
∗
(A.8) ∗
Example A.5. If S ∈ B(X) and ξ, η ∈ L (X ) ∩ L (X ), then ξ(P )Sη(P ) ∈ B1 and 2
(A.9) I(ξ(P )Sη(P )) ≤ S ξL2(X ∗ ) ηL2 (X ∗ ) . ∗ ∗ ¯ Indeed, we write ξ(P )Sη(P ) = S ξ(P ) η(P ) and use (A.8) and Lemma A.2. Lemma A.6. If T ∈ B1 , then (x, y) → I(Ux T Uy ) is bounded and norm continuous. Proof. Due to (A.8) it suffices to assume that T = S ∗ S for some S ∈ B2 and to show continuity at x = y = 0 of the map (x, y) → I(Sx∗ Sy ) with Sx = SUx . Then I(Sx∗ Sy ) − I(S ∗ S) ≤ I (Sx − S)∗ Sy + I S ∗ (Sy − S) ≤ |||Sx − S||||||S||| + |||S||||||Sy − S||| because of the estimate (A.8). Proposition A.7. If T ∈ B1 , then I(T ) ∈ C(X) and Ux∗ I(T )Ux = I(Ux∗ T Ux ) for all x ∈ X. The map I : B1 → C(X) is linear and positive. Proof. We clearly have Vk I(T )Vk∗ = I(T ) for all k ∈ X ∗ . Since the von Neumann algebra generated by {Vk }k ∈ X ∗ is just L∞ (X), we get ϕ(Q)I(T ) = I(T )ϕ(Q) for all ϕ ∈ L∞ (X). But L∞ (X) is maximal abelian in B, thus I(T ) ∈ L∞ (X). From (A.1), we get Ux∗ Vk∗ T Vk Ux = Vk∗ Ux∗ T Ux Vk , hence I(Ux∗ T Ux) = Ux∗ I(T )Ux . Since ϕ ∈ C(X) if and only if ϕ ∈ L∞ (X) and x → Ux∗ ϕ(Q)Ux is norm continuous, we get I(T ) ∈ C(X). The last assertion of the proposition is obvious. Remark A.8. In a similar way, we can associate an hereditary ∗-subalgebra B1◦ to B2◦ and define an extension of I to it, but then we only have I : B1◦ → L∞ (X). If X is a discrete group, then B1 = B and I is a conditional expectation. For T ∈ B1 and for x ∈ X, we set T(x) := I(T Ux∗ ), so that we associate to T a function T : X → C(X) which, by Lemma A.6, is bounded and norm continuous. Let T be the Fourier transform of the function k → Vk∗ T Vk , more precisely, T (x) = k(x)Vk∗ T Vk dk, X∗
where the integral exists in the weak operator topology. From (A.1), we get T(x) = T(x)Ux∗ (A.10) so that T is a kind of twisted Fourier transform. Now, the inversion formula for the Fourier transform gives us a formal relation T(x)Ux dx T = X
whose rigorous meaning is given below.
(A.11)
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
Lemma A.9. For each T ∈ B1 and θ ∈ L1 (X), we have dk, T(x)Ux θ(x) dx = Vk∗ T Vk θ(k)
479
(A.12)
X∗
X
where both integrals exist in the weak operator topology. Proof. For each f ∈ H, f, T (x)Ux f θ(x) dx = X
X
X∗
X
X∗
=
Vk f, T k(x)Vk f dk θ(x) dx k(x)Vk f, T Vk f dk θ(x) dx.
Since θ ∈ L1 (X) and the function k → Vk f, T Vk f is in L1 (X ∗ ), we can apply the Fubini theorem and get thus get (A.12). Let us remark that the left-hand side of the identity (A.12) always exists in the strong operator topology, and the same is true for the right-hand side if θ ∈ L1 (X ∗ ). We recall the following result (see, for example, [16, Lemma 4.19]). in X ∗ and Lemma A.10. Let Λ ⊂ X ∗ be a neighborhood of the neutral element 1 ∗ let ε > 0. Then there is θ ∈ Cc (X) such that θ ≥ 0, θ ∈ L (X ), X ∗ θ(k) dk = 1, dk ≤ ε. and X ∗ \Λ θ(k) The next version of the Fourier inversion formula is an easy consequence of Lemmas A.9 and A.10: then T belongs Proposition A.11. If T ∈ B1 and k → Vk∗ T Vk is norm continuous, to the norm closure of the set of operators of the form Tθ = X T (x)Ux θ(x) dx with L1 (X ∗ ). θ∈ Cc (X) and θ∈ LemmaA.12. Let ψ ∈ C0 (X ∗ ) and T ∈ B1 . Then, if θ ∈ L1 (X) and θ ∈ L1 (X ∗ ), the integral X T(x)Ux ψ(P )θ(x) dx exists in the norm operator topology and I(T )ψ(P ) is a norm limit of such integrals. Proof. The map x → Ux ψ(P ) is norm continuous if ψ ∈ C0 (X ∗ ), hence the integrand is norm continuous. The last assertion follows by choosing θ as in Lemma A.10 but with the roles of X and X ∗ inverted. A.3. Proof of Landstad’s theorem We are now ready to prove Landstad’s theorem (Theorem 3.7). From now on, A and A are as in that theorem. Lemma A.13. A is a non-degenerate C0 (X ∗ )-bimodule. More precisely, if A ∈ A and ψ ∈ C0 (X ∗ ), then Aψ(P ) ∈ A , ψ(P )A ∈ A and A is a limit of operators of the form Aψ(P ) and of operators of the form ψ(P )A.
June 29, 2006 16:14 WSPC/148-RMP
480
J070-00269
V. Georgescu & A. Iftimovici
Proof. It is clearly sufficient to consider only the right action and, since each ψ ∈ C0 (X ∗ ) is limit in the sup norm of functions whose Fourier transform is dx, we have integrable, we may assume ψ ∈ L1 (X). Then Aψ(P ) = X AUx ψ(x) AUx ∈ A and the integral converges in norm by the second assumption of Theorem 3.7, so Aψ(P ) ∈ A . By taking ψ = |K|−1 1K , where K runs over the set of compact neighborhoods of the origin in X, and since the map x → AUx is norm continuous, we see that A is a norm limit of operators of the form Aψ(P ). Lemma A.14. A is an X-subalgebra of C(X). Proof. It is clear that A is a norm closed subspace of C(X) stable under conjugation and stable under translations (note that (τx ϕ)(Q) = Ux ϕ(Q)Ux∗ ). To show that it is stable under multiplication, let ϕ1 , ϕ2 ∈ A and ψ ∈ C0 (X ∗ ). Since ϕ2 (Q)ψ(P ) ∈ A we can write it as a norm limit of operators of the form ξ(P )A with ξ ∈ C0 (X ∗ ) and A ∈ A , so that ϕ1 (Q)ϕ2 (Q)ψ(P ) is a norm limit of operators of the form ϕ1 (Q)ξ(P )A which belong to A . Now, we may consider the crossed product A X, this is the norm closed subspace of B(X) generated by the operators ϕ(Q)ψ(P ) with ϕ ∈ A and ψ ∈ C0 (X ∗ ). We clearly have A X ⊂ A and it remains to prove the reverse inclusion. Lemma A.13. If T ∈ A ∩ B1 , then I(T ) ∈ A. Proof. Due to Proposition A.7 it suffices to show that I(T )ψ(P ) ∈ A ∗ A.12, it is enough to prove that if ψ ∈ C0 (X ). Because of Lemma 1 T (x)U ψ(P )θ(x) dx ∈ A if θ ∈ L (X) and θ ∈ L1 (X ∗ ). But (A.12) implies: x X dk. T(x)Ux ψ(P )θ(x) dx = Vk∗ T Vk ψ(P )θ(k) X
X
Since Vk∗ T Vk ψ(P ) ∈ A and is a norm continuous function of k the last integral belongs to A . Lemma A.14. If T ∈ A ∩ B1 , then T ψ(P ) ∈ A X for all ψ ∈ C0 (X ∗ ). Proof. We shall have T Ux∗ ∈ A ∩ B1 , hence T(x) = I(T Ux∗ ) ∈ A, and thus the map T : X → A is bounded and norm continuous. On the other hand, Proposition A.11 shows that for each ψ ∈ C0 (X ∗ ) the operator T ψ(P ) is a norm limit of integrals X T(x)Ux ψ(P )θ(x) dx. But Ux ψ(P ) ∈ C0 (X ∗ ) and the map x → Ux ψ(P ) is norm continuous, thus the preceding integral converges in norm. Also, we have T(x)Ux ψ(P ) ∈ A X for each x, thus the integral belongs to A X. Now, we prove A ⊂ A X. For this it suffices to find a dense subset of A which is included in A X. The Example A.5 and Lemma A.13 imply that A ∩ B1 is a dense subspace of A . Thus, it suffices to show that A ∩ B1 ⊂ A X. But this follows from Lemma A.14 because each T ∈ A is a norm limit of operators of the form T ψ(P ) with ψ ∈ C0 (X ∗ ).
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
481
References [1] W. Amrein, A. Boutet de Monvel and V. Georgescu, C0 -Groups, Commutator Methods and Spectral Theory of N -Body Hamiltonians (Birkh¨ auser-Verlag, 1996). [2] W. Amrein, M. M˘ antoiu and R. Purice, Propagation properties for Schr¨ odinger operators affiliated with certain C ∗ -algebras, Ann. Henri Poincar´e 3(6) (2002) 1215–1232. [3] J. Bellissard, Gap labelling theorems for Schr¨ odinger operators, in From Number Theory to Physics (Les Houches 1989), eds. J. M. Luck, P. Moussa and M. Waldschmidt (Springer-Verlag, 1993), pp. 538–630. [4] J. Bellissard, Non Commutative Methods in Semiclassical Analysis, Lecture Notes in Mathematics, Vol. 1589 (Springer-Verlag, 1994). [5] N. Bourbaki, El´ements de math´ematique: Topologie g´ en´erale, Ch. 1-4 (Paris, Hermann, 1971). [6] A. Boutet de Monvel and V. Georgescu, Graded C ∗ -algebras in the N -body problem, J. Math. Phys. 32 (1991) 3101–3110. [7] A. Boutet de Monvel and V. Georgescu, Graded C ∗ -algebras associated to symplectic spaces and spectral analysis of many channel Hamiltonians, in Dynamics of Complex and Irregular Systems (Bielefeld 1991), eds. Ph. Blanchard et al., Vol. 8 (World Scientific Publishing, River Edge, NJ, 1993), pp. 22–66. [8] L. A. Coburn, R. D. Moyer and I. M. Singer, C ∗ -algebras of almost periodic pseudodifferential operators, Acta Math. 130 (1973) 279–307. [9] H. O. Cordes, Spectral Theory of Linear Differential Operators and Comparison Algebras (Cambridge University Press, 1987). [10] M. Damak, On the spectral theory of dispersive N -body Hamiltonians, J. Math. Phys. 40 (1999) 35–48. [11] M. Damak and V. Georgescu, C ∗ -crossed products and a generalized quantum mechanical N -body problem, in Proc. Symp. Math. Phys. and Quantum Field Theory (Berkeley, 1999), pp. 51–69; Electron. J. Differ. Equ. Conf. 4 (2000) 51–69; preprint 99-481 at http://www.ma.utexas.edu/mp arc/ is an improved version. [12] M. Damak and V. Georgescu, C ∗ -algebras related to the N -body problem and the self-adjoint operators affiliated to them, http://www.ma.utexas.edu/mp arc/preprint 99–482. [13] M. Damak, V. Georgescu, Self-adjoint operators affiliated to C ∗ -algebras, Rev. Math. Phys. 16 (2004) 257–280. [14] J. Dixmier, Les C ∗ -Alg`ebres et Leurs Repr´esentations, 2nd edn. (Gauthier-Villars, 1969). [15] J. M. G. Fell and R. S. Doran, Representations of ∗-Algebras, Locally Compact Groups, and Banach ∗-Algebraic Bundles, Vols. 1–2 (Academic Press, 1988). [16] G. B. Folland, A Course in Abstract Harmonic Analysis (CRC Press, 1995). [17] V. Georgescu and S. Gol´enia, Isometries, Fock spaces, and spectral analysis of Schr¨ odinger operators on trees, J. Func. Anal. 227 (2005) 389–429; preprint 04-182 http://www.ma.utexas.edu/mp arc/. [18] V. Georgescu and S. Gol´enia, Decay preserving operators and stability of the essential spectrum, preprint math.SP/0411489 at http://arXiv.org. [19] V. Georgescu and A. Iftimovici, Spectral analysis of some anisotropic quantum systems, unpublished notes (1998). [20] V. Georgescu and A. Iftimovici, C ∗ -algebras of energy observables: I, General theory and bumps algebras, preprint 00-521 at http://www.ma.utexas.edu/mp arc/. [21] V. Georgescu and A. Iftimovici, C ∗ -algebras of energy observables: II, Graded symplectic algebras and magnetic Hamiltonians, preprint 01-99 at http://www.ma. utexas.edu/mp arc/.
June 29, 2006 16:14 WSPC/148-RMP
482
J070-00269
V. Georgescu & A. Iftimovici
[22] V. Georgescu and A. Iftimovici, Crossed products of C ∗ -algebras and spectral analysis of quantum Hamiltonians, Comm. Math. Phys. 228 (2002) 519–560. [23] V. Georgescu and A. Iftimovici, C ∗ -algebras of quantum Hamiltonians, in Operator Algebras and Mathematical Physics, eds. J.-M. Combes, J. Cuntz, G. A. Elliot, G. Nenciu, H. Siedentop and S. Stratila (Theta Foundation, Bucarest, 2003), pp. 123–167; preprint 02-410 at http://www.ma.utexas.edu/mp arc/. odinger operators on trees, Ann. [24] S. Gol´enia, C ∗ -algebras of anisotropic Schr¨ Henri Poincar´e 5 (2004), 1097–1115; preprint 02-439 at http://www.ma.utexas. edu/mp arc/. [25] B. Helffer and A. Mohamed, Caract´erisation du spectre essentiel de l’op´erateur de Schr¨ odinger avec un champ magn´etique, Ann. Inst. Fourier (Grenoble) 38 (1988) 95–112. ˇ [26] N. Hindman and D. Strauss, Algebra in the Stone–Cech Compactification: Theory and Applications (Walter de Gruyter, 1998). [27] L. H¨ ormander, The Analysis of Linear Partial Differential Operators (Springer, Berlin, 1983). [28] A. Iftimovici, Nonperturbative techniques in the investigation of the spectral properties of many-channel systems, in Partial Differential Equations and Spectral Theory (Clausthal 2000), Oper. Theory Adv. Appl., eds. M. Demuth, B.-W. Schultze, Vol. 126 (Birkh¨ auser Verlag, 2001), pp. 155–163; preprint 00-433 at http://www.ma.utexas.edu/mp arc/. [29] M. Klaus, On −d2 /dx2 + V where V has infinitely many “bumps”, Ann. Inst. H. Poincar´e Phys. Th´eor. 38 (1983) 7–13. [30] M. B. Landstad, Duality theory for covariant systems, Trans. Amer. Math. Soc. 248 (1979) 223–269. [31] Y. Last and B. Simon, The essential spectrum of Schr¨ odinger, Jacobi, and CMV operators, preprint 05-112 at http://www.ma.utexas.edu/mp arc/; a revised version is preprint 304 at http://www.math.caltech.edu/people/biblio.html. [32] M. M˘ antoiu, C ∗ -algebras, dynamical systems at infinity and the essential spectrum of generalized Schr¨ odinger operators, J. Reine Angew. Math. 550 (2002) 211–229. [33] M. M˘ antoiu, R. Purice and S. Richard, Spectral and propagation results for magnetic Schr¨ odinger operators; A C ∗ -Algebraic Framework, preprint 05-84 at http://www.ma.utexas.edu/mp arc/. [34] V. Nistor, Pseudo-differential operators on non-compact manifolds and analysis on polyhedral domains, in Proc. Workshop on Spectral Geometry of Manifolds with Boundary and Decomposition of Manifolds, Contemporary Mathematics, Vol. 366 (American Mathematical Society, Providence, RI, 2005), pp. 307–328. [35] G. K. Pedersen, C ∗ -algebras and their automorphisms groups (Academic Press, 1979). [36] V. S. Rabinovich, Essential spectrum of perturbed pseudo-differential operators. Applications to the Schr¨ odinger, Klein–Gordon, and Dirac operators, Russian J. Math. Phys. 12 (2005) 62–80. [38] V. S. Rabinovich, S. Roch and J. Roe, Fredholm indices of band-dominated operators, Integral Equations Operator Theory 49 (2004) 221–238. [39] V. S. Rabinovich, S. Roch and B. Silbermann, Fredholm theory and finite section method for band-dominated operators, Integral Equations Operator Theory 30(4) (1998) 452–495. [40] V. S. Rabinovich, S. Roch and B. Silbermann, Limit Operators and their Applications in Operator Theory, Vol. 150 (Birkh¨ auser, 2004). [41] I. Raeburn, On crossed products and Takai duality, Proc. Edinburgh Math. Soc. 31 (1988) 321–330.
June 29, 2006 16:14 WSPC/148-RMP
J070-00269
Localizations at Infinity and Essential Spectrum of Quantum Hamiltonians
483
[42] M. Reed and B. Simon, Methods of Modern Mathematical Physics III: Scattering Theory (Academic Press, 1979). [43] S. Richard, Spectral and scattering theory for Schr¨ odinger operators with Cartesian anisotropy, Publ. Res. Inst. Math. Sci. 41(1) (2005) 73–111. [44] O. Rodot, On a class of anisotropic asymptotically periodic Hamiltonians, C. R. Acad. Sci. Paris, Ser. I Math. 334 (2002) 575–579. [45] J. Roe, Coarse cohomology and index theory on complete Riemannian manifolds, Mem. Amer. Math. Soc. 497 (1993) [46] J. Roe, Lectures on Coarse Geometry, University Lecture Series, Vol. 31 (American Mathematical Society, Providence, RI, 2003). [47] J. Roe, Band-dominated Fredholm operators on discrete groups, Integral Equations Operator Theory 51 (2005) 411–416. [48] P. Samuel, Ultrafilters and compactification of uniform spaces, Trans. Amer. Math. Soc. 64 (1948) 100–132. [49] M. Taylor, Gelfand theory of pseudo-differential operators and hypoelliptic operators, Trans. Amer. Math. Soc. 153 (1971) 495–510. [50] A. Weil, Basic Number Theory (Springer-Verlag, 1973).
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
Reviews in Mathematical Physics Vol. 18, No. 5 (2006) 485–517 c World Scientific Publishing Company
THE POLARON REVISITED
JACOB SCHACH MØLLER Aarhus University, Department of Mathematical Sciences, 8000 Aarhus C, Denmark [email protected] Received 13 February 2006 Revised 4 May 2006 In recent years, the spectral properties of the translation invariant Nelson model has been studied. Some of the results obtained did not extend to the related polaron model for technical reasons related to the typical assumption of boundedness of the phonon dispersion relation in the polaron model. In this paper, we work with a large class of linearly coupled translation invariant models which includes both the Nelson model and H. Fr¨ ohlich’s polaron model. The problems considered are chosen based on relevance for the polaron model. A key input is an analysis of the behavior of the bottom of the spectrum of the fiber Hamiltonians at large total momentum. Keywords: Polaron; Nelson model; gound state; essential spectrum. Mathematics Subject Classification 2000: 81Q10, 81T10
1. Introduction In this paper, we study the operator on L2 (Rν ) ⊗ F given by 1 H = H0 + Φ(e−ik·x v), where H0 = Ω ∇x ⊗ 1lF + 1lL2 (Rν ) ⊗ dΓ(ω), i
(1.1)
and Ω and ω are dispersion relations for a non-relativistic particle and a scalar field, respectively. The particle position is denoted by x and the phonon momentum by k. Throughout the paper, we use the Γ functor to denote second quantization. We write F = Γ(L2 (Rν )) for the symmetric Fock space over L2 (Rν ). The coupling function v is assumed to be an L2 (Rν ) function and the field operator Φ(g) is defined by (g(k)a∗ (k) + g(k)∗ a(k)) dk, (1.2) Φ(g) = Rν
for g ∈ L2 (Rν ; B(L2 (Rν ))). Here, a∗ (k) and a(k) are phonon creation and annihilation operators. They satisfy the canonical commutation relations. 485
August 5, 2006 21:35 WSPC/148-RMP
486
J070-00267
J. S. Møller
The operator H has an important symmetry. It is translation invariant, in the sense that it commutes with the operator of total momentum 1 ∇x ⊗ 1lF + 1lL2 (Rν ) ⊗ dΓ(k). (1.3) i Using a unitarytransform which goes back to Lee–Low–Pines [27], one can bring H on the form Rν H(ξ)dξ on L2 (Rν ; F ). The fiber Hamiltonians H(ξ) are given by P =
H(ξ) = H0 (ξ) + Φ(v),
where H0 (ξ) = Ω(ξ − dΓ(k)) + dΓ(ω)
(1.4)
as operators on F . The Lee–Low–Pines transform is given by ILLP := (F ⊗ 1lF ) ◦ Γ(e−ik·x ),
(1.5)
where F denotes the Fourier transform in L2 (Rν ). The field operator Φ(v) is defined as in (1.2). We refer to the set {(ξ, E) | ξ ∈ Rν , E ∈ σ(H(ξ))} as the energymomentum spectrum of H. We are mainly interested in the bottom of this set, in particular the ground state as a function of total momentum Σ0 (ξ) = inf σ(H(ξ)),
(1.6)
and the bottom of the essential spectrum. An important motivating example is H. Fr¨ ohlich’s polaron model [13] of one electron (or hole) in an ionic crystal. Here Ω(η) = η 2 /(2Meff ), ω(k) ≡ hω0 > 0 a constant, and v a coupling function which in 3 dimensions take the form 14 √ 1 3 1 h . (1.7) v(k) = αhω0 (4π) 2 (2π)− 2 2Meff ω0 |k| We remark that in the literature, v often comes with a factor i. This factor can be removed by the unitary transformation 1lL2 (Rν ) ⊗ Γ(i1lL2 (Rν ) ) (see also (4.2)). For general ν ≥ 1, the coupling function is a multiple of |k|−(ν−1)/2 (a constant in dimension 1), see [34]. The mass Meff is an effective mass which comes from approximating an electron in a static periodic background potential by a free electron with an effective mass. The frequency ω0 is that of long wavelength longitudinal optical phonons. The acoustic phonons as well as the transverse optical phonons are neglected. (The acoustic phonons do not contribute to the polarization of the crystal, since they model vibrations of neutral unit cells consisting of two oppositely charged ions.) The dimensionless coupling constant α is material dependent and is typically not a small number. H. Fr¨ ohlich’s model is often called the large polaron model, because in its derivation [12, 13, 26], it is assumed that the charged particle is smeared out over a region large compared to the lattice spacing. This permits a continuum approximation and explains that the entire momentum space is used, and not just a bounded Brillouin zone. (In [27], the charge distribution of the polaron, i.e. a ground state, is computed and the result is consistent with this assumption.) It should also be noted that a thermodynamic limit (infinite crystal) is implied by the choice of a
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
487
continuum for the phonon momentum space. The model is ultraviolet singular and infrared regular (v is not square integrable at infinity in any dimension) and one has to add an ultraviolet cutoff to make v square integrable. We remark that it is common in the literature to take a finite size continuous crystal. Let Λ ⊂ Rν be a box of side length L and write ∆ for the Laplacian in L2 (Λ), the electron (hole) Hilbert space, with periodic boundary conditions (to retain translation invariance). The momentum space for the phonons is now the dual lattice Λ∗ = [Z/(2πL)]ν and the Fock space for the phonons is Γ(2 (Λ∗ )). The total momentum (with ∂/∂xi defined again using periodic boundary conditions) has discrete spectrum equal to Λ∗ . Note that the integral in (1.2) should be replaced by (2πL)−ν k∈Λ∗ , and hence an infrared singularity appears at k = 0, which one should remove by redefining v at 0. Our results, suitably reformulated, holds in this situation also. We make no further remarks on this. Another model, the small polaron model, is used if the charged particle is localized on a length scale comparable to the lattice spacing. Here one can use the Holstein Hamiltonian [25] which we briefly explain. The crystal is kept as a cubic lattice, either infinite or periodic, with lattice spacing equal to 1, and the electron is confined to the lattice sites. The electron kinetic energy is modeled by a translation invariant hopping matrix, e.g. the discrete Laplacian, and the phonon momentum space is the dual lattice (the Brillouin zone [−π, π]ν for an infinite crystal). The fiber Hamiltonians again take the form (1.4), where Ω is here the Fourier transform of the hopping matrix, e.g., νj=1 (1 − cos(kj )) for the discrete Laplacian. For more material, we refer the reader to [30]. For an overview of a number of polaron models and their properties, see the review by Devreese [8]. Inspired by the Fr¨ ohlich polaron, Nelson introduced in [33] (see also [6]) a phenomenological model for non-relativistic charged particles interacting with a scalar field, often referred to as the Nelson model. √ In the one-particle sector, it has the 1 form (1.1) but with Ω(η) = η 2 /(2M ), ω(k) = k 2 + m2 and v(k) = gω(k)− 2 . Here M is the (bare) mass of the charged particle, m the mass of the field particle, and g a coupling constant. As for the Fr¨ ohlich polaron, the model has an ultraviolet singularity. Nelson noted that an ultraviolet cutoff can be removed by subtraction of an infinite self energy, without leaving the physical Hilbert space, hence defining the model rigorously. The same holds true for the Fr¨ ohlich polaron [19]. In 1973–74, J. Fr¨ ohlich’s PhD thesis appeared in the form of two papers [15, 16]. They were concerned with the spectral and scattering theory for the massless translation invariant Nelson model (in the one-particle sector), in particular, in the context of the infrared problem. These two papers are the starting point for most of the mathematical work on the operator (1.1) in the last 30 years. The methods employed are robust and the results obtained on the structure of the bottom of the energy-momentum spectrum extend to ω which are subadditive and satisfies ω(k) → ∞ for |k| → ∞. The Nelson analogue of the polaron model is the massive translation invariant Nelson model.
August 5, 2006 21:35 WSPC/148-RMP
488
J070-00267
J. S. Møller
The structure of the bottom of the energy-momentum spectrum of models of the form (1.1) has been studied by a number of authors [3, 19, 24, 31, 32, 39, 40]. Many of the results obtained were proved using the property ω(k) → ∞ for |k| → ∞ and, for a particular technical reason, do not extend to H. Fr¨ ohlich’s polaron model (with the exception of [3], where the Nelson model is not covered). This paper is devoted to overcoming this difficulty. In particular, we make an effort to work with minimal assumptions, cf. Sec. 2, on Ω, ω and v, such as to encompass models of the form (1.1) which typically appear in the literature. When presenting our main theorems, we have made an effort to focus on results pertinent to the Fr¨ohlich polaron. Recently, a relativistic electron, modeled by the Dirac operator, linearly coupled to a massless field was analyzed in [36]. For minimally coupled models, we refer the reader to the monograph [41] and the references therein. We list in telegraphic style the results obtained leaving comments on the literature to Sec. 2, where the precise statements are given. All results are concerned with the structure of the bottom of the energy-momentum spectrum and holds for all coupling strengths: An HVZ theorem determining the essential spectrum, uniqueness of ground states, (strict) monotonicity of Σ0 (ξ) and inf[σess (H(ξ))] in the coupling function, existence of isolated ground states in dimensions 1 and 2, and non-existence of embedded ground states in dimensions 3 and 4. A central theme in the paper is a study of the spectral gap inf[σess (H(ξ))] − Σ0 (ξ) in the limit of large total momentum |ξ| → ∞. We show that for a class of models including the Fr¨ohlich polaron (but not the Nelson model), the spectral gap closes in the limit of large total momentum. This is the new key observation which we use to conclude a number of our results. We repeat that these results are already known under the assumption of ω(k) → ∞. The contribution here is the extension to the polaron model, or more generally, models where ω(k) → ∞. Our central new results are not relevant for the Holstein model, because the phonon momentum space in that model is bounded. We remark that the early literature on the Fr¨ ohlich polaron was focused on the ground state energy and the effective mass (inverse curvature of ξ → Σ0 (ξ) at ξ = 0). This was a particulary challenging problem because the value of the coupling constant α for typical ionic crystals is not small (typically between 3 and 6, cf. [12, 27]), and perturbation theory is inadequate. In fact, the main thrust was towards large coupling results, cf. the key papers [27, 28]. In [11], Feynman applied his newly invented path integral technique to get bounds on the ground state energy valid for all α, an idea pursued and made mathematically rigorous in [9], see also [39] and the more recent paper [29] where the authors use more elementary methods to derive the large coupling asymptotics together with an error bound. In Sec. 2, we formulate precise assumptions and state our main theorems. In Sec. 3, we prove the HVZ theorem and in Sec. 4, we discuss the uniqueness of ground states and derive some consequences. The core of the paper is Sec. 5 where we analyze the bottom of the spectrum at large total momentum, and prove the main new results. We have included here for completeness some considerations which are
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
489
not so relevant for the particular case of the Fr¨ ohlich polaron. In Appendix A, we recall a partition of unity in Fock space, and in Appendix B, we discuss how to extend all of our main results to models with a weak ultraviolet singularity. 2. Model and Results We begin with basic assumptions on the two dispersion relations. Condition 2.1 (The Particle Dispersion Relation). Let Ω ∈ C 2 (Rν ) satisfy that Ω ≥ 0 and (i) There exists CΩ such that for all η ∈ Rν , we have |∇Ω(η)| ≤ CΩ Ω(η) + CΩ . Ω := supη∈Rν ∇2 Ω(η) < ∞. (ii) C Condition 2.2 (The Phonon Dispersion Relation). We have ω ∈ C 0 (Rν ) and ω0 > 0, the phonon mass, such that inf k∈Rν ω(k) = ω0 . Assuming Conditions 2.1(i) and 2.2, we can construct the Hamiltonian’s (1.1) and (1.4) as follows (for more details see, e.g., [16, 31]). Let C0∞ = Γfin (C0∞ (Rν )),
(2.1)
where Γfin (V) denotes the subspace of elements of the form (u0 , . . . , un , 0, . . .), where the u ’s are elements from -fold algebraic tensor products of V. Clearly, H0 is essentially self-adjoint on C0∞ (Rν ) ⊗ C0∞ and the H0 (ξ)’s are essentially self-adjoint on C0∞ . As for H, the field operator Φ(e−ik·x v) is 1lL2 (Rν ) ⊗ dΓ(ω)-bounded with relative bound zero. Hence, by Kato–Rellich, H is self-adjoint on D(H0 ) and essentially selfadjoint on C0∞ (Rν ) ⊗ C0∞ . Another application of Kato–Rellich shows that D(H0 (ξ)) is independent of ξ. We denote the common domain by D. Secondly, as above, Φ(v) is dΓ(ω)-bounded with relative bound zero so H(ξ) is self-adjoint on D and essentially self-adjoint on C0∞ . Condition 2.3. Let ω and Ω be continuous functions satisfying either that there exists {kj }j∈N ⊂ Rν with limj→∞ |kj | = ∞ such that lim ω(kj )−1 = 0
j→∞
(2.2)
or sup ω(k) < ∞ and k
lim Ω(η)−1 = 0.
|η|→∞
(2.3)
For n ≥ 1 and k = (k1 , . . . , kn ) ∈ Rnν , we write k (n) := k1 + · · · + kn .
(2.4)
August 5, 2006 21:35 WSPC/148-RMP
490
J070-00267
J. S. Møller
We now introduce the bottom of the spectrum for a composite system at total momentum ξ, consisting of an interacting system in the ground state at total momentum ξ − k (n) and n non-interacting phonons with momenta k: n (n) ω(kj ). Σ0 (ξ; k) := Σ0 ξ − k (n) +
(2.5)
j=1
The following functions are thresholds due to ground states dressed by n free phonons, at critical momenta: (n)
(n)
Σ0 (ξ) := infnν Σ0 (ξ; k).
(2.6)
k∈R
(n)
There may be other thresholds coming from other local extrema of k → Σ0 (ξ; k) and likewise from exited bands of eigenvalues. The bottom of the essential spectrum (see Theorem 2.1 below) (n)
Σess (ξ) := inf Σ0 (ξ). n≥1
(2.7)
The first result is the HVZ theorem, which goes back to Fr¨ ohlich [16] who proved Theorem 2.1(i) below for the massless translation invariant Nelson model. His proof, which relies on a method of Glimm and Jaffe [20], extends to models where ω is subadditive and ω(k) → ∞ for |k| → ∞. Spohn established in [40] that Σess (ξ) is the bottom of the essential spectrum, again assuming ω to be subadditive. A result which shows up again in [19]. However, in the latter two papers, the authors refer to J. Fr¨ ohlich for a crucial step where the property ω(k) → ∞ is used, and no mention is made of how to repair it. In [31], Theorem 2.1(i) and (ii) below were established (using the method of [7]) without the subadditivity assumption, but still with the assumption ω(k) → ∞. The proof we give here follows the Glimm–Jaffe approach and establishes Theorem 2.1(i) in full detail, and we argue using the proof given in [31] how to verify Theorem 2.1(ii). The remaining statement of Theorem 2.1(iii) is only nontrivial under the assumption (2.3), where the result is new. The proof relies on the vanishing of Σess (ξ) − Σ0 (ξ) at large total momentum, cf. Theorem 2.4 below. Theorem 2.1. Suppose v ∈ L2 (Rν ), Conditions 2.1 and 2.2. We have (i) The spectrum of H(ξ) below Σess (ξ) consists at most of eigenvalues of finite multiplicity, with Σess (ξ) as the only possible accumulation point. (n) (ii) We have {Σ0 (ξ; k) | n ∈ N and k ∈ Rnν } ⊂ σess (H(ξ)). (iii) If Condition 2.3 is also satisfied, then σess (H(ξ)) = [Σess (ξ), ∞). Remark 2.2. (1) From Theorem 2.1(i) and (ii), we find that inf σess (H(ξ)) = Σess (ξ). This is the result stated in [19, 40]. (2) Note that there are no gaps in the essential spectrum of the uncoupled model v ≡ 0 if ω(k) = ω0 > 0 a constant and Ω(η) → ∞, for |η| → ∞. It should be noted that the choice Ω ≡ 0 and ω(k) ≡ ω0 > 0 yields H(ξ) = ω0 N + Φ(v)
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
491
which is unitarily equivalent to ω0 N − ω0−1 v 22 . Hence H(ξ) has plenty of gaps in the essential spectrum in this case. Here N = dΓ(1lL2 (Rν ) ) is the phonon number operator. (3) In general, the inclusion in Theorem 2.1(ii) may be strict, if there are gaps in the set on the left-hand side and the model has excited bands of eigenvalues. These bands could give rise to extra contributions to the essential spectrum which may narrow the gaps. The only result on the existence/non-existence of excited bands of eigenvalues is due to [3]: √ For Ω(η) = η 2 , a class of ω’s, which include the constant function (but exclude k 2 + m2 , m > 0), and v’s which are smooth with all derivatives vanishing faster than a polynomial (implying a strong infrared regularization as well as an ultraviolet one), they prove that for small values of a coupling constant, there are no excited eigenstates for H(ξ) below Σ2 (ξ). (4) Note that 0 ≤ Σess (ξ) − Σ0 (ξ) ≤ ω(0). We do not assume in this paper that ω(0) = ω0 , so the spectral gap may be bigger than ω0 . We introduce the notation I0 ⊂ Rν for the set of total momenta where H(ξ) admits an isolated ground state I0 = {ξ ∈ Rν | Σ0 (ξ) < Σess (ξ)}.
(2.8)
Note that {ξ ∈ Rν | Σ0 (ξ) < inf η Σ0 (η) + ω0 } ⊂ I0 . In particular, I0 = ∅. The following theorem is concerned with uniqueness of ground states, a type of result which is usually proved by a suitable Perron–Frobenius theorem. It was established for ξ = 0 by L. Gross in [21] using the Schr¨ odinger representation of the free field, but this method does not extend to ξ = 0 and assumes that η → exp(−tΩ(η)) is positive definite for all t > 0.
(2.9)
(See also [19].) In [16] (cf. also [31] and [40]), an abstract Perron–Frobenius theorem of Faris was employed to establish uniquess of ground states for all ξ provided v = 0 a.e. and real-valued. Here, the property (2.9) is not needed. We improve this result in two directions. We avoid the assumption that v be real-valued (a trivial extension of the proof in [31]) and we show that isolated ground states are unique without any assumption on v apart from it being square integrable. This is particularly useful since one often employs sharp ultraviolet cutoffs. Theorem 2.3. Let ξ ∈ Rν . Assume v ∈ L2 (Rν ) and Conditions 2.1 and 2.2. If 0 a.e. or ξ ∈ I0 , then Σ0 (ξ) is nonΣ0 (ξ) is an eigenvalue for H(ξ) and either v = (0) (n) degenerate. Furthermore, the ground state ψξ = (ψξ , . . . , ψξ , . . .) can be chosen (0)
such that: ψξ
> 0 and for any n ≥ 1, (n)
(−1)n v(k1 ) · · · v(kn )ψξ (k) > 0 (n) ψξ (k)
=0
a.e. in {k ∈ Rnν | ∀ j : v(kj ) = 0}, a.e. in {k ∈ R
nν
| ∃ j s.t. v(kj ) = 0}.
(2.10) (2.11)
The proof is a combination of the method of J. Fr¨ ohlich and an application of the HVZ theorem.
August 5, 2006 21:35 WSPC/148-RMP
492
J070-00267
J. S. Møller
The central new observation used to prove Theorem 2.1(iii) in the case of bounded ω’s is the following result, which is a special case of the more general Theorem 5.2, which is formulated and proved in Sec. 5. Before stating the result, we impose a condition which is slightly stronger than Condition 2.3. Condition 2.4. Let ω and Ω be continuous functions satisfying either lim ω(k)−1 = 0
|k|→∞
(2.12)
or there exists a sequence {kj }j∈N ⊂ Rν , with |kj | → ∞ for j → ∞, such that sup ω(kj ) < ∞ and j
ω(η) = 0. |η|→∞ Ω(η) lim
(2.13)
We have Theorem 2.4. Suppose v ∈ L2 (Rν ), Conditions 2.1, 2.2 and (2.13). Then, for ¯ ∈ R, any Σ lim
¯ |ξ|→∞,Σ0 (ξ)≤Σ
Σess (ξ) − Σ0 (ξ) = 0.
The reader should of course keep in mind the Fr¨ ohlich polaron ω(k) ≡ ω0 , for which supξ Σ0 (ξ) < ∞. This theorem may seem trivial in light of the uncoupled model where the ground state disappears into the essential spectrum. It should be seen in connection with a surprising result of Spohn [40, Sec. 5], Theorem 2.6(i) below. We pause to introduce some notation. Let v1 , v2 ∈ L2 (Rν ). We will distinguish between the interacting Hamiltonians H0 (ξ) + Φ(vi ) by adding an index, i.e. H1 (ξ) and H2 (ξ). Likewise, we will distinguish spectral functions by adding an index as in Σ0,i (ξ) = inf σ(Hi (ξ)) and Σess,i (ξ) = inf σess (Hi (ξ)). We derive the following monotonicity result from Theorems 2.3 and 2.4. Corollary 2.5. Suppose Conditions 2.1 and 2.2. Let v1 , v2 ∈ L2 (Rν ) be coupling functions satisfying v1 v¯2 ≥ 0 a.e.
and
|v1 | ≥ |v2 | a.e.
(2.14)
Then for all ξ, we have (i) Σ0,1 (ξ) ≤ Σ0,2 (ξ) and Σess,1 (ξ) ≤ Σess,2 (ξ). (ii) If, in addition, Condition 2.4 is satisfied, v1 = 0 a.e. and v1 = v2 , then Σ0,1 (ξ) < Σ0,2 (ξ) and Σess,1 (ξ) < Σess,2 (ξ).
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
493
For the remaining theorem, we impose the following condition: Condition 2.5. The phonon dispersion relation ω is strictly subadditive, that is: ∀ k1 , k2 ∈ Rν : ω(k1 + k2 ) < ω(k1 ) + ω(k2 ).
(2.15)
We furthermore require that either lim|k|→∞ ω(k)−1 = 0 or ω is bounded, lim Ω(η)−1 = 0 and 2 lim inf ω(k) > sup ω(k).
|η|→∞
|k|→∞
k
(2.16)
See Remark 5.7 for a discussion of (2.16). Note that the last part of Condition 2.5 implies Condition 2.4. If ω is subadditive (inequality instead of strict inequality in (1) (2.15)), then Σess (ξ) = Σ0 (ξ). We have Theorem 2.6. Suppose Conditions 2.1, 2.2 and 2.5, v ∈ L2 (Rν )
and
∀ R > 0 : ess inf |v(k)| > 0. k:|k|≤R
(2.17)
Then, (i) For ν ∈ {1, 2} and all ξ ∈ Rν , the bottom of the spectrum Σ0 (ξ) is an isolated eigenvalue. (ii) For ν ∈ {3, 4} and ξ ∈ I0 , the bottom of the spectrum Σ0 (ξ) is not an eigenvalue. In dimension ν = 3 a formal calculation of Feynman, cf. [12], indicates that I0 should remain a bounded set when the coupling is turned on, but this has only been established for weak coupling in [3], under the assumptions outlined in Remark 2.2(3). Spohn proved Theorem 2.1(i) above, but under a technical condition which excluded the Nelson model, and under the implicit condition that the function (1) k → Σ0 (ξ; k) attains its infimum. This condition was verified by Spohn in the following situation: Under the assumption (2.9), an argument of L. Gross [21] shows that ξ = 0 is a global minimum of ξ → Σ0 (ξ). If in addition ω attains its supremum at infinity (this is in particular true if ω is constant), then the implicit condition is clearly satisfied. Our condition (2.16) is more natural as is illustrated by Remark 5.7, and we do not require (2.9). We remark that it has been observed by Gerlach and L¨owen [19] that ξ = 0 is a unique global minimum of Σ0 under the assumption (2.9). In [31], the theorem above was established for the Nelson model and a proof of Theorem 2.6(i) simpler than Spohn’s was given. We verify Spohn’s implicit assumption on the mass-shell and hence prove the result for the wider class of polaron models considered here. The proof of Theorem 2.1(ii) in [31] reduces in the case of the polaron model to a similar problem as for Theorem 2.1(i). The property (2.10) and Theorem 2.4 are key ingredients in the proof. 1 In the presence of a weak ultraviolet singularity with ω − 2 v ∈ L2 (Rν ), the Hamiltonian can be defined via the KLMN theorem. All the results in this section remain valid as formulated, except for Theorem 2.6(ii) for which we require the
August 5, 2006 21:35 WSPC/148-RMP
494
J070-00267
J. S. Møller
extra assumption (B.7). We refer the reader to Appendix B where this problem is addressed. Note that this is only of interest if ω is unbounded. The class of UV singular models just discussed is not large enough to include physical models. In the literature, three different renormalization schemes have been used. The simplest approach is due to Nelson [33] and has been implemented for Ω(η) = η 2 only. Here a dressing transformation is applied and in the new coordinate system, a limiting renormalized operator can be defined via the KLMN theorem after subtraction of an infinite self energy. This idea has been pursued further by Cannon [6]. The second method,
which goes back to lecture notes of Hepp [23], has only been applied for Ω(η) = η 2 + M 2 and consists of a systematic reordering of a perturbation expansion of the resolvent of the Hamiltonian (shifted by a self-energy). This method was implemented in the thesis of Eckmann [10], for an interaction resembling that of the Nelson model, (see also [1, 2]), and adapted to the massless Nelson model by Fr¨ohlich [16]. A full renormalization has not been achieved (in dimension 3) because of the condition 0 ≤ v(k) ≤ C(|k| + 1)−1 imposed on the coupling function. The third method is due to Gross [22] and Sloan [37],
and as for the second method, it has also only been implemented for Ω(η) = η 2 + M 2 . Here a renormalized resolvent is constructed pointwise using a compactness argument, which gives existence, but not uniqueness, of a limiting resolvent in the strong resolvent sense. This result is weaker than the two others which give norm-resolvent convergence. Gross implemented this method by renormalizing the mass whereas Sloan renormalized the self-energy as was also done in the other two approaches. (Gross constructed a completely renormalized model in three dimensions but this requires a change of Hilbert space. Sloan worked in dimension 2 where this is not nescessary.) Finally, we discuss the type of results presented above in the context of renormalized Hamiltonians. Here only Hamiltonians constructed via the first method has been analyzed. The HVZ theorem has been established in [19, 40] (for subadditive ω’s) but the proofs need elaboration, just as is explained in Sec. 2.1 for the non-singular case. (This can be done but we have elected to omit the details here.) The extension of the Perron–Frobenius argument to the renormalized Hamiltonian is claimed in [16] but the details were left to a preprint [14]. The proofs of the remaining results in this paper rely on ground states having non-zero overlap with the vacuum, which is a consequence of the resolvent of the Hamiltonian being positivity improving. Without that information the arguments would not work. (This includes the proofs given in [19, 40].) 3. On the HVZ Theorem We will give a proof of Theorem 2.1(i) following the Glimm–Jaffe approach, [20], which was employed by Fr¨ ohlich in [16, Sec. 2.1]. As for Theorem 2.1(ii), we explain how to employ the proof given in [31] for the Nelson model. Throughout this section, we assume Conditions 2.1 and 2.2.
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
495
The strategy of the proof is the same as the one outlined in [19], and consists of two steps. First, we prove the result for compactly supported v’s, and secondly, we extend the result to general v’s in L2 (Rν ). Let δ > 0 and write as a disjoint union Rν = ∪η∈δZν Kδ (η), where Kδ (η) := ν ×j=1 [ηj , ηj + δ). We write ηk for the unique η ∈ δZν with k ∈ Kδ (η). For k ∈ Rnν , we write η k ∈ δZnν for the vector (ηk1 , . . . , ηkn ). Furthermore, for η ∈ Rnν , we write Kδ (η) := ×ni=1 Kδ (ηi ). For the following constructions, we assume that v has compact support. For δ > 0, let nδ be the smallest integer such that supp(v) ⊂ [−δnδ , δnδ )ν =: Λδ . Note that δnδ is bounded uniformly in 0 < δ < 1. We define a discretized dispersion relation and form factor, ω(ηk ) if k ∈ Λδ and vδ = Pδ v, (3.1) ωδ (k) := ω(k) if k ∈ Λδ where Pδ : L2 (Rν ) → L2 (Rν ) is an orthogonal projection defined by (Pδ v)(k) = δ −ν v(k ) dk .
(3.2)
Kδ (ηk )
We note that s-limδ→0 Pδ = 1lL2 (Rν ) , as can be seen by computing the limit on the dense subspace of compactly supported continuous functions. This observation implies lim v − vδ = 0.
(3.3)
δ→0
As for approximating Hamiltonians, we take Hδ (ξ) := H0,δ (ξ) + Φ(vδ ),
where H0,δ (ξ) := dΓ(ωδ ) + Ω(ξ − dΓ(ηk )).
We leave the proof of the following to the reader (cf. the proof of [31, Proposition 1.1]) Lemma 3.1. Fix δ > 0. Let v ∈ L2 (Rν ) and assume Ω and ω, satisfy Conditions 2.1 and 2.2, respectively. Then, (i) D(H0,δ (ξ)) is independent of ξ and we denote it by Dδ . (ii) Φ(vδ ) is H0,δ (ξ)-bounded with relative bound 0. Hδ (ξ) is bounded from below, self-adjoint on D(Hδ (ξ)) = Dδ , and essentially self-adjoint on C0∞ . (iii) The bottom of the spectrum, ξ → Σ0,δ (ξ) := inf σ(Hδ (ξ)), is Lipschitz continuous. We introduce ()
Σ0,δ (ξ; k) := Σ0,δ (ξ − k () ) + ()
ωδ (kj ),
j=1
()
Σ0,δ (ξ) := inf Σ0,δ (ξ; k). k∈Rν
()
Lemma 3.2. Fix δ > 0. Then, the spectrum of Hδ (ξ) below min≥1 Σ0,δ (ξ) consists at most of isolated eigenvalues with finite multiplicity.
August 5, 2006 21:35 WSPC/148-RMP
496
J070-00267
J. S. Møller
Proof. Let h∞ = {f ∈ L2 (Rν ) | ∀ η ∈ δZν ∩ Λδ : Kδ (η) f (k) dk = 0}, and h0 = h⊥ ∞ is the finite dimensional subspace consisting of functions vanishing outside Λδ and constant on Kδ (η), η ∈ δZν . Let j = (j0 , j∞ ), where j# are the orthogonal projecˇ the geometric partition of unity recalled in Appendix A, tions onto h# . Then, Γ(j), is a unitary map from Γ(L2 (Rν )) to Γ(h0 ) ⊗ Γ(h∞ ). Below, we will consider the operator Hδ (ξ) on the Fock space Γ(h0 ), on which it is naturally defined. We use the same notation for the restricted operator. As in [31, Sec. 2.5], we write ˇ ∗ Hδext (ξ)Γ(j), ˇ Hδ (ξ) = Γ(j) where we split Γ(h0 ) ⊗ Γ(h∞ ) = Γ(h0 ) ⊕
∞
Γ(h0 ) ⊗ Γ
()
(h∞ )
(3.4)
=1
and identify Γ(h0 ) ⊗ Γ() (h∞ ) with a subspace of L2sym (Rν ; h0 ). Here, the subscript sym indicates that the functions are symmetric under interchange of the Rν -valued variables. We get
∞ () ext Hδ (ξ; k) dk , (3.5) Hδ (ξ) = Hδ (ξ) ⊕ Rν
=1
()
Hδ (ξ; k) = Hδ (ξ − k () ) +
ωδ (kj ) 1lΓ(h0 ) .
(3.6)
j=1
We thus find ()
Hδext (ξ)1lΓ(h0 ) ⊗ 1l(N ≥ 1) ≥ min Σ0,δ (ξ)1lΓ(h0 ) ⊗ 1l(N ≥ 1). ≥1
(3.7)
Since (Hδ (ξ) + i)−1 ⊗ 1l(N = 0) is compact we find for any f ∈ C0∞ (R) with () supp(f ) ⊂ (−∞, min≥1 Σ0,δ (ξ)), that f (Hδ (ξ)) is compact. This completes the proof. Lemma 3.3. Let v, v˜ ∈ L2 (Rν ). Then for ψ ∈ D, we have 1 1 −1 |ψ, Φ(v)ψ| ≤ 2ω0 2 v
ψ |ψ, [H0 (ξ) + Φ(˜ v )]ψ| 2 + |ψ, Φ(˜ v )ψ| 2 , (3.8) −1
1
|ψ, Φ(v)ψ| ≤ 4ω0 2 v
ψ |ψ, [H0 (ξ) + Φ(v)]ψ| 2 + 4ω0−1 v 2 ψ 2 .
(3.9)
The estimates (3.8) and (3.9) hold also with H0 (ξ) replaced by H0,δ (ξ). Proof. Recall the formula (n)
(a(v)ψ)
√ (k1 , . . . , kn ) = n + 1
Rν
v(kn+1 )ψ (n+1) (k1 , . . . , kn+1 ) dkn+1 a.e., (3.10)
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
497
which, for ψ ∈ C0∞ , is meaningful for any distribution v. Using this identity, we get |ϕ, a(v)ψ| ∞ ≤ n=0
√ |ϕ(n) (k1 , . . . , kn )| n + 1|v(kn+1 )ψ (n+1) (k1 , . . . , kn+1 )| dk
R(n+1)ν
1
≤ ω − 2 v
∞
ϕ(n)
n=0
× (n + 1)
R(n+1)ν
1
2 ω(kn+1 )ψ (n+1) (k1 , . . . , kn+1 ) dk1 · · · dkn+1
12
1
≤ ω − 2 v
ϕ ψ, dΓ(ω)ψ 2 .
(3.11)
This implies, for ψ ∈ D, the bound −1
1
|ψ, Φ(v)ψ| ≤ 2ω0 2 v
ψ ψ, dΓ(ω)ψ 2 .
(3.12)
To prove (3.8), we use (3.12) and estimate for v, v˜ ∈ L2 (Rν ) and ψ ∈ D, −1
1
|ψ, Φ(v)ψ| ≤ 2ω0 2 v
ψ ψ, H0 (ξ)ψ 2 1 1 −1 ≤ 2ω0 2 v
ψ |ψ, [H0 (ξ) + Φ(˜ v )]ψ| 2 + |ψ, Φ(˜ v )ψ| 2 . (3.13) This proves (3.8). Taking v = v˜ yields (3.9) (after a small computation). Since we only used ω(k) ≥ ω0 > 0 and Ω(η) ≥ 0, we conclude the bounds also with H0 (ξ) replaced by H0,δ (ξ). An application of (3.9) with v˜ = 0 yields the following useful a priori lower bound, which is valid for all ξ H(ξ) ≥ −ω0−1 v 2 1lF .
(3.14)
The same lower bound holds for Hδ (ξ) (note that vδ = Pδ v ≤ v ). Lemma 3.4. Let Wδ (ξ) = Hδ (ξ) − H(ξ). There exist a family of ξ-independent positive numbers {Cδ }δ>0 , with limδ→0 Cδ = 0, such that for any ϕ, ψ ∈ D, we have |ϕ, Wδ (ξ)ψ| ≤ Cδ H(ξ)ϕ + ϕ Hδ (ξ)ψ + ψ . Proof. As an operator on C0∞ , we have Wδ (ξ) = dΓ(ωδ − ω) + Ω(ξ − dΓ(ηk )) − Ω(ξ − dΓ(k)) + Φ(vδ − v).
(3.15)
We estimate the terms of Wδ (ξ) one by one beginning with dΓ(ωδ − ω). Since ωδ is only discretized in a compact set (uniformly in 0 < δ < δ0 , for any δ > 0 such that supk∈Rν |ω(k) − ωδ (k)| ≤ C δ δ0 ) we obtain a family of constants C δ = 0. This implies (cf. the operator bound [31, (2.11)]) and limδ→0 C δ N ϕ
ψ ≤ C δ ω −1 H0 (ξ)ϕ
ψ . |ϕ, dΓ(ωδ − ω)ψ| ≤ C 0
(3.16)
August 5, 2006 21:35 WSPC/148-RMP
498
J070-00267
J. S. Møller
Let v˜ ∈ L2 (Rν ). From (3.11) we get −1
1
|ϕ, Φ(˜ v ) ψ| ≤ 2ω0 2 ˜ v
ϕ
dΓ(ω) 2 ψ 1 1 −1 ≤ 2ω0 2 ˜ v
ϕ min{ H0 (ξ) 2 ψ , H0,δ (ξ) 2 ψ } + ψ , (3.17) which implies the bound 1 −1 v
ϕ H0,δ (ξ) 2 ψ + ψ . |ϕ, Φ(˜ v ) ψ| ≤ 2ω0 2 ˜
(3.18)
Finally, we use Condition 2.1 to estimate, writing zt (k) = (1 − t)ηk + tk, |ϕ, (Ω(ξ − dΓ(ηk )) − Ω(ξ − dΓ(k))ψ| ≤ |ϕ, ∇Ω(ξ − dΓ(k)) · dΓ(ηk − k)ψ| 1 + |dΓ(ηk − k)ϕ, ∇2 Ω(ξ − dΓ(zt (k)))dΓ(ηk − k)ψ| dt 0
Ω N ϕ
N ψ ≤ δν ∇Ω(ξ − dΓ(k))ϕ
N ψ + δ 2 ν C 1 2
Ω ω −2 ]( H0 (ξ)ϕ + ϕ ) H0,δ (ξ)ψ . ≤ δ[ν 2 CΩ ω0−1 + δν C 0 1
(3.19)
Combining (3.3), (3.16), (3.18) applied with v˜ = vδ −v, and (3.19) we get, for δ > 0, |ϕ, Wδ (ξ)ψ| ≤ Cδ ( H0 (ξ)ϕ + ϕ )( H0,δ (ξ)ψ + ψ ),
(3.20)
where limδ→0 Cδ = 0 and Cδ does not depend on ξ. The bound extends by continuity to ψ ∈ D. −1 Abbreviate C = C(v, ω0 ) := 2 v ω0 2 . To pass to the interacting Hamiltonians, we note that taking supremum over normalized ϕ’s in (3.17), with v˜ = v, yields 1
H0 (ξ)ψ ≤ H(ξ)ψ + C ψ, H0 (ξ)ψ 2 + ψ
1 2 1 ≤ H(ξ)ψ + H0 (ξ)ψ + C + C ψ . 2 2 Rearrangement and a repetition of the argument with v and H0 (ξ) replaced by vδ and H0,δ (ξ) give for ψ ∈ D
H0 (ξ)ψ ≤ 2 H(ξ)ψ + [C 2 + 2C] ψ ,
H0,δ (ξ)ψ ≤ 2 Hδ (ξ)ψ + [C 2 + 2C] ψ . Recall that vδ ≤ v so that the C’s can be taken identical. Plugging these two bounds into (3.20) yields the lemma. We get in particular the following corollary. Corollary 3.5. Let v ∈ L2 (Rν ). The family Hδ (ξ) converges, as δ → 0, in norm resolvent sense to H(ξ).
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
499
Proof of Theorem 2.1(i), for v’s with compact support. Let ψ ∈ F and compute for z ∈ C\[σ(H(ξ)) ∪ σ(Hδ (ξ))], ψ, (H(ξ) − z)−1 − (Hδ (ξ) − z)−1 ψ = (H(ξ) − z¯)−1 ψ, Wδ (ξ)(Hδ − z)−1 ψ (3.21) We abbreviate λ0 := −ω0−1 v 2 − 1.
(3.22)
ν
Then, by (3.14), we find that for all ξ ∈ R , H(ξ) ≥ (λ0 + 1)1lF
and Hδ (ξ) ≥ (λ0 + 1)1lF .
(3.23)
Let > 0. Putting together Lemma 3.4, applied with a normalized ψ ∈ 1l(Σ0 (ξ) ≤ H(ξ) ≤ Σ0 (ξ) + )F , (3.21), (3.23) and the Rayleigh–Ritz variational principle, we estimate (Σ0 (ξ) − λ0 )−1 − (Σ0,δ (ξ) − λ0 )−1 ≤ ψ , (H(ξ) − λ0 )−1 − (Hδ (ξ) − λ0 )−1 ψ + ≤ Cδ H(ξ)(H(ξ) − λ0 )−1 ψ
Hδ (ξ)(Hδ (ξ) − λ0 )−1 ψ + 1 + ≤ Cδ (1 + |λ0 |)2 + 1 + . Taking to zero and repeating the argument for (Σ0,δ (ξ) − λ0 )−1 − (Σ0 (ξ) − λ0 )−1 , we get (Σ0 (ξ) − λ0 )−1 − (Σ0,δ (ξ) − λ0 )−1 ≤ Cδ (1 + |λ0 |)2 + 1 . (3.24) Here it is important that Cδ → 0 and is independent of ξ. ¯ ∈ R. The bound above implies the following Fix ξ ∈ Rν , ∈ N and let Σ ¯ ¯ ¯ ¯ statement. There exists δ = δ(Σ) such that for 0 < δ < δ: () ¯ ⊂ k ∈ Rν | Σ() (ξ; k) ≤ 2Σ ¯ k ∈ Rν | Σ0 (ξ; k) ≤ Σ 0,δ () ¯ . ⊂ k ∈ Rν | Σ0 (ξ; k) ≤ 3Σ (3.25) ¯ ∈ R, From (3.24) and the first inclusion above, we find that for any Σ lim
()
inf
δ→0 k∈{k |Σ() (ξ;k )≤Σ} ¯
Σ0,δ (ξ; k) =
0
¯ = Let Σ bounds
() Σ0 (ξ; 0),
()
inf
() ¯ k∈{k |Σ0 (ξ;k )≤Σ}
Σ0 (ξ; k).
such that the first set in (3.25) is non-empty. We get the ()
lim sup Σ0,δ (ξ) ≤ lim
()
inf
δ→0 k∈{k |Σ() (ξ;k )≤Σ} ¯
δ→0
Σ0,δ (ξ; k)
0
=
inf
() ¯ k∈{k |Σ0 (ξ;k )≤Σ}
()
Σ0 (ξ; k)
()
= Σ0 (ξ) =
inf
() ¯ k∈{k |Σ0 (ξ;k )≤3Σ}
()
Σ0 (ξ; k)
August 5, 2006 21:35 WSPC/148-RMP
500
J070-00267
J. S. Møller
= lim
()
inf
δ→0 k∈{k |Σ() (ξ;k )≤3Σ} ¯
Σ0,δ (ξ; k)
0
≤ lim inf δ→0
()
inf
() ¯ k∈{k |Σ0,δ (ξ;k )≤2Σ}
Σ0,δ (ξ; k)
()
= lim inf Σ0,δ (ξ). δ→0
Hence, ()
()
lim Σ0,δ (ξ) = Σ0 (ξ).
(3.26)
δ→0
(Note that norm-resolvent convergence is not sufficient to get the above limit. In [16], the fact that ω(k) = |k| → ∞, |k| → ∞, was used instead to ensure that only convergence for a compact subset of ξ’s was needed.) Now, Theorem 2.1(i) is an easy consequence of (3.26), Lemma 3.2 and Corollary 3.5. The following lemma will be used to extend the result just proven to general v ∈ L2 (Rν ). We use the notation introduced in the paragraph leading into Corollary 2.5. Lemma 3.6. Let v1 , v2 ∈ L2 (Rν ) and suppose Conditions 2.1 and 2.2. Then −1 1 Σ0,2 (ξ) − Σ0,1 (ξ) ≤ v1 − v2 4ω0 2 |Σ0,1 (ξ)| 2 + 6ω0−1 v1 . Proof. Let > 0. We apply (3.9) with v = v1 and ψ = ψ,1 ∈ 1l(Σ0,1 (ξ) ≤ H1 (ξ) ≤ −1
Σ0,1 (ξ)+)F , which we take to be normalized. We thus get, abbreviating c = 2ω0 2 , 1
|ψ,1 , Φ(v1 )ψ,1 | ≤ 2c v1 (|Σ0,1 (ξ)| + ) 2 + c2 v1 2 . Next, we apply (3.9) with v = v2 − v1 , v˜ = v1 and ψ = ψ,1 ∈ 1l(Σ0,1 (ξ) ≤ H1 (ξ) ≤ Σ0,1 (ξ) + )F in conjunction with the Rayleigh–Ritz variational principle to obtain (with c as above) Σ0,2 (ξ) − Σ0,1 (ξ) − ≤ |ψ,1 , Φ(v2 − v1 )ψ,1 | 1 1 1 1 ≤ c v1 − v2 (|Σ0,1 (ξ)| + ) 2 + (2c) 2 v1 2 (|Σ0,1 (ξ)| + ) 4 + c v1 1 3 2 2 ≤ v1 − v2 2c(|Σ0,1 (ξ)| + ) + c v1 . 2 Taking to zero concludes the proof. Proof of Theorem 2.1(i) for general v. Let v ∈ L2 (Rν ) and define for Λ > 0 a cutoff coupling function vΛ := 1l(|k| ≤ Λ)v. Denote by Σ0,Λ (ξ) the bottom of the spectrum of HΛ (ξ) = H0 (ξ) + Φ(vΛ ), and by Σess,Λ (ξ) the usual function constructed from Σ0,Λ , cf. (2.7). Applying Lemma 3.6 twice, we find that 1 1 1 |Σ0 (ξ) − Σ0,Λ (ξ)| ≤ C 1l(|k| > Λ)v max |Σ0 (ξ)| 2 , |Σ0,Λ (ξ)| 2 + v 2 ,
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
501
where C does not depend on ξ. This estimate implies (Σ0 (ξ) − λ0 )−1 − (Σ0,Λ (ξ) − λ0 )−1 ≤ CΛ , where limΛ→∞ CΛ = 0, and CΛ does not depend on ξ. Here, λ0 is as in (3.22). This estimate replaces (3.24) in the proof for the compactly supported case, and the rest of the proof is identical, except for the last line where the compactness now comes from Theorem 2.1(i), with compactly supported v, and not from Lemma 3.2. To end this section, we explain how to verify Theorem 2.1(ii). Expanding on the notation (2.8), we introduce for n ≥ 1, (n) I0 (ξ) := k ∈ Rnν | ξ − k (n) ∈ I0 .
(3.27)
See (2.4) for the notation k (n) . Proof of Theorem 2.1(ii). First we note that the construction of Weyl sequences in the proof of the HVZ theorem in [31, Sec. 3.2] goes through in exactly the same way, and implies that for all ξ ∈ Rν , (n) (n) Σ0 (ξ; k) | n ≥ 1 and k ∈ I0 (ξ) ⊂ σess (H(ξ)). (n)
(n)
(3.28) (n)
It remains to prove that Σ0 (ξ; k) ∈ σess (H(ξ)) if k ∈ I0 (ξ). For k ∈ I0 (ξ), we have, by (3.27), that Σ0 (ξ − k (n) ) = Σess (ξ − k (n) ). By definition of Σess , there (n ) exists n and a sequence {kj }j∈N ⊂ I0 (ξ − k (n) ) such that (n)
Σ0 (ξ; k) = Σ0 (ξ − k (n) ) +
n
ω(ki )
i=1 n
= Σess (ξ − k (n) ) +
ω(ki )
i=1 (n )
= lim Σ0 j→∞
(ξ − k (n) ; kj ) +
ω(ki )
i=1 (n+n )
= lim Σ0 j→∞
n
(ξ; (k, k j )).
(n+n )
(ξ) and the essential spectrum is a closed set, this together Since (k, k j ) ∈ I0 with (3.28) concludes the proof. In the last paragraph of the proof above, we demonstrated the following: Corollary 3.7. Let v ∈ L2 (Rν ). Assume Conditions 2.1 and 2.2. The closure of (n) (n) (n) the set {Σ0 (ξ; k)|n ∈ N, k ∈ I0 (ξ)} equals {Σ0 (ξ; k) | n ∈ N, k ∈ Rnν }.
August 5, 2006 21:35 WSPC/148-RMP
502
J070-00267
J. S. Møller
4. On Uniqueness of Ground States In this section, we prove Theorem 2.3 and Corollary 2.5(i). As for Theorem 2.3, we can assume that v is not the zero function, since the theorem is trivially satisfied if the system is uncoupled. Let A ⊂ Rν be the Lebesgue measurable set A := {k | v(k) = 0},
(4.1)
which has non-zero measure. We define dΓ(ω) + Ω(ξ − dΓ(k)) directly on Γ(L2 (A)) and denote it by H0 (ξ; A). The operator H(ξ; A) = H0 (ξ; A) + A {v(k)a∗ (k) + v(k)a(k)} dk is defined on 2 (A)) by the Kato–Rellich Theorem. We abbreviate in the following ΦA (v) = Γ(L ∗ {v(k)a (k) + v(k)a(k)} dk. A The Perron–Frobenius argument given in [31, Sec. 3.3] (cf. also [16, Sec. 3.2]) yields that a ground H(ξ; A), (0)state of(n) if it exists, is non-degenerate and the eigenfunction ψξ;A = ψξ;A , . . . , ψξ;A , . . . can be chosen such that (n)
(−1)n v(k1 ) · · · v(kn )ψξ;A (k1 , . . . , kn ) > 0, for almost every (k1 , . . . , kn ) ∈ An . We note that in [16, 31] only real-valued v was considered. The argument however works also for complex valued v. The important observation is that −a∗ (v) and −a(v) preserve the Hilbert cone. Alternatively, one can reduce the problem to the case v ≥ 0, which was treated in [16], using the following observation: Let ϕ : Rν → R be measurable. Then, the unitary transform Γ(eiϕ ) satisfies Γ(eiϕ )∗ = Γ(e−iϕ ) and Γ(eiϕ )(H0 (ξ) + Φ(v))Γ(e−iϕ ) = H0 (ξ) + Φ(eiϕ v).
(4.2)
Let j0 and j∞ be the restriction maps from L2 (Rν ) to L2 (A) and L2 (Ac ), respectively. Then, cf. (3.4),
∞
ˇ Γ(j) : Γ(L2 (Rν )) → Γ(L2 (A)) ⊕
Γ(L2 (A)) ⊗ Γ() (L2 (Ac ))
=1
is unitary and as in (3.5) we get
∞ ˇ ∗ H(ξ; A) ⊕ H(ξ) = Γ(j) =1
H () (ξ; k, A) dk
ˇ Γ(j).
(4.3)
(Ac )
Here, H () (ξ; k, A) is defined as in (3.6), replacing Hδ (·) by H(· ; A), and ωδ by ω. From the Rayleigh–Ritz variational principle, we find inf σ(H () (ξ; k, A)) ≥ inf σ(H () (ξ; k)).
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
503
Here, H () (ξ; k) = H () (ξ; k, Rν ), cf. also [31, Secs. 2.5 and 2.6]. From this estimate, we get H () (ξ; k, A) dk inf σ (Ac )
≥ ≥ =
inf
inf σ(H () (ξ; k, A))
inf
inf σ(H () (ξ; k))
inf
Σ0 (ξ; k) ≥ Σ0 (ξ).
k∈(Ac ) k∈(Ac )
k∈(Ac )
()
()
We thus get inf σ
∞ c =1 (A )
H () (ξ − k () ; A) dk
≥ Σess (ξ).
(4.4)
Hence, if Σ0 (ξ) is an isolated eigenvalue, i.e. Σ0 (ξ) < Σess (ξ), then a corresponding eigenfunction must have the form ψξ = (ψξ;A , 0), where ψξ;A is a ground state of H(ξ; A). Theorem 2.3 now follows from the earlier discussion. Proof of Corollary 2.5(i). First, suppose Σ0,2 (ξ) is an isolated ground state and let ψ0,2 denote the unique ground state satisfying (2.10) and (2.11). Then, Σ0,1 (ξ) ≤ ψ0,2 , H1 (ξ)ψ0,2 = Σ0,2 (ξ) + ψ0,2 , φ(v1 − v2 )ψ0,2 .
(4.5)
The last term on the right-hand side is non-positive under the hypothesis (2.14). It remains to treat the case Σ0,2 (ξ) = Σess,2 (ξ). For this, we study the bottom of the essential spectrum. Use Theorem 2.1(ii), Corollary 3.7, and the result just proved to estimate (n)
Σess,2 (ξ) = min infnν Σ0,2 (ξ; k) n
= min n
≥ min n
k∈R
inf
(n)
k∈I0,2 (ξ)
inf
(n)
k∈I0,2 (ξ)
Σ0,2 (ξ − k (n) ) +
n
ω(kj )
j=1
Σ0,1 (ξ − k (n) ) +
≥ Σess,1 (ξ).
n
ω(kj )
j=1
(4.6)
Hence, Σ0,1 (ξ) ≤ Σess,1 (ξ) ≤ Σess,2 (ξ) = Σ0,2 (ξ). Another consequence of Theorem 2.3 is that if Ω is real analytic, then Σ0 restricted to I0 is a real analytic function, cf. [15, Lemma 1.6].
August 5, 2006 21:35 WSPC/148-RMP
504
J070-00267
J. S. Møller
5. Large Total Momenta We begin with Proposition 5.1. Suppose v ∈ L2 (Rν ) has compact support, Conditions 2.1 and 2.2, and lim
|k|→∞
ω(k) ω(k) = lim =0 |k| |k|→∞ Ω(k) + 1
and
γ := inf
inf
ξ η:|η|≥|ξ|/2
ω(η) > 0. ω(ξ)
(5.1)
Then, I0 is a bounded set. Proof. Let Λ be such that v(k) = 0 for |k| > Λ. Put AΛ = {k ∈ Rν | |k| ≤ Λ}. Then, H(ξ) partitions as in (4.3), with A replaced by AΛ . (Note that as opposed to the A in (4.1), v may vanish in AΛ .) Since 0 is the ground state energy at ξ = 0 for the uncoupled model, we have from Corollary 2.5(i) that Σ0 (0) ≤ 0.
(5.2)
We have the basic bound which follows from Theorem 2.1(ii) and (5.2), (1)
Σess (ξ) ≤ Σ0 (ξ; ξ) = Σ0 (0) + ω(ξ) ≤ ω(ξ).
(5.3)
Secondly, we write κ(ξ) :=
Ω(η) . η:|η|≥|ξ|/2 ω(η) inf
(5.4)
Abbreviate c = ω0 /(2Λ) and estimate from below H0 (ξ; AΛ ) = 1l(N ≤ |ξ|/(2Λ))H0 (ξ; AΛ ) + 1l(N > |ξ|/(2Λ))H0 (ξ; AΛ ) ≥ 1l(N ≤ |ξ|/(2Λ))Ω(ξ − dΓ(k)) + 1l(N > |ξ|/(2Λ)) dΓ(ω) ≥ 1l(N ≤ |ξ|/(2Λ))κ(ξ)ω(ξ − dΓ(k)) + 1l(N > |ξ|/(2Λ))c|ξ| ≥ min{γκ(ξ), c|ξ|/ω(ξ)}ω(ξ). (5.5) Here, we used γ from (5.1) and that | nj=1 ki | ≤ |ξ|/2, for n ≤ |ξ|/(2Λ). Since the constant in front of ω(ξ) goes to +∞ as |ξ| → ∞, we find in conjunction with (5.3) that there exists C = C(Λ) > 0 such that for |ξ| ≥ C: Σess (ξ) ≤ inf σ(H0 (ξ; AΛ )).
(5.6)
However, since Σ0 (ξ) ≤ Σess (ξ), this implies by the decomposition (4.3) and the estimate (4.4) that ∀ ξ s.t. |ξ| ≥ C : Σ0 (ξ) = Σess (ξ).
(5.7)
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
505
In the following proposition, we make use of a function f : [0, ∞) → (0, ∞) which is monotone non-decreasing and satisfies f ( 12 r) f (r) = 0 and ρ := inf > 0. r→∞ r r≥0 f (r) lim
(5.8)
Furthermore, the function g(r) := r/f (r) is assumed to be strictly monotone increasing. Below, we will use the inverse g −1 of g, which is also strictly monotone increasing and satisfies: s/g −1 (s) → limr→∞ f (r)−1 ≥ 0 as s → ∞. Theorem 5.2. Let U ⊂ Rν be an unbounded set and v ∈ L2 (Rν ). Suppose Condition 2.1 and 2.2, and that there exists Cω ≥ 1 such that ∀ k ∈ Rν : ω(k) ≥ Cω−1 f (|k|), ∀ k ∈ U : ω(k) ≤ Cω f (|k|), ω(k) lim = 0, |k|→∞ Ω(k) + 1 1
1l(|k| > Λ)v = o [Λ/g −1 (Λ)] 2 .
(5.9) (5.10) (5.11)
Then, lim
|ξ|→∞,ξ∈U
Σess (ξ) − Σ0 (ξ) = 0.
Remark 5.3. A choice of f above is f (r) = (1 + r)t with 0 ≤ t < 1. The particular choice t = 0 applies to the Fr¨ ohlich polaron, and in this case, the assumption on v is automatically satisfied for any v ∈ L2 (Rν ) since g(s) = g −1 (s) = s. For general t, t the assumption on v becomes 1l(|k| > Λ)v = o(Λ− 2(1−t) ). One could also multiply (1 + r)t by powers of logarithms or by iterated logarithms. Proof. Abbreviate AΛ = {k ∈ Rν | |k| ≤ Λ}. We repeat the estimate (5.5), taking into account the bounds in (5.8), (5.9) and (5.10), and obtain for ξ ∈ U H0 (ξ; AΛ ) ω0 |ξ| 2Λ ω0 ≥ 1l(N ≤ |ξ|/(2Λ))κ(ξ)Cω−1 f (|ξ|/2) + 1l(N > |ξ|/(2Λ)) g(|ξ|)f (|ξ|) 2Λ ≥ C1 min{κ(ξ), Λ−1 g(|ξ|)}ω(ξ). ≥ 1l(N ≤ |ξ|/(2Λ))κ(ξ)ω(ξ − dΓ(k)) + 1l(N > |ξ|/(2Λ))
Here, κ(ξ) is defined in (5.4) and C1 = Cω−1 min{ρCω−1 , ω0 /2}. As for (5.2), we also have Σ0,Λ (0) ≤ 0. If Λ and ξ are such that ξ ∈ U,
C1 κ(ξ) ≥ 1 and C1 g(|ξ|) ≥ Λ,
(5.12)
then we get as in the previous proof that Σ0,Λ (ξ) = Σess,Λ (ξ) (cf. (5.6) and (5.7)). From Lemma 3.6, Corollary 2.5 (i) and (5.3) we find that 0 ≤ Σ0,Λ (ξ) − Σ0 (ξ) 1 1 ≤ C 1l(|k| > Λ)v Σ0 (ξ) 2 + v 2 1
≤ C2 1l(|k| > Λ)v ω(ξ) 2 .
August 5, 2006 21:35 WSPC/148-RMP
506
J070-00267
J. S. Møller
Now, put Λ = C1−1 g(|ξ|), cf. (5.12). We can assume without loss that C1 ≤ 1. Then, using Corollary 2.5(i), we have the large ξ ∈ U asymptotic bound, Σess (ξ) − Σ0 (ξ) = Σess (ξ) − Σess,Λ (ξ) + Σ0,Λ (ξ) − Σ0 (ξ) ≤ Σ0,Λ (ξ) − Σ0 (ξ) 1
≤ C2 1l(|k| > C1−1 g(|ξ|))v ω(ξ) 2 1
≤ C2 1l(|k| > g(|ξ|))v ω(ξ) 2 1 1 ≤ o [g(|ξ|)/g −1 (g(|ξ|))] 2 f (|ξ|) 2 = o(1). This establishes the theorem. Proof of Theorem 2.1(iii). If ω(k) → ∞, then (iii) follows from (ii). We can hence assume that (2.3) is satisfied. (n) Let ξ ∈ Rν and E ≥ Σess (ξ). Let n be the smallest integer such that E ≥ Σ0 (ξ) (˜ n) ˜ > n. Let k = (k, . . . , k) and consider a sequence with and E < Σ0 (ξ), for n |k| → ∞. Then, by Theorem 5.2, Σ0 (ξ − nk) − Σess (ξ − nk) → 0, and we can thus for any > 0, pick k such that Σ0 (ξ − nk) ≥ Σess (ξ − nk) − . For such a k, we get (n)
Σ0 (ξ, k) ≥ Σess (ξ − nk) + nω(k) − (˜ n) ˜ + nω(k) − 2 ≥ Σ (ξ − nk; k) = ≥
0 (n+˜ n) ˜ (ξ; (k, k)) Σ0 (n+˜ n) (ξ) − 2, Σ0
− 2
(˜ n) ˜ − . where n ˜ and k˜ ∈ Rn˜ ν are chosen such that Σess (ξ − nk) ≥ Σ0 (ξ − nk; k) (n) By continuity, there exists k such that E = Σ0 (ξ, k). The result now follows from (ii).
Lemma 5.4. Let v ∈ L2 (Rν ) and assume Conditions 2.1, 2.2 and 2.4. Let ξ ∈ Rν (n) (n ) and suppose n ≥ 1 is such that Σ0 (ξ) < minn >n Σ0 (ξ). Then, there exists R = R(ξ) > 0 such that inf
k∈Rnν :|k(n) |≥R
(n)
(n)
Σ0 (ξ; k) > Σ0 (ξ).
Proof. This is clearly true under the hypothesis (2.12). We can hence assume that (2.13) holds true. (n) Assume the lemma is false for some ξ. Fix U := {η | Σ0 (η) ≤ Σ0 (ξ)}, which is not an empty set since ξ ∈ U. There exists a sequence {k j }j∈N ⊂ Rnν such that (n)
(n)
(n)
(n)
ξ − kj ∈ U, |kj | → ∞ for j → ∞, and Σ0 (ξ) = limj→∞ Σ0 (ξ; k j ). Let > 0. There exists by Theorem 5.2 a j0 such that for j ≥ j0 (n)
(n)
Σ0 (ξ) ≥ Σ0 (ξ; k j ) − n (n) = Σ0 ξ − kj ω(kj;i ) − + i=1
n (n) ≥ Σess ξ − kj + ω(kj;i ) − 2. i=1
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
507
Fix a j ≥ j0 and pick n ≥ 1 and k ∈ Rn ν such that n (n) (n ) (n) (n+n ) Σ0 (ξ) ≥ Σ0 ξ − kj ; k + ω(kj;i ) − 3 = Σ0 (ξ; (k j , k)) − 3. i=1
This estimate contradicts the choice of n. The following crucial lemma was proved in [31]. It is a statement about functions defined as in (2.5) and (2.6). See also (3.27). Lemma 5.5. Let v ∈ L2 (Rν ). Assume Conditions 2.1 and 2.2. Let ξ ∈ Rν , n ≥ 1 (n) (n ) (n) and k ∈ Rnν . If Σ0 (ξ; k) < inf n >n Σ0 (ξ), then k ∈ I0 (ξ). Proof of Corollary 2.5(ii). Suppose first Σ0,2 (ξ) is an isolated eigenvalue. Under the extra assumption v1 = 0 a.e. and v1 = v2 , we get from (2.10) and (4.5) that Σ0,1 (ξ) < Σ0,2 (ξ). Now, suppose Σ0,2 (ξ) = Σess,2 (ξ). By a compactness argument, we conclude from Lemmas 5.4 and 5.5 as well as (4.6) that Σess,1 (ξ) < Σess,2 (ξ). This completes the proof. (n) Recall that subadditivity implies that n → Σ0 (ξ) are non-decreasing functions. In the following lemma, we show that under Condition 2.5 the functions are strictly increasing. Lemma 5.6. Assume Conditions 2.1, 2.2 and 2.5. Then, for any ξ ∈ Rν and (n) (n+1) (ξ). n ≥ 1, we have Σ0 (ξ) < Σ0 Proof. If lim|k|→∞ ω(k)−1 = 0, the result follows from strict subadditivity (2.15). Hence we only have to deal with ω’s satisfying (2.16). Suppose the conclusion of the lemma is false. That is, there exist ξ and (n−1) (n) (ξ) = Σ0 (ξ). We can without loss of generality supn ≥ 2 such that Σ0 (n ) (n) pose that minn >n Σ0 (ξ) > Σ0 (ξ). Let {k } ⊂ Rnν be a sequence with (n) (n) lim→∞ Σ0 (ξ; k ) = Σ0 (ξ). By Lemma 5.4, there exists R > 0 such that (n) |k | ≤ R (for large). Suppose first that there exists R such that for all 1 ≤ j ≤ n and ≥ 1, we have |k;j | ≤ R . Then, by strict subadditivity (2.15), we obtain a contradiction. If all the k;j ’s cannot be uniformly bounded, there exist 1 ≤ i < j ≤ n such that |k;i | and |k;j | diverge. This together with (2.16) also yields a contradiction. Remark 5.7. The assumption (2.16) is not just technical. As a counterexample consider the borderline case ω(k) = m(1+exp(−k 2 /2)), which is strictly subadditive but just fails to satisfy (2.16). We take v ≡ 0 and for Ω, we take η 2 /(2M ) with m and M chosen such that mM < 1. This choice ensures that the function k → Ω(−k) + ω(k) has a global minimum at k = 0. We prove that for this example (2) (1) (1) (2) we have Σ0 (0) = Σ0 (0): Assume this equalty is false, that is Σ0 (0) < Σ0 (0). Then, the assumption of Lemma 5.5 is satisfied with ξ = 0 and n = 1, which implies that (1)
Σ0 (0) = Σess (0) = min(Ω(−k) + ω(k)) = 2m. k
August 5, 2006 21:35 WSPC/148-RMP
508
J070-00267
J. S. Møller
On the other hand, we obtain using Σ0 (0) = Ω(0) = 0 for the uncoupled model (2)
Σ0 (0) = inf (Σ0 (−k1 − k2 ) + ω(k1 ) + ω(k2 )) ≤ inf (Σ0 (0) + 2ω(k)) = 2m. k1 ,k2
k
This is a contradiction. Proof of Theorem 2.6. We begin with (i). (1) As demonstrated by Spohn, cf. [40, (5.14)], it suffices to show that k → Σ0 (ξ; k) has a minimizer for all ξ, with ξ − k ∈ I0 . Here, we refer to the simpler proof given in [31] which uses the assumption ω(k) → ∞ to obtain a minimizer. We remark that the proof given in [31] only relies on: (1) The HVZ Theorem. (2) That ground state eigenfunctions are unique and can be chosen strictly positive with respect to the cone C := ψ ∈ F | ψ (0) ≥ 0 and ∀ n ≥ 1 : (−1)n v¯ ⊗ · · · ⊗ v¯ ψ (n) ≥ 0 a.e. . There are n copies of v¯ in the tensor product. (1) (3) For ξ ∈ I0 , the function k → Σ0 (ξ; k) attains its infimum at a momentum (1) k ∈ I0 (ξ), cf. (3.27). The proof given in [31], which is formulated for real-valued v, goes through for complex-valued v provided (1)–(3) above are satisfied. Alternatively, we employ the transformation (4.2). We remark that ψ ∈ C is said to be strictly positive if ψ, ϕ > 0 for all ϕ ∈ C\{0}. The property (2) follows from Theorem 2.3 under the assumption (2.17). (1) To verify (3), we recall that subadditivity of ω implies Σess (ξ) = Σ0 (ξ). For (1) ξ ∈ I0 , we have Σ0 (ξ) = Σess (ξ) = Σ0 (ξ). The existence of a minimizer k now (1) follows from Lemmas 5.4 and 5.6, applied with n = 1. That k ∈ I0 (ξ) is a consequence of Lemmas 5.5 and 5.6. As for (ii), the proof given in [31] relies on (1) and (2) above together with: (1) (3 ) For ξ ∈ I0 , the set of global minima of k → Σ0 (ξ; k) is a bounded subset of (1) I0 (ξ). As above, (3 ) follows from Lemmas 5.4, 5.5 and 5.6, and as above, the proof is formulated for real-valued v, but goes through also for complex-valued v. Let E0 (ξ) := inf σ(H0 (ξ)) be the bottom of the spectrum for the uncoupled system. Then, by Corollary 2.5(i) and Lemma 3.6 (applied with v2 = 0), there exist C > 0 such that 1
E0 (ξ) − CE0 (ξ) 2 − C ≤ Σ0 (ξ) ≤ E0 (ξ).
(5.13)
The following proposition is concerned with the rate of growth at large total momentum of the bottom of the spectrum. It turns out that at linear growth a transition in behavior occurs. Proposition 5.8. Assume Conditions 2.1 and 2.2.
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
509
(i) Suppose there exist c > 0 and 0 ≤ s ≤ 1 such that ω(k) ≥ cks
and
Ω(η) ≥ cηs − c.
(5.14)
Then, there exists C > 0 such that C −1 ξs − C ≤ Σ0 (ξ) ≤ ω(ξ).
(5.15)
(ii) Suppose we have at least linear growth sup
k∈Rν
|k| <∞ ω(k)
and
sup
η∈Rν
|η| < ∞. Ω(η) + 1
(5.16)
Then, there exists C ≥ 1 such that C −1 |ξ| − C ≤ Σ0 (ξ) ≤ C|ξ| + C.
(5.17)
Proof. By (5.13), it suffices to establish the result with v = 0. The upper bound in (5.15) follows from (5.3). As for the upper bound in (5.17) let nξ = [|ξ|] + 1, where [|ξ|] is the smallest integer less than or equal to |ξ|. Then, Σ0 (ξ) ≤ Σess (ξ) ≤ Σ0 (0) + nξ ω(ξ/nξ ) ≤ (|ξ| + 1)ω(ξ/nξ ). For the second inequality in (5.17), we can thus take C = sup|k|≤1 ω(k). Let 0 < ρ < 1. For the lower bound, we estimate H0 (ξ) ≥ 1l(|ξ − dΓ(k)| ≤ ρ|ξ|)dΓ(ω) + 1l(|ξ − dΓ(k)| ≥ ρ|ξ|)Ω(ξ − dΓ(k)). (5.18) |k| . Then, for the first term we can estimate, using the conLet C1 := supk∈Rν ω(k) straint |ξ − dΓ(k)| ≤ ρ|ξ| and the subadditivity of t → ts for 0 ≤ s ≤ 1,
Case (i)
dΓ(ω) ≥ cdΓ(ks ) ≥ cdΓ(|k|)s ≥ c(1 − ρ)s |ξ|s .
Case (ii) dΓ(ω) ≥ C1−1 dΓ(|k|) ≥ C1−1 |dΓ(k)| ≥ (1 − ρ)C1−1 |ξ|. Write C2 = supη∈Rν Case (i)
|η| Ω(η)+1 .
(5.19)
Then, under the constraint |ξ − dΓ(k)| ≥ ρ|ξ| we have
Ω(ξ − dΓ(k)) ≥ cξ − dΓ(k)s − c ≥ cρs |ξ|s − c.
Case (ii) Ω(ξ − dΓ(k)) ≥ C2−1 |ξ − dΓ(k)| − 1 ≥ ρC2−1 |ξ| − 1.
(5.20)
Combining (5.18)–(5.20), we obtain the lower bounds in (5.15) and (5.17).
Acknowledgment The author is supported by a Skou stipend from the Danish Research Agency.
August 5, 2006 21:35 WSPC/148-RMP
510
J070-00267
J. S. Møller
Appendix A. Geometric Partition of Unity We recall briefly here the construction of a partition of unity introduced by Derezi´ nski and G´erard in [7, Sec. 2.13]. See also [31, Sec. 2.3]. We assume the reader is familiar with the second quantization functor Γ (cf. [4] or the two references mentioned in the previous paragraph). First, let h0 and h∞ be Hilbert spaces. Then, I is the canonical isomorphism I : Γ(h0 ⊕ h∞ ) → Γ(h0 ) ⊗ Γ(h∞ ) given by Ia# ((f, g)) = a# (f ) ⊗ 1lΓ(h∞ ) + 1lΓ(h0 ) ⊗ a# (g) I and IΩ = Ω ⊗ Ω. Here, a# (h) denotes either a(h) or a∗ (h). ∗ j∞ = 1lh . Then, the geometric Let j = (j0 , j∞ ) : h → h0 ⊕ h∞ satisfy j0∗ j0 + j∞ ˇ partition of unity Γ(j) : Γ(h) → Γ(h0 ) ⊗ Γ(h∞ ) is given by ˇ Γ(j) := IΓ(j)
(A.1)
ˇ ∗ Γ(j) ˇ Γ(j) = 1lΓ(h) .
(A.2)
and is an isometry ∗ ˇ If furthermore j0 j0∗ = 1lh0 and j∞ j∞ = 1lh∞ , then Γ(j) is unitary.
Appendix B. Weak Ultraviolet Singularities In this appendix, we treat the following type of couplings 1 VI := v ∈ L2loc (Rν ) | ω − 2 v ∈ L2 (Rν ) .
(B.1)
We remark that if ω is bounded VI = L2 (Rν ), and there are no ultraviolet singular couplings in this class. We nevertheless include this appendix for two reasons. For the Nelson model, this is of relevance and secondly, it may serve as a warm-up for the more involved renormalization procedures discussed at the end of Sec. 2. If v ∈ VI , the Hamiltonian H can be constructed using the KLMN theorem [35], via the following lemma: Lemma B.1. Let v ∈ VI . For ψ ∈ L2 (Rν ) ⊗ C0∞ (algebraic tensor product) and ϕ ∈ C0∞ : 1
1
|ψ, Φ(e−ik·x v)ψ| ≤ 2 ω − 2 v
ψ ψ, 1lL2 (Rν ) ⊗ dΓ(ω)ψ 2 , |ϕ, Φ(v)ϕ| ≤ 2 ω
− 12
1 2
v
ϕ ϕ, dΓ(ω)ϕ .
(B.2) (B.3)
Proof. The bound (B.3) was proved in (B.11). The bound (B.2) follows from (B.3) by applying the Lee–Low–Pines transformation (1.5) and using the second bound fiber by fiber. We can now define H as the self-adjoint operator associated with the closure of the semi-bounded form D(∆) ⊗ C0∞ ψ, ϕ → ψ, [H0 + Φ(e−ik·x v)]ϕ
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
511
1
and the form domain of H equals D(H02 ). Note that this is the same Hamiltonian one obtains by applying [5, Theorem 2.2] with K = C and K = 0. We can also construct H(ξ) as the self-adjoint operator associated with the closure of the form C0∞ ψ, φ → ψ, [H0 (ξ) + Φ(v)]ψ 1
1
and the form domain of H(ξ) is independent of ξ and equals D 2 := D(H0 (ξ) 2 ). Note that for the (UV-regular) Hamiltonian HΛ with coupling vΛ = 1l(|k| ≤ Λ)v or vΛ = exp(−|k|/Λ)v, we have: HΛ → H in norm-resolvent sense. Similarly, the fiber Hamiltonians HΛ (ξ) converge to H(ξ) in norm-resolvent sense locally uniformly in ξ. From this observation, it follows easily that H is translation invariant and the Lee–Low–Pines operator (1.5) transforms H into Rν H(ξ)dξ as usual. See [5, 17] for a more refined analysis of confined linearly coupled models defined as forms. We proceed to discuss how to establish our main results for this larger class of interactions. The HVZ Theorem. We begin with (i). The idea is the same as when we passed from compactly supported to square integrable v’s in Sec. 3. We just need to replace Lemma 3.6 by Lemma B.2. Let v1 , v2 ∈ VI . Suppose Conditions 2.1 and 2.2. Then, 1 1 1 Σ0,2 (ξ) − Σ0,1 (ξ) ≤ ω − 2 (v1 − v2 ) 4|Σ0,1 (ξ)| 2 + 6 ω − 2 v1 . Proof. In order to verify this lemma, one should first observe that the estimates 1 (3.8) and (3.9) can, using Lemma B.1, be replaced by: For ψ ∈ D 2 , we have 1 1 1 |ψ, Φ(v)ψ| ≤ 2 ω − 2 v
ψ |ψ, [H0 (ξ) + Φ(˜ v )]ψ| 2 + |ψ, Φ(˜ v )ψ| 2 , (B.4) 1
1
1
|ψ, Φ(v)ψ| ≤ 4 ω − 2 v
ψ |ψ, [H0 (ξ) + Φ(v)]ψ| 2 + 4 ω − 2 v 2 ψ 2 .
(B.5)
Following the same strategy as in the proof of Lemma 3.6, we arrive at the result. As for Theorem 2.1(ii) and (iii), they follow from norm resolvent convergence of HΛ (ξ) to H(ξ). (As usual, HΛ (ξ) is defined with v replaced by vΛ = 1l(|k| ≤ Λ)v.) Uniqueness of ground states. Assume v = 0 almost everywhere. That the resolvent (H(ξ) − µ)−1 is positivity improving follows by approximating v by vΛ = exp(−|k|/Λ)v ∈ L2 (Rν ), and noting that for ψ, ϕ ∈ C\{0}, the expectation value ψ, (HΛ (ξ) − µ)−1 ϕ is non-zero and strictly increasing (expand in a Neumann series). The same argument works if v = 0 a.e. in A ⊂ Rν (measurable) and we restrict the operators to Γ(L2 (A)). The argument that isolated ground states are unique, also when v vanishes on a set of non-zero measure, goes through unaltered. Monotonicity. Let v1 , v2 ∈ VI . Suppose (2.14) is satisfied. By norm resolvent convergence, Σ0,1 (ξ) ≤ Σ0,2 (ξ) (approximate vi by 1l(|k| ≤ Λ)vi ). As for Σess,i (ξ),
August 5, 2006 21:35 WSPC/148-RMP
512
J070-00267
J. S. Møller
we observe that Corollary 3.7 remains valid for v ∈ VI (again by an approximation argument), and hence; the proof given in Sec. 4 goes through. Large total momentum. Theorem 5.2 remains true and in the proof, one should use Lemma B.2 instead of Lemma 3.6. Hence, Theorem 2.4 remains true under the assumption v ∈ VI . Before continuing we remark that Lemmas 5.5, 5.4 and 5.6 also remain valid. Strict monotonicity. From the KLMN theorem, we found that the form domains 1 1 of H1 (ξ) and H2 (ξ) coincide and equals D 2 . By Lemma B.1, we have that D 2 is contained in the form domains of the Φ(vi )’s. Hence, the computation (4.5) remains valid and the proof given in Sec. 5 goes through unchanged. Existence of ground states. First, we need to address the proof given in [31, Sec. 3.3]. The key was the following inequality ([31, Lemma 3.7]) which holds for v ∈ L2 (Rν ) and z < Σ0 (ξ) and goes back to [40]. Ω, (H(ξ) − z)−1 Ω−1 ≤ Ω(ξ) − z − |v(k)|2 Ω, (H(ξ − k) + ω(k) − z)−1 Ω dk. Rν
(B.6) Note that in [31], (B.6) was formulated as the right-hand side being strictly positive, but the above bound is what was actually proved. (It was also assumed that v ≥ 0, which is superfluous, cf. the discussion in Sec. 4.) For v ∈ VI , apply (B.6) to HΛ (ξ) (defined by replacing v by 1l(|k| ≤ Λ)v) and take the limit Λ → ∞. The left-hand side and the integrand on the right-hand side in (B.6) converge. Hence, the inequality in (B.6) remains valid in the limit by Fatou’s lemma. (Alternatively, one could apply both the dominated and the monotone convergence theorem to the right-hand side.) The rest of the proof goes through unchanged. Non-existence of embedded ground states. Here, we need to impose an extra assumption, namely: There exists 0 < ρ ≤ 1 and CΩ > 0 such that |∇Ω(η)| ≤ CΩ Ω(η)1−ρ + CΩ .
(B.7)
The argument relies on a pull-through formula which we first need to establish for the renormalized Hamiltonian. 1 We equip the form domain D 2 of H0 (ξ) with the norm ψ 12 = (H0 (0) + 1
D
1
1
1) 2 ψ . Consider the operator N 2 as a densely defined operator on D 2 , with domain 1
1
1
1
1
DN2 = (N + 1)− 2 D 2 = (H0 (0) + 1)− 2 (N + 1)− 2 F . 1
We equip DN2 with the norm ψ sequence of spaces 1
1 2 DN
1
1
= (N + 1) 2 ψ 1
1
D2
1
∗
. By duality, this gives the
DN2 ⊂ D 2 ⊂ F ⊂ D 2 ∗ ⊂ DN2 ,
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited 1
where D 2 ∗ is the completion of F in the norm ψ 1
∗
513
1
1
D2∗
DN2 is the completion of F with respect to the norm ψ
= (H0 (0) + 1)− 2 ψ and 1∗
2 DN
1
= (N + 1)− 2 ψ
1
D2∗
.
1
We wish to extend the annihilation operator a(·), viewed as a map from D(N 2 ) → 1 L2 (Rν ; F ) to D 2 ∗ . Below we will use two representation formulas for square roots. We have for t > 0: 1 1 1 ∞ (t + x)−1 − x−1 x 2 dx, (B.8) t2 = − π 0 1 ∞ 1 − 12 t = (t + y)−1 y − 2 dy. (B.9) π 0 Either one can be verified by direct computation, and either one follow from the other. We learned the first of these formulas from [38, Lemma B.3]. We have the following: 1
Lemma B.3. For any compact set K ⊂ Rν , the map D(N 2 ) ψ → a(·)ψ ∈ L2 (K; F ) extends by continuity to a bounded operator from D
1 2∗
1
∗
to L2 (K; DN2 ).
1 1 ∗ Proof. Let ϕ ∈ L2 K; DN2 the space dual to L2 (K; DN2 . Write ϕ = (N + 1 1 2)− 2 (H0 (0) + 1)− 2 ϕ˜ with ϕ˜ ∈ L2 (K; F ). Then, for ψ ∈ C0∞ , 1 1 ϕ(k), ˜ (H0 (0) + 1)− 2 a(k)(N + 1)− 2 ψ dk. ϕ, a(·)ψ =
K
1
In order to commute (H0 (0) + 1)− 2 with a(k), we compute using the formula (B.9) and the pull-through formula (see, e.g., [31, Proposition 2.2]) with v ≡ 0 1 ∞ 1 (H0 (0) + 1 + y)−1 H0 (0) − H0 (−k)−ω(k) (H0 (0) + 1)− 2 , a(k) = − π 0 1
× a(k)(H0 (0) + 1 + y)−1 y − 2 dy. This identity is in the sense of operators on C0∞ . We compute the following difference locally uniformly in k, H0 (0) − H0 (−k) − ω(k) = ∇Ω(−dΓ(k)) · k + O(1). The extra assumption (B.7) on Ω yields the estimate 1 (H0 (0) + 1)− 12 , a(·) ψ 2 ≤ C (H0 (0) + 1)− 2 ψ , L (K;F ) for some C > 0. We thus get
1 1 |ϕ, a(·)ψ| ≤ ϕ ˜ L2 (K;F ) a(·)(N + 1)− 2 (H0 (0) + 1)− 2 ψ L2 (K;F ) 1 + C (H0 (0) + 1)− 2 ψ F ≤ (1 + C) ϕ
This concludes the proof.
1
2) L2 (K;DN
ψ
1
D2∗
.
August 5, 2006 21:35 WSPC/148-RMP
514
J070-00267
J. S. Møller
Lemma B.4. Let z ∈ C, Re(z) = 0. For any compact set K ⊂ Rν , there exists C > 0 such that for all 0 < Λ < ∞ and ψ ∈ F, we have (HΛ (ξ) − z)−1 ψ F ≤ C ψ 12 ∗ uniformly in ξ ∈ K. DN
Proof. Let µ < inf Λ,ξ∈K Σ0,Λ (ξ) − 1. The lemma reduces to the following commutator bound: 1 1 (B.10) TΛξ (ψ) := (HΛ (ξ) − µ)− 2 , (N + 1) 2 ψ F ≤ C ψ 12 ∗ , D
for ψ ∈ C0∞ . Here C should be independent of Λ and ξ ∈ K. In order to obtain this estimate, we use the formulas (B.8) and (B.9) ∞ ∞ 1 ξ (HΛ (ξ) − µ + y)−1 , (N + 1 + x)−1 ψ y − 12 x 12 dx dy. TΛ (ψ) ≤ 2 F π 0 0 (B.11) We have
(HΛ (ξ) − µ + y)−1 , (N + 1 + x)−1 ψ F = (N + 1 + x)−1 (HΛ (ξ) − µ + y)−1
× Φ(ivΛ )(HΛ (ξ) − µ + y)−1 (N + 1 + x)−1 ψ F 1 ≤ 2(1 + x)−2 (1 + y)−1 a(ivΛ )(H0 (0) + 1)− 2 1 1 2 × (HΛ (ξ) − µ + y)− 2 (H0 (0) + 1) 2 ψ 12 ∗ . D
(B.12)
The first norm on the right-hand side is bounded uniformly in Λ by (3.11). The second norm on the right-hand side is finite for each Λ and ξ. In order to get uniformity, we argue as follows. For λ > 0, we estimate 1
1
(HΛ (ξ) − µ + y)− 2 (H0 (ξ) + λ) 2 2 1 1 = (HΛ (ξ) − µ + y)− 2 (H0 (0) + λ)(HΛ (ξ) − µ + y)− 2 1 1 ≤ 1 + λ + (dΓ(ω) + λ)− 2 Φ(vΛ )(dΓ(ω) + λ)− 2 1 1 2 × (HΛ (ξ) − µ + y)− 2 (H0 (ξ) + λ) 2 . 1 1 1 By (B.3), we find that (dΓ(ω) + λ)− 2 Φ(vΛ )(dΓ(ω) + λ)− 2 ≤ Cλ− 2 , and hence for λ ≥ (C/2)2 , we get uniformly in Λ and ξ: (HΛ (ξ) − µ + y)− 12 (H0 (ξ) + λ) 12 2 ≤ 2(1 + λ). 1 1 Together with (B.11), (B.12) and the fact that (H0 (0) + 1) 2 (H0 (ξ) + λ)− 2 is bounded locally uniformly in ξ, we get the required bound (B.10). As a consequence of this lemma we get: Each resolvent (HΛ (ξ) − z)−1 , 0 < 1
∗
Λ ≤ ∞, extend by continuity to B(DN2 ; F ) with norm bounded uniformly in Λ and locally uniformly in ξ. Furthermore, (HΛ (ξ)−z)−1 converge strongly to (H(ξ)−z)−1 1
∗
in B(DN2 ; F ), locally uniformly in ξ.
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
515
We finally get the following extension of the pull-through formula as presented in [18, Proposition 3.4] (and [31, Proposition 2.3]). Our formula is closely related to the one presented in [5] except for the presence of the dispersive term Ω(ξ − dΓ(k)). 1
Proposition B.5 (Pull-through). Let z ∈ C, Re(z) < Σ0 (ξ). For any ψ ∈ D 2 , we have as an L2 (Rν ; F ) identity a(k)ψ = (H(ξ − k) + ω(k) − z)−1 a(k)(H(ξ) − z)ψ + v(k)(H(ξ − k) + ω(k) − z)−1 ψ. Remark B.6. The first expression on the right-hand side should be understood in the sense of composition of operators between weighted spaces. That the result is in L2 (Rν ; F ) (and not just L2loc (Rν ; F ) as we get from the lemma above), is a consequence of the remaining two terms in the equation being square integrable. Proof. We have, by the usual pull-through formula [18], the result with v replaced 1 by vΛ = 1l(|k| ≤ Λ)v and ψ ∈ D. That the formula extends to ψ ∈ D 2 follows from Lemmas B.3 and B.4. Finally, we appeal again to Lemmas B.3 and B.4 to remove the ultraviolet cutoff Λ. Now, Theorem 2.6(ii), with the extra assumption (B.7), follows from the pullthrough formula just as in [31, Sec. 3.3]. References [1] S. Albeverio, Scattering theory in a model of quantum fields, I, J. Math. Phys. 14 (1972) 1800–1816. [2] ——, Scattering theory in a model of quantum fields, II, Helv. Phys. Acta 45 (1972) 303–321. [3] N. Angelescu, R. A. Minlos and V. A. Zagrebnov, Lower spectral branches of a particle coupled to a bose field, Rev. Math. Phys. 17 (2005) 1111–1142. [4] F. A. Berezin, The Method of Second Quantization, 1st edn. (Academic Press, New York, San Francisco, London, 1966). [5] L. Bruneau and J. Derezi´ nski, Pauli–Fierz Hamiltonians defined as quadratic forms, Rep. Math. Phys. 54 (2004) 169–199. [6] J. T. Cannon, Quantum field theoretic properties of a model of Nelson: Domain and eigenvector stability for perturbed linear operators, J. Funct. Anal. 8 (1971) 101–152. [7] J. Derezi´ nski and C. G´erard, Asymptotic completeness in quantum field theory. Massive Pauli–Fierz Hamiltonians, Rev. Math. Phys. 11 (1999) 383–450. [8] J. T. Devreese, Polarons, Encyclopedia of Applied Physics, eds. G. L. Trigg and E. H. Immergut, Vol. 14 (Wiley-VCH, 1996), pp. 383–409. [9] M. D. Donsker and S. R. Varadhan, Asymptotics for the polaron, Comm. Pure Appl. Math. (1983) 505–528. [10] J.-P. Eckmann, A model with persistent vacuum, Comm. Math. Phys. 18 (1970) 247–264. [11] R. P. Feynman, Slow electrons in a polar crystal, Phys. Rev. 97 (1955) 660–665. [12] R. P. Feynman, Statistical Mechanics. A Set of Lectures, Frontiers in Physics, (W. A. Benjamin, Inc., Reading, Massechusets, 1972).
August 5, 2006 21:35 WSPC/148-RMP
516
J070-00267
J. S. Møller
[13] H. Fr¨ ohlich, Electrons in lattice fields, Adv. in Phys. 3 (1954) 325–362. [14] J. Fr¨ ohlich, Mathematical discussion of models with persistent vacuum, ETH preprint (1971). [15] ——, On the infrared problem in a model of scalar electrons and massless scalar bosons, Ann. Inst. Henri Poincar´e 19 (1973) 1–103. [16] ——, Existence of dressed one-electron states in a class of persistent models, Fortschr. Phys. 22 (1974) 159–198. [17] V. Georgescu, C. G´erard and J. S. Møller, Spectral theory of massless Pauli Fierz models, Comm. Math. Phys. 249 (2004) 29–78. [18] C. G´erard, On the existence of ground states for massless Pauli–Fierz Hamiltonians, Ann. Henri Poincar´e 1 (2000) 443–459. [19] B. Gerlach and H. L¨ owen, Analytical properties of polaron systems or: Do polaronic phase transitions exist or not?, Rev. Modern Phys. 63 (1991) 63–90. [20] J. Glimm and A. Jaffe, The λ(ϕ4 )2 quantum field theory without cutoffs: II. The field operators and the approximate vacuum, Ann. Math. 91 (1970) 362–401. [21] L. Gross, Existence and uniqueness of physical ground states, J. Funct. Anal. 10 (1972) 52–109. [22] ——, The relativistic polaron without cutoffs, Comm. Math. Phys. 31 (1973) 25–73. [23] K. Hepp, Th´eorie de la Renormalisation, Lecture Notes in Physics, Vol. 2 (SpringerVerlag, Berlin, 1969). [24] F. Hiroshima and H. Spohn, Mass renormalization in nonrelativistic QED, J. Math. Phys. 46 (2005) 27 pp. [25] T. Holstein, Studies of polaron motion: Part II. The “small” polaron, Ann. Phys. 8 (1959) 343–389. [26] M. I. Klinger, Problems of Linear Electron (Polaron) Transport Theory in Semiconductors, Internationational Series in Natural Philosophy, Vol. 87, 1st edn. (Pergamon Press, 1979). [27] T. D. Lee, F. E. Low and D. Pines, The motion of slow electrons in a polar crystal, Phys. Rev. 90 (1953) 297–302. [28] E. Lieb and K. Yamazaki, Ground-state energy and effective mass of the polaron, Phys. Rev. 111 (1958) 728–733. [29] E. H. Lieb and L. E. Thomas, Exact ground state energy of the strong-coupling polaron, Comm. Math. Phys. 183 (1997), 511–519; Erratum 188 (1997) 499–500. [30] H. L¨ owen, Absence of phase transitions in Holstein systems, Phys. Rev. B 37 (1988) 8661–8667. [31] J. S. Møller, The translation invariant massive Nelson model: I. The bottom of the spectrum, Ann. Henri Poincar´e 6 (2005) 1091–1135. [32] ——, On the essential spectrum of the translation invariant Nelson model, in Mathematical Physics of Quantum Mechanics, eds. J. Asch and A. Joye, Lecture Notes in Physics, Vol. 690 (Springer, 2006), pp. 179–195. [33] E. Nelson, Interaction of non-relativistic particles with a quantized scalar field, J. Math. Phys. 5 (1964) 1190–1197. [34] F. M. Peeters, X. Wu and J. T. Devreese, Ground-state energy of a polaron in n dimensions, Phys. Rev. B 33 (1986) 3926–3934. [35] M. Reed and B. Simon, Methods of Modern Mathematical Physics: II, Fourier Analysis and Self-Adjointness, 1st edn. (Academic Press, San Diego, 1975). [36] I. Sasaki, Ground state of the polaron in the relativistic quantum electrodynamics, J. Math. Phys. 46 (2005) 102307.
August 5, 2006 21:35 WSPC/148-RMP
J070-00267
The Polaron Revisited
517
[37] A. D. Sloan, The polaron without cutoffs in two space dimensions, J. Math. Phys. 15 (1974) 190–201. [38] T. Ø. Sørensen, Towards a relativistic Scott correction, Ph.D. thesis, Aarhus University (1998). [39] H. Spohn, Effective mass of the polaron: A functional integral approach, Ann. Phys. 175 (1987) 278–318. [40] ——, The polaron at large total momentum, J. Phys. A 21 (1988) 1199–1211. [41] ——, Dynamics of Charged Particles and their Radiation Field (Cambridge University Press, 2004).
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Reviews in Mathematical Physics Vol. 18, No. 5 (2006) 519–543 c 2006 by the authors
FERROMAGNETISM OF THE HUBBARD MODEL AT STRONG COUPLING IN THE HARTREE–FOCK APPROXIMATION∗
VOLKER BACH Institut f¨ ur Mathematik, Universit¨ at Mainz, D-55099 Mainz, Germany [email protected] ELLIOTT H. LIEB Departments of Mathematics and Physics, Jadwin Hall, Princeton University, P. O. Box 709, Princeton NJ 08544, USA [email protected] MARCOS V. TRAVAGLIA Institut f¨ ur Mathematik, Universit¨ at Mainz, D-55099 Mainz, Germany [email protected]
Received 14 March 2006 Revised 1 June 2006 As a contribution to the study of the Hartree–Fock theory, we prove rigorously that the Hartree–Fock approximation to the ground state of the d-dimensional Hubbard model leads to saturated ferromagnetism when the particle density (more precisely, the chemical potential µ) is small and the coupling constant U is large, but finite. This ferromagnetism contradicts the known fact that there is no magnetization at low density, for any U , and thus shows that HF theory is wrong in this case. As in the usual Hartree–Fock theory, we restrict attention to Slater determinants that are eigenvectors of the z-component of P the total spin, Sz = x nx,↑ − nx,↓ , and we find that the choice 2Sz = N = particle number gives the lowest energy at fixed 0 < µ < 4d. Keywords: Hubbard model; ferromagnetism; Hartree–Fock theory. Mathematics Subject Classification 2000: 82D40
1. Introduction The (one-band) Hubbard model has become a standard model for correlated electrons in condensed matter physics since it is, perhaps, the simplest possible model of ∗ c 2006 by the authors. This article may be reproduced in its entirety for non-commercial purposes.
519
August 5, 2006 21:35 WSPC/148-RMP
520
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
itinerant interacting electrons. In spite of its simplicity, its zero temperature phase diagram is rich with different magnetic phases such as paramagnetic, ferromagnetic, and anti-ferromagnetic phases, depending on the details of the hopping amplitudes, the (relative) coupling constant U/t and the filling parameter ν = N/(2|Λ|). As the Hubbard model is a many-body fermion model, the computation of its ground state for large lattices is a difficult, if not impossible, task, except in one dimension [1, 2]. Thus various schemes have been developed during the past decades to derive an approximate ground state and then to study its magnetic phase diagram. In the present paper, we consider the Hartree–Fock approximation of the (repulsive, one-band, nearest-neighbor-hopping) Hubbard model with the intention of studying the validity of the Hartree–Fock approximation. We require the Slater determinants entering the Hartree–Fock energy functional to be eigenfunctions of the operator Sz := x∈Λ {nx,↑ − nx,↓ } of total spin in the z-direction, and for this reason we refer to the model as the HFz approximation. Our requirement means that each orbital has the form ϕ(x) ⊗ |↑ or ϕ(x) ⊗ |↓. This is a restriction in the sense that general orbitals are of the form ϕ(x, σ), in which the spin direction depends on position. No other restriction is imposed on the variational states; in particular, no assumption about translation invariance is made a priori. For the HFz model, at small chemical potential and for sufficiently strong repulsion, we give a mathematical proof of saturated ferromagnetism in the Hartree–Fock ground state. That is, the HF ground state has maximal total spin and maximal ferromagnetic long-range spatial order. The smallness of the chemical potential and the large strength of the repulsion also insure that the HF ground state density is strictly below half-filling. Before we come to a detailed description of our result and its proof, we discuss it in comparison to other works. The appearance of ferromagnetic behavior has been anticipated in many studies of the Hubbard model and approximations thereof. Among these are (restricted) Hartree–Fock approximations [3], DMFT models in the limit of infinite spatial dimension [4–7], exact diagonalizations on small lattices [8], variational calculations [9] and studies at low filling [10]. These studies support the conjecture that, for large coupling U/t 1 and away from half-filling, ν = 1/2, the ground state of the Hubbard model is ferromagnetic. Ferromagnetism has been established for the (full) Hubbard model in case the dispersion relation leads to a very high density of states around the Fermi energy [11–13] and in case of the next-nearest-neighbor hopping [14, 15]. As mentioned before, the main purpose of the present paper is to prove ferromagnetic behavior with mathematical rigor. None of the papers [4–8] cited above match the standards of a mathematical proof: The orbitals in the Hartree–Fock approximation are a priori assumed to be composed of only a few Fourier modes; the error terms when taking the limit of infinite spatial dimension in DMFT are not under control; exact diagonalizations are restricted to very small lattices and
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
521
the implication of these to the thermodynamic limit remains unclear. The work by Mielke and Tasaki [11–13] is mathematically rigorous, but the assumptions made therein about the lattice structure are rather special. On the other hand, by adding the next-nearest-neighbor hopping (two-band Hubbard model), Tasaki [14, 15] has found a Hubbard model that displays ferromagnetism in all dimensions. Tasaki also reviews rigorous results on ferromagnetism in the Hubbard model in [16]. While the prediction of ferromagnetism in the Hubbard model and approximations thereof is supported by the above studies, we also know that the HF theory predicts anti-ferromagnetism (in the sense that the total spin is zero) at higher densities, notably at half-filling [17]. Furthermore, our proof shows saturated ferromagnetism at low density and sufficiently large coupling in HF theory, even in one dimension, but the actual ground state always has spin zero in one dimension as long as there is only the nearest-neighbor hopping (see [18]). Even more seriously, our conclusion is opposite to what actually occurs in the Hubbard model. Namely, at very low density (and independent of the value of U > 0), there is no magnetization in the ground state of this model. In the ground state Sz is close to zero and converges to zero, as the particle density tends to zero. This has been pointed out in [19, 16], based on arguments similar to the following transcription to lattice systems of the recent work [20]. In this paper [20], it was shown that fermions in the 3-dimensional continuum R3 (instead of the lattice Z3 ), and with a repulsive two-body potential, have a ground state energy density, e, given by e(ρ↑ , ρ↓ ) =
5/3 2 2 3 5/3 (6π 2 )2/3 ρ↑ + ρ↓ 8πaρ↑ ρ↓ + higher order in (ρ↑ , ρ↓ ), + 2m 5 2m (1.1)
where ρ↑ , ρ↓ are the densities of the “spin-up” and the “spin-down” fermions and a is the scattering length of the two-body potential. Because ρ5/3 dominates ρ2 for small ρ, it is clear from (1.1) that the minimum energy occurs approximately, if not exactly, when ρ↑ = ρ↓ = ρ/2. This answers the questions in [21, Problem 3]. To show that there is vanishing net magnetization as ρ → 0, one only needs an upper bound for e of the form (1.1). For the Hubbard model (where the two-body potential is a positive delta function, or even a hard core), this can conveniently be done by a variational wave function of the form Ψ = F Ψ0 , where Ψ0 is a Slater determinant, and F is the projection onto the states with no double occupancy — in imitation of [19, 16, 20]. We omit the details, but we draw attention to the fact that F Ψ0 is not a Slater determinant, reflecting the more complex structure of correlations in the actual ground state of the Hubbard model. The proof of an analog of (1.1) with precise constants is a more complicated matter which is now under investigation, but it is not needed for the present discussion. Our setting is the usual (repulsive) Hubbard model with the nearest-neighbor hopping on a d-dimensional cubic lattice Λ, with periodic boundary conditions and
August 5, 2006 21:35 WSPC/148-RMP
522
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
linear size L, which we assume to be an even integer. It is defined by the second quantized Hamiltonian (−∆x,y − µδx,y )c∗x,σ cy,σ + U nx,↑ nx,↓ . (1.2) Hµ,U = x,y∈Λ,σ=↑,↓
x∈Λ
We work at fixed chemical potential µ instead of fixed particle number. The only slightly unusual notation is ∆x,y = Tx,y − 2dδx,y for the matrix elements of the discrete Laplacian ∆ on Λ, with Tx,y := 1l[|x − y|1 = 1] being the nearest-neighbor hopping matrix and δx,y = 1l[x = y] the Kronecker-Delta. The operators c∗x,σ , cx,σ , and nx,σ := c∗x,σ cx,σ are the usual fermion creation, annihilation, and number operators, respectively, at site x ∈ Λ and of spin σ ∈ {↑, ↓}, obeying the canonical anti-commutation relations {cx,σ , cy,τ } = {c∗x,σ , c∗y,τ } = 0, {cx,σ , c∗y,τ } = δx,y δσ,τ , and cx,σ |0 = 0, for all x, y, σ, τ . Here |0 is the vacuum vector in the usual Fock space FΛ := Ff (CΛ ⊗ C2 ) of spin- 12 fermions. The Hamiltonian Hµ,U depends parametrically on the chemical potential µ > 0 and the coupling constant U > 0. Note that the usual hopping parameter t equals 1 here and that the discrete Laplacian ∆ differs from the usual hopping matrix by the inclusion of the diagonal term, i.e., 2d times the identity matrix. This difference amounts to a convenient redefinition of the chemical potential µ, so that µ = 0 corresponds precisely to zero filling since the hopping matrix −∆ ≥ 0 is a positive semi-definite matrix. Moreover, the boundedness 0 < µ < 4d of µ together with the assumption that U 4d insures that the corresponding electron density in the HF ground state is always at low filling, i.e., strictly below half-filling, 0 ≤ ρ < 1. Our definition of µ is convenient because in this paper, we are concerned with the Hubbard model at low filling, and our assumption of a bounded chemical potential 0 ≤ µ ≤ 2d Apart from this, everything is standard. The Hamiltonian Hµ,U is a linear operator on the Fock space and the ground (gs) state energy Eµ,U is its smallest eigenvalue, (gs) (1.3) Eµ,U := min Ψ | HΨ | Ψ ∈ FΛ , Ψ = 1 . Λ
2
As the dimension dim(FΛ ) = 2dim(C ⊗C ) = 4(L ) < ∞ is finite, the determination (gs) of Eµ,U amounts to diagonalizing the finite-dimensional, selfadjoint matrix Hµ,U . The fast growth of this dimension with the number Ld of points in the lattice Λ, however, allows for an explicit diagonalization of Hµ,U by a modern computer only up to L = 4, in three spatial dimensions, d = 3. The Hartree–Fock (HF) approximation is an important method to reduce the high-dimensional many-particle problem given by the diagonalization of Hµ,U to a low-dimensional, but nonlinear variational problem. It is defined by restricting the Λ 2 minimization in (1.3) to Slater determinants ϕ1 ∧· · ·∧ϕN , where {ϕi }N i=1 ⊆ C ⊗C is an orthonormal family of N one-electron wave functions. The HF approximation to the Hubbard model was analyzed in [17] in the special situation when the number d
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
523
of electrons equals the number of lattice sites, N = |Λ|, which is usually referred to as half-filling. Note that a priori no other condition but orthonormality is imposed on the orbitals {ϕi }N i=1 in the Slater determinants varied over in Hartree–Fock theory. This is sometimes stressed by calling it the unrestricted Hartree–Fock theory. Let us temporarily consider a general many-body Hamiltonian H which commutes with a certain symmetry operator S, i.e., [H, S] = 0. It is important to note that in this case, the HF ground state Φhf , i.e., the Slater determinant which minimizes the energy Φhf |HΦhf , is not necessarily an eigenstate of S. Phrased differently, the unrestricted Hartree–Fock theory may (depending on the model) break the symmetry S. The following are examples that occur in physically relevant situations: unrestricted HF ground states of atoms are, in general, not eigenfunctions of the angular momentum operator (because in unrestricted HF theory, all shells are filled [22]) — even though the atomic Hamiltonian is rotationally invariant; the ground state in the BCS theory of superconductors (which is a variant of the HF theory) is not an eigenfunction of the number operator — even though the BCS Hamiltonian preserves the particle number; a HF ground state for the Hubbard model with non-zero spin breaks the invariance of the Hubbard Hamiltonian under global spin rotations; charge density waves (CDW) and spin density waves (SDW) of the Hubbard model are translation invariant only by translation of an even number of lattice sites, breaking the (full) translation symmetry the Hubbard Hamiltonian Hµ,U posesses. As it is impossible to predict a priori whether a symmetry of the Hamiltonian is preserved or not, we call all variations of Φ|HΦ over Slater determinants Φ which fulfill an additional constraint restricted Hartree–Fock theory. In this paper, we consider a restricted Hartree–Fock theory, which we term the HFz approximation. The further restriction imposed is that we minimize in (1.3) only over Slater determinants Φ that are eigenfunctions of the operator Sz := x∈Λ {nx,↑ − nx,↓ } of total spin in the z-direction. One could rephrase our condition by saying that we do not allow for spiral spin density waves (SSDW; see, e.g. [3]) in (1.3). Once again, it is customary to employ this restriction in HF calculations without explicitly drawing attention to the fact that this is a restriction. (In [17] mentioned above, however, we dealt with truly unrestricted HF theory.) More concretely, our HF wave functions have the form Φ=
N↑ i=1
c∗↑ (fi )
N↓
c∗↓ (gi )|0.
(1.4)
j=1
where c∗↑,↓ (f ) = x∈Λ f (x)c∗x,↑,↓ , the integers N↑,↓ are the particle numbers, and where the fi and gi are two families of orthonormal wave functions on the lattice Λ, i.e., fi | fj = gi | gj = δi,j , with f | g := x∈Λ f (x)g(x) denoting the usual hermitian scalar product for such functions.
August 5, 2006 21:35 WSPC/148-RMP
524
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
It is convenient to rephrase the HFz approximation in terms of one-particle density matrices, i.e., complex, selfadjoint Λ × Λ matrices whose eigenvalues lie between 0 and 1. To this end, we denote Kµ := −∆ − µ
(1.5)
and observe that Φ | HΦ =
N↑
N↓ fi | Kµ fi + gj | Kµ gj
i=1
+U
x∈Λ
j=1 N↑
|fi (x)|
2
i=1
N↓
|gj (x)|
2
.
(1.6)
j=1
Introducing the one-particle density matrices γ↑,↓ corresponding to Φ by γ↑ :=
N↑ i=1
|fi fi | and γ↓ :=
N↓
|gi gi |,
(1.7)
i=1
∗ 2 we observe that γ↑,↓ = γ↑,↓ = γ↑,↓ are orthogonal projections of dimension N↑,↓ and that the energy expectation value of the Slater determinant Φ is given by (hfz) Φ | HΦ = Eµ,U (γ↑ , γ↓ ), where
(hfz) Eµ,U (γ↑ , γ↓ ) := Tr Kµ (γ↑ + γ↓ ) + U ρ↑ (x)ρ↓ (x),
(1.8)
x∈Λ
and the diagonal matrix elements ρ↑,↓ (x) := (γ↑,↓ )x,x of γ↑,↓ are the one-particle densities of the electron with spin up (“↑”) and spin down (“↓”), respectively. The symbol “Tr” denotes the usual trace Tr{A} = x∈Λ Ax,x of a complex Λ × Λ matrix A = (Ax,y )x,y∈Λ with Ax,y ∈ C. That is, “Tr” is the trace over the states in CΛ of a single spinless particle on the lattice Λ. It does not include spin states, and it is not the trace over states in Fock space. Let us note that the particle numbers N↑,↓ are not determined ab initio. We are in the grand canonical ensemble, so they are determined by the condition that the total energy (1.8) is minimized. These observations motivate us to define the HFz energy by the following variational principle over projections: (hfz) (hfz) ∗ 2 . (1.9) = γ↑,↓ Eµ,U := min Eµ,U (γ↑ , γ↓ ) | γ↑,↓ = γ↑,↓ The two sets of orthogonal projections on CΛ over which we minimize in (1.9) is not really well-suited for a variational analysis. In particular, they are not convex. An observation in [23], however, states that, because U ≥ 0, we will obtain the same value for the minimum if we vary over the larger set of all one-particle density matrices, 0 ≤ γ↑,↓ ≤ 1, not only over projections. (Recall that a density matrix is
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
525
a hermitean Λ × Λ matrix γ whose eigenvalues lie between 0 and 1, i.e., 0 ≤ γ ≤ 1, (hfz) as a matrix inequality.) Our extended Eµ,U is then (hfz) (hfz) Eµ,U = min Eµ,U (γ↑ , γ↓ ) | 0 ≤ γ↑,↓ ≤ 1 . (1.10) (hfz)
The evaluation of Eµ,U and the determination of those pairs (γ↑ , γ↓ ) of one-particle (hfz)
density matrices that minimize Eµ,U is the objective of this paper. Our main result (hfz)
is that, for any 0 < µ < 4d, the minimal value of Eµ,U is attained for the saturated ferromagnet, provided U < ∞ is sufficiently large. Theorem 1.1 (Ferromagnetism). For any 0 < µ < 4d, there is a finite length L# (µ) and a finite coupling constant U# (µ) ≥ 0, such that, for all even L ≥ L# (µ) and all U ≥ U# (µ), the minimal HFz energy is given by the sum of the negative eigenvalues of −∆ − µ, (hfz) Eµ,U = Tr [−∆ − µ]− . (1.11) If µ is not an eigenvalue of −∆ and if (γ↑ , γ↓ ) is a minimizer of the HFz functional, (hfz) (hfz) i.e., 0 ≤ γ↑,↓ ≤ 1, and Eµ,U (γ↑ , γ↓ ) = Eµ,U , then either or
γ↑ = 1l[−∆ < µ], γ↑ = 0,
γ↓ = 0
(1.12)
γ↓ = 1l[−∆ < µ],
(1.13)
where 1l[−∆ < µ] is the spectral projection of −∆ onto (−∞, µ). [With reference to Eq. (1.11) and elsewhere, note that in our notation, [X]− = min{X, 0} is negative, whereas elsewhere one often defines [X]− to be positive, i.e., [X]− := max{−X, 0}. If X is a selfadjoint operator, then [X]− denotes the negative part of X and Tr[X]− is the sum of the negative eigenvalues of X.] Theorem 1.1 is not really as complicated as it looks. It is stated in terms of a length L# and coupling constant U# in order to make it clear that the state of saturated ferromagnetism is obtained not only asymptotically in the thermodynamic limit and asymptotically as U → ∞, but it holds for all systems with large, finite interaction and sufficiently large size. Theorem 1.1 states that, for any value of the chemical potential µ ∈ (0, 4d), the HFz variational principle yields a ferromagnetic minimizer, provided U and L are chosen sufficiently large (but still finite). A similar statement was proved in [17, Theorem 4.7] for U = ∞ (which amounts to requiring Φ | nx,↑ nx,↓ Φ = 0, on every lattice site x ∈ Λ). At first sight, Theorem 1.1 seems to contradict another fact proved in [17] that the HF minimizer is anti-ferromagnetic at half-filling. But as the definition of the chemical potential µ in the present paper differs from its definition in [17] by 2d+U , the parameter range of the the present paper and of [17] never overlap and, hence, there is no contradiction. As just mentioned, the minimal HF energy and the minimal HFz energy agree in the half-filling case, as shown in [17]. We conjecture that this is also the case for
August 5, 2006 21:35 WSPC/148-RMP
526
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
the range of the chemical potential µ ∈ (0, 4d) and sufficiently large U , but we do not know how to prove this conjecture. This is a topic for future research. From Theorem 1.1 we conclude that at small filling, there is a phase transition (within the context of HFz theory) from paramagnetism for small U to saturated ferromagnetism for large U . This follows from continuity and the fact that when U = 0, we can find the ground state explicitly and, as is well known, it has S = 0 and is obtained from filling up the Fermi sea for both ↑ and ↓ states. If 0 < µ ≤ 12 , then we can estimate L# (µ) and U# (µ) in Theorem 1.1 more explicitly. For the precise formulation of these estimates, we introduce the following constants, L∗ (µ) := 2M∗ (µ) := 24(4d)2 µ−2 , d
κ(µ) := α∗ (µ) :=
µ
42d+1 ed dd
−2d 1 + 2 ln(2)(d−1 + 1) + ln 4dµ−1 ,
|S d−1 |µ(2+d)/2 21+d/2 (2π)d (4d)5
,
(1.14) (1.15) (1.16)
α2 α κ(µ) , , , (12d)2 3µ[4M∗ (µ) + 1]d 2
2µ 24d2 , U∗ (µ, α) := max , δ∗ (µ, α) αδ∗ (µ, α) δ∗ (µ, α) := min
(1.17) (1.18)
where |S d−1 | = 2π d/2 /Γ(d/2) is the measure of the unit sphere in Rd . Theorem 1.2. For any 0 < µ ≤ 12 , Theorem 1.1 holds true with L# (µ) := L∗ (µ) and U# (µ) := U∗ (µ, α∗ (µ)), as defined in (1.14), (1.16) and (1.18). The explicit form of L∗ (µ), α∗ (µ), and U∗ (µ, α∗ (µ)), for a given 0 < µ ≤ 12 , in Theorem 1.2 allows us to estimate the actual minimal size of L and U that guarantees saturated ferromagnetism. The distinction between µ ≤ 1/2 and µ > 1/2 is not a fundamental one. It is an artifact of the use in Lemma 3.6 of [24, 25], whose methods favored this technical distinction. 2. Proofs of Theorems 1.1 and 1.2 This section contains the proofs of our main results, Theorems 1.1 and 1.2, with the aid of several lemmas which will be proved later in Sec. 3. Here is a brief outline of the strategy of the proof. (hfz)
• We first reduce the minimization of Eµ,U (γ↑ , γ↓ ) in (1.10) over two one-particle density matrices γ↑ and γ↓ to the minimization of an effective energy functional (hfz) Eµ,U (γ) which depends only one one-particle density matrix γ. It is given as a (hfz) sum of two terms, E (γ) = Tr{Kµ γ} + Tr{[Kµ + Uρ]− }, where we recall that Kµ = −∆ − µ.
µ,U
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
527
• Given a trial one-particle density matrix γ and a small number δ > 2µU −1 , we introduce the corresponding particle density ρ(x) := γx,x and define the regions Ω := {x | ρ(x) < δ} and Ωc := {x | ρ(x) ≥ δ} of low and high density onto which we project by PΩ = x∈Ω |xx| and PΩ⊥ = 1l − PΩ , respectively. • We then use the fact that γ is mostly localized in the high density region Ωc . This leads us to estimate the kinetic energy Tr{−∆PΩ γPΩ } in Ω by zero and Tr{−∆PΩ⊥ γPΩ⊥ } in Ωc by the kinetic energy of the free Fermi gas in Ωc . The localization error is of order of a small constant times the volume |∂Ω| of the boundary of Ω. In Lemma 3.1, we give the exact formulation of the bound which (hfz) we use to estimate the term Tr{Kµ γ} in Eµ,U (γ). (hfz) • For the analysis of the term Tr{[Kµ + Uρ]− } in E (γ), we use the fact that Ωc µ,U
is a classically forbidden region, because −µ + Uρ ≥ −µ + U δ ≥ µ in Ωc . So, as shown in Lemma 3.2, we can replace Tr{[Kµ + Uρ]− } by Tr{[PΩ (Kµ + Uρ)PΩ ]− }, up to localization errors of order of a small constant times |∂Ω|. • We then pick a (large, but fixed) number M > 1 and further split up the low density region Ω into the subset Ω1 of those points in Ω that are at most at distance 2M away from the boundary ∂Ω and the bulk Ω2 ⊂ Ω of points of distance 2M or more to ∂Ω. The contribution of Ω1 turns out to be negligible because Ω1 contains at most (4M + 1)d |∂Ω| points, and the density is low in Ω1 ⊆ Ω. • The estimate of the region Ω2 x then uses the lower bound on the spatial density 1l[Kµ + Uρ < 0](x, x) of the projection onto the negative eigenvalues of Kµ + Uρ (actually, ρ instead of ρ), which we derive in Lemma 3.3. (hfz) • Adding up the estimates derived so far, we finally observe that Eµ,U (γ) is bounded below by Tr{[PΩ Kµ PΩ ]− } + Tr{[PΩ⊥ Kµ PΩ⊥ ]− } − η|∂Ω| =: Y − η|∂Ω|, where η > 0 becomes small when U 1 and δ > 0 is properly chosen. In Lemma 3.6, we reproduce the result from [24, 25] that Y can be estimated from below by Tr{[Kµ ]− } + α|∂Ω|, where α > 0 depends only on µ. In other words, the introduction of a domain wall at ∂Ω drives up the energy by α|∂Ω|, (hfz) which dominates η|∂Ω|, provided η is small. This establishes that Eµ,U (γ) ≥ Tr{[Kµ ]− } + (α − η)|∂Ω|, which implies the claim. To carry out the proof in detail, we start with the observation that the minimization over two one-particle density matrices in (1.10) can actually be reduced to the minimization over only one one-particle density matrix. To see this, we observe that ρ↑ (x)ρ↓ (x) = Tr{ρ↑ γ↓ }, (2.1) x∈Λ
where ρ↑ acts as a multiplication operator, (ρ↑ f ) (x) := ρ↑ (x)f (x). Thus we have (hfz) Eµ,U = min Tr{Kµ γ↑ } + min Tr{(Kµ + Uρ↑ )γ↓ } (2.2) 0≤γ↑ ≤1
0≤γ↓ ≤1
= min Tr{Kµ γ↑ } + Tr{[Kµ + Uρ↑ ]− } . 0≤γ↑ ≤1
(2.3)
August 5, 2006 21:35 WSPC/148-RMP
528
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
(Recall that Kµ = −∆ − µ.) In other words, we have done the minimization over γ↓ in (2.2) by taking γ↓ to be the projection onto the negative eigenspaces of Kµ +Uρ↑ . Thus, as our minimization principle over only one γ, we obtain the following. (hfz) (hfz) Eµ,U = min Eµ,U (γ) | 0 ≤ γ ≤ 1 , (2.4) (hfz) Eµ,U (γ) := Tr{Kµ γ} + Tr{[Kµ + Uρ]− },
(2.5)
where ρ(x) := γx,x . From now on γ, with 0 ≤ γ ≤ 1, is an arbitrary, but fixed one(hfz) particle density matrix, for which we bound Eµ,U (γ) from below. (An upper bound that agrees with Theorem 1.1 is readily obtained simply by choosing the variational function consisting of the unperturbed Fermi sea with all particles spin-up or all spin-down.) For the next step of the proof, we introduce a small number δ > 2µU −1 , whose precise value will be chosen in the final step of the proof. Given a one-particle density matrix 0 ≤ γ ≤ 1 with corresponding density ρ(x) := γx,x , we write the lattice Λ = Ω ∪ Ωc as a union of two disjoint subsets of Λ in the following way: Ω := {x ∈ Λ | ρ(x) < δ},
(2.6)
Ω := {x ∈ Λ | ρ(x) ≥ δ}.
(2.7)
c
These are the regions of low and high density, respectively. We define the boundary ∂Ω of Ω by ∂Ω := {x ∈ Ω | dist1 (x, Ωc ) = 1},
(2.8)
where dist1 (x, A) is the length of (number of bonds in) a shortest path joining x and some point in y ∈ A. Another useful notion of distance which we shall use is dist∞ (x, A), which is defined by the condition that 2 dist∞ (x, A)+1 is the sidelength of the smallest cube centered at x that intersects A. When A is a single point y, these distances are denoted by |x − y|1 and |x − y|∞ . We define PΩ , PΩc = PΩ⊥ , and P∂Ω to be the orthogonal projections onto Ω, Ωc , and ∂Ω, respectively, where the projection onto an arbitrary set A ⊆ Λ is given by f (x) for x ∈ A, (2.9) (PA f )(x) = 0 for x ∈ / A. We further set
for x ∈ Ωc , ρ(x),
ρ (x) := µ , ρ(x) , for x ∈ Ω, min 2U
(2.10)
and observe that ρ (x) ≤ ρ(x), for all x ∈ Λ, which implies that (hfz) Eµ,U (γ) ≥ Tr{Kµ γ} + Tr{[Kµ + U ρ ]− }.
(2.11)
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
529
2 For brevity, we define M := M∗ (µ) := 12 4d and note that, by assumption, µ L obeys L ≥ 2M . We further decompose Ω into two disjoint subsets Ω1 and Ω2 defined by Ω1 := {x ∈ Ω | dist∞ (x, Ωc ) ≤ 2M },
(2.12)
Ω2 := {x ∈ Ω | dist∞ (x, Ω ) > 2M }.
(2.13)
c
We observe that the ∞ -distance of the points in Ω1 to the boundary ∂Ω of Ω is less or equal to 2M , so Ω1 ⊆ ∂Ω + Q(2M ), where Q( ) = {− , . . . , }d + LZd . Hence |Ω1 | ≤ |∂Ω| · |Q(2M )| = (4M + 1)d · |∂Ω|,
(2.14)
and therefore, ρ(x) = ρ(x) + ρ(x) ≤ (4M + 1)d δ|∂Ω| + ρ(x), x∈Ω
x∈Ω1
x∈Ω2
(2.15)
x∈Ω2
since ρ ≤ δ on Ω. Equation (2.15) and Lemma 3.1 yield Tr{Kµ γ} ≥ Tr [PΩ⊥ Kµ PΩ⊥ ]− − 4dδ 1/2 + µ(4M + 1)d δ |∂Ω| − µ ρ(x).
(2.16)
x∈Ω2
Next, we apply Lemma 3.2 which asserts 8d2 Tr{[Kµ + U ρ ]− } ≥ Tr [PΩ (Kµ + U ρ )PΩ ]− − |∂Ω|. Uδ
(2.17)
Denoting by χ := 1l[PΩ (Kµ + U ρ )PΩ < 0] the orthogonal projection onto the subspace of negative eigenvalues of PΩ (Kµ + U ρ )PΩ and ρχ (x) := χx,x its diagonal matrix element, we observe that Tr [PΩ (Kµ + U ρ )PΩ ]− = Tr PΩ (Kµ + U ρ )PΩ χ = Tr PΩ Kµ PΩ χ + U ρχ (x) ρ (x). (2.18) x∈Ω
By Lemma 3.3, the density ρχ is bounded below on Ω2 by the universal constant κ(µ) > 0 defined in (3.19). Therefore, 8d2 Tr{[Kµ + U ρ ]− } ≥ Tr [PΩ Kµ PΩ ]− − |∂Ω| + κ(µ) U ρ(x). Uδ
(2.19)
x∈Ω2
Adding up (2.16) and (2.19), we obtain (hfz) Eµ,U (γ) ≥ Tr [PΩ Kµ PΩ ]− + Tr [PΩ⊥ Kµ PΩ⊥ ]−
8d2 1/2 d − 4dδ + µ(4M + 1) δ + |∂Ω| Uδ κ(µ)U ρ(x) − µρ(x) , + x∈Ω2
(2.20)
August 5, 2006 21:35 WSPC/148-RMP
530
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
and Lemma 3.6 further yields (hfz) Eµ,U (γ) − Tr [Kµ ]− ≥
8d2 1/2 d α(µ) − 4dδ − µ(4M + 1) δ − |∂Ω| Uδ κ(µ)U ρ(x) − µρ(x) . (2.21) + x∈Ω2
We choose
α(µ)2 α(µ) κ(µ) , , δ := δ∗ (µ) = min , (12d)2 3µ(4M + 1)d 2
and we observe that if
U ≥ U∗ µ, α(µ) = max
2µ 24 d2 , δ∗ (µ, α(µ)) α(µ) δ∗ (µ, α(µ))
(2.22)
(2.23)
then our choice for δ fulfills the requirement δ > 2µU −1 . Moreover, Eqs. (2.22) and (2.23) imply that 4dδ 1/2 + µ(4M + 1)dδ +
α(µ) α(µ) α(µ) 8d2 ≤ + + ≤ α(µ). Uδ 3 3 3
(2.24)
µ µ We further set Ω 2 := {x ∈ Ω2 | ρ(x) ≤ 2U } and Ω 2 := {x ∈ Ω2 | 2U < ρ(x) ≤ δ}, so Ω2 is the disjoint union of Ω 2 and Ω 2 , and by the definition (2.10) of ρ , we have that κ(µ)U ρ(x) − µρ(x) x∈Ω2
≥
{κ(µ)U − µ} ρ(x) +
x∈Ω2
µ {κ(µ) − 2δ} ≥ 0, 2
(2.25)
x∈Ω2
since δ ≤ 12 κ(µ) and U ≥ 2µ/δ∗ (µ, α(µ)) ≥ µ/κ(µ). Equations (2.24) and (2.25) insure that the right-side of (2.21) is nonnegative, which immediately implies Theorem 1.1. Theorem 1.2 is obtained by substituting the explicit value of α(µ) from (3.60) into (2.23) and using L∗ (µ) from (3.60). 3. Auxiliary Lemmas In this section, we state and prove the lemmas used in the proof of Theorems 1.1 and 1.2 in Sec. 2. 3.1. The region Ωc of high density In this subsection, we estimate Tr{Kµ γ} from below. We are guided by the intuition that γ is essentially localized on Ωc .
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
531
Lemma 3.1. Tr{Kµ γ} ≥ Tr [PΩ⊥ Kµ PΩ⊥ ]− − 4dδ 1/2 |∂Ω| − µ ρ(x).
(3.1)
x∈Ω
Proof. Inserting 1l = PΩ + PΩ⊥ into Tr{Kµ γ}, we obtain Tr{Kµ γ} = Tr{Kµ PΩ⊥ γPΩ⊥ } − 2 Re Tr{PΩ⊥ ∆PΩ γ} + Tr{Kµ PΩ γPΩ } ≥ Tr [PΩ⊥ Kµ PΩ⊥ ]− − 2 ∆x,y |γy,x | − µ Tr{PΩ γ PΩ } = Tr [PΩ⊥ Kµ PΩ⊥ ]− − 2
x∈Ω,y∈Ωc
∆x,y |γy,x | − µ
x∈∂Ω,y∈Ωc
ρ(x),
x∈Ω
(3.2) where we use that −∆ ≥ 0, that PΩ⊥ ∆PΩ = PΩ⊥ ∆P∂Ω , and that 0 ≤ γ ≤ 1. The latter also implies that ρ(y) = γy,y ≤ 1, for all y ∈ Λ. Thus, if x ∈ ∂Ω and y ∈ Ωc , √ the Cauchy–Schwarz inequality yields |γy,x | ≤ γy,y · γx,x ≤ δ 1/2 . Moreover, if x ∈ ∂Ω, y ∈ Ωc , and ∆x,y = 0, then y is a neighbor of x, and we obtain ∆x,y |γy,x | ≤ δ 1/2 = 2dδ 1/2 |∂Ω|, (3.3) x∈∂Ω,y∈Ωc
x∈∂Ω y∈Λ: |x−y|=1
which completes the proof of (3.1). 3.2. Decoupling the high and low density regions This subsection is devoted to showing that Tr{[Kµ + U ρ ]− } essentially agrees with the corresponding eigenvalue sum Tr{[PΩ (Kµ +U ρ )PΩ ]− } for the operator localized on Ω, the reason being that Ωc is a classically forbidden region since −µ + U ρ ≥ 1 c 2 U δ > 0 on Ω . Lemma 3.2. 8d2 |∂Ω|. Tr{[Kµ + U ρ ]− } ≥ Tr [PΩ (Kµ + U ρ )PΩ ]− − Uδ
(3.4)
Proof. We wish to apply of the Feshbach projection method. To this end, we first observe the following quadratic form bound, ˜)PΩ⊥ ≥ PΩ⊥ (Kµ˜ + U ρ )PΩ⊥ ≥ PΩ⊥ (U ρ − µ
1 U δPΩ⊥ , 2
(3.5)
for any µ ˜ ∈ [0, µ], since ρ ≥ δ on Ωc and δ ≥ 2µU −1 . Thus, PΩ⊥ (Kµ˜ + U ρ )PΩ⊥ is positive and invertible on Ran PΩ⊥ , and moreover, we have that
−1 ⊥ 2 P∂Ω ∆PΩ⊥∆P∂Ω . PΩ ∆PΩ⊥ PΩ⊥ (Kµ˜ + U ρ )PΩ⊥ PΩ ∆PΩ ≤ Uδ
(3.6)
August 5, 2006 21:35 WSPC/148-RMP
532
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
For y ∈ Ωc and f ∈ CΛ , the Cauchy–Schwarz inequality implies that 2 f | P∂Ω ∆1ly ∆P∂Ω f = |(∆P∂Ω f )[y]|2 = f (x) x∈∂Ω,|x−y|1 =1
≤
|f (x)|
x∈∂Ω,|x−y|1 =1
= 2d
2
1
x∈Λ,|x−y|1 =1
|f (x)|2 ,
(3.7)
x∈∂Ω,|x−y|1 =1
which, by summing over all y ∈ Ωc , yields f | P∂Ω ∆1ly ∆P∂Ω f f | P∂Ω ∆PΩ⊥ ∆P∂Ω f = y∈Ωc
≤ 2d |f (x)|2 · x∈∂Ω
≤ 4d2
y∈Λ,|x−y|1 =1
1
|f (x)|2 = 4d2 f | P∂Ω f .
(3.8)
x∈∂Ω
(We thank D. Ueltschi for pointing out (3.7) and (3.8) to us.) We conclude that
−1 ⊥ 8d2 PΩ ∆PΩ⊥ PΩ⊥ (Kµ˜ + U ρ )PΩ⊥ P∂Ω . PΩ ∆PΩ ≤ (3.9) Uδ The invertibility of PΩ⊥ (Kµ˜ + U ρ + e)PΩ⊥ on Ran PΩ⊥ implies the applicability of the Feshbach map, for any e ∈ [0, µ]. That is, for any e ∈ [0, µ], F (e) := FPΩ [Kµ + e + U ρ ] − ePΩ
−1 ⊥ = PΩ (Kµ + U ρ )PΩ − PΩ ∆PΩ⊥ PΩ⊥ (Kµ + e + U ρ )PΩ⊥ PΩ ∆PΩ
(3.10)
is a well-defined matrix on Ran PΩ , and the isospectrality of the Feshbach map guarantees that −e ∈ [−µ, 0) is a negative eigenvalue of Kµ + U ρ of multiplicity m(e) if and only if −e is an (nonlinear) eigenvalue of F (e), i.e., if the kernel of F (e) + e, as a subspace of Ran PΩ , has dimension m(e). Note that F is monotonically increasing, as a quadratic form, in e > 0. In particular, F (e) ≥ F (0) ≥ PΩ (Kµ + U ρ )PΩ −
8d2 P∂Ω , Uδ
(3.11)
additionally taking (3.9) into account. We claim that, for all λ ∈ (0, ∞), the number of eigenvalues of Kµ + U ρ below −λ is smaller than the number of negative eigenvalues of F (λ) + λ, (3.12) Tr 1l[Kµ + U ρ < −λ] ≤ TrΩ 1l[F (λ) + λ < 0] , where TrΩ denotes the trace on Ran PΩ . Both sides of Eq. (3.12) are zero and thus fulfill the claimed inequality, for λ ≥ µ. Assume that (3.12) is violated, for some λ ∈ (0, ∞), i.e., that λ∗ := inf{λ ∈ (0, ∞) | Eq. (3.12) holds true} > 0. We show
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
533
that this assumption leads to a contradiction. Obviously, −λ∗ must be an eigenvalue of Kµ + U ρ , and hence also of F (λ∗ ), of multiplicity m(λ∗ ) ≥ 1, because only then the left-hand or the right-side of (3.12) changes (increases, in fact). Moreover, Eq. (3.12) holds true for λ = λ∗ itself, i.e., the infimum in the definition of λ∗ is a minimum. Hence, for all sufficiently small ε > 0, the definition of λ∗ and the monotony of F (e) in e yield Tr{1l[Kµ + U ρ < −λ∗ ]} ≤ TrΩ {1l[F (λ∗ ) + λ∗ < 0]}
(3.13)
Tr{1l[Kµ + U ρ < −λ∗ + ε]} > TrΩ {1l[F (λ∗ − ε) + λ∗ − ε < 0]} ≥ TrΩ {1l[F (λ∗ ) + λ∗ − ε < 0]}.
(3.14)
Choosing ε > 0 so small that −λ∗ is the only eigenvalue of Kµ + U ρ in the interval [−λ∗ , −λ∗ + ε], we hence obtain m(λ∗ ) = Tr{1l[0 ≤ Kµ + U ρ + λ∗ < ε]} = Tr{1l[Kµ + U ρ < −λ∗ + ε]} − Tr{1l[Kµ + U ρ < −λ∗ ]} > TrΩ {1l[F (λ∗ ) + λ∗ < ε]} − TrΩ {1l[F (λ∗ ) + λ∗ < 0]} = TrΩ {1l[0 ≤ F (λ∗ ) + λ∗ < ε]} = m(λ∗ ),
(3.15)
arriving at a contradiction, which proves (3.12), for all λ ∈ (0, ∞). From (3.12) and (3.11), we finally conclude ∞ Tr{[Kµ + U ρ ]− } = − Tr{1l[Kµ + U ρ < −λ]} dλ 0
≥−
∞
TrΩ {1l[F (λ) + λ < 0]}
0
≥−
∞
TrΩ {1l[F (0) + λ < 0]}
0
= Tr{[F (0)]− } = TrΩ {[F (0)]− } ≥ Tr{[PΩ (Kµ + U ρ )PΩ ]− } −
8d2 Tr{P∂Ω } Uδ
(3.16)
which is the assertion of Lemma 3.2. 3.3. The electron density in the bulk In this subsection, we consider the spectral projection
χ := 1l PΩ (Kµ + U ρ )PΩ < 0 = 1l PΩ (−∆ − µ + U ρ )PΩ < 0
(3.17)
of PΩ (−∆ − µ + U ρ )PΩ onto its negative eigenvalues. Writing ∆Ω := PΩ ∆PΩ , i.e., (∆Ω )x,y = ∆x,y , for x, y ∈ Ω, and = 0, otherwise, and V ≡ x∈Ω V (x) · 1lx := µPΩ − U ρ PΩ , we have that 1 µ ≤ V (x) ≤ µ, (3.18) χ = 1l[−∆Ω − V < 0] and ∀x ∈ Ω : 2
August 5, 2006 21:35 WSPC/148-RMP
534
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
due to the definition (2.9) of ρ . Naive semiclassical intuition tells us that, for x ∈ Ω, the particle density ρχ (x) := χx,x corresponding to the one-particle density matrix χ should be bounded below by the particle density of the Fermi gas given by the one-particle density matrix 1l[−∆ < µ/2]. The purpose of this subsection is to prove such a bound (up to a constant factor) where it can be expected to hold, namely, for those points x that are sufficiently far away from the boundary of Ω. 2 Lemma 3.3. Let 0 < µ ≤ 4d, define M := M∗ := 12( 4d µ ) . Suppose that L obeys L ≥ 2M and that x ∈ Ω, with dist∞ (x, ∂Ω) > 2M . Then
ρχ (x) ≥ κ(µ) :=
−2d µd 1 + 2 ln(2) d−1 + 1 + ln 4dµ−1 . 42d+1 ed dd
(3.19)
Proof. For any β > 0, we note that the map RΩ → R, W → (e−β(−∆Ω−W ) )x,x is monotonically increasing in W . Namely, as TΩ = PΩ T PΩ has nonnegative matrix elements, so does eε∆Ω , ∞ ε∆Ω εk k −2dε εTΩ −2dε e T e = e = e ≥ 0, w,z w,z k! Ω w,z
(3.20)
k=0
∈ RΩ with W (z) ≤ W (z), for all for all w, z ∈ Ω. So, if n is an integer and W , W z ∈ Ω, then we have that n β∆Ω/n βW/n n eβ∆Ω/n zj−1 ,zj eβW (zj )/n e e = z0 ,zn z1 ,...,zn−1 ∈Ω
≤
z1 ,...,zn−1 ∈Ω
=
j=1 n β∆Ω/n f e eβ W (zj )/n zj−1 ,zj j=1
β∆Ω /n β W f n e e /n z0 ,zn ,
(3.21)
for all z0 , zn ∈ Ω. Setting z0 := zn := x ∈ Ω and taking the limit n → ∞, the Lie–Trotter product formula and Eq. (3.21) imply that −β(−∆Ω−W ) f ≤ e−β(−∆Ω−W ) x,x , e x,x
(3.22)
indeed. In particular, eβµ/2 eβ∆Ω x,x ≤ e−β(−∆Ω −V ) x,x ,
(3.23)
since V ≥ 12 µ on Ω. On the other hand, −∆Ω − V ≥ −µ and χ⊥ (−∆Ω − V )χ⊥ ≥ 0, as quadratic forms. The spectral theorem thus implies that χe−β(−∆Ω−V ) χ ≤ χeβµ χ = eβµ χ, χ⊥ e−β(−∆Ω−V ) χ⊥ ≤ χ⊥ ≤ PΩ .
(3.24) (3.25)
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
535
Putting together (3.23)–(3.25), using that χ and −∆Ω − V commute, we arrive at eβµ/2 eβ∆Ω x,x ≤ e−β(−∆Ω−V ) x,x = χe−β(−∆Ω−V ) χ x,x + χ⊥ e−β(−∆Ω −V ) χ⊥ x,x ≤ eβµ χx,x + 1. Solving for ρχ (x) = χx,x , we therefore have
ρχ (x) ≥ e−βµ/2 (eβ∆Ω )x,x − e−βµ/2 ,
(3.26)
(3.27)
for any x ∈ Ω and any β > 0. Next, recall that Q(M ) = {−M, . . . , M }d + LZd = {y ∈ Λ : |y|∞ ≤ M } is the box of sidelength 2M + 1 centered at 0 ∈ Λ. Since dist∞ (x, ∂Ω) > 2M , by assumption, we have that Q(M ) − z + x ⊆ Ω,
(3.28)
for all z ∈ Q(M ). By Lemma 3.4, this inclusion implies that (exp[β∆Ω ])x,x ≥ (exp[β∆Q(M)−z+x ])x,x = (exp[β∆Q(M) ])z,z , and by averaging this inequality over z ∈ Q(M ), we obtain 1 exp[β∆Ω ] x,x ≥ (exp[β∆Q(M) ])z,z . |Q(M )|
(3.29)
(3.30)
z∈Q(M)
Now, we apply Lemma 3.5 and arrive at 1 |Q(M )|
e−dβ/M exp[β∆Q(M) ] z,z ≥ exp[−β ω(k)] dd k (2π)d [−π,π]d
z∈Q(M)
!
"d 2 e−β/M π/2 2 = exp[−4β sin (t)] dt , (3.31) π 0 d d where ω(k) = ω(−k) = ν=1 2 1−cos(kν ) = ν=1 4 sin2 (kν /2). Choosing β ≥ 1, √ # βπ −t2 #π 2 e dt ≥ π1 0 e−t dt = 2√1 π erf[π] ≥ 14 . Using this and we observe that π1 0 sin2 (t) ≤ t2 , we have the following estimate, √βπ 2 2e−β/M π/2 e−β/M 1 e−β/M 2 exp[−4β sin (t)] dt ≥ 1/2 · e−t dt ≥ . π π 0 β 4 β 1/2 0
(3.32)
Inserting this estimate into (3.31) and then the result in (3.30) and (3.27), we obtain, for any β ≥ 1, that −dβ/M e ρχ (x) ≥ e−βµ/2 d d/2 − e−βµ/2 4 β ! "d/2 d/2 µ eτ = e−τ d e1−2dτ /(Mµ) · −1 , (3.33) 16ed τ
August 5, 2006 21:35 WSPC/148-RMP
536
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
where τ := βµ/d. Note that if we require τ ≥ 4 then β = τ d/µ ≥ 1, since µ ≤ 4d. We may thus replace β ∈ [1, ∞) by τ ∈ [4, ∞). Our goal is to choose τ such that !
eτ µ · 16ed τ
"d/2 ≥ 2 ⇐⇒
! " ! " 1 4d τ − ln(τ ) ≥ Y := 1 + 2 ln(2) + 1 + ln . d µ
(3.34) (3.35)
Note that, due to µ ≤ 4d, 2.38 ≤ 1 + 2 ln(2) ≤ Y ≤ 3 ln 16dµ−1 .
(3.36)
We choose τ := Y + 2 ln(Y ) and observe that Y ≥ 2.38 insures τ ≥ 4.11 ≥ 4, as required. Moreover, with this choice, we have
τ − ln(τ ) − Y = 2 ln(Y ) − ln Y + 2 ln(Y )
≥ ln(Y ) − ln 1 + 2 ln(Y ) Y −1
! " 1 1 ≥ ln(Y ) − 2 ln(Y ) Y −1 = 2 ln(Y ) − > 0, 2 Y
(3.37)
using that ln(1 + ε) ≤ ε, for ε ≥ 0, and Y ≥ 2.38> 2. Thus, (3.35) and (3.34) hold and true. Additionally, we observe that Y ≤ 3 ln 16d µ ! " ln r τ ≤ Y · max 1 + 2 = (1 + 2/e)Y ≤ 2Y r>0 r insures that
2dτ µ
≤
12d µ
(3.38)
2 ≤ 12 4d ln 16d ≤ M∗ ≤ M . This, in turn, yields µ µ 2dτ exp 1 − ≥ 1, Mµ
(3.39)
and by inserting (3.39) and (3.34) into (3.33), we arrive at ρχ (x) ≥ e−τ d =
µd 42d+1 ed dd
−2d 1 + 2 ln(2) d−1 + 1 + ln 4dµ−1 .
(3.40)
Lemma 3.4. Let A, B ⊆ Λ, with A ⊆ B, and denote ∆A := PA ∆PA and ∆B := PB ∆PB . For all x ∈ A and all β > 0, exp[β ∆A ] x,x ≤ exp[β ∆B ] x,x .
(3.41)
Proof. We first define the nearest-neighbor hopping matrix T on Λ by Tw,z := 1 if |w − z|1 = 1 and Tw,z := 0, otherwise. For a given subset C ⊂ Λ, the matrix
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
537
TC := PC T PC denotes the hopping matrix restricted to C. Note that ∆C = TC − 2dPC is the difference of the two commuting matrices TC and 2dPC . Hence, for x ∈ C, (3.42) exp[β∆C ] x,x = exp[β TC ] exp[−2dβPC ] x,x = e−2dβ exp[β TC ] x,x . Due to this identity and the fact that x ∈ A ⊆ B, Eq. (3.41) is equivalent to exp[β TA ] x,x ≤ exp[β TB ] x,x . (3.43) Now, 0 ≤ (TA )w,z ≤ (TB )w,z , and hence (TAn )x,x ≤ (TBn )x,x , for all intergers n. Thus, (3.43) follows from an expansion of the exponentials in Taylor series, ∞ ∞ βn n βn n (TA )x,x ≤ (TB )x,x = exp[β TB ] x,x . exp[β TA ] x,x = n! n! n=0 n=0
(3.44)
Lemma 3.5. Let Q = {−m, . . . , m}d ⊂ Zd be a cube. Denote by ∆Q the nearestneighbor Laplacian on Q, i.e., ∆Q = PQ ∆PQ = −2dPQ + TQ , TQ := PQ T PQ , and Tx,y = 1l(|x − y|1 = 1). Then, for all β > 0, 1 e−dβ/m exp[β∆Q ] z,z ≥ exp[−β ω(k)] dd k, (3.45) |Q| (2π)d [−π,π]d z∈Q
where ω(k) :=
2 1 − cos(kν ) .
d ν=1
Proof. We may pick an even integer r, choose L := r · (2m + 1), and identify Q with Q + LZd ⊆ Λ. (Note that the statement of the lemma makes no reference to the Hubbard model analyzed before, and for the purpose of the proof, L can be taken an arbitrarily large integer multiple of 2m + 1.) Given s ∈ Zdr , we define Q(s) := Q + (2m + 1)s and observe that the family {Q(s)}s∈Zdr of cubes define a disjoint partition of Λ, i.e., $ Q(s) and ∀s = s : Q(s) ∩ Q(s ) = ∅. (3.46) Λ= s∈Zd r
Hence % := ∆
∆Q(s)
(3.47)
s∈Zd r
is the sum of translated, but mutually disconnected copies of ∆Q . We observe that % Tr exp[β ∆] % x,x = % z+(2m+1)s,z+(2m+1)s = (exp[β ∆]) (exp[β ∆]) x∈Λ
=
z∈Q s∈Zd r
(exp[β∆Q(s) ])z+(2m+1)s,z+(2m+1)s = rd
z∈Q s∈Zd r
(exp[β∆Q ])z,z .
z∈Q
(3.48)
August 5, 2006 21:35 WSPC/148-RMP
538
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
As an intermediate result, we thus have 1 1 % , Tr exp[β ∆] (exp[β∆Q ])z,z = |Q| |Λ|
(3.49)
z∈Q
since |Λ| = Ld = rd |Q|. % by the elements of Q, i.e., for η ∈ Q, we introduce ∆ % (η) Next, we translate ∆ Λ on C by % (η) := ∆ ∆Q(q)+η = ∆Q+η+(2m+1)q . (3.50) q∈Zd r
q∈Zd r
% (η) is unitarily equivalent to ∆. % We observe that Of course, ∆ 1 % (η) 1 1 ∆ = ∆Q+y = −2d · 1lCΛ + TQ+y , |Q| |Q| |Q| η∈Q
y∈Λ
(3.51)
y∈Λ
where, for w, z ∈ Λ, TQ+y = 1lQ (w − y)1lQ (z − y)Tw,z y∈Λ
w,z
y∈Λ
= (Q + w) ∩ (Q + z) · Tw,z = 2m(2m + 1)d−1 Tw,z ,
(3.52)
since Tw,z = 0 only if w − z are neighboring lattice sites. Hence, 1 % (η) 2m ∆ = −2d · 1lCΛ + T |Q| 2m + 1 η∈Q
2d 2m · 1lCΛ + ∆ 2m + 1 2m + 1 d ≥ − · 1lCΛ + ∆, (3.53) m where ∆ ≤ 0 is the nearest-neighbor Laplacian on Λ (with periodic b.c.). This and the convexity of A → Tr{eβA } therefore imply that & ' 1 β % (η) (η) % % ∆ Tr exp[β ∆] = Tr exp[β ∆ ] ≥ Tr exp |Q| |Q| =−
η∈Q
η∈Q
≥ e−βd/m Tr exp[β∆] .
(3.54)
We diagonalize ∆ by discrete Fourier transformation on C . The eigenvalues of d −∆ are given by ω(k), where k ∈ Λ∗ = 2π L ZL is the variable dual to x ∈ Λ. Since ∗ d d |Λ | = L = |Q| r , we therefore have −βd/m 1 1 % ≥e Tr exp[β ∆] exp[β∆Q ]z,z = e−β ω(k) . (3.55) |Q| |Λ| |Λ∗ | ∗ Λ
z∈Q
k∈Λ
Inequality (3.55) holds for every L = r(2m + 1), and hence also in the limit L → ∞. Since the right side of (3.50) is a Riemann sum approximation to the integral in (3.45), this limit yields the asserted estimate (3.45).
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
539
3.4. The discrete Laplacians on Ω, Ωc , and their eigenvalue sums In this final subsection, we compare the sum of the eigenvalues of := PΩ (−∆)PΩ + PΩ⊥ (−∆)PΩ⊥ −∆
(3.56)
below µ to the sum of the eigenvalues of −∆ below µ, where Ω ⊆ Λ is an arbitrary, but henceforth fixed, subset of Λ, and Ωc := Λ\Ω is its complement. To this end, we introduce the difference of these eigenvalue sums, − µ]− } − Tr{[−∆ − µ]− } δE(µ, Ω) := Tr{[−∆ − µ)P− } − Tr{(−∆ − µ)P− }, = Tr{(−∆
(3.57)
≤ µ] and P− := 1l[−∆ ≤ µ]. We further set P+ := P−⊥ and P+ := where P− := 1l[−∆ ⊥ P− } = Tr{(−∆−µ)P− }, P− . Since P− commutes with PΩ , we have that Tr{(−∆−µ) and thus δE(µ, Ω) = Tr{(−∆ − µ) (P− − P− )} = Tr{[−∆ − µ]− (P− − 1l)} + Tr{[−∆ − µ]+ P− } = Tr{[∆ + µ]+ P+ } + Tr{[−∆ − µ]+ P− } ≥ 0
(3.58)
is manifestly nonnegative. The derivation of a nontrivial lower bound on δE(µ, Ω) of the form δE(µ, Ω) ≥ α(µ)|∂Ω|, where α(µ) > 0 is a positive constant which depends only on µ and the spatial dimension d ≥ 1 (but not on Ω), is a task that was first addressed by Freericks, Lieb, and Ueltschi in [24]. Shortly thereafter, Goldbaum [25] improved the numerical value for α(µ) > 0, especially if µ is close to 2d. As a consequence of the estimates in [24, 25], we have the following lemma. Lemma 3.6 ([24, 25]). (i) Let 12 < µ < 4d. There is L∗ (µ) < ∞ and α(µ) > 0 such that, for all L ≥ L∗ (µ) and all subsets Ω ⊆ Λ, δE(µ, Ω) ≥ α(µ)|∂Ω|.
(3.59)
(ii) Let 0 < µ ≤ 12 , and define α(µ) :=
|S d−1 |µ(2+d)/2 21+d/2 (2π)d (4d)5
and
L∗ (µ) :=
4πd , µ
(3.60)
where |S d−1 | is the surface volume of the d-dimensional sphere. Then, for all L ≥ L∗ (µ) and all subsets Ω ⊆ Λ = ZdL , we have δE(µ, Ω) ≥ α(µ)|∂Ω|.
(3.61)
Proof. We only give the proof of (ii), which amounts to reproducing the proof of Lemma 3.1 in [24]. By {ψk }k∈Λ∗ ⊆ CΛ , we denote the orthonormal basis (ONB) of eigenvectors of ∆, i.e., ψk (x) := |Λ|−1/2 e−ik·x ,
k ∈ Λ∗ =
2π d Z , L L
(3.62)
August 5, 2006 21:35 WSPC/148-RMP
540
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
d and we have that −∆ψk = ω(k)ψk , with ω(k) = ν=1 2{1 − cos(kν )}. Evaluating the traces in Eq. (3.58) by means of this ONB, we obtain [µ − ω(k)]+ ψk | P+ ψk + [ω(k) − µ]+ ψk | P− ψk . δE(µ, Ω) = k∈Λ∗
≥
[µ − ω(k)]+ ψk | P+ ψk .
(3.63)
k∈Λ∗ |Λ| i.e., −∆ϕ j = ej ϕj . For any Let {ϕj }j=1 ⊆ CΛ be an ONB of eigenvectors of ∆, ∗ k ∈ Λ and 1 ≤ j ≤ |Λ|, we observe that 2 j |2 ej − ω(k) |ψk | ϕj |2 = |ψk | (∆ − ∆)ϕ
= |ψk | (PΩ ∆PΩ⊥ + PΩ⊥ ∆PΩ )ϕj |2 = |PΩ ∆PΩ⊥ ψk | ϕj |2 + |PΩ⊥ ∆PΩ ψk | ϕj |2 ≥ |P∂Ω ∆PΩ⊥ ψk | ϕj |2 , PΩ⊥ ϕj
using that either PΩ ϕj = 0 or = 0 and that |ej − ω(k)| ≤ 4d, Eq. (3.64) implies that
PΩ ∆PΩ⊥
(3.64) =
(4d)2 |ψk | ϕj |2 ≥ |bk | ϕj |2 ,
P∂Ω ∆PΩ⊥ .
Since (3.65)
where bk := P∂Ω ∆PΩ⊥ ψk is the boundary vector that plays a crucial role in [24]. By summation over all j corresponding to eigenvalues ej > µ, we obtain ψk | P+ ψk ≥ (4d)−2 bk | P+ bk ,
(3.66)
> for all k ∈ Λ∗ . Next, the convexity of λ → [λ]+ and the fact that P+ = 1l[−∆ −1 − µ]+ yield µ] ≥ (4d) [−∆
1 − µ]+ bk ≥ 1 bk | (−∆ − µ)bk bk | P+ bk ≥ bk | [−∆ + 4d 4d 1
bk | (−∆ − µ)bk + . = (3.67) 4d Now, for any x ∈ ∂Ω there is, by definition, at least one point x + e ∈ Ωc , with |e|1 = 1. Since bk is supported in ∂Ω, we have bk (x + e) = 0, and thus |bk (x) − bk (x + e)|2 − µ|bk (x)|2 bk | (−∆ − µ) bk = x∈∂Ω
|e|1 =1
≥ (1 − µ)
|bk (x)|2 = (1 − µ) bk 2 .
(3.68)
x∈∂Ω
Inserting (3.66)–(3.68) into (3.63), we arrive at (1 − µ) δE(µ, Ω) ≥ [µ − ω(k)]+ bk 2 . (4d)3 ∗
(3.69)
k∈Λ
Next, we use that (3.69) only those k ∈ Λ∗ contribute, for which in the sum in d 1 ω(k) = ν=1 2 1 − cos(kν ) ≤ 2 , as 0 < µ ≤ 1. This implies that cos(kν ) ≥ 12 , for
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
541
all ν ∈ {1, 2, . . . , d}. Hence, for these k, we have that 2 d 1 iσkν 2 c bk = e 1l[x + σeν ∈ Ω ] |Λ| x∈∂Ω σ=± ν=1
1 ≥ |Λ|
x∈∂Ω
≥
d
2 cos(kν )1l[x + σeν ∈ Ω ] c
σ=± ν=1
1 |∂Ω| , 1= 4|Λ| 4|Λ|
(3.70)
x∈∂Ω
since there is at least one choice for (σ, ν) such that x + σeν ∈ Ωc . Inserting this estimate into (3.69), we obtain 1 |∂Ω| [µ − ω(k)]+ . (3.71) δE(µ, Ω) ≥ 8 (4d)3 |Λ∗ | ∗ k∈Λ
∗
Now define q : T → Λ by the preimages "d π π q −1 (k) := k + − , , L L d
(3.72)
for k ∈ Λ∗ . In other words, given ξ ∈ Td , the point q(ξ) ∈ Λ∗ is the closest point π , which implies that |ω(q(ξ)) − ω(ξ)| ≤ 2πd to ξ. In particular, |ξ − q(ξ)|∞ ≤ L L , by Taylor’s theorem. Hence,
dd ξ 1 µ − ω(q(ξ)) [µ − ω(k)] = + + (2π)d |Λ∗ | Td k∈Λ∗
dd ξ µ − 2πdL−1 − ω(ξ) + ≥ . (3.73) (2π)d Td µ 2πd 2 Since, by assumption, 2πd L ≤ L∗ = 2 and ω(ξ) ≤ ξ , we have ( )
|S d−1 | µ µ1+(d/2) . − ξ 2 dd ξ = d/2 µ − 2πdL−1 − ω(ξ) + dd ξ ≥ 2 + 2 d(d + 2) d d T T (3.74)
Inserting (3.73) and (3.74) into (3.71), we arrive at the asserted estimate. Acknowledgments The authors are grateful to Alessandro Giuliani for very helpful discussions and comments about an earlier version of this paper. They also thank Manfred Salmhofer, J¨ urg Fr¨ ohlich and Daniel Ueltschi for useful discussions. M.T. thanks the German student exchange service DAAD for a generous stipend, which supported two-thirds of his graduate studies. V.B. and M.T. gratefully acknowledge financial support from grant no. HPRN-CT-2002-00277 of the European Union and grant no. Ba 1477/3-3 of the Deutsche Forschungsgemeinschaft. E.L. gratefully acknowledges
August 5, 2006 21:35 WSPC/148-RMP
542
J070-00273
V. Bach, E. H. Lieb & M. V. Travaglia
support from the Alexander von Humboldt Foundation of a fellowship, the U.S. National Science Foundation, grant no. PHY-0133984, and the hospitality of the Mathematics Departments of the University of Mainz and the Technical University of Berlin. The authors appreciate the careful and helpful work of the referee. References [1] E. H. Lieb and F. Y. Wu, Absence of Mott transition in an exact solution of the shortrange, one-band model in one dimension, Phys. Rev. Lett. 20 (1968) 1445–1448. [2] E. H. Lieb and F. Y. Wu, The one-dimensional Hubbard model: A reminiscence, Physica A 321 (2003) 1–27. [3] D. Penn, Stability theory of the magnetic phases for a simple model of the transition metals, Phys. Rev. 142(2) (1966) 350–365. [4] P. van Dongen, Thermodynamics of the extended Hubbard model in high dimensions, Phys. Rev. Lett. 67(6) (1991) 757–760. [5] P. van Dongen, Extended Hubbard model at weak coupling, Phys. Rev. B 50(19) (1994) 14016–14030. [6] T. Obermeier, T. Pruschke and J. Keller, Ferromagnetism in the large-u Hubbard model, Phys. Rev. B 56(14) (1997) R8479–R8482. [7] J. Wahle, N. Bl¨ umer, J. Schlipf, K. Held and D. Vollhardt, Microscopic conditions favoring itinerant ferromagnetism, Phys. Rev. B 58(19) (1997) 12749–12757. [8] G. M. Pastor, R. Hirsch and B. M¨ uhlschlegel, Magnetism and structure of small clusters: An exact treatment of electron correlations, Phys. Rev. B 53(15) (1996) 10382–10396. [9] T. Hanisch, G. S. Uhrig and E. M¨ uller-Hartmann, Lattice dependence of saturated ferromagnetism in the Hubbard model, Phys. Rev. B 56 (1997) 13960. [10] E. M¨ uller-Hartmann, Ferromagnetism in Hubbard models: Low density route, J. Low. Temp. Phys. 99 (1995) 349. [11] H. Tasaki, Ferromagnetism in Hubbard models with degenerate single-electron ground states, Phys. Rev. Lett. 69 (1992) 1608–1611. [12] A. Mielke, Ferromagnetism in the Hubbard model and Hund’s rule, Phys. Lett. A 174 (1993) 443–448. [13] A. Mielke and H. Tasaki, Ferromagnetism in the Hubbard model — Examples from models with degenerate single-electron ground states, Commun. Math. Phys. 158 (1993) 341–371. [14] H. Tasaki, Ferromagnetism in Hubbard models, Phys. Rev. Lett. 75 (1995) 4678–4681. [15] H. Tasaki, Ferromagnetism in the Hubbard model: A constructive approach, Commun. Math. Phys. 242(3) (2003) 445–472. [16] H. Tasaki, From Nagaoka’s ferromagnetism to flat-band ferromagnetism and beyond — An introduction to ferromagnetism in the Hubbard model, Prog. Theor. Phys. 99(4) (1998) 489–548. [17] V. Bach, E. H. Lieb and J. P. Solovej, Generalized Hartree–Fock theory and the Hubbard model, J. Stat. Phys. 76 (1994) 3–90. [18] E. H. Lieb and D. C. Mattis, Theory of ferromagnetism and the ordering of electronic energy levels, Phys. Rev. 125 (1962) 164–172. [19] P. Pieri, S. Daul, D. Baeriswyl, M. Dzierzawa and P. Fazekas, Low density ferromagnetism in the Hubbard model, Phys. Rev. B 45 (1996) 9250. [20] E. H. Lieb, R. Seiringer and J. P. Solovej, Ground state energy of the low density Fermi gas, Phys. Rev. A 71 (2005) 053605–13.
August 5, 2006 21:35 WSPC/148-RMP
J070-00273
Ferromagnetism of the Hubbard Model at Strong Coupling
543
[21] E. H. Lieb, The Hubbard model: Some rigorous results and open problems, in Proc. XIth Int. Cong. Mathematical Physics (Paris, 1994), ed. D. Iagolnitzer (International Press, 1995), pp. 392–412; arXiv cond-mat/9311033. [22] V. Bach, E. H. Lieb, M. Loss and J. P. Solovej, There are no unfilled shells in Hartree– Fock theory, Phys. Rev. Lett. 72(19) (1994) 2981–2983. [23] E. H. Lieb, Variational principle for many-fermion systems, Phys. Rev. Lett. 46(7) (1981) 457–459. [24] J. K. Freericks, E. H. Lieb and D. Ueltschi, Segregation in the Falicov–Kimball model, Commun. Math. Phys. 227 (2002) 243–279. [25] P. Goldbaum, Lower bound for the segregation energy in the Falicov–Kimball model, J. Phys. A 9 (2003) 2227–2234.
August 5, 2006 21:35 WSPC/148-RMP
J070-00271
Reviews in Mathematical Physics Vol. 18, No. 5 (2006) 545–564 c World Scientific Publishing Company
A FURTHER STUDY ON NON-ABELIAN PHASE SPACES: LEFT-SYMMETRIC ALGEBRAIC APPROACH AND RELATED GEOMETRY
CHENGMING BAI Chern Institute of Mathematics & LPMC, Nankai University, Tianjin 300071, P. R. China and Liu Hui Center for Applied Mathematics, Tianjin 300071, P. R. China and Department of Mathematics, Rutgers, The State University of New Jersey, Piscataway, NJ 08854, USA [email protected] Received 1 November 2005 Revised 12 June 2006 The notion of non-abelian phase space of a Lie algebra was first formulated and then discussed by Kuperschmidt. In this paper, we further study the non-abelian phase spaces in terms of left-symmetric algebras. We interpret the natural appearance of left-symmetric algebras from the intrinsic algebraic properties and the close relations with the classical Yang–Baxter equation. Furthermore, using the theory of left-symmetric algebras, we study some interesting geometric structures related to phase spaces. Moreover, we also discuss the generalized phase spaces with certain non-trivial algebraic structures on the dual spaces. Keywords: Phase space; left-symmetric algebra; Lie algebra; symplectic form. Mathematics Subject Classification 2000: 17B, 53C, 81R
1. Introduction The phase space T ∗ V of a vector space V over a field F is the direct sum of V and its dual space V ∗ = Hom(V, F) endowed with the symplectic form ω(x + a∗ , y + b∗ ) = a∗ , y − b∗ , x,
∀x, y ∈ V, a∗ , b∗ ∈ V ∗ ,
(1.1)
where , is the ordinary pairing between V and V ∗ . If V is replaced by a Lie algebra (in particular, for a non-abelian Lie algebra), the definition of phase space was naturally generalized and formulated by B. A. Kuperschmidt in [1] as follows. Let G be a Lie algebra and G ∗ be its dual space. A phase space of G is the vector 545
August 5, 2006 21:35 WSPC/148-RMP
546
J070-00271
C. Bai
space T ∗ G = G⊕G ∗ as the direct sum of vector spaces such that T ∗ G is a Lie algebra, G is its subalgebra and the symplectic form ω given by Eq. (1.1) is a 2-cocycle on T ∗ G, that is, ω([x1 + a∗1 , x2 + a∗2 ], x3 + a∗3 ) + CP = 0,
∀xi ∈ G, a∗i ∈ G ∗ ,
(1.2)
where “CP” stands for “cyclic permutation”. As was pointed out in [1], it is not true that there is a phase space for any Lie algebra. A natural question is: Which kind of Lie algebras have phase spaces? Furthermore, unlike the (unique) phase space of a vector space (the abelian phase space, that is, the Lie algebra structure of the phase space is abelian), the phase space of a Lie algebra is not unique in general. Thus, the following question is: for such a Lie algebra, how do we construct and classify its phase spaces? In [1, 2], these two questions were partly solved: A great class of phase spaces of certain Lie algebras such as gl(V ), the Lie algebra of vector field on Rn , the current algebras and the Virasoro algebra, can be constructed from a kind of non-associative algebras, namely, left-symmetric algebras (in [1–3], the notion of quasi-associative algebras was used). A left-symmetric algebra A is a vector space over a field F with a bilinear product (x, y) → xy such that for any x, y, z ∈ A, the associator (x, y, z) = (xy)z − x(yz) is symmetric in x, y, that is, (x, y, z) = (y, x, z),
or equivalently,
(xy)z − x(yz) = (yx)z − y(xz).
(1.3)
In fact, left-symmetric algebras are a kind of natural algebraic systems appearing in many fields in mathematics and mathematical physics such as convex homogeneous cones [4], affine manifolds and affine structures on Lie groups [5–7], symplectic and k¨ ahler structures on Lie groups [8–10], complex product structures on Lie algebras [11], integrable systems [12, 13], classical and quantum Yang–Baxter equations [14– 18], Poisson brackets and infinite-dimensional Lie algebras [19–25], operads [26], quantum field theory [27] and so on. The aim of this paper is to study the phase spaces of Lie algebras in terms of left-symmetric algebras, extending the discussion in [1–3]. We would like to point out that the appearance of left-symmetric algebras in the study of those phase spaces is natural and necessary since a Lie algebra has a phase space if and only if it is sub-adjacent to a left-symmetric algebra. This can also be seen from the close relations with classical Yang–Baxter equation. Moreover, in the sense of [1–3], every phase space of a Lie algebra G can be constructed from a compatible leftsymmetric algebra structure on G and two such phase spaces are isomorphic if and only if their corresponding (compatible) left-symmetric algebras are isomorphic. Thus, we answer the above two questions completely. Therefore, the theory of left-symmetric algebras plays a key role in the study of phase spaces. Furthermore, besides the interpretation of their natural appearance, in this paper, we mainly study two other important topics using the theory of left-symmetric algebras. One is certain geometry related to phase spaces. Since leftsymmetric algebras have close relations with many geometric structures [4–11], they are helpful to understand the plentiful of interesting geometric properties of some
August 5, 2006 21:35 WSPC/148-RMP
J070-00271
Left-Symmetric Algebraic Approach and Related Geometry
547
phase spaces. The other is to generalize the concept of phase space in the sense of [1–3] to a wide extent which there are certain non-trivial algebraic structures on the dual space G ∗ , like the Drinfel’d double [28]. Such phase spaces can be constructed through the theory of left-symmetric algebras, too. Throughout this paper, for briefness, all algebras are of finite dimension, although many results also hold in infinite dimension. 2. The Natural Appearance of Left-Symmetric Algebras: Algebraic Interpretation At first, we recall some basic properties of left-symmetric algebras [29–33]. Proposition 2.1. Let A be a left-symmetric algebra. For any x ∈ A, let Lx denote the left multiplication operator, that is, Lx (y) = xy for any y ∈ A. Then, we have (1) The commutator [x, y] = xy − yx,
∀x, y ∈ A,
(2.1)
defines a Lie algebra G(A), which is called the sub-adjacent Lie algebra of A and A is also called the compatible left-symmetric algebra on the Lie algebra G(A). (2) L : G(A) → gl(G(A)) with x → Lx gives a regular representation of the Lie algebra G(A), that is, [Lx , Ly ] = L[x,y],
∀x, y ∈ A.
(2.2)
It is not true that there is a compatible left-symmetric algebra structure on every Lie algebra G. For example, a real or complex Lie algebra G with a compatible left-symmetric algebra structure must satisfy the condition [G, G] = G (cf. [7]), hence there does not exist a compatible left-symmetric algebra structure on any real or complex semisimple Lie algebra. Here, we briefly introduce a sufficient and necessary condition for a Lie algebra with a compatible left-symmetric algebra structure [6, 29]. Let G be a Lie algebra and ρ : G → gl(V ) be a representation of G. A 1-cocycle q associated to ρ (denoted by (ρ, q)) is defined as a linear map from G to V satisfying q[x, y] = ρ(x)q(y) − ρ(y)q(x),
∀x, y ∈ G.
(2.3)
Proposition 2.2. Let G be a Lie algebra. Then, there is a compatible left-symmetric algebra structure on G if and only if there exists a bijective 1-cocycle of G. In fact, let (ρ, q) be a bijective 1-cocycle of G, then x ∗ y = q −1 ρ(x)q(y),
∀x, y ∈ G,
(2.4)
defines a left-symmetric algebra structure on G. Conversely, for a left-symmetric algebra A, (L, id) is a bijective 1-cocycle of G(A), where id is the identity transformation on G(A). Next, we study the relation between phase spaces and left-symmetric algebras. Let G be a Lie algebra and ρ : G → gl(V ) be a representation. On the direct sum of
August 5, 2006 21:35 WSPC/148-RMP
548
J070-00271
C. Bai
vector spaces G ⊕ V , it is easy to know that there is a natural Lie algebra structure given as follows [34]: [x1 + v1 , x2 + v2 ] = [x1 , x2 ] + ρ(x1 )v2 − ρ(x2 )v1 ,
∀x1 , x2 ∈ G, v1 , v2 ∈ V.
(2.5)
This Lie algebra is denoted by G ρ V . The following construction was first given in [1]: Theorem 2.3 [Kuperschmidt]. Let A be a left-symmetric algebra. Then, T ∗ G(A) = G(A) L∗ G ∗ (A) is a phase space of the sub-adjacent Lie algebra G(A), where G ∗ (A) is the dual space of G(A) and L∗ is the dual representation of the regular representation L. Conversely, if T ∗ G is a phase space of a Lie algebra G such that the Lie bracket on T ∗ G is given by G ρ∗ G ∗ , where ρ : G → gl(G) is a representation of G and ρ∗ : G → gl(G ∗ ) is its dual representation, then for any x, y ∈ G, xy = ρ(x)y defines a left-symmetric algebra structure on G. In fact, the second half part of the above theorem can be extended to the main theorem in this section given as follows. Theorem 2.4. Let T ∗ G = G ⊕ G ∗ be a phase space of a Lie algebra G. Then, there exists a compatible left-symmetric algebra structure “∗” on T ∗ G defined by ω(x ∗ y, z) = −ω(y, [x, z]),
∀x, y, z ∈ T ∗ G.
(2.6)
Moreover, G is a left-symmetric subalgebra with the above product. Proof. The first half part can be followed from [8] directly due to the symplectic form ω, or can be obtained as follows: ω defines a linear isomorphism q : T ∗ G → (T ∗ G)∗ given by q(x), y = ω(x, y),
∀x, y ∈ T ∗ G.
Then, q is a bijective 1-cocycle associated to the dual representation ad∗ of the adjoint representation ad of T ∗ G and the left-symmetric structure on T ∗ G can be defined by x ∗ y = q −1 (ad∗ xq(y)),
∀x, y ∈ T ∗ G,
which exactly satisfies (for any x, y, z ∈ T ∗ G) ω(x ∗ y, z) = q(x ∗ y), z = ad∗ x(q(y)), z = −q(y), [x, z] = −ω(y, [x, z]). Let x, y ∈ G. We need to prove that x ∗ y ∈ G. In fact, for any z ∈ G, we have ω(x ∗ y, z) = −ω(y, [x, z]) = 0. If x ∗ y = 0, then let x ∗ y = u + v ∗ , where u ∈ G, v ∗ ∈ G ∗ . Thus, 0 = ω(x ∗ y, G) = ω(v ∗ , G) = v ∗ , G.
August 5, 2006 21:35 WSPC/148-RMP
J070-00271
Left-Symmetric Algebraic Approach and Related Geometry
549
Therefore, v ∗ = 0. Hence, G is a left-symmetric subalgebra of T ∗ G with the product ∗. Remark 2.5. A Lie algebra with a non-degenerate 2-cocycle is called a symplectic Lie algebra [8–10, 13]. In fact, the above theorem and its proof only involved the following properties of T ∗ G: T ∗ G is a symplectic Lie algebra and G is a (Lie) subalgebra. On the other hand, if T ∗ G is given by G ρ∗ G ∗ , then for any x, y ∈ G, a∗ ∈ G ∗ , we have x ∗ y, a∗ = −ω(x ∗ y, a∗ ) = ω(y, [x, a∗ ]) = ω(y, ρ∗ (x)a∗ ) = ρ(x)y, a∗ . Then, x ∗ y = ρ(x)y, which is just the second half of Theorem 2.3. Corollary 2.6. A Lie algebra has a phase space if and only if it is sub-adjacent to a left-symmetric algebra. Therefore, there does not exist any phase space on a real or complex semisimple Lie algebra. So, the classical and matured theory on semisimple Lie algebras is almost useless here. It is necessary to search some new ideas and ways. We have seen that the theory of left-symmetric algebras is a choice, although it is still in development. On the other hand, according to Theorem 2.4, a phase space of a Lie algebra is also sub-adjacent to a left-symmetric algebra. Corollary 2.7. Let (A, ·) be a left-symmetric algebra. Then, there is a compatible left-symmetric algebra structure on the phase space T ∗ G(A) = G(A)L∗ G ∗ (A) given as follows: ∀x, y ∈ G(A), a∗ , b∗ ∈ G ∗ (A), we have x ∗ a∗ , a∗ ∗ x ∈ G ∗ (A) and x ∗ y = x · y,
a∗ ∗ b∗ = 0,
x ∗ a∗ , y = −a∗ , x · y − y · x,
a∗ ∗ x, y = a∗ , y · x.
(2.7)
The above definition is natural (cf. [1]) in the sense that two isomorphic leftsymmetric algebras (A, ·) and (A , ◦) induce isomorphic left-symmetric algebra structures on T ∗ G(A) and T ∗ G(A ). Proof. The first half part follows directly from Theorem 2.4. Let ϕ : A → A be an isomorphism of left-symmetric algebras. Then, the dual map ϕ∗ : (A )∗ → A∗ is invertible. Denote ϕ by the map ϕ∗ −1 so that ϕ(a∗ ), ϕ(x) = a∗ , x,
∀x ∈ A, a∗ ∈ A∗ .
For any x, y ∈ A, a∗ ∈ A∗ , we have ϕ(x ∗ a∗ ), ϕ(y) = x ∗ a∗ , y = −a∗ , x · y − y · x = −ϕ(a∗ ), ϕ(x · y − y · x) = −ϕ(a∗ ), ϕ(x) ◦ ϕ(y) − ϕ(y) ◦ ϕ(x) = ϕ(x) ∗ ϕ(a∗ ), ϕ(y). Therefore, ϕ(x ∗ a∗ ) = ϕ(x) ∗ ϕ(a∗ ). Similarly, ϕ(a∗ ∗ x) = ϕ(a∗ ) ∗ ϕ(x). So, the two left-symmetric algebra structures on the phase spaces are isomorphic.
August 5, 2006 21:35 WSPC/148-RMP
550
J070-00271
C. Bai
Remark 2.8. In [1], the compatible left-symmetric algebra structure on the phase space T ∗ G(A) = G(A) L∗ G ∗ (A) was constructed in another way: ∀x, y ∈ G(A), a∗ , b∗ ∈ G ∗ (A), we have x ∗ a∗ , a∗ ∗ x ∈ G ∗ (A) and a∗ ∗ b∗ = 0,
x ∗ y = x · y,
x ∗ a∗ , y = −a∗ , x · y,
a∗ ∗ x = 0.
(2.8)
This definition is also “natural” in the sense of Corollary 2.7. In general, these two left-symmetric algebra structures on T ∗ G(A) are not isomorphic although the left-symmetric algebra structures on G(A) are same. Example 2.9. Let F be 1-dimensional associative algebra with a basis e satisfying e · e = e. Then, F L∗ F∗ is a natural phase space. Moreover, in this case, as Lie algebras, it is isomorphic to e, e∗ |[e, e∗ ] = −e∗ , which is the 2-dimensional nonabelian Lie algebra. From Eq. (2.7), the compatible left-symmetric algebra structure on F L∗ F∗ is given by e ∗ e = e,
e ∗ e∗ = 0,
e∗ ∗ e = e∗ ,
e∗ ∗ e∗ = 0.
From Eq. (2.8), the compatible left-symmetric algebra structure on F L∗ F∗ is given by e ∗ e = e,
e ∗ e∗ = −e∗ ,
e∗ ∗ e = 0,
e∗ ∗ e∗ = 0.
Obviously, these two left-symmetric algebras are not isomorphic [33]. Example 2.10. Let A be a left-symmetric algebra. Since there is a left-symmetric algebra structure on the phase space T ∗ G(A) = G(A)L∗ G ∗ (A) (defined by Eq. (2.7) or (2.8)), one can construct a new phase space T ∗ G(A) L∗ (T ∗ G(A))∗ of T ∗ G(A). This process can be continued indefinitely. Hence, there exist a series of phase spaces {A(n) }n≥2 : A(1) = G(A),
A(2) = T ∗ A(1) = T ∗ G(A), . . . ,
A(n) = T ∗ A(n−1) , . . . .
A(n) (n ≥ 2) is called the symplectic double of A(n−1) in [3]. Definition 2.11. Let T ∗ G1 be a phase space of a Lie algebra G1 and T ∗ G2 be a phase space of a Lie algebra G2 . T ∗ G1 is said to be isomorphic to T ∗ G2 if there exists a Lie algebra isomorphism ϕ : T ∗ G1 → T ∗ G2 satisfying the following conditions: ϕ(G1 ) = G2 ,
ϕ(G1∗ ) = G2∗ ;
ω(x, y) = ω(ϕ(x), ϕ(y)),
∀x, y ∈ T ∗ G1 .
(2.9)
Proposition 2.12. Let (A, ·) and (A , ◦) be two left-symmetric algebras. Then, the phase space T ∗ G(A) = G(A) L∗ G ∗ (A) is isomorphic to the phase space T ∗ G(A ) = G(A ) L∗ G ∗ (A ) if and only if A is isomorphic to A as left-symmetric algebras. Proof. Let ϕ : A → A be an isomorphism of left-symmetric algebras. Then, from the proof of Corollary 2.7, ϕ induces an isomorphism of phase spaces from G(A)L∗
August 5, 2006 21:35 WSPC/148-RMP
J070-00271
Left-Symmetric Algebraic Approach and Related Geometry
551
G ∗ (A) to G(A ) L∗ G ∗ (A ). Conversely, let ϕ : G(A) L∗ G ∗ (A) → G(A ) L∗ G ∗ (A ) be an isomorphism of phase spaces. Then, for any x, y ∈ A, a∗ ∈ A∗ , ϕ(x · y), ϕ(a∗ ) = ω(ϕ(a∗ ), ϕ(x · y)) = ω(a∗ , x · y) = ω(y, [x, a∗ ]) = ω(ϕ(y), ϕ[x, a∗ ]) = ω(ϕ(y), [ϕ(x), ϕ(a∗ )]) = −ω(ϕ(x) ◦ ϕ(y), ϕ(a∗ )) = ϕ(x) ◦ ϕ(y), ϕ(a∗ ). Therefore, ϕ is an isomorphism of left-symmetric algebras. The complex left-symmetric algebras have been classified up to dimension 3 in the sense of isomorphism [29, 33]. 3. The Natural Appearance of Left-Symmetric Algebras: Classical Yang–Baxter Equation The classical Yang–Baxter equation plays an important role in the study of integrable systems [35–37]. Definition 3.1. Let G be a Lie algebra and r ∈ G ⊗ G. r is called a solution of classical Yang–Baxter equation (CYBE) on G if [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ] = 0
in U (G),
(3.1)
where U (G) is the universal enveloping algebra of G and for r = i ai ⊗ bi , ai ⊗ bi ⊗ 1; r13 = ai ⊗ 1 ⊗ bi ; r23 = 1 ⊗ ai ⊗ b i . r12 = i
i
(3.2)
i
r is also called a classical r-matrix. Moreover, r is said to be skew-symmetric if (ai ⊗ bi − bi ⊗ ai ). (3.3) r=
i
For r = i ai ⊗ bi ∈ G ⊗ G, we denote r21 = i bi ⊗ ai . An interesting interpretation of CYBE is given by Drinfel’d as follows [28]: Theorem 3.2 [Drinfel’d]. Let G be a finite-dimensional Lie algebra and r ∈ G ⊗G be skew-symmetric and non-degenerate. r can be identified as a linear map (a linear isomorphism) from G ∗ to G. That is, u, r(v) = u ⊗ v, r for any u, v ∈ G ∗ . Then, r is a solution of CYBE on G if and only if the bilinear form B on G given by B(x, y) = r−1 (x), y,
∀x, y ∈ G,
(3.4)
is a 2-cocycle on G. Hence, for the symplectic form ω given by Eq. (1.1) on a phase space of a Lie algebra, there should be a corresponding classical r-matrix. It is not difficult to get the exact form of this classical r-matrix by the above theorem [3]. However, let us see this relation from another point of view in the following. In fact, there is a correspondence between left-symmetric algebras and certain classical r-matrices [29].
August 5, 2006 21:35 WSPC/148-RMP
552
J070-00271
C. Bai
Theorem 3.3 [29]. Let G be a Lie algebra and ρ : G → gl(V ) be a representation of G and ρ∗ : G → gl(V ∗ ) be its dual representation. Let T : V → G be a linear map which is identified as an element in G ⊗ V ∗ ⊂ (G ρ∗ V ∗ ) ⊗ (G ρ∗ V ∗ ). Then, r = T − T 21
(3.5)
is a skew-symmetric solution of CYBE on G ρ∗ V ∗ if and only if T satisfies [T (u), T (v)] = T (ρ(T (u))v − ρ(T (v))u),
∀u, v ∈ V.
(3.6)
Furthermore, if T : V → G is a linear map satisfying Eq. (3.6), then u ∗ v = ρ(T (u))v,
∀u, v ∈ V,
(3.7)
defines a left-symmetric algebra on V . Conversely, let (A, ·) be a left-symmetric algebra. Then, on G(A) L∗ G ∗ (A), r=
n
(−ei ⊗ e∗i + e∗i ⊗ ei )
(3.8)
i=1
is a solution of CYBE, where {e1 , . . . , en } is a basis of A and {e∗1 , . . . , e∗n } is its dual basis. Remark 3.4. In [3], the map satisfying Eq. (3.6) is called an O-operator. If ρ is the adjoint representation, Eq. (3.6) is also called the operator form of CYBE ([17, 37, etc.]). Remark 3.5. We would like to point out that, on the Lie algebra G(A)ad∗ G ∗ (A), Eq. (3.8) is not a solution of CYBE, but a solution of the modified CYBE [12]. That is, on G(A) ad∗ G ∗ (A), Eq. (3.8) does not satisfy Eq. (3.1), but satisfies [x ⊗ 1 ⊗ 1 + 1 ⊗ x ⊗ 1 + 1 ⊗ 1 ⊗ x, [r12 , r13 ] + [r12 , r23 ] + [r13 , r23 ]] = 0,
∀x ∈ A. (3.9)
Corollary 3.6. Let A be a left-symmetric algebra. Then, the symplectic form ω on the phase space T ∗ G(A) = G(A) L∗ G ∗ (A) given by Eq. (1.1) satisfies ω(x, y) = r−1 (x), y,
∀x, y ∈ T ∗ G(A),
(3.10)
where r is given by Eq. (3.8). Therefore, ω is just the 2-cocycle on T ∗ G(A) induced by r−1 . Proof. By Eq. (3.8), r as a linear map from (T ∗ G(A))∗ to T ∗ G(A) satisfies the following equations: r(e∗i ) = e∗i ,
r(ei ) = −ei ,
i = 1, . . . , n.
Therefore, for any i, j, k, l, we have r−1 (ei + e∗j ), ek + e∗l = −ei + e∗j , ek + e∗l = −ei , e∗l + ek , e∗j = ω(ei + e∗j , ek + e∗l ). So, Eq. (3.10) holds.
August 5, 2006 21:35 WSPC/148-RMP
J070-00271
Left-Symmetric Algebraic Approach and Related Geometry
553
In summary, there are the following correspondences, which interpret the natural appearance of left-symmetric algebras (in the sense of [1–3]): {Phase spaces of Lie algebras} ⇐⇒ {Certain non-degenerate classical r-matrices} ⇐⇒ {Left-symmetric algebras}. 4. The Geometry Related to Phase Spaces Since left-symmetric algebras have close relations with many geometric structures [4–11], in this section, we study some interesting geometric structures related to phase spaces. Throughout this section, the field is the real field R. On the other hand, because any phase space T ∗ G of a Lie algebra G is still a Lie algebra, the related geometric structures are in fact the corresponding strucˆ whose Lie algebra is T ∗ G. Therefore, we only tures on the connected Lie group G give the notions of the algebraic structures on T ∗ G since every such a strucˆ For example, ture can be lifted to a (left-invariant) geometric structure on G. left-symmetric algebras are the algebraic structures of affine structures on Lie groups. Proposition 4.1 [7]. Let G be a Lie group whose Lie algebra is G. Then, there exists a left-invariant flat and torsion free connection ∇ (i.e. an affine structure) on G if and only if G is sub-adjacent to a left-symmetric algebra. The correspondence is given by ∇x y = xy,
∀x, y ∈ G.
(4.1)
Corollary 4.2. There exists an affine structure on the connected Lie group whose Lie algebra is a phase space of a Lie algebra. The following two structures directly come from the study in the previous sections. (A) Paracomplex structures Definition 4.3 [38]. Let G be a real Lie algebra. A paracomplex structure on G is a linear endomorphism E : G → G satisfying E 2 = 1 (1 is the identity map on G) and the integrable condition: E[x, y] = [Ex, y] + [x, Ey] − E[Ex, Ey],
∀x, y ∈ G.
(4.2)
Proposition 4.4. Let A be a left-symmetric algebra. Then, on the phase space T ∗ G(A) = G(A)L∗ G ∗ (A), there is a paracomplex structure E : T ∗ G(A) → T ∗ G(A) given by E(x + a∗ ) = x − a∗ ,
∀x ∈ G, a∗ ∈ G ∗ .
(4.3)
August 5, 2006 21:35 WSPC/148-RMP
554
J070-00271
C. Bai
Proof. It is obvious that E 2 = 1. For any x, y ∈ A and a∗ , b∗ ∈ A∗ , we have E([x + a∗ , y + b∗ ]) = E([x, y] + [x, b∗ ] − [y, a∗ ]) = [x, y] − [x, b∗ ] + [y, a∗ ]; [E(x + a∗ ), y + b∗ ] = [x − a∗ , y + b∗ ] = [x, y] + [x, b∗ ] + [y, a∗ ]; [x + a∗ , E(y + b∗ )] = [x + a∗ , y − b∗ ] = [x, y] − [x, b∗ ] − [y, a∗ ]; E[E(x + a∗ ), E(y + b∗ )] = E([x, y] − [x, b∗ ] + [y, a∗ ]) = [x, y] + [x, b∗ ] − [y, a∗ ]. Therefore, E([x + a∗ , y + b∗ ]) = [E(x + a∗ ), y + b∗ ] + [x + a∗, E(y + b∗ )] − E[E(x + a∗), E(y + b∗ )]. So E is a paracomplex structure on T ∗ G(A). (B) Parak¨ ahler structures Definition 4.5 [39]. Let M be a symplectic manifold with symplectic form ω. Let (F + , F − ) be a pair of transversal foliations on M . The triple (M, ω, F ± ) is called a parak¨ ahler manifold if each leaf of F ± is a Lagrangian submanifold of M . A parak¨ ahler manifold M is called homogeneous if Aut(M ) acts transitively on M , where Aut(M ) is a finite-dimensional Lie group which consists of all the diffeomorphisms of M preserving both the symplectic structure and the two foliations. Theorem 4.6 [40]. Let G be a real connected Lie group with Lie algebra G. Then, there exists a G-invariant parak¨ ahler structure on G if and only if there exist two subalgebras G ± of G and a skew-symmetric bilinear form ω on G such that the following conditions are satisfied: (1) (2) (3) (4)
G = G + ⊕ G − as the direct sum of vector spaces; ω is non-degenerate; ω(G + , G + ) = ω(G − , G − ) = 0; ω is a 2-cocycle on G.
Corollary 4.7. Let A be a left-symmetric algebra. Then, there exists a parak¨ ahler structure on the connected Lie group whose Lie algebra is the phase space T ∗ G(A) = G(A) L∗ G ∗ (A). However, the following structures are not direct as above. (C) Complex structures and complex product structures Definition 4.8. Let G be a real Lie algebra. A complex structure on G is a linear endomorphism J : G → G satisfying J 2 = −1 and the integrable condition: J[x, y] = [Jx, y] + [x, Jy] + J[Jx, Jy],
∀x, y ∈ G.
(4.4)
Proposition 4.9. Let (A, ·) be a left-symmetric algebra with a non-degenerate bilinear form B(, ). Suppose the bilinear form B is invariant in the following sense: B(x · y, z) + B(y, x · z) = 0, B(Lx (y), z) + B(y, Lx (z)) = 0,
or equivalently, ∀x, y, z ∈ A.
(4.5)
August 5, 2006 21:35 WSPC/148-RMP
J070-00271
Left-Symmetric Algebraic Approach and Related Geometry
555
Let ϕ : A → A∗ be the linear isomorphism induced by B: B(x, y) = ϕ(x), y,
∀x, y ∈ A.
(4.6)
Then, there exists a complex structure J on the phase space T ∗ G(A) = G(A) L∗ G ∗ (A) given as follows: J(x + a∗ ) = −ϕ−1 (a∗ ) + ϕ(x),
∀x ∈ A, A∗ ∈ A∗ .
(4.7)
In particular, if the bilinear form B is symmetric and positive definite, then the complex product structure J is given as follows: J(x + y ∗ ) = −y + x∗ ,
∀x, y ∈ A, (4.8) n where for any x = i=1 λi ei ∈ A, set x∗ = i=1 λi e∗i ∈ A∗ . Here, {e1 , . . . , en } is a basis of A such that B(ei , ej ) = δij and {e∗1 , . . . , e∗n } is its dual basis. n
Proof. It is obvious that J 2 = −1. Next, we prove the integrability. Note that for any x, y ∈ A, a∗ ∈ A∗ , we have [x, a∗ ], y = L∗ (x)a∗ , y = −a∗ , x · y;
ϕ(xy) = L∗ (x)ϕ(y).
(1) For any x, y, z ∈ A, we have J([x, y]), z = J(x · y − y · x), z = ϕ(x · y − y · x), z = B(x · y − y · x, z); [Jx, y], z = [ϕ(x), y], z = ϕ(x), y · z = B(x, y · z); [x, Jy], z = [x, ϕ(y)], z = −ϕ(y), x · z = −B(y, x · z); J([Jx, Jy]), z = J([ϕ(x), ϕ(y)]), z = 0. Therefore, by Eq. (4.5), we have J[x, y] = [Jx, y] + [x, Jy] + J[Jx, Jy]. (2) For any a∗ , b∗ ∈ A∗ , with a similar computation to the above, we know J([a∗ , b∗ ]) = [Ja∗ , b∗ ] + [a∗ , Jb∗ ] + J([Ja∗ , Jb∗ ]) = 0. (3) For any x ∈ A, a∗ , b∗ ∈ A∗ , let a ˆ, ˆb ∈ A such that ϕ(ˆ a) = a∗ and ϕ(ˆb) = b∗ . Therefore, we have J([x, a∗ ]), b∗ = −ϕ−1 ([x, a∗ ]), b∗ = −ϕ−1 (L∗ (x)ϕ(ˆ a)), b∗ ˆ); = −x · a ˆ, b∗ = −B(ˆb, x · a [Jx, a∗ ], b∗ = [ϕ(x), a∗ ], b∗ = 0; a·x−x·a ˆ, b∗ = B(ˆb, a ˆ·x−x·a ˆ); [x, Ja∗ ], b∗ = [x, −ϕ−1 (a∗ )], b∗ = ˆ ∗ ∗ −1 ∗ ∗ ∗ ˆ J([Jx, Ja ]), b = J([ϕ(x), −ϕ (a )]), b = J([ˆ a, ϕ(x)]), b = −B(b, a ˆ · x). Therefore, J([x, a∗ ]) = [Jx, a∗ ] + [x, Ja∗ ] + J([Jx, Ja∗ ]).
August 5, 2006 21:35 WSPC/148-RMP
556
J070-00271
C. Bai
So, J is a complex structure on T ∗ G(A). If the invariant bilinear form B is symmetric and positive definite, then ϕ is nothing but ϕ(ei ) = e∗i , where {e1 , . . . , en } is a basis of A such that B(ei , ej ) = δij and {e∗1 , . . . , e∗n } is its dual basis. Example 4.10. Let A be a 3-dimensional real left-symmetric algebra with a basis {e1 , e2 , e3 } whose non-zero products are given as follows [24]: e3 e1 = e2 ,
e3 e2 = −e1 .
A symmetric bilinear form B(, ) on A is invariant if and only if it satisfies the following condition: B(e1 , e1 ) = B(e2 , e2 ),
B(e1 , e2 ) = B(e1 , e3 ) = B(e2 , e3 ) = 0.
In particular, there exists a symmetric and positive definite invariant bilinear from on A such that B(ei , ej ) = δij . Example 4.11. Let A be a left-symmetric algebra with a symmetric and positive definite invariant bilinear form B. Let {e1 , . . . , en } be a basis of A such that B(ei , ej ) = δij and {e∗1 , . . . , e∗n } be its dual basis. Then, it is easy to know that, on ¯ the phase space T ∗ G(A) = G(A) L∗ G ∗ (A), there is an invariant bilinear from B given as follows: ¯ i , ej ) = B(e ¯ ∗i , e∗j ) = δij , B(e
¯ i , e∗j ) = B(e∗j , ei ) = 0, B(e
where the left-symmetric algebra structure on T ∗ G(A) is given by Eq. (2.8). More¯ is symmetric and positive definite. over, B Next, we give a construction of the left-symmetric algebras with a nondegenerate invariant bilinear form. Recall that a Frobenius algebra A is a commutative associative algebra with a non-degenerate bilinear form B(, ) satisfying [19] B(xy, z) = B(y, xz),
∀x, y, z ∈ A.
(4.9)
The classification of low-dimensional Frobenius algebras are given in [41]. Proposition 4.12. Let (A, ·) be a left-symmetric algebra with a non-degenerate invariant bilinear from B(, ). Let (A , ◦) be a Frobenius algebra with a nondegenerate bilinear form B (, ) satisfying Eq. (4.9). Then, there is a left-symmetric algebra structure on the vector space A ⊗ A given by (x ⊗ x ) ∗ (y ⊗ y ) = (x · y) ⊗ (x ◦ y ),
∀x, y ∈ A, x , y ∈ A ,
(4.10)
∀x, y ∈ A, x , y ∈ A ,
(4.11)
¯ on A ⊗ A given by and the bilinear form B ¯ ⊗ x , y ⊗ y ) = B(x, y)B (x , y ), B(x
is non-degenerate and invariant. In addition, if both B and B are symmetric and ¯ is a symmetric and positive definite bilinear form on the positive definite, then B left-symmetric algebra A ⊗ A .
August 5, 2006 21:35 WSPC/148-RMP
J070-00271
Left-Symmetric Algebraic Approach and Related Geometry
557
Proof. For any x, y, z ∈ A and x , y , z ∈ A , the associator satisfies (x ⊗ x , y ⊗ y , z ⊗ z ) = (x, y, z) ⊗ (x ◦ y ◦ z ). Then, A ⊗ A is a left-symmetric algebra with the product given by Eq. (4.10). Furthermore, we have ¯ ¯ ⊗ y ), (x ⊗ x ) ∗ (z ⊗ z )) B((x ⊗ x ) ∗ (y ⊗ y ), z ⊗ z ) + B((y = B(x · y, z)B (x ◦ y , z) + B(y, x · z)B (y , x ◦ z ) = [B(x · y, z) + B(y, x · z)]B (x ◦ y , z ) = 0. ¯ is invariant on A ⊗ A . If B and B are non-degenerate (or symmetric Thus, B ¯ is also non-degenerate (or and positive definite), then it is easy to know that B symmetric and positive definite). Definition 4.13 [11]. Let G be a real Lie algebra. A complex product structure on the Lie algebra G is a pair {J, E} of a complex structure J and a paracomplex structure E satisfying JE = −EJ. The complex product structures on Lie algebras have close relations with the study of hypercomplex and hypersymplectic manifolds [42]. Combining Propositions 4.4 and 4.9 together, we have Corollary 4.14. Let (A, ·) be a left-symmetric algebra with a non-degenerate invariant bilinear form B(, ). Let ϕ : A → A∗ be the linear isomorphism given by Eq. (4.6). Then, there exists a complex product structure {J, E} on the phase space T ∗ G(A) = G(A) L∗ G ∗ (A), where J is given by Eq. (4.7) and E is given by Eq. (4.3). (D) K¨ ahler structures Definition 4.15 [43]. Let G be a real Lie algebra. If there exists a complex structure J and a non-degenerate skew-symmetric bilinear form ω (symplectic form) such that the following conditions are satisfied: (1) ω is a 2-cocyle on G; (2) ω(J(x), J(y)) = ω(x, y) for any x, y ∈ G; (3) ω(x, J(x)) > 0, for any x ∈ G and x = 0,
(4.12) (4.13)
then {J, ω} is called a k¨ ahler structure on G. The k¨ ahler structures on Lie algebras are closely related to the study of k¨ ahler Lie groups and k¨ ahler manifolds [9, 10, 43, 44]. Theorem 4.16. Let (A, ·) be a left-symmetric algebra with a symmetric and positive definite invariant bilinear form B(, ). Then, there exists a k¨ ahler structure {−J, ω} on the phase space T ∗ G(A) = G(A)L∗ G ∗ (A), where J is given by Eq. (4.8) and ω is given by Eq. (1.1).
August 5, 2006 21:35 WSPC/148-RMP
558
J070-00271
C. Bai
Proof. We have proved that ω is a 2-cocycle and J is a complex structure on T ∗ G(A). Obviously, −J is also a complex structure on T ∗ G(A). Let {e1 , . . . , en } be a basis of A such that B(ei , ej ) = δij and {e∗1 , . . . , e∗n } be its dual basis. Therefore, for any i, j, k, l, we have ω(ei + e∗j , ek + e∗l ) = −δil + δjk ; ω(−J(ei + e∗j ), −J(ek + e∗l )) = ω(−ej + e∗i , −el + e∗k ) = −δil + δjk . Hence, we have ω(−J(x+a∗ ), −J(y+b∗ )) = ω((x+a∗ ), (y+b∗ )) for any x, y ∈ A and n n a∗ , b∗ ∈ A∗ . Let x ∈ A, a∗ ∈ A∗ and x + a∗ = 0. Set x = i=1 λi ei , a∗ = i=1 µi e∗i . Then, we have n n n n ∗ ∗ ∗ ∗ λi ei + µj ej , µk ek − λl el ω(x + a , −J(x + a )) = ω i=1
=
n i=1
λ2i +
j=1 n
k=1
l=1
µ2j > 0.
j=1
Therefore, {−J, ω} is a k¨ahler structure on the phase space T ∗ G(A). Remark 4.17. The above conclusion coincides with a result in [44] (cf. [44, Proposition 2.10]). Combining Corollaries 4.2, 4.7, and 4.14, and Theorem 4.16 together, we have: Corollary 4.18. Let (A, ·) be a left-symmetric algebra with a symmetric and positive definite invariant bilinear form B(, ). Then, there exists an affine structure ∇, a parak¨ ahler structure {G(A), G ∗ (A), ω}, a complex product structure {J, E} and a k¨ ahler structure {−J, ω} on the phase space T ∗ G(A) = G(A) L∗ G ∗ (A). 5. The Construction of the Generalized Phase Spaces The construction of a phase space as G(A) L∗ G ∗ (A) from a left-symmetric algebra A is a symplectic double [3]. Under this sense, the structure of a phase space T ∗ G should depend on the structure of G and there is only a “module” structure on the dual space G ∗ . Thus, the (Lie or left-symmetric) algebraic structure on G ∗ is trivial. But, for the notion of a phase space, such a condition is not a priori. Like the Drinfel’d double [28], it is natural to extend to the case that G ∗ has “certain” algebraic structures (that is, the “generalized” phase spaces). The most natural choice is that G ∗ is still a Lie subalgebra of the phase space T ∗ G. In this case, from Remark 2.5, we have: Corollary 5.1. Let T ∗ G = G ⊕ G ∗ be a phase space with the symplectic form ω. Then, G ∗ is a Lie subalgebra if and only if G ∗ is also a left-symmetric subalgebra under the product given by Eq. (2.6). Notice here that the algebraic structure on G ∗ (as a Lie algebra or a leftsymmetric algebra) may not have a direct and obvious relation with G. Before we
August 5, 2006 21:35 WSPC/148-RMP
J070-00271
Left-Symmetric Algebraic Approach and Related Geometry
559
give the general construction, we re-consider the construction in [1] in our present picture. Theorem 5.2. Let T ∗ G = G ⊕ G ∗ be a phase space of a Lie algebra G. If G ∗ is an ideal of T ∗ G, then G ∗ is abelian and hence as Lie algebras, T ∗ G is isomorphic to the semidirect G ad T ∗ G G ∗ . In this case, the phase space T ∗ G is isomorphic to the phase space G L∗ G ∗ which is given by the induced compatible left-symmetric algebra structure on G from ω. Proof. If G ∗ is an ideal, then for any a∗ , b∗ ∈ G ∗ , x ∈ G, we have [a∗ , x], [b∗ , x] ∈ G ∗ ⇒ ω([a∗ , b∗ ], x) = ω([a∗ , x], b∗ ) + ω(a∗ , [b∗ , x]) = 0. Thus, [a∗ , b∗ ], G = ω([a∗ , b∗ ], G) = 0. So [a∗ , b∗ ] = 0. Then, T ∗ G is isomorphic to the semidirect product G ρ G ∗ , where the representation ρ = ad T ∗ G : G → gl(G ∗ ) is given by ρ(x)(a∗ ) = ad(x)(a∗ ) = [x, a∗ ] for any x ∈ G, y ∈ G ∗ . In this case, by Theorem 2.4, the left-symmetric algebra structure on the Lie algebra G is given by ω(x ∗ y, a∗ ) = −ω(y, [x, a∗ ]) = −ω(y, ρ(x)(a∗ )),
∀x, y ∈ G, a∗ ∈ G ∗ .
On the other hand, for such a left-symmetric algebra, there is a phase space GL∗ G ∗ . Let id be the identity transformation on G ⊕ G ∗ . Then, it is easy to see that id is an isomorphism of phase spaces from T ∗ G = G ρ G ∗ to G L∗ G ∗ and ρ = L∗ . With Proposition 2.12 together, we have: Corollary 5.3. Let T ∗ G be a phase space of a Lie algebra G. If T ∗ G is an abelian Lie algebra, then it is unique in the sense of isomorphism. In this case, the induced left-symmetric algebra structure given by Eq. (2.6) is trivial, that is, all the products are zero. Next, we discuss the general cases. Lemma 5.4 [45]. Let G, H be two Lie algebras. If there exist two representations ρ : G → gl(H) and µ : H → gl(G) satisfying ρ(x)[a, b] − [ρ(x)a, b] − [a, ρ(x)b] + ρ(µ(a)x)b − ρ(µ(b)x)a = 0,
(5.1)
µ(a)[x, y] − [µ(a)x, y] − [x, µ(a)y] + µ(ρ(x)a)y − µ(ρ(y)a)x = 0,
(5.2)
for any x, y ∈ G and a, b ∈ H, then there is a Lie bracket on the vector space G ⊕ H given by [x + a, y + b] = [x, y] + µ(a)y − µ(b)x + [a, b] + ρ(x)b − ρ(y)a, ∀x, y ∈ G, a, b ∈ H.
(5.3)
This new Lie algebra is denoted by G H. On the other hand, if G and H are Lie subalgebras of a Lie algebra U such that U = G ⊕ H, then there exist representations
August 5, 2006 21:35 WSPC/148-RMP
560
J070-00271
C. Bai
ρ : G → gl(H) and µ : H → gl(G) satisfying Eqs. (5.1) and (5.2) so that U = G H, where ρ and µ are determined by [x, a] = −µ(a)x + ρ(x)a,
∀x ∈ G, a ∈ H.
(5.4)
Theorem 5.5. Let (A, ·) be a left-symmetric algebra. Suppose there is another leftsymmetric algebra structure “ ◦ ” on its dual space A∗ . Let ω be the symplectic form given by Eq. (1.1). Let the maps ρ = L∗· : A → gl(A∗ ) and µ = L∗◦ : A∗ → gl(A) be the dual representations of the regular representations of the sub-adjacent Lie algebras G(A) and G(A∗ ) respectively, that is, ρ(x)a∗ , y = −x · y, a∗ ,
µ(a∗ )x, b∗ = −a∗ ◦ b∗ , x,
∀x, y ∈ A, a∗ , b∗ ∈ A∗ . (5.5) ∗
If ρ and µ satisfy Eqs. (5.1) and (5.2), then on the vector space G(A) ⊕ G(A ), there is a Lie algebra structure (that is, G(A) G(A∗ )) given by Eq. (5.3) such that it is a phase space. On the other hand, every (generalized) phase space can be constructed from the above way. Proof. If ρ and µ satisfy Eqs. (5.1) and (5.2), then by Lemma 5.4, on the vector space G(A) ⊕ G(A∗ ), there is a Lie algebra under the Lie bracket (5.3). For any x, y, z ∈ A, a∗ , b∗ , c∗ ∈ A∗ , we have ω([x + a∗ , y + b∗ ], z + c∗ ) = [a∗ , b∗ ], z + ρ(x)b∗ − ρ(y)a∗ , z − [x, y] + µ(a∗ )y − µ(b∗ )x, c∗ = −x · z, b∗ + y · z, a∗ + [a∗ , b∗ ], z − [x, y], c∗ + a∗ ◦ c∗ , y − b∗ ◦ c∗ , x; ω([y + b∗ , z + c∗ ], x + a∗ ) = −y · x, c∗ + z · x, b∗ + [b∗ , c∗ ], x − [y, z], c∗ + b∗ ◦ a∗ , y − c∗ ◦ a∗ , y; ω([z + c∗ , x + a∗ ], y + b∗ ) = −z · y, a∗ + x · y, c∗ + [c∗ , a∗ ], y − [z, x], b∗ + c∗ ◦ b∗ , a − a∗ ◦ b∗ , z. Then, ω is a 2-cocycle on the Lie algebra G(A) G(A∗ ). Therefore, it is a phase space. On the other hand, let T ∗ G be a phase space of G. By Theorem 2.4, there exists a left-symmetric algebra structure on T ∗ G given by Eq. (2.6) such that G and G ∗ are left-symmetric subalgebras. Moreover, if we let [x, a∗ ] = ρ(x)a∗ − µ(a∗ )x,
∀x ∈ G, a∗ ∈ G ∗ ,
then the maps ρ : G → gl(G ∗ ) and µ : G ∗ → gl(G) are representations. Moreover, we have ω([x, a∗ ], y) = ρ(x)a∗ , y,
ω([x, a∗ ], b∗ ) = −µ(a∗ )x, b∗ ,
∀x, y ∈ G, a∗ , b∗ ∈ G ∗ .
Therefore, ρ and µ satisfy Eq. (5.5). Furthermore, ρ and µ satisfy Eqs. (5.1) and (5.2), hence as Lie algebras, T ∗ G ∼ = G G ∗ .
August 5, 2006 21:35 WSPC/148-RMP
J070-00271
Left-Symmetric Algebraic Approach and Related Geometry
561
Remark 5.6. In the above construction, if the left-symmetric algebra structure on G ∗ is trivial, then G ∗ is an abelian (Lie) ideal of G(A) G(A∗ ) and in this case, G(A) G(A∗ ) ∼ = G(A) L∗ G ∗ (A). Corollary 5.7. Let (A, ·) be a left-symmetric algebra. Suppose there is another left-symmetric algebra structure “ ◦ ” on its dual space A∗ . If there is a phase space G(A) G(A∗ ) given by Theorem 5.5, then there is a compatible left-symmetric algebra structure on G(A) G(A∗ ) given as follows: for any x, y ∈ A, a∗ , b∗ ∈ A∗ , x ∗ y = x · y,
a∗ ∗ b ∗ = a∗ ◦ b ∗ ,
x ∗ a∗ = lA (x)a∗ + rA∗ (a∗ )x,
(5.6)
a∗ ∗ x = lA∗ (a∗ )x + rA (x)a∗ , where lA , rA : A → gl(A∗ ), lA∗ , rA∗ : A∗ → gl(A) are linear maps defined by lA (x)a∗ , y = −x · y − y · x, a∗ , ∗
∗
∗
∗
∗
∗
rA (x)a∗ , y = a∗ , y · x;
lA∗ (a )x, b = −a ◦ b − b ◦ a , x,
∗
∗
∗
(5.7) ∗
rA∗ (a )x, b = x, b ◦ a .
(5.8)
Moreover, lA , lA∗ are representations of G(A) and G(A∗ ), respectively, and ρ = lA − rA ,
µ = lA∗ − rA∗ ,
(5.9)
where ρ and µ are given by Eq. (5.5). Example 5.8. We consider the case A = F and A∗ = F. Let e and e∗ denote the basis of A and A∗ , respectively, and e, e∗ = 1. The representations ρ and µ defined by Eq. (5.5) are given as follows: ρ(e)(e∗ ) = −e∗ ,
µ(e∗ )(e) = −e.
It is easy to check that ρ and µ satisfy Eqs. (5.1) and (5.2). Therefore, F F is a phase space and it is isomorphic to e, e∗ |[e, e∗ ] = e − e∗ as Lie algebras. Obviously, it is not isomorphic to the phase space F L∗ F∗ given in Example 2.9, although they are isomorphic as Lie algebras. However, in general, it is not easy to construct the non-trivial examples by Theorem 5.5. In the following, we give a kind of examples satisfy the conditions in Theorem 5.5. Proposition 5.9. Let (A, ·) be a left-symmetric algebra with a symmetric and positive definite invariant bilinear form B(, ). Let {e1 , . . . , en } be a basis of A such that B(ei , ej ) = δij and {e∗1 , . . . , e∗n } is its dual basis. Suppose that the left-symmetric algebra structure on A∗ is as the same as on A in the following sense: x∗ · y ∗ = (x · y)∗ ,
∀x, y ∈ A, (5.10) n where for any x = i=1 λi ei ∈ A, set x∗ = i=1 λi e∗i ∈ A∗ . Therefore, there exists ∗ a Lie bracket on the vector space A ⊕ A given by n
[x + a∗ , y + b∗ ] = x · y − y · x + a · y − b · x + (a · b − b · a + x · b − y · a)∗ , ∀x, y, a, b ∈ A,
(5.11)
August 5, 2006 21:35 WSPC/148-RMP
562
J070-00271
C. Bai
which is G(A) G(A∗ ) such that it is a phase space with the symplectic ω given by Eq. (1.1). Proof. Let ρ : A → gl(A∗ ) and µ : A∗ → gl(A) be the regular representation L, that is ρ(x)(a∗ ) = (x · a)∗ ,
µ(a∗ )(x) = a · x,
∀x, a ∈ A.
Then, by Eq. (4.5), it is easy to check that ρ and µ satisfy Eq. (5.5). Moreover, due to the left-symmetry of A, ρ and µ also satisfy Eqs. (5.1) and (5.2). Notice that in this case Eq. (5.11) is nothing but Eq. (5.3). Hence, by Theorem 5.5, G(A) G(A∗ ) is a phase space. At the end of this paper, we would like to point out that we also can study the corresponding geometric structures related to the generalized phase spaces similar to the discussion in Sec. 4. Acknowledgments The author thanks Professors P. Etingof, I.M. Gel’fand, and B.A. Kupershmidt for important suggestions and great encouragement. The author also thanks Professors J. Lepowsky, Y.Z. Huang and H.S. Li for the hospitality extended to him during his stay at Rutgers, The State University of New Jersey and for valuable discussions. This work was supported in part by the S.S. Chern Foundation for Mathematical Research, the National Natural Science Foundation of China, Program for New Century Excellent Talents in University and the K.C. Wong Education Foundation. References [1] B. A. Kuperschmidt, Non-abelian phase spaces, J. Phys. A 27 (1994) 2801–2810. [2] B. A. Kuperschmidt, On the nature of the Virasoro algebra, J. Nonlinear Math. Phy. 6 (1999) 222–245. [3] B. A. Kuperschmidt, What a classical r-matrix really is, J. Nonlinear Math. Phy. 6 (1999) 448–488. [4] E. B. Vinberg, Convex homogeneous cones, Transl. Moscow Math. Soc. 12 (1963) 340–403. [5] L. Auslander, Simply transitive groups of affine motions, Amer. J. Math. 99 (1977) 809–826. [6] H. Kim, Complete left-invariant affine structures on nilpotent Lie groups, J. Differential Geom. 24 (1986) 373–394. [7] A. Medina, Flat left-invariant connections adapted to the automorphism structure of a Lie group, J. Differential Geom. 16 (1981) 445–474. [8] B. Y. Chu, Symplectic homogeneous spaces, Trans. Amer. Math. Soc. 197 (1974) 145–159. [9] A. Lichnerowicz and A. Medina, On Lie groups with left invariant symplectic or K¨ ahlerian structures, Lett. Math. Phys. 16 (1988) 225–235. [10] J. M. Dardie and A. Medina, Double extension symplectique d’un groupe de Lie symplectique, Adv. Math. 117 (1996) 208–227.
August 5, 2006 21:35 WSPC/148-RMP
J070-00271
Left-Symmetric Algebraic Approach and Related Geometry
563
[11] A. Andrada and S. Salamon, Complex product structures on Lie algebras, arXiv: math. DG/0305102. [12] M. Bordemann, Generalized Lax pairs, the modified classical Yang–Baxter equation, and affine geometry of Lie groups, Comm. Math. Phys. 135 (1990) 201–216. [13] A. Winterhalder, Linear Nijenhuis-tensors and the construction of integrable systems, arXiv: physics/9709008. [14] A. Diatta and A. Medina, Classical Yang–Baxter equation and left-invariant affine geometry on Lie groups, aiXiv:math.DG/0203198. [15] P. Etingof and A. Soloviev, Quantization of geometric classical r-matrices, Math. Res. Lett. 6 (1999) 223–228. [16] P. Etingof, T. Schedler and A. Soloviev, Set-theoretical solutions to the quantum Yang–Baxter equation, Duke Math. J. 100 (1999) 169–209. [17] I. Z. Golubschik and V. V. Sokolov, Generalized operator Yang–Baxter equations, integrable ODES and nonassociative algebras, J. Nonlinear Math. Phys. 7 (2000) 184–197. [18] S. I. Svinolupov and V. V. Sokolov, Vector-matrix generalizations of classical integrable equations, Theoret. and Math. Phys. 100 (1994) 959–962. [19] A. A. Balinskii and S. P. Novikov, Poisson brackets of hydrodynamic type, Frobenius algebras and Lie algebras, Soviet Math. Dokl. 32 (1985) 228–231. [20] B. A. Dubrovin and S. P. Novikov, On Poisson brackets of hydrodynamic type, Soviet Math. Dokl. 30 (1984) 651–654. [21] I. M. Gel’fand and I. Ya. Dorfman, Hamiltonian operators and algebraic structures related to them, Funct. Anal. Appl. 13 (1979) 248–262. [22] C. M. Bai and D. J. Meng, The classification of Novikov algebras in low dimensions, J. Phys. A 34 (2001) 1581–1594. [23] C. M. Bai and D. J. Meng, On the realization of transitive Novikov algebras, J. Phys. A 34 (2001) 1581–1594. [24] C. M. Bai and D. J. Meng, Bilinear forms on Novikov algebras, Int. J. Theor. Phys. 41 (2002) 495–502. [25] C. M. Bai and D. J. Meng, A Lie algebraic approach to Novikov algebras, J. Geo. Phys. 45 (2003) 218–230. [26] F. Chapoton and M. Livernet, Pre-Lie algebras and the rooted trees operad, Int. Math. Res. Not. 8 (2001) 395–408. [27] A. Connes and D. Kreimer, Hopf algebras, renormalization and noncommutative geometry, Comm. Math. Phys. 199 (1998) 203–242. [28] V. Drinfel’d, Hamiltonian structure on the Lie groups, Lie bialgebras and the geometric sense of the classical Yang–Baxter equations, Soviet Math. Dokl. 27 (1983) 68–71. [29] C. M. Bai, Left-symmetric algebras, bijective 1-cocycles and classical Yang–Baxter equation, preprint (2003). [30] C. M. Bai, Left-symmetric algebras from linear functions, J. Algebra 281 (2004) 651–665. [31] O. Baues, Left-symmetric algebras for gl(n), Trans. Amer. Math. Soc. 351 (1999) 2979–2996. [32] D. Burde, Left-invariant affine structures on reductive Lie groups, J. Algebra 181 (1996) 884–902. [33] D. Burde, Simple left-symmetric algebras with solvable Lie algebra, Manuscipta Math. 95 (1998) 397–411. [34] N. Jacobson, Lie Algebras (Interscience, New York, 1962).
August 5, 2006 21:35 WSPC/148-RMP
564
J070-00271
C. Bai
[35] A. A. Belavin and V. G. Drinfel’d, Solutions of classical Yang–Baxter equation for simple Lie algebras, Funct. Anal. Appl. 16 (1982) 159–180. [36] V. Chari and A. Pressley, A Guide to Quantum Groups (Cambridge University Press, Cambridge, 1994). [37] M. A. Semonov-Tian-Shansky, What is a classical R-matrix? Funct. Anal. Appl. 17 (1983) 259–272. [38] S. Kaneyuki and M. Kozai, Paracomplex structures and affine symmetric spaces, Tokyo J. Math. 8 (1985) 81–98. [39] P. Libermann, Sur le probleme d’equivalence de certaines structures infinitesimals, Ann. Mat. Pura Appl. 36 (1954) 27–120. [40] S. Kaneyuki, Homogeneous symplectic manifolds and dipolarizations in Lie algebras, Tokyo J. Math. 15 (1992) 313–325. [41] C. M. Bai and D. J. Meng, Addendum: The classification of Novikov algebras in low dimensions: Invariant bilinear forms, J. Phys. A 34 (2001) 8193–8197. [42] M. Barbeis, Hypercomplex structures on four-dimensional Lie groups, Proc. Amer. Math. Soc. 125 (1997) 1043–1054. [43] D. Mcduff and D. Salamon, Introduction to Symplectic Topology (Clarendon Press, Oxford, 1998). [44] J. M. Dardie and A. Medina, Algebres de Lie K¨ ahleriennes et double extension, J. Algebra 185 (1995) 774–795. [45] S. Majid, Matched pairs of Lie groups associated to solutions of the Yang–Baxter equations, Pacific J. Math. 141 (1990) 311–332.
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Reviews in Mathematical Physics Vol. 18, No. 5 (2006) 565–594 c World Scientific Publishing Company
SCALING ALGEBRAS AND SUPERSELECTION SECTORS: STUDY OF A CLASS OF MODELS
CLAUDIO D’ANTONI∗ and GERARDO MORSELLA† ∗Dipartimento
di Matematica, Universit` a di Roma “Tor Vergata”, Via della Ricerca Scientifica, I-00133 Roma, Italy [email protected]
†Istituto
Nazionale d’Alta Matematica “Francesco Severi”, and
Dipartimento di Matematica, Universit` a di Roma “La Sapienza”, P.le Aldo Moro 2, I-00185 Roma, Italy [email protected] Received 14 February 2006 Revised 27 June 2006 We analyze a class of quantum field theory models illustrating some of the possibilities that have emerged in the general study of the short distance properties of superselection sectors, performed in a previous paper (together with R. Verch). In particular, we show that for each pair (G, N ), with G a compact Lie group and N a closed normal subgroup, there is a net of observable algebras which has (a subset of) DHR sectors in 1-1 correspondence with classes of irreducible representations of G, and such that only the sectors corresponding to representations of G/N are preserved in the scaling limit. In the way of achieving this result, we derive sufficient conditions under which the scaling limit of a tensor product theory coincides with the product of the scaling limit theories. Keywords: Scaling algebra; superselection sectors; nuclearity. Mathematics Subject Classification 2000: 81T05, 46L60, 46N50
1. Introduction The scaling algebra concept has been introduced in [1], in an attempt to make available, in the framework of the algebraic approach to quantum field theory [2], the methods of the renormalization group, which have proved very useful in analyzing the short distance behaviour of quantum field theory in the conventional approach. The elements of the scaling algebra are functions of a scaling parameter λ > 0 taking values in the algebra of local observables of the theory under consideration, any of such function representing the orbit λ → Rλ (A) of an arbitrary observable A under a family (Rλ )λ>0 of renormalization group transformations, whose choice is only restricted by the requirement that such orbits have a “phase space occupation” which is independent of the scale λ, i.e. that the operators Rλ (A) are 565
August 5, 2006 21:35 WSPC/148-RMP
566
J070-00272
C. D’Antoni & G. Morsella
localized in regions of radius proportional to λ and have energy-momentum transfer proportional to λ−1 . The information about the short distance (or, equivalently, high energy) properties of the given theory (to which we will refer, from now on, as the underlying theory), is then obtained by studying the vacuum expectation values of such functions in the λ → 0 limit, and is encoded in a new net of local observables, called the scaling limit of the underlying net. One of the major achievements of these methods has been the formulation of an intrinsic notion of charge confinement [3], not suffering from the ambiguities of the conventional one, which relies on the assignment of a physical interpretation to the unobservable fields in terms of which the theory is described (whose choice is of course highly non-unique). According to this new confinement notion, the underlying theory describes confined charges if the corresponding scaling limit theory has superselection sectorsa which are not, at the same time, sectors of the underlying theory itself. An example of such situation is provided by the Schwinger model (massless QED in two spacetime dimensions), which has trivial superselection structure at finite scales, but whose scaling limit theory exhibits nontrivial sectors [3, 5]. In order for this concept to be applied to a general theory, one needs a canonical way of comparing the superselection structures of the underlying theory and of the scaling limit one. With this aim in mind, a general study of the short distance properties of charged fields and of superselection sectors — of both DHR and BF types — has been performed in [6] (see also [7]), where the scaling algebra and scaling limit concepts are extended to the nets of charge carrying fields localized in double cones or in spacelike cones (depending on the kind of sector with which these fields are associated), and are then used to formulate a notion of “charge preservation” in the scaling limit. In such a way, the confined sectors of the underlying theory are identified with those sectors of the scaling limit theory which do not arise as limits of preserved sectors of the underlying theory [7]. For the convenience of the reader, we will give an account of some of the main results of this work in Sec. 2 below. In the present paper, we study a class of quantum field theory models which exhibit both preserved and non-preserved DHR sectors, therefore providing an illustration of the general analysis of [6]. More precisely, for each pair (G, N ) consisting of a compact Lie group G and of a normal closed subgroup N ⊂ G, we construct a local net A , satisfying the standard assumptions, which has (a subset of) DHR sectors labeled by the equivalence classes of unitary irreducible representations of G, and such that precisely the sectors corresponding to representations which are trivial on N (i.e. representations which factorize through G/N ) are preserved according to [6], cf. Theorem 4.6. Similarly to [8], the net A is obtained as the fixed point net A = F G of a suitable field net F which carries an action of G, and in turn F is defined as a tensor product F = F1 ⊗ F2 , where F1 is a net with trivial scaling a We
refer the reader to [2, 4] for a comprehensive account of the theory of superselections sectors.
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
567
limit constructed using results in [9] and generated by fields which carry the charges corresponding to representations of G which are nontrivial on N , while F2 is a free field net which has G/N as gauge group. The above mentioned result amounts then to showing that: (i) thanks to the fact that F1 has trivial scaling limit, the scaling limit of F coincides with the scaling limit F2,0 of F2 , and that (ii) the scaling limit net F2,0 again has G/N as its gauge group, and the corresponding sectors all comply with the preservation condition formulated in [6]. In order to establish point (i), we will derive, in Sec. 3, sufficient conditions under which the operations of scaling limit and of forming the tensor product of two theories can be interchanged. Not surprisingly, the main assumption which we employ is that of “asymptotic nuclearity”, Definition 3.1, which was formulated in [10], and which plays a role here in allowing to approximate functions in the scaling algebra of the tensor product theory by finite sums of “simple tensors” of the form λ → F 1λ ⊗F 2λ , with the F i in the scaling algebras of the factor theories. The proof of point (ii) above is then obtained in Sec. 4 by combining this result about the scaling limit of product theories, with the computation of the scaling limit of the free scalar field in [5]. Together with a result in [6], this also implies that the equivalence of local and global intertwiners holds for any theory generated by a finite number of multiplets of free scalar fields of arbitrary masses transforming under irreducible representations of a compact gauge group, Corollary 4.5.
2. Scaling Algebras for Charged Fields and Preservation of DHR Sectors For the paper to be reasonably self-contained, and in order to establish our notations, we give in the present section an exposition of the main results of [6] concerning the short distance analysis of DHR superselection sectors. We refer the interested reader to the original paper for more details and discussions of definitions and results. By a quantum field theory with gauge action (QFTGA in the following) we mean a quintuple (F , U, V, Ω, k), such that: (i) O → F (O) is a net of von Neumann algebras on open double cones in Minkowski d-dimensional spacetime (d = 3, 4) acting irreducibily on a Hilbert space H with scalar product ·, ·; (ii) U is a unitary strongly continuous representation on H of the translations group Rd , satisfying the spectrum condition, i.e. the spectrum of U is contained in the closed forward light cone, and with respect to which the net F is covariant U (x)F (O)U (x)∗ = F (O + x), we set αx := Ad U (x);
x ∈ Rd ;
August 5, 2006 21:35 WSPC/148-RMP
568
J070-00272
C. D’Antoni & G. Morsella
(iii) V is a unitary strongly continuous representation on H of a compact gauge group G, which acts locally on F V (g)F (O)V (g)∗ = F (O),
g ∈ G,
and which commutes with U ; we set βg := Ad V (g), and the subnet of G-fixed points A (O) := F (O)G := {F ∈ F (O) : βg (F ) = F ∀ g ∈ G} is the net of observables determined by (F , U, V, Ω, k); (iv) Ω ∈ H is the vacuum vector, i.e. is the unique translation invariant unit vector in H and it is cyclic for the quasi-local algebra F := O F (O) (closure in the uniform topology on B(H )); Ω is also gauge invariant, and we denote by ω := Ω, (·)Ω the vacuum state; (v) k ∈ Z(G), k 2 = e, is the element defining the Z2 grading according to which elements in the quasi-local algebra F satisfy normal commutation relations, i.e. with F± :=
1 (F ± βk (F )), 2
F ∈ F,
and with Fi ∈ F (Oi ), i = 1, 2, O1 and O2 spacelike separated, one has F1,+ F2,+ = F2,+ F1,+ ,
F1,+ F2,− = F2,− F1,+ ,
F1,− F2,− = −F2,− F1,− .
When there is no risk of confusion, we will indicate the QFTGA (F , U, V, Ω, k) simply by F . For simplicity, we assumed here that we are dealing only with translations covariant nets, but most of the results in the present and following sections also hold for Poincar´e covariant nets, i.e. QFTGAs (F , U, V, Ω, k) for which U is actually a ↑ of the proper orthocronous unitary representation of the universal covering P˜+ ∗ Poincar´e group, such that U (Λ, x)F (O)U (Λ, x) = F (ΛO + x). The notation α(Λ,x) := Ad U (Λ, x) will be used in this case also. The scaling algebra associated to F is defined in the following way. On the C∗ -algebra B(R+ , F) of all norm bounded functions λ ∈ R+ → F λ ∈ F, with the natural C∗ -norm F = supλ>0 F λ , we define automorphic actions α of Rd and β of G by αx (F )λ := αλx (F λ ),
β g (F )λ := βg (F λ ),
x ∈ R4 ,
g ∈ G,
λ > 0.
The local scaling algebra of the double cone O is then the C∗ -algebra F(O) of all the functions F ∈ B(R+ , F) such that F λ ∈ F (λO) for each λ > 0, and lim αx (F ) − F = 0,
x→0
lim β g (F ) − F = 0.
g→e
(2.1)
We will denote by F both the net O → F(O) and the associated quasi-local C∗ algebra.
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
569
Let ϕ be a locally normal state of F , and define a net of states (ϕλ )λ>0 on F by ϕλ (F ) := ϕ(F λ ). We denote by SLF (ϕ) the set of weak∗ limit points of (ϕλ )λ>0 for λ → 0. From an argument due to Roberts [11], it follows that for any pair ϕ1 , ϕ2 of locally normal states on F , there holds lim (ϕ1 − ϕ2 ) F (λO) = 0,
(2.2)
λ→0
and then SLF (ϕ) is actually independent of ϕ, and is called the set of scaling limit states of F. It easily follows that any ω 0 ∈ SLF is α- and β-invariant and then, if (π0 , H0 , Ω0 ) is the corresponding GNS representation, by defining the net of von Neumann algebras F0 (O) := π0 (F(O)) , and the representations U0 of Rd and V0 of G0 := G/N0 (where N0 := {g ∈ G : π0 (β g (F ) − F )Ω0 = 0, ∀ F ∈ F}) by U0 (x)π0 (F )Ω0 := π0 (αx (F ))Ω0 ,
V0 (gN0 )π0 (F )Ω0 := π0 (β g (F ))Ω0 ,
one gets that (F0 , U0 , V0 , Ω0 , kN0 ) is a QFTGA such that A0 (O) := π0 (A(O)) = F0 (O)G0 , A being the scaling algebra for the observable net A defined in [1]. Remark . We note, for future reference, that if the net F is Poincar´e covariant, then also the nets F and F0 can be made Poincar´e covariant by extending α and ↑ by U0 to P˜+ α(Λ,x) (F )λ := α(Λ,λx) (F λ ),
U0 (Λ, x)π0 (F )Ω0 := π0 (α(Λ,x) (F ))Ω0 ,
but, in general, the function Λ → U0 (Λ, x) will not be strongly continuous, since, at variance with what is done in [6], we do not require it here that condition (2.1) is satisfied for the extended α. If we now assume that F is the covariant field net arising from a net of local observables A through the Doplicher–Roberts reconstruction theorem [12], we can define the notion of preservation of DHR sectors in the scaling limit. We first recall that to any (finite statistics, covariant) sector ξ of A we can associate, for any double cone O, a multiplet of class ξ of field operators, i.e. elements ψj ∈ F (O), j = 1, . . . , d, with d the statistical dimension of ξ, such that ψi∗ ψj = δij 11,
d j=1
ψj ψj∗ = 11,
βg (ψi ) =
d
ψj vξ (g)ji ,
j=1
where vξ is a unitary irreducible representation of G in the class associated to the sector ξ. We will then say that a finite statistics, covariant sector ξ of A is preserved in the scaling limit state ω 0 if for each double cone O1 and each λ > 0, it is possible to find a multiplet of class ξ, ψj (λ) ∈ F (λO1 ), j = 1, . . . , d, such that for each ε > 0,
August 5, 2006 21:35 WSPC/148-RMP
570
J070-00272
C. D’Antoni & G. Morsella
each double cone O containing the closure of O1 and each j = 1, . . . , d, there exist scaling algebra elements F , F ∈ F(O) for which lim sup [ψj (λκ ) − F λκ ]Ω + [ψj (λκ ) − F λκ ]∗ Ω < ε, (2.3) κ
where (λκ )κ∈K ⊂ R+ is a net such that ω 0 = limκ ω λκ . As discussed at length in [6], the restriction that the above condition imposes on the sector ξ is essentially that the states ψj (λ)Ω, which represent a charge ξ roughly localized in the region λO, should have energy-momentum scaling not faster than λ−1 , and this corresponds to the physical picture that a preserved charge should be “pointlike”, and therefore its phase space occupation should only be restricted by the Heisenberg principle, as opposed to a charge with some “internal structure” which requires a surplus of energy in order to be localized in small regions. In order to state the consequences of such a notion of charge preservation, we introduce here a notation which will also be useful in the following. For a bounded function λ ∈ R+ → Fλ ∈ F (λO) and functions h ∈ L1 (Rd ), ψ ∈ L1 (G), we set dx h(x)αλx (Fλ ), (β ψ F )λ := dg ψ(g)βg (Fλ ), (2.4) (αh F )λ := Rd
G
where dg is the normalized Haar measure on G and the integrals are understood in a weak sense. It is easy to verify that Gλ := (β ψ F )λ is such that lim supβg (Gλ ) − Gλ = 0,
g→e λ>0
(2.5)
ˆ and for any function λ → Gλ satisfying this condition αh G ∈ F, and αh G ∈ F(O) ˆ ⊃ O + supp h. if O If the sector ξ is preserved in the state ω0 and ψj (λ) is a multiplet satisfying (2.3), we obtain that for each δ-sequence (hn )n∈N , the limit ψ j := s∗ - lim π0 (αhn ψj ), n→+∞
∗
exists in the strong operator topology, is independent of the chosen δ-sequence, and defines a multiplet of class ξ in F0 (O) (in the sense that the representation vξ is trivial on N0 , and ψ j is a multiplet of the corresponding representation of G0 ), and furthermore, by defining ρ(A) :=
d
ψ j Aψ ∗j ,
A ∈ A0 ,
j=1
with A0 the quasi-local algebra of the net A0 , one gets a DHR endomorphism of A0 , whose sector is therefore identified with the scaling limit of the sector ξ. The last result that we cite from [6] is the following generalization of a theorem proven by Roberts [11] for dilatation invariant theories: if all the sectors of the underlying theory are preserved in some scaling limit state, and if the local field algebras are factors, F (O) ∩ F (O) = C11, then local intertwiners between DHR endomorphisms of A are also global intertwiners, i.e. if ρ, σ are covariant, finite
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
571
statistics DHR endomorphisms of A and T ∈ A is such that T ρ(A) = σ(A)T holds for each A ∈ A (O), then it also holds for each A ∈ A. 3. Scaling Limit of Tensor Product Theories As mentioned in the introduction, nuclearity assumptions will play a fundamental role in the discussion of the scaling limit of tensor product theories. For the notion of a p-nuclear map between Banach spaces, see Definition A.1 in Appendix A. Let (F , U, V, Ω, k) be a QFTGA. For a non-negative function ψ ∈ C(G) with ψ = 1, we introduce the notation G Vˆ (ψ) := dg ψ(g)V (g), G
where the integral is defined in the strong sense. Of course Vˆ (ψ) = 1. For a double cone O and a function f ∈ Cb (Rd ), consider the map Θf,ψ,O : F (O) → H defined by (3.1) Θf,ψ,O (F ) := f (P )Vˆ (ψ)F Ω, F ∈ F (O), where P is the d-momentum operator of our theory, i.e. the generator of the translations group. An important particular case of such maps is the map Θβ,O obtained when ψ approaches a δ-function at e ∈ G and f is such that f (p) = e−βp0 for p ∈ V + , for some β > 0, i.e. Θβ,O (F ) := e−βH F Ω,
F ∈ F (O),
(3.2)
with H = P0 the generator of time translations. Definition 3.1. The QFTGA (F , U, V, Ω, k) is said to be asymptotically (uniformly) p-nuclear if all the maps Θβ,O are p -nuclear and lim sup Θλβ,λO p < +∞.
(3.3)
λ→0
From the estimates in [13, Proposition 3.1], it follows that the theory of n free scalar fields of masses mi ≥ 0, i = 1, . . . , n, is asymptotically p -nuclear for any p ∈ (0, 1]. The notion of asymptotic nuclearity was first introduced in [10], where the relations between the phase space properties of the underlying theory and the structure of its scaling limits were analyzed. Essentially all the results to be found there can be generalized to the present setting (the generalization consisting of the fact that here we allow a nontrivial gauge group G acting on the net, as well as for normal commutation relations). In particular, we will need the following results, whose proofs are obtained by a straightforward modification of the ones of [10, Theorems 4.5 and 4.6], combined (0) with the remark [13, Lemma 3.1] that the nuclearity properties of the map Θβ,O , defined as the analogue of the map Θβ,O for a given scaling limit theory F0 , are the same as the ones of the map F ∈ π0 (F(O)) → e−βH0 F Ω0 which is considered in [10] (we recall that F0 (O) = π0 (F(O))− ).
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
C. D’Antoni & G. Morsella
572
Proposition 3.2. Assume that the theory F is asymptotically p-nuclear for 0 < p < 1/3. Then: (0)
(i) for each scaling limit theory F0 , the corresponding maps Θβ,O are q-nuclear for any q > 2p/(2 − 3p), and there exists a c > 0, depending only on p, q, such that (0)
Θβ,O q ≤ c lim supΘλβ,λO p ; λ→0
(ii) if there exists a constant c such that lim supΘλβ,λO p ≤ c, λ→0
uniformly for all double cones O, then F has a classical scaling limit. We now state and prove some technical results that we will use later in the discussion of the scaling limit of a tensor product theory. We first introduce some notation. For f ∈ S (Rd ), we adopt the following conventions for its Fourier transform and anti-transform: dp ipx ˇ ˆ dx f (x)e , f (x) := f (p)e−ipx , f (p) := d (2π) Rd Rd where of course px = pµ xµ is the Minkowski scalar product of p, x ∈ Rd . Also, for a function f on Rd , and λ > 0, we set f λ (p) = f (λp), p ∈ Rd . Lemma 3.3. Let the theory F be asymptotically p-nuclear for 0 < p < 1/3, let ω 0 be a scaling limit state and f ∈ S (Rd ) be such that supp∈V + |f (p)eβp0 | < ∞ for ˆ and for each ε > 0, some β > 0. Then if 2p/(1 − p) < q ≤ 1, for each double cone O ˆ such that, if we let there are elements F , . . . , F ∈ F(O) 1
PN :=
N
N
(0) (0) Θf,ψ,Oˆ (π0 (F n )), · Θf,ψ,Oˆ (π0 (F n )),
(3.4)
n=1
then (0)
(11 − PN )Θf,ψ,Oˆ q < ε.
(3.5)
Proof. From the conditions on p, q in the statement, it follows that we can take a number r such that 2p/(2 − 3p) < r < 2q/(4 − q), which implies that q > 4r/(r + 2) (0) and r < 2/3. This implies, according to the previous proposition, that Θβ,Oˆ is r-nuclear, and then, since it follows from the conditions on the function f that (0) (0) f (P )eβH is a bounded operator on H0 , Θf,ψ,Oˆ = f (P )eβH Vˆ0 (ψ)Θβ,Oˆ is r-nuclear too. Then, according to Lemma A.5 in Appendix A, there exist an orthonormal (0) ˆ ∗ such that and a family (ϕn )n∈N ⊂ F0 (O) system (Φn )n∈N ⊂ ran Θ ˆ f,ψ,O
(0)
Θf,ψ,Oˆ (F ) =
+∞ n=1
ϕn (F )Φn ,
+∞ n=1
ϕn q < +∞.
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
573
It is therefore possible to find an integer N such that if QN is the orthogonal projection on the subspace spanned by Φ1 , . . . , ΦN , +∞
1/q ε (0) q (3.6) ϕn < . (11 − QN )Θf,ψ,Oˆ q ≤ 2 n=N +1
(0) (0) ˆ Furthermore, since, as it is easily checked, ran Θf,ψ,Oˆ = Θf,ψ,Oˆ (π0 (F(O))), we can ˆ n = 1, . . . , N , such that find elements F n ∈ F(O), ε (0) Φn − Θf,ψ,Oˆ (π0 (F n )) < min 1, , (0) 3 · 2n+1 Θf,ψ,Oˆ q (0)
so that Θf,ψ,Oˆ (π0 (F n )) ≤ 2. Then, if PN is given by Eq. (3.4), we get, for each Φ ∈ H0 , (QN − PN )Φ ≤
N
(0) (0) Φn , ΦΦn − Θf,ψ,Oˆ (π0 (F n )), Φ Θf,ψ,Oˆ (π0 (F n ))
n=1
≤3
N
ε
(0)
Φn − Θf,ψ,Oˆ (π0 (F n )) Φ <
n=1
(0) 2Θf,ψ,Oˆ q
Φ,
(0)
i.e. QN − PN ≤ ε/2Θf,ψ,Oˆ q , which, together with inequality (3.6), gives the statement. In order not to burden the formulas too much, in the following lemma and in the proof of Lemma 3.6, we will make the following slight abuse of notation: given an element F n ∈ F, we will denote its value at scale λ as F nλ (instead of (F n )λ ), which should not be confused with the value of an element F at scale nλ. Lemma 3.4. Assume that the theory F is asymptotically p-nuclear for p ∈ (0, 1/6), and let f ∈ S (Rd ) be as in the previous lemma, and ω0 = limκ∈K ω λκ be a scaling ˆ limit state of F . Then if 2p/(1 − 4p) < q ≤ 1, for each pair of double cones O, O ˆ and for each ε > 0, there exist F 1 , . . . , F N ∈ F(O) ˆ such that, if we set with O ⊂ O (λ)
PN :=
N Θf λ ,ψ,λOˆ (F nλ ), · Θf λ ,ψ,λOˆ (F nλ ),
(3.7)
n=1
we have (λ )
lim sup Θf λκ ,ψ,λκ O − PN κ Θf λκ ,ψ,λκ O q ≤ ε,
(3.8)
κ∈K
where for each κ ∈ K the q-norm appearing in the last equation is the q-norm of nuclear maps in B(F (λκ O), H ). Proof. We use a variation of the arguments in [10, Theorem 4.5]. We observe preliminarly that given bounded functions λ → Fλ ∈ F (λO1 ), λ → Gλ ∈ F (λO2 )
August 5, 2006 21:35 WSPC/148-RMP
574
J070-00272
C. D’Antoni & G. Morsella
we have
lim Θf λκ ,ψ,λκ O1 (Fλκ ), Θf λκ ,ψ,λκ O2 (Gλκ ) κ = lim (αfˇβ ψ F )λκ Ω, (αfˇβ ψ G)λκ Ω κ = π0 (αfˇβ ψ F )Ω0 , π0 (αfˇβ ψ G)Ω0 .
(3.9)
ˆ n = 1, . . . , N , and for the For simplicity, for any given family F n ∈ F(O), (λ) (λ) (λ) corresponding PN defined as in (3.7), we set TN := Θf λ ,ψ,λO − PN Θf λ ,ψ,λO . (λκ ) Furthermore, we denote by Nκ (ε) the ε-content of the map TN , and by N0 (ε) (0) that of the map (11 − PN )Θf,ψ,Oˆ , where PN is defined as in the previous lemma, Eq. (3.4). We begin by showing that the following inequality holds for each ε > 0: lim sup Nκ (ε) ≤ N0 (ε/2).
(3.10)
κ
If this is not true, there exists an ε > 0 such that, if we set M := N0 (ε/2), for (n) each ν ∈ K, we can find a κ(ν) ∈ K, κ(ν) ≥ ν, and elements Gν ∈ F (λκ(ν) O), (n) Gν ≤ 1, n = 1, . . . , M + 1, such that (λ
)
(m) TN κ(ν) (G(n) ν − Gν ) > ε
if n = m. Define then, for each λ > 0, and n = 1, . . . , M + 1, (n) if λ = λκ(ν) for some ν ∈ K, Gν (n) Gλ := 0 otherwise. ˜ := {κ(ν) : ν ∈ K} ⊂ K is, with the It is straightforward to check that the set K induced partial ordering, a subnet of K, and therefore it is easy to verify, using (3.9), that (11 − PN )π0 αfˇβ ψ G(n) − αfˇβ ψ G(m) Ω0 (λ )
(n)
(m)
(λ )
(n)
(m)
= lim TN κ (Gλκ − Gλκ ) κ∈K
= lim TN κ (Gλκ − Gλκ ) ≥ ε. ˜ κ∈K
(3.11)
Pick now non-negative functions h ∈ Cc (Rd ), χ ∈ C(G) with Rd h = 1 = G χ and ˆ and define H (n) := α β χ G(n) ∈ F(O). ˆ Taking into account that O + supp h ⊂ O, h d convolution on R is commutative and G is unimodular, we see that we can take supp h and supp χ so small that (0)
Θf,ψ,Oˆ (π0 (H (n) )) − π0 (αfˇβ ψ G(n) )Ω0 ≤ αh αfˇβ ψ∗χ G(n) − αfˇβ ψ G(n) ≤ fˇ1 sup β ψg−1 G(n) − β ψ G(n) + g∈supp χ
<
ε , 411 − PN
sup αfˇx β ψ G(n) − αfˇβ ψ G(n)
x∈supp h
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
575
where we used the standard notation fˇx (y) := fˇ(y − x), ψg−1 (h) := ψ(hg −1 ). Therefore, together with Eq. (3.11), we get (0)
(0)
(11 − PN )Θf,ψ,Oˆ (π0 (H (n) )) − (11 − PN )Θf,ψ,Oˆ (π0 (H (m) )) > ε/2, which means that N0 (ε/2) ≥ M + 1 = N0 (ε/2) + 1, and this contradiction proves (3.10). Now, according to Lemma A.3(ii) in Appendix A, there holds
q1 +∞ 1 1 (λκ ) q 2 m (m εm Nκ (εm ) ) , (3.12) lim sup TN q ≤ lim sup dq κ
κ
m=1
provided we can find a sequence of positive numbers (εm )m∈N such that the series on the right-hand side of this equation are convergent. We then pick numbers r, s with 2p/(1 − p) < r < 2q/(3q + 2) and r/(1 − r) < s < 2q/(q + 2) (that this is possible, 1 (0) follows from the conditions imposed on p, q) and set εm := (11 −PN )Θf,ψ,Oˆ r m− s . It then follows from Lemma A.3(i) that
(λ ) cTN κ sr 1 1 (0) εm Nκ (εm ) m ≤ (11 − PN )Θf,ψ,Oˆ r exp m− s , (0) s (11 − PN )Θf,ψ,Oˆ r and since, if M := supp∈V + |f (p)eβp0 |, it is easily checked that
N (λκ ) 2 2 F n Θλκ β,λκ O r , TN r ≤ M 1 + M n=1
we have, from the assumption of asymptotic p-nuclearity of the theory, and from the fact that r > p for p ∈ (0, 1/6), that there exists a constant C > 0 and some ν ∈ K, independent of m, such that for all κ > ν, there holds 1
1
εm Nκ (εm ) m ≤ Cm− s .
+∞ q q It follows from the conditions imposed on q, s that the series m=1 m 2 − s is convergent, and we can then interchange the sum and the limit superior on the right-hand side of (3.12) obtaining a larger upper bound on the left-hand side, so that, using inequality (3.10) and Lemma A.3(i) once more, we conclude that there exists a constant Kq,s > 0 such that (λ )
(0)
lim supTN κ q ≤ Kq,s (11 − PN )Θf,ψ,Oˆ r , κ
and the statement is finally obtained by appealing to the previous lemma. We now pass to consider the situation in which we have two different QFTGAs (F , U (i) , V (i) , Ω(i) , ki ), i = 1, 2. For simplicity we will assume, in all that follows, that the F (i) are purely bosonic, i.e. ki = ei (identity of the group Gi ). It is straightforward, if cumbersome, to generalize the following results to the case of two genuinely Z2 -graded nets (see the remarks after Theorem 3.7), but, as in the (i)
August 5, 2006 21:35 WSPC/148-RMP
576
J070-00272
C. D’Antoni & G. Morsella
rest of the paper we will need only the present special case, we refrain from giving details. Of course, by defining ¯ F (2) (O), F (O) := F (1) (O) ⊗ U (x) := U V (g1 , g2 ) := V Ω := Ω
(1) (1) (1)
(x) ⊗ U
(2)
(g1 ) ⊗ V ⊗Ω
(2)
(x),
(2)
(g2 ),
(3.13) x∈R ,
(3.14)
(g1 , g2 ) ∈ G1 × G2 ,
(3.15)
d
,
(3.16) (1)
we get a new QFTGA (F , U, V, Ω, (e1 , e2 )) on the Hilbert space H := H ⊗H (2) , which will be called the tensor product theory of F (1) and F (2) , and denoted, for ¯ F (2) . Our purpose is to study the relationship between the brevity, with F (1) ⊗ scaling limit theory of F and the tensor product of the scaling limit theories of F (i) , i = 1, 2. ˜ , V˜ , Ω, ˜ k) with the same We recall that two QFTGAs (F , U, V, Ω, k) and (F˜ , U gauge group are net-isomorphic if there is an isomorphism of the quasi-local algebras ˜ such that θ(F (O)) = F˜ (O), α ˜ θ = ω, with θ:F→F ˜ x θ = θαx , β˜g θ = θβg and ω obvious meaning of the symbols. It is then plain that the sets of scaling limit states of two net-isomorphic theories are in bijective correspondence, and that the scaling limit theories arising from two corresponding scaling limit states are net-isomorphic. Therefore, net-isomorphic theories can be identified when discussing properties of their scaling limit theories. In particular, in the following, we will always identify ¯ C11 ⊂ F (O), without further comment the nets O → F (1) (O) and O → F (1) (O) ⊗ ¯ F (2) (O) ⊂ F (O) (with the obvious and the nets O → F (2) (O) and O → C11 ⊗ definitions of translations and gauge transformations). We will then denote by F(i) the scaling algebra associated to F (i) , i = 1, 2. For (i) F ∈ F(i) (O), i = 1, 2, we define, by a slight abuse of notation, (F (1) ⊗ F (2) )λ := (1) (2) F λ ⊗ F λ ∈ F (λO), and it is clear that F (1) ⊗ F (2) ∈ F(O). We will denote ˜ ˜ the by F(O) the C∗ -subalgebra of F(O) generated by such elements, and by F ∗ corresponding quasi-local C -algebra. We also define 11λ := 11 for all λ > 0. Proposition 3.5. The sets of scaling limit states of the three theories F , F (1) , F (2) are in bijective correspondence, in such a way that ω 0 ∈ SLF (ω) corresponds (1) to the states F ∈ F(1) → ω0 (F ⊗ 11) in SLF (ω (1) ) and F ∈ F(2) → ω0 (11 ⊗ F ) in
SLF
(2)
(ω (2) ).
Proof. It is well known that there exist conditional expectations E (i) : F → F(i) such that E (i) (F (O)) = F (i) (O), defined by the fact that, say, E (2) (F ), F ∈ F (O), is the unique element of F (2) (O) such that φ(E (2) (F )) = ω (1) ⊗ φ(F ) for each φ ∈ F (2) (O)∗ , so that E (2) (F1 ⊗ F2 ) = ω (1) (F1 )F2 (see, for instance, the proof of (i) [14, Theorem 2.6.4]). It is then straightforward to check that αx E (i) = E (i) αx , (i) (i) βgi E = E (i) β(g1 ,g2 ) and ω (i) E (i) = ω. It can then be shown [15] that given a conditional expectation between two nets with the above properties, the respective sets of scaling limit states are in bijective correspondence, such correspondence being given by the restriction of scaling limit states.
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
577
Remark. The above proposition implies, in particular, that the cardinality of the set of scaling limit states is independent of the theory under consideration. Although this may seem surprising at first sight, it must be kept in mind that this does not mean that the physical interpretation of these states is the same for all theories: if, for instance, two states of a theory give rise to isomorphic scaling limit nets, in general, this will not happen for the corresponding states of another theory. Therefore, upon identifying isomorphic scaling limit theories, we see that the number of the physically distinguishable scaling limits will be different for different theories. In view of the above result, given a state ω 0 ∈ SLF (ω), we will denote by i = 1, 2, the scaling limit (bosonic) QFTGAs arising from (i) the corresponding states in SLF (ω (i) ), without further specifications. We will also (i) denote by Θf,ψ,O the nuclear maps associated to the theory F (i) , i = 1, 2.
(i) (i) (i) (i) (F0 , U0 , V0 , Ω0 ),
Lemma 3.6. Assume that both theories F (i) , i = 1, 2, are asymptotically p-nuclear for p ∈ (0, 1/6) and let ω 0 be a scaling limit state of F . For each F ∈ F(O) and ˜ such that each ε > 0, there exists a G ∈ F π0 (F )Ω0 − π0 (G)Ω0 < ε. Proof. Without restriction to generality, we can assume F ≤ 1. To begin with, we choose β > 0 and non-negative functions ψi ∈ C(Gi ), i = 1, 2, which integrate to one, such that, if ψ(g1 , g2 ) := ψ1 (g1 )ψ2 (g2 ), (11 − e−βH Vˆ0 (ψ))π0 (F )Ω0 <
ε , 2
and we pick f ∈ S (Rd ) such that f (p) = e−βp0 for p ∈ V + (it is straightforward (i) to explicitly construct such a function). Let also Mi := lim supλ→0 Θλβ,λO q and let δ > 0 be such that δ(M1 + M2 + δ) < ε/2, and, according to Lemma 3.4, let (i) ˆ F (i) n ∈ F (O), n = 1, . . . , N , i = 1, 2, be such that (i)
(i,λκ )
lim supΘf λκ ,ψi ,λκ O − PN κ∈K
(i)
Θf λκ ,ψi ,λκ O q ≤ δ,
for 1 ≥ q > 2p/(1 − 4p), with obvious meaning of the symbols. If we define then (λ)
RN :=
N (1) (1) (2) (2) Θf λ ,ψ,λOˆ F nλ ⊗ F mλ , · Θf λ ,ψ,λOˆ F nλ ⊗ F mλ , n,m=1 (1)
(2)
(λ)
(1,λ)
we have Θf,ψ,O (F (1) ⊗ F (2) ) = Θf,ψ1 ,O (F (1) ) ⊗ Θf,ψ2 ,O (F (2) ) and RN = PN (2,λ)
⊗
PN , and therefore, observing that, by the arguments in [13], the nuclear q-norm (λ) of Θf λ ,ψ,λO − RN Θf λ ,ψ,λO agrees with that of its restriction to the minimal ten(1) sor product F (λO) ⊗min F (2) (λO), we can apply Lemma A.4 in Appendix A,
August 5, 2006 21:35 WSPC/148-RMP
578
J070-00272
C. D’Antoni & G. Morsella
obtaining (λ)
Θf λ ,ψ,λO − RN Θf λ ,ψ,λO q (1)
(2)
(1,λ)
(1)
(1,λ)
= Θf λ ,ψ1 ,λO ⊗ Θf λ ,ψ2 ,λO − PN ≤ Θf λ ,ψ1 ,λO − PN (1,λ)
+ PN
(1)
(1)
(2,λ)
Θf λ ,ψ1 ,λO ⊗ PN
(2)
Θf λ ,ψ2 ,λO q
(2)
Θf λ ,ψ1 ,λO q Θλβ,λO q
(1)
(2)
(2,λ)
Θf λ ,ψ1 ,λO q Θf λ ,ψ2 ,λO − PN
(2)
Θf λ ,ψ2 ,λO q ,
and then (λ )
lim supΘf λκ ,ψ,λκ O − RN κ Θf λκ ,ψ,λκ O q ≤ M2 δ + (M1 + δ)δ < κ
We define then the bounded functions (1) (2) cnm (λ) := Θf λ ,ψ,λOˆ (F nλ ⊗ F mλ ), Θf λ ,ψ,λO (F λ ) ,
ε . 2
n, m = 1, . . . , N,
N (1) (2) ˜ O). ˜ is a C∗ -algebra ˆ Since F and we set H λ := n,m=1 cnm (λ)F nλ ⊗ F mλ , H ∈ F( on which translations α and gauge transformations β act norm continuously, we ˜ and have G := α ˇβ ψ H ∈ F, f
[π0 (F ) − π0 (G)]Ω0 (0) (0) ≤ (11 − f (P )Vˆ0 (ψ))π0 (F )Ω0 + Θf,ψ,O (π0 (F )) − Θf,ψ,Oˆ (π0 (H)) N (1) ε (2) cnm (λκ )Θf λκ ,ψ,λκ Oˆ F nλκ ⊗ F mλκ ≤ + lim Θf λκ ,ψ,λκ O (F λκ ) − κ 2 n,m=1 ε (λ ) = + lim Θf λκ ,ψ,λκ O − RN κ Θf λκ ,ψ,λκ O (F λκ ) κ 2 ε (λ ) ≤ + lim sup Θf λκ ,ψ,λκ O − RN κ Θf λκ ,ψ,λκ O q < ε, 2 κ where in the last inequality we have used the fact that the operator norm is majorized by any nuclear q-norm with 0 < q ≤ 1. Theorem 3.7. Assume that the theories F (i) , i = 1, 2, are asymptotically p-nuclear for 0 < p < 1/6, and that, for a given scaling limit state ω 0 of the tensor product (i) theory F , the scaling limit theories F0 , i = 1, 2, satisfy Haag duality. Then, there is a unitary equivalence (1) ¯ F0(2) (O) ∼ F0 (O) ⊗ = F0 (O) (1)
which implements a net-isomorphism between F0
¯ F0(2) and F0 . ⊗
˜ span a dense Proof. In view of the last lemma, the vectors π0 (G)Ω0 with G ∈ F subspace of the scaling limit Hilbert space H0 . Therefore, the operator W : H0 → (1) (2) H0 ⊗ H0 defined by (1)
(1)
(2)
(2)
W π0 (F (1) ⊗ F (2) )Ω0 := π0 (F (1) )Ω0 ⊗ π0 (F (2) )Ω0 ,
F (i) ∈ F(i) ,
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
579
is unitary, and it is obviously such that (1) − ˜ ¯ F0(2) (O). W π0 (F(O)) W ∗ = F0 (O) ⊗
Therefore, identifying unitarily equivalent nets, we get (1)
(2)
¯ F0 (O) ⊆ F0 (O), F0 (O)⊗ (1)
(2)
¯ F0 satisfies Haag duality, and F0 satisfies locality by the results of [6], but F0 ⊗ so that, since a net satisfying Haag duality is a maximal local net, the two nets coincide. It is then straightforward to verify that Ad W defines a net-isomorphism ¯ F (2) . from F0 to F (1) ⊗ Remarks. (i) According to [5, Theorem 3.4] and to Theorem 4.3 in the following (i) section, examples in which the nets F0 satisfy Haag duality are obtained by taking for F (i) , i = 1, 2, nets generated by free fields. In this case, in fact, the corresponding scaling limit theories are also free fields. (ii) Another class of examples is obtained, as in the following section, by taking (1) for F (1) , say, a net with a classical scaling limit F0 = C11, and for F (2) a net with a scaling limit satisfying Haag duality (a free field, for instance). In such cases, (1) (2) ¯ F0(2) (O) ∼ we have F0 (O) ⊗ = F0 (O), so the scaling limit of the tensor product theory coincides, by Theorem 3.7, with the scaling limit of the second factor. (iii) It is fairly straightforward to modify the proofs of the above results in order to treat the case of Poincar´e covariant nets, with associated scaling algebras defined by requiring the continuity condition (2.1) to hold also with respect to the Lorentz transformation (these scaling algebras are therefore smaller than those considered above). In particular, under the assumption of asymptotic p-nuclearity, (1) (2) the generalized version of Lemma 3.6 will imply that H0 = H0 ⊗ H0 in this case also. If we assume then geometric modular action for the theories F (i) , this will (i) also hold for the scaling limit theories F0 , F0 [6, Proposition 3.1], and this will imply, without further assumptions (in particular, without assuming Haag duality, (1) ¯ F0(2) (W ) for any wedge W , and, as a in the scaling limit), F0 (W ) = F0 (W ) ⊗ (1) ¯ F0(2) . consequence, equality of the dual theories of F0 and of F0 ⊗ (iv) Another possible generalization of the results discussed in this section is obtained by dropping the hypothesis that the nets F (i) are purely bosonic. In this case, one defines the bosonic and fermionic parts of F (i) (O) as F (i) (O)± := (i) { 21 (F ± βki (F )) : F ∈ F (i) (O)} and, in order to get a Z2 -graded theory, the definition of the tensor product theory given above must be altered by replacing Eq. (3.13) with ˆ F (2) (O) F (O) := F (1) (O) ⊗ ¯ F (2) (O)+ + V (1) (k1 )F (1) (O) ⊗ ¯ F (2) (O)− . := F (1) (O) ⊗ The analysis proceeds then along the same lines as the above one, by studying the λ → 0 behaviour of the nuclearity properties of the restrictions of the maps
August 5, 2006 21:35 WSPC/148-RMP
580
J070-00272
C. D’Antoni & G. Morsella
Θf λ ,ψ,λO to the bosonic and fermionic subnets, and at the end, one obtains that if (i) the theories F (i) are asymptotically p -nuclear and their scaling limit theories F0 (1) (2) ˆ F0 are net-isomorphic. satisfy twisted Haag duality, then F0 and F0 ⊗ We will need a version of Theorem 3.7 which deals with outer regularized scaling limit nets: π0 (F(O1 )) . F0,r (O) := O1 ⊃O
Theorem 3.8. Assume that the purely bosonic theories F (i) , i = 1, . . . , n, are asymptotically p-nuclear for 0 < p < 1/6, and that the outer regularized scaling (i) limit theories F0,r , i = 1, . . . , n, satisfy Haag duality. Then, (1) (n) ¯ · · · ⊗F ¯ 0,r F0,r (O)⊗ (O) ∼ = F0,r (O) (1)
(n)
¯ · · · ⊗F ¯ 0,r and F0,r are net-isomorphic. and the theories F0,r ⊗ Proof. According to Lemma A.4, the tensor product of two asymptotically p-nuclear theories is again asymptotically p -nuclear, and therefore it is sufficient, by induction, to prove the theorem for n = 2. We begin by observing that, if (Mα )α , (Nα )α are families of von Neumann algebras (on the same Hilbert space), then
¯ ¯ M α ⊗ Nα = Mα ⊗ Nα . α
α
α (1)
Through the unitary equivalence induced by the operator W : H0 → H0 defined in the proof of Theorem 3.7, we have (1)
(2)
⊗ H0
(2)
π0 (F(1) (O)) ⊗min π0 (F(2) (O)) ⊆ π0 (F(O)), and then, by what we have just observed,
(1) (2) (1) (2) (1) (2) ¯ F0,r (O) = ¯ F0,r (O) ⊗ π0 F (O1 ) π0 F (O1 ) ⊗ O1 ⊃O
=
(1) π0 F(1) (O1 )
O1 ⊃O
¯ π0(2) F(2) (O1 ) ⊆ π0 F(O1 ) ⊗
O1 ⊃O
O1 ⊃O
= F0,r (O), and we conclude, as in the proof of Theorem 3.7, using the maximality of Haag dual nets. 4. A Class of Models with Non-Preserved DHR Sectors In this section, we will construct quantum field theory models which possess DHR sectors which are non-preserved in the scaling limit. As already said in the introduction, the observables net of such models is obtained as the fixed point net of a field net which in turn is a tensor product of two theories, one being a theory with
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
581
trivial scaling limit, and the other being a free field theory, whose DHR sectors are all preserved. We start then by briefly recalling some facts about theories with trivial scaling limit and the scaling limit of free field theories from [9, 5]. Let φ be the generalized free scalar field with mass measure dρ(m) = dm in d = s + 1 = 3, 4 spacetime dimensions, i.e. i (4.1) φ(f ) = √ [a(T f¯) − a(T f )∗ ], f ∈ S (Rd ), 2 with a(ψ), a(ψ)∗ being the annihilation and creation operators on the symmetric Fock space over L2 (Rs × R+ , ds p dm) and 1 Tf(p, m) := (2ωm (p))− 2 fˆ(ωm (p), p), ωm (p) := m2 + |p|2 . Let also λ ∈ R+ → n(λ) ∈ N0 be a non-increasing function which diverges as λ → 0, and for a double cone O = (x + V+ ) ∩ (y − V+ ) define n(O) := n( (x − y)2 ), and consider the net of local algebras AL (O) := {eiφ(f ) : f ∈ n(O) D R (O)} ,
(4.2)
with the obvious definition of the action of translations. In [9], developing an idea exposed in [10], it is shown that the net AL satisfies the standard assumptions, including weak additivity and essential Haag duality, and that the corresponding operators ΘL β,O , defined as in Eq. (3.2), satisfy lim supΘL λβ,λO p ≤ 1,
(4.3)
λ→0
for all β, O and all 0 < p ≤ 1, and therefore, according to Proposition 3.2(ii), the net AL has classical scaling limit. We now consider the scaling limit of the free scalar field. Following [5], we use a non-standard, locally Fock representation of the local algebras in the Cauchy˜ the (abstract) Weyl algebra over data formulation of the free field. Denote by W, s ˜ ), where (D(R ), σ ds xf (x)g(x), σ ˜ (f, g) = Im Rs
(m) equipped with (mass dependent) actions α(Λ,x) of the Poincar´e group and δ˜λ of
dilations, and let ω (m) denote the vacuum state with mass m (see [5] for the explicit formulas, which will not be needed in the following). The non-standard free field representation used in [5] is obtained in the following ˜ the Weyl algebra over (D(B), σ ˜ ). way. For an open ball B ⊂ Rs , denote by W(B) ˜ ˜ and ω (0) W(B) are known to be normal to each Then the states ω (m) W(B) ˜ induced by ω (m) , other [16], and then, if π (m) , π (0) are the GNS representations of W (0) (m) ˜ (0) ˜ ω , the von Neumann algebras π (W(B)) and π (W(B)) are isomorphic through an isomorphism connecting π (m) (W ) and π (0) (W ). It is also straightforward ↑ ˜ can be on W to verify that, using these isomorphisms, the action α(m) of P+ (0) ˜ (m) transported to an action on π (W), still denoted by α . Therefore, by defining, (m) ˜ (g))) : g ∈ D(B) , A (m) (ΛOB + x) := α (π (0) (W (Λ,x)
August 5, 2006 21:35 WSPC/148-RMP
582
J070-00272
C. D’Antoni & G. Morsella
with OB = B the causal completion of the s-dimensional ball B, one gets a net of local von Neumann algebras for the scalar field of mass m which is represented on the Fock space H (0) of the field of mass zero, but which is isomorphic to the mass m net in the usual Fock representation. We also note that, ω (0) being δ˜λ invariant, δ˜λ is unitarily implemented in this representation (but, of course, it is not an automorphism of the net A (m) , unless m = 0). From now on, we will drop the indication of the representation π (0) . Let A(m) (O) be the C∗ -algebra of the elements A ∈ A (m) (O) such that (m) x → αx (A) is norm-continuous. Due to the outer continuity of the net A (m) , A (m) (O) = O1 ⊃O A (m) (O1 ) [17], we have A(m) (O)− = A (m) (O). Note that A(m) is also outer continuous. Let then A(m) be the net of scaling algebras associated to A(m) , and, with ω 0 any scaling limit state, and π0 the corresponding GNS repre(m) sentation, we define A0 as the outer regularized scaling limit net (m) π0 (A(m) (O1 )). (4.4) A0 (O) := O1 ⊃O
One of the main results of [5] is that the formula θ˜m (π0 (A)) = w-lim δ˜λ−1 (Aλκ ), κ
(4.5)
κ
with (λκ )κ ⊂ R+ a net such that ω 0 = limκ ωλκ , defines a net isomorphism between (m) A0 and A(0) , which is implemented by a unitary V˜m : H0 → H (0) . Proposition 4.1. With the above notations, θ˜m = Ad V˜m extends to a net isomorphism between the outer regularized net of von Neumann algebras (m) A0 (O) := π0 (A(m) (O1 )) , (4.6) O1 ⊃O
and A
(0)
.
Proof. We have, by the outer regularity of A (0) , (m) V˜m A0 (O) V˜m∗
=A
(0)
(O) =
A
(0)
(O1 ) = V˜m
O1 ⊃O
and, therefore,
(m) A0 (O1 )
=
O1 ⊃O
⊆
(m) A0 (O1 )
O1 ⊃O
π0 (A
(m)
(O1 ))
O1 ⊃O
π0 (A(m) (O1 )) ⊆
O1 ⊃O
so that, finally,
(m)
A0 (O1 ) ,
O1 ⊃O
(m) (m) ˜ ˜ θm (A0 (O)) = θm π0 (A (O1 )) = A (0) (O). O1 ⊃O
V˜m∗ ,
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
583
We now pass to the construction of the class of models mentioned above. Let G be a compact Lie group and N ⊂ G a (proper and nontrivial) normal closed subgroup. It is then possible [18, Theorem 6.1.1] to find a finite set ∆ of irreducible representations of G which is symmetric and generating, i.e. such that for each representation belonging to ∆ also its conjugate representation is in ∆, and every irreducible representation of G is a subrepresentation of a tensor product of representations from ∆. Denote now by ∆2 the subset of ∆ of those representations which are trivial on N2 := N , and define another closed normal subgroup N1 of G as the annihilator of ∆1 := ∆ \ ∆2 : N1 := v∈∆1 ker v. It follows then easily that G is isomorphic to G1 × G2 , with Gi := G/Ni , i = 1, 2, and that ∆i is a symmetric generating set of irreducible representations for Gi . We introduce the representations vi := v∈∆i v of Gi , and set ni := dim vi , i = 1, 2. Since ∆i is symmetric, there exists a unitary involution Ji on Cni such that Ji vi (·)Ji is the complex conjugate representation of vi , and it is possible to find mi , pi ∈ N with ni = 2mi + pi , and (i) (i) (i) (i) (i) vectors (ek )k=1,...,mi +pi of Cni such that {e1 , . . . , emi +pi , Ji e1 , . . . , Ji emi } is an (i)
(i)
(i)
(i)
orthonormal basis and Ji ek = ek for k = mi +1, . . . , mi +pi (i.e. emi +1 , . . . , emi +pi span the subspace of the real representations in ∆i ). The first factor F (1) of our class of models is defined as the field net generated by a v-multiplet of generalized free fields (4.1) for each v ∈ ∆1 , to which a suitable power of the Dalembertian has been applied, as in (4.2). More specifically, on the symmetric Fock space H (1) over L2 (Rs × R+ , ds p dm) ⊗ Cn1 , with vacuum vector Ω(1) , we consider the fields i (1) (1) φk (f ) = √ [a(T f¯ ⊗ ek ) − a(T f ⊗ J1 ek )∗ ], 2
f ∈ S (Rd ), k = 1, . . . , m1 + p1 ,
(φ1 , . . . , φm1 are complex generalized free fields, while φm1 +1 , . . . , φm1 +p1 are real generalized free fields), and, assuming that there is a λ0 such that n(λ) = 0 for λ ≥ λ0 , define the net of von Neumann algebras F (1) (O) := {ei[φk (f )+φk (f )
∗ −
]
: f ∈ n(O) DR (O), k = 1, . . . , m1 + p1 } ,
(1)
with the obvious action αx := Ad U (1) (x) of the translations and with the action (1) βg := Ad V (1) (g) of G1 , where V (1) (g) is the second quantization of 11 ⊗ v1 (g). Proposition 4.2. The bosonic theory (F (1) , U (1) , V (1) , Ω(1) ) is asymptotically p-nuclear for each p ∈ (0, 1] and has a classical scaling limit. The associated net of observables A (1) := (F (1) )G1 has DHR sectors in one to one correspondence with the unitary equivalence classes of irreducible representations of G1 . Proof. It is a straightforward consequence of the fact that F (1) ∼ = AL⊗n1 and of the, by now, common argument involving Lemma A.4, that for the nuclear operator (1) (1) (1) ⊗n1 n1 and Θβ,O p ≤ ΘL Θβ,O of F (1) there holds Θβ,O = (ΘL β,O ) β,O p for each p ∈ (1) is asymptotically p -nuclear (0, 1], and therefore it follows from Eq. (4.3) that F and from Proposition 3.2(ii) that it has classical scaling limit.
August 5, 2006 21:35 WSPC/148-RMP
584
J070-00272
C. D’Antoni & G. Morsella
For what concerns the second part of the statement, it follows easily from the results in [9] that the theory under consideration satisfies the assumptions (1)–(7) in [19], and this entails that the DHR sectors of A (1) which appear in H (1) are in one to one correspondence with the classes of irreducible representations of G1 [19, Theorem 3.6]. Remark. It may seem natural to conjecture that the net A (1) has no other DHR sector apart from those described by the group G1 . The standard way to prove this, would be to show that the net F (1) satisfies the split property and a certain cohomological condition [4, Sec. 3.4.5]. While it is straightforward to verify that this is actually the case, as A (1) does not satisfy Haag duality this can only be used to conclude that classes of irreducible representations of G1 label classes of local 1-cocycles of A (1) [4]. On the other hand, it follows from standard arguments that the dual net A (1)d coincides with the G1 -fixed point net of the dual of the field net generated by the generalized free fields φk (f ) (without further conditions on the test functions f ) and such a net does not even satisfy the split property [20]. It is therefore an open problem to determine the complete superselection structure of A (1) . The factor F (2) in our model is the net generated by multiplets of free scalar fields belonging to the representations in ∆2 . We consider therefore the Weyl algebra W over (D(Rs ) ⊗ C n2 , σ) with σ(f ⊗ ξ, g ⊗ η) = Im ds xf (x)g(x)(ξ · η) , Rs
where (ξ · η) is the standard scalar product of ξ, η ∈ Cn2 , and pick a mass function ˜ ⊗min n2 , we define v ). As W is isomorphic with W µ : ∆2 → R+ such that µ(v) = µ(¯ ↑ an action α(µ) of P˜+ , an action δ of dilations and a vacuum state ω (µ) through (µ)
(µ )
(µn )
1 2 α(Λ,x) := α(Λ,x) ⊗ · · · ⊗ α(Λ,x) ,
δλ := δ˜λ⊗n2 , ω (µ) := ω (µ1 ) ⊗ · · · ⊗ ω (µn2 ) , where (µ1 , . . . , µn2 ) is a vector obtained by repeating each value µ(v) exactly dim v times, for each v ∈ ∆2 . Furthermore, an action β (2) of G2 is defined by βg(2) (W (f )) := W ((11 ⊗ v2 (g))f ),
f ∈ D(Rs ) ⊗ Cn2 , g ∈ G2 .
If 0 : ∆2 → R+ is the identically vanishing function, it is easy to verify that the states ω (µ) and ω (0) are locally normal to each other, and therefore, in analogy to the n2 = 1 case, we define, in the GNS representation of ω (0) , the net (µ) F (µ) (ΛOB + x) := α(Λ,x) W (f ) : f ∈ D(B) ⊗ Cn2 , where we suppressed the explicit indication of the GNS representation. The corre↑ and G2 will be denoted by U (µ) , V (2) , and sponding unitary representations of P˜+ (µ) the vacuum vector by Ω .
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
585
(µ)
Consider now a scaling limit state ω0 of the scaling algebra F(µ) associated to F , and the corresponding GNS representation π0 . In analogy to the scalar field (µ) case, we define the corresponding (outer regularized) scaling limit net F0 by (µ) π0 (F(µ) (O1 )) . (4.7) F0 (O) := (µ)
O1 ⊃O (µ)
Theorem 4.3. Let s = 2, 3, and ω 0 be a scaling limit state of F(µ) . There is (µ) (µ) (2) (µ) a net-isomorphism θ between (F0 , U0 , V0 , Ω0 ) and (F (0) , U (0) , V (2) , Ω(0) ), which is unitarily implemented and is such that θ(π0 (F )) = w- lim δλ−1 (F λκ ), κ κ
(µ)
with (λκ )κ ⊂ R+ a net such that ω0
(4.8)
(µ)
= limκ ω λκ .
Proof. We begin by introducing an auxiliary scaling algebra G(µ) , which is defined as the scaling algebra associated to the net (F (µ) , U (µ) , Ω(µ) ), i.e. which disregards the action of the gauge group. Of course F(µ) (O) ⊆ G(µ) (O) for each O, and there (µ) exists a scaling limit state of G(µ) whose restriction to F(µ) coincides with ω0 . By a slight abuse of notation, we denote by π0 the scaling limit representation of G(µ) thus obtained, and we define (µ) G0 (O) := π0 (G(µ) (O1 )) . (4.9) O1 ⊃O
From the net isomorphism (F (µ) , U (µ) , Ω(µ) ) ¯ ···⊗ ¯ A (µn2 ) , U (µ1 ) ⊗ · · · ⊗ U (µn2 ) , Ω(µ1 ) ⊗ · · · ⊗ Ω(µn2 ) ), (A (µ1 ) ⊗ taking into account the fact that, according to [13, Proposition 3.1], each theory A (µk ) is asymptotically p -nuclear for each p ∈ (0, 1], and the fact that, according (µ ) to Proposition 4.1, each scaling limit net A0 k satisfies Haag duality, it follows, applying Theorem 3.8, that we have a net isomorphism θ (µ) (µ) (µ) ¯ ···⊗ ¯ A (0) , U (0) ⊗ · · · ⊗ U (0) , Ω(0) ⊗ · · · ⊗ Ω(0) ) (G0 , U0 , Ω0 ) (A (0) ⊗
(F (0) , U (0) , Ω(0) ), such that, for each G of the form Gλ = A1 λ ⊗ · · · ⊗ An2 λ , Ak ∈ A(µk ) , (µ ) (µ ) θ(π0 (G)) = (θ˜µ1 ⊗ · · · ⊗ θ˜µ2n ) (π0 1 ⊗ · · · ⊗ π0 2n )(G) = w- lim(δ˜λ−1 ⊗ · · · ⊗ δ˜λ−1 )(Gλκ ) κ κ κ
= w- lim δλ−1 (Gλκ ), κ κ
(µ )
(4.10)
where π0 k is the scaling limit representation of A(µk ) induced by the scaling limit (µ) (µ ) state Ak → ω 0 (11 ⊗ · · · ⊗ Ak ⊗ · · · ⊗ 11) = ω0 k (Ak ). The last relation is then ∗ ˜ (µ) of G(µ) generated extended by linearity and continuity to the C -subalgebra G
August 5, 2006 21:35 WSPC/148-RMP
586
J070-00272
C. D’Antoni & G. Morsella
by all such functions. Furthermore, θ = Ad(V˜µ1 ⊗ . . . ⊗ V˜µ2n ). We now extend Eq. (4.10) to arbitrary elements of G(µ) . From the proof of Theorem 3.8, it follows ˜ (µ) (O1 )) , and, therefore, for F ∈ G(µ) (O) and for that π0 (G(µ) (O)) ⊂ O1 ⊃O π0 (G ˜ (µ) (OB ) such that each OB ⊃ O and ε > 0, we can find G ∈ G [π0 (F ) − π0 (G)]Ω0 < ε. Furthermore, if H ∈ G(µ) (OB ), we have (F λκ ) θ(π0 (H))Ω(0) θ(π0 (H))Ω(0) , θ(π0 (F )) − δλ−1 κ = π0 (H ∗ H)Ω0 , π0 (F ) − π0 (G)]Ω0 (Gλκ ) θ(π0 (H))Ω(0) + θ(π0 (H))Ω(0) , θ(π0 (G)) − δλ−1 κ + δλκ θ(π0 (H ∗ H)) Ω(0) , F λκ − Gλκ Ω(0) ,
(4.11)
(4.12)
and the three terms of the right-hand side of such equation can be made arbitrarily small, for sufficiently large κ, thanks, respectively, to (4.11), to (4.10) and to the fact that, since ω (µ) and ω (0) are locally normal to each other, we have, by (2.2), lim (F λκ − Gλκ )Ω(0) 2 κ
= lim ω (0) ((F λκ − Gλκ )∗ (F λκ − Gλκ )) κ
= lim ω (µ) ((F λκ − Gλκ )∗ (F λκ − Gλκ )) κ
= [π0 (F ) − π0 (G)]Ω0 2 . Then, since θ(π0 (G(µ) (OB )))Ω(0) is dense in H (0) (its closure contains (0) (0) (F λκ ))κ is a bounded net, we conclude F (O1 )Ω for each O1 ⊃ OB ) and (δλ−1 κ that (4.10) holds for each element of G(µ) . (µ) (µ) In order to complete the proof, we need only show that G0 (O) = F0 (O). (2) To this end, let F ∈ G(µ) (O), and, for ψ ∈ L1 (G), consider β ψ F ∈ F(µ) (O). (2)
Thanks to (4.10) and to the fact that δλ and βg are commuting and unitarily implemented on H (0) , we have that, for each Φ ∈ H (0) , we can find a sequence (λn )n∈N converging to 0 such that (2) dg ψ(g)Φ, βg(2) δλn (F λn )Φ. Φ, θ(π0 (β ψ F ))Φ = lim n→+∞
G
But, using again (4.10), we have lim Φ, βg(2) δλn (F λn )Φ = Φ, βg(2) (θ(π0 (F )))Φ,
n→+∞
and therefore, applying the dominated convergence theorem, (2) dg ψ(g)βg(2) (θ(π0 (F ))). θ(π0 (β ψ F )) = G
This last equation entails that, if (ψn )n∈N is a δ-sequence in L1 (G), (2) then θ(π0 (β ψn F )) converges strongly to θ(π0 (F )), i.e. θ(π0 (G(µ) (O))) ⊆
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
587
θ(π0 (F(µ) (O))) ⊆ θ(π0 (G(µ) (O))) , and the cyclic Hilbert spaces π0 (G(µ) )Ω0 and π0 (F(µ) )Ω0 coincide (recall that θ is unitarily implemented), from which the equal(µ) (µ) ity of the nets F0 and G0 readily follows. Knowing explicitly the scaling limit of F (µ) , it is not difficult to show that the superselection structure of the observable net A (µ) (O) = F (µ) (O)G2 is entirely preserved. Theorem 4.4. Each covariant, finite statistics sector of the net A (µ) is preserved (µ) in each scaling limit state ω0 . Proof. As recalled in Sec. 2, we must find, for each covariant, finite statistics sector ξ of A (µ) and for each double cone O, a scaled multiplet ψj (λ) ∈ F (µ) (λO), j = 1, . . . , d, d the statistical dimension of ξ, and for each j an F ∈ F(µ) (O1 ) with ¯ for which Eq. (2.3) holds. O1 ⊃ O, To this end, let O = ΛOB + x, take a multiplet ψj ∈ F (µ) (OB ) = F (0) (OB ) (µ) associated to the sector ξ, and define ψj (λ) := α(Λ,λx) δλ (ψj ) ∈ F (µ) (λO). We claim that for this multiplet, Eq. (2.3) is satisfied. (2) (µ) Since βg commutes with α(Λ,λx) δλ , it is obvious that ψj (λ), j = 1, . . . , d, is a (µ) multiplet of class ξ. Now, since θ−1 (ψj ) ∈ F0 (OB ) = O2 ⊃OB π0 (F(µ) (O2 )) , for each ε > 0 and O2 ⊃ O B , we can find G ∈ F(µ) (O2 ) such that [θ−1 (ψj ) − π0 (G)]Ω0 + [θ−1 (ψj ) − π0 (G)]∗ Ω0 < ε. We have then the following chain of equalities, where [· · ·] stands for either [· · ·] or [· · ·]∗ : [θ−1 (ψj ) − π0 (G)] Ω0 = [ψj − θ(π0 (G))] Ω(0) = lim[ψj − δλ−1 (Gλκ )] Ω(0) κ κ
= lim[δλκ (ψj ) − Gλκ ] Ω(0) κ
= lim[δλκ (ψj ) − Gλκ ] Ω(µ) κ
(µ) = lim ψj (λκ ) − α(Λ,x) G λ Ω(µ) , κ
κ
where in the fourth equality we have again applied Eq. (2.2) to ω (µ) and ω (0) . Therefore, taking into account the remark about Poincar´e covariance in Sec. 2, we (µ) get (2.3) with F := α(Λ,x) G. From the above theorem and from [6, Corollary 6.2] the following result readily follows. Corollary 4.5. All local intertwiners between DHR endomorphisms of A (µ) are also global intertwiners.
August 5, 2006 21:35 WSPC/148-RMP
588
J070-00272
C. D’Antoni & G. Morsella
We remark that a property closely related to the equivalence of local and global intertwiners has been recently proven, under quite general assumptions, in the context of locally covariant theories [21]. The above results may be summarized in the following theorem, which expresses the existence of a rather vast class of decent quantum field theory models possessing non-preserved DHR sectors. Theorem 4.6. For each pair (G, N ) with G a compact Lie group and N ⊂ G a normal closed subgroup, there exists a bosonic QFTGA (F , U, V, Ω) such that the associated observable net A := F G fulfills the following properties: (i) A has a subset of DHR sectors which are in 1-1 correspondence with the unitary equivalence classes of irreducible representations of G; (ii) A has a unique quantum scaling limit according to the classification of [1]; (iii) among the sectors of A , only those which correspond to representations of G which are trivial on N are preserved in any scaling limit state; (iv) the set of DHR sectors of each (outer regularized ) scaling limit net A0 of A is in 1-1 correspondence with the unitary equivalence classes of irreducible representations of G/N . Proof. For a given choice of a finite symmetric generating set ∆ = ∆1 ∪ ∆2 of irreducible representations of G and a mass function µ : ∆2 → R+ as above, we form the tensor product theory F := F (1) ⊗ F (µ) . It is easy to check, as in the proof of Proposition 4.2, that the sectors of A = F G which appear in the Hilbert space H on which F acts are in 1-1 correspondence with classes of irreducibile representations of G, thereby proving (i). It then follows from Theorem 3.8, Proposition 4.2 and Theorem 4.3 that each outer regularized scaling limit net F0 of F is unitarily equivalent to the net F (0) and then, since, according to the results in [15], to each scaling limit state ω 0 of A there corresponds uniquely a scaling limit state of F whose restriction to A coincides with ω 0 , statement (ii) follows. Property (iii) is the content of Theorem 4.4, and finally property (iv) follows from the fact that F (0) , being a finite tensor product of free scalar field nets, satisfies the split property and Roberts’ cohomological condition, and it is therefore a complete field net, i.e. all DHR sectors of A0 ∼ = F (0)G/N are implemented by G/N -multiplets in F (0) . As noted in the remark following Proposition 4.2, it may also happen that A has more DHR sectors than those described by F , but, since A0 has precisely the sectors described by F (0) , also these additional sectors would not be preserved under the scaling limit. 5. Conclusions and Outlook In this work, we have presented very simple quantum field theory models possessing DHR superselection sectors which are not preserved under the scaling limit
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
589
operation, i.e. sectors of the underlying theory which are not also sectors of the scaling limit. The way in which these sectors are obtained provides a simple illustration of the physical mechanism which may be expected to lead to the appearance of non-preserved sectors in more realistic, interacting theories. As already discussed in [7], charges will disappear in the scaling limit if they have some kind of “internal structure” which, in order for it to be “squeezed” in a region of radius λ, requires an amount of energy growing faster than λ−1 . Therefore, one can expect that the fields carrying such a charge will have rather bad ultraviolet properties, and this is actually the case for the fields n(λ) φk (x) employed in constructing our models at scale λ. As we mentioned above, our examples, since they are built by making use of generalized free fields with constant mass measure, do not satisfy Haag duality, but only essential duality. This leaves open the possibility that requiring Haag duality rules out the existence of non-preserved sectors, but, in view of the physical picture just discussed, one may be tempted to exclude that this is actually the case. As a tool for building the models, we have derived sufficient conditions under which the scaling limit of a tensor product theory coincides with the tensor product of the scaling limits of the factor theories, the main such condition being a requirement of asymptotic nuclearity for the factor theories. While we do not have any example of theories not satisfying such hypothesis for which the scaling limit and tensor product operations do not commute, it seems to us quite natural that some kind of phase space condition has to play a role in such questions, particularly in view of the fact that, if a specific such condition holds, scaling limits are limits with respect to a suitable metric in a suitable space of nets of C∗ -algebras [22], and therefore they should enjoy good functorial properties. This is connected with the fact that we required G to be not just a compact group, but a Lie one, which is due to the following technical reason: according to the results in [13], an infinite tensor product of free scalar field theories is not asymptotically p-nuclear for 0 < p < 1/3, and therefore we have to use, in our construction, pairs (G, N ) for which the set ∆2 which generates the representations which are trivial on N is finite, i.e. even if we just assume that G is a compact group, we have in any case to require G/N to be a Lie group. It is therefore not clear if it is possible to construct, along the lines exposed above, examples of theories having, as in [8], an arbitrary compact group G as a gauge group, and such that the sectors associated to an arbitrary normal closed subgroup N ⊂ G are non-preserved. Acknowledgments The authors would like to thank D. Buchholz and L. Zsido for numerous helpful discussions and suggestions. G.M. acknowledges the kind hospitality of the Institute of Theoretical Physics of G¨ ottingen University during some stages of this work. The authors are supported by MIUR, INdAM-GNAMPA, and the Network “Quantum Spaces–Noncommutative Geometry” HPRN-CT-2002-00280.
August 5, 2006 21:35 WSPC/148-RMP
590
J070-00272
C. D’Antoni & G. Morsella
A. Some Results on Nuclear Maps For the interested reader, in this appendix we collect some elementary results about nuclear maps between Banach spaces which are used in the main text, but whose proofs are not easily found in the existing literature. We begin by recalling the definition of a nuclear map between Banach spaces. Definition A.1. Let X, Y be Banach spaces and p ∈ (0, 1]. A bounded linear map T ∈ B(X, Y ) is said to be p-nuclear if there exist sequences (fn )n∈N ⊂ X ∗ and (yn )n∈N ⊂ Y such that Tx =
+∞ n=1 +∞
fn (x)yn ,
∀ x ∈ X,
fn pX ∗ yn pY < +∞.
n=1
The nuclear p-norm of T is defined as +∞
1/p p p T p := inf fn X ∗ yn Y , n=0
where the infimum is taken over all possible decompositions of T as above. The p -nuclear maps form a vector space equipped with the quasi-norm · p [23, Sec. 19.7]. A closely related concept is the one of ε-content of a compact map. Definition A.2. Let X, Y be Banach spaces, and T ∈ B(X, Y ). The ε-content of T , denoted by NT (ε), is the maximal number of elements xi ∈ X, xi ≤ 1, i = 1, . . . , NT (ε), such that T (xi − xj ) > ε for i = j. It is easy to check that NT (ε) < +∞ for all ε > 0 if and only if T is a compact map. For the convenience of the reader, we summarize in the next lemma some results, which are used in the main body of the paper, about the relationships between ε-content and nuclearity for maps with values in a Hilbert space. For their proof, we refer the reader to [10, Lemma 2.1] and to the references cited there. Lemma A.3. Let X be a Banach space, H a Hilbert space and T ∈ B(X, H). (i) If 0 < p < 1 and q > p/(1 − p), there exists a constant c = cp,q > 0 such that, for each p-nuclear T, there holds NT (ε) ≤ e
q cT p εq
.
(A.1)
(ii) If 0 < p ≤ 1, there exists a constant d = dp > 0 such that if there exists a sequence of positive numbers (εm )m∈N with +∞ 1 1 p m 2 εm NT (εm ) m < +∞, m=1
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
then the map T is p-nuclear and
p1 +∞ 1 1 T p ≤ d (m 2 εm NT (εm ) m )p .
591
(A.2)
m=1
We will have to deal with tensor products of nuclear maps, and the following lemma will be useful at some instances. Lemma A.4. Let Ti : Xi → Yi be p-nuclear maps, i = 1, 2, and let · α , · β be cross-norms on the algebraic tensor products X1 ⊗ X2 , Y1 ⊗ Y2 , respectively. Assume further that · α majorizes the injective cross-norm on X1 ⊗ X2 . Then there exists a unique bounded operator T1 ⊗ T2 : X1 ⊗α X2 → Y1 ⊗β Y2 such that T1 ⊗ T2 (x1 ⊗ x2 ) = T1 (x1 ) ⊗ T2 (x2 ), and there holds T1 ⊗ T2 p ≤ T1 p T2 p . Proof. Uniqueness of T1 ⊗ T2 is immediate. We prove existence. For a given ε > 0, we can find sequences (fi,n )n∈N ⊂ Xi∗ , (yi,n )n∈N ⊂ Yi , i = 1, 2, such that Ti (xi ) =
+∞
fi,n (xi )yi,n ,
n=1
+∞
fi,n p yi,n p < (Ti p + ε)p .
n=1
Since · α majorizes the injective cross-norm, the induced norm · α∗ on X1∗ ⊗ X2∗ is a cross-norm [24, Proposition IV.2.2] and therefore the algebraic tensor product f1,n ⊗ f2,m extends to an element of X1∗ ⊗α∗ X2∗ = (X1 ⊗α X2 )∗ , denoted by the same symbol. Furthermore, since p ≤ 1, it is easy to check that T1 ⊗ T2 (x) =
+∞
f1,n ⊗ f2,m (x)y1,n ⊗ y2,m ,
n,m=1
defines a bounded T1 ⊗ T2 : X1 ⊗α X2 → Y1 ⊗β Y2 such that T1 ⊗ T2 (x1 ⊗ x2 ) = T1 (x1 ) ⊗ T2 (x2 ), and from +∞
f1,n ⊗ f2,m pα∗ y1,n ⊗ y2,m pβ < (T1 + ε)p (T2 + ε)p ,
n,m=1
and the arbitrariness of ε, we get the estimate T1 ⊗ T2 p ≤ T1 p T2 p . We will apply this result to the case in which the Xi are C∗ -algebras and · α is the minimal C∗ -cross-norm, for which the above hypotheses are satisfied [24, Sec. IV.4]. Lemma A.5. Let X be a Banach space, H be a Hilbert space and T : X → H a p-nuclear map, 0 < p < 2/3. There exist an orthonormal system (ξn )n∈N in ran T and a sequence (fn )n∈N ⊂ X ∗ such that for each q, 4p/(p + 2) < q ≤ 1, there holds Tx =
+∞ n=1
fn (x)ξn ,
+∞ n=1
fn q < +∞.
August 5, 2006 21:35 WSPC/148-RMP
592
J070-00272
C. D’Antoni & G. Morsella
Proof. The proposition is trivial if T is of finite rank, so we assume that this is not the case. Let (ζk )k∈N ⊂ H, (gk )k∈N ⊂ X ∗ be such that Tx =
+∞
gk (x)ζk ,
k=1
+∞
gk p ζk p < +∞,
k=1
and we can assume that the sequence ak := gk ζk is non-increasing. If E is the projection on ran T , we have T x = ET x = +∞ k=1 gk (x)Eζk , so that we can also assume that ζk ∈ ran T , and of course that ζk = 0. We now define inductively a subsequence (ηm )m∈N ⊂ (ζk )k∈N of linearly independent vectors in the following way: η1 := ζ1 , and having defined linearly independent vectors {η1 , . . . , ηm } ⊂ (ζk )k∈N , it will be ηm = ζk for some k, and ηm+1 will be defined to be the first vector in (ζh )h≥k+1 which is linearly independent from {η1 , . . . , ηm }. It is clear from this construction that each vector ζk will be a linear combination of the vectors η1 , . . . , ηk at most. Let now (ξn )n∈N ⊂ ran T be the orthonormal system obtained by applying the Gram–Schmidt procedure to (ηm )m∈N . It also holds that ηm is in the subspace spanned by ξ1 , . . . , ξm , so that we will have ζk =
k
αkn ξn ,
ζk 2 =
n=1
k
|αkn |2 ,
n=1
for suitable scalars (αkn )1≤n≤k<+∞ , and, if q ≤ 1, it follows, thanks to the convexity q properties of the function t → t 2 , that ζk = k q
q 2
k 1 |αkn |2 k n=1
q2 ≥k
q 2 −1
k
|αkn |q .
(A.3)
n=1
Since the sequence ak = gk ζk is non-increasing, from the inequalities kapk ≤
k
aph ≤
h=1
+∞
aph < +∞,
h=1 1
we get that there exists a constant c > 0 such that ak < c/k p . It follows then from this observation and from Eq. (A.3) that, if we set αkn = 0 for n > k, +∞ +∞
|αkn |q gk q ≤
k=1 n=1
+∞
q
k 1− 2 ζk q gk q ≤ cq
k=1
n=1
fn q ≤
+∞ +∞ n=1 k=1
|αkn |q gk q =
k 1− 2 − p q
q
k=n
is convergent if q > 4p/(p + 2), which entails that fn := functional on X with +∞
+∞
+∞ +∞ k=1 n=1
+∞ k=n
αkn gk is a bounded
|αkn |q gk q < +∞,
August 5, 2006 21:35 WSPC/148-RMP
J070-00272
Scaling Algebras and Superselection Sectors
593
where we have used the fact that if 0 < q ≤ 1 and a, b > 0 then (a + b)q ≤ aq + bq , and that it is possible to interchange the sums since the double sum is absolutely convergent. This also implies that for each x ∈ X the double series +∞ +∞ k=1 n=1 αkn gk (x)ξn is absolutely convergent in H, and therefore it is allowed to interchange sums in Tx =
+∞ k=1
gk (x)ζk =
+∞
gk (x)
k=1
k n=1
αkn ξn =
+∞ n=1
ξn
+∞ k=n
αkn gk (x) =
+∞
fn (x)ξn ,
n=1
which concludes the proof. References [1] D. Buchholz and R. Verch, Scaling algebras and renormalization group in algebraic quantum field theory, Rev. Math. Phys. 7 (1995) 1195–1239. [2] R. Haag, Local Quantum Physics, 2nd edn. (Springer, 1996). [3] D. Buchholz, Quarks, gluons, colour: Facts or fiction?, Nucl. Phys. B 469 (1996) 333–356. [4] J. E. Roberts, Lectures on algebraic quantum field theory, in The Algebraic Theory of Superselection Sectors. Introduction and Recent Results (Palermo, 1989), ed. D. Kastler (World Scientific, 1990), pp. 1–112. [5] D. Buchholz and R. Verch, Scaling algebras and renormalization group in algebraic quantum field theory. II: Instructive examples, Rev. Math. Phys. 10 (1998) 775–800. [6] C. D’Antoni, G. Morsella and R. Verch, Scaling algebras for charged fields and shortdistance analysis for localizable and topological charges, Ann. Henri Poincar´e 5 (2004) 809–871. [7] C. D’Antoni, G. Morsella and R. Verch, Scaling algebras for charge carrying quantum fields and superselection structure at short distances, to appear in Proc. Young Researchers Symposium XIV Int. Cong. Math. Phys. (Lisbon, 2003). [8] S. Doplicher and G. Piacitelli, Any compact group is a gauge group, Rev. Math. Phys. 14 (2002) 873–886. [9] M. Lutz, Ein lokales Netz ohne Ultraviolettfixpunkte der Renormierungsgruppe, diploma thesis, Hamburg University (1997). [10] D. Buchholz, Phase space properties of local observables and structure of scaling limits, Ann. Inst. Henri Poincar´e 64 (1996) 433–460. [11] J. E. Roberts, Some applications of dilatation invariance to structural questions in the theory of local observables, Commun. Math. Phys. 37 (1974) 273–286. [12] S. Doplicher and J. E. Roberts, Why there is a field algebra with a compact gauge group describing the superselection structure in particle physics, Commun. Math. Phys. 131 (1990) 51–107. [13] S. Mohrdieck, Phase space structure and short distance behaviour of local quantum field theories, J. Math. Phys. 43 (2002) 3565–3574. [14] S. Sakai, C∗ -Algebras and W∗ -Algebras, Ergebnisse der Mathematik, No. 40 (Springer-Verlag, 1971). [15] R. Conti and G. Morsella, work in progress. [16] J.-P. Eckmann and J. Fr¨ ohlich, Unitary equivalence of local algebras in the quasifree representation, Ann. Inst. Henri Poincar´e Sect. A (N.S.) 20 (1974) 201–209. [17] H. Araki, A lattice of von Neumann algebras associated with the quantum theory of a free boson field, J. Math. Phys. 4 (1963) 1343–1362. [18] J. F. Price, Lie Groups and Compact Groups, London Mathematical Society Lecture Notes Series, No. 25 (Cambridge University Press, 1977).
August 5, 2006 21:35 WSPC/148-RMP
594
J070-00272
C. D’Antoni & G. Morsella
[19] S. Doplicher, R. Haag and J. E. Roberts, Fields, observables and gauge transformations I, Commun. Math. Phys. 13 (1969) 1–23. [20] S. Doplicher and R. Longo, Standard and split inclusions of von Neumann algebras, Invent. Math. 75 (1984) 493–536. [21] R. Brunetti and G. Ruzzi, Superselection sectors and general covariance. I, gr-qc/0511118. [22] D. Guido and R. Verch, Quantum Gromov–Hausdorff convergence and scaling limit theories, seminar at the Oberwolfach workshop on Quantum Field Theory and Noncommutative Geometry (October 23–29, 2005). [23] H. Jarchow, Locally Convex Spaces (B. G. Teubner, 1981). [24] M. Takesaki, Theory of Operator Algebras, Vol. I (Springer-Verlag, 1979).
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Reviews in Mathematical Physics Vol. 18, No. 6 (2006) 595–617 c World Scientific Publishing Company
QUANTUM DYNAMICAL SEMIGROUPS GENERATED BY NONCOMMUTATIVE UNBOUNDED ELLIPTIC OPERATORS
CHANGSOO BAHN Department of Mathematics, Korean Minjok Leadership Academy, Gangwon-do 225-823, Korea [email protected] CHUL KI KO Natural Science Research Institute, Yonsei University, Seoul 120-749, Korea [email protected] YONG MOON PARK Department of Mathematics, Yonsei University, Seoul 120-749, Korea [email protected] Received 29 June 2005 Revised 26 April 2006 We study quantum dynamical semigroups generated by noncommutative unbounded elliptic operators which can be written as Lindblad-type unbounded generators. Under appropriate conditions, we first construct the minimal quantum dynamical semigroups for the generators and then use Chebotarev and Fagnola’s sufficient conditions for conservativity [1] to show that the semigroups are conservative. We then apply our results to a quantum mechanical system. Keywords: Quantum dynamical semigroups; noncommutative elliptic operators; conservativity; quantum mechanical system. Mathematics Subject Classification 2000: 47D07, 47N50, 81S25
1. Introduction The purpose of this work is to study quantum dynamical semigroups (q.d.s.) generated by noncommutative unbounded elliptic operators which can be expressed as Lindblad-type (unbounded) generators. Under appropriate conditions on coefficients, we first construct the minimal q.d.s. for the generators and then use Chebotarev and Fagnola’s sufficient conditions for conservativity [1] to show that the semigroups are conservative. For details, see Sec. 3.
595
September 12, 2006 14:40 WSPC/148-RMP
596
J070-00275
C. Bahn, C. K. Ko & Y. M. Park
Let us first describe briefly the background of this study. In [2], using a quantum version of Feynman–Kac formula, the authors constructed the Markovian semigroup generated by the following noncommutative elliptic operator L on a von Neumann algebra M acting on a separable Hilbert space h: D(L) = D(δ 2 ), L(X) =
1 2 1 δ (X) + Aδ(X) + δ(X)A − [A, [A, X]], 2 2
X ∈ D(L),
(1.1)
where A is a self-adjoint element of M, δ is the generator of a weak*-continuous group of *-automorphisms (αt )t∈R of M and [X, Y ] = XY −Y X for any X, Y ∈ M. Let M = B(h) be the class of bounded operators on h and B be a self-adjoint operator on h. Let αt (X) = eitB Xe−itB , X ∈ M, be the corresponding one parameter group of automorphisms of M. Then δ(X) = i[B, X],
X ∈ D(δ).
Put L := A − iB,
H :=
1 (AB + BA). 2
The generator L in (1.1) can be represented by the following Lindblad-type generator: 1 1 L(X) = i[H, X] − L∗ LX + L∗ XL − XL∗ L, 2 2
X ∈ D(L).
(1.2)
In this paper, we consider that Al and Bl , l = 1, 2, . . . , n, are self-adjoint operators on h satisfying appropriate conditions (Assumptions 3.1 and 3.5). Let Ll = Al − iBl
and Hl =
1 (Al Bl + Bl Al ), 2
l = 1, 2, . . . , n.
We are interested in the following (formal) generator L: n 1 ∗ 1 ∗ ∗ L(X) = i[Hl , X] − Ll Ll X + Ll XLl − XLl Ll . 2 2
(1.3)
l=1
The aim of this paper is to construct the conservative minimal q.d.s. with generator L given in (1.3) for an unbounded operators Al , Bl , l = 1, 2, . . . , n. Because of the unboundedness, the method of the quantum Feynman–Kac formula in [2, 3] cannot be applied. In [4], the authors employed the theory of the minimal q.d.s. to construct the Markovian semigroup with generator L in (1.2) under the condition that [B, A] is bounded on h. This condition means that [Bl , Al ] is bounded for any
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Quantum Dynamical Semigroups
597
l = 1, 2, . . . , n in our case. In this paper, we give the improved condition: For any ε ∈ (0, 1), there exists a positive constant c(ε), depending on ε, such that [Bk , Al ]∗ [Bk , Al ] ≤ ε
n
(A2l + Bl2 ) + c(ε),
k, l = 1, 2, . . . , n,
(1.4)
l=1
as bilinear forms on suitable domain D (see (3.3)). In order to construct the minimal q.d.s. generated by L given in (1.3), we first prove a useful proposition (Proposition 3.3) on perturbations of the generator of a strongly continuous contraction semigroup. For the detail, see Proposition 3.3. Under suitable condition (Assumption 3.1), we apply Proposition 3.3 to construct the minimal q.d.s. with generator L given in (1.3), and then use the results of Chebotarev and Fagnola ([1, Theorem 4.4]) to show that the minimal q.d.s. is conservative under additional condition: There exists a constant c3 ∈ R such that for any {ul }nl=1 ⊂ D, n
uk , i[Bk , Al ]ul ≤ c3
k,l=1
n
ul 2 .
l=1
See also (3.13) and part (iv) in Assumption 5.1. As an application of our main results, we consider the following quantum mechanical system. Let h = L2 (Rn ). Let Wl (x1 , x2 , . . . , xn ), l = 1, 2, . . . , n, denoted by Wl (x), be real valued twice continuously differentiable functions on Rn . For each ∂ with respect to the lth coordil = 1, 2, . . . , n, let ∂l be the differential operator ∂x l nate and ∂lk =
∂2 ∂xk ∂xl
(l, k = 1, 2, . . . , n). For each l = 1, 2, . . . , n, we choose Al = −Wl
and Bl = −i∂l .
(1.5)
If X is a smooth function with a compact support on Rn (a multiplication operator on L2 (Rn )), then [Wl , X] = 0, l = 1, 2, . . . , n and the generator given in (1.3) can be written as 1 (1.6) L(X) = ∆X − 2W · ∇X, 2 where W = (W1 , W2 , . . . , Wn ), ∇X = (∂1 X, ∂2 X, . . . , ∂n X) and ∆X = nl=1 ∂ll X. Thus the operator L given in (1.3) is a noncommutative generalization of the elliptic operator given in (1.6). In the case of n = 1, this has been studied in [5, Example 4.2]. We would like to mention that Chebotarev [6] and Fagnola [7, 8] studied the case W in (1.6) with bounded partial derivatives. See also [8, Eq. (5.2)]. The paper is organized as follows: In Sec. 2, we review the theory of the minimal q.d.s. and give Chebotarev and Fagnola’s sufficient conditions for conservativity [1]. In Sec. 3, we give the assumptions and state main results. First, we introduce a proposition related to the perturbation of the generator of a strongly continuous contraction semigroup, and then construct the minimal q.d.s. with (formal) generater L. Under additional condition, we show that the q.d.s. is conservative. Section 4 is devoted to proofs of main results. In Sec. 5, we apply our results to the example mentioned in the above.
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
C. Bahn, C. K. Ko & Y. M. Park
598
2. Review on the Minimal Quantum Dynamical Semigroups Let h be a separable Hilbert space with the scalar product ·, · and norm · . Let B(h) denote the Banach space of bounded linear operators on h. The uniform norm in B(h) is denoted by · ∞ and the identity in h is denoted by I. We denote by D(G) the domain of operator G in h. Definition 2.1. A quantum dynamical semigroup (q.d.s.) on B(h) is a family T = (Tt )t≥0 of bounded operators in B(h) with the following properties: (i) (ii) (iii) (iv)
T0 (X) = X, for all X ∈ B(h), Tt+s (X) = Tt (Ts (X)), for all s, t ≥ 0 and all X ∈ B(h), Tt (I) ≤ I, for all t ≥ 0, (completely positivity) for all t ≥ 0, all integers n and all finite sequences (Xj )nj=1 , (Yl )nl=1 of elements of B(h), we have n
Yl∗ Tt (Xl∗ Xj )Yj ≥ 0,
j, l=1
(v) (normality or σ-weak continuity) for every sequence (Xn )n≥1 of elements of B(h) converging weakly to an element X of B(h) the sequence (Tt (Xn ))n≥1 converges weakly to Tt (X) for all t ≥ 0, (vi) ultraweak or weak∗ continuity for all trace class operator ρ on h and all X ∈ B(h), we have lim Tr(ρTt (X)) = Tr(ρX).
t→0+
We recall that as a consequence of properties (iii) and (iv), for each t ≥ 0 and X ∈ B(h), Tt is a contraction, i.e. Tt (X)∞ ≤ X∞,
(2.1)
and as a consequence of properties (iv) and (vi), for all X ∈ B(h), the map t → Tt (X) is strongly continuous. Definition 2.2. A q.d.s. T = (Tt )t≥0 is called to be conservative or Markovian if Tt (I) = I for all t ≥ 0. The natural generator of q.d.s. would be the Lindblad type generator [9, 10] ∞
1 1 L(X) = i[H, X] − XM + L∗l XLl − M X, X ∈ B(h), 2 2 l=1 ∞ ∗ where M = l=1 Ll Ll , Ll is densely defined and H a symmetric operator on h. The generator can be formally written by L(X) = XG + G∗ X +
∞ l=1
L∗l XLl ,
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Quantum Dynamical Semigroups
599
where G = −iH − 12 M . A very large class of q.d.s. was constructed by Davies [11] satisfying the following assumption. It is basically corresponding to the condition L(I) = 0. Assumption 2.3. The operator G is the infinitesimal generator of a strongly continuous contraction semigroup P = (P (t))t≥0 in h. The domain of the operators (Ll )∞ l=1 contains the domain D(G) of G. For all v, u ∈ D(G), we have v, Gu + Gv, u +
∞
Ll v, Ll u = 0.
(2.2)
l=1
As a result of [12, Proposition 2.5], we can assume only that the domain of the operators Ll contains a subspace D which is a core for G and (2.2) holds for all v, u ∈ D. For all X ∈ B(h), consider the sesquilinear form L(X) on h with domain D(G)× D(G) given by v, L(X)u = v, XGu + Gv, Xu +
∞
Ll v, XLl u.
(2.3)
l=1
Under the Assumption 2.3, one can construct a q.d.s. T = (Tt )t≥0 satisfying the equation t v, L(Ts (X))u ds (2.4) v, Tt (X)u = v, Xu + 0
for all v, u ∈ D(G) and all X ∈ B(h). Indeed, for a strongly continuous family (Tt (X))t≥0 of elements of B(h) satisfying (2.1), the followings are equivalent: (i) Equation (2.4) holds for all v, u ∈ D(G), (ii) For all v, u ∈ D(G), we have v, Tt (X)u = P (t)v, XP (t)u +
∞ 0
l=1
t
Ll P (t − s)v, Ts (X)Ll P (t − s)u ds.
(2.5)
We refer to the proof of [1, Proposition 2.3]. A solution of Eq. (2.5) is obtained by the iterations (0)
u, Tt
(n+1)
u, Tt
(X)u = P (t)u, XP (t)u, (X)u = P (t)u, XP (t)u ∞ t Ll P (t − s)u, Ts(n) (X)Ll P (t − s)u ds + l=1
(2.6)
0
for all u ∈ D(G). In fact, for all positive elements X ∈ B(h) and all t ≥ 0, the (n) sequence of operators (Tt (X))n≥0 is non-decreasing. Therefore, it is strongly convergent and its limits for X ∈ B(h) and t ≥ 0 define the minimal solution (Tt )t≥0
September 12, 2006 14:40 WSPC/148-RMP
600
J070-00275
C. Bahn, C. K. Ko & Y. M. Park
of (2.5) in the sense that, given another solution (Tt )t≥0 of (2.4), one can easily check that Tt (X) ≤ Tt (X) ≤ X∞I for any positive element X and all t ≥ 0. For details, we refer to [13, 14]. From now on, the minimal solution (Tt )t≥0 is called the minimal q.d.s. Chebotarev and Fagnola gave a criteria to verify the conservativity of minimal q.d.s. (Tt )t≥0 obtained under Assumption 2.3. Here we give their result. Theorem 2.4 [1, Theorem 4.4]. Suppose that there exists a positive self-adjoint operator C in h with the following properties: (a) The domain of the positive square root C 1/2 contains the domain D(G) of G and D(G) is a core for C 1/2 , (b) the linear manifolds Ll (D(G2 )), l ≥ 1, are contained in the domain of C 1/2 , (c) there exists a positive self-adjoint operator Φ, with D(G) ⊂ D(Φ1/2 ) such that, for all u ∈ D(G), we have −2 Reu, Gu =
∞
Ll u2 = Φ1/2 u2 ,
l=1
(d) D(C) ⊂ D(Φ), and for all u ∈ D(C), we have Φ1/2 u ≤ C 1/2 u, (e) there exists a positive constant k such that 2 ReC 1/2 u, C 1/2 Gu +
∞
C 1/2 Ll u2 ≤ kC 1/2 u2 ,
(2.7)
l=1
for all u ∈ D(G2 ). Then the minimal q.d.s. (Tt )t≥0 is conservative. 3. Conservativity of Minimal Quantum Dynamical Semigroups: Main Results Let Al and Bl , l = 1, 2, . . . , n be self-adjoint operators on the Hilbert space h with a common core D satisfying a suitable condition (Assumption 3.1(i)). Let H and Ll , l = 1, 2, . . . , n be the operators defined by n
Hu =
1 (Al Bl u + Bl Al u), 2 l=1
(3.1)
Ll u = Al u − iBl u, for any u ∈ D. Under Assumption 3.1(i) listed below, H is a densely defined, symmetric operator. We denote again by H its closure. For each l = 1, 2, . . . , n, the adjoint operator L∗l of Ll is given by L∗l u = Al u + iBl u, Since
D(L∗l )
u ∈ D.
is dense, Ll is closable. Denote again by Ll its closure.
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Quantum Dynamical Semigroups
601
In the rest of this paper, we assume that the operators Al and Bl , l = 1, 2, . . . , n satisfy the following properties: Assumption 3.1. Suppose that Al and Bl , l = 1, 2, . . . , n are self-adjoint operators on h with the common core D satisfying the followings: (i) Let Cl , Cl and Cl be either Al or Bl , l = 1, 2, . . . , n. For any u ∈ D, Ck u ∈ D(Cl ),
Cl Ck u ∈ D(Cm ),
k, l, m = 1, 2, . . . , n,
(3.2)
(ii) [Ak , Al ] = 0, [Bk , Bl ] = 0 on D, k, l = 1, 2, . . . , n, n 2 2 (iii) l=1 (Al + Bl ) is essentially self-adjoint on D, (iv) for any ε > 0, there exists a positive constant c(ε), depending on ε, such that for any u ∈ D and k, l = 1, 2, . . . , n, n 2 2 2 (Al + Bl )u + c(ε)u2 , (3.3) [Bk , Al ]u ≤ ε u, l=1
(v) there exist positive constants c1 and c2 such that for any u ∈ D and j, k, l = 1, 2, . . . , n, [Aj , [Bk , Al ]]u2 + [Bj , [Bk , Al ]]u2 n ≤ c1 u, (A2l + Bl2 )u + c2 u2 .
(3.4)
l=1
Let the operators G0 and G defined by G0 u = −
n
n
l=1
l=1
1 ∗ 1 Ll Ll u = − (Al + iBl )(Al − iBl )u 2 2 n
=−
1 2 (Al + Bl2 + i[Bl , Al ])u, 2
(3.5)
l=1
Gu = −iHu + G0 u,
(3.6)
for any u ∈ D. We will denote A=
n l=1
A2l ,
B=
n l=1
Bl2 ,
Kkl = i[Bk , Al ],
k, l = 1, 2, . . . , n.
(3.7)
n By (3.3), l=1 Kll is infinitesimally small with respect to A + B. Clearly, n K is symmetric on D. By [15, Theorem X.12], the operator G0 is nonpositive, ll l=1 essentially self-adjoint on D. Denote again by G0 the self-adjoint extension of G0 . The operator G0 generates a strongly continuous contraction semigroup on h. Since the adjoint operator G∗ of G is given by G∗ = iH + G0 on D, G is closable. Denote by G again its closure.
September 12, 2006 14:40 WSPC/148-RMP
602
J070-00275
C. Bahn, C. K. Ko & Y. M. Park
We consider the elliptic operator L on B(h) formally given by L(X) = i[H, X] −
n
n
n
l=1
l=1
l=1
1 1 ∗ Ll Ll X + L∗l XLl − XL∗l Ll 2 2
= G∗ X + XG +
n
L∗l XLl ,
X ∈ D(L).
(3.8)
l=1
Remark 3.2. In case that n = 1 and [Bl , Al ] is bounded, the elliptic operator L in (3.8) was studied in [4]. In this paper, we will remove the boundedness (see (3.3)). As mentioned in the Introduction, we will construct the minimal q.d.s. with the formal generator (3.8) under Assumption 3.1, and under appropriate additional condition (Assumption 3.5), we show the conservativity of the semigroup. We state our main results. We first introduce a useful proposition on perturbations of the generator of a strongly continuous contraction semigroup. Proposition 3.3. Let (Q, D(Q)) be the generator of a strongly continuous contraction semigroup on a Hilbert space h and let (S, D(S)) be a symmetric operator on h. Assume that the following properties hold: (a) There is a dense set D such that D ⊂ D(Q) ∩ D(S) and D is a core for Q, (b) (relative boundedness) there are positive constants α, β such that the bound Su2 ≤ α2 Qu2 + β 2 u2
(3.9)
holds for any u ∈ D, (c) (commutator estimate) for any ε > 0 there is a constant c˜(ε) > 0, depending on ε, such that the bound ±i(Qu, Su − Su, Qu) ≤ εQu2 + c˜(ε)u2
(3.10)
holds for any u ∈ D. Then for any λ ∈ R, the operator (Q + iλS, D(Q)) generates a strongly continuous contraction semigroup on h. Moreover, D is a core for Q + iλS. Now consider the sesquilinear form L(X) on h with domain D × D given by v, L(X)u = v, XGu + Gv, Xu +
n
Ll v, XLl u
(3.11)
l=1
and the semigroup T = (Tt )t≥0 satisfying the equation t v, L(Ts (X))u ds v, Tt (X)u = v, Xu + 0
for all u, v ∈ D and for all X ∈ B(h).
(3.12)
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Quantum Dynamical Semigroups
603
Theorem 3.4. Suppose that Al , Bl , l = 1, 2, . . . , n satisfy Assumption 3.1. (a) The operator G defined as in (3.5) and (3.6) generates a strongly continuous contraction semigroup on h. Moreover, D is a core for G. (b) There exists the minimal q.d.s. T = (Tt )t≥0 satisfying (3.12). Next, in order to show that the minimal q.d.s. T = (Tt )t≥0 is conservative, let us introduce another assumption for Al , Bl , l = 1, 2, . . . , n. Assumption 3.5. There exists a constant c3 ∈ R such that for any {ul }nl=1 ⊂ D, n
uk , Kkl ul ≤ c3
k,l=1
n
ul 2 .
(3.13)
l=1
Theorem 3.6. Suppose that Al , Bl , l = 1, 2, . . . , n satisfy Assumptions 3.1 and 3.5. Then the minimal q.d.s. T = (Tt )t≥0 obtained in Theorem 3.4(b) is conservative. Remark 3.7. Let Ml , l = 1, 2, . . . , n be self-adjoint operators on the Hilbert space h with a common core D satisfying the similar conditions corresponding to Assumption 3.1(i), (iv) and (v) of self-adjoint operators Al . Assume that they satisfy [Mk , Ml ] = 0, [Mk , Al ] = 0 on D, and for some d > 0,
k, l = 1, 2, . . . , n,
2
Ml u ≤ dAl u2 ,
u ∈ D,
l = 1, 2, . . . , n.
We consider H, Ll defined as for any u ∈ D n
1 Hu = (Ml Bl u + Bl Ml u), 2
Ll u = Al u − iBl u
l=1
instead of (3.1). We can also construct the minimal q.d.s. T = (Tt )t≥0 with the (formal) generator L in (3.8). But it is hard for us to find a simple condition for conservativity of the m.q.d.s. 4. Proofs of Main Results In this section, we produce the proofs of Proposition 3.3, and Theorems 3.4 n 2 and 3.6. We first introduce an elementary fact. Recall that A = l=1 Al , B = n 2 l=1 Bl , Kkl = i[Bk , Al ], k, l = 1, 2, . . . , n. Lemma 4.1. (a) The inequalities n l=1
hold for u ∈ D.
A2l u2 ≤ Au2 ,
n l=1
Bl2 u2 ≤ Bu2
(4.1)
September 12, 2006 14:40 WSPC/148-RMP
604
J070-00275
C. Bahn, C. K. Ko & Y. M. Park
(b) There exist positive constants k1 , k2 such that Au2 + Bu2 ≤ k1 (A + B)u2 + k2 u2
(4.2)
holds for u ∈ D. Proof. (a) Notice that by Assumption 3.1(ii), for any u ∈ D, A2k u, A2l u = Ak Al u, Ak Al u ≥ 0. Thus we have n
Au2 =
A2k u, A2l u
k,l=1 n
≥
A2l u, A2l u =
l=1
n
A2l u2 .
l=1
Similarly, one can check the other inequality. (b) We compute that for u ∈ D (A + B)u2 = Au2 + Bu2 + 2 ReAu, Bu n = Au2 + Bu2 + 2 ReAu, Bl2 u l=1 2
= Au + Bu +2
n
2
(ABl u, Bl u + Re[Bl , A]u, Bl u)
l=1
≥ Au2 + Bu2 + 2
n
Re[Bl , A]u, Bl u,
(4.3)
l=1
and 2
n
n
Re[Bl , A]u, Bl u = 2
l=1
Re[Bl , A2k ]u, Bl u
l,k=1
=
n
u, [Bl , [Bl , A2k ]]u
l,k=1
= −i
n
u, ([Bl , Ak Klk ] + [Bl , Klk Ak ])u
l,k=1
=−
n l,k=1
2 u, iAk [Bl , Klk ] + 2Klk + i[Bl , Klk ]Ak u.
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Quantum Dynamical Semigroups
605
Using Schwarz’s inequality to the above, it follows from (3.3) and (3.4) that there exist positive constants k3 and k4 such that 2
n
Re[Bl , A]u, Bl u ≥ −2
l=1
n
(Klk u2 + Ak u[Bl , Klk ]u)
l,k=1
≥−
n
(2Klk u2 + Ak u2 + [Bl , Klk ]u2 )
l,k=1
≥ −k3 u, (A + B)u − k4 u2 1 1 k3 + k4 u2 . ≥ − k3 (A + B)u2 − 2 2
(4.4)
Combining (4.3) and (4.4), we have the inequality (4.2). Proof of Proposition 3.3. Replacing α−1 S by S, we may assume that α = 1. It follows from (3.10) that for any γ > 0 and u ∈ D, (Q + iγS)u2 − γ 2 Su2 = Qu2 + iγ(Qu, Su − Su, Qu) ≥ (1 − γε)Qu2 − γ˜ c(ε)u2 . By choosing ε < γ −1 , we conclude that for any γ > 0 and u ∈ D, the bound γ 2 Su2 ≤ (Q + iγS)u2 + γ˜ c(ε)u2
(4.5)
holds. Since D is a core for Q, the bound (3.9) (with α = 1) holds for all u ∈ D(Q). Thus, for any 0 < ν < 1, νS is a relatively Q-bounded with relative bound less than 1. Since (S, D(S)) is symmetric, the operator iS is dissipative. Therefore, the operator (Q + iνS, D(Q)) generates a strongly continuous contraction semigroup on h (see [16, Corollary 3.3, Chap. 3]). Moreover, D is a core for Q + iνS by (3.9). The bound (4.5) with γ = ν implies that for 0 < τ < 1, τ νS is relatively Q + iνSbounded with relative bound less than 1 and so (Q + i(1 + τ )νS, D(Q)) generates a strongly continuous contraction semigroup and D is a core for the operator. Since τ ν < γ = (1 + τ )ν, the bound (4.5) implies that (Q + i(1 + 2τ )νS, D(Q)) generates a strongly continuous contraction semigroup. By using an induction argument, we conclude that for any τ, ν ∈ (0, 1) and n = 1, 2, . . . , the operator (Q + i(1 + nτ )νS, D(Q)) generates a strongly continuous contraction semigroup and D is a core for the generator. For given λ > 0, one can choose τ, ν ∈ (0, 1) and n such that λ = (1 + nτ )ν, and for given λ < 0, S replaces by −S. This completes the proof of the proposition. In order to show that the operator G is a generator of a strongly continuous contraction semigroup on h, we only need to check the conditions in Proposition 3.3.
September 12, 2006 14:40 WSPC/148-RMP
606
J070-00275
C. Bahn, C. K. Ko & Y. M. Park
Proof of Theorem 3.4. (a) To prove the part (a) of the theorem, we apply Proposition 3.3 for Q = G0 , S = H and D = D. Clearly, H is a symmetric operator on D. As mentioned below (3.7), G0 is nonpositive, essentially self-adjoint on D, and so it generates a strongly continuous contraction semigroup. Thus, condition (a) of Proposition 3.3 is satisfied. Let us verify condition (b) of Proposition 3.3. Notice that for u ∈ D,
n
2
1
2 (Al Bl + Bl Al )u
Hu =
4
l=1
2
n
1
=
(2Al Bl − iKll )u
4
l=1
≤
n 4
n
(2Al Bl − iKll )u2
l=1 n
≤
n (4Al Bl u2 + Kll u2 ), 2
(4.6)
l=1
and Al Bl u2 = u, Bl Al (Bl Al + iKll )u = u, (Bl (Bl Al + iKll )Al + iBl Al Kll )u = u, (Bl2 A2l + i[Bl , Kll ]Al + iKll Bl Al + iBl Al Kll )u = u, (Bl2 A2l + i[Bl , Kll ]Al + Kll2 + iKll Al Bl + iBl Al Kll )u ≤
1 (Bl2 u2 + A2l u2 + [Bl , Kll ]u2 + Al u2 ) 2 + Kll u2 + 2Al Bl uKll u
≤
1 (Bl2 u2 + A2l u2 + [Bl , Kll ]u2 + Al u2 ) 2 1 + Kll u2 + Al Bl u2 + 4Kll u2 , 4
which implies Al Bl u2 ≤
2 20 (Bl2 u2 + A2l u2 + [Bl , Kll ]u2 + Al u2 ) + Kll u2 . 3 3
(4.7)
Substituting (4.7) into (4.6), and applying (4.1), (3.3) and (3.4), we obtain that Hu2 ≤ d1 (Au2 + Bu2 ) + d2 u, (A + B)u + d3 u2 ≤ d4 (A + B)u2 + d5 u2
(4.8)
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Quantum Dynamical Semigroups
607
for some constants d1 , . . . , d5 > 0, where we have used (4.2) and Schwarz’s inequality n to get the second inequality. Since l=1 Kll is infinitesimally small with respect to A + B, there exist positive constants d6 and d7 such that (A + B)u2 ≤ d6 G0 u2 + d7 u2 .
(4.9)
Combining (4.8) and (4.9), we obtain that Hu2 ≤ d8 G0 u2 + d9 u2,
u ∈ D,
(4.10)
for d8 = d4 d6 , d9 = d4 d7 + d5 > 0. This proved condition (b) of Proposition 3.3. Next, we consider the commutator estimate in (3.10). Recall that Q = G0 = −
S=H =
1 2
n
n
l=1
l=1
1 ∗ 1 2 Ll Ll = − (Al + Bl2 + i[Bl , Al ]), 2 2
n
(Al Bl + Bl Al ).
l=1
We can write that n n i 2 1 ∗ Ll Ll , H = ∓ [Al + Bl2 + Kll , Ak Bk + Bk Ak ] ±i − 2 4 l=1
l,k=1
=∓
n i ([A2l , Ak Bk + Bk Ak ] + [Bl2 , Ak Bk + Bk Ak ]) 4 l,k=1
∓
n i [Kll , Ak Bk + Bk Ak ]. 4
(4.11)
l,k=1
Notice that by Assumption 3.1(ii) i[A2l , Ak Bk + Bk Ak ] = i(Ak [A2l , Bk ] + [A2l , Bk ]Ak ) = −Ak Al Kkl − Ak Kkl Al − Al Kkl Ak − Kkl Al Ak = Ak [Al , Kkl ] − 2Ak Al Kkl − 2Kkl Al Ak − [Al , Kkl ]Ak ,
(4.12)
and i[Bl2 , Ak Bk + Bk Ak ] = i([Bl2 , Ak ]Bk + Bk [Bl2 , Ak ]) = Bl Klk Bk + Klk Bl Bk + Bk Bl Klk + Bk Klk Bl = [Bl , Klk ]Bk + 2Klk Bl Bk + 2Bk Bl Klk − Bk [Bl , Klk ],
(4.13)
i[Kll , Ak Bk + Bk Ak ] = −i[Ak Bk + Bk Ak , Kll ] = −iAk [Bk , Kll ] − i[Ak , Kll ]Bk − iBk [Ak , Kll ] − i[Bk , Kll ]Ak
(4.14)
September 12, 2006 14:40 WSPC/148-RMP
608
J070-00275
C. Bahn, C. K. Ko & Y. M. Park
as bilinear forms on D. Substituting (4.12)–(4.14) into (4.11), we obtain that for u ∈ D,
n 1 ∗ Ll Ll , H u ± u, i − 2 l=1
=±
±
1 2
n
(−ReAk u, [Al , Kkl ]u + 2 ReAl Ak u, Kkl u)
l,k=1
n 1 (ReBk u, [Bl , Klk ]u − 2 ReBl Bk u, Klk u) 2 l,k=1
∓
n 1 (ImAk u, [Bk , Kll ]u + ImBk u, [Ak , Kll ]u). 2
(4.15)
l,k=1
Let Cl , Cl be either Al or Bl , l = 1, 2, . . . , n. We get from (3.4) that |Ck u, [Cj , Klm ]u| ≤ ≤
1 (Ck u2 + [Cj , Klm ]u2 ) 2 1 ((1 + c1 )u, (A + B)u + c2 u2 ), 2
(4.16)
and for ε ∈ (0, 1) by (3.3) and Lemma 4.1, |Cl Ck u, Kjm u| ≤
1 (εCl Ck u2 + ε−1 Kjm u2 ) 2
≤
1 {ε(Cl2 u2 + Ck2 u2 ) + 2(u, (A + B)u + ε−1 c(ε)u2 )} 4
≤
1 {ε(Au2 + Bu2 ) + u, (A + B)u + ε−1 c(ε)u2 } 2
≤
1 {εk1 (A + B)u2 + u, (A + B)u + (εk2 + ε−1 c(ε))u2 }. 2
(4.17)
Notice that for ε ∈ (0, 1) u, (A + B)u ≤
1 (ε(A + B)u2 + ε−1 u2 ). 2
(4.18)
Using Schwarz’s inequality to (4.15), and by (4.16), (4.17) and (4.18), one has that ε) depending on ε˜ such that for any ε˜ ∈ (0, 1), there exists a positive constant c3 (˜ n 1 ∗ Ll Ll , H u ≤ ε˜(A + B)u2 + c3 (˜ ε)u2 . (4.19) ± u, i − 2 l=1
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Quantum Dynamical Semigroups
Two inequalities (4.9) and (4.19) yield the bound n 1 ∗ ± u, i − Ll Ll , H u ≤ ε˜d6 G0 u2 + c4 (˜ ε)u2 2
609
(4.20)
l=1
ε) = c3 (˜ ε) + ε˜d7 . By resetting ε = ε˜d6 , the inequality (3.10) holds. The proof for c4 (˜ of the part (a) of theorem is completed. (b) By (a), G generates a strongly continuous contraction semigroup on h and D is a core for G. It follows from (3.5) and (3.6) that we have v, Gu + Gv, u +
n
Ll v, Ll u = 0
(4.21)
l=1
for all u, v ∈ D. Thus G and Ll , l = 1, 2, . . . , n, satisfy the condition (2.2) on a core D for G, and so Assumption 2.3 is satisfied. Therefore, as mentioned in Sec. 2, by the iterations, we can construct a minimal q.d.s. T = (Tt )t≥0 satisfying Eq. (3.12). Proof of Theorem 3.6. Applying Theorem 2.4, we show that the minimal q.d.s. is conservative. Let us choose the operator C C = −2G0 =
n l=1
L∗l Ll
=
n
(A2l + Bl2 + i[Bl , Al ]).
(4.22)
l=1
Recall that D is a core for C. We have that as bilinear forms on D G∗ G = (iH + G0 )(−iH + G0 ) = H 2 + G20 + i[H, G0 ] ≥ G20 + i[H, G0 ].
(4.23)
It follows from (4.23) and (4.20) that we have G0 u2 ≤ aGu2 + bu2,
u∈D
(4.24)
for some constants a, b > 0. Using the relations (4.22), (4.24) and the fact that −iH is a relatively bounded perturbation of G0 , we obtain that G and C are relatively bounded with respect to each other and so D(G) = D(C). We will check that the operator C satisfies the conditions in Theorem 2.4. Conditions (a) and (b) of Theorem 2.4 are trivially fulfilled. To check the condition (e) of Theorem 2.4, we estimate CG + G∗ C +
n l=1
as bilinear forms on D.
L∗l CLl = i[H, C] +
n
1 ∗ (Ll [C, Ll ] + [L∗l , C]Ll ) 2 l=1
(4.25)
September 12, 2006 14:40 WSPC/148-RMP
610
J070-00275
C. Bahn, C. K. Ko & Y. M. Park
We obtain from (4.15) (with interchanging of l and k) and C = that i[H, C] =
n
l=1
L∗l Ll
n 1 (−Al [Ak , Klk ] + [Ak , Klk ]Al + 2Al Ak Klk + 2Klk Ak Al ) 2 l,k=1
n 1 + (Bl [Bk , Kkl ] − [Bk , Kkl ]Bl − 2Bl Bk Kkl − 2Kkl Bk Bl ) 2 l,k=1
+
n 1 (iAl [Bl , Kkk ] + i[Bl , Kkk ]Al + iBl [Al , Kkk ] + i[Al , Kkk ]Bl ) 2 l,k=1
as bilinear forms on D. Notice that by Assumption 3.1(ii), 2Al Ak Klk + 2Klk Ak Al = 2Al [Ak , Klk ] − 2[Al , Klk ]Ak + 4Al Klk Ak , −2Bl Bk Kkl − 2Kkl Bk Bl = −2Bk [Bl , Kkl ] + 2[Bk , Kkl ]Bl − 4Bk Kkl Bl , as bilinear forms on D. Thus we have i[H, C] =
n
(2Al Klk Ak − 2Bk Kkl Bl ) + I,
(4.26)
l,k=1
where I=
n 1 (Al [Ak , Klk ] + [Ak , Klk ]Al − 2[Al , Klk ]Ak ) 2 l,k=1
+
n 1 (Bl [Bk , Kkl ] + [Bk , Kkl ]Bl − 2Bk [Bl , Kkl ]) 2 l,k=1
+
n 1 (iAl [Bl , Kkk ] + i[Bl , Kkk ]Al + iBl [Al , Kkk ] + i[Al , Kkk ]Bl ), 2 l,k=1
as bilinear forms on D. On the other hand, we have [C, Ll ] =
n
[A2k + Bk2 + Kkk , Al − iBl ]
k=1
=
n
(i[Bl , A2k ] + [Bk2 , Al ] + [Kkk , Ll ])
k=1
=
n k=1
(Ak Klk + Klk Ak − iBk Kkl − iKkl Bk − [Ll , Kkk ]),
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Quantum Dynamical Semigroups
611
and so L∗l [C, Ll ] =
n
(Al + iBl )(Ak Klk + Klk Ak − iBk Kkl − iKkl Bk − [Ll , Kkk ])
k=1
=
n
(Al [Ak , Klk ] + 2Al Klk Ak + Bl [Bk , Kkl ] + 2Bl Kkl Bk )
k=1
−
n k=1
L∗l [Ll , Kkk ] +
n
(−iAl Bk Kkl − iAl Kkl Bk
k=1
+ iBl Ak Klk + iBl Klk Ak ), which implies n
1 ∗ (Ll [C, Ll ] + (L∗l [C, Ll ])∗ ) = II + III + IV + V, 2
(4.27)
l=1
where II =
n
(Al Klk Ak + Ak Klk Al + Bl Kkl Bk + Bk Kkl Bl ),
l=1 n 1 (Al [Ak , Klk ] − [Ak , Klk ]Al + Bl [Bk , Kkl ] − [Bk , Kkl ]Bl ), III = 2 l,k=1
IV =
n 1 (−L∗l [Ll , Kkk ] + [L∗l , Kkk ]Ll ), 2 l,k=1
V =
n 1 (−iAl Bk Kkl − iAl Kkl Bk + iBl Ak Klk + iBl Klk Ak ) 2 l,k=1
+
n 1 (iKkl Bk Al + iBk Kkl Al − iKlk Ak Bl − iAk Klk Bl ) 2 l,k=1
=
n 1 (−i[Al , Bk Kkl ] − i[Al , Kkl Bk ] + i[Bl , Ak Klk ] + i[Bl , Klk Ak ]) 2 l,k=1
n 1 2 = (2Kkl − iBk [Al , Kkl ] − i[Al , Kkl ]Bk ) 2 l,k=1
+
n 1 2 (2Klk + iAk [Bl , Klk ] + i[Bl , Klk ]Ak ) 2 l,k=1
September 12, 2006 14:40 WSPC/148-RMP
612
J070-00275
C. Bahn, C. K. Ko & Y. M. Park
as bilinear forms on D. Combining (4.25)–(4.27), one has CG + G∗ C +
n
n
L∗l CLl =
l=1
(3Al Klk Al + Ak Klk Al − Bk Kkl Bl + Bl Kkl Bk )
l,k=1
+ I + III + IV + V
(4.28)
as bilinear forms on D. Notice that as bilinear forms on D Ak Klk Al = [Ak , Klk ]Al + Klk Ak Al = [Ak , Klk ]Al − [Al , Klk ]Ak + Al Klk Ak ,
(4.29)
Bl Kkl Bk = [Bl , Kkl ]Bk + Kkl Bl Bk = [Bl , Kkl ]Bk − [Bk , Kkl ]Bl + Bk Kkl Bl .
(4.30)
Substituting (4.29) and (4.30) into (4.28), we have n
∗
CG + G C +
L∗l CLl
=4
l=1
n
Al Klk Ak + I + III + IV + V + VI
(4.31)
l,k=1
as bilinear forms on D, where VI =
n
([Ak , Klk ]Al − [Al , Klk ]Ak + [Bl , Kkl ]Bk − [Bk , Kkl ]Bl ).
l,k=1
By Assumption 3.5, we get that for u ∈ D
n
4 u,
Al Klk Ak u
l,k=1
=4
n
Al u, Klk Ak u
l,k=1
≤ 4c3
n
Al u2 = 4c3 u, Au.
(4.32)
l=1
Since the remainder except the first term in (4.31) are composed of the types of 2 , Cj [Ck , Klm ] and [Ck , Klm ]Cj , where Cl , Cl are either Al or Bl , l = 1, 2, . . . , n, Klk by applying (3.3) and the estimation used in (4.16), we have the bound |u, (I + III + IV + V + VI)u| ≤ c4 u, (A + B)u + c5 u2 ,
u ∈ D,
(4.33)
Ll u, CLl u ≤ (4c3 + c4 )u, (A + B)u + c5 u, u.
(4.34)
for some constants c4 , c5 > 0. Thus, for u ∈ D, 2 ReCu, Gu +
n l=1
Choosing ε = 1/n in (3.3), there exists a constant c6 > 0 such that n l=1
Kll u2 ≤ u, (A + B)u + c6 u2 ,
u ∈ D,
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Quantum Dynamical Semigroups
613
and so we have u, Cu = u, (A + B)u +
n
u, Kll u
l=1 n
≥ u, (A + B)u −
1 (u2 + Kll u2 ) 2 l=1
≥
1 1 u, (A + B)u − (c6 + n)u2 . 2 2
(4.35)
It follows from (4.34) and (4.35) that for u ∈ D,
2 ReCu, Gu +
n
Ll u, CLl u ≤ c7 u, Cu + c8 u, u
l=1
for c7 = 2(4c3 + c4 ), c8 > 0. Redefine C = 2 ReCu, Gu +
n
n
l=1
L∗l Ll +
Ll uCLl u ≤ c7 u, Cu,
c8 c7 ,
then by (4.21) we have
u ∈ D.
(4.36)
l=1
This proved the inequality (2.7) for u ∈ D. We want to extend the inequality (4.36) to the domain D(G). Since G and C are relatively bounded with respect to each other and D(G) = D(C), there exists a sequence {un } ⊂ D such that lim un = u,
n→∞
lim Cun = Cu,
n→∞
lim Gun = Gu,
n→∞
u ∈ D(G).
Then the relation (4.36) implies that {C 1/2 Ll un }n≥1 is a Cauchy sequence. Therefore, it is convergent and it is easy to deduce that (4.36) holds for u ∈ D(G). Note that Φ = nl=1 L∗l Ll ≤ C(= Φ+ cc87 ) as bilinear forms on D. Hence the conditions (c) and (d) of Theorem 2.4 also hold and the minimal q.d.s. is conservative.
5. Application: Quantum Mechanical System In this section, we apply our results to construct a conservative minimal q.d.s. in a quantum mechanical system. Let h = L2 (Rn ) and D = C0∞ (Rn ), the space of C ∞ -functions with compact ∂ ( l = 1, 2, . . . , n) differential operators with support on Rn . We denote by ∂l = ∂x l respect to the lth coordinate and ∂lk =
∂2 ∂xk ∂xl ( l, k
= 1, 2, . . . , n). For any measur2
∂T T able function T , we denote the (distributional) derivative ∂x , ∂x∂k ∂x by (T )l , (T )lk , l l l, k = 1, 2, . . . , n, respectively. The Laplacian and the gradient operators are denoted by ∆ and ∇, respectively.
September 12, 2006 14:40 WSPC/148-RMP
614
J070-00275
C. Bahn, C. K. Ko & Y. M. Park
Let a function (vector field) W : Rn → Rn , W = (W1 , W2 , . . . , Wn ) be given, where each component function Wl (x), l = 1, 2, . . . , n is a real valued twice differentiable function on Rn . We will denote n 1/2 n n W2 = Wl2 , x2 = x2l , |x| = x2l . l=1
l=1
l=1
We suppose that W satisfies the following assumption. Assumption 5.1. The function W = (W1 , W2 , . . . , Wn ) satisfies the following properties: (i) Wl ∈ C 2 (Rn ), l = 1, 2, . . . , n, (ii) for any ε > 0, there exists a positive constant c(ε), depending on ε, such that |(Wl )k | ≤ ε|W | + c(ε)
(5.1)
for any l, k = 1, 2, . . . , n, (iii) there exist positive constants c1 , c2 such that |(Wl )jk | ≤ c1 |W | + c2 ,
l, j, k = 1, 2, . . . , n,
(5.2)
(iv) there exists a constants c3 ∈ R such that ((Wl )k ) ≥ −c3 in the sense that for any complex numbers ξ1 , ξ2 , . . . , ξn , n
ξ¯k (Wl )k ξl ≥ −c3
l,k=1
n
|ξl |2 .
l=1
n
Example 5.2. Let V : R → R be the function (potential) given by V (x) =
n
al x2m + Q(x), l
l=1
where al > 0, l = 1, 2, . . . , n, m is a positive integer and Q(x) is a polynomial with degree less than or equal to 2m − 1. Choose W = (W1 , W2 , . . . , Wn ), Wl = 1 1 4 (V )l , l = 1, 2, . . . , n, that is, W = 4 ∇V. Then there exist positive constants α1 , α2 , β1 and β2 such that for any l, k = 1, 2, . . . , n, |(Wl )k (x)| ≤ α1 |x|2m−2 + β1 , |W (x)| ≥ α2 |x|2m−1 − β2 .
(5.3)
Notice that for any ε > 0, |x|2m−2 ≤ ε|x|2m−1 ,
if |x| ≥ ε−1 ,
|x|2m−2 ≤ ε−(2m−2) ,
if |x| ≤ ε−1 .
(5.4)
Combining (5.3) and (5.4), we get that the inequality (5.1) holds. The inequality (5.2) can be checked similarly. Thus W satisfies Assumption 5.1(i)–(iii).
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Quantum Dynamical Semigroups
615
Remark 5.3. (a) By Assumption 5.1(i), Wl2 ∈ L2loc (Rn ), l = 1, 2, . . . , n. Due to [15, Theorem X. 28], −∆ + W 2 is essentially self-adjoint on D. (b) Let W = (W1 , W2 , . . . , Wn ) be given as in Example 5.2. Then Assumption 5.1(iv) means that Hess V ≥ −c3 , where Hess V is the Hessian of V . It is the conservative condition corresponding to Hasminskii condition. We can check the no explosion criteria for (1.6). See [17, Sec. 4.5, p. 102–103]. Choose Al = −Wl ,
Bl = −i∂l ,
l = 1, 2, . . . , n,
and consider the operators Ll , H, G0 and G given by Ll u = −(Wl + ∂l )u,
l = 1, 2, . . . , n,
n
i Hu = (Wl ∂l + ∂l Wl )u, 2 l=1
G0 u = −
n
n
l=1
l=1
1 ∗ 1 Ll Ll u = − (Wl − ∂l )(Wl + ∂l )u 2 2
(5.5)
n 1 2 −∆ + W − (Wl )l u, =− 2 l=1
Gu = −iHu + G0 u,
(5.6)
for u ∈ D. We consider the sesquilinear form L(X) on h with domain D×D given by v, L(X)u = v, XGu + Gv, Xu +
n
Ll v, XLl u
l=1
and the semigroup T = (Tt )t≥0 satisfying the equation v, Tt (X)u = v, Xu +
0
t
v, L(Ts (X))u ds
(5.7)
for all u, v ∈ D and for all X ∈ B(h). Theorem 5.4. Suppose that W tion 5.1(i)–(iii).
=
(W1 , W2 , . . . , Wn ) satisfies Assump-
(a) The operator G defined as in (5.5) and (5.6) generates a strongly continuous contraction semigroup on h. Moreover, D = C0∞ (Rn ) is a core for G. (b) There exists the minimal q.d.s. T = (Tt )t≥0 satisfying (5.7). (c) Under adding Assumption 5.1(iv), the minimal q.d.s. T = (Tt )t≥0 is conservative.
September 12, 2006 14:40 WSPC/148-RMP
616
J070-00275
C. Bahn, C. K. Ko & Y. M. Park
Proof. In order to prove theorem we apply Theorems 3.4 and 3.6. Thus we only need to check that −Wl , −i∂l , l = 1, 2, . . . , n, satisfy Assumptions 3.1 and 3.5. Recall that Al = −Wl ,
Bl = −i∂l ,
l = 1, 2, . . . , n
and D = C0∞ (Rn ).
Clearly −Wl and −i∂l , l = 1, 2, . . . , n are essentially self-adjoint on D. Since Wl ∈ C 2 (Rn ), l = 1, 2, . . . , n, u ∈ D satisfies condition (3.2). Assumption 3.1(ii) obviously holds because of [Wk , Wl ] = 0, [∂k , ∂l ] = 0, k, l = 1, 2, . . . , n on D. As mentioned in n Remark 5.3(a), l=1 (Wl2 − ∂l2 ) = W 2 − ∆ is essentially self-adjoint on D. Thus, Assumption 3.1(iii) is satisfied. We get from Assumption 5.1(ii) that for any u ∈ D and k, l = 1, 2, . . . , n, [i∂k , Wl ]u2 = i(Wl )k u2 = u, (Wl )2k u ≤ 2ε2 u, W 2 u + 2c(ε)2 u2 ≤ 2ε2 u, (−∆ + W 2 )u + 2c(ε)2 u2 .
(5.8)
Resetting 2ε2 by ε, (5.8) yields Assumption 3.1(iv). Notice that [Wj , [i∂k , Wl ]] = 0, [i∂j , [i∂k , Wl ]] = −(Wl )kj on D, j, k, l = 1, 2, . . . , n. By the similar calculation, we can check that Assumption 3.1(v) is satisfied. Finally, we check that −Wl , −i∂l , l = 1, 2, . . . , n, satisfy Assumption 3.5. It follows from Assumption 5.1(iv) that for any {ul }nl=1 ⊂ D, n
uk , i[i∂k , Wl ]ul = −
l,k=1
n
uk , (Wl )k ul
l,k=1
=−
n
Rn l,k=1
≤ c3
n
Rn l=1
uk (x)(Wl )k (x)ul (x) dx
2
|ul (x)| dx = c3
n
ul 2 .
l=1
Therefore Assumption 3.5 also holds and the proof of theorem is completed. Acknowledgment The authors would like to thank their anonymous referees for suggestions to improve the paper. This work was supported by Korea Research Foundation Grant (KRF2003-005-00010, KRF-2003-005-C00011). References [1] A. M. Chebotarev and F. Fagnola, Sufficient conditions for conservativity of minimal quantum dynamical semigroups, J. Funct. Anal. 153 (1998) 382–404.
September 12, 2006 14:40 WSPC/148-RMP
J070-00275
Quantum Dynamical Semigroups
617
[2] C. Bahn and Y. M. Park, Feynman–Kac representation and Markov property of semigroups generated by noncommutative elliptic operators, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 6 (2003) 103–121. [3] J. M. Lindsay and K. B. Sinha, Feynman–Kac representation of some noncommutative elliptic operators, J. Funct. Anal. 147 (1997) 400–419. [4] C. Bahn and C. K. Ko, Conservative minimal quantum dynamical semigroups generated by noncommutative elliptic operators, J. Korean Math. Soc. 42 (2005) 1231–1249. [5] C. Bahn, C. K. Ko and Y. M. Park, Remarks on sufficient conditions for conservativity of minimal quantum dynamical semigroups, Rev. Math. Phys. 17 (2005) 745–768. [6] A. M. Chebotarev, Lectures on Quantum Probability, Mathematical Contributions: Text 14 (Sociedad Matem´ atica M´ exicana, M´ exico, 2000), pp. 164–166. [7] F. Fagnola, Quantum Markov semigroups and quantum flows, Proyecciones 18(3) (1999) 1–144. [8] F. Fagnola, Diffusion processes in Fock space, Quantum Probab. Related Topics IX (1994) 189–214. [9] G. Lindblad, On the generator on dynamical semigroups, Comm. Math. Phys. 48 (1976) 119–130. [10] K. R. Parthasarathy, An Introduction To Quantum Stochastic Calculus, Monographs in Mathematics (Birkh¨ auser, Basel, 1992). [11] E. B. Davies, Quantum dynamical semigroups and the neutron diffusion equation, Rep. Math. Phys. 11 (1977) 169–188. [12] A. M. Chebotarev and F. Fagnola, Sufficient conditions for conservativity of quantum dynamical semigroups, J. Funct. Anal. 118 (1993) 131–153. [13] A. M. Chebotarev, Sufficient conditions for conservativity of dynamical semigroups, Theor. Math. Phys. 80(2) (1989). [14] F. Fagnola, Chebotarev’s sufficient conditions for conservativity of quantum dynamical semigroups, Quantum Probab. Related Topics VIII (1993) 123–142. [15] M. Reed and B. Simon, Method of Modern Mathmatical Physics I, II (Academic Press, 1980). [16] A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations (Springer Verlag, 1983). [17] H. P. McKean, Stochastic Integrals (Academic Press, 1969).
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Reviews in Mathematical Physics Vol. 18, No. 6 (2006) 619–653 c World Scientific Publishing Company
STEADY STATE FLUCTUATIONS OF THE DISSIPATED HEAT FOR A QUANTUM STOCHASTIC MODEL
WOJCIECH DE ROECK∗ and CHRISTIAN MAES† Instituut voor Theoretische Fysica, K. U. Leuven, Belgium ∗[email protected] †[email protected]
Received 16 March 2006 Revised 6 June 2006 We introduce a quantum stochastic dynamics for heat conduction. A multi-level subsystem is coupled to reservoirs at different temperatures. Energy quanta are detected in the reservoirs allowing the study of steady state fluctuations of the entropy dissipation. Our main result states a symmetry in its large deviation rate function. Keywords: Entropy production; fluctuation theorem; quantum stochastic calculus. Mathematics Subject Classification 2000: 82C10, 82C31
1. Introduction Steady state statistical mechanics wants to construct and to characterize the stationary distribution of a subsystem in contact with several reservoirs. By nature, the required scenario is an idealization as some essential specifications of the reservoirs must be kept constant. For example, intensive quantities such as temperature or (electro-)chemical potential of the different reservoirs are defined and unchanged for an extensive amount of time, ideally ad infinitum. Reservoirs do not interact directly with each other but only via the subsystem; they remain at their same spatial location and can be identified at all times. That does not mean that nothing happens to the reservoirs; flows of energy or matter reach them and they are like sinks and sources of currents that flow through the subsystem. Concrete realizations and models of steady states vary widely depending on the type of substances and on the nature of the driving mechanism. An old and standard problem takes the subsystem as a solid in contact at its ends with two heat reservoirs and to investigate properties of the energy flow. Beloved by many is a classical model consisting of a chain or an array of coupled anharmonic ∗Aspirant
of the Flemish Research Fund (FWO), University of Antwerp. 619
September 12, 2006 14:40 WSPC/148-RMP
620
J070-00274
W. De Roeck & C. Maes
oscillators connected to thermal noises at the boundaries. The reservoirs are there effectively modeled by Langevin forces while the bulk of the subsystem undergoes a Hamiltonian dynamics, see e.g. [19, 41, 45]. Our model to be specified below is a quantum analogue of that scenario in the sense that we also consider a combination of Hamiltonian dynamics and Markovian thermal noises. We imagine a chain of coupled two (or multi)-level systems. The dynamics of the isolated subsystem is unitary with Hamiltonian HS . Quanta of energy ω are associated to the elementary transitions between energy levels. Two physical reservoirs at inverse temperatures βk , k = 1, 2, are now attached to the subsystem. The total dynamics is described by a quantum stochastic differential equation through which we can observe the number Nω,k of quanta with energy ω that are piled up in the kth reservoir. The total energy N := HS + ω,k ωNω,k is conserved under the dynamics (Proposition 2.13). The change in the second term corresponds to the flow of energy quanta in and out of the reservoirs and specifies the dissipated heat. Our main result consists in obtaining a symmetry in the fluctuations of that dissipated heat that extends the so-called steady state fluctuation theorem for the entropy production to a quantum regime (Proposition 2.14). The quantum stochastic evolution that defines the model is a particular dilation of a semigroup dynamics that describes the weak coupling regime of our subsystem coupled to quasi-free boson fields. The dilation, a sort of quantum Langevin equation, is much richer and enables the introduction of a natural path space measure. One should remember here that a major conceptual difficulty in coming to terms with the notion of a variable entropy production for quantum steady states is to understand its path-dependence. One option is to interrupt the unitary dynamics with collapses, see e.g. [12]. Others have proposed an entropy production operator, avoiding the problem of path-dependence. Our set-up follows a procedure that is well known in quantum optics with thermal noises formally replacing photon detectors, see [7, 8]. In the resulting picture, we record each energy quantum that is transferred between subsystem and reservoirs. It induces a stochastic process on quanta transferrals and there remains no problem to interpret the fluctuations of the entropy production. From the mathematical point of view, the model can be analyzed via standard probabilistic techniques.
1.1. Related results In the past decade, a lot of interest has been going to the Gallavotti–Cohen fluctuation relation, [16, 21, 22] see [38] for more recent references. In its simplest form that relation states that the steady state probability (Prob) of observing a total entropy decrease wT = −wT in a time T , is exponentially damped with respect to the probability of observing an increase of wT as Prob(wT = wT ) ≈ e+wT Prob(wT = −wT )
(1.1)
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
621
at least for very large time spans T . The relation (1.1) is known as the steady state fluctuation theorem (SSFT) and states a symmetry in the fluctuations of the entropy dissipation in a stationary nonequilibrium state. The symmetry was first discovered in the context of dynamical systems and was applied to the phase space contraction rate in strongly chaotic dynamical systems, see [16, 21, 22, 47]. It was first further developed for stochastic dynamics in [32, 35, 37]. In the present paper, we deal with the SSFT for a quantum system. The steady state condition must however be understood in a physical sense; it is about heat conduction for fixed reservoirs in a long time limit. The small system is treated in the steady state of an approximate dynamics (the weak coupling limit) while the reservoirs are kept at a fixed termperature. Yet, mathematically, we are not quite dealing with a steady or stationary state. The true total dynamics for system plus reservoirs is in fact much more complicated. We still speak about the steady state (and the SSFT) also to contrast it with transient versions of the fluctuation symmetry (1.1), see also [23]. Transient fluctuation theorems (TFT) start typically from a change of variables at a finite time t, reversing, so to say, the evolution (see [50]), and can be obtained equally well for classical as for quantum systems. That is not at all what we are doing here. We truly concentrate on the stationary heat dissipation in the reservoirs, but from a technical point of view, one could argue that our set-up is actually a “transient model of a steady state”. The basic underlying mechanism and general unifying principles connecting SSFT and TFT with statistical mechanical entropy have been explained in [38, 40]. Monnai and Tasaki [43] have investigated an exactly solvable harmonic system and found quantum corrections to both SSFT and TFT. Matsui and Tasaki [42] proved a quantum TFT in a general C ∗ -algebraic setting. It is however unclear what is the meaning of their entropy production operator. A related quantum Jarzynski relation was studied in [46]. Besides the fluctuation theorem, we also describe a new approach to the study of heat conduction in the quantum weak coupling limit. In [34], Lebowitz and Spohn studied the thermodynamics of the weak-coupling generator. They identified the mean currents, and they proved a Green–Kubo relation. At that time, it was however not yet possible to conclude that these expressions are the first non-zero contributions to their counterparts at finite coupling λ. That has recently been shown in a series of papers by Jakˇsi´c and Pillet [28–31], who used spectral techniques to study the system at finite coupling λ. It was also shown that the stationary state of the weak-coupling generator is the zeroth order contribution to the system part of the so-called NESS, the nonequilibrium steady state. The current fluctuations we define in our model, agree with the expressions of [34] as far as the mean currents and the Green–Kubo formula is concerned. Our entropy production operator is however new; it differs, for example, from the proposal of [42]. The approach taken here also differs from the more standard route that has been followed and that was outlined by Ruelle in [48]. Recently and within that approach and context of heat conduction, new results have been obtained in [5, 30, 31]. To us, it remains however
September 12, 2006 14:40 WSPC/148-RMP
622
J070-00274
W. De Roeck & C. Maes
very much unclear how to define and study in that scenario a fluctuating entropy; in contrast, that is exactly one of the things we can easily achieve via our approach but we remain in the weak-coupling limit. 1.2. Basic strategies 1.2.1. Microscopic approach In general, one would like to start from microscopic quantum dynamics. The system is then represented by a finite-dimensional Hilbert space H and system Hamiltonian HS . The environment is made from thermal reservoirs, indexed by k ∈ K, infinitely extended quantum mechanical systems, with formal Hamiltonian, HRk . HR := k∈K
The coupling between system and reservoirs is local and via some bounded interaction term λHS−R so that the total Hamiltonian takes the form Hλ := HS ⊗ 1 + 1 ⊗ HR + λ Vk ⊗ Rk , k∈K
where we have already inserted a specific form for the coupling HS−R using selfadjoint reservoir operators Rk and Vk acting on, respectively, HRk and H. On the same formal level, which can however easily be made precise, the total quantum dynamics is then just Utλ := e−iHλ t . We will not follow the beautiful spectral or scattering approach that has recently been exploited for that nonequilibrium problem. We refer the reader to the specialized references such as [31, 48] and we only outline the main steps, totally ignoring essential assumptions and technicalities: One starts the dynamics from an initial state µ := ρS ⊗ ρR1 ⊗ · · · ⊗ ρR|K| , where ρS stands for an initial state in the system and the ρRk are equilibrium KMS states at inverse temperature βk for the kth reservoir. The quantum dynamics takes that initial state to the new (now coupled) state µt at time t > 0. The NESS is obtained via an ergodic average 1 T dt µt . (1.2) µNESS := lim T ↑+∞ T 0 One of the first questions (and partially solved elsewhere, see, e.g., [4, 31, 48]) is then to derive the natural conditions under which the mean entropy production rate m S˙ := i βk µNESS ([Hλ , HRk ]) k=1
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
623
is strictly positive. While that mean entropy production certainly coincides with conventional wisdom, we do not however believe that the operator i[Hλ , HRk ] or equivalent expressions, is the physically correct candidate for the study of current fluctuations which would obey the SSFT. That is not even the case for the simplest (classical) stochastic dynamics; one needs to go to path space and study current fluctuations in terms of (fluctuating) trajectories. 1.2.2. Weak coupling approach Starting from the microscopic dynamics above, we can of course always look at the reduced dynamics Λλt on the system λ Λλt ρS = TrR Utλ (ρS ⊗ ρR1 ⊗ · · · ⊗ ρR|K| )U−t for a density matrix ρS on the system. Obviously, the microscopic evolution couples the system with the environment and the product form of the state will, in general, not be preserved. One can however attempt a Boltzmann-type Ansatz or projection technique to enforce a repeated randomization. That can be made rigorous in the so-called weak-coupling limit. For that, one needs the interaction picture and one keeps λ2 t = τ fixed. That is the Van Hove–Davies-limit [13, 25] ∗
lim Λ0−t Λλt ρS := eτ L ρS ,
λ→0
where L∗ is a linear operator acting on density matrices for the system. The generator will be written out more explicitly in Sec. 2.1 but its dual L acting on B(H) is of the form, see (2.10), Lk (·), L(·) = i[Hf , ·] + k∈K
where the Lk can be identified with the contribution to the dissipation from the kth reservoir. Hf is an effective, renormalized Hamiltonian depending on details of the reservoirs and the coupling. From now on, we write ρ for the (assumed) unique invariant state (see also Remark 2.1): ∗
etL ρ = ρ,
τ ≥ 0.
Again, one can study here the mean entropy production, as, for example, done in [34] and argue that Tr[ρLk HS ] represents the stationary heat flow into the kth reservoir, at least in the weakcoupling regime. Nothing tells us here however about the physical fluctuations in the heat current for which higher moments should be considered. In fact, the reservoirs are no longer visible as the weak-coupling dynamics is really a jump process on
September 12, 2006 14:40 WSPC/148-RMP
624
J070-00274
W. De Roeck & C. Maes
the energy levels of the system Hamiltonian, see further in Sec. 3.1. The heat flow and the energy changes in the individual reservoirs cannot be reconstructed from the changes in the system. The present paper uses a new idea for the study of the fluctuations of the heat dissipation in a reservoir. 1.2.3. Dilation While the weak coupling dynamics is very useful for problems of thermal relaxation (one reservoir) and for identifying the conditions of microscopic reversibility (detailed balance) characterizing an equilibrium dynamics, not sufficient information is left in the weak-coupling limit to identify the variable heat dissipated in the various reservoirs. Heat is path-dependent and we need at least a notion of energytrajectories.a The good news is that we can obtain such a representation at the same time as we obtain a particular dilation of the weak-coupling dynamics. The representation is basically achieved via an unraveling of the weak-coupling generator L and the corresponding Dyson expansion of the semigroup dynamics. That will be explained in Sec. 2.2. There are many possible dilations of a quantum dissipation. It turns out that there is a dilation whose restriction to the system coincides with the Dyson representation in terms of energy-trajectories of the weak-coupling dynamics. That dilation is well studied and goes under the name of quantum stochastic dynamics. The associated quantum stochastic calculus was invented by Hudson and Parathasaraty [26]. It has been extensively employed for the purpose of quantum counting processes, see, e.g., [7, 8]. Various representations and simplifications have been added, such as in [1] where a (classical) Brownian motion extends the quantum dissipation. Unravelings of generators have been first employed in quantum optics in [49], they are further discussed in [10]. 1.2.4. Results We prove a symmetry in the large deviation generating function of the dissipated heat (Proposition 2.8). This function is analytic and this implies the large deviation principle. The symmetry is recognized as the fluctuation theorem for the entropy production. The precise form of the fluctuation theorem depends on whether the model has been derived from a reversible or an irreversible (e.g., because of the presence of magnetic fields) dynamics. This point was clarified in [39]. By a theorem of Bryc [9], analyticity of the generating function implies the central limit theorem for the currents. We do not stress this point but it is implicitly used in deriving a Green–Kubo relation and Onsager reciprocity (Proposition 2.10), or modifications of these, again depending on the reversibility of the original model. In all cases, the a At least, if one has a stochastic or effective description of the system dynamics, as is the case in the weak-coupling limit. We do not claim at all that the trajectory-picture is microscopically fundamental.
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
625
fluctuation symmetry helps to establish strict positivity of the entropy production (Proposition 2.9). Let us stress that our main result, Proposition 2.8, depends on an interpretation, as described above under Sec. 1.2.3. However, the consequences of our main result, Propositions 2.9 and 2.10 do not depend on this interpretation. This will be further discussed in Sec. 3.
1.2.5. Comparison with earlier results Technically, our fluctuation theorem is very close to the results obtained in [32] or [35]. The Green–Kubo relations and Onsager reciprocity have been established recently in, e.g., [27] for the spin-fermion model. In the weak-coupling limit, they were discussed already in [34], however there the authors did not distinguish between reversible and irreversible models (this is commented upon in Remark 2.6). The strict positivity in the weak-coupling limit was proven in [31]b (for the spin-fermion model) and in [4] (under general conditions). Our theorem on strict positivity is however slightly more general: Assuming the existence of a unique, faithful stationary state, we formulate a necessary and sufficient condition for strict positivity.
1.3. Outline of the paper In Sec. 2, we introduce the quantum stochastic model and state the result. In Sec. 3, follows a discussion where the main points and novelties are emphasized. Proofs are postponed to Sec. 4.
2. The Model 2.1. Weak coupling We briefly introduce here the weak-coupling dynamics without speaking about its derivation, which is not relevant for the discussion here. Some of that was briefly addressed in Secs. 1.2.1 and 1.2.2 and it is covered in detail in [13] and [34]. Let H be a finite-dimensional Hilbert space assigned to a small subsystem, called system in what follows. Let HS be a self-adjoint Hamiltonian on H. Introduce the set of Bohr frequencies F := {ω ∈ R | ∃ e, e ∈ sp HS : ω = e − e }.
(2.1)
Remark that F is the set of eigenvalues of the derivation −i[HS , ·]. We label by k ∈ K (a finite number of) different heat reservoirs at inverse temperatures βk < ∞. b Besides,
from [31], it follows that the strict positivity remains true at small non-zero coupling, without taking the weak-coupling limit.
September 12, 2006 14:40 WSPC/148-RMP
626
J070-00274
W. De Roeck & C. Maes
To each reservoir, k is assigned a self-adjoint operator Vk ∈ B(H) and for each k ∈ K, ω ∈ F , we put 1e (HS )Vk 1e (HS ), (2.2) Vω,k = e,e ∈sp HS ω=e−e
where 1e (HS ) for e ∈ sp(HS ) is the spectral projection on e associated to HS . older Fix for k ∈ K, nonnegative functions ηk ∈ L1 (R) and assume them to be H¨ continuous in F ⊂ R and satisfying the condition ηk (x) = e−βk x ηk (−x) ≥ 0,
x∈R
(2.3)
which is related to the KMS equilibrium conditions in the reservoir k ∈ K, see further under Remark 2.5. Write also for ω ∈ F, k ∈ K, ηk (x) (2.4) sk (ω) = lim ↓0 R\[ω−,ω+] ω − x which is well defined by the assumption of H¨older continuity for ηk∈K . From now on, we simply write the indices ω, k for ω ∈ F, k ∈ K. We consider the self-adjoint Hamiltonian ∗ sk (ω)Vω,k Vω,k (2.5) Hf := ω,k
satisfying, by construction, [Hf , HS ] = 0.
(2.6)
We work with the following generator L on B(H) 1 ∗ ∗ L(·) = i[Hf , ·] + ηk (ω) Vω,k · Vω,k − {Vω,k Vω,k , ·} . 2
(2.7)
ω,k
Putting T (H) ⊂ B(H) the set of all density matrices on H, i.e. µ ∈ T (H) ⇔ Tr[µ] = 1,
µ ≥ 0,
(2.8)
one introduces the dual generator L∗ on T (H), defined through Tr[AL∗ µ] = Tr[µLA],
A ∈ B(H), µ ∈ T (H).
By grouping all terms with the same k in (2.7), we can also write Lk (·). L(·) = −i[Hf , ·] +
(2.9)
(2.10)
k∈K
Both L and Lk∈K are of the Lindblad form [33] and hence they generate completely positive semigroups etL and etLk . A ρ ∈ T (H) is a stationary state for the semigroup etL iff L∗ ρ = 0
∗
or, equivalently, etL ρ = ρ.
(2.11)
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
627
We fix an anti-unitary operator T on H, which has to be thought of as playing the role of time reversal. Let HSθ := T HS T,
Vkθ := T Vk T.
(2.12)
That defines a new model, satisfying all necessary requirements. This model can be thought of as the time-reversal of the original one. We will need the following assumptions: Assumption A1. We ask triviality of the commutant {ηk (ω)Vω,k | k ∈ K, ω ∈ F } = C1, 1/2
(2.13)
where for A ⊂ B(H), B ∈ A ⇔ ∀ A ∈ A : [A, B] = 0.
(2.14)
That ensures the existence of a unique stationary state, as stated in Remark 2.1 Assumption A2. We ask that the system can complete a closed cycle in which the entropy production is non-zero. More precisely, there are sequences ω1 , . . . , ωn in F and k1 , . . . , kn in K such that 1.
n
βki ωi = 0.
(2.15)
i=1
2. There is a one-dimensional projection P ∈ B(H) such that Tr[P Vωn ,kn · · · Vω2 ,k2 Vω1 ,k1 P ] = 0.
(2.16)
Assumption A3. This assumption expresses that our model is time-reversal invariant. It will be used in deriving the full fluctuation theorem, the Green–Kubo relations and Onsager reciprocity. HSθ = HS ,
∀ k ∈ K : Vkθ = Vk .
(2.17)
Remark 2.1. If Assumption A1 holds, then, by a theorem of Frigerio (Theorem 3.2 in [20]) and the fact that βk∈K < ∞, the semigroup etL has a unique stationary state ρ. This state is faithful, i.e. for all non-zero projections P = 0 ∈ B(H): Tr[ρP ] > 0.
(2.18)
Assumption A1 is actually a necessary condition for the existence of a unique stationary state. Remark 2.2. Assumption A2 comprises the intuitive assumption that the system does not break up in independent subsystems which are coupled separately to the reservoirs. If that would be the case, then most of our results still hold but they become trivial. For example, the rate function e from Proposition 2.8 satisfies ∀ κ ∈ C : e(κ) = 0.
September 12, 2006 14:40 WSPC/148-RMP
628
J070-00274
W. De Roeck & C. Maes
Remark 2.3. If for all k ∈ K, βk = β for some β, then ρβ := exp(−βHS )/Tr[exp(−βHS )]
(2.19)
is a stationary state for etL , as follows from the condition (2.3) and the explicit form (2.7). Remark 2.4. If A1 holds (assuring the uniqueness of the stationary state), then one easily checks ∀ A ∈ B(H),
∀ e = e ∈ sp HS : lim etL (1e (HS )A1e (HS )) = 0 t↑+∞
(2.20)
which is usually called “decoherence”. As a consequence of (2.20), the stationary state ρ ∈ T (H) of etL satisfies, 1e (HS )ρ1e (HS ) = ρ. (2.21) e∈sp HS
Remark 2.5. If one would derive the model from a microscopic set-up, then we can be more specific. Let HRk be the Hilbert space of the kth reservoir and ρk a thermal equilibrium state at βk on (a subalgebra of) B(HRk ). Assume the coupling is given by Vk ⊗ Rk , Rk = Rk∗ ∈ B(HRk ). (2.22) k∈K
Then the functions ηk are fourier transforms of the autocorrelation function of Rk and the KMS conditions imply (2.3). All this is discussed at length in [35]. The restriction to couplings of the form (2.22), where each term is self-adjoint by itself, is not necessary. Besides, one can also have multiple couplings per reservoir. Since this complicates our notation without introducing any novelty, we adhere to the simple form (2.22). Remark 2.6. If HS is nondegenerate, one can choose T as follows: Let ψe , e ∈ sp HS be a complete set of eigenvectors for HS and put ce ψe = c¯e ψe , ce∈sp HS ∈ C. (2.23) T e∈sp HS
e∈sp HS
Although this does not necessarily imply Assumption A3, it does imply HSθ = HS ,
∗ θ θ ∗ Vω,k AVω,k = Vω,k A(Vω,k ) ,
A ∈ B(H),
(2.24)
which, as one can check from the proofs, can replace A3 for all purposes of this paper. Hence, a nondegenerate model is automatically time-reversal invariant. This explains why in [34] the Green–Kubo relations were derived for nondegenerate Hamiltonians without speaking about microscopic time-reversal. It also explains why time-reversal does not appear naturally in the framework of classical Markov jump processes.
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
629
2.2. Unraveling the generator We associate to that semigroup dynamics, generated by (2.7), a pathspace measure by a procedure which is known as “unraveling the generator”. Basically, we will introduce |F | × |K| Poissonian clocks, one for each reservoir and each Bohr frequency. Whenever the clock (ω, k) ticks, our system will make a transition with Bohr frequency ω, induced by reservoir k. This will be our “a priori” measure dσ (see further). If HS is nondegenerate, then it is very easy to upgrade dσ to the appropriate pathspace measure: one multiplies dσ with a certain factor for each jump and with factors for the waiting times, obtaining something of the form dPρ0 (σ) = e−(t−tn )rn+1 cn · · · c2 e−(t2 −t1 )r1 c1 e−t1 r1 dσ
(2.25)
for some positive numbers c1 , . . . , cn and r1 , . . . , rn+1 and initial state ρ0 . When HS is degenerate, one has to do things more carefully, leading to the expression (2.42) in Lemma 2.7. The technical difference between degenerate and nondegenerate HS is further discussed in Sec. 3. 2.2.1. Preliminaries Put
Ω1t := {σ ⊂ [0, t] |σ| < ∞},
Ω1 := {σ ⊂ R+ |σ| < ∞},
(2.26)
where |σ| is the cardinality of the set σ ⊂ R. Let (Ω1t )ω,k , Ω1ω,k stand for identical copies of Ω1t , Ω1 and put Ω := × Ω1ω,k , ω,k
Ωt := × (Ω1t )ω,k ,
(2.27)
ω,k
where Ω and Ωt are called Guichardet spaces, see [24]. An element σ ∈ Ω looks like σ = (ω1 , k1 , t1 ; . . . ; ωn , kn , tn )
with 0 < t1 < t2 < · · · < tn < +∞.
Alternatively, and corresponding to the product in (2.27): σ = (σω,k )ω,k with σω,k ∈ Ω1ω,k , |σ| := |σω,k |.
(2.28)
(2.29)
ω,k
We define integration on Ωt and Ω, by putting for any sequence of functions g = (gn )n∈N with gn a measurable function on F n × K n × (R+ )n for all n ∈ N, ∞ dσ g(σ) := dt1 · · · dtn gn ((ω1 , k1 , t1 ; . . . ; ωn , kn , tn )), (2.30) Ωt
n=0 k1 ,...,kn ∈K n ω1 ,...,ωn ∈F n
n t
where nt ⊂ Rn is the simplex (t1 , . . . , tn ) ∈ nt ⇔ 0 < t1 < · · · < tn < t.
(2.31)
The equality (2.30) defines the symbol “dσ” and the notion of measurable sets in Ωt or Ω (for the latter, take t = ∞ in the above definitions).
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
W. De Roeck & C. Maes
630
For future use, we introduce “number functions” ntω,k , defined as ntω,k (σ) := |σω,k ∩ [0, t]|,
nω,k (σ) := |σω,k |,
(2.32)
and the abbreviations σ ∪ τ and τ \σ for elements of Ω, defined by (σ ∪ τ )ω,k := σω,k ∪ τω,k ,
σ = ξ\τ ⇔ σ ∪ τ = ξ.
(2.33)
If σ ∈ Ωs and τ ∈ Ωu , we also need στ ∈ Ωs+u , defined by (στ )ω,k := σω,k ∪ (s + τω,k ),
where q ∈ s + τω,k ⇔ q − s ∈ τω,k .
(2.34)
Remark that a function g on Ωs is naturally made into a function on Ωs+u by, using the notation (2.34), g(στ ) := g(σ),
σ ∈ Ωs , τ ∈ Ωu .
(2.35)
2.2.2. Constructing a pathspace measure Write the weak-coupling generator (2.7) as L = L0 + Jω,k
(2.36)
ω,k
with ∗ · Vω,k Jω,k (·) := ηk (ω)Vω,k
and L0 (·) := i[Hf , ·] −
(2.37)
1 ∗ ηk (ω){Vω,k Vω,k , ·}. 2 ω,k
Consider Wt (σ) : B(H) → B(H) as the completely positive map depending on σ∈Ω Wt (σ) := IΩt (σ)et1 L0 Jω1 ,k1 e(t2 −t1 )L0 · · · e(t|σ| −t|σ|−1 )L0 Jω|σ| ,k|σ| e(t−t|σ| )L0
(2.38)
with IΩt the indicator function of Ωt ⊂ Ω and with the indices (ωi , ki ), i = 1, . . . , |σ| referring to the representation (2.28) of σ. To verify the complete positivity of (2.38), rewrite L0 as 1 ∗ L0 (·) = S · + · S ∗ , S = iHf − ηk (ω)Vω,k Vω,k (2.39) 2 ω,k
which yields, etL0 (·) = etS · (etS )∗ .
(2.40)
Complete positivity of (2.37) is obvious from its definition. The Dyson expansion of etL , corresponding to the splitting (2.36), reads dσ Wt (σ). (2.41) etL = Ωt
That expression induces a “path space measure”, or a notion of “quantum trajectories” on Ω.
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
631
Lemma 2.7. Choose µ ∈ T (H). Let E ⊂ Ωt be measurable and define Pµ,t (E) :=
E
dσ Tr[µWt (σ)1].
(2.42)
Then (Pµ,t )t∈R+ are a consistent family of probability measures on (Ωt )t∈R+ , i.e. for a measurable function g on Ωt ,
dPµ,t (σ)g(σ) = Ωt
dPµ,s (σ)g(σ),
s ≥ t,
(2.43)
Ωs
where g is extended to Ωs as in (2.35). Thus we obtain a new probability measure Pµ on Ω by the Kolmogorov extension theorem, for t > 0 and a function g on Ωt ,
dPµ (σ)g(σ) = Ω
dPµ,t (σ)g(σ),
(2.44)
Ωt
where we used again the extension as in (2.35). The expectation with respect to these measures is denoted Eµ,t on Ωt ,
Eµ on Ω.
(2.45)
These probability measures are often called “quantum counting processes”, see [7, 8].
2.3. Results We define the integrated entropy current wt up to time t as a function on Ω: wt (σ) = −
βk ωntω,k (σ)
(2.46)
ω,k
with ntω,k as in (2.32). In what follows, we denote by ρ the stationary state for etL , which is unique by Assumption A1. For κ ∈ C, we write t 1 log Eρ [e−κw ] t↑+∞ t
e(κ) := lim
(2.47)
if it exists. Then, e(κ) of course depends on all model parameters, i.e. on HS , Vk , ηk . We introduce eθ (κ) which is derived from the model with new parameters HSθ , Vkθ , ηkθ = ηk , see (2.12).
September 12, 2006 14:40 WSPC/148-RMP
632
J070-00274
W. De Roeck & C. Maes
Now, we can already formulate the main result of the paper: Proposition 2.8 (Fluctuation Theorem). Assume A1. Let wt be defined by (2.46). There is an open set U ∈ C containing the real line, R ⊂ U, such that for all κ ∈ U, the limit e(κ) := lim
t↑+∞
t 1 log Eρ [e−κw ] t
(2.48)
exists and the function κ → e(κ) is analytic on U. Moreover, e(κ) = eθ (1 − κ).
(2.49)
If also A3 holds, then e(κ) = eθ (κ) and e(κ) = e(1 − κ).
(2.50)
We list some consequences of the fluctuation relations (2.49) and (2.50). Proposition 2.9 (Strict Positivity of the Entropy Production). Assume A1, then 1 Eρ [wt ] t↑+∞ t
A2 holds ⇔ lim
> 0.
For the next proposition, we introduce energy functions ntk on Ω: ωntω,k . ntk := −
(2.51)
(2.52)
ω∈F
Proposition 2.10 (Green–Kubo Relations). Assume A1 and fix some β > 0. Let for k, k ∈ K:
1 t
∂ lim Eρ nk
(2.53) Lk,k (β) := ∂βk t↑+∞ t β1 =···=β|K| =β and similarly the time-reversed coefficient Lθk,k , obtained by starting with HSθ and Vkθ . Then, Lk,k (β) + Lθk,k (β) = β lim
t↑+∞
1 Eρ [ntk ntk ]. t
(2.54)
If also A3 holds, then Lk,k =
1 1 β lim Eρ [ntk ntk ] 2 t↑+∞ t
(2.55)
with Onsager reciprocity Lk,k = Lk ,k .
(2.56)
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
633
2.4. The quantum model: A dilation of the semigroup etL 2.4.1. Heuristics In the next section, we construct a unitary evolution, which is our basic quantum model. This type of unitary evolutions is generally known as solutions of quantum stochastic differential equations, introduced in [26]. For the readers who are familiar with stochastic calculus, we briefly state how our evolution would look in traditional notation. Recommended references are [6, 44] for quantum stochastic calculus and [14] for the formalism of second quantization. For all ω ∈ F and k ∈ K, let (L2 (R+ ))ω,k be a copy of L2 (R+ ). We consider the bosonic fock space (Γs denotes symmetrized second quantization) (2.57) R = Γs ⊕ (L2 (R+ ))ω,k = ⊗ Γs [(L2 (R+ ))ω,k ] ω,k
ω,k
and think of dA∗ω,k,t with t ∈ R+ as the creation operator on Γs [(L2 (R+ ))ω,k ] creating the “wavefunction” χ[t,t + dt] (the indicator function of the interval [t, t + dt]). We now write a Quantum Stochastic Differential Equation (QSDE) on B(H ⊗ R): 1/2
∗ ηk (ω) Vω,k dA∗ω,k,t − Vω,k dAω,k,t Ut dUt = ω,k
1 ∗ − iHf dt − ηk (ω)Vω,k Vω,k dt Ut , 2
U0 = 1 ⊗ 1.
(2.58)
ω,k
Of course, the intuitive definitions given here, do not suffice to give meaning to this expression. We content ourselves with stating that (2.58) defines a unitary evolution Ut , which we will now rigorously construct by using Maassen’s approach of integral kernels [36]. 2.4.2. Construction of the unitary evolution Ut Recall the Guichardet spaces Ωt and Ω, introduced in Sec. 2.2 and define for (σ, τ ) ∈ Ω × Ω the ordered sequence of times (t1 , . . . , tn ) as {t1 , . . . , tn } = ∪ω,k (σω,k ∪ τω,k ) and 0 < t1 < · · · < tn ,
n = |σ| + |τ |.
(2.59)
ut (σ, τ ) = IΩt ×Ωt (σ, τ )e(t−tn )K Zn e(tn −tn−1 )K Zn−1 · · · Z2 e(t2 −t1 )K Z1 et1 K
(2.60)
We define the integral kernel ut : Ω × Ω → B(H): with IΩt ×Ωt the indicator function of Ωt × Ωt , S ∈ B(H) as in (2.39) and for j = 1, . . . , n 1/2 if tj ∈ σω,k , ηk (ω)Vω,k (2.61) Zj = 1/2 ∗ −ηk (ω)Vω,k if tj ∈ τω,k . Finally, let F := L2 (Ω, H, dσ) H ⊗ L2 (Ω, dσ).
(2.62)
September 12, 2006 14:40 WSPC/148-RMP
634
J070-00274
W. De Roeck & C. Maes
Remark that in a natural way, we have F H ⊗ R with R as defined in (2.57). Take f ∈ L2 (Ω, H, dσ) and define (Ut f )(ξ) = ut (σ, τ )f ((ξ\σ) ∪ τ ) dτ. (2.63) σ⊂ξ
Ω
In [36], one proves that this Ut is unitary and that it solves the QSDE (2.58). The unitary family Ut , thus defined, is not a group, but a so-called cocycle; physically, this corresponds to an interaction picture and it can be made into a group by multiplying it with a well-chosen “free evolution”. Note that by taking each Vω,k = 0 or Vk = 0 in (2.2) the subsystem decouples from the reservoir and (2.63) reduces to Ut = 1 ⊗ 1.
(2.64)
This follows since the kernel ut (σ, τ ) in (2.60) vanishes except for σ = τ = ∅ and S reduces to 0. Remark that in (2.58) or (2.63), the reservoirs are now not only labeled by k ∈ K, as in the original physical picture, but also by ω ∈ F ; each transition has its own mathematical reservoir. To formulate our results, we also need to specify the state. Define the one-dimensional vacuum projection 1∅ ∈ B(F ) f (∅) when σ = ∅, (2.65) (1∅ f )(σ) = 0 when σ = ∅. Our reference state is ρ ⊗ 1∅
on H ⊗ F
(2.66)
where ρ is the unique stationary state of etL , see Remark 2.1. Note that the state ρ ⊗ 1∅ is not invariant under the dynamics, only its restriction to H is invariant (see also (2.74)). Hence, technically, it is quite different from a NESS as in (1.2). We will abbreviate the Heisenberg dynamics as jt (G) := U∗t GUt ,
G ∈ B(H ⊗ F)
(2.67)
Ntk
∈ B(F ) be the energy operators
(Ntk f )(σ) = ntk (σ)f (σ)
(2.68)
with Ut as in (2.63). Let for each k ∈ K, t ≥ 0,
with ntk as defined in (2.52). We also define a quantity which we interpret as the total energy of subsystem plus reservoirs Ntk , Nt ∈ B(H ⊗ F). (2.69) Nt := HS + ω,k
This interpretation is backed by Proposition 2.13. These “energies” should be understood as renormalized quantities, of which the (infinite) equilibrium energy of the reservoirs was subtracted. This interpretation
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
635
is confirmed by remarking that at time s = 0, these “energies” equal 0: for all continuous functions g, Tr[1∅ g(js=0 Ntk )] = Tr[1∅ g(Ntk )] = g(0) for all k ∈ K, t ≥ 0.
(2.70)
2.4.3. Connection of the QSDE with the counting process The connection of the QSDE with the “quantum trajectories” is provided by the following lemma, which we will not prove. It can be found, for example, in [7, 8] and it is easy to derive starting from (2.63) and remarking that Wt (σ)(·) = u∗t (σ, ∅) · ut (σ, ∅).
(2.71)
Lemma 2.11. Let E ⊂ Ω be measurable (as for (2.30)). Denote by 1E the ortogonal projection 1E : L2 (Ω) → L2 (E)
(2.72)
and recall 1∅ from (2.65). Then, for all A ∈ B(H), dσ Wt (σ)A, TrF [1∅ jt (A ⊗ 1E )] =
(2.73)
E
where TrF denotes the partial trace over F . The formula (2.63) actually defines a dilation of the semigroup etL . To see this, take E = Ω, then (2.73) reads dσ Wt (σ)A = etL A. (2.74) TrF [1∅ jt (A ⊗ 1)] = Ω
Another useful consequence of Lemma 2.11 is the connection between the energy operators in (2.68) and the functions (2.52). Proposition 2.12. Let k1 , . . . , k , t1 , . . . , t and g1 , . . . , g be finite ( < ∞) sequences of, respectively, elements of K, R+ and continuous functions, and let µ ∈ T (H), then ti ti Tr (µ ⊗ 1∅ ) gi (jti (Nki )) = Eµ gi nki . (2.75) i=1
i=1
Again, we do not give a complete proof and we refer to [7, 8]. Proposition 2.12 follows from Lemma 2.11 by using that for all t ≥ s and k ∈ K, jt (Nsk ) = js (Nsk )
and that the family {Ntk t > 0, k ∈ K} is commutative.
(2.76)
September 12, 2006 14:40 WSPC/148-RMP
636
J070-00274
W. De Roeck & C. Maes
2.5. Results within the quantum picture First, we show that the energy (see (2.69)) is conserved. Proposition 2.13. Let Nt be as in (2.69). For all continuous functions g: Tr[(ρ ⊗ 1∅ )g(jt (Nt ))] = Tr[(ρ ⊗ 1∅ )g(Nt )] = Tr[ρg(HS )].
(2.77)
The change of entropy in the environment up to time t is Wt :=
βk jt (Ntk )
(2.78)
k∈K
and its “steady state expectation” is the entropy production. Our main result is a fluctuation theorem for Wt . Proposition 2.14. Assume A1. Let Wt be defined as in (2.78). There is an open set U ∈ C containing the real line, R ⊂ U, such that for all κ ∈ U, the limit t 1 log Tr[(ρ ⊗ 1∅ )e−κW ] t↑+∞ t
eˆ(κ) := lim
(2.79)
exists and the function κ → eˆ(κ) is analytic on U. Let e(κ) by defined as in (2.48). Then, eˆ(κ) = e(κ)
(2.80)
on U and thus all statements in Proposition 2.8 carry over to eˆ(κ). From ρ⊗1∅ , we deduce probability measures Tt on R. Let A ⊂ R be measurable, then Tt (A) = Tr[(ρ ⊗ 1∅ )1A (Wt )],
(2.81)
where 1A (Wt ) is the spectral projection on A associated to Wt . Via Legendretransformation, (2.50) implies − lim
t↑+∞
dTt (−a) 1 log =a t dTt (a)
(2.82)
which is (1.1). In the same way as in Proposition 2.14, Propositions 2.10 and 2.9 carry over the quantum picture; for concreteness, we give the analogue of Proposition 2.10.
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
Proposition 2.15. Assume A1 and fix some β > 0. Let for k, k ∈ K:
∂ 1 t
˜ Lk,k (β) := lim Tr (ρ ⊗ 1∅ )Nk )
∂βk t↑+∞ t β1 =···=β|K| =β
637
(2.83)
˜ θ , obtained by starting with H θ and similarly the time-reversed coefficient L S k,k θ and Vk . ˜ k,k (β) + L ˜ θ (β) = β lim 1 Tr (ρ ⊗ 1∅ )Nt Nt . L k,k k k t↑+∞ t
(2.84)
If also A3 holds, then ˜ k,k = 1 β lim 1 Tr (ρ ⊗ 1∅ )Nt Nt L k k 2 t↑+∞ t
(2.85)
with Onsager reciprocity ˜ k,k = L ˜ k ,k . L
(2.86)
Recall Lk,k from Proposition 2.10. Then, for all k ∈ K, ˜ k,k = Lk,k . L
(2.87)
Remark that Propositions 2.14 and 2.15 follow immediately from Propositions 2.8 and 2.10 by application of Proposition 2.12. 3. Discussion 3.1. Entropy production for Markov processes It is well known that the weak-coupling generator is “classical” in the sense that the commutant algebra Acl := {A ∈ B(H) | [A, HS ] = 0} is invariant. In case the Hamiltonian HS is nondegenerate and only then, Acl is a commutative algebra. Then, we can construct a Markov process with state space Λ which is the restriction of (the dual of) the semigroup etL to Acl C(Λ). Loosely speaking, let ρ be the stationary state, Ωt := Λ[0,t] the pathspace up to time t, and Pρt the pathspace measure (starting from ρ) of this Markov process. The time reversal operation Θ acts on Ωt as (Ωξ)(u) = ξ(t − u) for ξ ∈ Ωt and 0 ≤ u ≤ t. For such Markov processes describing a nonequilibrium dynamics, we dispose of a general strategy for identifying the entropy production. It turns out in a lot of interesting cases [39, 41, 37] that log
dPρt (ξ) = St (ξ) + O(1), dΘPρt
(3.1)
where St (ξ) is the random variable that one physically identifies as the entropy production. The second term in the right-hand side is non-extensive in time. The algorithm allows to derive (1.1) from (3.1).
September 12, 2006 14:40 WSPC/148-RMP
638
J070-00274
W. De Roeck & C. Maes
Since we also have a Markov generator, we can apply the same scheme to our setup.c To evaluate the result, we however need a physical notion of entropy production in our model. As mentioned earlier, such a notion is rather unambiguous here, see also [31, 34]: current into kth reservoir = Tr[ρLk HS ].
(3.2)
But the mean entropy production based on these currents is not equal to the expectation value of (3.1): dPρt 1 βk Tr[ρLk HS ] = lim EPρt ln . (3.3) t↑∞ t dΘPρt k∈K
For example, take two reservoirs (k = L(left), R(right)) and let Refl : B(H) → B(H) stand for the involution which models left-right reflection. Assume that HS is nondegenerate and that for all x ∈ R, Refl HS = HS ,
Refl VL = VR ,
eβL x/2 ηL (x) = eβR x/2 ηR (x).
(3.4)
Hence all parameters are left-right symmetric, except the inverse temperatures βL , βR . (Actually, these assumptions are inconsistent; if HS is left-right symmetric, then it must be degenerate. However, one can introduce an arbitrarily small symmetry breaking which will generically lift the degeneracy, such that the our reasoning still applies.) One checks that ρ[1e0 (HS )] βL ω i βR ω i m dPρt e− 2 + e− 2 e0 ∈A0 ⊂sp HS (ξ) = log log + log βL ω i βR ω i dΘPρt ρ[1em (HS )] i=1 e 2 +e 2 em ∈Am ⊂sp HS
= O(1) − (βL + βR )
m
ωi ,
(3.5)
i=1 m where the sets A0 , Am and the sequence ωi=1 of energy jumps are derived from ξ, m and moreover | i=1 ωi | ≤ HS . This means that in this particular left-right symmetric case, (3.5) is bounded, independently of t for every ξ, and hence the righthand side of (3.3) vanishes, which disqualifies it as “entropy production”. This trivial remark shows that it is not enough to look at the semigroup etL to identify the entropy production. Instead, we use more input; we certainly use the fact that L = k∈K Lk where the index k runs over the different reservoirs but moreover, with the unraveling of the generator, Sec. 2.2, comes an intuitive interpretation of the various terms. That can be contrasted with results by V. Jakˇsi´c c Very
recently, a paper [18] appeared where exactly this is done: one derives a fluctuation theorem
for log
t dPρ t dΘPρ
as in (3.1). Since the authors consider mainly examples involving one reservoir, they
do not run into the difficulty described here.
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
639
and C. A. Pillet, where one actually proves that quantities like Tr[ρLk HS ], cf. (3.2) are limits of currents in the original microscopic Hamiltonian model. Of course, we take care that our choices are consistent with that result. However, for the higherorder fluctuations, we do not know; we just make a choice which looks very natural. At present, we do not give arguments that for a class of reasonable functions g Eρ [g(wt )]
(3.6)
is indeed the limit of some fluctuation of dissipated heat in the microscopic model. (Although [15] points in that direction, see also point 2 in Sec. 3.3.) Another choice for the higher-order fluctuations is discussed in Sec. 3.2. It is exactly here that lies the role of the dilation with quantum stochastic evolutions. If one takes that quantum model as a starting point, then one can derive that (3.6) is a fluctuation of the dissipated heat. To our knowledge, that is the only quantum model in which one can study the fluctuations of the dissipated heat. On the other hand, one can also make a classical dilation of the semigroup and in fact, this is exactly what we do in Sec. 2.2. Yet, there is a technical difference between the cases of degenerate and nondegenerate system Hamiltonians HS . If 1e (HS ) is one-dimensional for e ∈ sp HS , and in addition, for a non-zero ω ∈ F , e is the unique element of sp HS such that e − ω ∈ sp HS , then we have the following form of Markovianness: If a σ ∈ Ω contains ω, i.e. σ = σ0 τ σ1 ,
σ0 ∈ Ωt0 , σ1 ∈ Ωt1 , τ = (t0 , ω, k) for some k ∈ K, t0 , t1 ≥ 0, (3.7)
then dPρ (σ) = Tr[ρWt0 +t1 (σ)1] = Tr[ρWt0 (σ0 τ )1] dσ0 dτ × Tr[1e (HS )Wt1 (σ1 )1] dσ1 = dPρ (σ0 τ ) dP1e (HS ) (σ1 ).
(3.8)
In words, a one-dimensional spectral subspace erases memory. That does not work in the degenerate case.
3.2. Integrated currents within the semigroup approach Starting from (3.2), one could define the integrated currents Jˆk,t ∈ B(H) as Jˆk,t =
t
du euL (Lk (HS ))
(3.9)
0
and study their fluctuations. One can ask whether these fluctuations coincide with these in our model? The answer is partially positive because
September 12, 2006 14:40 WSPC/148-RMP
640
J070-00274
W. De Roeck & C. Maes
Proposition 3.1. Take for all k ∈ K : βk = β for a certain β and let ρβ be the stationary state for etL as in Remark 2.3. For all k, k ∈ K and all u ≥ 0,
∂2 Eρβ [nvk1 nvk2 ] v1 =0,v2 =u = −Tr[ρβ Lk (HS )euL (Lk (HS ))] ∂v1 ∂v2
(3.10)
which gives a relation between Proposition 2.10 and the Green–Kubo relation in [35]. Also the averages coincide, leading to (3.2). However, it is not true that for a reasonable class of functions g, Eρ [g(ntk )] = Tr[ρg(Jˆk,t )].
(3.11)
So the mean entropy production and the Green–Kubo formula can correctly be expressed in terms of the operators Jˆk,t , but higher-order fluctuations of the dissipated heat cannot. 3.3. Connection to microscopic dynamics We know of three derivations in the literature of the stochastic evolution (2.63) or (2.58) from a microscopic setup: 1. Stochastic Limit Accardi et al. prove in [2] that the weak-coupling limit can be extended to the total evolution of subsystem observables. Let Utλ be the evolution (in the interaction picture) on the total system with λ the coupling between subsystem and reservoirs. Then, in a certain sense, λ λ ∗ U−t/λ 2 (S ⊗ 1)Ut/λ2 → Ut (S ⊗ 1)Ut , λ↓0
(3.12)
whereas the traditional weak-coupling limit only speaks about convergence in expectation of the left-hand side. The unitary Ut is the solution of (2.58). 2. Stochastic Limit Revisited In [15], the approach of [2] (mentioned above) was simplified. By introducing a unitary map Jλ acting on the reservoirs, we get for all continuous functions g s-lim Jλ∗ Uλλ−2 t Jλ = Ut , λ↓0
s-lim Jλ∗ g(HRk )Jλ = g(Nk ), λ↓0
(3.13)
where s-lim denotes strong operator convergence and Hk is the generator of the dynamics in the uncoupled kth reservoir. This suggests that one can study the fluctuations of the reservoir energies by looking at the number operators Nk in the model reservoirs, exactly as we do in the present paper. 3. Repeated Interactions In [3], Attal and Pautrat describe a subsystem with Hilbertspace H interacting repeatedly for a time h with a small reservoir with Hilbertspace R. After each time h, R is replaced by an identical copy. This procedure ensures that at any time, the subsystem sees a “fresh” reservoir. In the limit h → 0 the dynamics (in the interaction picture) converges in a certain sense to the solution of a QSDE. One
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
641
can choose a particular QSDE by tuning the parameters of the interaction. Assume that R = ⊗ω,k Rω,k .
(3.14)
Each Rω,k is 2-dimensional with basevectors (θ, ω). Define aω,k on Rω,k by aω,k (ω) = θ,
aω,k θ = 0.
Choose the dynamics on H ⊗ R as e−itH(h) for 0 ≤ t ≤ h with 1 ∗ H(h) = Hf + √ (Vω,k a∗ω,k + Vω,k aω,k ). h ω,k
(3.15)
(3.16)
Then, through the limiting procedure of [3], Eq. (2.58) is obtained. 4. Proofs 4.1. Proof of Lemma 2.7 From
dσ Wt (σ)1 = etL 1 = 1,
∀ σ ∈ Ω : Wt (σ)1 ≥ 0
(4.1)
Ωt
for all t ≥ 0, it follows that (Pµ,t )t∈R+ is indeed a family of probability measures for all µ ∈ T (H). Further, for s, u ≥ 0, we have Wt (σ)Wt (τ ) = Wt (στ ),
σ ∈ Ωs , τ ∈ Ωu .
(4.2)
Together with (4.1), this yields consistency of the family (Pµ,t )t∈R+ . 4.2. Proof of Proposition 2.8 Define for κ ∈ C|K| and t > 0, Mt,κ : Ω → C,
Mt,κ (σ) =
κk βk ntk (σ).
(4.3)
k∈K
Our results rely on the following lemma: Lemma 4.1. Assume A1 and let µ ∈ T (H). There is an open set U ⊂ C|K| , with R|K| ⊂ U such that 1 e(κ) := lim log Eµ [eMt,κ ] (4.4) t↑+∞ t is an analytic function on U which does not depend on µ. Moreover, for any sequence k1 , . . . , k ∈ K, lim
t↑+∞
∂ ∂ 1 ∂ ∂ log Eµ [eMt,κ ] = ··· ··· e(κ) ∂κ1 ∂κ t ∂κ1 ∂κ
uniformly on compacts.
(4.5)
September 12, 2006 14:40 WSPC/148-RMP
642
J070-00274
W. De Roeck & C. Maes
Proof. We apply the generalized Perron–Frobenius Theorem A.1 of the Appendix with dσ Wt (σ)eMt ,r (σ) (4.6) Λ= Ω
for well chosen t and r ∈ R|K| . Since for r ∈ R|K| , Mt ,r is a real function, the map Λ is completely positive as a linear combination of completely positive maps with positive coefficients. Below we choose t so as to satisfy the nondegeneracy requirement (A.2) of the Appendix. By faithfulness of the stationary state ρ, :=
inf
0
Tr[ρP ] > 0.
(4.7)
Since the semigroup is ergodic, it follows that there is t such that for all t > t , ∗ sup ρ − etL µ ≤ . (4.8) 3(dim H)2 µ∈T (H) Since L, L0 < +∞, with · being the operator norm in B(B(H)), the Dyson expansion (2.41) is absolutely convergent. Hence, we can find n ∈ N such that t L . (4.9) dσ Wt (σ) − e ≤ 3 dim H |σ|≤n Let m := inf |σ|≤n Mt ,r (σ). For each r ∈ R|K| , decompose Mt ,r m dσ Wt (σ)e =e dσ Wt (σ) + dσ Wt (σ)(eMt ,r (σ) − em ) Ω
|σ|≤n
+ |σ|>n
|σ|≤n
dσ Wt (σ)eMt ,r (σ)
(4.10)
and for each pair of non-zero projections P = 0, P = 0 ∈ B(H), we have Tr P Wt (σ)eMt ,r P ≥ em dσ Tr[P Wt (σ)P ] |σ|≤n
Ω
≥ em Tr[P et L P ] − 3 ≥ em Tr[P ρ] − − ≥ em . 3 3 3
(4.11)
This shows that one can apply Theorem A.1 with Λ as in (4.6). Call the dominant eigenvalue of Λ, λ(r, t ) and the corresponding strictly positive eigenvector v(r). Remark that for each κ ∈ C|K| and t ∈ R+ , dσ Wt (σ)eMt,κ (σ) = etLκ , (4.12) Ω
where Lκ (·) = L0 (·) +
ω,k
∗ ηk (ω)e−κk βk ω Vω,k · Vω,k .
(4.13)
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
643
This follows by comparing the Dyson expansions (in the same sense as for (2.41)) corresponding to the left-hand and the right-hand side of (4.12). As a consequence, for all r ∈ R|K| , Lr has a nondegenerate maximal eigenvalue λ(r) = t1 ln λ(r, t ) corresponding to the eigenvector v(r). Since v(r) is strictly positive, we have Tr[v(r)] > 0, and, for any µ ∈ T (H), Tr[v(r)µ] > 0. This implies 1 log Tr[µetLr 1] = λ(r) t↑+∞ t lim
(4.14)
and hence, again by (4.12) e(r) = λ(r).
(4.15)
Since for all κ ∈ C|K| , Lκ depends analytically on κ, perturbation theory for isolated eigenvalues gives us for all r ∈ R|K| an open set Ur r such that for all κ ∈ Ur : 1. There is a unique λ(κ) ∈ spLκ such that
inf{λ(κ) − |p| p ∈ sp Lκ \λ(κ)} > 0.
(4.16)
2. The eigenvector v(κ), corresponding to λ(κ) satisfies inf ( Tr[µv(κ)]) > 0.
µ∈T (H)
It follows that (4.14) holds for all κ ∈ e(κ) = lim
t→∞
∪
r∈R|K|
(4.17)
Ur ,
1 log Tr[µetLκ 1] = λ(κ). t
(4.18)
Summarizing, we have for all r ∈ R|K| and µ ∈ T (H) a family of analytic functions F (t, κ) :=
1 log Tr[µetLκ 1] t
(4.19)
converging pointwise in Ur to the function e(κ) as t ↑ +∞. We recall Montel’s Theorem, see, e.g., [11, p. 153]: Theorem 4.2. Let G ⊂ C be open and let (fn )n∈N be a sequence of analytic functions G → C, then (fn )n∈N contains a uniformly convergent on compacts subsequence iff the set (fn )n∈N is locally bounded, i.e. that for each z ∈ G there is a r > 0 and M > 0, such that |z − z| ≤ r ⇒ ∀ n ∈ N : |fn (z )| ≤ M.
(4.20)
For all r ∈ R|K| , the family (F (t, κ))t≥t0 is locally bounded on Ur for large enough t0 ≥ 0. This follows from analyticity of Lκ and from the condition (4.17). Consequently, one can apply Theorem 4.2 for each component of κ separately. A standard result, e.g., [11, Theorem 2.1, p. 151] states that the uniform limit of a sequence of analytic functions is analytic and that all derivatives converge. Since this generalizes to the multi-dimensional variable κ, e.g., by Hartog’s Theorem, Lemma 4.1 is proven.
September 12, 2006 14:40 WSPC/148-RMP
644
J070-00274
W. De Roeck & C. Maes
Referring again to the representation (2.28), we introduce for σ ∈ Ω the factor η(σ) :=
|σ|
ηki (ωi ).
(4.21)
i=1
Recall the definition of S in (2.40), introduce the time-reversed maps Lθ0 and, for t ≥ 0, Wtθ (i.e. these maps are derived from HSθ and Vkθ ) and remark, (see also (2.40)) ∗
etL0 (·) = etS · etS ,
θ
etL0 (·) = etT S
∗
T
· etT ST .
(4.22)
Define the operation θt on Ωt as θt (ω1 , k1 , t1 ; · · · ; ωn , kn , tn ) := (−ωn , kn , t − tn ; · · · ; −ω1 , k1 , t − t1 ).
(4.23)
Calculate η −1 (σ) Tr[Wt (σ)1] ∗
= Tr[· · · Vω∗i ,ki e(ti+1 −ti )S Vω∗i+1 ,ki+1 · · · Vωi+1 ,ki+1 e(ti+1 −ti )S Vωi ,ki · · ·] ∗
= Tr[· · · T Vωi+1 ,ki+1 T T e(ti+1 −ti )S T T Vωi,ki T · · · T Vω∗i ,ki T T e(ti+1−ti )S T T Vω∗i+1,ki+1 T · · ·] ∗ T e(ti+1 −ti )T S = Tr[· · · T V−ω i+1 ,ki+1
∗
T
∗ T V−ω T i ,ki
· · · T V−ωi ,ki T e(ti+1 −ti )T ST T V−ωi+1 ,ki+1 T · · ·] = η −1 (θt σ) Tr[Wtθ (θt σ)1] = η −1 (σ)ewt (σ) Tr[Wtθ (θt σ)1].
(4.24)
In the last equality the KMS-condition (2.3) was used. The previous equalities ∗ = V−ω,k and (4.22). Using (4.24), follow from cyclicity of the trace, T T = 1, Vω,k we calculate by change of integration variables (putting I := dim1 H ∈ T (H), Mt,κ EI [e ]= dσ Tr[IWt (σ)1]eMt,κ (σ) Ω
=
Ω
dσ Tr[IWtθ (σ)1]e−w
t
(σ) −Mt,κ (σ)
e
.
(4.25)
Since in the limit t ↑ ∞, one can replace the initial state I by ρ, as in (4.18), the formula (4.25) yields for all κ ∈ U as in Lemma 4.1. e(κ) = eθ (1 − κ) with
1 − κ := (1 − κ1 , . . . , 1 − κ|K| ).
(4.26)
Finally, Proposition 2.8 follows from (4.26) by putting for some κ ∈ C, κi := κ, thus obtaining Mt,κ = κwt .
i = 1, . . . , |K|,
(4.27)
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
645
4.3. Proof of Proposition 2.9 The nonnegativity of the entropy production follows from Proposition 2.8 by Jensen’s inequality. To get the strict positivity from A2, we first need to introduce more notation. Let ρ be the unique stationary state of etL . We decompose the states ρ and T ρT in one-dimensional unnormalized states as ρ i , ρ i ρ j = δi,j ρ i ρ i , ρ i > 0, i, j ∈ D, (4.28) ρ = i∈D
where ρ can stand for ρ or T ρT and D := {1, . . . , dim H}. The decomposition (4.28) differs from the spectral decomposition when ρ is degenerate. Remark that there is an arbitrariness in labeling the unnormalized states, as well as a possible arbitrariness stemming from degeneracies in ρ . We partially fix this arbitrariness by asking that T (T ρ T )j T = ρ j .
(4.29)
This is always possible because the set T (T ρ T )j T, j ∈ D satisfies all the require˜ t = Ωt × D × D for a t ≥ 0 and ments of (4.28) as a decomposition of ρ . Let Ω ˜ define the measure Pt by (letting g be a measurable function): (T ρT )j ˜ ˜ t, dPt (˜ σ )g(˜ σ) = dσ Tr ρi Wt (σ) g(σ, i, j), σ ˜ = (σ, i, j) ∈ Ω (T ρT ) ˜t j Ω Ω t i,j (4.30) where it is understood that σ ∈ Ωt and i, j ∈ D. In the rest of this section, we ˜ t is obvious and will use this notation without further comments. Positivity of P normalization follows by (T ρT )j dσ Tr ρi Wt (σ) dσ Tr[ρi Wt (σ)1] = (T ρT )j Ωt i,j Ωt i = dPρ,t (σ) = 1. (4.31) ˜ θ the measure, constructed as above, but with W θ replacing Wt . We call P t t Remark that this is not the measure one would obtain by starting from HSθ , Vkθ instead of HS , Vk , because then one would also replace ρ in (4.30) by ρθ , the stationary state of Lθ . ˜ t as Define again the operation θt on Ω θt (σ, i, j) = (θt σ, j, i), where the action of θt on Ωt was defined in (4.23).
(4.32)
September 12, 2006 14:40 WSPC/148-RMP
646
J070-00274
W. De Roeck & C. Maes
Consider the function ˜t → R, St : Ω
S t (˜ σ ) = −log
˜ θ (θ˜ σ) dP . ˜ dP(˜ σ)
(4.33)
˜ t as We upgrade the function wt on Ωt to a function on Ω σ ) = wt (σ), wt (˜
σ ˜ = (σ, i, j).
(4.34)
Our strategy will be to prove (Sec. 4.3.1) that for some u > 0, ˜ u (˜ dP σ )S u (˜ σ) > 0
(4.35)
and then (Sec. 4.3.2) that for all t ≥ 0,
˜ t (˜ dP σ ) S t (˜ σ ) − wt (˜ σ ) ≤ 0,
(4.36)
which will lead to the conclusion that for a certain u ∈ R+ , ˜ u (˜ dPρ (σ)wu (σ) = dP σ )wu (˜ σ ) > 0,
(4.37)
˜u Ω
˜t Ω
Ω
Ω
where the first equality is checked by arguing as in (4.31). The converse statement is proven in Sec. 4.3.3. 4.3.1. Positivity of S t Looking back at the calculation (4.24), one immediately checks that for t ≥ 0 and σ ∈ Ω, t ρi (T ρT )j (T ρT )i ρj Tr Wt (σ) = ew (σ) Tr Wtθ (θσ) (4.38) ρi (T ρT )j ρj (T ρT )i and hence σ ) = wt (˜ σ ) − log(ρj ) + log(ρi ), S t (˜
σ ˜ = (σ, i, j).
(4.39)
σ ) = −S t (θt σ ˜ ), that S t satisfies an exact fluctuation Note, using (4.33) and S t (˜ symmetry, for t ≥ 0 and κ ∈ C: t −(1−κ)S t (˜ σ) ˜ t (˜ ˜ θ (˜ dP σ )e−κS (˜σ) = dP . (4.40) t σ )e ˜t Ω
˜t Ω
Remark that f : R → R : x → e−x + x − 1 is positive for all x, increasing for x ≥ 0 and decreasing for x ≤ 0. A Chebyshev inequality with δ > 0 yields t t ˜ ˜ t (˜ ˜ t (|S t | ≥ δ). dPt (˜ σ )S (˜ σ) = dP σ )(e−S + S t − 1)(˜ σ ) ≥ (e−δ + δ − 1)P ˜t Ω
˜t Ω
(4.41)
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
647
Rephrasing (2.15) and (2.16), there is for u > 0, a E ⊂ Ωu , and one-dimensional projection P ∈ B(H) such that dσ Tr[P Wu (σ)P ] > 0, wu (E) =: w = 0. (4.42) E
For any k ∈ N, we construct Ωku ⊃ E k := {σ1 σ2 · · · σk , | σ1 , . . . , σk ∈ E},
(4.43)
where the notation σ1 σ2 , and consequently also σ1 σ2 · · · σk was defined in (2.34). We have dσ Tr[P Wu (σ)P ] > 0. (4.44) wt (E k ) = kw, Ek
Since ρ is faithful, there are i, j ∈ {1, . . . , dim H} such that dσ Tr[ρi Wku (σ)ρj ] > 0.
(4.45)
Ek
Since the function S u − wu is bounded uniformly in u ∈ R+ (this follows, e.g., from (4.39)), one can choose k ∈ N and i, j ∈ {1, . . . , dim H} such that Tr[ρi Wkt (σ)ρj ] = kw + logρi − logρj > 0,
σ ∈ Ek.
(4.46)
This proves that the last expression in (4.41) is not zero (after replacing t by ku). Hence, (4.35) is proven. 4.3.2. Difference between S t and wt Calculate for t ≥ 0 ˜ t (˜ dP σ )logρj = ˜t Ω
i,j∈D
(T ρT )j dσ Tr ρi Wt (σ) logρj ρj Ωt
= Tr[ρetL log T ρT ] = Tr[ρ log T ρT ] and
˜t Ω
˜ σ ) logρi = dP(˜
T ρj T dσ Tr ρi Wt (σ) log ρi ρj Ωt
i,j∈D
=
(4.47)
Tr[ρi etL 1]logρi = Tr[ρ log ρ],
(4.48)
i∈D
where we used ρetL = ρ and etL 1 = 1. Hence, one gets ˜ t (˜ dP σ )(log((T ρT )i ) − log(ρj )) = Tr[ρ(log ρ − log T ρT )] ≤ 0,
(4.49)
˜t Ω
where the last inequality follows from the nonnegativity of the relative entropy.
September 12, 2006 14:40 WSPC/148-RMP
648
J070-00274
W. De Roeck & C. Maes
4.3.3. Strict positivity implies Assumption A2 We prove that A2 is a necessary condition for a non-zero entropy production. First, remark that dPρ (σ)wt (σ) (4.50) Ωt
is extensive in t > 0. This follows from translation invariance (in t) of wt and stationarity of Pρ . Hence, we can fix t > 0 such that
Ωt
dPρ (σ)wt (σ)
> 2 dim H max |βk ω|.
(4.51)
ω,k
Take σ ∈ Ωt satisfying Wt (σ) = 0. It follows that one can split t = σ = τ3 τ2 τ1 ,
τi ∈ Ωti , i = 1, 2, 3.
3
i=1 ti
and (4.52)
(again the notation (2.34) was used) such that: 1. There is a one-dimensional projection P such that Tr[P Wt2 (τ2 )P ] > 0. 2.
|τ1 | ≤ dim H,
(4.53)
|τ3 | ≤ dim H.
(4.54)
Assume that A2 does not hold. It follows that wt2 (τ2 ) = 0. Hence, by (4.54), |wt (σ)| = |wt1 (τ1 ) + wt3 (τ3 )| ≤ 2 dim H max|βk ω| ω,k
(4.55)
which is in obvious contradiction with (4.51).
4.4. Proof of Proposition 2.10 This proof is by now quite standard, it can be found, e.g., in [34]. We recall from (4.5) in Lemma 4.1 that we can interchange the limit t ↑ ∞ and differentiation of κ → e(κ). By differentiating relation (4.26) with respect to κk and to βk in κ = 0 and βk∈K = β, and interchanging limits and derivatives, we arrive at the modified Green–Kubo relation: 1 Eρ [ntk ntk ] t↑+∞ t
Lk,l + Lθk,l = β lim
from which the other statements in Proposition 2.10 easily follow.
(4.56)
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
649
4.5. Proof of Proposition 2.13 Choose v ∈ H and φ ∈ L2 (Ωt ) such that ∀k ∈ K :
HS v = mS v, By the definition of Ut ,
Ut (v ⊗ φ)(ξ) =
Ntk φ = mk φ,
mS , mk∈K ∈ R.
dτ ut (σ, τ )vφ((ξ\σ) ∪ τ ),
(4.57)
ξ ∈ Ωt .
(4.58)
Using [S, HS ] = 0, one checks that ut (σ, τ )v either vanishes or
ω(|σω,k | − |τω,k |) ut (σ, τ )v − v . HS (ut (σ, τ )v − v) =
(4.59)
Ω
σ⊂ξ
ω,k
By (4.57), it follows that φ((ξ\σ) ∪ τ ) = 0 in (4.58), unless for all k ∈ K, ω(|ξω,k | − |σω,k | + |τω,k |) = mk .
(4.60)
ω
Together with (4.59), this implies t
t
N Ut (v ⊗ φ) = N (v ⊗ φ) =
mS +
mk (v ⊗ φ).
(4.61)
k∈K
Since the operators HS , Ntk∈K mutually commute, vectors like v ⊗ φ as in (4.57) furnish a complete set of eigenvectors. This proves the proposition. 4.6. Proof of Proposition 3.1 By expanding the left-hand side of (3.10) in a Dyson expansion, as in (2.41), one can evaluate the derivatives, leading to
∂2 Eρ [nvk1 nvk2 ] v1 =0,v2 =u = ω1 ω2 Tr[ρJω1 ,k1 euL Jω2 ,k2 (1)]. (4.62) ∂v1 ∂v2 ω ,ω 1
2
Putting ρ = ρβ , yields ρβ Vω,k = Vω,k ρβ e−βk ω . Now, (3.10) follows after some reshuffling, using ∗ = V−ω,k , ηk (ω) = e−βk ω ηk (−ω), Vω,k ∗ ωηk (ω)Vω,k Vω,k = Lk (HS ).
(4.63)
ω
Appendix Let A be the matrix algebra Mn (C) for some n ∈ N, and denote by A+ its positive cone, i.e. A+ = {x∗ x | x ∈ A}.
(A.1)
An element x ∈ A is called strictly positive (notation: x > 0) if it is invertible. +
September 12, 2006 14:40 WSPC/148-RMP
650
J070-00274
W. De Roeck & C. Maes
Theorem A.1. Let Λ : A → A be a completely positive linear map, satisfying Tr[xΛy] > 0,
x, y ∈ A+ , x = 0, y = 0.
(A.2)
Then, Λ has a positive eigenvalue λ, such that if µ is another eigenvalue, then |µ| < λ. The eigenvector v ∈ A corresponding to λ can be chosen strictly positive. The eigenvalue λ is simple, i.e. as a root of the characteristic equation of Λ it has multiplicity 1. The theorem was proven almost in the above form in [17], (see Theorem 4.2 therein). We state (a simplified version of) that theorem and we show that the above statement follows from it. We call a positive map φ on A irreducible if ∀ x = 0,
y = 0 ∈ A+ ,
∃ k ∈ N : Tr[xφk y] > 0.
(A.3)
Theorem A.2. Let φ be a positive map such that 1. φ preserves the unit 1 ∈ A : φ(1) = 1, 2. φ satisfies the two-positivity inequality: φ(x∗ x) ≥ φ(x)∗ φ(x)
for all x ∈ A,
(A.4)
3. For all k = 1, 2, . . . , φk is irreducible. Then, φ has a positive, simple eigenvalue λ, such that if µ is another eigenvalue, then |µ| < λ. The eigenvector v ∈ A corresponding to λ can be chosen strictly positive. Another theorem in [17] is (Theorem 2.4, combined with the sentences following it): Theorem A.3. Let φ be an irreducible positive linear map on A and let r be the spectral radius
(A.5) r := sup{|c| c ∈ sp φ}, then there is a unique eigenvector v ∈ A+ with eigenvalue r. To prove Theorem A.1, we remark that Λ has the same spectral properties as a well-chosen map φ that satisfies the conditions of Theorem A.2: Since Λ is irreducible, one can apply Theorem A.3 to find an eigenvector v. Because of (A.2), we conclude that v > 0. Let now the map φ be defined as φ(x) =
1 −1/2 Λ(v 1/2 xv 1/2 )v −1/2 , v r
x ∈ A.
It is clear that 1. φ is completely positive and φ still satisfies (A.2), 2. φ(1) = 1, 3. sp φ = 1r sp Λ and also the multiplicities of the eigenvalues are equal.
(A.6)
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
651
Hence, φ satisfies the conditions of Theorem A.2, since unity-preserving completely positive maps satisfy the two-positivity inequality (A.4). Theorem A.1 follows. Acknowledgments We thank Luc Bouten, Hans Maassen, Andr´e Verbeure, Frank Redig and Karel Netoˇcn´ y for stimulating discussions. We are grateful to an unknown referee for suggesting numerous improvements. References [1] R. Alicki and M. Fannes, Dilations of quantum dynamical semigroups with classical Brownian motion, Comm. Math. Phys. 108 (1987) 353–361. [2] L. Accardi, A. Frigerio and Y. G. Lu, Weak coupling limit as a quantum functional central limit theorem, Commun. Math. Phys. 131 (1990) 537–570. [3] S. Attal and Y. Pautrat, From repeated to continuous quantum interactions, to appear in Ann. Henri Poincar´e (2003). [4] W. Aschbacher and H. Spohn, A remark on the strict positivity of the entropy production, Lett. Math. Phys. 75 (2006) 17–23. [5] W. K. Abou-Salem and J. Fr¨ ohlich, Adiabatic theorems and reversible isothermal processes, Lett. Math. Phys. 72(2) (2005) 153–163. [6] S. Attal, Quantum Open Systems. Vol II: The Markovian approach, in Lecture notes Grenoble Summer School on Open Quantum Systems, eds. S. Attal, A. Joye and C.-A. Pillet, Lecture Notes in Mathematics (Springer, 2003). [7] L. Bouten, M. Guta and H. Maassen, Stochastic Schrodinger equations, J. Phys. A 37 (2004) 3189–3209. [8] L. Bouten, H. Maassen and B. K¨ ummerer, Constructing the Davies process of resonance fluorescence with quantum stochastic calculus, Opt. Spectrosc. 94 (2003) 911–919. [9] W. Bryc, A remark on the connection between the large deviation principle and the central limit theorem, Stat. Prob. Lett. 18 (1993) 253–256. [10] H. J. Carmichael, An Open Systems Approach to Quantum Optics (Springer, Berlin, 1993). [11] J. B. Conway, Functions of One Complex Variable: I (Springer, New York, 1978). [12] I. Callens, W. De Roeck, T. Jacobs, C. Maes and K. Netoˇcn´ y, Quantum entropy production as a measure for irreversibility, Phys. D 187 (2004) 383–391. [13] E. B. Davies, Markovian master equations, Commun. Math. Phys. 39 (1974) 91–110. [14] J. Derezi´ nski, Introduction to representations of canonical commutation and anticommutation relations, in Large Coulomb Systems — QED, ed. J. Derezinski i H. Siedentop (Springer, 2003). [15] J. Derezi´ nski and W. De Roeck, Stochastic limit for Pauli–Fierz operators, in preparation (2006). [16] D. J. Evans, E. G. D. Cohen and G. P. Morriss, Probability of second law violations in steady flows, Phys. Rev. Lett. 71 (1993) 2401–2404. [17] D. E. Evans and R. Hoegh-Krohn, Spectral properties of positive maps on C ∗ -algebras, J. London Math. Soc. 17(2) (1978) 345–355. [18] M. Esposito and S. Mukaumel, Fluctuation theorems for quantum master equations, e-print: cond-mat/0602679.
September 12, 2006 14:40 WSPC/148-RMP
652
J070-00274
W. De Roeck & C. Maes
[19] J.-P. Eckman, C.-A. Pillet and L. Rey-Bellet, Nonequilibrium statistical mechanics of anharmonic chains coupled to two heat baths at different temperatures, Commun. Math. Phys. 201 (1999) 657–697. [20] A. Frigerio, Stationary states of quantum dynamical semigroups, Commun. Math. Phys. 63(3) (1978) 269–276. [21] G. Gallavotti and E. G. D. Cohen, Dynamical ensembles in nonequilibrium statistical mechanics, Phys. Rev. Lett. 74 (1995) 2694–2697. , Dynamical ensembles in stationary states, J. Stat. Phys. 80 (1995) 931–970. [22] , Note on two theorems in nonequilibrium statistical mechanics, J. Stat. Phys. [23] 96 (1999) 1343–1349. [24] A. Guichardet, Symmetric Hilbert Spaces and Related Topics, Vol. 231 (Springer, Berlin, 1972). [25] L. Van Hove, Quantum-mechanical perturbations giving rise to a statistical transport equation, Physica 21 (1955) 517–540. [26] R. L. Hudson and K. R. Parathasaraty, Quantum Ito’s formula and stochastic evolutions, Commun. Math. Phys. 93(3) (1984) 301–323. [27] V. Jakˇsi´c, Y. Ogata and C.-A. Pillet, The Green–Kubo formula and the Onsager reciprocity relations in quantum statistical mechanics, to appear in Commun. Math. Phys. (2005). [28] V. Jakˇsi´c and C.-A. Pillet, On a model for quantum friction. II: Fermi’s golden rule and dynamics at positive temperature, Commun. Math. Phys. 176 (1996) 619–644. , On a model for quantum friction. III: Ergodic properties of the spin-boson [29] system, Commun. Math. Phys. 178 (1996) 627–651. , Mathematical theory of non-equilibrium quantum statistical mechanics, [30] J. Stat. Phys. 108 (2002) 787–829. , Non-equilibrium steady states of finite quantum systems coupled to thermal [31] reservoirs, Commun. Math. Phys. 226 (2002) 131–162. [32] J. Kurchan, Fluctuation theorem for stochastic dynamics, J. Phys. A 31(16) (1998) 3719–3729. [33] G. Lindblad, Completely positive maps and entropy inequalities, Commun. Math. Phys. 40 (1975) 147–151. [34] J. Lebowitz and H. Spohn, Irreversible thermodynamics for quantum systems weakly coupled to thermal reservoirs, Adv. Chem. Phys. 39 (1978) 109–142. , A Gallavotti–Cohen type symmetry in the large deviations functional of [35] stochastic dynamics, J. Stat. Phys. 95 (1999) 333–365. [36] H. Maassen, Quantum Markov processes on Fock spaces described by integral kernels, in Quantum Probability and Applications II, eds. L. Accardi and W. von Waldenfels, Lecture Notes in Mathematics, Vol. 1136 (Springer, Berlin, 1984), pp. 361–374. [37] C. Maes, The fluctuation theorem as a Gibbs property, J. Stat. Phys. 95 (1999) 367–392. , On the origin and the use of fluctuation relations for the entropy, in Poincar´e [38] Seminar, eds. J. Dalibard, B. Duplantier and V. Rivasseau (Birkh¨ auser, Basel, 2003), pp. 145–191. , Fluctuation relations and positivity of the entropy production in irreversible [39] dynamical systems, Nonlinearity 17 (2004) 1305–1316. [40] C. Maes and K. Netoˇcn´ y, Time-reversal and Entropy, J. Stat. Phys. 111 (2003) 1219–1244. [41] C. Maes, K. Netoˇcn´ y and M. Verschuere, Heat conduction networks, J. Stat. Phys. 111 (2003) 1219–1244.
September 12, 2006 14:40 WSPC/148-RMP
J070-00274
Steady State Fluctuations of the Dissipated Heat
653
[42] T. Matsui and S. Tasaki, Fluctuation theorem, nonequilibrium steady states and MacLennan–Zubarev ensembles of L1 -asymptotic abelian C ∗ dynamical systems, Quantum Prob. White Noise Anal. 17 (2003) 100–119. [43] T. Monnai and S. Tasaki, Quantum Correction of Fluctuation Theorem e-print: cond-mat/0308337. [44] K. R. Parthasarathy, An Introduction to Quantum Stochastic Calculus (Birkh¨ auser, Basel, 1992). [45] L. Rey-Bellet and L. E. Thomas, Fluctuations of the entropy production in anharmonic chains, Ann. Henri Poincar´e 3 (2002) 483–502. [46] W. De Roeck and C. Maes, A quantum version of free energy–irreversible work relations, Phys. Rev. E 69(2) (2004) 026115. [47] D. Ruelle, Smooth dynamics and new theoretical ideas in nonequilibrium statistical mechanics, J. Stat. Phys. 95 (1999) 393–468. , Natural nonequilibrium states in quantum statistical mechanics, J. Stat. [48] Phys. 98 (2000) 57–75. [49] M. D. Srinivas and E. B. Davies, Photon counting probabilities in quantum optics, Opt. Acta 28 (1981) 981–996. [50] S. Sarman, D. J. Evans and P. T. Cummings, Recent developments in non-Newtonian molecular dynamics, Phys. Reports 305(1–2) (1998) 1–92.
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
Reviews in Mathematical Physics Vol. 18, No. 6 (2006) 655–711 c World Scientific Publishing Company
THE GENERAL STRUCTURE OF G-GRADED CONTRACTIONS OF LIE ALGEBRAS, II: THE CONTRACTED LIE ALGEBRA
EVELYN WEIMAR-WOODS Fachbereich f¨ ur Mathematik und Informatik, Freie Universit¨ at Berlin, Arnimallee 2–6, D–14195 Berlin, Germany [email protected] Received 27 March 2006 We continue our study of G-graded contractions γ of Lie algebras where G is an arbitrary finite Abelian group. We compare them with contractions, especially with respect to their usefulness in physics. (Note that the unfortunate terminology “graded contraction” is confusing since they are, by definition, not contractions.) We give a complete characterization of continuous G-graded contractions and note that they are equivalent to a proper subset of contractions. We study how the structure of the contracted Lie algebra Lγ depends on γ, and show that, for discrete graded contractions, applications in physics seem unlikely. Finally, with respect to applications to representations and invariants of Lie algebras, a comparison of graded contractions with contractions reveals the insurmountable defects of the graded contraction approach. In summary, our detailed analysis shows that graded contractions are clearly not useful in physics. Keywords: Graded Lie algebra; graded contractions. Mathematics Subject Classification 2000: 17B05, 17B70, 17B81
1. Introduction Let G be a finite Abelian group. A G-graded Lie algebra L = (V, µ) has the structure γ V = ⊕j∈G Vj where µ(Vj , Vk ) ⊂ Vj+k . The notion of a graded contraction L → Lγ of a graded Lie algebra L was introduced in 1991 [1, 2]. It transforms a G-graded Lie algebra L = (V, µ) into a G-graded Lie algebra Lγ = (V, µγ ) in a purely algebraic way by defining, with the obvious meaning, µγ (Vj , Vk ) = γjk µ(Vj , Vk ) where γ is a matrix that is symmetric (so that µγ is antisymmetric) and satisfies non-linear “defining equations” (cf. Eq. (2.1)) which enforce the Jacobi identity for µγ . By a graded contraction, it means the matrix γ, whose definition depends only on the γ grading group G and not on L. The process L → Lγ is called the graded contraction of the Lie algebra L by γ. The notion of a contraction of Lie algebras was introduced 40 years earlier where, motivated by physics, it is defined by a limiting process [3, 4]. The reader should note that graded contractions should never have been called that since this terminology 655
September 12, 2006 14:40 WSPC/148-RMP
656
J070-00276
E. Weimar-Woods
violates normal grammatical and mathematical usage. Namely, a graded contraction is not a contraction which is graded (since it is defined algebraically and not by a limiting process). Indeed, a graded contraction is, in general, not even equivalent to a contraction (cf. Sec. 6). Needless to say, this unfortunate terminology has led to some confusion. In Part I, we studied the general structure of complex (resp. real) G-graded contractions γ. We found a complete set of invariants (support, higher-order identities, and — in the real case — sign invariants) which allowed us to give a complete classification of G-graded contractions. In this paper, we continue our investigation by studying the effect of γ on Lγ . We find subalgebras and ideals for Lγ and we recognize substructures of L which survive for Lγ . We check if Lγ is semisimple, solvable, or nilpotent. By generalizing our earlier result for non-negative ZN -graded contractions, we give a complete characterization of continuous G-graded contractions. We note that the continuous graded contractions are equivalent to a proper subset of contractions (so that they are nothing new). Now, any contraction can be realized by a generalized In¨ on¨ u– Wigner contraction, which is given by a diagonal matrix T (ε)ij = δij εnj ; nj ∈ R. So contractions are at least as easy to deal with as the continuous graded contractions. For discrete graded contractions, we give a detailed study of Lγ , which shows that applications in physics seem unlikely. In any case, if two Lie algebras are related by a discrete G-graded contraction, the question “So what?” has not yet been satisfactorily answered. We carefully compare graded contractions with contractions with respect to applications to representations and invariants of Lie algebras. Here the insurmountable defects of the graded contraction approach are clearly revealed. Namely, while contractions can successfully treat a wide variety of interesting representations, we prove in Theorem 7.5 that the graded contraction method can never relate two physically interesting (i.e. faithful self-adjoint) representations. As for invariants, contractions can be easily applied to not only all polynomial invariants, but also rational and even some formal ones. However, graded contractions can only deal in a limited way, and with great difficulty, with polynomial invariants. The problem with the rational and formal invariants is that graded contractions cannot deal at all with situations where a grading label cannot be assigned to the objects under consideration. This is also the case for exponentials of generators, and hence they cannot treat BCH formulas. Nor can they handle special functions (where both the lack of a limiting procedure and the necessity of assigning grading labels are the problem). It is well established that contractions can successfully deal with BCH formulas and special functions [5]. Summary. For G-graded contractions γ, the interplay between the Jacobi identity and the grading group leads to an interesting mathematical structure, as our classification in Part I illustrates. The original motivation for introducing graded contractions was claimed to be their usefulness in physics. However, our detailed analysis shows that in fact they cannot be usefully applied there. This is in
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
657
complete contrast to the situation for contractions. Under these circumstances, how could it happen that such a vast literature, with constant claims of its superiority and importance for mathematical physics, came to exist? One reason is that this literature tends to consist of endless tables, and the authors never ask any detailed questions about their content or significance. In particular, they never compare their results with the contraction method (in one paper, they even made the totally false claim that contractions cannot deal with representations) (cf. Sec. 7). In fact, the method was fatally flawed from the beginning because of the two following reasons. First, if one wants to relate two Lie algebras in physics, it is invariably because they arise from a situation where one theory is the limit of another. But, unlike the contraction procedure, the graded contraction method completely ignores the limiting process (which, for example, makes their treatment of polynomial invariants so difficult). Secondly, while many Lie algebras in physics are naturally graded, the grading is not a natural part of all aspects of the problem. As a result, the graded contraction method cannot treat (physically interesting) representations, rational or formal invariants, BCH formulas, or special functions. We now outline the contents of the paper in detail. In Sec. 2, we study the distribution of zeroes of γ, since they determine much of the structure of Lγ . We concentrate on the elements γ0j and γj,−j (j ∈ G) since only they enter the Killing form of Lγ (cf. Eq. (3.9)). We discuss especially the implications of γ00 = 0 (Lemma 2.4 shows which of the elements γ0j and γj,−j can also be different from zero) and γ00 = 0 (Lemma 2.10 gives the minimal number of additional zeroes). In Sec. 3, we present our general structural results for Lγ . The Killing form tells us immediately that Lγ cannot be semisimple if γ has zeroes and that Lγ is nilpotent whenever γ00 = 0 (cf. Lemma 3.1). In the case γ00 = 0, Lγ splits into (1) (2) (2) the direct sum of two smaller Lie algebras Lγ and Lγ where Lγ is nilpotent. (1) Lγ is the semidirect sum of a subalgebra and a nilpotent ideal (cf. Lemma 3.4). If γ00 = 0, the class of nilpotency of Lγ is at most N = |G| (cf. Lemma 3.6). In Sec. 4, we show that a G-graded contraction γ is continuous if and only if it has no violations (cf. Theorem 4.2). The proof yields a test (cf. Corollary 4.3) if a given support defines a continuous projection and — if not — tells us how to find its (weak) violation of a higher-order identity (cf. Remark 4.4 and Examples 4.5). In Sec. 5, we study discrete G-graded contractions γ according to their violations (cf. Definition 5.1, Remark 5.2). If γ violates “γ00 = γ0k ” (cf. part A) or some higher-order identity weakly or strongly (cf. part B) the link between L and Lγ becomes too loose to suggest any useful applications. The same is true if a real γ has a negative sign invariant of the first kind (cf. part C). If only sign invariants of the second kind are negative for a real γ, then it contains db (cf. part C) which can at most turn L into one other real form. In Sec. 6, we first summarize the relevant results for contractions (cf. Theorem 6.4). Then we show that continuous graded contractions are equivalent
September 12, 2006 14:40 WSPC/148-RMP
658
J070-00276
E. Weimar-Woods
to a proper subset of contractions (cf. Theorem 6.9). In contrast, discrete graded contractions are in general not equivalent to any contraction. In Sec. 7, we compare the applicability of contractions and graded contractions to representations and invariants. An Appendix illustrates the result in Sec. 7 for three typical examples from physics. In Part III, we will deal with our Conjecture I.2.15 that γ ∼ γ ⇔ Lγ Lγ
for all L
and consequences thereof. This will complete our study. 1.1. Notation and results from Part I For a given group G, N = |G| denotes the order of G, Nj the order of j ∈ G. 1.1.1. Special γ’s (cf. Sec. I.2) (i) The identity 1 with 1jk = 1. (ii) The coboundary da with aj ak (da)jk = ; aj+k
0 = aj ∈ C
(resp. R)
which corresponds to the change of basis Vj → aj Vj so that Lda L. (iii) The real ZN -graded contraction db where jπ
bj = ei N ;
j ∈ ZN .
(iv) Projections π with π · π = π, i.e. πjk ∈ {0, 1}. (v) The projection π(γ) where 1 if γjk = 0, (π(γ))jk = 0 if γjk = 0. 1.1.2. Some definitions The product of two γ’s is defined elementwise by (cf. Definition I.2.5(ii)) (γ · γ )jk = γjk γjk .
The equivalence γ ∼ γ means γ = da · γ (i.e. they differ only by a change of basis compatible with the grading ; cf. Definition I.2.14). 1.1.3. Elements of γ We consider the elements γjk and γkj to be identical — especially for counting arguments. Two elements are called incompatible if their product does not occur in any non-trivial defining equation (cf. Definition I.2.11). For arbitrary values of pairwise incompatible elements, a γ always exists (cf. Remark I.2.13).
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
659
1.1.4. Independence, basis A set of γ-elements is quasi-independent (resp. independent) if for arbitrary non-vanishing complex values of these elements, a complex γ (resp. a complex γ without zeroes) exists (cf. Definition I.5.1, Definition I.6.3 and Remark I.6.4). A set of elements {γji ki ; i = 1, 2, . . . , r} is independent if and only if the ansatz γji ki =
aji aki aji +ki
can be solved for r different aj ’s (cf. Lemma I.5.5). A pseudobasis is a maximal set of independent elements. It is a basis if the resulting γ is unique (cf. Definition I.5.1). For positive γ’s, any pseudobasis is a basis. The natural bases for G are given in Appendix I.A. 1.1.5. Invariants The following are invariants for γ. (i) The support S(γ) = {(j, k) ∈ G × G | γjk = 0}. (ii) Higher-order identities (cf. Sec. I.4 A). These are identities of the form “P1 = P2 ” where P1 , P2 denote products of γ-elements such that P1 (da) = P2 (da) for all da (note that any γ without zeroes is a da; aj ∈ C (cf. Theorem I.3.1)), but P1 (γ) = P2 (γ) for some γ with zeroes. If 0 = P1 (γ) = P2 (γ) = 0, we have a strong violation, otherwise a weak violation. A projection π can only have weak violations. The relation “γ00 = γ0k ” can only be weakly violated (cf. Remark I.6.1). (iii) Sign invariants (cf. Sec. I.4 B). These are invariants of the form sgn P (γ) where P is a product of γ-elements. They are of the first kind if sgn P (γ) = +1 for all γ’s without zeroes, otherwise they are of the second kind. All sign invariants of the second kind are negative for the Z2M -graded contraction db; M = 1, 2, . . . . 1.1.6. Structural results γ ∼ π(γ) if and only if γ has no strong violations and — in the real case — no negative sign invariants (cf. Theorem I.6.7 and Lemma I.6.10). γ ∼ γ if and only if they agree on all the above invariants, i.e. our invariants are complete (cf. Theorem I.7.1). This leads to the following classification. 1.1.7. Classification We give a straightforward algorithm which, for a given G, determines all possible supports S (cf. Remark I.6.2). For each S, let N (S) (resp. N (S)) be the maximal number of independent (resp. quasi-independent) γ-elements in S. The construction of such maximal sets yields Q(S) = N (S)−N (S) ≥ 0 higher-order identities which
September 12, 2006 14:40 WSPC/148-RMP
660
J070-00276
E. Weimar-Woods
can be arbitrarily strongly violated by γ’s with support S, and which determine all other strong violations for these γ’s (cf. Theorem I.6.5). In the complex case, this yields a Q(S)-parameter family of equivalence classes (cf. Sec. I.7). For the real case, we show how to get a maximal set of independent sign invariants (cf. Sec. 5, part C) which completes the classification (cf. Sec. I.7). 2. The Zeroes of γ Let γ be a G-graded contraction matrix. In this section, we study the possible distribution of zeroes for γ since they determine much of the structure of Lγ . In fact, if γ is not strongly discrete (cf. Sec. 5). we have γ ∼ π(γ) so that Lγ is completely determined by the zeroes of γ. We will see in Sec. 3 that γ enters the Killing form of Lγ only through the elements γ0j and γj,−j (j ∈ G; cf. Eq. (3.9)). Therefore, we pay particular attention to these elements. We split V = ⊕j∈G Vj in two different ways as V = V (1) ⊕ V (2) (cf. Definition 2.1) resp. as V = VS ⊕ VI (cf. Definition 2.3) according to the zeroes in the set {γ0j } resp. {γj,−j }. These splittings will play a useful role in our study of Lγ in Sec. 3 (in particular, the choice of the subscript S (as subalgebra) and I (as ideal) will become clear. We study the cases γ00 = 0 and γ00 = 0 separately since they lead to very different types of Lγ (cf. Sec. 3). In the case γ00 = 0, the zeroes in the sets {γ0j } and {γj,−j } are intricately related to each other (cf. Lemmas 2.4, 2.5 and Remark 2.6). In the case γ00 = 0, we show that this zero alone forces at least half of the elements {γjk | j, k, j + k = 0} of γ to vanish (cf. Lemma 2.10). In Lemma 2.12, we prove that this is exactly the minimal number of additional zeroes in the case G = ZN . The field C (resp. R) does not play any role in this section. We split the elements γjk ; j, k ∈ G; of a G-graded contraction γ into the following subsets γ00 , γ0k ; k = 0; γj,−j ;
j = 0
and γjk ;
j, k, j + k = 0.
We will study the implications of the defining equations (cf. Eq. I.(2.3)) γjk γl,j+k = γjl γk,j+l = γkl γj,k+l ;
j, k, l ∈ G;
(2.1)
separately for these subsets. We begin with the first subset. For j = l = 0, k = 0, Eq. (2.1) yields 2 γ00 γ0k = γ0k ;
k = 0;
(2.2)
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
661
so that γ00 = 0 ⇒ γ0k = γ00 = 0, γ00 = 0 ⇒ either γ0k = γ00 = 0 or γ0k = 0. Definition 2.1. Given γ, we define I (1) = {k ∈ G | γ0k = 0},
(2.3)
I (2) = {k ∈ G | γ0k = 0}
(2.4)
and V (i) =
⊕ Vk ;
k∈I (i)
i = 1, 2.
(2.5)
We obviously have I (1) ∩ I (2) = φ;
I (1) ∪ I (2) = G;
I (1) = φ ⇒ 0 ∈ I (1)
and V = V (1) ⊕ V (2) . For l = 0 and j, k = 0, Eq. (2.1) gives γ0j γjk = γ0k γjk = γ0,j+k γjk . It follows that
γjk = 0 ⇒
{j, k, j + k} ⊂ I (1) , {j, k, j + k} ⊂ I
(2)
(2.6) or
.
(2.7)
To get all remaining defining equations which contain at least one subscript 0 ∈ G, we start with (j, l ∈ G; j = 0; r ∈ N) γj,−j γ0,rj+l = γj,rj+l γ−j,(r+1)j+l = γj,−j γ0,(r+1)j+l = γj,(r+1)j+l γ−j,(r+2)j+l where we used in the second equation rj + l = −j + [(r + 1)j + l]. By now taking r = 1, 2, . . . , Nj , we get (j, l ∈ G; j = 0), γ0l γj,−j = γ0,j+l γj,−j = γ0,2j+l γj,−j = · · · = γ0,(Nj −1)j+l γj,−j = γjl γ−j,j+l = γj,j+l γ−j,2j+l = γj,2j+l γ−j,3j+l = · · · = γj,(Nj −1)j+l γ−j,l . (2.8) (Part of this chain for l = 0 agrees with Eq. (2.6) for k = −j.) Equation (2.8) yields together with Definition 2.1 for 0 = j ∈ G γj,−j = 0 ⇒ {l, j + l, 2j + l, . . . , (Nj − 1)j + l} ⊂ I (i) ; for all l ∈ I (i) where i = 1, 2; l ∈ G.
(2.9)
September 12, 2006 14:40 WSPC/148-RMP
662
J070-00276
E. Weimar-Woods
We want to analyze Eq. (2.8) a little further. (i) Assume γ0l = γ00 for all l ∈ G (i.e. we have either I (2) = φ if γ00 = 0 or I (1) = φ if γ00 = 0). Then we can link Eqs. (2.8) together for all l ∈ G to get γ00 γj,−j = γjl γ−j,j+l ;
j = 0;
l ∈ G.
(2.10)
(ii) Assume γ00 = 0 so that 0 ∈ I (1) , and assume I (2) = 0. If γj,−j = 0 for some 0 = j ∈ G, all Eqs. (2.8) for different l can again be linked together to yield 0 = γjl γ−j,j+l ;
l ∈ G.
(2.11)
If γj,−j = 0 for some 0 = j ∈ G, we can only link all Eqs. (2.8) for all l ∈ I (1) together to get l ∈ I (1) .
0 = γ00 γj,−j = γjl γ−j,j+l ;
(2.12)
The remaining equations 0 = γjl γ−j,j+l ;
l ∈ I (2) ;
are trivially satisfied since γjl = 0 and γ−j,j+l = 0;
l ∈ I (2) ;
(2.13)
due to ± j ∈ I (1) (since γj,−j = 0 and 0 ∈ I (1) (cf. Eq. (2.7)) and l, j + l ∈ I (2) (cf. Eq. (2.9)). [In some special cases Eq. (2.12) can be ignored, too. Consider, e.g., G = Z2M (M = 1, 2, . . .); I (1) = {0, M } and γMM = 0. Then Eq. (2.12) simply reads 0 = γ00 γMM = γ0M γMM which is trivially satisfied.] Equation (2.8) already suffices to prove the following lemma. Lemma 2.2. A G-graded contraction γ with γj,−j = 0
for all j ∈ G
is without zeroes. Proof. Since γj,−j = 0 for all j ∈ G, we have (cf. Eq. (2.7)) I (2) = φ since 0 ∈ I (1) . Therefore Eq. (2.10) is valid which reads 0 = γ00 γj,−j = γjl γ−j,j+l ;
j, l ∈ G;
so that γjl = 0. Apart from Eq. (2.8), a specific element γj,−j (j = 0) only occurs in those defining equations which relate the three elements γj,−j ; γk,−k
and γj+k,−j−k
for all k = 0, ±j;
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
663
namely γj+k,−k γj,−j = γ−j,j+k γk,−k = γ−j,−k γj+k,−j−k
(2.14)
γ−j−k,k γj,−j = γj,−j−k γk,−k = γjk γj+k,−j−k .
(2.15)
and
All remaining defining equations only relate the elements {γjk | j, k, j + k = 0}. Definition 2.3. Given γ. We define VS =
⊕
Vj
(2.16)
⊕
Vj
(2.17)
γj,−j =0
and VI =
γj,−j =0
so that V = ⊕ Vj = VS ⊕ VI . j∈G
Furthermore, we define (i)
VS = VS ∩ V (i) ;
(i)
VI
= VI ∩ V (i) ;
i = 1, 2;
(2.18)
so that (i)
(i)
V (i) = VS ⊕ VI . The case γ 00 = 0. The following lemma shows how strongly the split into I (1) and I (2) already determines VS . (1)
Lemma 2.4. Given a G-graded contraction γ with γ00 = 0 i.e. V0 ⊂ VS . Then we have (i) (ii)
(iii)
(1)
VS = VS ,
(2.19)
Vj ⊂ VS ⇒ {l + nj | n = 1, 2, . . . , Nj } ⊂ I (i) f or all l ∈ I (i) (i = 1, 2) especially
(2.20)
{nj | n = 1, 2, . . . , Nj } ⊂ I
(2.21)
Vj , Vk ⊂ VS ⇒ Vnj+mk ⊂ VS ;
(1)
,
n = 1, 2, . . . , Nj ; m = 1, 2, . . . , Nk .
(2.22)
Proof. (i) Assume Vj ⊂ VS . This means γj,−j = 0 and therefore ± j ∈ I (1) (cf. (1) Eq. (2.7) since 0 ∈ I (1) ). Thus VS = VS . (ii) Equation (2.20) is a direct consequence of Eq. (2.9). The special case l = 0 ∈ I (1) yields Eq. (2.21).
September 12, 2006 14:40 WSPC/148-RMP
664
J070-00276
E. Weimar-Woods
(iii) Assume Vj , Vk ⊂ VS . This means γj,−j = 0 ; γk,−k = 0 and (cf. (i)) ±j, ±k ∈ I (1) . Therefore the defining equation γj,−j γ0k = γjk γ−j,j+k yields — since the left side is different from zero — (once directly, once by replacing j by (−k) and k by (−j)) γ−j,j+k = 0
and γk,−j−k = 0
so that the defining equation γj+k,−j−k γ0,−j = γj+k,−j γk,−j−k gives γj+k,−j−k = 0 ⇔ Vj+k ⊂ VS . By using this argument repeatedly for k = j, we get Vj ⊂ VS ⇒ Vnj ⊂ VS ;
n = 1, 2, . . . , Nj ;
and if we then replace j by nj and k by mk we get n = 1, 2, . . . , Nj ; Vj , Vk ⊂ VS ⇒ Vnj+mk ⊂ VS ; m = 1, 2, . . . , Nk . When γ00 = 0, Lemma 2.4 tells us how the zeroes in the two sets {γ0j | j = 0} and {γj,−j | j = 0} are related. The following lemma proves that all choices for these zeroes which are not excluded by Lemma 2.4 are indeed realized by some γ. Lemma 2.5. A G-graded contraction γ exists for which I (1) ⊂ G is any subset with 0 ∈ I (1) and for which VS ⊂ V is any subset which satisfies all conditions (1) listed in Lemma 2.4 (i.e. V0 ⊂ VS = VS and Eqs. (2.20) and (2.22) hold ). Proof. We remark first that since the choice VS = V0 satisfies all conditions, the lemma is not vacuous. We define γ by Vj ⊂ VS , Vk ⊂ V (1) , or γjk = γkj = 1 if Vk ⊂ VS , Vj ⊂ V (1) (2.23) and γjk = 0 otherwise. Equation (2.23) yields indeed γ0j = 1 if Vj ⊂ V (1) , γ0j = 0 if Vj ⊂ V (2) , γj,−j = 1 if V±j ⊂ VS , γj,−j = 0 if V±j ⊂ VI . Now we show that this γ satisfies all defining equations i.e. γjk γl,j+k = γjl γk,j+l = γkl γj,k+l ; by going through all possible cases.
j, k, l ∈ G;
(2.24)
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
665
(1)
If Vj , Vk , Vl ⊂ VS = VS , we have Vj+k , Vj+l , Vk+l ⊂ VS (cf. Eq. (2.22)) so that all factors in Eq. (2.24) are equal to 1. (1) (i) If Vj , Vk ⊂ VS = VS and Vl ⊂ VI (i = 1, 2) we have Vj+k ⊂ VS (cf. Eq. (2.22)) and Vj+l , Vk+l ⊂ V (i) (cf. Eq. (2.20)). Therefore, in the case i = 1, all factors in Eq. (2.24) are again equal to 1, whereas in the case i = 2, all factors apart from γjk vanish. If Vj ⊂ VS and Vk , Vl ⊂ VI , we must have Vj+k , Vj+l ⊂ VI since, e.g., Vj+k ⊂ VS would lead to Vk ⊂ VS (cf. Eq. (2.22)). Therefore, γl,j+k = γk,j+l = γkl = 0. If finally Vj , Vk , Vl ⊂ VI , we have γjk = γjl = γkl = 0. Remark 2.6. The possible choices for VS in Lemma 2.5 are rather limited because of Lemma 2.4. We can always have VS = V0 independently of I (1) . On the other hand, the choice VS = V is only possible if I (1) = G, in fact if γ is without zeroes (cf. Lemma 2.2). All remaining choices for VS lie somewhere in between. Consider, e.g., the case G = Z6 . The remaining choices for VS are VS = V0 ⊕ V3
if {0, 3} ⊂ I (1)
and {1, 4} ⊂ I (i) (i = 1, 2)
and {2, 5} ⊂ I (i ) (i = 1, 2); VS = V0 ⊕ V2 ⊕ V4
if {0, 2, 4} ⊂ I (1)
and {1, 3, 5} ⊂ I (i) (i = 1, 2).
In the second case, we have γ15 = γ33 = 0 and γ24 = 0 which illustrates the fact that we can have Vj , Vk ⊂ VI
and Vj+k ⊂ VS .
Remark 2.7. An application of Lemma 2.4 is the following. Consider a ZN -graded contraction γ with γ00 = 0
and γ1,−1 = 0.
Then, we have V0 , V±1 ⊂ VS and therefore, (cf. Eq. (2.22)) V±j ⊂ VS for all j ∈ G. This means (cf. Lemma 2.2) that γ is without zeroes. Since one non-vanishing element does not say anything about the remaining elements, two is therefore the minimal number of non-vanishing elements which force γ to be without zeroes. For G = ZN1 × ZN2 ZN1 ·N2 , we can argue similarly to get γ00,00 = 0;
γ10;−1,0 = 0;
γ01;0,−1 = 0 ⇒ γ without zeroes.
September 12, 2006 14:40 WSPC/148-RMP
666
J070-00276
E. Weimar-Woods
But three is not necessarily the minimal number. For example, the defining equations for Z2 × Z2 show that two non-vanishing elements (like γ10,11 and γ01,01 ) already force γ to be without zeroes. (2)
The case γ 00 = 0. If γ00 = 0 we have I (1) = φ (cf. Eq. (2.2)) and V0 ⊂ VI . Since all elements {γj,−j | j = 0} are pairwise incompatible, a γ exists for arbitrary values of these elements. Therefore, γ’s exist where a specific couple V±j (j = 0) (2) (2) belongs either to VS or to VI — in complete contrast to the case γ00 = 0 (cf. Lemma 2.4). We will show that γ00 = 0 forces many elements of γ to vanish (besides γ0k ). To do this, it is convenient to first divide the set {γjk | j, k, j + k = 0} into “triplets”. Definition 2.8. Consider the set {γjk | j, k, j + k = 0}. We call the subset {γjk ; γj,−j−k ; γk,−j−k } the triplet of γjk . Remark 2.9. Note that each element of this triplet defines the same triplet, so that two triplets are either identical or disjoint. Only in the case j = k does a triplet contain less than three elements, namely two if 3j = 0 and one if 3j = 0. The triplets of γjk and γ−j,−k always have the same number of elements. These two triplets agree if and only if 2j = 2k = 0
where j = k
(in the case j = k, we must have j = −j since otherwise j + k = 0). For G = ZN , this case cannot occur. An example for G = Z2 × Z2 is the triplet {γ01,10 ; γ01,11 ; γ10,11 }. Lemma 2.10. Let γ00 = 0. Then γ0k = 0; k ∈ G; and at least half of the elements of the set {γjk | j, k, j + k = 0} must vanish. If G = ZN , this result gives precisely the minimal number of additional zeroes which are required. Proof. Since γ00 = 0, Eq. (2.10) yields for all j, k, j + k = 0, 0 = γjk γ−j,j+k .
(2.25)
This means especially for the triplet of γjk = γkj ,
0 = γjk γ−j,j+k = γkj γ−k,j+k 0 = γj,−j−k γ−j,−k = γ−j−k,j γj+k,−k 0 = γk,−j−k γ−k,−j = γ−j−k,k = γj+k,−j
(2.26)
i.e. the triplet of γjk gets exactly multiplied by the triplet of γ−j,−k (and vice versa).
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
667
Assume first that both triplets are different. Then it is easy to check that we can solve Eq. (2.26) with a minimal number of zeroes by setting one of these two triplets to zero. If both triplets agree (i.e. if 2j = 2k = 0 where j = k (cf. Remark 2.9)), Eq. (2.26) looks like 0 = γjk γj,j+k = γjk γk,j+k = γj,j+k γk,j+k . This forces at least two elements of such a triplet to vanish. Altogether we see that in order to satisfy Eq. (2.25) at least half of the elements of the set {γjk | j, k, j + k = 0} have to vanish. Lemma 2.12 provides us with a ZN -graded contraction γ which has exactly the zeroes required here. Remark 2.11. We only used Eq. (2.10) to get the lower bound for the number of zeroes in Lemma 2.10. Now, we add Eqs. (2.14) and (2.15) into our consideration. In Eq. (2.14), the triplet of γ−j,−k occurs and in Eq. (2.15), the triplet of γjk . If these two triplets are different and one of them vanishes, Eqs. (2.14) and (2.15) can be satisfied with γj,−j = 0; j = 0. Therefore, the lower bound for zeroes we get in Lemma 2.10 does not change. If both triplets agree, Eq. (2.10) enforces only two elements of this triplet to vanish. But now Eqs. (2.14) and (2.15) enforce a third element to vanish too. Whenever G allows such a case (for G = ZN it does not) the greatest lower bound for zeroes is in fact higher than the one stated in our lemma. Since we still have to take into account all defining equations which only relate the elements from the set {γjk | j, k, j + k = 0}, the question arises if one could improve this lower bound. As Lemma 2.12 shows this is not possible for G = ZN . Lemma 2.12. Let G = ZN . The equations (j, k = 1, 2, . . . , N − 1) γ00 = γ0k = 0
(2.27)
γj,−j = 1
(2.28)
γjk = 1;
0 < j, k < j + k < N ;
(2.29)
γjk = 0;
0 < j + k < j, k < N ;
(2.30)
defines a ZN -graded γ for which the number of zeroes is precisely the number required by Lemma 2.10. Proof. We note first that all elements of γ are uniquely defined. Since 0 < j, k < j + k < N
means that
0 < −j − k < −j, −k < N
(2.31)
and 0 < j + k < j, k < N
means that
0 < −j, −k < −j − k < N
(2.32)
the three elements γjk ;
γj,−j−k ;
γk,−j−k
September 12, 2006 14:40 WSPC/148-RMP
668
J070-00276
E. Weimar-Woods
which constitute the triplet of γjk (cf. Definition 2.8) either all belong to Eq. (2.29) or all to Eq. (2.30), and the triplet of γ−j,−k then belongs to the other equation. Hence our γ satisfies the defining Eq. (2.10) (cf. proof of Lemma 2.10) and Eqs. (2.14) and (2.15) (see Remark 2.11). It remains only to check the remaining defining equations, namely γjk γl,j+k = γjl γk,j+l = γkl γj,k+l
(2.33)
for j, k, l, j + k, j + l, k + l, j + k + l = 0. We assume (without loss of generality) 0 < j ≤ k ≤ l < N. Then we have exactly the following six possible order relations in Z (not ZN ) for j + k;
j + l;
k+l
and j + k + l.
(i) If 0<j+k ≤j+l ≤k+l <j+k+l
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
669
the three elements γl,j+k ; γk,j+l ; γj,k+l belong to Eq. (2.29), the remaining three elements γjk ; γjl ; γkl to Eq. (2.30). (vi) If 0 < N < j + k ≤ j + l ≤ k + l < 2N < j + k + l < 3N all six elements in Eq. (2.31) belong to Eq. (2.30). It is obvious that Eq. (2.31) is satisfied in all six cases. This same ZN -graded contraction γ as defined in Lemma 2.12 will serve later as an example of a Lγ whose class of nilpotency has the maximal value N (cf. Remark 3.7). 3. General Structure of Lγ Let Lγ = (V, µγ ) be a G-graded contraction of a G-graded Lie algebra L = (V, µ). We first calculate the Killing form of Lγ , which depends only on the elements γ0j and γj,−j ; j ∈ G (cf. Eq. (3.9)). This implies immediately that if γ has zeroes then Lγ cannot be semisimple, and if γ00 = 0 then Lγ is nilpotent (cf. Lemma 3.1). We then study the effect on Lγ of the two splittings of the vector space V which we introduced in Sec. 2 (cf. Definitions 2.1 and 2.3). Lγ splits under V = V (1) ⊕V (2) (1) (2) into the direct sum Lγ ⊕ Lγ of two smaller Lie algebras (cf. Lemma 3.2). The decomposition V = VS ⊕ VI produces the nilpotent ideal VI of Lγ (cf. Lemma 3.3): (1) (1) If γ00 = 0, then VS is a subalgebra of Lγ , so that Lγ becomes the semidirect sum (1) of VS and VI (cf. Lemma 3.4). If γ00 = 0, we prove that the class (or degree) of nilpotency of Lγ is at most N = |G| (cf. Lemma 3.6). Again the field C (resp. R) does not play any role in this section. 3.1. The Killing form Recall that the Killing form g of a Lie algebra L = (V, µ) with structure constants l in the basis ej ; j = 1, 2, . . . , dim V ; i.e. Cjk l el µ(ej , ek ) = Cjk
is the symmetric matrix r l Ckr ; gjk = Cje
j, k, l, r = 1, 2, . . . , dim V.
We recall also that L = (V, µ) is semisimple ⇔ det g = 0; solvable ⇔ (µ(V, V ), µ) nilpotent; nilpotent ⇔ g = 0.
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
E. Weimar-Woods
670
Now consider the G-graded Lie algebra L = (V, µ) where V = ⊕ Vj j∈G
and µ(Vj , Vk ) ⊂ Vj+k .
(3.1)
Let ejα ; α = 1, 2, . . . , dim Vj ; be a basis of Vj . Then the structure constants of L are given by (j+k)δ
µ(ejα , ekβ ) = Cjα,kβ e(j+k)δ ;
j, k ∈ G;
(3.2)
and the Killing form g consists of submatrices gjk = (gjα,kβ )
(3.3)
where rε lδ Ckβ,rε ; gjα,kβ = Cjα,lδ
j, k, l, r ∈ G.
(3.4)
Because of the grading (cf. Eq. (3.1)) the above structure constants vanish unless r = j + l and l = k + r, and hence gjk = 0
if k = −j
(3.5)
and (j+l)ε
lδ gjα;(−j)β = Cjα,lδ C(−j)β;(j+l)ε .
(3.6)
Now consider the graded contraction γ
L = (V, µ) → Lγ = (V, µγ ). Then we have µγ (ejα, ekβ ) = γjk µ(ejα, ekβ )
(3.7)
so that the Killing form gγ of Lγ is (gγ )jk = 0 if k = −j
(3.8)
and because of the defining equation γjl γ−j,j+l = γj,−j γ0l , (j+l)ε lδ (gγ )jα;(−j)β = γj,−j γ0l Cjα,lδ C(−j)β,(j+l)ε . l∈G
δ,ε
From this equation, we get γ
Lemma 3.1. Given a graded contraction L → Lγ . (i) If γ00 = 0, then gγ = 0 so that Lγ is nilpotent. (ii) If γ has zeroes, then det gγ = 0 so that Lγ is not semisimple.
(3.9)
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
671
(iii) If γ is without zeroes, then det gγ = c det g
for some 0 = c ∈ C (resp. R).
Proof. (i) If γ00 = 0 then Eq. (2.2) yields γ0l = 0 for all l ∈ G so that Eq. (3.9) gives gγ = 0. (ii) In this case, it follows from Lemma 2.2 that we must have γk,−k = 0 for some k ∈ G. Then Eq. (3.9) yields (gγ )k,−k = 0, and hence det gγ = 0. (iii) If γ is without zeroes, we have γ00 = γ0l = 0; l ∈ G; (cf. Eq. (2.2)) so that Eq. (3.9) yields (gγ )j,−j = γ00 γj,−j gj,−j which implies for some c = 0.
det gγ = c det g
This means, e.g., that if L is semisimple so is Lγ . [Note that it is only possible in the real case to produce a semisimple Lγ L (cf. Example 6.11(i)) since we have in the complex case for a γ without zeroes γ ∼ 1 (cf. Theorem I.3.1).] 3.2. The splittings and the structure of Lγ In Definition 2.2 we split V as V (1) ⊕ V (2) according to whether or not γ0k = 0. Lemma 3.2. (2) Lγ = L(1) γ ⊕ Lγ
where (i) , µγ ); L(i) γ = (V
i = 1, 2.
Proof. Since µγ (Vj , Vk ) = γjk µ(Vj , Vk ) ⊂ γjk Vj+k ;
j, k ∈ G;
Equation (2.7) yields µγ (V (i) , V (i) ) ⊂ V (i) ;
µγ (V (1) , V (2) ) = 0.
We now study the consequences of the split V = VS ⊕ VI , which is induced by whether or not γj,−j = 0 (cf. Definition 2.3), on the structure of Lγ . (i)
Lemma 3.3. (VI , µγ ) is a nilpotent ideal of Lγ = (V, µγ ). Furthermore, (VI , µγ ) (i) is a nilpotent ideal of Lγ ; i = 1, 2. Proof. Note that VI = φ unless γ has no zeroes (cf. Lemma 2.2). Let Vj ⊂ VI i.e. γj,−j = 0 for some j ∈ G. Then the left side of the defining equation γk,−j−k γj,−j = γjk γj+k,−j−k ;
k ∈ G;
September 12, 2006 14:40 WSPC/148-RMP
672
J070-00276
E. Weimar-Woods
vanishes. Therefore we must have either γjk = 0 and hence µγ (Vj , Vk ) = 0, or γj+k,−j−k = 0 and hence Vj+k ⊂ VI . Thus µγ (VI , V ) ⊂ VI which means that (VI , µγ ) is an ideal. (VI , µγ ) is a nilpotent subalgebra of Lγ since its Killing form vanishes identically (cf. Eq. (3.9)). (i) Together with Lemma 3.2 this yields immediately that (VI , µγ ) is a nilpotent (i) ideal of Lγ ; i = 1, 2. The case γ 00 = 0. The following lemma gives the main structural result for γ00 = 0. Lemma 3.4. Assume γ00 = 0. (1)
(1)
(i) Then Lγ = (VS ⊕ VI , µγ ) is the semidirect sum of its subalgebra (VS , µγ ) (1) (2) (2) and its nilpotent ideal (VI , µγ ). Lγ = (VI , µγ ) is nilpotent. (ii) Furthermore, Vj ⊂ VS , Vk ⊂ V (1) , or γjk = 0 if Vk ⊂ VS , Vj ⊂ V (1) so that no zeroes of γ occur in the Lie products µγ (VS , VS ) ⊂ VS and (1) (1) µγ (VS , VI ) ⊂ VI . (1) (2) In contrast, if VI = φ (resp. VI = φ) then at least one zero of γ must (1) (1) (1) (2) (2) (2) occur in the Lie products µγ (VI , VI ) ⊂ VI (resp. µγ (VI , VI ) ⊂ VI ). (1)
Proof. (i) Since γ00 = 0 we have V0 ⊂ VS = VS (cf. Eq. (2.19)). Equation (2.22) shows explicitly that (VS , µγ ) is a subalgebra of Lγ . Together with Lemmas 3.2 and (1) 3.3, we therefore get that Lγ is a semidirect sum of its subalgebra VS and its ideal (1) (2) (2) VI while Lγ = (VI , µγ ) is nilpotent. (ii) The left side of the defining equation γj,−j γ0k = γjk γ−j,j+k is different from zero whenever γj,−j = 0 and γ0k = 0. Therefore, Vj ⊂ VS , Vk ⊂ V (1) , or γjk = γkj = 0 if Vk ⊂ VS , Vj ⊂ V (1) . (1)
(1)
(1)
(1)
Now we show that µγ (VI , VI ) ⊂ VI must contain zeroes of γ unless VI = (1) (1) φ. If VI = φ we have Vl ⊂ VI for some l = 0 i.e. γl,−l = 0 and γ0l = γ00 = 0. (1) (2) If V−l ⊂ VI , the desired zero is already given by γl,−l = 0. If V−l ⊂ VI we will show that some r = 1, 2, . . . , Nl − 2 exists with (1)
Vrl ⊂ VI
and γl,rl = 0.
(3.10) (1)
If γll = 0, we have r = 1. If γll = 0, we must have V2l ⊂ VI µγ (Vl , Vl ) = γll µ(Vl , Vl ) ⊂ V2l
and
(1) (1) µγ (VI , VI )
because of (1)
⊂ VI .
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
673
If γll = 0, but γl,2l = 0, we have r = 2 in Eq. (3.10). If γll = 0 and γl,2l = 0, we get (1) with a similar argument as above that V3l ⊂ VI . Continuing in this way we must find a zero of γ since we cannot have γl,rl = 0 for all r = 1, 2, . . . , Nl − 2; (1)
since this would yield V−l ⊂ VI in contradiction to our assumption that (2) V−l ⊂ VI . (2) If VI = 0, we can repeat the same argument, the fact that now γ0l = 0 only (1) (2) exchanges everywhere VI with VI . The case γ 00 = 0. (2)
Remark 3.5. When γ00 = 0, we have I (1) = φ and V0 ⊂ VI . Therefore, Lemmas 3.2 and 3.3 yield (2) (2) Lγ = L(2) γ = VS ⊕ VI , µγ . In contrast to the case γ00 = 0, we do not have further structural results. For (2) example, (VS , µγ ) will, in general, not be a subalgebra of Lγ since γj,−j = 0 for some j = 0 yields (2)
µγ (Vj , V−j ) = γj,−j µ(Vj , V−j ) ⊂ V0 ⊂ VI . And we can always have γj,−j = 0 for some j = 0 (cf. Sec. 2). Lemma 2.12 even (2) (2) gives an example where VS is maximal, namely VS = V \V0 . Lemma 3.1 tells us that Lγ is nilpotent whenever γ00 = 0. This means that its lower central series µγ (V, V );
µγ (V, µγ (V, V ));
µγ (V, µγ (V, µγ (V, V ))); · · ·
terminates after a finite number of terms. The first trivial term in this series gives the class (or degree) of nilpotency of Lγ . For example, an Abelian Lie algebra is nilpotent of class 1. For a nilpotent Lie algebra L = (V, µ) it is easy to see that the class of nilpotency is at most (dim V −1). In our case of the nilpotent graded Lie algebra Lγ = (V, µγ ) with γ00 = 0 the upper bound for the class of nilpotency is given by |G|. Lemma 3.6. Given a G-graded contraction γ with γ00 = 0. The class (or degree) of nilpotency of Lγ is at most N = | G |. Proof. (i) We show first that by applying the Lie product µγ repeatedly you cannot get from any Vj (j ∈ G) back to the same Vj i.e. all closed loops vanish (otherwise Lγ could not be nilpotent). Because of γ0j = γ00 = 0, we have µγ (Vj , V0 ) = 0 for all j ∈ G
September 12, 2006 14:40 WSPC/148-RMP
674
J070-00276
E. Weimar-Woods
so that all closed loops with one step vanish. Furthermore, we see that all closed loops vanish whenever V0 is involved. Since (cf. Eq. (2.25) in the case j, k, j + k = 0) γjk γ−k,j+k = 0;
j, k ∈ G;
(3.11)
we have µγ (V−k , µγ (Vk , Vj )) = 0 i.e. any closed loop Vj → Vj+k → Vj with two steps vanish. Since γjk γl,j+k γ−k−l,j+k+l = 0;
j, k, l ∈ G;
(use first Eq. (2.1) to get γl,j+k γ−k−l,j+k+l = γl,−k−l γj+k,−k and then Eq. (3.11)) we have µγ (V−k−l , µγ (Vl , µγ (Vk , Vj )) = 0 i.e. any closed loop Vj → Vj+k → Vj+k+l → Vj with three steps vanishes. To show that a closed loop with (n + 1) steps (n ∈ N) vanishes, we use that γjk1 γk2 ,j+k1 γk3 ,j+k1 +k2 · · · γkn ,j+k1 +···+kn−1 γ−k1 −···−kn ,j+k1 +···+kn = 0. (To see this use the defining equation γkr ,j+k1 +···+kr−1 γ−k1 −···−kr ,j+k1 +···+kr = γkr ,−k1 −···−kr γ−k1 −···−kr−1 ,j+k1 +···+kr−1 repeatedly for r = n, n − 1, . . . , 2 and finally again Eq. (3.11), namely γjk1 γ−k1 ,j+k1 = 0.) (ii) Because of (i) multiple Lie products which do not vanish have to go through different subspaces Vk only. Such a chain can have at most (N − 1) steps. V0 can obviously only occur at the end of such a chain. Remark 3.7. In Lemma 2.12, we give a ZN -graded γ with γ00 = 0 and (cf. Eqs. (2.28) and (2.29)) γjk = 0 if
0 < j, k < j + k ≤ N.
Therefore, the (N − 1)-fold Lie product µγ (V1 , µγ (V1 , µγ (V1 , . . . , µγ (V1 , µγ (V1 , V1 )) · · ·))) = γ11 γ12 γ13 · · · γ1,N −1 µ(V1 , µ(V1 , µ(V1 , . . . , µ(V1 , µ(V1 , V1 )) · · ·))) is different from zero whenever this expression does not vanish for L = (V, µ). For such an L (which is easy to construct), the class of nilpotency of Lγ has the maximal value N (cf. Lemma 3.6).
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
675
γ00 = 0 is not the only way to produce a nilpotent Lγ . In Sec. 5 (cf. Example 5.5) we use the following obvious result. Lemma 3.8. Given a complex (resp. real ) G-graded contraction γ where all nonvanishing elements are pairwise incompatible. Then Lγ is nilpotent of class 2 (at most). To end this section, we mention that there is no simple criteria for Lγ to be solvable. The reason is the following. Remark 3.9. If γ00 = 0, Lγ is nilpotent (cf. Lemmas 3.1 and 3.6) whatever L is. If γ00 = 0, the subalgebra (V0 , µ) of L survives. If this subalgebra is not solvable, Lγ is not solvable. 4. Continuous G-Graded Contractions The main result of this section is the statement that γ is continuous if and only if it is free of violations (cf. Theorem 4.2). Strong violations (and, in the real case, negative sign invariants) have been treated extensively in Part I, leaving only the problem of weak violations. Fortunately, the proof of Theorem 4.2 provides a test for recognizing weak violations which even produces the higher-order identities which are weakly violated (cf. Corollary 4.3 and Remark 4.4). Definition 4.1 (cf. Definition I.2.17). A complex (resp. real) G-graded contraction γ is continuous if a family da(ε); 0 = aj (ε) ∈ C (resp. R); ε ∈ (0, 1]; j ∈ G; exists so that γ = lim da(ε),
(4.1)
ε→0
otherwise γ is called discrete. In [7], we characterized non-negative ZN -graded contractions (cf. Theorems 6.1 and 6.2). Our results in Part I now allow us to extend to the general case. Theorem 4.2. A complex (resp. real ) G-graded contraction γ is continuous if and only if (i) γ00 = γ0k ; k ∈ G; and P1 (γ) = P2 (γ) for all higher-order identities “P1 = P2 ” and — in the real case — (ii) no negative sign invariants (of either kind ) exist for the set of non-vanishing elements of γ. Proof. We note first that any coboundary da has no violations. Namely, we have (da)00 = a0 = (da)0k ;
k ∈ G.
By their definition, all higher-order identities are satisfied by any da. Finally, in the real case, Lemma I.4.8 states that for all da with 0 = aj ∈ R all sign invariants are positive. The necessity of the conditions follows trivially from Definition 4.1.
September 12, 2006 14:40 WSPC/148-RMP
676
J070-00276
E. Weimar-Woods
Now assume that γ satisfies the conditions. Since γ has no strong violations (and, in the real case, no negative sign invariants) it follows from our classification (cf. Theorem I.7.1) that γ ∼ π(γ). Now the definition of equivalence trivially implies that (i) continuity is an invariant, and (ii) that equivalent γ’s have identical violations (since da has no violations). It is therefore sufficient to consider the case of a projection π which has no (necessarily weak) violations. Since π = 1 is clearly continuous (just choose aj (ε) = 1; j ∈ G; in Eq. (4.1)), we can assume that π has some zeroes. In [7] (cf. Theorem 6.2), we constructed for any ZN -graded projection π with zeroes and without violations, numbers 0 ≤ nj ∈ Q; j ∈ ZN ; such that π = lim da(ε) ε→0
where aj (ε) = εnj ;
j ∈ ZN .
(4.2)
We now sketch the proof of this result in sufficient detail that it is clear that it remains valid for G. This proof consists of the following three steps. Step 1. We write π = π1 · π2 · · · πr where each projection πi (i = 1, 2, . . . , r ∈ N) has no violations, and its set of zeroes is minimal in the following sense. To define π1 we choose an arbitrary zero πs1 (si ∈ ZN × ZN ) of π and turn as many zeroes of π as possible, other than πs1 , into “ones” without violating either the defining equations, or (π1 )00 = (π1 )0k , or any higher-order identities. If π = π1 we are done. If not, we define π2 by choosing some zero πs2 of π where (π1 )s2 = 1, and then proceed as above. If π = π1 · π2 , we are done. If not, we continue this procedure until all zeroes of π occur in at least one πi . This construction is trivially valid for a general G. Step 2. We now show that each πi is continuous. It is not difficult to show that there exists a γi (ε) such that (γi (ε))s = 1 if (πi )s = 1 and γi (ε))si = ε (where we use the fact that γsi is independent of all elements γs in the support of πi , since otherwise πi would have a weak violation (cf. Definition I.5.1)). At this point one now needs the rather subtle argument in [7, Theorems 7.2 and 7.3] to
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
677
show that the fact that πi has a minimal set of zeroes, implies that γi (ε) is already unique, and furthermore that when (πi )t = 0 we have (γi (ε))t = εnit
where
0 < nit ∈ Q.
(4.3)
It follows that πi = lim γi (ε). ε→0
Since only the general structure of the defining equations have been used, step 2 is also valid for a general G. Step 3. We have π = π1 · π2 · · · πr = lim γ(ε) ε→0
where γ(ε) = γ1 (ε) · γ2 (ε) · · · γr (ε) is positive for ε > 0. From Theorem I.3.1 we have γ(ε) = da(ε), and hence π is continuous. We will make some use (cf. Corollary 4.3) of the fact that aj (ε) = εnj ;
0 ≤ nj ∈ Q;
j ∈ G.
(4.4)
N = |G|;
(4.5)
This follows from Eq. (4.3) and the formula aN j =
N −1
γj,qj ;
j ∈ G;
q=0
which is valid for γ = da, and which can be easily verified. To apply Theorem 4.2 we need to be able to recognize the existence of violations. First of all, it is a triviality to check “γ00 = γ0k ”. Next, one would look for strong violations of higher-order identities and, in the real case, for negative sign invariants. These violations are thoroughly treated in Part I (cf. Theorem I.6.5 and Lemma I.6.17, Algorithm I.6.18). If these violations do not occur, we have γ ∼ π(γ), so it remains only to check if the projection π(γ) violates weakly some higher-order identity. Since a complete list of all higher-order identities does not exist, the following trivial corollary of the proof of Theorem 4.2 (see expecially Eq. (4.4)) provides a useful test. Corollary 4.3. A G-graded projection π with zeroes is continuous if and only if a family of positive G-graded contractions γ(ε); ε > 0; exists such that π = lim γ(ε) ε→0
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
E. Weimar-Woods
678
where 0 ≤ nt ∈ Q;
γt (ε) = εnt ; where
nt = 0
if
πt = 1
and
nt > 0
if
πt = 0.
(4.6)
Remark 4.4. This test is, at least in principle, easy to apply. To construct such a γ(ε) we select first from the support S(π) of π a maximal set of N < N = |G| independent elements {γsi | i = 1, 2, . . . , N ; si ∈ S(π)}. We complete this set into a pseudobasis (cf. Sec. I.5 for details) by adding (N − N ) elements (which necessarily vanish for π) / S(π)}. {γsi | i = N + 1, . . . , N ; si ∈ Then we can express all γt in terms of this pseudobasis to get γt =
N
γsriti ;
rti ∈ Q;
(4.7)
i=1
with unique exponents rti . We now define our desired γ(ε) by γsi (ε) = 1; and
i = 1, 2, . . . , N ;
i = N + 1, . . . , N ;
γsi (ε) = εni ;
0 < ni ∈ Q;
(4.8)
where the ni are arbitrary positive rational numbers. Equation (4.7) then yields γt (ε) = 1; and
γt (ε) = εmt
t ∈ S(π);
where mt =
N
rti ni ;
t∈ / S(π).
(4.9)
i=N +1
/ S(π), If exponents ni > 0; i = N + 1, . . . , N ; exist such that mt > 0 for all t ∈ then π is continuous. If no such ni exist, then π is discrete. Now how can we use this test to show that a given π is discrete? We treat first a special case, and then the general case. / S(π). Then we have mt ≤ 0 Case (i). Let rti ≤ 0; i = N + 1, . . . , N ; for some t ∈ for any ni > 0, so that π is discrete. Furthermore, Eq. (4.7) becomes γt
N
i=N +1
γs|ri ti |
=
N
γsriti
(4.10)
i=1
which can be brought into the form P1 = P2 where P1 contains some elements not in the support of π, whereas P2 contains only elements in the support of π. Hence we get a higher-order identity “P1 = P2 ” which π clearly weakly violates.
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
679
Case (ii). To arrive at the general case, we note first that the structure in Eq. (4.10) which yields directly the violated higher-order identity, is not too far removed from the general case. Namely, any higher-order identity which is weakly violated by the given π must have the form
γtpt
t∈S(π) /
=
N
γsqii ;
pt ≥ 0;
pt , qi ∈ Q;
(4.11)
i=1
where at least one of the non-negative pt does not vanish. It follows from Eqs. (4.8), (4.9) and (4.11) that
P γt (ε)pt = ε pt mt = 1. (4.12) t∈S(π) /
Since this is independent of the ni , it follows from Eq. (4.9) that the exponents pt must satisfy pt rti = 0; i = N + 1, . . . , N. (4.13) t∈S(π) /
Conversely, if there exist pt ≥ 0 (with some pt > 0) satisfying Eq. (4.13), it then follows from Eq. (4.7) that Eq. (4.11) holds, so that we indeed have a higher-order identity which is weakly violated by π. Examples 4.5. We illustrate Remark 4.4 with two examples for G = Z6 where we know that exactly one higher-order identity “P1 = P2 ” exists, namely P1 (γ) = γ11 γ33 γ55 ;
P2 (γ) = γ13 γ15 γ35 .
Nevertheless, we show how this “P1 = P2 ” comes out naturally as described in Remark 4.4. (i) Consider the Z6 -graded projection π with π11 = π12 = π13 = π22 = π55 = 1;
πt = 0 otherwise.
Since the only surviving defining equation is γ11 γ22 = γ12 γ13 , the support S(π) contains N = 4 independent elements, which we can choose to be {γ11 , γ12 , γ22 , γ55 } and which we complete into a pseudobasis by adding two appropriate elements e.g. {γ00 , γ14 }.
September 12, 2006 14:40 WSPC/148-RMP
680
J070-00276
E. Weimar-Woods
We now construct γ(ε) as in Remark 4.4 by defining γ11 (ε) = γ12 (ε) = γ22 (ε) = γ55 (ε) = 1; γ00 (ε) = εn00 , γ14 (ε) = εn14 ,
and
where n00 , n14 > 0. The easiest way to compute the remaining γjk (ε) is to write γ(ε) = da(ε). By a straightforward calculation one gets a0 (ε) = εn00 ;
a31 (ε) = εn14 ;
a33 (ε) = ε3n14 ;
a32 (ε) = a35 (ε) = ε2n14 ;
a34 (ε) = ε4n14
from which we can now trivially compute all γjk (ε). As can be easily checked, non-positive powers of ε can only arise for elements γjk with j +k = 0. They are a1 (ε)a5 (ε) = εn14 −n00 , a0 (ε) a2 (ε)a4 (ε) γ24 (ε) = = ε2n14 −n00 , a0 (ε) a2 (ε) γ33 (ε) = 3 = ε2n14 −n00 . a0 (ε)
γ15 (ε) =
Since for n14 > n00 > 0 all exponents of ε which occur are positive, π is continuous. (ii) Consider the Z6 -graded projection π with π11 = π33 = π55 = 1;
πt = 0
otherwise.
We get a pseudobasis by adding to these N = 3 independent elements {γ11 , γ33 , γ55 } e.g. the three elements {γ00 , γ12 , γ23 }. Choosing (n00 , n12 , n23 > 0) γ11 (ε) = γ33 (ε) = γ55 (ε) = 1 γ00 (ε) = ε
n00
;
γ12 (ε) = εn12
and γ23 (ε) = εn23
yields γ(ε) = da(ε) where
and
a0 (ε) = εn00 ;
a61 (ε) = εn00 +2n12 ;
a32 (ε) = εn00 +2n12 ;
a23 (ε) = εn00 ;
a34 (ε) = ε5n00 +4n12 −6n23
a65 (ε) = ε5n00 +4n12 −6n23 .
This yields γ13 (ε) =
a1 (ε)a3 (ε) = ε−n00 −n12 +2n23 , a4 (ε)
a1 (ε)a5 (ε) = εn12 −n23 , a0 (ε) a3 (ε)a5 (ε) = εn00 −n23 . γ35 (ε) = a2 (ε) γ15 (ε) =
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
681
It is now easy to see that if we take p13 = p15 = p35 = 1, all other pt = 0, that Eq. (4.12) becomes γ13 (ε)γ15 (ε)γ35 (ε) = ε0 = 1 and hence π is discrete. Equation (4.11) which gives the higher-order identity violated by π, now becomes q11 q33 q55 γ13 γ15 γ35 = γ11 γ33 γ55 .
Using γ = da, one easily solves to find q11 = q33 = q55 = 1 so that our test does indeed yield the expected “P1 = P2 ”. 5. Discrete Graded Contractions In this section, we study discrete G-graded contractions γ, especially with respect to the implications for Lγ . Our results make it rather unlikely that discrete graded contractions are a useful tool. Recall that a graded contraction is discrete if and only if it has some violations (cf. Theorem 4.2). The easiest way to produce a discrete γ is by violating “γ00 = γ0k ”. In part A, we show how to construct all such γ’s. For this violation, which occurs for all (1) (2) G, we have Lγ = Lγ ⊕ Lγ where (in contrast to the continuous case) both (1) (2) Lγ and Lγ are non-zero (cf. Theorem 3.4). This severe “cutting” of L into two independent substructures means that the link between L and Lγ is rather loose, which makes useful applications questionable. To show this, we look at four typical examples for G = Z2 , Z3 , Z2 × Z2 and Z6 (cf. Examples 5.3). We know that (2) (2) Lγ is nilpotent. In our examples Lγ is Abelian in three cases and nilpotent of (1) class 2 in one case. We know that Lγ is the semidirect sum of its subalgebra (1) VS where V0 ⊂ VS and its nilpotent ideal VI . In our examples, we have twice (1) VS = V0 and VI = φ so that γ essentially restricts L to this subalgebra (an operation which does not require any additional formalism). In the remaining two cases (1) VS is larger than V0 (we explicitly looked for such cases) and VI is either φ or Abelian. The real Z2M -graded contraction db with its negative sign invariants of the second kind also offers an easy way to produce real discrete γ’s. Namely, just multiply any real continuous Z2M -graded contraction by db (cf. part C for details and the general case). We know (cf. Remark I.3.6 and Example I.3.7) that db can (at most) separate different real forms of a complex Lie algebra. But contractions are not helpful in studying real forms. It is less straightforward to produce discrete γ’s which violate some higher-order identity “P1 = P2 ” weakly or strongly (cf. part B), or to find real γ’s with some negative sign invariant of the first kind (cf. part C). Both cases only occur for
September 12, 2006 14:40 WSPC/148-RMP
682
J070-00276
E. Weimar-Woods
|G| ≥ 6. For G = Z6 , the only sign invariant of the first kind which can be negative stems directly from the only higher-order identity which exists (cf. Example 5.6(i)). In both cases, we find the same effect which already appears in part A. Namely, such a violation enforces on γ so many zeroes that the link between L and Lγ is again rather loose, and hence the usefulness of these discrete graded contractions for applications is at best doubtful. Definition 5.1. We call a complex (resp. real) G-graded contraction γ strongly discrete if (i) γ has a strong violation of some higher-order identity, and/or — in the real case — (ii) γ has a negative sign invariant. We call γ weakly discrete if it is discrete but not strongly discrete (i.e. if it has no strong violations and, in the real case, no negative sign invariants, but it does have a weak violation of “γ00 = γ0k ” and/or of some higher-order identity). Remark 5.2. It follows immediately from our classification in Part I that γ is strongly discrete if and only if γ ∼ π(γ). Hence, γ is weakly discrete if and only if γ ∼ π(γ) and π(γ) is (weakly) discrete. Part A.
G-graded contractions which violate “γ00 = γ0k ”
Let γ be a G-graded contraction violating “γ00 = γ0k ”. If γ is weakly discrete, we have γ ∼ π(γ). Since “γ00 = γ0k ” can only be weakly violated, if γ is strongly discrete it just means that there are also strong violations in addition to the fact that π(γ) violates “γ00 = γ0k ”. The effect of these additional violations will be treated in Parts B and C. In the following we therefore assume that we have some γ = π(γ) violating “γ00 = γ0k ”. For such a γ, we have necessarily 0 ∈ I (1) and I (2) = φ so that Lemmas 2.4, 2.5 and Remark 2.6 apply. Lemma 3.4 describes the structure of Lγ . We now show how to construct all these γ’s. In a first step we choose I (1) (cf. Definition 2.2) arbitrarily under the sole condition that 0 ∈ I (1) and I (2) = φ. Then we have (cf. Eq. (2.7)) γjk = 0
whenever {j, k, j + k} ⊂ I (i)
(i = 1 or i = 2).
This tears L = (V, µ) into two separate pieces (cf. Lemma 3.2). In a second step, we choose the subspace VS (cf. Definition 2.3) either as V0 or — if possible (cf. Lemma 2.4 and Remark 2.6) — larger than V0 . Then we know (cf. Lemma 3.4) Vj ⊂ VS ; Vk ⊂ V (1) or γjk = 1 if Vk ⊂ VS ; Vj ⊂ V (1) .
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
683
In the third and fourth steps, we have to put enough zeroes into the set of elements (cf. Lemma 3.4) (i)
{γjk | Vj , Vk , Vj+k ⊂ VI };
for i = 1 and i = 2;
i.e. we have to solve Eq. (2.11) plus all further defining equations in which one (i) or more of these elements occur. Then Lγ = (V (i) , µγ ); i = 1, 2; are indeed Lie algebras. We already know (cf. Lemma 2.5) that γjk = 0;
(i)
Vj , Vk , Vj+k ⊂ VI ;
i = 1 or i = 2;
is always possible. Note that the last two steps are completely independent of each other. We give four examples for this construction. Examples 5.3. Weakly discrete G-graded contractions of the form γ = π(γ) which violate “γ00 = γ0k ”. (i) Let G = Z2 . Then we must have I (1) = {0} and I (2) = {1} which yields γ00 = 1;
γ01 = γ11 = 0
(2) VI
(1)
so that VS = V0 , VI = = V1 . This means that Lγ = (V0 , µ) while (2) Lγ is Abelian. Therefore γ “projects” any Z2 -graded Lie algebra L = (V0 ⊕ V1 , µ) onto its subalgebra (V0 , µ). (This possibility, namely γ00 = 1, γjk = 0 otherwise, exists for all G.) (ii) Let G = Z3 . Choose I (1) = {0}, I (2) = {1, 2} which yields γ00 = 1; (2) VI
so that VS = V0 and VI = (cf. Eq. (2.11)), for j = l = 1
γ01 = γ02 = γ12 = 0 (2)
= V1 ⊕ V2 . For (VI , µγ ), we have to satisfy γ11 γ22 = 0.
Choose e.g. γ11 = 1, γ22 = 0. This means that γ “projects” any Z3 -graded Lie algebra L = (V0 ⊕ V1 ⊕ V2 , µ) onto the direct sum of its subalgebra (V0 , µ) and the substructure (V1 ⊕ V2 , µγ ) where all Lie products vanish except for µγ (V1 , V1 ) = µ(V1 , V1 ) ⊂ V2 (2)
so that Lγ is nilpotent of class 2. (iii) Let G = Z2 × Z2 . Choose I (1) = {(0, 0), (0, 1)}, I (2) = {(1, 0), (1, 1)} which yields γ00,00 = γ00,01 = 1;
γ00,10 = γ00,11 = γ01,10 = γ01,11 = γ10,10 = γ10,11 = γ11,11 = 0.
Therefore, γj,−j = 0 for j = (1, 0) and j = (1, 1). But we are free to choose γ01,01 = 1
September 12, 2006 14:40 WSPC/148-RMP
684
J070-00276
E. Weimar-Woods
(Lemma 2.4 is satisfied) so that (2)
VS = V00 ⊕ V01 ,
VI = VI
(1)
= V10 ⊕ V11 .
(2)
This means that Lγ = (VS , µ) while Lγ is Abelian so that γ “projects” any Z2 × Z2 -graded Lie algebra L = (V, µ) onto its subalgebra (VS , µ). (iv) Let G = Z6 . Choose I (1) = {0, 1, 3, 4}, I (2) = {2, 5} which yields γ00 = γ01 = γ03 = γ04 = 1 and γ02 = γ05 = γ11 = γ12 = γ14 = γ15 = γ22 = γ23 = γ24 = γ25 = γ35 = γ44 = γ45 = γ55 = 0. Therefore γj,−j = 0 for j = 1 and j = 2. But we are free to choose (cf. Remark 2.6) γ33 = 1 which yields VS = V0 ⊕ V3 ;
(1)
= V1 ⊕ V4 ;
VI
(2)
VI
= V2 ⊕ V5 . (1)
Since the remaining two elements γ13 and γ34 operate in µγ (VS , VI ) where no zeroes occur we have γ13 = γ34 = 1. (1)
(1)
Therefore we get Lγ = (VS ⊕ VI , µγ ) where µγ (VS , VS ) = µ(VS , VS ) ⊂ VS ; (1)
(1)
(1)
µγ (VS , VI ) = µ(VS , VI ) ⊂ VI and (1)
(1)
µγ (VI , VI ) = 0 (2)
while Lγ is Abelian. (1)
(2)
(If we choose γ33 = 0, we get VS = V0 , VI = V1 ⊕ V3 ⊕ V4 , VI = V2 ⊕ V5 and (1) we need (cf. Eq. (2.11) for j = 3, l = 1)γ13 γ34 = 0 so that Lγ becomes even “more Abelian”.) Part B. G-graded contractions which violate a higher-order identity “P1 = P2 ” We will see that a violation of a higher-order identity “P1 = P2 ” — weakly or strongly — forces γ to have a large number of zeroes. If the defining equations yield an identity of the form P (γ)P1 (γ) = P (γ)P2 (γ)
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
685
where P (γ) is some product of elements of γ, we will call P (γ) a “zipper product” for “P1 = P2 ”. If P has only one element, we will call this element a “zipper element”. Thus a violation of “P1 = P2 ” requires that all zipper products vanish. This, together with the defining equations, will enforce more zeroes still. [That such zipper products always exist can be seen as follows. For a given “P1 = P2 ” we can rewrite both sides for γ’s without zeroes by using for all elements which occur their unique basis expansion (cf. e.g. Lemma I.A.1 for the natural basis for G = ZN and Lemma I.A.4 for the natural basis for a general G). If we perform this calculation without using any denominators, it remains valid for γ’s with zeroes, as well. But in this case we only get an identity of the form P (γ)P1 (γ) = P (γ)P2 (γ).] Example 5.4. We have shown (cf. Examples I.4.6) that for the smallest higherorder identity “P1 = P2 ” P1 and P2 have the general structure (ji , ki ∈ G; i = 1, 2, 3) P1 (γ) = γj1 k1 γj2 k2 γj3 k3 ;
P2 (γ) = γj2 k1 γj3 k2 γj1 k3
where all elements which occur are pairwise incompatible with s1 = j1 + k1 = j3 + k2 ;
s2 = j2 + k2 = j1 + k3 ;
s3 = j3 + k3 = j2 + k1 .
Then the elements {γs1 j2 ; γs2 j3 ; γs3 j1 ; γs1 k3 ; γs2 k1 ; γs3 k2 } are zipper elements. [The proof for γs1 j2 e.g. goes like this. Combining the three defining equations γs1 j2 γj1 k1 = γj2 k1 γs3 j1 γs3 j1 γj3 k3 = γj1 k3 γs2 j3 γs2 j3 γj2 k2 = γj3 k2 γs1 j2 yields γs1 j2 P1 (γ) = γs1 j2 P2 (γ).] If γ violates “P1 = P2 ”, we therefore know that γs1 j2 = γs2 j3 = γs3 j1 = γs1 k3 = γs2 k1 = γs3 k2 = 0.
(5.1)
These zeroes enforce further zeroes. If we assume, e.g., P1 (γ) = 0 we get γs1 s2 = γs1 s3 = γs2 s3 = 0 [since, e.g., γj1 k1 γs1 s2 = γs2 k1 γj1 ,s2 +k1
(5.2)
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
E. Weimar-Woods
686
together with γj1 k1 = 0
and γs2 k1 = 0
yields γs1 s2 = 0]. Furthermore, we must have γs1 ,−j3 = γs2 ,−j1 = γs3 ,−j2 = γs1 ,−k2 = γs2 ,−k3 = γs3 ,−k1 = 0
(5.3)
[since, e.g., (s1 − j3 = k2 ) γs1 ,−j3 γj2 k2 = γs1 j2 γ−j3 ,s1 +j2 together with γj2 k2 = 0
and γs1 j2 = 0
yields γs1 ,−j3 = 0]. Now we look at a concrete example of this type. Example 5.5. Such a higher-order identity “P1 = P2 ” with three factors first occurs for G = Z6 (cf. Examples I.4.6) and it looks like P1 (γ) = γ11 γ33 γ55 ;
P2 (γ) = γ13 γ15 γ35 .
We will show that γ can violate “P1 = P2 ” in exactly four ways (cf. (i)–(iv) below). If a Z6 -graded contraction γ violates “P1 = P2 ” all zipper elements (cf. Eq. (5.1)) have to vanish i.e. γ01 = γ05 = γ14 = γ23 = γ25 = γ34 = 0. Assume first P1 (γ) = 0. Then we must have the following additional zeroes (cf. Eqs. (5.2) and (5.3)) γ02 = γ04 = γ12 = γ24 = γ45 = 0. Since γ33 = 0 we have (cf. Eq. (2.7)) γ00 = γ03 . In the case of γ00 = γ03 = 0, we must have (cf. Eq. (2.7)) γ13 = γ15 = γ35 = γ44 = 0. The defining equation γ11 γ22 = γ12 γ13 finally yields γ22 = 0. In the case of γ00 = γ03 = 0, the defining equations γ12 γ13 = γ11 γ22 ;
γ45 γ35 = γ55 γ44
yield γ22 = γ44 = 0,
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
687
while the elements γ13 , γ15 , γ35 can be chosen arbitrarily. The case P2 (γ) = 0 can be treated similarily. Altogether γ can violate “P1 = P2 ” exactly in the following four ways. (i) (ii) (iii) (iv)
P1 (γ) = 0; P2 (γ) = 0; γjk = 0 otherwise, P1 (γ) = 0; P2 (γ) = 0; γjk = 0 otherwise, P1 (γ) = 0; γ00 = γ03 = 0; γjk = 0 otherwise, 0 = P1 (γ) = P2 (γ) = 0; γjk = 0 otherwise.
Cases (i) and (ii) have a weak violation of “P1 = P2 ” , case (iii) violates weakly “P1 = P2 ” and “γ00 = γ0k ”, case (iv) finally violates “P1 = P2 ” strongly. A strong violation forces therefore 15 of the 21 elements of γ to vanish, a weak violation between 16 and 18. In contrast, a continuous Z6 -graded contraction with zeroes has a minimal number of 6 zeroes (namely, γ11 = γ13 = γ15 = γ33 = γ35 = γ55 = 0; γjk = 0 otherwise) if γ00 = 0 and of 12 zeroes if γ00 = 0 (cf. Lemma 2.10). Lγ is in the three cases (i), (ii) and (iv) nilpotent of class 2 (cf. Lemma 3.8). In case (iii), we have (2) (3) Lγ = (V = ⊕5j=0 Vj , µγ ) = L(1) γ ⊕ Lγ ⊕ Lγ (1)
where Lγ = (V0 ⊕ V3 , µγ ) is a Z2 -graded surviving subalgebra of L since µγ (V0 , V0 ) = γ00 µ(V0 , V0 ) ⊂ V0 ;
µγ (V0 , V3 ) = γ00 µ(V0 , V3 ) ⊂ V3 ;
µγ (V3 , V3 ) = γ33 µ(V3 , V3 ) ⊂ V0 and where (3) L(2) γ = (V1 ⊕ V2 , µγ ) and Lγ = (V4 ⊕ V5 , µγ )
are both nilpotent of class 2. Part C. Real G-graded contractions with a negative sign invariant The only remaining case of a discrete graded contraction γ occurs if γ is real and some sign invariant is negative for γ. In the following, we study first such γ’s without zeroes and then those with zeroes. Real γ’s without zeroes In Lemma I.3.4 and Theorem I.3.5, all equivalence classes for real γ’s without zeroes are determined. All sign invariants of the first kind have to be positive (cf. Lemma I.4.11 and Definition I.4.12). Sign invariants sgn P (γ) of the second kind only exist for each factor Z2M (M = 1, 2, . . .) of G, e.g., P (γ) = γ00 γMM and they all have to agree (cf. Lemma I.4.14).
September 12, 2006 14:40 WSPC/148-RMP
688
J070-00276
E. Weimar-Woods
Consider first G = Z2M . We have two equivalence classes with representatives 1 and db where (db)00 (db)MM = −1 (cf. Lemma I.3.4 and Example I.2.7(iii)). Therefore all real γ’s with negative sign invariants must be of the form γ ∼ db. The generalization to an arbitrary G with more than one factor Z2M is straightforward. For G = Z2M1 × Z2M2 (Mi = 1, 2, . . . ; i = 1, 2), e.g., γ’s with negative sign invariants must have one of the following forms γ ∼ db ⊗ 1;
γ ∼ 1 ⊗ db;
γ ∼ db ⊗ db.
Real γ’s with zeroes Now we study real γ’s with zeroes. Recall first that we can find all independent sign invariants which survive for a given support in the following way. First choose N quasi-independent elements and then from these N elements N < N independent elements according to Lemma I.6.9. Then we know (cf. Lemma I.6.17) that exactly J = Q + J
where Q = N − N
independent sign invariants exist for this support which can take on arbitrarily the values ±1 (cf. Lemma I.6.16). Q of them stem from higher-order identities and are of the first kind. The remaining J sign invariants belong to the N elements alone, they follow from Algorithm I.6.18. They are of the first kind if they satisfy Lemma I.4.11 (resp. Eq. (5.4) below), otherwise of the second kind. We know that sign invariants of the second kind only occur for each factor Z2M (M = 1, 2, . . .) of G and that they can be constructed in general (cf. Examples I.4.17). To produce a sign invariant of the first kind we can take the product of two sign invariants of the second kind which belong to the same subgroups of G (cf. Remark I.4.16(ii) resp. Example 5.6(ii) below) or the product P1 (γ)P2 (γ) of some higher-order identity “P1 = P2 ” (cf. Remark I.4.16(iii)). Furthermore, any dependence relation between elements of a γ without zeroes where at least one element occurs with an even power (different from zero) and one with an odd power yields a sign invariant of the first kind (cf. Example I.4.9(i)). Case 1. Assume first that all sign invariants of the first kind which survive for γ are positive. Then some sign invariant of the second kind must be negative for γ. Consider first the case G = Z2M . All surviving sign invariants of the second kind must be negative for γ since otherwise the product of two “contradicting” ones would yield a sign invariant of the first kind (cf. Remark I.4.16(ii)) which is
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
689
negative. Since all these sign invariants are negative for db too (cf. above), we know that γ · db = γ has no longer any negative sign invariants at all. Therefore γ = db · γ where γ has at most violations of “γ00 = γ0k ” and/or of some “P1 = P2 ”, but where sgn(P1 (γ)P2 (γ)) ≥ 0. Therefore it is enough to consider db in addition to our results in Parts A and B. The generalization to an arbitrary G with more than one factor Z2M is straightforward. Case 2. Finally we assume that one sign invariant sgn P (γ) of the first kind is negative for γ. Then we know (cf. Lemma I.4.11) that we get with respect to an arbitrary basis {γsi | si ∈ G × G; i = 1, 2, . . . , N = |G|} if we evaluate P (γ) for a γ without zeroes P (γ) =
N
γsmi i
where all mi are even.
(5.4)
i=1
Since we must have for all γ’s without zeroes sgn P (γ) = +1 we certainly need zeroes to allow sgn P (γ) = −1. Rewrite Eq. (5.4) without negative powers as (after renumbering if necessary) r
i| γs|m P (γ) i
i=1
=
N
i| γs|m ; i
0 ≤ r < N.
(5.5)
i=r+1
For a real γ with zeroes we can obviously only have sgn P (γ) = −1 if either (α) Eq. (5.5) does not hold for γ. Then Eq. (5.5) must represent a higher-order identity (cf. Definition I.4.1) which is weakly or strongly violated by γ or (β) Eq. (5.5) does hold for γ. Then we must have
and
γsi = 0
for some i ∈ {1, 2, . . . , r}
i.e. r ≥ 1
γsi = 0
for some i ∈ {r+1, . . . , N }.
(Even in this case, Eq. (5.5) may still represent a higher-order identity which is just not violated by the given γ (cf. Example 5.6(ii)).) And condition (α) or (β) must hold for all bases! More generally still, whenever we can express P (γ) by squares of elements only (no matter if these elements belong to a basis or not) we can draw exactly the same conclusion as above i.e. that P (γ) either stems from some “P1 = P2 ” (which always requires a lot of zeroes (cf. Part B)) or that we can
September 12, 2006 14:40 WSPC/148-RMP
690
J070-00276
E. Weimar-Woods
deduce the existence of at least two zeroes. And the collection of all these zeroes together will in turn enforce further zeroes via the defining equations. We give two examples. Examples 5.6. (i) For all G with |G| ≤ 6 there is only one sign invariant of the first kind which can be negative, namely the one which stems from the only higher-order identity which exists for all these G. It is P (γ) = γ11 γ13 γ15 γ33 γ35 γ55 for the real Z6 -graded γ with sgn P (γ) = −1;
γjk = 0 otherwise.
(ii) Consider the sign invariant of the first kind for G = Z2M (M = 3, 4, . . .) P (γ) = γjj γM+j,M+j γkk γM+k,M+k ;
0 < j < k < M;
which is the product of two sign invariants of the second kind (cf. Examples I.4.17). If all four elements are pairwise incompatible (i.e. if k = 2j and 2k = M + j) we can have sgn P (γ) = −1. In this case P (γ) stems from a higher-order identity “P1 = P2 ” where 2 ; P1 (γ) = γjj γkk γM+j,M+k
2 P2 (γ) = γM+j,M+j γM+k,M+k γjk
which can be arbitrarily violated. If γjk = γM+j,M+k = 0 we have case (β) above, otherwise case (α). In contrast, if k = 2j or 2k = M + j the surviving defining equations enforce sgn P (γ) = +1.
6. Graded Contractions Versus Contractions I In this section we start our comparison of the two notions “graded contractions” and “contractions” of a finite-dimensional complex (resp. real) Lie algebra L = (V, µ). We prove that continuous graded contractions are equivalent to a proper subset of contractions (cf. Theorem 6.9) where equivalence is defined in Definition 6.7. Then we show that discrete graded contractions are in general not equivalent to any contraction.
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
691
6.1. Contractions We start with a few basic facts on contractions. For further details cf. [3, 4, 6, 8]. Definition 6.1 (cf. [8, Definitions 2.2 and 2.3]). Let T (ε) ∈ Aut(V ), 0 < ε ≤ 1, be a family of non-singular linear maps. Then the Lie algebras LT (ε) = (V, µT (ε) );
ε > 0;
where (x, y ∈ V ) µT (ε) (x, y) = T −1 (ε)µ(T (ε)x, T (ε)y)
(6.1)
are equivalent to L = (V, µ). If the limit µT (x, y) = lim µT (ε) (x, y)
(6.2)
ε→0
exists for all x, y ∈ V , then µT is a Lie product and the Lie algebra LT = (V, µT ) is called the contraction of L by T (ε), in short, T (ε)
L −→ LT . Similarly one can define a sequential contraction. T (ε)
Definition 6.2 (cf. [8, Definition 2.6]). A contraction L −→ LT is called a generalized In¨ on¨ u–Wigner contraction (gen. IW-contraction) if the matrix of T (ε) has the form, with respect to some basis e1 , e2 , . . . , edim V of V , T (ε)ij = δij εnj ;
nj ∈ R;
ε > 0;
i, j = 1, 2, . . . , dim V.
(6.3)
If some powers nj are 0 and all others are 1, we speak of a simple IW-contraction. The necessary and sufficient conditions for T (ε)ij = δij εnj to define a contraction of L = (V, µ) is (cf. [8, Remark 2.6]) k Cij ek . µ(ei , ej ) =
(6.4)
nk ≤ni +nj
We then get for LT = (V, µT ) µT (ei , ej ) =
k Cij ek .
(6.5)
nk =ni +nj T (ε)
Definition 6.3 (cf. [8, Definition 2.4]). Two contractions : L −→ LT and S(ε)
L −→ LS with L L are called equivalent if LT LS . T (ε)
Theorem 6.4 (cf. [8, Theorem 3.1]). Any contraction L −→ LT (resp. sequential contraction) is equivalent to a gen. IW-contraction with integer exponents.
September 12, 2006 14:40 WSPC/148-RMP
692
J070-00276
E. Weimar-Woods
Since LT is not semisimple whenever LT L (cf. [8, Corollary 4.2]), contractions are especially useful as a link between semisimple and non-semisimple Lie algebras (cf. also Sec. 7 and Appendix). Since gen. IW-contractions are extremely easy to apply, dealing with contractions has been considerably simplified and quite often only made possible by Theorem 6.4. Furthermore, Theorem 6.4 states at the same time that any contraction is equivalent to an analytic deformation (cf. [8] for further details). Since a gen. IW-contraction either leaves the structure constants alone or sends them to zero (cf. Eqs. (6.4) and (6.5)), LT is either isomorphic to L or “more Abelian” than L. The property of being “more Abelian” can be measured directly by the drop in the dimension of the orbit. Remark 6.5 (cf. [9, pp. 215 and 221]). The orbit O(L) of a complex (resp. real) finite-dimensional Lie algebra L = (V, µ) under the action of the group Aut(V ) is isomorphic to O(L) Aut(V )/Aut(L). 3
(6.6) 3
O(L) is a smooth submanifold of C(dim V ) (resp. R(dim V ) ). Aut(L) consists of all U ∈ Aut(V ) with U µ(x, y) = µ(U x, U y);
x, y ∈ V.
Its Lie algebra is given by all derivations D ∈ L(V ) where Dµ(x, y) = µ(Dx, y) + µ(x, Dy). Therefore, dim O(L) can, e.g., be determined by calculating all derivations (cf. Eq. (6.6)). Assume LT L. Due to the definition of LT (cf. Definition 6.1), its orbit O(LT ) lies in the closure of O(L) (relative to the Euclidean topology). Since the boundary of an orbit consists of orbits of lower dimension we have immediately LT L ⇔ dim O(LT ) < dim O(L).
(6.7)
Example 6.6. The real three-dimensional Lie algebra L = so(3) = A3,9 can be contracted into LT = iso(2) = A3,6 or into the Heisenberg algebra LT = A3,1 (cf. [10, 11]). We have dim O(A3,1 ) = 3 < dim O(A3,6 ) = 5 < dim O(A3,9 ) = 6. 6.2. Graded contractions In order to study the relation between contractions and graded contractions of finitedimensional Lie algebras, we first define equivalence between these two procedures along the lines of Definition 6.3. T (ε)
γ
Definition 6.7. A contraction L −→ LT and a graded contraction L → Lγ with L L are called equivalent if LT Lγ .
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
693
Note that L and Lγ are graded with respect to the same grading group whereas gradings do not play any role for L and LT . (a) Continuous graded contractions Because of their definition (cf. Definition 4.1) continuous graded contractions can be immediately interpreted as contractions. Namely we have γ = limε→0 da(ε) where da(ε) corresponds to the change of basis Vj → aj (ε)Vj ; j ∈ G; (cf. Examples I.2.7(ii)), which we identify with T (ε). For a Lie algebra L which is graded with respect to V = ⊕j∈G Vj , it follows from Eq. (6.1) that limε→0 µT (ε) (x, y) exists and is given by µγ . It is easy to show that this contraction T (ε) can always be chosen in such a way that T (0) exists. We know from Eq. (4.5) that lim aN j (ε)
ε→0
exists for all j ∈ G so that the aj (ε); ε ∈ (0, 1]; are bounded. Therefore there exists εn → 0 such that limn→∞ aj (εn ) exists for all j ∈ G. This defines a sequential contraction where T (0) exists. [Another proof is given by Theorem 4.2 (cf. Eq. (4.4)).] In [8], we proved the existence of a contraction L → L0 such that there is T (ε)
no contraction L → L0 for which T (0) = limε→0 T (ε) exists. Furthermore, not all contractions where T (0) does exist can be realized by a graded contraction (continuous or discrete) as the following three-dimensional example shows. Example 6.8. Consider the real three-dimensional Lie algebra L = (V, µ) = A3,2 (cf. [11]) with non-vanishing Lie products µ(e1 , e3 ) = e1 ;
µ(e2 , e3 ) = e1 + e2 .
Non-trivial graded contractions (continuous or discrete) of L do not exist since L only admits the Z2 -grading V = V0 ⊕ V1 where e3 ∈ V0 ; e1 , e2 ∈ V1 . We write V = V1 ⊕ V2
with e1 , e3 ∈ V1 ;
e2 ∈ V2 .
Then we have (no grading!) µ(V1 , V1 ) ⊂ V1 ;
µ(V1 , V2 ) ⊂ V1 ⊕ V2 ;
µ(V2 , V2 ) = 0.
This leads to two non-trivial inequivalent contractions (i, j = 1, 2) T (ε)
L −→ LT
with T (ε)ij = δij εnj ,
namely (i) n2 > n1 = 0. µT (V1 , V1 ) ⊂ V1 ;
µT (V1 , V2 ) ⊂ V2 ;
µT (V2 , V2 ) = 0
September 12, 2006 14:40 WSPC/148-RMP
694
J070-00276
E. Weimar-Woods
i.e. µT (e1 , e3 ) = e1 ;
µT (e2 , e3 ) = e2
so that LT = A3,3 and (ii) n1 > n2 = 0, µT (V1 , V1 ) = 0;
µT (V1 , V2 ) ⊂ V1 ;
µT (V2 , V2 ) = 0
i.e. µT (e1 , e3 ) = 0;
µT (e2 , e3 ) = e1
so that LT = A3,1 . Further non-trivial contractions of L do not exist (cf. [10]). Thus we have established: Theorem 6.9. Continuous graded contractions are equivalent to a proper subset of contractions T (ε) where T (0) exists. (b) Discrete graded contractions As to be expected from the purely algebraic definition, a discrete graded contraction is in general not equivalent to a contraction as the following example shows. Example 6.10. Consider the real weakly discrete Z2 -graded contraction γ with γ00 = 1
and γ01 = γ11 = 0
so that “γ00 = γ0k ” is violated. Consider the three-dimensional real Z2 -graded Lie algebra L = (V, µ) = A3,3 (cf. [11]) where V = V0 ⊕ V1 with basis vectors e1 , e3 ∈ V0 ; e2 ∈ V1 and non-vanishing Lie products µ(e1 , e3 ) = e1 ;
µ(e2 , e3 ) = e2 .
Then Lγ = (V, µγ ) = A2,1 ⊕ A1,1 since µγ (e1 , e3 ) = e1 ;
µγ (e2 , e3 ) = 0;
i.e. Lγ is simply a subalgebra of L. We have for the orbits dim O(Lγ ) = 2 < dim O(L) = 3.
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
695
However (although Lγ is “more Abelian” than L and although Eq. (6.7) is satisfied), non-trivial contractions of A3,3 do not exist (cf. [10]). T (ε)
For a non-trivial contraction L −→ LT we must have (see above) LT not semisimple LT “more Abelian” than L dim O(LT ) < dim O(L). But for discrete graded contractions γ none of these correlations have to hold. The following two examples of strongly discrete γ’s show that the dimension of the orbit can stay the same or even grow! Examples 6.11. (i) The Z2 -graded contraction db
L = so(3) → Ldb = so(2, 1) provides an example where dim O(Lγ ) = dim O(L). Furthermore Ldb is simple although Ldb L. Ldb is not “more Abelian” than L. (ii) Consider the strongly discrete Z6 -graded contraction γ with (cf. Example 5.5, case (iv)) γ11 = α = 0, 1;
γ13 = γ15 = γ33 = γ35 = γ55 = 1; γjk = 0
otherwise.
Consider the Z6 -graded Lie algebra L = (V = ⊕5j=0 Vj , µ) with basis vectors ej , ej1 , ej2 ∈ Vj and non-vanishing Lie products µ(e11 , e12 ) = e2 = µ(e32 , e51 ) µ(e31 , e32 ) = e0 = µ(e52 , e11 ) µ(e51 , e52 ) = e4 = µ(e12 , e31 ). Then one can show (details will be presented in Part III) that dim O(Lγ ) > dim O(L) so that Lγ L and Lγ is not “more Abelian” than L. 7. Graded Contractions Versus Contractions II We continue our comparison of both notions, especially with respect to their use in physics. Here we discuss the motivation for their introduction, and their applicability to representations and invariants. We note that contractions can successfully treat a wide variety of interesting cases for representations. In contrast to this, Theorem 7.5 proves that the graded contraction method can never relate two physically interesting (i.e. faithful self-adjoint) representations. As for invariants, contractions
September 12, 2006 14:40 WSPC/148-RMP
696
J070-00276
E. Weimar-Woods
can be easily applied to not only all polynomial invariants, but also rational and even some formal ones. However graded contractions can only deal in a limited way, and with great difficulty, with polynomial invariants. 7.1. Motivation (a) Contractions If two physical theories (e.g. relativistic and non-relativistic mechanics) are related by a limiting process (e.g. the velocity of light goes to infinity), the same should be true for their invariance groups (e.g. Poincar´e and Galilean group). This idea led to the concept of contractions [3, 4]. A contraction (cf. Definition 6.1) is a path (resp. a sequence) which runs within the orbit of one Lie algebra and ends in its boundary. It follows easily from Theorem 6.4 that every point in the orbit closure can be so obtained. Contractions quickly became a standard tool in mathematical physics, although mostly simple IWcontractions (cf. Definition 6.2) were used. (b) Graded contractions Graded contractions were claimed to be, in the context of mathematical physics, a generalization of contractions (“Graded contractions . . . allow many more contraction parameters to be introduced and consequently a much larger variety of contraction ‘limits’ to be studied”) in the following three ways (cf. [1, Introduction]). (i) The grading group G is no longer Z2 only. (“Traditional WI-contractions are a particular case of Z2 -graded contractions” (cf. [13, Introduction]).) (ii) Discrete solutions exist besides continuous ones. (iii) The procedure is identical for all G-graded Lie algebras and for all superalgebras independent of the (finite or infinite) dimensions of the subspaces. Since graded contractions are not generalizations of contractions, this statement is quite puzzling. Therefore, we will comment on all three individual points. (i) This statement is completely misleading since gradings do not play any role for a contraction! This misunderstanding expressed in (i) is probably based on the following observation. When we expose a Lie algebra L = (V = V0 ⊕ V1 , µ) (no grading!) to a simple IW-contraction with respect to its subalgebra (V0 , µ) the contracted Lie algebra LT = (V = V0 ⊕ V1 , µT ) exhibits a Z2 -grading since µT (V0 , V0 ) = µ(V0 , V0 ) ⊂ V0 ;
µT (V0 , V1 ) ⊂ V1 ;
µT (V1 , V1 ) = 0.
But this is in general not a Z2 -graded contraction since in general µ(V0 , V1 ) ⊂ V1
and µ(V1 , V1 ) ⊂ V0 .
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
697
For a three-dimensional example of this type (which is not equivalent to any graded contraction), see Example 6.8. (ii) This is not surprising since we compare an analytic and a purely algebraic concept. In Sec. 5, we show that for discrete graded contractions, L and Lγ are in general two loosely connected to suggest any interesting applications in physics. In any case, we could not find any interesting application of a discrete graded contraction in the extensive literature. (iii) This is trivially true since a graded contraction treats all vectors in a subspace Vj (j ∈ G) identically. Contractions could of course be restricted in the same way, but this is neither necessary nor desirable. However, it should be added here that most publications on graded contractions ignore this point by producing results which are only valid for one specific Lie algebra. 7.2. Representations We start with a quotation (cf. [14, Introduction]). “A major handicap of contractions is that they do not extend . . . to the theory of representations . . . Graded contractions . . . extend naturally to all representations . . .”. It is clear even from the title of the original In¨ on¨ u–Wigner paper [4] that the first statement is wrong. As for the second statement, we will show that graded contractions cannot be used at all to study physically interesting representations. The representations of real Lie groups which are of interest in physics are faithful, unitary representations — possibly up to a factor. They define faithful, selfadjoint representations of the corresponding real Lie algebra, or, in the case of a representation up to a factor, possibly of a central extension of it [15, 16]. (a) Contractions The original In¨on¨ u–Wigner paper [4] focuses on the question how the physically interesting representations of L relate to those of LT . Consider a contraction T (ε)
L −→ LT and a representation D of L = (V, µ) on a Hilbert space H, i.e. D(µ(x, y)) = [D(x), D(y)];
x, y ∈ V.
(7.1)
The first idea to produce a representation of LT would be to consider the representations Dε on H given by (cf. Definition 6.1) Dε (x) = D(T (ε)x);
0 < ε ≤ 1;
(7.2)
of LT (ε) = (V, µT (ε) ). If T (0) = limε→0 T (ε) exists, then one could try to define a representation DT of LT by DT (x) = D(T (0)x);
x ∈ V.
(7.3)
But LT L implies that T (0) is singular, so that such a representation cannot be faithful. This approach was therefore immediately rejected by In¨ on¨ u and Wigner.
September 12, 2006 14:40 WSPC/148-RMP
698
J070-00276
E. Weimar-Woods
As a remedy, they proposed instead to consider the representations Dε (x) = D(ε) (T (ε)x);
0 < ε ≤ 1;
(ε)
(7.4) (ε)
where D is a representation of L on a Hilbert space H . The idea is to choose D(ε) and H (ε) together with a limiting procedure which yields interesting representations of LT . The necessity of such a limiting procedure is to be expected. Namely, when one theory is a limit of another, the contraction parameter will be a physical quantity. Since a physically interesting representation describes some physical situation, this representation should change too. Furthermore, whenever L is compact and LT non-compact, the irreducible self-adjoint representations of L are finite-dimensional while those of LT are infinite-dimensional. Therefore a limiting procedure which changes the representation space is unavoidable in such a case (hence the graded contraction method can not be used at all here, cf. Examples A.1 and A.2). In¨ on¨ u and Wigner illustrate this limiting procedure for the simple IWcontraction T (ε)
so(3) −→ iso(2) (cf. also Example A.1). If you consider gen. IW-contractions of compact simple Lie algebras in general [12] this limiting procedure has to become more involved (cf. also Example A.2). This approach also works e.g. for the non-compact Lorentz group (cf. Example A.3) But there are still a lot of open questions (e.g. do you get all interesting representations of LT in this way). The power of this procedure was well-demonstrated in the original In¨ on¨ u– Wigner paper. Bargmann [15] had shown that the non-relativistic Schr¨ odinger equation transforms under the Galilean group by a representation up to a factor which contains the mass and cannot be eliminated. A main motivation of the In¨ on¨ u–Wigner paper was the question — why do the true representations of the Galilean group not occur in physics. They answered this question by showing that they are contractions of spacelike representations of the Poincar´e group. Furthermore, they contracted the timelike representations of the Poincar´e group to obtain the representations for the Schr¨odinger equation. Here the procedure is necessarily more involved, since the representation of the original Lie algebra is contracted to a representation of a central extension of the contracted Lie algebra (a procedure which the graded contraction method cannot, even in principle, deal with). Here the matrix elements of the generator of time translations necessarily diverge, and removing this c-number divergence automatically produces the desired representation. (b) Graded contractions Moody and Patera [2] have defined the graded contraction of compatibly graded representations as follows.
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
699
A representation D of a G-graded Lie algebra L = (V = ⊕j∈G Vj , µ) on a vector space H = ⊕k∈G Hk is said to be compatibly graded if D(Vj )Hk ⊂ Hj+k .
(7.5)
The graded contracted representation Dψ of Lγ = (V, µγ ) on the same vector space H is defined by Dψ (Vj )Hk = ψjk D(Vj )Hk ⊂ Hj+k
(7.6)
where the numbers ψjk satisfy their defining equation (j, k, l ∈ G) γjk ψj+k,l = ψjl ψk,j+l = ψkl ψj,k+l .
(7.7)
One sees immediately from the defining equation (2.1) for γ that ψ = γ is always a solution. But there are others, and ψ need not be symmetric. Remark 7.1. The questions which representations D of L are compatibly graded, and which representations Dψ of Lγ you get with this approach, are not discussed in [2]. Furthermore, no clear motivation for this concept is given. There is no obvious reason why a representation (other than the adjoint representation) should be compatibly graded. Indeed, the well-known representations of the Lorentz group are not (cf. Example A.3). All you get in [2] are two tables for G = Z2 and Z3 containing some solutions for ψ for all different projections γ = π(γ) with zeroes, without any discussion. For example, Table 1 lists a total of 12 solutions for the 3 different Z2 -graded contractions γ = π(γ) with zeroes. The authors did not notice that 5 of them differ from others only by the exchange of H0 and H1 (which is trivially possible). For 6 of the remaining 7, Dψ is not faithful so that they should also have been discarded from the start (cf., e.g., Example A.2). For the one remaining solution, namely ψ=γ
where γ00 = γ01 = 1,
γ11 = 0,
Dψ is faithful, but not self-adjoint (cf., e.g., Example A.1). The question which ψ belong to γ ∼ γ was not discussed either. If γ = da · γ, then aj ak = ψjk ψjk aj+k satisfies Eq. (7.7). If you replace Vj by aj Vj and Hk by ak Hk you get a similar representation. Finally, the obvious question whether this procedure can relate two physically interesting (i.e. faithful, self-adjoint) representations seems to have been completely ignored in the graded contraction literature. We now show that it can never do this. The first problem is the compatible grading of D. We prove in Lemma 7.3 that this restricts the grading group to a product of Z2 factors (i.e. Z2 , Z2 ×
September 12, 2006 14:40 WSPC/148-RMP
700
J070-00276
E. Weimar-Woods
Z2 , Z2 ×Z2 ×Z2 etc.). (However, even for Z2 , the standard representations from physics need not be compatibly graded (cf. Remark 7.1). Finally, Theorem 7.5 completes the argument. We now prove these results. Lemma 7.2. Let G be a grading group, let H = ⊕k∈G Hk be a Hilbert space. Let j ∈ G, and let T = 0 be a self-adjoint operator such that T Hk ⊂ Hj+k . Then 2j = 0 and T Hk = {0} ⇔ T Hj+k = {0}.
(7.8)
Proof. Let ekα be an orthonormal basis for Hk . Then a non-zero matrix element of T is necessarily of the form T(j+k,β),(kα) and Eq. (7.8) now follows from the self-adjointness of T . Since T = 0, we must have T Hk = 0 for some k, and hence for some α, β we have T(j+k,β),(kα) = T¯(kα),(j+k,β) = 0 which implies k = 2j + k. Lemma 7.3. Let L = (V = ⊕j∈G Vj , µ) be a G-graded Lie algebra. Let D be a compatibly graded, faithful, self-adjoint representation of L on a Hilbert space H. Then Vj = {0} implies that 2j = 0 (i.e. the grading group is, in effect, Z2 or Z2 ×Z2 or Z2 × Z2 × Z2 etc.). Proof. We have H = ⊕k∈G Hk and D(Vj )Hk ⊂ Hj+k . Choose some T ∈ Vj , T = 0. Then D(T ) = 0 satisfies the assumptions in Lemma 7.2. Lemma 7.4. Let γ be a G-graded contraction, let L = (V = ⊕j∈G Vj , µ) be a G-graded Lie algebra, and let D be a compatibly graded, faithful and self-adjoint representation of L. If a representation Dψ of Lγ as defined by Eqs. (7.5)–(7.7) exists which is faithful and self-adjoint, then γjk = 0 for all j, k in the subgroup G generated by all j with dim Vj > 0.
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
701
Proof. Let J = {j ∈ G | dim Vj > 0}. Let j ∈ J. Since Dψ is faithful, there exists l ∈ G such that Dψ (Vj )Hl = 0.
(7.9)
Using self-adjointness and Lemma 7.2, we get Dψ (Vj )Hj+l = 0.
(7.10)
Equations. (7.6), (7.9) and (7.10) give ψjl = 0
and ψj,j+l = 0.
By Lemma 7.3, we have 2j = 0, and Eq. (7.7) with k = j then gives γjj ψ0l = ψjl ψj,j+l = 0.
(7.11)
(If J = G , then γjj = γj,−j = 0 for all j ∈ G , and Lemma 2.2 already implies the result.) Equation (7.11) implies γjj , ψ0l , ψjl , ψj,j+l = 0.
(7.12)
Equation (7.7) with k = 0 gives γ0j ψjl = ψ0l ψjl
(7.13)
so that γ0j = 0 which means (cf. Definition 2.1 and Eq. (2.2)) 0, j ∈ I (1) .
(7.14)
Since 2j = 0, Eqs. (2.12) and (7.12) give 0 = γ00 γjj = γjm γj,j+m
for all m ∈ I (1) ,
(7.15)
and hence γjm = 0 for all m ∈ I (1) .
(7.16)
We now use the defining equations for γ to show that γjk = 0 for all j ∈ J, k ∈ G . Note that any k ∈ G is a finite sum of elements in J. Let j1 , j2 ∈ J. Then Eqs. (7.14) and (7.16) give γj1 j2 = 0
(7.17)
and Eqs. (2.7) and (7.14) imply that j1 + j2 ∈ I (1) . By repeating this argument we get for all k ∈ G , and all j ∈ J, k ∈ I (1)
and γjk = 0.
Now let j1 , j2 ∈ J and k ∈ G . The defining equation γj1 j2 γj1 +j2 ,k = γj1 k γj2 ,j1 +k
(7.18)
September 12, 2006 14:40 WSPC/148-RMP
702
J070-00276
E. Weimar-Woods
gives, because of Eq. (7.18) γj1 +j2 ,k = 0.
(7.19)
Repeating this cycle we get γkk = 0;
k, k ∈ G .
Theorem 7.5. Let L = (V = ⊕j∈G Vj , µ) be a G-graded Lie algebra, and let D be a compatibly graded, faithful, self-adjoint representation of L. Let γ be a G-graded contraction such that Lγ L. Then a graded contracted representation Dψ of Lγ cannot be faithful and self-adjoint. Proof. From Lemmas 7.3 and 7.4, we can assume that G = Z2 × Z2 × · · · × Z2 , that G is generated by J = {j ∈ G | dim Vj > 0}, and that γ has no zeroes. Since in the complex case this would mean that Lγ L (cf. Theorem I.3.1), Lγ and L must be inequivalent real forms, and hence γ must have a negative sign invariant of the second kind (cf. Theorem I.3.5 and Lemma I.4.15). Since J generates G it follows that any sign invariant for γ is a product of the sign invariants sgn Pj (γ) where Pj (γ) = γ00 γjj ;
0 = j ∈ J;
(7.20)
(cf. Lemma I.4.15 and Example I.6.15). Hence there exists some ˆj ∈ J, ˆj = 0 such that sgn Pˆj (γ) = −1.
(7.21)
Now consider ψ. Equation (7.7) for j = k = l = 0 gives 2 γ00 ψ00 = ψ00 .
Thus we must have either Case (i):
ψ00 = γ00 = 0;
or
(7.22) Case (ii):
ψ00 = 0.
Equation (7.7) for j = k = 0, l = 0 yields (since 2j = 0) γjj ψ00 = ψj0 ψjj .
(7.23)
It follows from Eqs. (7.20)–(7.23) that we must have either Case (i):
sgn(ψˆj0 ψˆjˆj ) = sgn(γ00 γˆj ˆj ) = −1,
or
(7.24) Case (ii):
ψj0 ψjj = 0,
j ∈ G.
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
703
Now, for j ∈ J, we have Dψ (Vj )H0 = ψj0 D(Vj )H0 ⊂ Hj , and
(7.25) Dψ (Vj )Hj = ψjj D(Vj )Hj ⊂ H0 .
We consider first case (i). Then in Eq. (7.25) with j = ˆj, both expressions are nonzero, and the minus sign then implies that Dψ is not even similar to a self-adjoint representation. We now treat case (ii). Equation (7.7) for j = k = l yields γjj ψ0j = ψjj ψj0 and hence ψ0j = 0;
j ∈ G.
(7.26)
Equation (7.7) for j = k yields γjj ψ0l = ψjl ψj,j+l and hence ψjl ψj,j+l = 0;
j, l ∈ G.
(7.27)
If we assume that Dψ is faithful and self-adjoint, then the argument at the start of the proof of Lemma 7.4 implies that for j ∈ J and some l ∈ G we have ψjl = 0 and ψj,j+l = 0, which contradicts Eq. (7.27). Note that once the problem has been reduced to where Lγ and L are inequivalent real forms, we have essentially the same situation as Weyl’s unitary trick which necessarily destroys self-adjointness, so that the theorem is to be expected (the main problem for Weyl was to change the representation space). 7.3. Invariants (a) Contractions There are no conceptual difficulties at all to apply gen. IW-contractions to invariants [17]. We start with polynomial invariants (the so-called Casimir operators). Consider a contraction (cf. Definition 6.1) T (ε)
L = (V, µ) −−→ LT = (V, µT ) with LT L and a polynomial invariant C = ci1 i2 ···im ei1 ei2 · · · eim of L where ei are basis vectors of V and ci1 i2 ···im ∈ C
(resp. R).
(7.28)
September 12, 2006 14:40 WSPC/148-RMP
704
J070-00276
E. Weimar-Woods
Then CT (ε) = ci1 i2 ···im (T −1 (ε)ei1 )(T −1 (ε)ei2 ) · · · (T −1 (ε)eim ) = ci1 i2 ···im (ε)ei1 ei2 · · · eim
(7.29)
is the transformed invariant of LT (ε) = (V, µT (ε) ). In the case of a gen. IW-contraction with respect to this basis, we have T (ε)ei = εni ei ;
ni ∈ R;
which means ci1 i2 ···im (ε) = ε−(ni1 +ni2 +···+nim ) ci1 i2 ···im .
(7.30)
M = max(ni1 + ni2 + · · · + nim ) for all ci1 i2 ···im = 0.
(7.31)
Let
Then CT = lim εM CT (ε) = ε→0
ci1 i2 ···im ei1 ei2 · · · eim
(7.32)
ni1 +ni2 +···+nim =M
is a non-trivial polynomial invariant of LT with the same degree as C. If you contract several polynomial invariants of L, it can happen that a contracted invariant CT with a higher degree is simply a product of those with lower degrees, since its contribution occured in CT (ε) with the largest negative exponent. In this case, we have to subtract this expression from CT (ε) in order to get an interesting result. In this way it is, e.g., possible to contract a complete set of algebraically independent invariants for so(p, q + 1) with the simple IW-contraction L = so(p, q + 1) → LT = iso(p, q), into a complete set of algebraically independent invariants for iso(p, q) [18]. But since LT is “more Abelian” than L, it can have more invariants than L, so that in general we cannot expect to get all invariants of LT from those of L. Rational invariants can be successfully treated in a similar way. But for formal invariants, the existence of a non-trivial limit has only been established in certain cases [17]. (b) Graded contractions γ For G-graded contractions L → Lγ only polynomial invariants C of L = (V = ⊕j∈G Vj , µ) have been considered and that is in their standard form as symmetric homogeneous polynomials [19]. The starting point is the observation that all summands of C belong to one and the same total grading label. If C is a polynomial of degree r with total grading label k ∈ G, then each summand of C looks like ei1 ei2 · · · eir ;
eis ∈ Vjs ⊂ V ;
js ∈ G;
s = 1, 2, . . . , r;
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
705
where the individual grading labels add up to k, i.e. j1 + j2 + · · · jr = k. Following the general graded contraction philosophy, Bincer and Patera multiply each summand by a number µj1 j2 ···jr ∈ C
(resp. R)
which only depends on which elements js occur. The idea now is to choose these numbers in such a way that this modified expression Cµ becomes a polynomial invariant of Lγ . In [19], the equation µj1 j2 ···jr has to satisfy to achieve this goal is derived explicitly only for G = ZN and a quadratic Casimir operator with total grading label 0. For example, for G = Z2 it looks like µ00 γ01 = µ11 γ11 .
(7.33)
The general case becomes quickly rather involved and obscure so that the authors only sketch the procedure for a Casimir operator of degree 3 and arbitrary total grading label. The question which invariants of Lγ you get in this way and what you do if Cµ = 0 is not discussed. Rational and formal invariants cannot be treated at all. Altogether this approach can only be used in the simplest cases (cf. examples in the Appendix). Appendix. Three Physical Examples In this appendix, we illustrate the contraction resp. graded contraction of representations and invariants for the following three standard examples from physics. Example A.1. Example A.2. Example A.3.
L = so(3) → LT = Lγ = iso(2) L = so(3) → LT = Lγ = Heisenberg algebra L = so(3, 1) → LT = Lγ = iso(3)
Since all invariants which occur are quadratic polynomials, both methods work. However the second invariant in Example A.3 has total grading label 1 so that it is not explicitly treated in [19] (cf. also Sec. 7.3(b)). The situation is completely different for representations. The contraction method handles all three cases very successfully. The graded contraction method produces no result of any mathematical or physical interest. In Example A.1, the adjoint representation is the only faithful irreducible representation we get. In Example A.2, we get no faithful representation at all. Finally, Example A.3 (the Lorentz group) cannot be treated at all since the starting representation is not compatibly graded. Example A.1. Consider the simple IW-contraction resp. Z2 -graded contraction L = so(3) → LT = Lγ = iso(2). L is compact simple. LT is non-compact, non-semi-simple.
September 12, 2006 14:40 WSPC/148-RMP
706
J070-00276
E. Weimar-Woods
Consider the (2j+1)-dimensional faithful, irreducible, self-adjoint representation D of L(j = 0, 1, 2, . . .) on the Hilbert space H (j) with orthonormal basis (j)
|j, m;
m = 0, ±1, . . . , ±j;
where the non-vanishing matrix elements of the generators J with µ(Ji , Jj ) = iεijk Jk ;
i, j, k = 1, 2, 3;
are j, m | D(j) (J3 ) | j, m = m, and j, m ± 1 | D(j) (J± ) | j, m =
(j ∓ m)(j ± m + 1)
where J± = J1 ± iJ2 . The Casimir operator 1 C = J12 + J22 + J32 = J32 + (J+ J− + J− J+ ) 2 takes on the value j(j + 1)1. (a) Contractions We take the contraction in the form T (ε)J3 = J3
and T (ε)J± = εJ± .
The basic idea of the contraction of the representation is to choose an appropriate path through the different representations D(j) as follows. Choose first j(ε) ∈ N with j(ε) → ∞ so that lim εj(ε) = M > 0.
ε→0
(M)
The contracted representation DT of LT is defined on the Hilbert space HT with orthonormal basis |m; m ∈ Z; by the non-vanishing matrix elements (M)
m | DT
(J3 ) | m = lim j(ε), m | D(j(ε)) (J3 ) | j(ε), m ε→0
= m, and (M)
m ± 1|DT
(J± )|m = lim j(ε), m ± 1 | εD(j(ε)) (J± )|j(ε), m ε→0 = lim ε (j(ε) ∓ m)(j(ε) ± m + 1) = M. ε→0
Note that for
(M) DT
the contracted Casimir operator (cf. Eq. (7.32)) CT =
1 (J+ J− + J− J+ ) 2
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
707
takes on the value M 2 1, and that we get all faithful, irreducible, self-adjoint representations of LT . (b) Graded contractions We have J3 ∈ V0 ,
J± ∈ V1 ,
and γ00 = γ01 = 1,
γ11 = 0.
(j) H0
(j)
The first problem is to find all possibilities H (j) = ⊕ H1 so that D(j) is (j) compatibly graded. Now H0 must be an invariant subspace under D(j) (J3 ), (j) i.e. the projection P on H0 must commute with D(j) (J3 ). Since the spectral multiplicity of the eigenvalues of D(j) (J3 ) is one, it follows that each |j, m (j) (j) must belong to either H0 or H1 . Since the shiftoperators D(j) (J± ) send (j) (j) (j) H0 to H1 and vice versa, it follows that the only possibilities are that H0 is spanned by either all |j, m with m even, or all |j, m with m odd. (j) For ψ = γ we get the representation Dψ of Lγ on H (j) with non-vanishing matrix elements (j)
m | Dψ (J3 ) | m = m; m = 0, ±1, . . . , ±j; (j ∓ m)(j ± m + 1); m even (resp. odd) (j) m ± 1 | Dψ (J± ) | m = 0; m odd (resp. even). (j=1)
We note that Dψ
(with grading label 0 for even m) is the adjoint repre(j)
sentation of Lγ . All other Dψ have the property that each invariant subspace gives either the adjoint representation, or a two-dimensional representation which is irreducible but not faithful. Finally, the graded contracted Casimir operator (cf. Eq. (7.33), µ00 = 0) Cγ =
1 (J+ J− + J− J+ ) 2
(j)
vanishes for all Dψ . Example A.2. Consider the gen. IW-contraction resp. Z2 -graded contraction L = so(3) → LT = Lγ = Heisenberg algebra. L is compact simple. LT is non-compact, non-semi-simple. We consider the representations D(j) of L as in Example A.1. (a) Contractions We take the contraction in the form T (ε)J3 = ε2 J3
and T (ε)J± = εJ± .
Here we choose j(ε) ∈ N with j(ε) → ∞ so that lim ε2 j(ε) = q > 0.
ε→0
September 12, 2006 14:40 WSPC/148-RMP
708
J070-00276
E. Weimar-Woods (q)
The contracted representation DT of LT is defined on the Hilbert space HT with orthonormal basis |m; m = 0, −1, −2, . . . ; by the non-vanishing matrix elements (q)
m | DT (J3 )|m = lim j(ε), j(ε) + m|ε2 D(j(ε)) (J3 )|j(ε), j(ε) + m ε→0
= lim ε2 (j(ε) + m) = q, ε→0
m +
(q) 1 | DT (J+ ) | m
= lim j(ε), j(ε) + m + 1| εD(j(ε)) (J+ ) | j(ε), j(ε) + m ε→0 = lim ε (j(ε) − j(ε) − m)(j(ε) + j(ε) + m + 1) ε→0 = −2qm; m < 0;
and (q)
m − 1 | DT (J− ) | m = lim j(ε), j(ε) + m − 1 | εD(j(ε)) (J− )| j(ε), j(ε) + m ε→0 = lim ε (j(ε) + j(ε) + m)(j(ε) − j(ε) − m + 1) ε→0 = −2q(m − 1). (q)
We note that for DT the contracted Casimir operator (cf. Eq. (7.32)) CT = J32 takes on the value q 2 1, and that again we get all faithful, irreducible, selfadjoint representations of LT . (b) Graded contractions We have J3 ∈ V0 ,
J± ∈ V1 ,
and γ00 = γ01 = 0,
γ11 = 1.
We use the same grading of H (j) as in Example A.1. For ψ = γ we get the (j) representation Dψ of Lγ on H (j) with non-zero matrix elements (j)
m | Dψ (J3 ) | m = 0; m = 0, ±1, . . . , ±j; 0 m even (resp. odd), (j) m ± 1 | Dψ (J± ) | m = (j ∓ m)(j ± m + 1) m odd (resp. even). (j)
We note that Dψ is not faithful, and that the graded contracted Casimir operator (cf. Eq. (7.33); µ11 = 0) Cγ = J32 (j)
vanishes for Dψ .
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
709
Example A.3. Consider the simple IW-contraction resp. Z2 -graded contraction L = so(3, 1) → LT = Lγ = iso(3). L is non-compact simple, LT is non-compact, non-semi-simple. in so(3, 1) are The Lie products of the generators J and K µ(Ji , Jj ) = iεijk Jk ;
i, j, k = 1, 2, 3;
µ(Ji , Kj ) = iεijk Kk , µ(Ki , Kj ) = −iεijk Jk .
and
Consider the following faithful, irreducible, self-adjoint representation D = D(λ,j0 ) ; λ ∈ R; j0 = 1, 2, 3, . . . ; of L from the principal series. Its infinite-dimensional representation space H (j0 ) is spanned by the eigenvectors |j, m;
m = 0, ±1, . . . , ±j;
j = j0 , j0 + 1, . . . ;
2 of J and J3 . The non-vanishing matrix elements of J are (as in Example A.1)
j, m |D(J3 )| j, m = m, and j, m ± 1 | D(J± ) | j, m =
(j ∓ m)(j ± m + 1).
form an irreducible vector operator under rotations their matrix Since the boosts K | j, m are (according to the Wigner–Eckart Theorem) prodelements j , m | D(K) ucts of the appropriate Clebsch–Gordan coefficient (which only depends on j and We have m) and a reduced matrix element j D(K)j. j(j + 1) jD(K)j = j0 λ (A.1) 2j + 1 and 1 |jD(K)j − 1|2 = j
λ2 1 + 2 (j 2 − j02 ). j
The two Casimir operators 2 − J 2 C1 = K
and C2 = J · K
take on the values (1 + λ2 − j02 )1
and λj0 1.
(a) Contractions We use the contraction = εK. T (ε)J = J and T (ε)K To contract the representation D(λ,j0 ) we first choose λ(ε) → ∞ so that lim ελ(ε) = M > 0.
ε→0
September 12, 2006 14:40 WSPC/148-RMP
710
J070-00276
E. Weimar-Woods (M,j )
The contracted representation DT = DT 0 of LT is defined on the original Hilbert space H (j0 ) . The matrix elements of J remain unchanged, and for the we have reduced matrix elements of K j(j + 1) j(j + 1) = lim jDT (K)j jεD(K)j ε→0 2j + 1 2j + 1 = lim εj0 λ = j0 M. ε→0
Similarly, we get M2 2 2 (j − j02 ). − 1| = |jDT (K)j j (M,j0 )
This produces a faithful, irreducible, self-adjoint representation DT The contracted Casimir operators of LT are (cf. Eq. (7.32)) 2 CT 1 = K (M,j0 )
and for DT
of LT .
and CT 2 = J · K,
they take on the values M 2 1 and j0 M 1.
(b) Graded contractions the grading label 1. The graded conJ must have the grading label 0 and K traction of the Casimir operators yields 2 Cγ1 = K
(like in Example A.1)
and Cγ2 = J · K. (j )
(j )
We now show that there is no decomposition H (j0 ) = H0 0 ⊕ H1 0 which 2 produces a compatible grading. We note first that J and J3 belong to V0 and are
diagonal in the |j, m basis of H (j0 ) . By the same argument as in the so(3) case, it (j ) (j ) follows that each |j, m must be in either H0 0 or H1 0 . Then K3 ∈ V1 implies that j, m | D(K3 ) | j, m = 0.
Since K3 does not change the m-value of J3 , this contradicts Eq. (A.1) for all λ = 0. changes the j-value by ±1 (otherwise by 0, ±1) and [In the case λ = 0 K the second Casimir operator C2 vanishes. In this special case we can give all |jm with j −j0 even (resp. odd) the grading label 0 (resp. 1). If we take ψ = γ, then both Casimir operators for LT vanish for the contracted representation.] Acknowledgment Without the constant interest, criticism, and nagging (but penetrating) questions of my husband, Jim Woods, this paper would not have attained its present form. Thank you.
September 12, 2006 14:40 WSPC/148-RMP
J070-00276
The Contracted Lie Algebra
711
References We will quote from Part I of this paper: E. Weimar-Woods, The General Structure of Ggraded Contractions of Lie Algebras, I. The Classification (Preprint 04-04 Freie Universit¨ at Berlin, to be published in 2006 in the Canadian Journal of Mathematics, accepted on September 22, 2004) in the form cf. Eq. I.(2.5) or cf. Lemma I.3.7 etc. [1] M. de Montigny and J. Patera, Discrete and continuous graded contractions of Lie algebras and superalgebras, J. Phys. A 24 (1991) 525–547. [2] R. V. Moody and J. Patera, Discrete and continuous graded contractions of representations of Lie algebras, J. Phys. A 24 (1991) 2227–2257. [3] I. E. Segal, A class of operator algebras which are determined by groups, Duke Math. J. 18 (1951) 221–265. [4] E. In¨ on¨ u and E. P. Wigner, On the contraction of groups and their representations, Proc. Nat. Acad. Sci. U.S. 39 (1953) 510–524. [5] R. Gilmore, Lie Groups, Lie Algebras, and Some of Their Applications (Wiley & Sons, 1974). [6] E. J. Saletan, Contractions of Lie groups, J. Math. Phys. 2 (1961) 1–21. [7] E. Weimar-Woods, Contractions of Lie algebras. Generalized In¨ on¨ u–Wigner contractions versus graded contractions, J. Math. Phys. 36 (1995) 4519–4548. [8] E. Weimar-Woods, Contractions, generalized In¨ on¨ u–Wigner contractions and deformations of finite-dimensional Lie algebras, Rev. Math. Phys. 12 (2000) 1505–1529. [9] A. L. Onishchik and E. B. Vinberg (eds.), Lie Groups and Lie Algebras III, Encyclopaedia of Mathematical Sciences, Vol. 41 (Springer-Verlag, 1994), Chapter 7, §2. [10] E. Weimar-Woods, The three-dimensional real Lie algebras and their contractions, J. Math. Phys. 32 (1991) 2028–2033. [11] J. Patera, R. T. Sharp and P. Winternitz, Invariants of real low dimension Lie algebras, J. Math. Phys. 17 (1976) 986–994. [12] E. Weimar-Woods, Contraction of Lie algebra representations, J. Math. Phys. 32 (1991) 2660–2665. [13] M. de Montigny, J. Patera and J. Tolar, Graded contractions and kinematical groups of space-time, J. Math. Phys. 35 (1994) 405–425. [14] A. Hussin, R. C. King, X. Leng and J. Patera, Graded contractions of the affine (1) Lie algebra A1 , its representations and tensor products, and an application to the (1)
[15] [16]
[17]
[18] [19]
(1)
branching rule A1 ⊃ A1 , J. Phys. A 27 (1994) 4125–4152. V. Bargmann, On unitary ray representations of continuous groups, Ann. Math. 59 (1954) 1–46. A. L. Onishchik and E. B. Vinberg (eds.), Lie Groups and Lie Algebras II, Encyclopaedia of Mathematical Sciences, Vol. 21 (Springer-Verlag, 1991), Part II, Chapter 2, §2. E. Weimar-Woods, Contractions of invariants of Lie algebras, in Proc. XXI Int. Colloq. Group Theoretical Methods in Physics (Group XXI), Vol. 1 (World Scientific Publishing Co., 1996), pp. 132–136. E. Weimar-Woods, published. A. M. Bincer and J. Patera, Graded contractions of Casimir operators, J. Phys. A 26 (1993) 5621–5628.
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Reviews in Mathematical Physics Vol. 18, No. 7 (2006) 713–745 c World Scientific Publishing Company
EFFECTIVE EQUATIONS OF MOTION FOR QUANTUM SYSTEMS
MARTIN BOJOWALD∗,†,‡ and AURELIANO SKIRZEWSKI†,§ ∗Institute
for Gravitational Physics and Geometry, The Pennsylvania State University, University Park, Pennsylvania 16802, USA
†Max-Planck-Institut
f¨ ur Gravitationsphysik, Albert-Einstein-Institut, Am M¨ uhlenberg 1, D-14476 Potsdam, Germany ‡[email protected] §[email protected] Received 21 February 2006 Revised 21 June 2006
In many situations, one can approximate the behavior of a quantum system, i.e. a wave function subject to a partial differential equation, by effective classical equations which are ordinary differential equations. A general method and geometrical picture are developed and shown to agree with effective action results, commonly derived through path integration, for perturbations around a harmonic oscillator ground state. The same methods are used to describe dynamical coherent states, which in turn provide means to compute quantum corrections to the symplectic structure of an effective system. Keywords: Effective theory; low energy effective action; dynamical coherent states. Mathematics Subject Classification 2000: 81Q15, 81Q20, 81S30
1. Introduction Many applications of quantum systems are placed in a realm close to classical behavior, where nevertheless quantum properties need to be taken into account. In view of the more complicated structure of quantum systems, both of conceptual and technical nature, it is then often helpful to work with equations of classical type, i.e. systems of ordinary differential equations for mechanical systems, which are amended by correction terms resulting from quantum theory. From a mathematical point of view, the question arises on how well the behavior of a (wave) function subject to a partial differential equation can be approximated by finitely many variables subject to a system of coupled but ordinary differential equations. One very powerful method is that of low energy effective actions [1, 2] which have been developed and are widely used for quantum field theories. The effective action of a free field theory is identical to the classical action, while interacting theories 713
October 11, 2006 13:24 WSPC/148-RMP
714
J070-00277
M. Bojowald & A. Skirzewski
receive quantum corrections “from integrating out irrelevant degrees of freedom”. The language is suggestive for the physical intuition behind the formalism, but the technical details and the mathematical relation between classical and quantum theories remain less clear. In this article we develop, building on earlier work [3–7], a geometrical picture of effective equations of motion for a quantum mechanical system with a clear-cut relation between the classical and quantum system: as a manifold, a classical phase space of the form R2n can literally be embedded into the quantum system.a Also the Schr¨ odinger equation can be formulated as Hamiltonian equations of motion for quantum phase space variables, and self-adjoint operators as observables in quantum theory can select special functions on the quantum phase space which can be considered as observables of the classical type. We discuss several examples and show that, in the regime where effective action techniques can be used, they coincide with our method. 2. Effective Actions For any system with classical action S[q] as a functional of the classical coordinates q, thus satisfying δS = −J δq
(2.1)
in the presence of an external source J, one can formally define the effective action Γ[q] satisfying the same relation δΓ = −J δq
(2.2)
but containing -dependent quantum corrections. If the generating functional Z[J] of Greens functions is known, Γ is obtained as the Legendre transform [8] of −i log Z[J]. This procedure is well-motivated from particle physics where additional contributions to Γ can be understood as resulting from perturbative quantum interactions (“exchange of virtual particles”). Indeed, effective actions are mostly used in perturbative settings where the generating functional Z can be computed by perturbing around free theories, using, e.g., Gaussian path integrations. For other systems, or quantum mechanical applications, Eq. (2.2) can, however, be seen at best as a formal justification. The effective action can rarely be derived in general, but its properties can make an interpretation very complicated. First, Γ is in general complex and so are the effective equations (2.2) as well as their solutions. In fact, q in (2.2) is not the classical q and not even the expectation value of qˆ in a suitable state of the quantum system. Instead, in general, it is related to a Using the geometrical picture [3, 4] for this purpose and the idea of horizontality as well as the appearance of additional quantum degrees of freedom in this context were suggested to us by Abhay Ashtekar.
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
715
non-diagonal matrix elements [9] of qˆ. Secondly, Γ is, in general, a non-local functional of q which cannot be written as the time integral of a function of q and its derivatives. In most applications, one employs a derivative expansion assuming that higher derivatives of q are small. In this case, each new derivative order introduces additional degrees of freedom into the effective action which are not classical, but whose relation to quantum properties of, e.g., the wave function is not clear either. Indeed, in this perturbative scheme, not all solutions of the higher derivative effective action are consistent perturbatively [10] as many depend non-analytically on the perturbation parameter . For those solutions, it is then not guaranteed that they capture the correct perturbative behavior considering that next order corrections, non-analytical in the perturbation parameter, can dominate the leading order. Such non-analytical solutions have to be excluded in a perturbative treatment, which usually brings down the number of solutions to the classical value even if perturbative corrections are of higher derivative form [10]. The description, even in a local approximation, can thus be quite complicated, given by higher derivative equations with many general solutions subject to the additional condition that only solutions analytic in the perturbation parameter are to be retained. The formulation is thus very redundant if higher derivative terms are used. Moreover, where there seem to be additional (quantum) degrees of freedom associated with higher derivative corrections, their role remains dubious given that many solutions have to be excluded. There are other technical difficulties if one tries to generalize beyond the usual realm of perturbing around the ground state of a free field theory or, in quantum mechanics, the ground state of a harmonic oscillator. In the latter case, for a system with classical action 1 2 1 2 2 (2.3) S[q(t)] = dt mq˙ − mω q − U (q) , 2 2 one can derive the effective action [11] (see also [12] for the effective potential) U (q)2 1 Γeff [q(t)] = dt m+ q˙2 5 2 5 2 2 −1 2 2 m (ω + m U (q)) 1 1 ω U (q) 2 2 2 − mω q − U (q) − (2.4) 1+ 2 2 mω 2 to first order in and in the derivative expansion, using path integral techniques. The quantum system is here described effectively in an expansion around the ground state of the Harmonic oscillator. On the other hand, a quantum system allows more freedom and one could, e.g., want to find an effective formulation for a quantum system which is prepared to be initially close to a squeezed state, or a state of non-minimal uncertainty. This freedom is not allowed by the usual definition of an effective action. Other problems include the presence of “infrared problems”: In the free particle limit, corresponding to a massless field theory, one has U (q) = − 21 mω 2 q 2
October 11, 2006 13:24 WSPC/148-RMP
716
J070-00277
M. Bojowald & A. Skirzewski
for which (2.4) becomes meaningless. Still, at least for some time the free particle should be possible to be described in an effective classical manner. Other generalizations, such as for systems to be perturbed around a Hamiltonian non-quadratic in momenta as they occur, e.g., in quantum cosmology, look even more complicated since one could not rely on Gaussian path integrations. For all these reasons, it is of interest to develop a scheme for deriving effective equations of a quantum system based on a geometrical formulation of quantum mechanics. For semiclassical issues, this has been used already in the context of quantum cosmology [6, 7] where usual techniques fail. As we show here, it also allows a general development of effective systems which reduce to the effective action result (2.4) in the common range of applicability, but is much more general. Moreover, it provides a clear, geometrical picture for the relation between the dynamics of classical and quantum systems, the role of quantum degrees of freedom and the effective approximation.
3. A Geometrical Formulation of Quantum Mechanics The formalism of quantum theory has been studied for almost a century already and a prominent understanding of its structure, based mainly on functional analysis, has been achieved. From this perspective, quantum mechanics appears very different from classical mechanics not only conceptually but also mathematically. While in classical physics the viewpoint is geometrical, employing symplectic or Poisson structures on a phase space; quantum theory is analytical and based on Hilbert space structures and operator algebras. There are, however, some contributions which develop and pursue a purely geometrical picture of quantum mechanics, in which the process of quantization and kinematical as well as dynamical considerations are generalizations of classical structures. The process of quantization is described in a geometrical, though not always constructive, manner in geometric quantization [13], employing line bundles with connections, but the picture of the resulting theory remains analytical based on function spaces and operators thereon. Independently, a geometrical formulation of quantum mechanics has been developed which, irrespective of the quantization procedure, provides a geometrical viewpoint for all the ingredients necessary for the basic formulation of quantum physics [3, 4]. It is the latter which will be crucial for our purposes of developing a geometrical theory of effective equations of motion and the classical limit. Let us assume that we are given a quantum system, specified by a Hilbert space H = (V, ·, ·) with underlying vector space V equipped with inner product ·, ·, ˆ The Hamiltonian together with an algebra of basic operators and a Hamiltonian H. dΨ −1 ˆ defines a flow on H by dt = −i HΨ. Lemma 3.1. Let (V, ·, ·) be a Hilbert space. The inner product ·, · on H defines a K¨ ahler structure on V.
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
717
Proof. To start with, we note that the inner product can be decomposed as 1 i G(Φ, Ψ) + Ω(Φ, Ψ), (3.1) 2 2 where G(Φ, Ψ) and Ω(Φ, Ψ) denote the real and complex parts of 2Φ, Ψ, respectively. It follows from the properties of an inner product that G is a metric and Ω a symplectic structure on the vector space V, identified with its tangent space in any of its points. Also by definition, the metric and symplectic structure are related to each other by Φ, Ψ =
G(Φ, Ψ) = Ω(Φ, iΨ). With the obvious complex structure, (V, G, Ω) is thus K¨ ahler. As used in the proof, points and tangent vectors of the K¨ ahler manifold K = (V, G, Ω) correspond to states in the Hilbert space, and functions densely defined on V can be associated to mean values of operators acting on H: Any operator Fˆ on H defines a function F := Fˆ on K taking values F (Ψ) = Ψ, Fˆ Ψ in points Ψ of its domain of definition. Any state η ∈ H defines a constant vector field on K, which can be used to compute the Lie derivative £η F (Ψ) :=
d F (Ψ + tη)|t=0 . dt
(3.2)
This allows us to show Lemma 3.2. Let F = Fˆ be a function on K associated with a self-adjoint operator Fˆ on H. Its Hamiltonian vector field is given by XF (Ψ) :=
1 ˆ FΨ. i
Proof. Using the definition of a Lie derivative and self-adjointness of Fˆ we have d Ψ + tη, Fˆ (Ψ + tη)|t=0 = η, Fˆ Ψ + Ψ, Fˆ η dt = −i(−i−1Fˆ Ψ, η − η, −i−1 Fˆ Ψ) = Ω(−i−1 Fˆ Ψ, η)
£η F (Ψ) =
(3.3)
for any vector η, from which XF can immediately be read off. Remark. Such vector fields are also known as Schr¨ odinger vector fields, as their flow is generated on H by a Schr¨ odinger equation 1 d |Ψ = Fˆ |Ψ. (3.4) dt i The flow is a family of unitary transformations, i.e. automorphisms of the Hilbert space which preserve the Hilbert space structure. Therefore, the flow preserves not only the symplectic structure of K, as any Hamiltonian vector field does, but also the metric. Hamiltonian vector fields thus are Killing vector fields, and since
October 11, 2006 13:24 WSPC/148-RMP
718
J070-00277
M. Bojowald & A. Skirzewski
each tangent space has a basis of Killing vectors the K¨ahler space is maximally symmetric. ˆ the symplectic structure defines the For two functions F = Fˆ and K = K Poisson bracket 1 ˆ {F, K} := Ω(XF , XG ) = [Fˆ , K]. (3.5) i For, e.g., q := ˆ q and p := ˆ p, we have {q, p} = 1 from [ˆ q , pˆ] = i. Of physical significance in quantum theory are only vectors of the Hilbert space up to multiplication with a non-zero complex number. Physical information is then not contained in the vector space V but in the projective space V/C∗ . From now on, we will take this into account by working only with norm one states and normpreserving vector fields. 4. Classical and Quantum Variables For any quantum system, the algebra of basic operators, which is a representation of the classical algebra of basic phase space variables defined by Poisson brackets, plays an important role. We will assume mainly, for simplicity, that this basic algebra is given by a set of position and momentum operators, qˆi and pˆi for 1 ≤ i ≤ N , with canonical commutation relations. This distinguished set of operators leads to further structure on K: Definition 4.1. The set of fundamental operators (ˆ q i , pˆi ) on H defines a fiber bundle structure on V where the bundle projection identifies all points Φ, Ψ for which Ψ, qˆi Ψ = Φ, qˆi Φ and Ψ, pˆi Ψ = Φ, pˆi Φ for all i. The base manifold can be identified with the classical phase space as a manifold. Remark. The Hilbert space used for the quantization of a classical system is always infinite dimensional, which implies that the fibers of the bundle are infinite dimensional. For instance, for an analytic wave function one can consider the collection of numbers associated to the mean values of products of the fundamental operators, an = Ψ, qˆn Ψ and bn = Ψ, qˆn pˆΨ for all n ≥ 0. Usually denominated by the name of Hamburger moments [14], the (an , bn ) are a complete set in the sense that they uniquely determine the wave function. Indeed, from linear combinations cn of the Hamburger moments with coefficients corresponding to
some orthogonal polynomials, taking Hermite polynomials {Hn (q) = l hn,l q l } for definiteness, we have hn,l al = dq|Ψ(q)|2 Hn (q) (4.1) cn = l
giving the absolute value of the wave function as 2 cn Hn (q) . |Ψ(q)|2 = e−q 2n πn! n
(4.2)
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
719
The bn , on the other hand, provide information about the phase α(q) of the wave function up to a constant: bn = − dqΨ(q)∗ q n i∂q Ψ(q) =−
dq|Ψ(q)|q n i∂q |Ψ(q)| −
dq|Ψ(q)|2 q n i∂q α(q)
(4.3)
from which ∂q α(q) is determined as before, using the already known norm of Ψ. One could thus use Hamburger moments as coordinates on the fiber bundle, but for practical purposes, it is more helpful to choose coordinates which are not only adapted to the bundle structure but also to the symplectic structure. We thus require that, in addition to the classical variables q i and pi , coordinates of the fibers generate Hamiltonian vector fields symplectically orthogonal to ∂/∂q i and ∂/∂pi . Definition 4.2. The quantum variables of a Hilbert space H are defined as Gi1 ···in := (ˆ x(i1 − x(i1 ) · · · (ˆ xin ) − xin ) ) n n k = (−) xik+1 · · · x ˆin ) x(i1 · · · xik ˆ k
(4.4)
k=0
q k , pˆk }1≤i≤N where round with respect to fundamental operators {ˆ xi }1≤i≤2N := {ˆ brackets on indices denote symmetrization. Variables of this type have been considered in quantum field theories; see, e.g., [15]. Together with the classical variables, they provide in particular local trivializations of the quantum phase space as a fiber bundle. Lemma 4.3. The fiber coordinates Gi1 ···in on K are symplectically orthogonal to the classical coordinates xi . Proof. We compute the Poisson bracket with xj to obtain i1 ···in
{x , G j
}=
n
n (−) xik+1 · · · x ˆin ) [{xj , x(i1 · · · xik }ˆ k k
k=0
+ x(i1 · · · xik {xj , ˆ xik+1 · · · xˆin ) }] n n k = (−) xik · · · x ˆin−1 ) [kj(in xi1 · · · xik−1 ˆ k k=0
+ (n − k)j(in xi1 · · · xik ˆ xik+1 · · · x ˆin−1 ) ]
October 11, 2006 13:24 WSPC/148-RMP
720
J070-00277
M. Bojowald & A. Skirzewski
=
n−1
(l+1)
(−)
l=0
+
n
n xil+1 · · · x ˆin−1 ) (n − l)j(in xi1 · · · xil ˆ l
n xik+1 · · · x ˆin−1 ) = 0, (n − k)j(in xi1 · · · xik ˆ k
(−)k
k=0
(4.5)
where we used repeatedly the Leibnitz rule and introduced ij = {xi , xj }. Remark. An alternative proof proceeds by computing the Poisson bracket between i i the function eαi (ˆx −x ) and xj , restricting to the dense subspace in which such functions are analytic in {αi }, and expanding. Since the fibers are symplectic, Ω defines a natural decomposition of tangent spaces of K as a direct sum of a vertical space tangent to the fibers and a horizontal space HorΩ K as the symplectic complement: Corollary 4.4. (K, π, B) is a fiber bundle with connection over the classical phase space B as base manifold. We now know the Poisson relation between the classical variables xi and between xi and the Gj1 ,...,jm . In order to compute the remaining Poisson brackets {Gi1 ,...,in , Gj1 ,...,jm } for N canonical degrees of freedom, we introduce a new notation ak ,...,ak
Gbk 1,...,bk N 1
N
= (ˆ q k1 − q k1 )ak1 · · · (ˆ q kN − q kN )akN (ˆ pk1 − pk1 )bk1 · · · (ˆ pkN − pkN )bkN Weyl , the label “Weyl” meaning that the product of operators is Weyl or fully symmetricordered. The notation allows us to drop indices whose values are zero so whenever we are dealing with a single pair of degrees of freedom, we use the notation where . Ga,n := Gn−a a Lemma 4.5. The Poisson brackets for the variables above are a ,...,a
ck ,...,ck k k Gbk 1,...,bk N , Gdk1 ,...,dkN 1
1
N
=−
N
r+s
(−)
r,s,e1 ,···,eN {a}{b}{c}{d}
× Kr,s,{e} −
2r δe1 +···+eN ,2r+1
ak +ck −e1 ,...,ak +ck −eN
Gbk 1 +dk1 −e1 ,...,bk N +dk N −eN 1
N
1 2
1
N
ak ,...,ak −1,...,akN
akf dkf Gbk 1 ,...,bk f 1
f =1
N
ck ,...,ck
Gdk1 ,...,dkN −1,...,dk 1
N
ck ,...,ck −1,...,ckN
ak ,...,ak
− bkf ckf Gbk 1,...,bk N−1,...,bk Gdk1 ,...,dkf 1
f
N
1
N
N
f
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
721
with indices running as 1 ≤ 2r + 1 ≤
N
(min(af , df ) + min(bf , cf )),
f =1
0 ≤ s ≤ minr,
N
min(bf , cf ),
f =1
0 ≤ ef ≤ min(af , df , s) + min(bf , cf , 2r + 1 − s). and coefficients given by
{a}{b}{c}{d}
Kr,s,{e}
=
δg +···+g ,2r+1−s 1 N s!(2r + 1 − s)! ,...,g
g1
f
n
af bf cf df ef − gf gf gf ef − gf 2r + 1 − s s gf ef − gf (4.6)
where max(ef − s, ef − af , ef − df , 0) ≤ gf ≤ min(bf , cf , 2r + 1 − s, ef ) . Proof. Consider first the Poisson bracket between functions of the form D(α) = i i eαi (ˆx −x ) . For analytical wave functions in the mean values, D(α) is an analytical function and so is the Poisson bracket between two such functions D(α) and D(β). We can therefore take the coefficients in a Taylor expansion for all orders in αi i j i and βj . Using the relation [eαi xˆ , eβj xˆ ] = 2i sin( 2 αj βk jk )e(α+β)i xˆ , which follows ˆi , βj x ˆj ] = from the Baker–Campbell–Hausdorff formula and the commutator [αi x ij i αi βj , we find that 1 2 αj βk jk D(α + β) − αj βk jk D(α)D(β). (4.7) {D(α), D(β)} = sin 2
i i a1 ···aN N ai bi −1 Now, we use D(α) = eαi (ˆx −x ) = , and {a},{b} Gb1 ···bN i=1 αqi αpi (ai !bi !) substitute 1 2 sin αj βk jk D(α + β) 2 2r +c1 ,...,aN +cN r+s 1 =− (−) Gab11+d 1 ,...,bN +dN 2 ×
N f =1
a +gf
αqff
b +ef
αpff
c +ef
βqff
d +gf
βpff
af !bf !cf !df !ef !gf !(2r + 1 − s − ef )!(s − gf )!
,
(4.8)
where we sum over all collections of numbers af , bf , cf , df , ef , gf , r and s such
that f gf = s, f ef = 2r − s and s ≤ 2r + 1. Since the equality (4.7) holds for any α and β, coefficients in the expansion have to fulfill the equality.
October 11, 2006 13:24 WSPC/148-RMP
722
J070-00277
M. Bojowald & A. Skirzewski
5. Uncertainty Principle The fibers of K as a fiber bundle over the classical phase space are not vector spaces, and the quantum variables Gi1 ,...,in are not allowed to take arbitrary values. Similarly, not any collection of numbers is a collection of Hamburger moments. With K being a K¨ ahler space, the fibers are bounded by relations following from Schwarz inequalities. A special case of this fact is well known and commonly written as the uncertainty relation (∆q)2 (∆p)2 ≥
2 2 + (ˆ q pˆ + pˆqˆ)/2 − qp2 ≥ , 4 4
(5.1)
where (∆a)2 = (ˆ a − a)2 , or in our notation G0,2 G2,2 ≥
2 + (G1,2 )2 . 4
(5.2)
More generally, the Schwarz inequality for a K¨ ahler manifold with metric g and symplectic structure ω is g(u, u)g(v, v) ≥ |g(u, v)|2 + |ω(u, v)|2
(5.3)
for all tangent vectors u and v. This results in bounds to be imposed on the quantum variables. Lemma 5.1. The function D(α) = eαi (ˆx
i
−xi )
is subject to
(D(2α) − D(α)2 )(D(2β) − D(β)2 ) 1 α × β D(α + β)D(α)D(β) + D(α)2 D(β)2 . ≥ D(α + β)2 − 2 cos 2
(5.4)
Proof. For the Schwarz inequality, we need to know the metric and pre-symplectic structure on the space of states of unit norm, which we compute by evaluating them on vector fields that generate transformations only along the submanifold of unit 1 ˆ F Ψ, we associate vectors in the Hilbert space. To an arbitrary vector XF = i 1 ˜ F = (1 − |ΨΨ|)XF = ˆ − F )Ψ. This ensures that the vector given by X ( F i ˜ F maps normalized states to normalized states, the transformation generated by X ˜ F )Ψ|2 = |Ψ|2 −2i−1Ψ, (Fˆ − which is most easily seen infinitesimally using |(1+X 2 2 2 F )Ψ + O( ) = |Ψ| + O( ). The metric on the space of physical states evaluated in Hamiltonian vector fields induces a symmetric bracket (F, K) = g(XF , XK ) = G((1 − |ΨΨ|)XF , (1 − |ΨΨ|)XK ).
(5.5)
The symplectic structure is as before, ω(XF , XK ) = Ω(XF , XK ). For the corresponding operators, g and ω result in the anticommutator [·, ·]+ and commutator [·, ·], respectively.
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
723
For functions eα.ˆx and eβ.ˆx (parameterized by αi and βi ), the Schwarz inequality implies (e2α.ˆx − eα.ˆx 2 )(e2β.ˆx − eβ.ˆx 2 ) 2 1 α.ˆx β.ˆx 2 1 α.ˆ x β.ˆ x ≥ [e , e ]+ − e e + [eα.ˆx , eβ.ˆx ] 2 4
(5.6)
which upon using, as before, the Baker–Campbell–Hausdorff formula for the commutator and anticommutator and multiplying both sides with e−2(α+β).x proves the lemma. This gives us a large class of inequalities thus specifying bounds on the variables Gi1 ,...,in . The boundary, obtained through saturation of the inequalities, is characterized by relations which result from the lemma order by order in α and β. 6. Quantum Evolution The dynamical flow of the quantum system is given as the unitary Schr¨ odinger ˆ As before, this flow is also flow on H of a self-adjoint Hamiltonian operator H. Hamiltonian when viewed on the K¨ ahler space K. It is generated by the Hamiltonian function obtained as the mean value of the Hamiltonian operator. In terms of coordinates on the manifold, the Hamiltonian function is obtained by Taylor expanding the mean value of the Hamiltonian operator which in our convention is taken to be Weyl ordered: Definition 6.1. The quantum Hamiltonianb on K is the function ∞ n 1 n ∂ n H(q, p) a,n xi )Weyl = H(xi +(ˆ xi −xi )) = G HQ := H(ˆ n! a ∂pa ∂q n−a n=0 a=0
(6.1)
generating Hamiltonian equations of motion x˙ i = {xi , HQ }, G˙ a,n = {Ga,n , HQ }.
(6.2)
This Hamiltonian flow is equivalent to the Schr¨ odinger equation of the Hamiltonian operator. As such, it is an equivalent description of the quantum dynamics and only superficially takes a classical form, albeit for infinitely many variables, in its mathematical structure. Nevertheless, the reformulation makes it possible to analyze the classical limit in a direct manner, and to derive effective equations in appropriate regimes. Classical dynamics is to arise in the limit of “small” quantum fluctuations which, when the fluctuations are completely ignored or switched off by → 0, should give rise to classical equations of b This
is the basic object for an effective theory, playing a similar role in the effective potential [16].
October 11, 2006 13:24 WSPC/148-RMP
724
J070-00277
M. Bojowald & A. Skirzewski
motion. In practice, this limit is not easy to define, and the most direct way is to derive first effective equations of motion, which still contain , and then take the limit → 0. In this procedure, the main problem is to reduce the infinite set of coupled quantum equations of motion to a set of differential equations for only a finite set of variables. Additional degrees of freedom without classical analogs carry information about, e.g., the spreading of the wave function around the peak, which itself is captured by expectation values. For a formulation of classical type, taking into account only a finite number of degrees of freedom, a system has to allow a finite-dimensional submanifold of the quantum space K which is preserved by the quantum flow. We start by generalizing the situation encountered in [5]: Definition 6.2. A strong effective classical system (P, Heff ) for a quantum system ˆ is given by a finite dimensional pre-symplectic subspace P of the K¨ (H, H) ahler space K associated with H satisfying the following two conditions: (1) For each p ∈ P ⊂ K, the tangent space Tp P contains the horizontal subspace Ω HorΩ p K of p in K defined by the symplectic structure: Horp K ⊂ Tp P for all p ∈ P (base horizontality). ˆ and, if P is symplectic, the restriction (2) P is fixed under the Schr¨ odinger flow of H of the flow to P agrees with the Hamiltonian flow generated by the effective Hamiltonian Heff . Remark. A strong effective classical system agrees with the quantum system both at the kinematical and quantum level since its symplectic structure as well as the Hamiltonian flow are induced by the embedding. As such, the conditions are very strong since they require a quantum system to be described exactly in terms of a finite dimensional system P. In addition to agreement between the strong effective and the quantum dynamics, the first condition ensures that the classical variables are contained in P and fulfill the classical Poisson relations. In the simplest case, we require the effective system to have the same dimension as the classical system, such that potentially only correction terms will appear in Heff (to be discussed further in Theorem 8.2 below) but no additional degrees of freedom. Quantum variables, in general, cannot simply be ignored since they evolve and back react on the classical variables. Sometimes one may be forced to keep an odd number of quantum variables, such as the three Ga,2 , in the system which we allow by requiring the effective phase space P to be only pre-symplectic. For a strong effective system of the classical dimension, however, the dynamics of the quantum variables in the embedding space occurs only as a functional dependence through the classical coordinates: G˙ a,n = x˙ i ∂xi Ga,n (xj ).
(6.3)
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
725
The effective equations of motion, generated by Heff are then obtained by inserting solutions Ga,n (x) in the equations for xi : x˙ i = {xi , HQ }|Ga,n (x) =
∞ 1 i {x , H(xi ),i1 ···in }Gi1 ,...,in (x). n! n=0
(6.4)
7. Examples We now demonstrate the applicability of the general procedure by presenting examples, which will then lead the way to a weakened definition and, in the following section, a proof that the results coincide with standard effective action techniques when both can be applied. Example 1: Harmonic oscillator The quantum Hamiltonian (6.1) for a harmonic oscillator is 1 2 1 1 1 2,2 HQ = p + mω 2 q 2 + mω 2 G0,2 + G 2m 2 2 2m giving equations of motion
(7.1)
p˙ = {p, HQ } = −mω 2 q, 1 p, m 1 G˙ a,n = {Ga,n , HQ } = (n − a)Ga+1,n − mω 2 aGa−1,n . (7.2) m In this case, the set of infinitely many coupled equations splits into an infinite number of sets, for each n as well as the classical variables, each having a finite number of coupled equations. Independently of the solutions for the Ga,n , we obtain the same set of effective equations for q and p agreeing with the classical ones. Therefore, the effective Hamiltonian for a system of the classical dimension is here identical to the classical one (up to a constant which can be added freely). We can also define higher dimensional (but non-symplectic) systems by including the variables Ga,n for a finite set of values for n. Along the classical evolution, the evolution of the additional parameters is then given by linear differential equations which we write down in a dimensionless form, defining q˙ = {q, HQ } =
˜ a,n = −n/2 (mω)n/2−a Ga,n . G
(7.3)
The requirement that dynamics be restricted to the classical subspace parametrized by q and p implies 1 1 2 ˜ a,n = (n − a)G ˜ b,n , ˜ a+1,n − aG ˜ a−1,n =: (n) M ab G p∂q − mω q∂p G (7.4) ω m whose solution is ˜ a,n (r, θ) = (exp θ(n) M )ab Ab (r), G
(7.5)
October 11, 2006 13:24 WSPC/148-RMP
726
J070-00277
M. Bojowald & A. Skirzewski
1 2 a,n 2 2 where r = (r) are n + 1 arbitrary m p + mω q , tan(θ) = mωq/p and A functions of r. For, e.g., n = 2, we have ˜ 0,2 (r, θ) = A0,2 (r) − e2iθ A2,2 (r) − e−2iθ A−2,2 (r), G
(7.6)
˜ 1,2 (r, θ) = −ie2iθ A2,2 (r) + ie−2iθ A−2,2 (r), G
(7.7)
˜ 2,2 (r, θ) = A0,2 (r) + e2iθ A2,2 (r) + e−2iθ A−2,2 (r). G
(7.8)
In terms of the constants Aa,n , the uncertainty relation (5.2) reads: 1 . (7.9) 4 We are thus allowed to choose A2,2 = 0 = A−2,2 and A0,2 = 12 which saturates the uncertainty bound and makes the Ga,2 constant. In fact, these values arise from † ¯a |0 which corresponds to quantum evolution given by coherent states |α = eαˆa −αˆ trajectories of constant quantum variables (A0,2 (r))2 − 4A2,2 (r)A−2,2 (r) ≥
(n − a)! ˜ a,n = 1 a! G n 2 (a/2)! ((n − a)/2)!
(7.10)
˜ a,n = 0 otherwise. This implies that any truncation for even a and n, and G of the system by including only a finite set of values for n, which as already seen is consistent with the dynamical equations, and choosing initial conditions to be that of a coherent state gives a base horizontal subspace as required by Definition 6.2. In other words, the harmonic oscillator allows an infinite set of strong effective classical systems, including one of the classical dimension. The last case is symplectic, with effective Hamiltonian Heff = H + const. In particular, for n = 2, we see that the uncertainty relations are saturated. For other states, the quantum variables will, in general, vary during evolution, which means that the spreading of states changes in time. Nevertheless, the variables remain bounded and the system will stay in a semiclassical regime of small uncertainties if it starts there. With varying G, we will not obtain a strong effective system as horizontality will be violated. Nevertheless, such states are often of interest and suitable for an effective description, which we will provide in a weakened form later on. Example 2: Linear systems The harmonic oscillator is a special case of systems, where a complete set of functions on the classical phase space exists such that they form a Lie algebra with the Hamiltonian. For such systems, which we call linear, semiclassical aspects can be analyzed in an elegant manner using generalized coherent states: a family of states — of the dimension of the algebra minus the dimension of its subalgebra that generates the stability subgroup of a given, so-called extremal state — with respect to which the mean values of operators can be approximated very well by their classical expressions [17].
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
727
In this example, we assume that basic variables of the quantum system are not ˆ i of a linear quantum necessarily canonical but given by the Lie algebra elements L i ˆ i , and quantum system. Thus, our classical variables are mean values L := L variables are ˆ (i1 − Li1 ) · · · (L ˆ in − Lin ) ) . GiL1 ,...,in = (L Poisson brackets between these functions on the infinite dimensional K¨ ahler manifold K can easily be found to be {Li , Lj } = f ij k Lk and {Li , GiL1 ,...,in } =
i ,...,ir−1 jir+1 ,...,in
f iir j GL1
.
r,j
It is then immediately seen that the Hamiltonian dynamics of all degrees of freedom is linear, the Li decouple from the quantum variables, and that the dynamics of any GiL1 ,...,in depends only on other GjL1 ,...,jn with the same n. As in the harmonic oscillator case, the dynamics of infinitely many degrees of freedom thus decouples into infinitely many sectors containing only finitely many variables. This shows Corollary 7.1. Any linear quantum system admits a class of finite dimensional subspaces preserved by the quantum flow, including one of the classical dimension. This is not sufficient for the existence of a strong effective system, for which we also have to discuss base horizontality. As in the harmonic oscillator example, one can try to use coherent states which have been widely analyzed in this context. Nevertheless, the issue of base horizontality, i.e. finding coherent states for which all G are constant, in general, is more complicated. A special family of states is generated by acting with the Lie algebra on an extremal state, i.e. a lowest weight of a module representation, which can thus be seen to be in one-to-one correspondence with the factor space of the Lie algebra by the stabilizer of the state. More explicitly those states are of the form |ηΛ,Ω = e
P α
ηα Eα −H.c.
|ext = N (τ (η), τ (η)∗ )−1 e
P α
τα (η)Eα
|ext ,
where Λ is a representation of the Lie algebra, Ω is the quotient of the group manifold by its stabilizer, |ext is an extremal state, E−α |ext = 0 for all positive roots α and ηα or τα are coordinate charts of the homogeneous space. Since the flow is generated by an element of the Lie algebra, generalized coherent states define a preserved manifold according to the Baker–Campbell–Hausdorff formula. In this situation, one can compute the mean values of elements Li of the Lie algebra and the quantum variables GiL1 ,...,in as functions over the classical phase space. With this construction of coherent states, the semiclassical phase space associated to the Lie algebra and the dimension of the classical theories would differ depending on the choice of the extremal state and each of these would provide us
October 11, 2006 13:24 WSPC/148-RMP
728
J070-00277
M. Bojowald & A. Skirzewski
with diffeomorphisms from the set of Li to the τα , these last ones being the only dynamical variables of this subspace (when all conditions are satisfied, we have by definition dynamical coherent states). We notice as well that a natural emergence of a K¨ ahler structure for this submanifold of the space of states, as observed within the context of the geometrical formulation of quantum mechanics, is also justified in Gilmore’s construction. We are not aware of general expressions for the G or special choices of constant values as they exist for the harmonic oscillator. It is, however, clear that such constant choices are not possible in general for a linear system as the counterexample of the free particle demonstrates.
Example 3: Free particle The free particle is an example for a linear system and can be obtained as the limit of a harmonic oscillator for ω → 0. However, the limit is non-trivial and the semiclassical behavior changes significantly. If we re-instate units into the uncertainty formulas of the harmonic oscillator, we obtain in the case of constant Ga,2 : mω , G1,2 = 0, G2,2 = . 2mω 2 The fixed point of the evolution of quantum variables which exists for the harmonic oscillator thus moves out to infinity in the free particle limit and disappears. Moreover, the closed classical orbits break open and become unbounded. Even nonconstant bounded solutions for the G then cease to exist, a fact well known from quantum mechanics where the wave function of a free particle has a strictly growing spread, while harmonic oscillator states always have bounded spread as follows from (7.6)–(7.8). For a free particle, one can thus not expect to have a valid semiclassical approximation for all times. One can see this explicitly by computing eigenvalues of the matrices (n) M in (7.4) for arbitrary n which in the limit of vanishing frequency become degenerate. More precisely, the solutions of n − a a+1,n p ∂q Ga,n = G (7.11) m m are given by G0,2 =
Ga,n (q, p) = pa
n−a i=0
ci,n (n − a)! n−a−i q (n − a − i)!
(7.12)
with integration constants ci,n , i = 0, . . . , n. Minimal uncertainty requires for n = 2 2 that 2c0 c2 − c21 = 4p 2 . Initial conditions could be chosen by requiring the initial state to be a harmonic oscillator coherent state at the point (q0 , p0 ). Since, due to the degeneracy of eigenvalues, solutions for the G are now polynomials in q and the classical trajectories are unbounded, the spread is unbounded when the whole evolution is considered. In particular, no constant choice and so no strong
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
729
effective system exists. With unbounded quantum variables, the system cannot be considered semiclassical for all times, but for limited amounts of time, this can be reasonable. If this is done, the equations of motion for the classical variables q and p are unmodified such that there is no need for introducing an effective Hamiltonian different from the classical one if one is interested only in an effective system of the classical dimension. Example 4: Quantum cosmology So far, we have mainly reproduced known results in a different language. To illustrate the generality of the procedure, we now compute effective equations for an unbounded Hamiltonian which generally occurs in quantum cosmology. Here, one considers the quantized metric of a homogeneous and isotropic space-time whose sole dynamical parameter is the scale factor a determining the change of size of space in time. The canonical structure as well as Hamiltonian follow from the Einstein– Hilbert action specialized to such an isotropic metric. The momentum is then given ˙ with the gravitational constant κ, and the Hamiltonian is equivaby pa = 3aa/κ lent to the Friedmann equation. There are different sets of canonical variables, all related to the spatial metric and extrinsic curvature of spatial slices, some of which are better adapted to quantization. Here, we use the example of isotropic quantum cosmology coupled to matter in the form of dust (constant matter energy E) in Ashtekar variables [18] which in the isotropic case are (c, p) with {c, p} = 13 γκ where γ is a real constant, the so-called Barbero–Immirzi parameter [19, 20], and √ give a Hamiltonian H = −3γ −2 κ−1 c2 p + E. (This is formally similar to a system with varying mass as discussed in [21].) For details of the variables (c, p) used, we refer to [22, 23]. The geometrical meaning can be seen from |p| = a2 and c = 12 γ a˙ in terms of the scale factor a. For a semiclassical universe, we thus have c 1 and p 2P = κ. In contrast to a, p can also be negative in general with the sign corresponding to spatial orientation, but we will assume p > 0 in this example. The Hamiltonian H is actually a constraint in this case, but we will not discuss aspects of constrained systems in the geometric formulation here. To simplify calculations, we have already weaken the notion of a strong effective system and require agreement between quantum and effective dynamics only up to corrections of the order . Performing the expansion of the mean value of the Hamiltonian, we obtain 1 ˜ ij + O( 32 ) HQ = H + κH,ij G 2 3 3 √ ˜ 0,2 c ˜ 1,2 c2 ˜ 2,2 = H − 2 pG +√ G − G (7.13) + O( 2 ) γ p 8 p3 ˜ a,n = −n Ga,n . These variables are motivated by the uncertainty in terms of G P relations, with for the symplectic structure in this example read G0,2 G2,2 −(G1,2 )2 ≥ 1 2 4 ˜ 36 γ P . Thus, one can expect that for minimal uncertainty the G (which are not
October 11, 2006 13:24 WSPC/148-RMP
730
J070-00277
M. Bojowald & A. Skirzewski
dimensionless) do not contribute further factors of . We will now perform a more detailed analysis. From the commutation relation [c, p] = 13 iγ 2P , we obtain G˙ a,n = (c∂ ˙ c + p∂ ˙ p )Ga,n 1 (n − a)c2 a+1,n 2 a−1,n a,n G =− + (n − 2a)cpG − −2ap G . 4 γ p3 At this point, it is useful to define Ga,n =: cn−a pa g a,n with dimensionless g, leading to 1 1 1 c∂c − 2p∂p g a,n = −ag a−1,n + (n + a)g a,n − (n − a)g a+1,n . 2 4 8 This system of partial differential equations can be simplified by introducing coor√ √ dinates (x, y) by e2x = c2 / p and y := c2 p/ with a constant of dimension length, e.g., = κE as the only classically available length scale independent of the canonical variables, such that 12 c∂c − 2p∂p = ∂x and ( 12 c∂c − 2p∂p )f (y) = 0 for any function f independent of x. The general solution for n = 2 then is 3
g 0,2 = g0 (y) + g 23 (y)e 2 x + g3 (y)e3x , 3
g 1,2 = 2g0 (y) − g 32 (y)e 2 x − 4g3 (y)e3x , 3
g 2,2 = 4g0 (y) − 8g 23 (y)e 2 x + 16g3 (y)e3x , subject to the uncertainty relation 4g0 g3 − g 23 ≥ 2
γ 2 4P 3 √ 5 . 22 34 2 (c2 p) 2
(7.14)
Since H is a constraint, y will be constant physically such that we can also consider g0 , g 23 and g3 as constants. On the constraint surface, the right-hand side of the uncertainty relation is then of the order ( P /κE)4 for the above choice of and thus very small. Note first that, unlike the free particle and the harmonic oscillator examples, solutions for the Ga,n do not leave unaffected the effective system. In this example, provided that it allows an effective Hamiltonian description, we would thus encounter an effective Hamiltonian different from the classical one. Spreading backreacts on the dynamics according to the effective equations 1 2 − 12 2 − 12 3/4 2 − 12 3/2 (7.15) 1 + g0 − g 23 ( c p ) + 11g3( c p ) + · · · , γ c˙ = −c p 2 1 1 √ γ p˙ = c p(4 + 2g0 + 2g 23 ( c2 p− 2 )3/4 − 16g3( c2 p− 2 )3/2 + · · ·). (7.16) There is no explicit in the correction terms because we use dimensionless variables, but the uncertainty relation shows that for constants close to minimal uncertainty the corrections are of higher order in the Planck length.
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
731
Moreover, as in the free particle case, no constant solutions for the Ga,n exist. We thus have to weaken not only the condition of a preserved embedding, but also its horizontality. Since we are interested in effective equations only up to a certain order in , which we already used in the dynamics of this example, it is reasonable to require constant G also only up to terms of some order in . This means that the quantum variables do not need to be strictly constant, but change only slowly. In this example, we have 3 5 0,2 −1 3 −1/2 x 3x ˙ 2 g0 + g 23 e + 4g3 e , G = −γ c p 2 G˙ 1,2 = 3γ −1 c2 p1/2 (g0 + 2g3 e3x ), 3 G˙ 2,2 = 4γ −1 cp3/2 (2g0 + 5g 23 e 2 x + 2g3 e3x ),
where ex is small for a large, semiclassical universe and the dominant terms are given by g0 . For large p, G˙ 2,2 grows most strongly, but we can ensure that it is small by using small g0 . It is easy to see that the uncertainty relation allows g0 to be small enough such that the G˙ a,2 are small and at most of the order . For √ instance, g 32 = 0, g3 = 1 and g0 ∼ 4P −3/2 (c2 p)−5/2 is a suitable choice where correction terms to the classical equations are small and the strongest growth of the second order quantum variables, given by G˙ 2,2 ∼ 4P −3/2 c−4 p1/4 is small on the constraint surface and using ∼ κE: G˙ 2,2 ∼ 4P (κE)−7/2 p5/4 . To the -order of the equations derived here, the system is thus almost preserved, and quantum variables do not grow strongly for some time of the evolution provided that the integration constants ga are chosen appropriately. (Similar results, without using explicit quantum variables G, have been obtained in [7, 6].) In the following section, we will formalize the weakened conditions on an effective system and show that this allows one to reproduce standard effective action results. 8. Anharmonic Oscillator We now come to the main part of this paper. As motivated by the preceding examples, we first weaken the effective equation scheme developed so far and then show that it reproduces the standard effective action results when quantum dynamics is expanded around the ground state of a harmonic oscillator. From what we discussed so far, one can already see that basic properties are the same: First, the harmonic oscillator ground state (or any coherent state) gives a quantum dynamics with constant quantum variables such that the quantum Hamiltonian differs from the classical one only by a constant. Effective equations of motion are then identical to the classical ones, which agrees with the usual result. If there is an anharmonic contribution to the potential, however, the evolution of classical variables depends on the quantum variables, and moreover there is no finite set of decoupled quantum variables. Thus, for an exact solution all infinitely many quantum variables have to be taken into account, and in general no strong effective system exists. This is the
October 11, 2006 13:24 WSPC/148-RMP
732
J070-00277
M. Bojowald & A. Skirzewski
analog of the non-locality of the standard effective action which in general cannot be written as a time integral of a functional of the q i and finitely many of their time derivatives. In standard effective actions, a derivative expansion is an important approximation, and similarly we have to weaken our definition of effective systems by introducing approximate notions. 1 2 p + 12 mω 2 q 2 + U (q), and the The classical Hamiltonian is now given by H = 2m quantum Hamiltonian in terms of dimensionless quantum variables (7.3), dropping the tilde from now on, is 1 2 1 ω 0,2 p + mω 2 q 2 + U (q) + (G + G2,2 ) HQ = 2m 2 2 1 (/mω)n/2 U (n) (q)G0,n . (8.1) + n! n This generates equations of motion q˙ = m−1 p, p˙ = −mω 2 q − U (q) −
1 (m−1 ω −1 )n/2 U (n+1) (q)G0,n , n! n
(8.2)
aU a−1,n G G˙ a,n = −aωGa−1,n + (n − a)ωGa+1,n − mω √ aU (q) a−1,n−1 0,2 aU (q) a−1,n−1 0,3 + G G + G G 3 3!(mω)2 2(mω) 2 √ U (q) a−1,n+1 U (q) a−1,n+2 a − G + G 3 2 3(mω)2 (mω) 2 √ a(a − 1)(a − 2) U (q) a−3,n−3 U (q) a−3,n−2 + G + G + ··· 3 3 · 23 (mω)2 (mω) 2 showing explicitly that a potential of order higher than two makes the equations of motion for the Ga,n involve Ga,n+1 , Ga,n+2 and so on, therefore requiring one to solve an infinite set of coupled non-linear equations. However, for semiclassical dynamics, the Ga,n should be small as they are related to the spreading of the wave function. This allows the implementation of a perturbative expansion in 1/2 powers to solve the equations for G, where the number of degrees of freedom involved to calculate the equations of motion for the classical variables up to a given order is finite. We emphasize that corrections appear at half-integer powers in , except for the linear order. This is in contrast to what is often intuitively expected for quantum theories, where only corrections in powers of are supposed to appear. (Correction terms of half-integer order do not appear only if the classical Hamiltonian is even in all canonical variables.) However, this is much more natural √ from a quantum gravity point of view where not but the Planck length P = κ is the basic parameter, which is a fractional power of (see the quantum cosmology example).
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
733
e/2 To solve the equations, we expand Ga,n = e Ga,n . If we want to find a e solution up to kth order, we have to calculate the solutions to (8.2) for G0,2 up to the order k − 2 and G0,3 to the order k − 3. At the same time, these will be functions a,3+2(k−3)−l for all positive integer l ≤ 2k − 3. of the Ga,n to all orders up to Gl Example. For U (q) =
δ 4 4! q ,
we have equations of motion
δq 2 a a−1,n G G˙ a,n = −aωGa−1,n + (n − a)ωGa+1,n − , 0 0 0 2mω 0 δq 2 a a−1,n δaq a−1,n a+1,n 0,2 a−1,n−1 G1 G˙ a,n = −aωG + (n − a)ωG − + 3 G0 G0 1 1 1 2mω 2(mω) 2 δaq (a − 1)(a − 2) a−3,n−3 a−1,n+1 G − − G , 3 0 0 12 2(mω) 2 δq 2 a a−1,n G = −aωGa−1,n + (n − a)ωGa+1,n − G˙ a,n 2 2 2 2mω 2 δaq 0,2 a−1,n−1 a−1,n−1 + + G0,2 ) 3 (G1 G0 0 G1 2(mω) 2 (a − 1)(a − 2) a−3,n−3 δaq a−1,n+1 G − − G 3 1 1 12 2(mω) 2 δa δaq (a − 1)(a − 2) a−3,n−2 0,3 a−1,n−1 a−1,n+2 + G G − − G G 0 0 3!(mω)2 0 0 6(mω)2 4(mω)2 up to second order. Now, in order to construct a strong effective theory of the system, we would again have to find a submanifold which is invariant under the action of the Hamiltonian. The only dynamics contained in our quantum degrees of freedom then comes via δ 4 q the submanifold: G˙ a,n = x˙ i ∂i Ga,n , e.g., for a potential U (q) = 4! 1 δ δq 0,2 3/2 δ 0,3 a,n G˙ a,n = p∂q − mω 2 q + q 3 + G + G . (8.3) ∂ p G 3 m 3! 2mω 2 3!(mω) It seems solve the or exact, therefore we have
convenient to perform an expansion in δ in addition to in order to system of equations. However, solutions of these equations, perturbative are in general not single valued functions of the classical variables and an exactly preserved semiclassical submanifold does not exist. In fact,
ˆ be a quantum mechanical system such that H ˆ = 1 pˆ2 + Lemma 8.1. Let (H, H) 2m ˆ admits a strong effective system of the classical dimension then V (ˆ q ). If (H, H) ˆ is linear. (H, H) Proof. By assumption, we have an embedding of the classical phase space into the quantum phase space such that the quantum flow is everywhere tangential
October 11, 2006 13:24 WSPC/148-RMP
734
J070-00277
M. Bojowald & A. Skirzewski
to the embedding and the classical symplectic structure is induced. We can thus take the quantum Hamiltonian vector field and choose additional horizontal vector fields generated by functions Li on K such that they span the tangent space to P in each point p ∈ P. Since, by construction, the collection of all those vector fields can be integrated to a manifold, they are in involution. Vector fields on the bundle, finally, correspond to linear operators on the Hilbert space having the same commutation relations as the Poisson relations of the generating functions. There is thus a complete set of operators of the quantum system which includes the Hamiltonian and is in involution. The notion of a strong effective system then does not allow enough freedom to include many physically interesting systems. Indeed, the dynamics of a strong effective system does not significantly differ from the classical one: Theorem 8.2. For any strong effective system of classical dimension, Heff = H + const differs from the classical Hamiltonian only by a constant of order . Proof. From the preceding lemma, it follows that a strong effective system can exist only when the Hamiltonian is at most quadratic in the complete classical phase space functions Li . In an expansion as in (8.1), we then have only the linear order in containing Ga,2 . Since by assumption the strong effective system is of the classical dimension, horizontality implies that the Ga,2 are constant. Thus, HQ − H = c with a constant c, and HQ directly gives the effective Hamiltonian. If quantum degrees of freedom are included in a strong effective system of dimension higher than the classical one, they are then only added onto the classical system without interactions, which is not of much interest. On the other hand, for effective equations one is not necessarily interested in precisely describing whole orbits of the system, for which single valued solutions G(q, p) would be required, but foremost in understanding the local behavior compared to the classical one, i.e. modifications of time derivatives of the classical variables. The conditions for a strong effective system, however, are requirements on the whole set of orbits of the system. Thus, as noted before, we have to weaken our definition of effective systems. We first do so in a manner which focuses on the finite dimensionality of classical systems but ignores more refined notions of semiclassicality: ˆ is Definition 8.3. An effective system of order k for a quantum system (H, H) a dynamical system (M, Xeff ), i.e. a finite-dimensional manifold M together with an effective flow defined by the vector field Xeff , which can locally be embedded in the K¨ ahler manifold K associated with H such that it is almost preserved: for any p ∈ M there is an embedding ιp of a neighborhood of p in K such that XH (p) − ιp∗ Xeff (p) is of the order k+1 with the vector field XH generated by the quantum Hamiltonian.
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
735
An effective system in this sense allows one to describe a quantum system by a set of finitely many equations of motion, as we encountered it before in the examples. The only concept of classicality is the finite dimensionality, while otherwise the quantum variables included in the effective system can change rapidly and grow large even if an initial state has small fluctuations. Moreover, the finite dimensional space of an effective system is not required to be of even dimension or, even if it is of even dimension, to be a symplectic space. In general, it is only equipped locally with a pre-symplectic form through the pull-back of Ω on K. A stronger notion, taking these issues into account, is Definition 8.4. A Hamiltonian effective system (P, Heff ) of order k for a quanˆ is a finite-dimensional subspace P of the K¨ tum system (H, H) ahler manifold K associated with H which is (1) symplectic, i.e. equipped with a symplectic structure ΩP = ι∗ ΩK + O(k+1 ) agreeing up to order k+1 with the pull-back of the full symplectic structure, and (2) almost preserved and Hamiltonian, i.e. there is a Hamiltonian vector field Xeff generated by the effective Hamiltonian Heff on P such that for any p ∈ P the vector XH (p) − Xeff (p) is of the order k+1 with the vector field XH generated by the quantum Hamiltonian. By using a symplectic subspace, we ensure that the commutator algebra of the quantum system, which determines the symplectic structure on K, is reflected in the symplectic structure of the effective system. Moreover, as in the previous definition the dynamics of the effective system is close to the quantum dynamics. Still, the effective Hamiltonian is not directly related to the quantum Hamiltonian: one generally expands the quantum Hamiltonian in powers of , solves some of the equations of motion for Ga,n and reinserts solutions into the expansion. Nevertheless, to low orders in , most fluctuations can be ignored and it is often possible to work directly with the quantum Hamiltonian as the expectation value in suitably peaked states. This is the case for effective equations of quantum cosmology [24, 6, 7] where this procedure has been suggested first. In this definition, we still do not include any reference to the corresponding classical system. In general, its dynamics will not be close to the effective dynamics, but there are usually regimes where this can be ensured for at least some time starting with appropriate initial states. Also the symplectic structure ΩP can differ from the classical one. This is realized also for effective actions such as (2.4), where the symplectic structure also receives correction terms of the same order in as the Hamiltonian. The effective and classical symplectic structures are close if the embedding of P in K is “almost horizontal” which can be formalized by requiring that for any p ∈ P and v ∈ HorΩ p K there is a w ∈ Tp P such that w − v ∈ Tp K is of some appropriate order in .
October 11, 2006 13:24 WSPC/148-RMP
736
J070-00277
M. Bojowald & A. Skirzewski
We do not make this definition of almost horizontality more precise since it turns out not to be needed to reproduce usual effective action results. Moreover, its practical implementation can be rather complicated: The quantum cosmology example showed that the order to which one can ensure almost horizontality is not directly related to the order in to which equations of motion are expanded. If one has an almost horizontal embedding, ignored quantum degrees of freedom remain almost constant such that they do not much influence the evolution for an appropriately prepared initial state. However, not any system can be approximated in this manner, and so the condition of almost horizontality implies that for some systems only higher dimensional Hamiltonian effective systems exist. In such a case, there are some quantum degrees of freedom which can by no means be ignored for the effective dynamics. On the other hand, in such a case, it may be difficult to guarantee the existence of a symplectic structure. This happens, for instance, if the Ga,2 change too rapidly, but not higher G. One can then use a 5-dimensional effective system with variables (q, p, G0,2 , G1,2 , G2,2 ) which can only be pre-symplectic and thus not Hamiltonian. Alternatively, one can drop the condition of almost horizontality, but then has to accept a new (pre-)symplectic structure which is not necessarily related to the classical one by only correction terms. These constraints show that a discussion of quantum variables in higher-dimensional effective systems can be complicated if one insists on the presence of a canonical structure. Moreover, computing the symplectic structure on the K¨ahler space and its pull-back to the effective manifold in an explicit manner is usually complicated (see, however, Sec. 9 for a brief discussion). We thus present a final definition which does not require an explicit form of the quantum symplectic structure but is sufficient for the usual setting of effective actions: Definition 8.5. An adiabatic effective system of order (e, k) for a quantum system ˆ is an effective system (M, Xeff ) of order k in the sense of Definition. 8.3 (H, H) such that the local embeddings are given by solutions up to order e in an adiabatic expansion of those quantum variables not included as variables of the effective system. Here, adiabaticity intuitively captures the physical property of a weak influence of quantum degrees of freedom on the classical ones: in the adiabatic approximation, they change only slowly compared to the classical variables. Provided that a semiclassical initial state is chosen, it is then guaranteed that the system remains semiclassical for some time. This viewpoint is still much more general than the usual definition of an effective action, and it allows much more freedom by choosing different finite-dimensional subspaces. For an explicit derivation of effective equations, of course, one has to find solutions Ga,n (xi ) as they appear in the quantum Hamiltonian, which requires one to solve an infinite set of coupled differential equations for infinitely many variables. Only in exceptional cases, such as integrable systems, can this be done
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
737
without approximations. Moreover, general solutions for Ga,n (xi ) contain infinitely many constants of integration which then also appear in the effective equations after inserting the Ga,n (xi ). On the one hand, this allows much more freedom in choosing the states, such as squeezed or of non-minimal uncertainty, to perturb around. However, it also means that one needs criteria to fix the integration constants in situations of interest. One such situation is that of 1 2 p + 12 mω 2 q 2 + U (q) Theorem 8.6. A system with classical Hamiltonian H = 2m admits an adiabatic effective system of order (2, 1) whose dynamics is governed by the effective action (2.4).
Proof. In order to find the subspace P and the dynamics on it, we expand the quantum Hamiltonian in powers of and solve the equations of motion for Ga,n in an adiabatic approximation. The adiabatic approximation of slowly varying fields in the equations of motion is an expansion in a parameter λ introduced for the sake of the calculation, but in d d → λ dt and, the end set to λ = 1. Derivatives with respect to time are scaled as dt
a,n e a,n expanding G = e Ge λ , the equations of motion x˙ i ∂i Ga,n = {Ga,n , HQ }Q imply a,n x˙ i ∂i Ga,n e−1 = {Ge , HQ }Q .
In addition to the adiabatic approximation, we also perform a semiclassical expansion in powers of . In what follows, we will calculate the first order in and go to second order in λ for Ga,2 . To zeroth order in λ, the equations to solve are U a+1,n a−1,n G , H } = ω (n − a)G − a 1 + 0 = {Ga,n Q Q 0 0 0 mω 2 with general solution = Ga,n 0
n/2 a/2
a/2 −1 U n G0,n 1+ 0 a mω 2
for even a and n, and Ga,n = 0 whenever a or n are odd. This still leaves the value 0 free, which will be fixed shortly. To first order in λ, of G0,n 0 U 1 a+1,n (n − a)G1 −a 1+ = G˙ a,n Ga−1,n 1 2 mω ω 0 implies Lemma 8.7.
n−a 2 n/2 U G˙ a,n = 0. 1+ 0 2 a/2 mω a even
October 11, 2006 13:24 WSPC/148-RMP
738
J070-00277
M. Bojowald & A. Skirzewski
Proof. From the equation above, n−a 2 n/2 U G˙ a,n 1+ 0 2 a/2 mω a =
n/2 U 1+ a/2 mω 2 a
n−a 2
U a−1,n − a 1 + (n − a)Ga+1,n G 1 1 mω 2
manipulating the first term of the right-hand side expression, we shift a → a − 2 leaving the limits for a unaffected in the summation to obtain n−a+2 2 U (n/2)! 1 + mω 2 (n − a + 2)Ga−1,n 1 ((a − 2)/2)!((n − a + 2)/2)! a n−a+2 2 n/2 U = Ga−1,n a 1+ 1 2 a/2 mω a which cancels then the second term to finish the proof.
U −n/4 This imposes a constraint on G0,n solved by setting G0,n = Cn (1 + mω . 2) 0 0 n! The remaining constants Cn are fixed to Cn = 2n (n/2)! by requiring that the limit U → 0 reproduces the quantum variables of coherent states of the free theory (7.10) or equivalently by requiring the perturbative vacuum of the quantum theory to be associated to the vacuum of the effective system. Therefore, 2a−n 4 (n − a)!a! U = . Ga,n 1 + 0 n 2 2 ((n − a)/2)!(a/2)! mω We will need only the n = 2 corrections to first order in , and the solution to the 1 ˙ 0,2 first order equations becomes trivial: G1,2 1 = 2ω G0 , the rest being zero. To second order, we have U 1 ˙ 1,2 1 ¨ 0,2 2,2 G , G = G2 − 1 + G0,2 2 = mω 2 ω 1 2ω 2 0 again leaving free parameters in the general solution to be fixed by the next, third order from which we obtain U ˙ 2,2 G˙ 0,2 1+ 2 + G2 = 0 mω 2 as in the lemma before. The previous two equations can be combined to a first order differential equations for G0,2 2 in terms of known solutions at lower orders: ... G˙ 0,2 1 0,2 0,2 2 G 0,2 0 G˙ 0,2 G + (G ) 2 − 2 0 0 = 0. ω2 G0,2 0
Its general solution is G0,2 2
2 0,2 3/2 d 0,2 1/2 −2 = c − 2ω (G0 ) (G ) G0,2 0 , dt2 0
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
739
where the integration constant c can be fixed to c = 0 by requiring the correct free limit U = 0 (for which the original two differential equations imply 0,2 G2,2 2 = −G2 = 0). From this, the solution to the system is 2 2 0,2 52 d (G ) (G0,2 )1/2 0 ω2 dt2 0 − 72 U 2 1+ U q˙ U U q¨ + U q˙2 mω 2 = −5 1+ . 4ω 2 mω 2 4mω 2 4mω 2
G0,2 2 = −
Finally, putting our approximate expressions for the quantum variables back into the equations of the classical variables (8.2), we obtain U 3 2 2 2 λ q˙ 4mω U U 1+ − 5(U ) λ2 (U )2 mω 2 q ¨ + m + 5 7 U 2 U 2 7 m3 ω 7 1 + 2 5 m2 ω 5 1 + 2 mω 2 mω 2 + mω 2 q + U +
U
1 = 0 U 2 4mω 1 + mω 2
(8.4)
as it also follows from the effective action (2.4) after setting λ = 1. The proof demonstrates the role of the harmonic oscillator ground state and its importance for fixing constants in the effective equations. The role of adiabaticity here is the same as in the derivative expansion of low energy effective actions, but even for an anharmonic oscillator are the effective systems defined here more general: we are not forced to expand around a vacuum state but can make other choices depending on the physical situation at hand. The vacuum state was used here in order to fix the constants Cn which appear when integrating equations of motion for quantum variables. One can just as well choose different constants, for instance those corresponding to a squeezed state, and obtain the corresponding effective equations. Note, however, that not every choice is consistent with the adiabatic approximation. For instance, the proof showed that Ga,2 0 had to be zero to leading order in . Thus, one cannot allow arbitrary squeezing since the parameter G1,2 0 is restricted. This can become non-zero only at higher orders in the expansion. Or, while one would always include the classical variables in the effective system, they can be accompanied by some of the quantum variables which are not treated as adiabatic. One can include such quantum degrees of freedom directly as defined on the quantum phase space, or introduce them by perturbing quantum a,n (t). New degrees of variables around the adiabatic solution, Ga,n = Ga,n adiabatic + g a,n freedom given by g (t) are then independent of the classical variables and describe quantum corrections on top of the adiabatic one. There are also situations where no distinguished state such as the vacuum is known, as it happens in the example of quantum cosmology discussed earlier.
October 11, 2006 13:24 WSPC/148-RMP
740
J070-00277
M. Bojowald & A. Skirzewski
General effective equations can then still be formulated but contain free parameters incorporating the freedom of choosing an initial state in which the system is prepared. The constants Cn in the above proof, for instance, would then remain unspecified and appear in effective equations. To the same order as considered here, only the constant C2 enters which will appear in general equations of motion. Following the lines of the proof above without fixing C2 is easily seen to lead to an effective action of the form (2.4) with mass term m + C23
U (q)2 5
25 m2 (ω 2 + m−1 U (q)) 2
and effective potential
1 1 ω U (q) 2 2 2 − mω q − U (q) − C2 . 1+ 2 2 mω 2
Remark. Knowing the effective action, one can derive the corresponding momentum and compute the effective symplectic structure. Corrections to the canonical symplectic structure can then occur if one uses a momentum variable p that matches the dynamics of the mean value of pˆ. Still, this does not necessarily imply that the system is a Hamiltonian effective system of first order as per Definition 8.4 because we did not relate this symplectic structure to that following from pull-back from the quantum symplectic structure. 9. Dynamical Coherent States In addition to the effective dynamical behavior of classical and quantum degrees of freedom it is also of interest to know approximate states whose dynamics corresponds to the effective evolution. Under the name of dynamical coherent states [17], they can be obtained by collecting the information contained in the mean values of the fundamental operators and the spreading as well as higher order distortions of the state of the system. In this section, we only collect results related to the previous discussion without going into further details. As we already stated, the task could be achieved by summing up the Hermite polynomial modes obtained through the Hamburger moments, but a short cut to the answer is possible using Moyal’s formula [25] by which four arbitrary normalizable vectors |Ψ1 , |Ψ2 , |Ψ3 and |Ψ4 satisfy 2 † † d z Ψ1 |ezˆa −¯zaˆ |Ψ2 Ψ3 |e−zˆa +¯zaˆ |Ψ4 = Ψ1 |Ψ4 Ψ3 |Ψ2 , (9.1) 2π ˆ = √1 (ˆ q + iˆ p). For a bounded operator Fˆ , (9.1) can where z = √1 (z q + iz p ) and a 2
2
be rewritten as 2 † † d z Ψ1 |ezˆa −¯zaˆ |Ψ2 Tr{Fˆ e−zˆa +¯zaˆ } = Ψ1 |Fˆ |Ψ2 . (9.2) 2π For given solutions Ga,n , the reconstruction of a dynamical coherent state is completed by performing the integral with arbitrary |Ψ1 , |Ψ2 after inserting for
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
741
Fˆ the probability density operator ρ(q, ˆ p) and assuming that the state is analytical such that ∞ n † (−)n−a in n q a p n−a a,n √i (z q p−z p q) Tr{ρˆ(q, p)e−zˆa +¯zaˆ } = e (a )(z ) (z ) G (q, p) n! n=0 a=0 †
produce the matrix elements of ρˆ(q, p) in a basis of operators ezˆa −¯zaˆ . For the n! G(i1 i2 · · · Gin−1 in ) anharmonic oscillator to 0th order in we have Gi1 ,...,in = (n/2)! for n even, implying i 1 i j −zˆ a† +¯ za ˆ i j kl } = exp √ (z ij x ) − z z ik jl G (q, p) . Tr{ρˆU (q, p)e 2 In order to perform the integral above, we choose to work with harmonic oscillator † ¯a |0 for which the matrix elements of the exponential coherent states |α = eαˆa −αˆ † i j i j 1 i zˆ a −¯ za ˆ |α = exp(− 4 (α − αi )δij (α − αj ) + 4 (α + αi )ij (α − operator are α|e j j j j i i i α + 2z )). Finally, defining Si = δij (α − α ) + iij (α + α − 2x ), the matrix elements of the probability density operator are 1 1 i1 j1 ij ij −1 j2 i2 S exp − (2G + δ ) S α|ˆ ρU (q, p)|α = i i 2 j1 j2 4 1 1 ij δ + Gij det 2 i i 1 i i j j i j j × exp − (α − α )ij (α + α ) − (α − α )δij (α − α ) . 4 4 (9.3) The trace of the operator above can now be computed to equal one whenever Gij is a non-degenerate matrix. In order to be sure that ρ is a density matrix, we need to show its positivity. We do not have a complete proof for arbitrary systems, but using the fact that the assumption of the state being semi-classical requires the mean values of operators to be given by their classical expressions up to corrections, a case by case study leads to the conclusion that the positive mean values above lead to positivity of the operator. Furthermore, the state of the quantum system as given above is not in general a pure state, but if Gij = 2 (eg )ik (eg )jl δ kl , also ρˆU (x)2 has trace one and thus gives a pure state which can be realized as a squeezed coherent state labeled by the symmetric matrix gij through i i i i i j j j |x, g = exp gij (ˆ x − x )(ˆ x − x ) exp − x ij x ˆ |0. (9.4) 2 i
With the help of e− 2 gij xˆ become Gi1 ,...,in (gij ) =
i
x ˆj k
i
x ˆ e 2 gij xˆ
i
x ˆj
= (eg )kl x ˆl , the remaining fiber coordinates
n/2 n! g i1 (e )j1 · · · (eg )ijnn δ (j1 j2 · · · δ jn−1 jn ) . 2n (n/2)!
(9.5)
Reconstructing a dynamical coherent state from the quantum variables Ga,n also provides means to compute the symplectic structure on the effective space, as
October 11, 2006 13:24 WSPC/148-RMP
742
J070-00277
M. Bojowald & A. Skirzewski
needed for a Hamiltonian effective system as per Definition 8.4. For the evaluation of the symplectic structure on the vector fields we obtain the pull-back Ω(Y, Z) = 2 ImY, Z where Y and Z are tangent vectors to the embedded effective manifold. Given a dynamical coherent state |ψ(f i ) as a function of classical variables f i , we can define a basis of the tangent space spanned by |i := ∂|ψ/∂f i . Expanding
Y = i Yi |i and Z = i Zi |i, we have Y |Z =
i,j
∂ψ| ∂|ψ Y¯ i Z j ∂f i ∂f j
such that we can formally write Ω = −2i d(x1 , . . . , xn |) ∧ d(|x1 , . . . , xn ).
(9.6)
Thus, the pull-back of the symplectic structure to the subspace of squeezed states is Ω|x,g = 2ij dxi ∧ dxj + 2−5 δ i1 i2 i3 i4 (δij11 + (eg )ji11 ) · · · (δij44 + (eg )ji44 )dgj1 j3 ∧ dgj2 j4 .
(9.7)
For an effective system of the classical dimension, corresponding to a set of solutions gij (xk ), we can further pull back (9.7) to the classical manifold and obtain the quantum symplectic structure there. This shows that the classical symplectic structure is reproduced up to corrections of order if the g do not change strongly (adiabaticity or almost horizontality), and provides means to compute those correction terms. 10. Conclusions Comparison with common effective action techniques applicable to anharmonic oscillators demonstrates how effective systems can be formulated more generally for any quantum system. We have extracted several definitions which have different strengths and use different mathematical structures:
Here, the strengths of each of our definitions are compared in a condensed diagram by use of implication arrows and abbreviations in which the initial S holds for strong, H for Hamiltonian, A for adiabatic and ES for effective system. The only definition not provided before is that of a strong Hamiltonian effective system which is a Hamiltonian effective system which is exactly preserved and whose symplectic structure is exactly the pull-back of the quantum symplectic structure. It is clear from
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
743
the discussions before that any strong effective system is also strong Hamiltonian, and examples lead to the conjecture that the converse is also true. Still, since we are not aware of a proof, we include strong Hamiltonian effective systems in this diagram. While the definition of Hamiltonian effective systems is most geometrical, adiabatic effective systems turn out to be more practical and are more directly related to path integral techniques. The weakest notion of an effective system can be applied to any system but does not incorporate many classical aspects except for finite dimensionality for mechanical systems. As the examples showed, in particular that of quantum cosmology, the general definitions provided here are more widely applicable and also present a more intuitive understanding of possible quantum degrees of freedom. Moreover, they are always switched on perturbatively, and no nonanalyticity in perturbation parameters as with higher derivative effective actions arises. The expansion of the quantum Hamiltonian also showed that in general halfinteger powers of have to be expected in correction terms and not just integer powers as often stated. The only exception is the first order in 1/2 which does not appear because the expectation value of variables G1 would be zero by definition. Half-integer powers do not appear only if one has a system with a Hamiltonian even in all canonical variables, such as an anharmonic oscillator with an even potential, as it often occurs in quantum field theories. These observations are relevant for quantum gravity phenomenology because an expansion in the Planck length P = √ κ naturally involves half-integer powers in . From the perspective provided here, one can expect all integer powers of the Planck length except for the linear one. Other advantages are that the effective equations have a geometrical interpretation where only real variables, unlike q(t) in the usual definition, occur. We are dealing directly with equations of motion displaying only the relevant degrees of freedom, which are automatically provided with an interpretation as properties of the wave function, and can directly deal with canonical formulations in which the scheme indeed arises most naturally. The techniques are general enough for arbitrary initial states and systems with unbounded Hamiltonians, as demonstrated by our quantum cosmology example. The infrared problem of (2.4) for m → 0 is seen to arise only in the adiabatic approximation, but can easily be treated by using more general notions of effectivity such as by including the spreading parameters Ga,2 in a pre-symplectic effective system. As discussed briefly in the preceding section, techniques introduced here can also be used directly at the quantum level and not just for effective semiclassical approximations. In this context, we have presented only first steps, but this already shows that the techniques can give information on dynamical coherent states. This will then also have helpful implications for the effective equation scheme itself from which such states arise, as they can give a handle on computing the pull-back of the full symplectic structure.
October 11, 2006 13:24 WSPC/148-RMP
744
J070-00277
M. Bojowald & A. Skirzewski
Acknowledgments We thank Abhay Ashtekar for several discussions and suggestions in the early stages of this work. We are grateful to Emil Akhmedov, Benjamin Bahr, Oscar Castillo, H´ector Hern´ andez, Mikolaj Korzynski, Angel Mu˜ noz, Hanno Sahlmann and Thomas Thiemann for fruitful discussions on different aspects of this work. MB is grateful to the Isaac Newton Institute for Mathematical Sciences, Cambridge for its hospitality during the workshop “Global Problems in Mathematical Relativity”, where this paper was completed, and thanks the organizers Piotr Chrusciel and Helmut Friedrich for the invitation. References [1] W. Heisenberg and H. Euler, Consequences of Dirac’s theory of the positron, Z. Phys. 98 (1936) 714; [physics/0605038]. [2] J. Schwinger, On gauge invariance and vacuum polarization, Phys. Rev. 82 (1951) 664–679. [3] T. W. B. Kibble, Geometrization of quantum mechanics, Commun. Math. Phys. 65 (1979) 189–201. [4] A. Heslot, Quantum mechanics as a classical theory, Phys. Rev. D 31 (1985) 1341–1348. [5] A. Ashtekar and T. A. Schilling, Geometrical Formulation of Quantum Mechanics (Springer, New York, 1999), pp. 23–65; [gr-qc/9706069]. [6] A. Ashtekar, M. Bojowald and J. Willis, in preparation. [7] J. Willis, On the low-energy ramifications and a mathematical extension of loop quantum gravity, PhD thesis, The Pennsylvania State University (2004). [8] G. Jona-Lasinio, Relativistic field theories with symmetry breaking solutions, Nuovo Cim. 34 (1964) 1790–1795. [9] R. Jackiw and A. Kerman, Time dependent variational principle and the effective action, Phys. Lett. A 71 (1979) 158–162. [10] J. Z. Simon, Higher-derivative Lagrangians, nonlocality, problems, and solutions, Phys. Rev. D 41 (1990) 3720–3733. [11] F. Cametti, G. Jona-Lasinio, C. Presilla and F. Toninellir, Comparison between quantum and classical dynamics in the effective action formalism, in Proc. Int. School of Physics “Enrico Fermi”, Course CXLIII (Amsterdam, IOS Press, 2000), pp. 431–448; [quant-ph/9910065]. [12] L. Dolan and R. Jackiw, Gauge-invariant signal for gauge-symmetry breaking, Phys. Rev. D 9 (1974) 2904–2912. [13] N. M. J. Woodhouse, Geometric quantization, Oxford Mathematical Monographs (Clarendon, 1992). [14] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. 2 (Academic Press, Boston, 1975). [15] J. M. Cornwall, R. Jackiw and E. Tomboulis, Effective action for composite operators, Phys. Rev. D 10 (1974) 2428–2445. [16] K. Symanzik, Renormalizable models with simple symmetry breaking I. Symmetry breaking by a source term, Comm. Math. Phys. 16 (1970) 48–80. [17] W. Zhang, D. H. Feng and R. Gilmore, Coherent states: Theory and some applications, Rev. Mod. Phys. 62 (1990) 867–927. [18] A. Ashtekar, New Hamiltonian formulation of general relativity, Phys. Rev. D 36 (1987) 1587–1602.
October 11, 2006 13:24 WSPC/148-RMP
J070-00277
Effective Equations of Motion for Quantum Systems
745
[19] J. F. Barbero G., Real Ashtekar variables for Lorentzian signature space-times, Phys. Rev. D 51 (1995) 5507–5510; [gr-qc/9410014]. [20] G. Immirzi, Real and complex connections for canonical gravity, Class. Quantum Grav. 14 (1997) L177–L181. [21] H. Kleinert and A. Chervyakov, Covariant effective action for quantum particle with coordinate-dependent mass, Phys. Lett. A 299 (2002) 319; [quant-ph/0206022]. [22] M. Bojowald, Isotropic loop quantum cosmology, Class. Quantum Grav. 19 (2002) 2717–2741; [gr-qc/0202077] [23] A. Ashtekar, M. Bojowald and J. Lewandowski, Mathematical structure of loop quantum cosmology, Adv. Theor. Math. Phys. 7 (2003) 233–268; [gr-qc/0304074]. [24] M. Bojowald, Inflation from quantum geometry, Phys. Rev. Lett. 89 (2002) 261301; [gr-qc/0206054]. [25] J. E. Moyal, Quantum mechanics as a statistical theory, Proc. Cambridge Phil. Soc. 45 (1949) 99–124.
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Reviews in Mathematical Physics Vol. 18, No. 7 (2006) 747–779 c World Scientific Publishing Company
EXISTENCE AND STABILITY OF SOLITARY WAVES IN NON-LINEAR KLEIN–GORDON–MAXWELL EQUATIONS
EAMONN LONG University of Cambridge, CMS, Wilberforce Road, Cambridge, CB3 0WA, U.K. [email protected] Received 15 March 2006 Revised 28 July 2006 We prove the existence and stability of non-topological solitons in a class of weakly coupled non-linear Klein–Gordon–Maxwell equations. These equations arise from coupling non-linear Klein–Gordon equations to Maxwell’s equations for electromagnetism. Keywords: Solitons; stability; Klein–Gordon–Maxwell. Mathematics Subject Classification 2000: 22E46, 53C35, 57S20
1. Statement of Results 1.1. Introduction In this article, we are interested in the existence and stability of a class of solitary wave solutions to the following system of equations in four dimensional space-time: φ = 2ieA0 φ˙ + ieA˙ 0 φ + e2 A20 φ − 2ieA.∇φ − e2 |A|2 φ + G (|φ|), A = ieφ, (∇ − ieA)φ − ∇A˙ 0 , ˙ − e2 |φ|2 A0 , −A0 = ieφ, φ
(1.1) (1.2) (1.3)
2
d d d ˙ ˙ where = dt 2 − is the wave operator, φ = dt φ, A0 = dt A0 , and = ∇.∇ is the Laplacian in three dimensional space. We refer to e as the (electromagnetic) coupling constant. The spatial part of the electromagnetic gauge field is given by the real valued function A, while the real valued function A0 is the temporal part of the gauge field. The potential function G is subject to a number of associated hypotheses which we detail in Appendix A.1. A good paradigm for G is the function p 2 G(|φ|) = |φ|p − m2 |φ|2 where m is a fixed number and p ∈ (2, 6); the significance of 6 is that it is the critical Sobolev exponent in three dimensional space for the embedding H 1 → Lp . Collectively, we call these Eqs. (1.1)–(1.3) the non-linear Klein–Gordon–Maxwell equations in the Coulomb gauge. The equations admit a Hamiltonian structure (1.8). The solitary wave solutions in which we are interested
747
October 7, 2006 17:42 WSPC/148-RMP
748
J070-00278
E. Long
are finite energy solutions of the form eiωt fω (x), they decay exponentially at infinity and are called non-topological solitons. Our main results concern the existence and stability of non-topological soliton solutions to the non-linear Klein–Gordon– Maxwell equations. Physically, φ is a self attracting scalar field which carries an electric charge given by (1.7) and which experiences the electromagnetic force that is communicated via A and A0 . For the purposes of this article, non-topological solitons are defined in Sec. 1.2. The existence of these solutions is stated precisely in Theorem 1.1, their stability is precisely stated in Theorem 1.2. In a forthcoming article, we derive an equation of motion of the soliton field in the presence of a background electromagnetic field. Our non-topological soliton solutions are localized in that they decay exponentially (Lemma A.16) and are stable. Thus, the non-topological soliton may be considered as a reasonable model for a particle. The goal is to compare the equation of motion derived for the soliton with that of an electron in the presence of a background electromagnetic field. Indeed, the “true” classical equation of motion for a point charge has been the subject of some controversy and research since the ill-posed Lorentz–Dirac equation for a point charge was derived; see, for example, Spohn’s book [1] and the references therein. 1.1.1. Context Let us attempt to relate somewhat cursorily the work in this article to previous research. We may view the existence and stability results herein as a natural extension to the non-linear Klein–Gordon–Maxwell system (1.1)–(1.3) of similar results — found by Berestycki and Lions [2], Coleman [3], Grillakis, Shatah and Strauss [4], and Stuart [5] — applicable the non-linear Klein–Gordon equation; φ = G (φ),
(1.4)
which we refer to as the “e = 0 case”. Indeed, in the statement of our results, we demand that the coupling constant e be sufficiently close to 0. However, the usual technique of proving the existence of energy-minimizing non-topological soliton solutions — Schwartz symmetrization — does not appear to be applicable to the non-linear Klein–Gordon–Maxwell system as the electrostatic energy given by iφ, ψ(x)iφ, ψ(y) 2 dxdy e |x − y| is increased by concentrating the charge density iφ, ψ as per Schwartz symmetrization. On the other hand, there have been (variational) studies on the question 2 of existence of non-topological solitary waves for the case of G(f ) = f p − m2 f 2 by Benci and Fortunato [6] for 4 < p < 6, and by d’Aprile and Mugnai [7] for 2 < p < 4. But, the solutions in [6, 7] are found via a mountain-pass type method. It is therefore not clear if these solutions are stable. Indeed, for 3 ≤ p < 6 and
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
749
e = 0 those solutions in [6, 7] are not stable, cf. [5, 4]. However, it is possible to adapt the argument found in [3] to deduce the existence of a non-topological soliton which minimizes the Hamiltonian energy within a given charge sector provided that one can demonstrate that a minimizer of the Hamiltonian energy within a given charge sector maybe taken a priori to be radial. Unfortunately, we have been unable to show that a minimizer of (1.5) — under conditions (1.6) and (1.7) — can be assumed to be radial. 1.1.2. Hamiltonian formalism We may consider the non-linear Klein–Gordon–Maxwell equations as arising from the Hamiltonian 1 (|E|2 + |∇ × A|2 + |ψ|2 + |∇A φ|2 − G(|φ|)), (1.5) H(φ, ψ, A, E) = 2 subject to the constraints: C0 := ∇.E − ieφ, ψ = 0; C1 := iφ, ψ = Q.
(1.6) (1.7)
Here ∇A φ is the covariant derivative of φ given by ∇A φ = ∇φ − ieAφ. A is the vector part of the contravariant gauge field. The equations of motion for the augmented Hamiltonian H1 = H − A0 C0 are: ψ + ieA0 φ φ ψ A φ − G (φ) + ieA0 ψ d , = (1.8) dt Ai Ei + ∇i A0 Ei Ai − ∇i (∇.A) + ieφ, ∇A φ where A0 is identifiable with the temporal part of the gauge field, A φ = φ − 2ieA.∇φ − ieφ∇.A + e2 |A|2 φ, i = 1, 2, 3, and we have not yet chosen any gauge (see Sec. 1.2.1). 1.2. Non-topological solitons The class of solitary wave solutions of interest is that of so-called non-topological solitons. Our basic soliton is given by Exp[iωt]fω,e φ Exp[iωt]i(ω − eαω,e )fω,e ψ , (1.9) = A 0 E
−∇αω,e
October 7, 2006 17:42 WSPC/148-RMP
750
J070-00278
E. Long
where we have emphasized the dependence on the parameter ω and e. In this instance, the functions fω,e and αω,e are radial and solve the simultaneous equations: 2 2 αω,e = eωfω,e ; −αω,e + e2 fω,e
−fω,e − G (fω,e ) + (m2 − (ω + eαω,e )2 )fω,e = 0,
(1.10) (1.11)
where we have accounted for C0 = 0. In the language of (1.8), we identify αω,e as being A0 for the soliton. In other words, αω,e is the electric potential for a static soliton. The Hamiltonian equations (1.8) are Poincar´e covariant. Let us present the full action of the Poincar´e group on the radial soliton (1.9): Exp[iΘ](fω,e (Z)) φ(x; λ, e) ψ(x; λ, e) Exp[iΘ](iγ(ω − eαω,e (Z))fω,e (Z) − γu.∇Z fω,e (Z)) . (1.12) = −γuαω,e (Z) A(x; λ, e)
1 Pu + γQu ∇Z αω,e (Z) − E(x; λ, e) γ The projection operators Pu : R3 → R3 and Qu : R3 → R3 are defined by (Pu )ij = ui uj |u|2 and Qu = 1−Pu . We define Z = γPu (x−ξ)+Qu (x−ξ), Θ = θ−ωu.Z, γ(u) = √ 1 2 and λ = (θ, ω, u, ξ). Since the equations of motion (1.8) are Poincar´e 1−|u|
covariant, the solitons given by (1.12) form an eight parameter family of solutions d of the equations of motion (1.8) as long as dt λ = ( ωγ , 0, 0, u). Indeed, it is useful to introduce the parameter
t t ω(s) ds, 0, 0, u(s) ds . (1.13) Λ =λ − 0 γ[u(s)] 0 1.2.1. Choice of gauge The Hamiltonian equations of motion (1.8) are covariant under gauge transformations. That is to say that, if (φ, ψ, A, E, A0 ) is a solution to (1.8), the gauge transformed version
d Exp[ieχ]φ, Exp[ieχ]ψ, A + ∇χ, E, A0 + χ dt is also a solution for any twice differentiable function χ. For the purposes of proving stability, this gauge covariance can be a nuisance. We can effectively eliminate this nuisance by imposing either the Coulomb condition (the Coulomb gauge) ∇.A = 0
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
751
or the Lorentz condition (the Lorentz gauge) d A0 − ∇.A = 0. dt The Lorentz condition is covariant under Lorentz boosts. The radial (and therefore, static) soliton (1.9) is trivially in both the Lorentz and the Coulomb gauge. Therefore, the Lorentz boosted solitons are in the Lorentz gauge. However, for the purposes of demonstrating stability of the solitons, it seems here to be more useful to consider everything in the Coulomb gauge, principally because, when ∇.A = 0, it follows that ∇A L2 = ∇ × A L2 wherever the norms are defined. In any event, the Lorentz boosted solitons in the Coulomb gauge have the form ω,e (Z)) Exp[iΘ](f φS,e (x) ψS,e (x) Exp[iΘ](iγ(ω − eαω,e (Z))fω,e (Z) − γu.∇Z fω,e (Z)) = (1.14) −γuαω,e (Z) + ∇χ A (x) S,e
1 ES,e(x) Pu + γQu ∇Z αω,e (Z) − γ = Θ + ieχ, and χ satisfies −χ = −γu.∇αω,e (Z). In this scheme, the where Θ ˙ temporal part of the gauge field — A0S — is given by A0S,e (x) = γαω,e (Z) + 1e χ(x). The following function spaces will be used: Lp =
f
R3
|f |p
p1
= f Lp < ∞ ,
(1.15)
k f
Dα f L2 < ∞ ,
(1.16)
H˙ 1 = {f ∈ L6 | ∇f L2 = f H˙ 1 < ∞}.
(1.17)
Hk =
|α|=0
Define Hrk be the intersection of H k and the space of radial functions and similarly define Lpr . We shall make frequent use of the L2 inner product defined by a, bL2 = a, b, (1.18) R3
where a, b = 1/2(a¯b + b¯ a). 1.3. Soliton existence Crucial to our analysis is the e = 0 soliton. Indeed, we make the following hypothesis: (SOL)
For ω 2 < m2 , there exists a unique positive radial function fω,0 ∈ H 4 (R3 ) which solves (− + m2 − ω 2 )fω,0 = β(fω,0 )fω,0 ,
October 7, 2006 17:42 WSPC/148-RMP
752
J070-00278
E. Long
f 2 where G(f ) = U (f ) − m2 f 2 with U (f ) = 0 tβ(t) dt. Conditions on G sufficient to ensure that this occurs are given in Appendix A.1. We also need an hypothesis to apply an implicit function type-argument: (KER)
The kernel of L+ (ω) is empty in Hr2 (R3 ),
where L+ (ω) is given by − + m2 − ω 2 − β(fω,0 ) − β (fω,0 )fω,0 .
(1.19)
This hypothesis is valid under the conditions imposed on G in [8]. We are now in a position to state the first main result of this paper. Theorem 1.1. Suppose that ω 2 < m2 and the hypotheses (SOL) and (KER) hold. Then, there exists e( ω ) > 0 such that, if |e| < e( ω ), there exists fωe ,e ∈ Hr2 (R3 ) such that ω − eαωe,e )2 fωe ,e = β(fωe ,e )fωe ,e , −fωe,e + m2 fωe,e − (
(1.20)
where αωe ,e ∈ H˙ r1 (R3 ) is a non-local function of fωe ,e uniquely determined by efωe2,e . −αωe,e + e2 fωe2,e αωe ,e = ω
(1.21)
In addition, there exists e( ω ) > 0 and a neighborhood U of ω such that, if |e| < e( ω ), the map ω → fωe,e is C 2 from U to Hr2 . We refer to this theorem as the existence theorem. It is proved in Sec. 2. 1.4. Stability In order to state the stability theorem, it will be helpful to define ΦS,e (λ) by ΦS,e (λ) = (φS,e (λ), ψS,e (λ), AS,e (λ), ES,e (λ)),
(1.22)
while we shall abbreviate a general solution by making use of the following definition: Ψ = (φ, ψ, A, E).
(1.23)
Following the notation in [5], we observe that λ ∈ O, where O ⊂ R8 and O := {(θ, ω, u, ξ) ⊂ R8 : |u| < 1 and ω 2 < m2 }. An important set is OStability,e ⊂ O which is defined by d 2 (ω fω,0 L2 ) < 0 . OStability,e := (θ, ω, u, ξ) ⊂ O : fω,e exists and dω
(1.24)
(1.25)
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
753
1.4.1. Local well-posedness We also need the hypothesis that the Cauchy problem for (1.1)–(1.3) is locally well˙ ˙ posed, that is to say, given initial data (φ(0), φ(0), A(0), A(0)) ∈ H 1 ⊕ L2 ⊕ H˙ 1 ⊕ L2 ˙ in the Coulomb gauge (i.e. ∇.A(0) = 0, ∇.A(0) = 0), there exist ˙ ˙ A(0), A(0) T∗ = T∗ ( (φ(0), φ(0), ˙ 1 ⊕L2 ) H 1 ⊕L2 ⊕H
(1.26)
˙ ˙ and a unique solution ((φ(t), φ(t), A(t), A(t)) with the property that ˙ (φ(t), φ(t) ∈ C([0, T∗ ); H 1 ⊕ L2 ) ∩ C 1 ([0, T∗ ); L2 ⊕ H −1 ), ˙ (A(t), A(t)) ∈ C([0, T∗ ); H˙ 1 ⊕ L2 ). Furthermore, it is assumed that the solution is continuous with respect to ˙ ˙ the initial data inasmuch as that, for initial data (φ(0), φ(0), A(0), A(0)) and ˙ ˙ (φ1 (0), φ1 (0), A1 (0), A1 (0)) in the Coulomb gauge which are close in H 1 ⊕ L2 ⊕ H˙ 1 ⊕ L2 , the following holds on the common domain of definition [0, T∗ ]; ˙ −A ˙ 1 1 2 ˙ 1 2 ) max ( (φ − φ1 , φ˙ − φ˙ 1 , A − A1 , A H ⊕L ⊕H ⊕L
[0,T∗ ]
˙ ˙ ˙ 1 (0) 1 2 ˙ 1 2 ), ≤ c( (φ(0) − φ1 (0), φ(0) − φ˙ 1 (0), A(0) − A1 (0), A(0) −A H ⊕L ⊕H ⊕L for some constant c > 0. Conditions on the non-linearity sufficient to ensure local well-posedness are given in Appendix A.1. A precise statement of this fact is the subject of Theorem 3.1. A proof of the theorem appears in Appendix A.2. 1.4.2. The Stability Theorem Our solitons can be seen to be stationary points of the Hamiltonian — given by (1.5) — subject to the constraints that Q(φ, ψ, A, E) = q,
(1.27)
ρ(φ, ψ, A, E) ≡ 0,
(1.28)
Π(φ, ψ, A, E) = p,
(1.29)
where Q(φ, ψ, A, E) = iφ, ψ, ρ(φ, ψ, A, E) = ∇.E − iφ, ψ and Π(φ, ψ, A, E) = ψ, (∇ − ieA)φ + E × ∇ × A, inasmuch as the solitons satisfy ω 1 Q − αω,e ρ + u.Π = 0, H − (1.30) γ[u] γ[u] where ω, u and α can be interpreted as Lagrange multipliers. An important quantity in the stability analysis is the enlarged functional Jt which is given by 1 ω(t) Jt (Ψ) = H(Ψ) − Q(Ψ) − αω,e ρ(Ψ) + u(0).Π(Ψ). (1.31) γ[u(0)] γ[u(0)]
October 7, 2006 17:42 WSPC/148-RMP
754
J070-00278
E. Long
In order for the Hessian of Jt to control the norm of any perturbation of the soliton solution, we make the following assumption: (Stability)
2 is uniformly equivalent to Ψ 1 2 ˙1 2 Jt (ΦS,e (λ))[Ψ] H ⊕L ⊕H ⊕L on compact sets of λ ∈ OStability,e .
It is proved in Theorem 3.2 that this assumption is valid if we assume property (S1) found in Appendix A.1 and that the solution is of form found in (3.3) satisfying the constraints (3.14). The next theorem is a precise statement of the soliton being stable. Theorem 1.2. Suppose that the potential G satisfies (A.2)–(A.5), U (1), U (2), = (θ, ω , S(1), (WP 1) and (WP 2). Suppose further that, for λ , u ξ), we have ∂ 2 ∗ (ω f ) < 0. It follows that there exists ε ( λ, e), e( λ) > 0 such that, ω,0 L2 ω=e ω ∂ω ∗ if |e| < e(λ) and ε = Ψ(0) − ΦS,e (λ) H 1 ⊕L2 ⊕H˙ 1 ⊕L2 < ε (λ, e), there exists c1 > 0, λ(t) ∈ C 1 (R, OStability,e ) and Ψ(t) ∈ C(R, H 1 ⊕ L2 ⊕ H˙ 1 ⊕ L2 ) solving equations (1.1)–(1.3) with sup Ψ(t) − ΦS,e (λ(t)) H 1 ⊕L2 ⊕H˙ 1 ⊕L2 < c1 ε.
(1.32)
t∈R
Furthermore, λ(t) satisfies a system of ordinary differential equations given by (4.24) with
d
Λ < c2 (ε + |e|), (1.33)
dt where Λ is defined by (1.13). 2. Existence: The Proof of Theorem 1.1 Proof. Analogously to the existence proof in [9] we consider the mapping F : Hr2 (R3 ) × R × R → L2r (R3 ) defined by F (φ, e, ν) = −φ + (m2 − (ω − eαω,e (φ))2 )φ − β(|φ|)φ + iνφ,
(2.1)
where αω,e = αω,e (φ) solves −αω,e + e2 |φ|2 αω,e = eω|φ|2 . From Lemma A.9 in ∗ the Appendix, we have αω,e ∈ C 1 [Hr2 (R3 ); L2r ]. It may be demonstrated from a maximum principle that (see Lemma A.10 in the Appendix) for each φ, αω,e ∈ L∞ r . These two facts together imply that F is continuously differentiable everywhere. From (SOL), ∃ R ∈ Hr2 (R3 ) such that F (R, 0, 0) = 0. Consider the partial derivative of F with respect to φ at the point (R, 0, 0). This can be written as
− + m2 − ω 2 − β(R) − β (R)R 0 Fφ = , (2.2) 0 − + m2 − ω 2 − β(R) where we think of Hr2 (R3 ; C) as Hr2 (R3 ; R) × Hr2 (R3 ; R), and Fφ as a real matrix operator with Fφ : Hr2 (R3 ; R) × Hr2 (R3 ; R) → L2r (R3 ; R) × L2r (R3 ; R), and G (φ) = β (|φ|)φ. We wish to determine the cokernel of Fφ . Since the finite dimensional
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
755
kernel of Fφ is known [5, 10], we wish to demonstrate that Fφ is self-adjoint, for, in that case, the kernel and cokernel coincide precisely. It is obvious that Fφ is a symmetric operator, and that it is densely defined on L2r . Now, R ∈ Hr2 (R3 ) and so, R ∈ L∞ r . Thus, since β is continuous, there exists k ∈ R such that Fφ + ki is invertible, i.e. the range of Fφ + ki is L2r . It follows, therefore, (see, for example, [11]) that Fφ is self-adjoint. The kernel, and thus the cokernel, lying in Hr2 is given [5, 10] by the span of {iR}. (The kernel may be thought of as arising from phase covariance). It is clear that the cokernel is “filled out” by the addition of the term involving ν. The range of (Fφ , Fν ), evaluated at (R, 0, 0), is, therefore, L2r (R3 ). This implies that (Fφ , Fν ) is surjective, and so, that we can apply the implicit function theorem to obtain the existence — for each e in some neighborhood (which is possibly dependent on ω) of e = 0 — of some (φ, e, ν) such that F (φ, e, ν) = 0. For example, consider the function G where G : Hr2 (R3 ) ∩ (span{iR})⊥ × R × R → L2r (R3 ) × R, and G(φ, e, ν) = (F (φ, e, ν), ν). We now claim that F (φ, e, ν) = 0 forces ν = 0. Consider the inner product F (φ, e, ν), iRL2 . Since R L2 = 0, we must have ν = 0. Next, define F : R ⊕ Hr2 → L2r by F (ω, f ) = (− + m2 − (ω − eα[f ])2 )f − β(f )f.
(2.3)
Now, there exists e1 ( ω ) > 0 such that, if |e| < e1 ( ω ), there exists fωe,e ∈ Hr2 such that ω − eα[fωe ,e ])2 )fωe ,e = β(fωe ,e )fωe ,e . (− + m2 − (
(2.4)
Next, from (KER), we know that in Hr2 the kernel of L+ is empty, where L+ is given by 2 − β(fωe ,0 ) − β (fωe ,0 )fωe,0 . − + m2 − ω
(2.5)
It follows that L+ defines a continuous isomorphism from Hr2 to L2r . Therefore, by continuity in e, it follows that there exists e( ω) > 0 such that, if |e| < e( ω), d F ( ω , f ) is invertible. It follows from the implicit function theorem and from ω e ,e df [5, Theorem 1.4] that ω → fω,e is C 2 from U to Hr2 . 3. Stability: The Proof of Theorem 1.2 We shall need the following four subsidiary theorems, the proofs of which we shall defer till later.
3.1. Local well-posedness Our first theorem is concerned with local well-posedness (in the sense of Sec. 1.4.1) of the Cauchy problem.
October 7, 2006 17:42 WSPC/148-RMP
756
J070-00278
E. Long
Theorem 3.1. Suppose that the potential G satisfies (A.2)–(A.5), (WP1) and (WP2). Let a0 , a1 , φ0 , φ1 be initial data satisfying the following: ∇a0 L2 + a1 L2 + φ0 H 1 + φ1 L2 < k0 < ∞,
(3.1)
∇.a0 = 0 = ∇.a1 .
(3.2)
Then, for any e0 > 0, the system of equations (1.1)–(1.3), where φ(t = 0) = φ0 , ˙ = 0) = φ1 , A(t = 0) = a0 , and A(t ˙ = 0) = a1 , is locally well-posed in the φ(t sense of Sec. 1.4.1 on some non-empty time interval [0, T ] provided that |e| < e0 . The time of existence T depends only on e0 and k0 . Furthermore, the solution ˙ ∇A, A) ˙ ∈ C([0, T ); H 1 ⊕ L2 ) ∩ C 1 ([0, T ); L2 ⊕ H −1 ) while ∇A, ˙ satisfies (φ, φ) (φ, φ, 2 ˙ A ∈ C([0, T ); L ). Proof. See Appendix A.2. 3.2. Hessian positivity 3.2.1. Ansatz for a nearby solution We make an ansatz for what we think a solution corresponding to nearby initial data should look like. Our idea is that a solution initially nearby to a soliton will at each time be close to a soliton that is close to the original soliton, i.e. ∆(t) (defined in Eq. (3.4) below) is small for all t ∈ R+ . In this case, our ansatz will be φS,e (λ(t)) + Exp[i(Θ + eχ)]v φ(x) ψ(x) ψS,e (λ(t)) + Exp[i(Θ + eχ)]w = (3.3) A(x) AS,e (λ(t)) + q E(x) ES,e (λ(t)) + s with the temporal part of the gauge satisfying A0 = A0S,e + r. We also impose the Coulomb gauge so that ∇.q = 0. In quantifying what we mean by how far the nearby solution is from the soliton solution, we introduce quantity ∆(T ) which is defined by |2 + s 2L2 + ∇ × q 2L2 + w 2L2 + v 2H 1 ). |2 + |u − u ∆(T ) = sup (|ω − ω
(3.4)
t∈[0,T ]
To demonstrate stability, we shall show that ∆(t) is small for all t provided we make the appropriate choice for λ at each time t. There is clearly a redundancy in our ansatz. The appropriate choice of λ and the elimination of the redundancy is the subject of the following subsection. 3.2.2. Constraints for Hessian positivity The functional given in Eq. (1.31) shall be crucial to our stability analysis. It would be preferable if the Hessian controlled the norm of our perturbation. Thus, we want in some sense our perturbations to be perpendicular to the null directions and to
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
757
any negative directions. In particular, we should like the Hessian of Jt to be positive for sufficiently small values of the coupling constant e. In order to achieve this, we proceed to impose the constraints imposed in [5] for the corresponding problem for the non-linear Klein–Gordon equation (i.e. the “e = 0 case”). Before stating what these equations are, we note that there are some global symmetries of the functional Jt ; Jt is invariant under translation and a global phase change. We expect therefore that the Hessian of Jt evaluated at a solution will be zero in the direction of the generators of these symmetries. For the e = 0 case, a description of these generators is given by Eqs. (3.7), (3.8), (3.11) and (3.12). In addition, since the e = 0 soliton solves a constrained minimization problem, a negative part of the spectrum to Jt is expected and found [4]. Indeed, changing the momentum, Π, or the charge Q will raise or lower the energy. Thus, since the momentum is related to the velocity u of the soliton while the charge is related to ω, we take (aA , bA ) (defined below) for A ∈ {−1, 4, 5, 6} as a representation of the generators of a change in momentum and charge. As a result of [5, Theorem 2.7], it is sufficient to require that, for all t in the interval of existence of the solution, w, bA (Z) + v, aA (Z) = 0 (3.5) R3
for A = −1, 0, . . . , 6 where as in [5] (aA , bA ) are given by b−1 (Z; λ) = gω,0 − iu.Zfω,0 ,
(3.6)
b0 (Z; λ) = ifω,0 ,
(3.7)
bi (Z; λ) = ∇iZ fω,0 (Z),
(3.8)
b3+i (Z; λ) = ζji ∇jZ fω,0 (Z) − iωγ((γPu + Qu )Z)i fω,0 (Z),
(3.9)
while a−1 (Z; λ) = −γ −1 b0 + (γu.∇Z − iγω)b−1 ,
(3.10)
a0 (Z; λ) = (γu.∇Z − iγω)b0
(3.11)
ai (Z; λ) = (γu.∇Z − iγω)bi ,
(3.12)
a3+i (Z; λ) = (γPu + Qu )Z)ij bj + (γu.∇Z − iγω)b3+i , j
(3.13)
d fω,e , and ζji = dZ where i, j = 1, 2, 3, gω,e = dω dui + t(γPu + Qu )ij . An equivalent more compact representation of the constraints is given in Eq. (3.14). Of course, we should like it to be possible to impose these constraints for the lifetime of the supposed nearby solution. In order to show that this imposition is possible, we show that it is possible, in some sense, to do this at time t = 0. The subject of Lemma 3.4 is the demonstration of this fact. We then prove in Lemma 3.5 that the time derivative of the left-hand side of Eq. (3.5) is zero. That the Hessian of Jt is positive given the imposition of the constraints is the subject of Sec. 4.1 and the following theorem.
October 7, 2006 17:42 WSPC/148-RMP
758
J070-00278
E. Long
3.2.3. Norm equivalence of the Hessian Theorem 3.2. Suppose that the potential G satisfies (A.2)–(A.5), U (1), U (2), S(1), (WP1) and (WP2). Suppose further that λ lies in a compact subset, C, of OStability,e . Then, there exists eC > 0 such that, if |e| < eC , the quadratic form Ee (v, w, q, s) given by Ee (v, w, q, s) = s 2L2 + ∇ × q L2 + 2 u.s × ∇ × q+ w − iγωv + u.∇v 2L2 + v, (− + m2 − ω 2 − β(fω,e ))vL2 + v, −β (fω,e )fω,e Re[v]L2 is equivalent uniformly on C to (v, w, q, s) 2H 1 ⊕L2 ⊕H˙ 1 ⊕L2 provided that (v, w) satisfy the constraints d d ψS,0 φS,0 − w, = 0, (3.14) v, dΛ dΛ L2 L2 where we define d d ψS,e = exp[−i(Θ)] (exp(−ieχ)ψS,e ), dΛ dΛ
(3.15)
d and likewise for dΛ φS,e . Proof. See Sec. 4.1. 3.3. Solubility of the constraints We must, of course, show that, at each time, it is possible to write the solution in the form (3.3) in such a way that the constraints (3.14) hold. This is the content of Theorem 3.6 below. For ease of reading, it is helpful to state this theorem in the form of two lemmas. The first lemma, Lemma 3.4 will show that we can impose the constraints (3.14) at time t = 0 provided that we make the correct choice of λ(0). Indeed, if Ψ(0) − ΦS,e (λ) ˙ 1 ⊕L2 (where Ψ(0) is the initial data) is H 1 ⊕L2 ⊕H small, then so is Ψ(0) − ΦS,e (λ) H 1 ⊕L2 ⊕H˙ 1 ⊕L2 for a continuum of λ near to λ in OStability,e . However, for only one λ near to λ will it be possible to write Ψ(0) in such a way that the constraints (3.14) hold? The second lemma, Lemma 3.5 completes the argument by showing that
d d d ψS,0 φS,0 − w, = 0. (3.16) v, dt dΛ dΛ L2 L2 In order to state these lemmas precisely, we need to introduce the following set. by = (θ, ω , Definition 3.3. Let λ , u ξ) ∈ OStability,e . Then define Kle (λ) = {λ ∈ OStability,e : |ω − ω | ≤ l}, Kle (λ) | + |u − u | ≤ 2l} ⊂ OStability,e . with the proviso that l satisfies {λ : |ω − ω | + |u − u
(3.17)
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
759
We now state the initial data preparation lemma showing that we may impose the constraints at t = 0. = (Θ, ω , Lemma 3.4. Suppose that there exists λ , u ξ) such that d (ω fω,0 2L2 )ω=eω < 0. dω δ(λ, e), c1 > 0 such that, if |e| < e(λ) and Then, there exists e(λ), H 1 + ψ(0) − ψS,e (λ) L2 < δ, φ(0) − φS,e (λ)
(3.18)
there exists λ(0) ∈ OStability,e depending differentiably upon (φ(0), ψ(0)) such that (ve (0), we (0)), defined by ve (0) = Exp[−i(Θ + eχ)](φ(0) − φS,e (λ(0)))
(3.19)
we (0) = Exp[−i(Θ + eχ)](ψ(0) − ψS,e (λ(0))),
(3.20)
ve (0), aA L2 + we (0), bA L2 = 0
(3.21)
and
satisfy
for each A = −1, 0, 1, . . . , 6. Furthermore, φ(0) − φS,e (λ(0)) H 1 + ψ(0) − ψS,e (λ(0)) L2 < c1 δ.
(3.22)
Proof. See Sec. 4.2.1. Next, we show that the value of the constraints does not change in time. Lemma 3.5. Assume that the hypotheses of Lemma 3.4 hold. Let λ(0) ∈ OStability,e and (ve (0), we (0)) be as given in the conclusions of Lemma 3.4. Let Ψ be a solution to the Cauchy problem for (1.1)–(1.3) on the time interval [0, T # ] with sup Ψ(t) 2H 1 ⊕L2 ⊕H˙ 1 ⊕L2 < N0 .
[0,T # ]
δ2 , N0 , e(λ)) > 0 such that, if Then, there exists δ2 > 0 and T1 = T1 (λ, e (ve (0), we (0)) H 1 ⊕L2 < δ2 and λ(0) ∈ Kl/4 (λ) where l < δ2 , on [0, T1 ] there such that the constraints (3.14) are satisfied, i.e. exists λ(t)∈C 1 ([0, T1 ]; K e (λ)) 2l
ve ,
d d ψS,0 (λ) φS,0 (λ) − we , = 0, dΛ dΛ L2 L2
(3.23)
ve = Exp[−i(Θ + eχ)](φ(0) − φS,e (λ)),
(3.24)
we = Exp[−i(Θ + eχ)](ψ(0) − ψS,e (λ)).
(3.25)
where
Proof. See Sec. 4.2.3.
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
E. Long
760
Combining these lemmas, we have the following theorem. Theorem 3.6. Let Ψ be a solution to the Cauchy problem for (1.1)–(1.3) on the time interval [0, T #] with Ψ(t) 2H 1 ⊕L2 ⊕H˙ 1 ⊕L2 < N0
(3.26)
∂ = (θ, ω , at each time t. Let λ , u ξ) be given with ∂ω (ω fω,0 2L2 )ω=eω < 0. Then, and > 0, δ(λ) > 0, and c1 > 0 such that, if |e| < e(λ) there exists e(λ) Ψ(0) − ΦS,e(λ) H 1 ⊕L2 ⊕H˙ 1 ⊕L2 < δ(λ), there exists λ(0) ∈ OStability,e depending differentiably upon Ψ(0) such that d d ψS,0 (λ(0)) φS,0 (λ(0)) − we (0), =0 (3.27) ve (0), dΛ dΛ L2 L2
with φ(0) − φS,e (λ(0)) H 1 + ψ(0) − ψS,e (λ(0)) L2 < c1 δ. Furthermore, if λ(0) ∈ K e , there exists = λ(0) if Ψ(0) = ΦS,e (λ). In addition, λ l/4 δ2 > 0 such that, if (ve (0), we (0)) H 1 ⊕L2 < δ2 and l < δ2 , we have the existence l, δ2 , N0 , e(λ)) ∈ (0, T #] and of λ ∈ C 1 ([0, T1 ]; K2l ) with the property of T1 = T1 (λ, that, on [0, T1 ], the constraints (3.14) are satisfied, i.e. d d ψS,0 (λ) φS,0 (λ) − we , = 0, (3.28) ve , dΛ dΛ L2 L2 where ve = Exp[−i(Θ + eχ)](φ(0) − φS,e (λ)),
(3.29)
we = Exp[−i(Θ + eχ)](ψ(0) − ψS,e (λ)).
(3.30)
Proof. See Sec. 4.2. 3.4. Taylor expansion of Jt Our final theorem is used to bind these last three theorems to prove the stability theorem, Theorem 1.2. Theorem 3.7. Suppose that on [0, T ] the constraints (3.14) are satisfied. Suppose that G satisfies hypothesis (N ) given in Appendix A.1. Define |2 + s 2L2 + ∇ × q 2L2 + w 2L2 + v 2H 1 ). |2 + |u − u ∆(t) = sup(|ω − ω [0,t]
Let Ψ(t) = Ψ(t) − ΦS,e (λ). Then, 2 + γ[u(0)]hω (γ[u(0)]2 Pu(0) + Qu(0) )ij δui δuj Jt (ΦS,e (λ))[Ψ(t)] 2 = Jt=0 (ΦS,e(λ(0)))[Ψ(0)] + O(e)O(∆) + o(∆),
(3.31)
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
761
2 is the second derivative of Jt evaluated at ΦS,e (λ) with douwhere Jt (ΦS,e (λ))[Ψ] ble input [Ψ]. Proof. See Sec. 4.1.1. 3.5. Completion of the proof of Theorem 1.2 By the local existence theorem, Theorem 3.1, and by Theorem 3.6, we may assume l, δ2 , N0 , e(λ)) such that that the solution Ψ exists on [0, T1 ] where T1 = T1 (λ, 1 e there exists λ ∈ C ([0, T1 ]; K2l (λ)) with the property that, on [0, T1 ], the constraints (3.14) are satisfied. Assume further that e is bounded so that conclusions of Theorems 3.2, 3.6, and 3.7 hold. From Theorem 3.2, it follows that Ee (v, w, q, s) is equivalent to (v, w, q, s) 2H 1 ⊕L2 ⊕H˙ 1 ⊕L2 . Therefore, it follows that there exists ∆∗ > 0 such that, if e is sufficiently small, and if ∆ < ∆∗ on some time interval [0, T2 ], we have that for each t ∈ [0, T2 ] e (λ), λ(t) ∈ Kl/2
(3.32)
2 ≤ c5 ∆(0), ∆(t) ≤ c4 Jt=0 (ΦS,e (λ(0)))[Ψ(0)]
(3.33)
and δ2 given by where ∆∗ , c4 , and c5 are positive and dependent only upon λ Lemma 3.5. Hence, there exists δ3 > 0 such that, if ∆(0) < δ3 , ∆(T2 ) ≤
∆∗ . 2
(3.34)
Therefore, if ∆(0) < δ3 , the set of times for which ∆ ≤ ∆2∗ is non-empty. By continuity, this set is closed. We shall show that this is set is open to finish the proof of the stability theorem, Theorem 1.2. Since ∆(T2 ) ≤ ∆2∗ , we may assume that Ψ(T2 ) 2H 1 ⊕L2 ⊕H˙ 1 ⊕L2 < k0 . Hence, as before, we may assume that the solution l, δ2 , N0 , e(λ)) such that there exists Ψ exists on [0, T2 + T1 ] where T1 = T1 (λ,
λ ∈ C 1 ([0, T2 + T1 ]; K2l ) with the property that, on [0, T1 ], the constraints (3.14) are satisfied. By continuity, we may assume that for some T3 > 0, ∆(t) < ∆∗ for t ∈ [0, T2 + T3 ]. However, since ∆(0) < δ3 , we conclude that ∆(T2 + T3 ) ≤ ∆2∗ using the same reasoning as before. We have thus proven Theorem 1.2.
4. Proof of Subsidiary Theorems 4.1. Norm equivalence of the Hessian Since we constrain |u| ≤ δ < 1 for some δ > 0 and we have from Lemma A.14 that fω,e − fω,0 H 2 = O(e2 ), Theorem 3.2 follows as a corollary from the following theorem which essentially is [5, Theorem 2.7]. Theorem 4.1. Suppose that the potential G is such that (A.2 )–(A.5), U (1), U (2), and S(1) hold. Suppose further that λ lies in a compact subset, Kle=0 , of OStability,0 .
October 7, 2006 17:42 WSPC/148-RMP
762
J070-00278
E. Long
Then, the quadratic form Ee=0 (v, w) given by Ee=0 (v, w) = w − iγωv + u.∇v 2L2 + v, (− + m2 − ω 2 − β(fω,0 ))vL2 + v, −β (fω,0 )fω,0 Re[v]L2 is equivalent uniformly on Kle=0 to (v, w) 2H 1 ⊕L2 provided that (v, w) satisfies the constraint d d ψS,0 − v, φS,0 =0 (4.1) v, dΛA dΛA L2 L2 for each A ∈ {−1, 0, 1, . . . , 6}. We now relate this theorem to the Hessian of the functional Jt evaluated at i.e. Jt ((ΦS,e (λ))[Ψ] 2 . We have it that ΦS,e (λ) twice in the direction of Ψ, 2 = s 2 2 + ∇ × q L2 + 2 u(0).s × ∇ × q Jt ((ΦS,e (λ))[Ψ] L + w − iγωv + u(0).∇v 2L2 + v, (− + m2 − ω 2 − β(fω,e ))vL2 + v, −β (fω,e )fω,e Re[v]L2 + v, −2ie(γαω,eu + ∇χ + q).∇v + e2 |γαω,e u + ∇χ|2 vL2
1 + w, eαω,e γ[u]u.u(0) + v γ[u(0)] L2 − 2(iγ(ω − eαω,e ) − u.∇)fω,e , ieu(0).qvL2 , and thus, 2 = Ee (v, w, q, s) + O(e)O(∆). Jt ((ΦS,e (λ))[Ψ]
(4.2)
4.1.1. Taylor expansion of Jt : Proof of Theorem 3.7 We need the following few lemmas. Lemma 4.2. Suppose that the constraints (3.14) are satisfied. Then,
= O(e)O(∆ 12 ), Π (ΦS,e (λ))[Ψ] = O(e2 )O(∆ 12 ), Q (ΦS,e (λ))[Ψ] = O(e2 )O(∆ ), αω,e ρ (ΦS,e (λ))[Ψ] 1 2
(4.3) (4.4) (4.5)
where f (x, y) = O(x)O(y) means that f = gh with limx→0 g = 0 and limy→0 h = 0.
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
763
Proof. We have = w, (−iγ(ω − eαω,e )u + ∇)fω,e L2 Π (ΦS,e (λ))[Ψ] + iγ(ω − eαω,e )fω,e − u.∇fω,e , (−iγ(ω − eαω,e )u + ∇)vL2
1 Pu + γQu ∇αω,e × ∇ × q. + s × ∇ × (−γuαω,e ) − γ However, by Lemma A.14 in the appendix fω,e − fω,0 H 2 = O(e2 ). Thus, = w, (−iγωu + ∇)fω,0 L2 Π (ΦS,e (λ))[Ψ] 1
+ iγωfω,e − u.∇fω,0 , (−iγωu + ∇)vL2 + O(e)O(∆ 2 ), and the result now follows from the constraints. A similar proof works for (4.4) and (4.5). Lemma 4.3. H(ΦS,e (λ)) = γ[u]hω and Π(ΦS,e (λ)) = −γ[u]uhω where hω = 2 2 |∇α |2 + (ω − eαω,e )2 fω,e + 13 |∇fω,e |2 . ω,e 3 Proof. Simple calculation gives H(ΦS,e (λ)) Π(ΦS,e (λ)) = −γ[u]ue0 where
2 |∇αω,e |2 + (ω − eαω,e )2 fω,e −
e0 =
=
γ[u]|u|2 e0 +
1 0 , γ[u] e
(u.∇αω,e )2 (u.∇fω,e )2 + 2 |u| |u|2
while
(4.6)
which, since αω,e and f are radial, simplifies to e0 =
1 2 2 |∇αω,e |2 + (ω − eαω,e )2 fω,e + |∇fω,e |2 . 3 3
(4.7)
Meanwhile, e0 =
1 2
2 |∇αω,e |2 + (ω − eαω,e )2 fω,e − 2G(fω,e ).
The lemma is thus equivalent to the Pohozaev type identity e0 = e0 , i.e. 2 + 6G(fω,e ). |∇fω,e |2 = |∇αω,e |2 + 3(ω − eαω,e )2 fω,e
(4.8)
(4.9)
The proof of the Pohozaev type identity follows in the standard fashion. We note that fω,e and (ω − eαω,e )fω,e are exponentially decaying (see Lemma A.16) in Hr1 so that, from multiplying Eq. (1.20) for fω,e and integrating by parts, we have
|∇fω,e |2 =
2 −(x.∇fω,e )(ω − eαω,e )2 + 6G(fω,e )
(4.10)
October 7, 2006 17:42 WSPC/148-RMP
764
J070-00278
E. Long
so that we are done if we can show 2 fω,e x.∇(ω − eαω,e )2 = |∇αω,e |2 .
(4.11)
This last follows from multiplying Eq. (1.21) for αω,e across by x.∇αω,e and integrating by parts. Lemma 4.4. 1 Jt (ΦS,e (λ)) = Jt=0 (ΦS,e(λ(0))) + hω γ[u(0)]((γ[u(0)])2 Pu(0) + Qu(0) )ij (δu)i (δu)j 2 1 ∂qω 1 qω − (ω − ω(0))2 + o(∆), − (ω − ω(0)) γ[u(0)] 2 ∂ω γ[u(0)] 2 and δu = u − u(0). where qω = (ω − eαω,e )fω,e Proof. From the previous lemma, it follows that Jt (ΦS,e (λ)) =hω γ[u](1 − u(0).u) −
ω qω . γ[u(0)]
(4.12)
Now, γ[u](1 − u(0).u) =
1 + γ[u(0)]((γ[u(0)])2 Pu(0) + Qu(0) )ij (δu)i (δu)j γ[u(0)] + o(|u − u(0)|2 ).
Define hω = hω − ωqω . From Lemma A.15, ∂ hω = −qω ∂ω
(4.13)
and the result follows from simple algebra. The proof of Theorem 3.7 follows from the previous three lemmas once we note that, from conservation of H, Π, and Q, Jt (Ψ(t)) =Jt=0 (Ψ(0)) −
ω − ω(0) Q(Ψ(0)), γ[u(0)]
and that H (ΦS,e (λ)) + u.Π (ΦS,e (λ)) −
ω Q (ΦS,e (λ)) = γ
αω,e ρ (ΦS,e (λ)). γ
(4.14)
(4.15)
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
765
4.2. Modulational equations and the solubility of the constraints 4.2.1. Initial data preparation; Proof of Lemma 3.4 Proof. Define the operator DA by D : R ⊕ H 1 (R3 ) ⊕ L2 (R3 ) ⊕ R8 → R8 and d φS,0 (λ) φ(0) − φS,e (λ) dΛA , DA (e, φ, ψ, λ) = Ω (4.16) ψ(0) − ψ (λ) , d S,e ψS,0 (λ) dΛA where the simplectic form Ω : (L2 (R3 ) ⊕ L2 (R3 ))2 → R is given by a b Ω , = a, dL2 − b, cL2 . c d
(4.17)
ψS,0 (λ), λ) is invertevaluated at (0, φS,0 (λ), > 0 such that, if |e| < e(λ), the ible. Hence, by Lemma A.14, there exists e(λ) ∂ DA evaluated at (e, φS,e (λ), ψS,e (λ), λ =λ) is invertible. Now, assume matrix
By [5, Lemma 2.3], the matrix
∂ ∂λB DA
∂λB
and define R : H 1 (R3 ) ⊕ L2 (R3 ) ⊕ U → R8 , where U is a neighborthat |e| < e(λ) hood of OStability,e , by d φS,0 (λ) φ(0) − φS,e (λ) dΛA . RA (φ, ψ, λ) = Ω (4.18) ψ(0) − ψ (λ) , d S,e ψS,0 (λ) dΛA ψS,e (λ), λ =λ) is invertWe have it that the matrix ∂λ∂B RA evaluated at (φS,e (λ), ible. The result now follows from the implicit function theorem. 4.2.2. Modulational equations We now demonstrate that it is possible to impose the constraints (3.14) for some non-empty time interval. We work out the time evolution equations for v and w: d ˙ e), v + i(µ0 + γω − e(γαω,e + r))v = w + j1 (λ, λ, dt
(4.19)
d w + i(µ0 + γω − e(γαω,e + r)w dt ˙ e) + N (fω,0 , fω,e , v) + L(αω,e , q, λ, λ, ˙ Z, v), = −Mλ v + j2 (λ, λ, ˙ − γω, gω,e = where µ0 = Θ
d dω fω,e ,
µ = Z˙ − γu,
˙ e) = −ωg j1 (λ, λ, ˙ ω,e + erfω,e − µ0 fω,e − µ0 .∇Z fω,e , ˙ e) = (γu)t .∇Z fω,e − i(γω)t fω,e − ω(iγωg ˙ j2 (λ, λ, ω,e − γu.∇Z gω,e ) − (iγω − γu.∇Z )µ.∇Z fω,e − iµ0 (iγω − γu.∇Z )fω,e ,
(4.20)
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
E. Long
766
and −Mλ v = (x − (m2 + γ 2 ω 2 u2 ))v + 2iγu.∇x v + β(fω,0 )v + fω,0 β (fω,0 )v (4.21) while N (fω,0 , fω,e , v) = β(|fω,e + v|)(fω,e + v) − β(fω , e)fω,e − β(fω,0 )v − fω,0 β (fω,0 )(v)
(4.22)
with ˙ Z, v) L(αω,e , q, λ, λ, = eγµ0 αω,e fω,e − iγeα ˙ ω,e fω,e − iγ ω˙
d (eαω,e f ) dω
+ iγ 2 e(u.∇Z αω,e )fω,e − iγe(µ.∇Z αω,e )fω,e + iγ 2 eαω,e u.∇Z f − iγeαω,e µ.∇Z fω,e + 2ωeγ 2 u2 αω,e v + 2ieαω,e γu.∇x v + ieγ 2 (u.∇Z αω,e )v + (γeαω,e u)2 v − 2γωeu.qfω,e − 2ieq.∇x fω,e + ier(ieγ(ω − eαω,e )fω,e − γu.∇Z f ) − e2 |q|2 fω,e + 2eγu.qαω,e fω,e . As in [5], we remark that the constraints (3.14) are satisfied if at each time the following holds. 1 2 aA , j1 + bA , j2 + N + L + IA + iµ0 aA , v + IA + iµ0 bA , w = 0, (4.23) R3
β d where IA = dt (aA , bA ). In fact, using exactly the same procedure as in [5], we have the modulational equations: d ˙ (ω fω,0 22 )ω˙ = F0 (v, w, λ, λ), (4.24) dω
d ˙ ˙ − ω) = F−1 (v, w, λ, λ), (ω fω,0 22 ) (Θ (4.25) − dω
d ∇fω,0 22 i d 2 ˙ γu + ω fω,0 22 (ωγui ) = Fi (v, w, λ, λ), (4.26) dt 3 dt
d ∇fω,0 22 2 ˙ + ω fω,0 ξ = Fn+i (v, w, λ, λ), 22 (4.27) 3 dt 0i where for A = 0, −1, 1 2 ˙ = FA (v, w, λ, λ) bA , N + L + IA + iµ0 aA , v + IA + iµ0 bA , w R3
˙ + EA (e, λ, λ), while for i = 1, . . . , 3, ˙ = −(Pu + γ −1 Qu )ij Fi (v, w, λ, λ)
(4.28)
bj , N + L + Ij1 + iµ0 aj , v
˙ , + Ij2 + iµ0 bj , wdZ + Ej (e, λ, λ) (4.29) R3
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
and ˙ = −γ −2 (γ −2 Pu + Qu )ij F3+i (v, w, λ, λ)
R3
1 bj , N + L + I3+j + iµ0 a3+j , v
2 ˙ + iµ0 b3+j , w + E3+j (e, λ, λ), + I3+j
where we define
767
(4.30)
˙ = EA (e, λ, λ)
˙ − j1 (e = 0, λ, λ) ˙ aA , j1 (e, λ, λ) ˙ − j2 (e = 0, λ, λ). ˙ + bA , j2 (e, λ, λ) R3
(4.31)
R3
Again, as per [5], we can solve for λ˙ provided e, v, and w are small compared to . The proof of Theorem 3.6 is completed by the proof of Lemma 3.5.
d 2 dω (ω fω,e 2 )
4.2.3. Proof of Lemma 3.5 Proof. Arguing as in the proof of [5, Theorem 2.6], we note that for T ≤ T # it d (φ, ψ) is bounded in L2 ⊕ H −1 in terms of follows from Theorem 3.1 above that dt e N0 , thus if λ ∈ K2l (λ), and δ, l are small, it follows from an elementary application of the triangle inequality that on some time interval 0 ≤ t < T — where T (N0 ) ≤ T # — that (v, w) is small in L2 ⊕ H −1 . It is also to be observed that ∇q L2 is likewise small. Keeping in mind that, if it were not for the terms involving L and EA , the system of evolution equations could be manipulated — as in the proof of [5, Theorem 2.6] — to form a system of ordinary differential equations; dλ = V (λ, e, φ(t), ψ(t)), where V is a bounded continuous function of λ(t) ∈ K e (λ) 2l
dt
older’s and Sobolev’s inequalities, L L2 and |EA | are and 0 ≤ t < T . Since, by H¨ small with small |e|, v H 1 and w L2 , we can still manipulate the equations to form such a system of ordinary differential equations. The result now follows from the standard local existence theory for ordinary differential equations. Acknowledgments I wish to thank my doctoral supervisor, Dr. D.M.A. Stuart, for the suggestion of this problem and for many helpful conversations. This work was in part supported by the Engineering and Physical Sciences Research Council of the United Kingdom. I am grateful for financial assistance from the Robert Gardiner Memorial fund and the Isaac Newton Trust. Appendix A. A.1. Conditions on the non-linearity G Let U (f ) = G(f ) +
m2 2 f 2
(A.1)
October 7, 2006 17:42 WSPC/148-RMP
768
J070-00278
E. Long
f and define β : R → R by U (f ) = 0 tβ(t) dt. To ensure existence and regularity [5] of non-topological soliton solutions in the e = 0 case, the following conditions are imposed on U (f ); U (f ) = −U (−f ) and U ∈ C 1 (R) ∩ C 2 ((0, ∞)),
(A.2)
U (0) = U (0) = 0 and ∃ s ∈ (0, 1) : lim f s U (f ) = 0,
(A.3)
f →0
∃ ζ > 0 : U (ζ) >
m2 − ω 2 2 ζ , 2
U (f ) = 0. f →∞ f 5 lim
(A.4) (A.5)
To ensure uniqueness of the e = 0 solution, following [5, 8], we impose U(1)
∃ α > 0 : 0 < f < α ⇒ U (f ) < (m2 − ω 2 )f and α < f < ∞ ⇒ U (f ) > (m2 − ω 2 )f and U (α) − (m2 − ω 2 ) > 0,
and that U(2)
For β > α, ∃ λ = λ(β) ∈ C[(α, ∞), R+ ] such that 2(m2 − ω 2 )f + λf U (f ) − (λ + 2)U (f ) is non-negative on (0, β) and non-positive on (β, ∞).
For proving stability, we rely on [10, Lemma E.1] and the following spectral assumption: S(1)
The subspace in which L+ is strictly negative is one-dimensional,
where L+ = − + (m2 − ω 2 ) − β(fω,0 ) − fω,0 β (fω,0 ) and U (fω,0 ) = β(fω,0 )fω,0 . The spectral assumption is valid [5] when fω,0 is obtained by the constrained minimization technique of [2]. The following assumption is necessary [5] for the purposes of making a Taylor expansion of the functional Jt : (N)
The second derivative of U given by U has the property that the map φ → U (φ) is continuous as a map H 1 (R3 ) → Lp (R3 ) for some p ≥ 32 .
For example, U has the property (N ) if |U (φ) − U (ϕ)| ≤ C|φ − ϕ|(1 + |φ|3−δ + |ϕ| ). In order to have local well-posedness in the sense of Sec. 1.4.1, the following must hold: For all φ, ϕ, if 3−δ
˙ ˙ φ(0) H 1 + φ(0) L2 + ϕ(0) H 1 + ϕ(0) L2 < k0 ,
(A.6)
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
then
U (φ − U (ϕ) rL2 ≤ c k0 ,
T
(WP1) 0
T
ϕ L2 ,
0
T
0
769
φ L2
s ˙ × ( φ(0) − ϕ(0) H 1 + φ(0) − ϕ(0) ˙ L2 )
T T + c k0 , ϕ L2 , φ L2 0
×T and
T
(WP2) 0
T
r 0
U (φ) rL2 < c 1 + k0 +
T 0
0
(φ −
φ L2
ϕ) sL2
s ,
T where s > 0 and r ≥ 1. In (WP1), c > 0 may depend on k0 , 0 ϕ L2 , and T 2 0 φ L , whereas, in (WP2), the universal constant has no such dependence. The following proposition gives sufficient criteria U for (WP1) and (WP2) to hold. Proposition A.1. Suppose that, for all φ, ϕ, |U (φ − U (ϕ)| ≤ C|φ − ϕ|(1 + |φ|4−δ + |ϕ|4−δ )
(A.7)
for some 0 < δ ≤ 4. Suppose also that U (0) = 0 and that (A.6) holds. Then, it follows that (WP1) and (WP2) hold for U . Proof. (WP2) follows if we show that T δ n 2 U (φ ) L2 ≤ CT 1 + k0 + 0
T 0
5−δ φ L2
.
(A.8)
The condition (A.7) implies that |U (φn )| ≤ c|φn | + d|φn |5−δ since U (0) = 0. Thus, it suffices to show that
12
5−δ T T δ 10−2δ |φ| dt ≤ CT 2 1 + k0 + φ L2 . (A.9) R3
0
0
To prove this last inequality, we use the following Strichartz type estimate of Grillakis [12]:
rq r1
T T q (A.10) |φ| dt ≤ C(ε) φ L2 dt + k0 , R3
0
where r = 0
T
2 1−ε
and q =
R3
0
6 ε.
10−2δ
|φ|
But, by H¨ older’s inequality,
12
dt ≤ T
δ 2
0
T
10−2δ
R3
|φ|
1− δ2
1
2−δ
dt
.
(A.11)
October 7, 2006 17:42 WSPC/148-RMP
770
J070-00278
E. Long
Hence, applying Grillakis’ Strichartz estimate, [12, Theorem 1.4], gives us
T
10−2δ
R3
0
|φ|
1− δ2
1
2−δ
≤ k0 +
dt
T
0
5−δ φ L2
.
(A.12)
φ − ϕ L6 ds.
(A.13)
In order to show that (WP1) holds, we note that, if δ < 2,
t
4−δ
|φ|
0
(|φ − ϕ|) L2 ≤
t
12−3δ
|φ|
0
13
Therefore, using Sobolev’s inequality and H¨older’s inequality, we have 0
t
4−δ
|φ|
(|φ − ϕ|) L2 < c
t
0
∇(ϕ − φ) L2
t 0
12−3δ
|φ|
23 12 ,
where c > 0. Now, t
12−3δ
|φ|
0
13 ds < t
δ 2
t 0
12−3δ
|φ|
1− δ2
1
3−δ
ds
(A.14)
whence, by Grillakis’ Strichartz-type inequality, [12, Theorem 1.4], t 0
|φ|12−3δ
13
4−δ t δ ds < Ct 2 k0 + φ L2 ,
(A.15)
0
where C > 0. For the case of δ < 2, in order to show (WP1), it remains to observe that
t ˙ 2 2 ∇φ(t) L2 ≤ c ∇φ(0) L2 + φ(0) + φ . L L 0
On the other hand if 0 < δ ≥ 2, then let ε = δ − 2, so that 2 > ε ≥ 0, and
t
|φ|4−δ (|φ − ϕ|) L2 =
0
t
0
|φ|2−ε (|φ − ϕ|) L2 .
(A.16)
But, by H¨ older’s inequality, t 0
4−2ε
R3
|φ|
2
|φ − ϕ|
12
ds ≤
t 0
6−3ε 2+3ε φ − ϕ L68 φ − ϕ L28 φ 2−ε ds. L8
Observe that φ(t) L2
t ˙ ≤ c φ(0) L2 + t φ(0) L2 + ∇φ(0) L2 + φ L2 . 0
(A.17)
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
771
(WP1) follows, from using Sobolev’s inequality and Strichartz’s inequality [13];
18 t
t 8 ˙ |φ| ds < c ∇φ(0) L2 + φ(0) L2 + φ L2 . 0
R3
0
A.2. The local existence theorem; Theorem 3.1 This theorem will be proved by use of an iteration scheme and a device of Klainerman and Machedon appearing in [14]. The structure of this section will be firstly to define the iteration scheme, next to show that all the iterates exist and are uniformly bounded on some fixed non-empty time interval, then to prove that the iterates satisfy the contraction mapping property in an appropriate norm, and finally to show uniqueness and local well-posedness of the solutions. In what follows, we assume the hypotheses of Theorem 3.1 hold. A.2.1. The iteration scheme Initial Conditions. For n ≥ 1, we define φn (t = 0) = φ0 , φ˙ n (t = 0) = φ1 ; ˙ n (t = 0) = a1 ; An (t = 0) = a0 , where An (t = 0) = a0 , A 0 − a0 = ieφ0 , φ1 − ie a0 φ0 .
(A.18)
Recall that ∇a0 L2 + a1 L2 + φ0 H 1 + φ1 L2 < k0 . The nth iterate. Now, on the time interval [0, 1], define φ1 = 0,
(A.19)
1
A = 0, A˙ 10 = −∇.ieφ1 , (∇ − ieA1 )φ1 ,
(A.20) (A.21)
where A˙ 10 vanishes at infinity. It is well known that these equations are soluble with the desired regularity (φ1 , φ˙ 1 ) ∈ C([0, 1); H 1 ⊕ L2 ) ∩ C 1 ([0, 1); L2 ⊕ H −1 ) while ∇A1 , A1 ∈ C([0, 1); L2 ). Next, for n ≥ 1, define on the time interval [0, 1] φn+1 = 2ieAn0 φ˙ n + ieA˙ n0 φn + e2 |An0 |2 φn − 2ieAn .∇φn − e2 |An |2 φn + G (|φn |),
(A.22)
An+1 = Pieφn , (∇ − ieAn )φn , −A˙ n+1 0
= ∇.ieφ
n+1
, (∇ − ieA
n+1
n+1
)φ
(A.23) .
(A.24)
Here, PB = −1 (∇ × ∇ × B), i.e. P isolates the divergence-free part of B. Remark A.2. For a fixed point (φ, A) of the above iteration scheme, ∇.A = 0. In the sequel, we need the following definition: 0 n X (T ) = T n n 2 2 0 φ L + A L
if n = 1 . if n > 1
(A.25)
October 7, 2006 17:42 WSPC/148-RMP
772
J070-00278
E. Long
A.2.2. Existence and boundedness of the iterates In this section, where not explicitly stated, we shall assume that t ∈ [0, 1]. We need the following lemma: Lemma A.3. Let the iteration scheme be defined as above. Suppose that the potential G satisfies (A.2)–(A.5), (WP1) and (WP2). Suppose further that the nth iterate exists on [0, 1] with (φn , φ˙ n ) ∈ C([0, 1); H 1 ⊕ L2 ) ∩ C 1 ([0, 1); L2 ⊕ H −1 ) while ∇An , ˙ n ∈ C([0, 1); L2 ). Then, A An0 Lp ≤ c(1 + k0 + X n (t))7 , for p ∈ [1,
3 2 ],
(A.26)
and
An0 L∞ ≤ c(1 + k0 + X n−1 (t) + X n (t))16+s + c(1 + k0 + X n (t))(1 + φn L8 ), (A.27) where s > 0 is as per (WP2). Proof. See [17, Chap. 3]. We have the following corollary also proved in [17, Chap. 3]. Corollary A.4. Let the iteration scheme be defined as above. Suppose that the potential G satisfies (A.2)–(A.5), (WP1) and (WP2). Suppose further that the nth iterate exists on [0, 1] with (φn , φ˙ n ) ∈ C([0, 1); H 1 ⊕ L2 ) ∩ C 1 ([0, 1); L2 ⊕ H −1 ) while ˙ n ∈ C([0, 1); L2 ). Define ∇An , A T ∆n (T ) = φn − φn−1 L2 + An − An−1 L2 0
n
for n > 1. Then, if X (t) is bounded on [0, T ] uniformly in n and t for some T < 1, L∞ ≤ c(1 + φn L8 + φn−1 L8 )(∆n (T ) + ∆n−1 (T )) An0 − An−1 0 + c φn − φn−1 Lq ,
(A.28)
for some q ≥ 8. Given existence and desired regularity of the nth iterate on [0, 1], existence and the desired regularity of the (n + 1)th iterate is a consequence of the following theorem. Theorem A.5. Let the iteration scheme be defined as above. Suppose that the potential G satisfies (A.2)–(A.5), (WP1) and (WP2). Suppose further that the nth iterate exists on [0, 1] with (φn , φ˙ n ) ∈ C([0, 1); H 1 ⊕ L2 ) ∩ C 1 ([0, 1); L2 ⊕ H −1 ) while ˙ n ∈ C([0, 1); L2 ). Then, for any e0 > 0, there exists c5 > 0 dependent only ∇An , A upon e0 such that, if |e| < e0 and T ∈ [0, 1], T ∗ ∗ ψ1n L2 + ψ n2 L2 dt ≤ c5 T r (1 + k0 + X n (T ))s , (A.29) 0
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
773
where ψ1n = 2ieAn0 φ˙ n + ieA˙ n0 φn + e2 |An0 |2 φn − 2ieAn .∇φn − e2 |An |2 φn + G (|φn |), ψ n2 = Pieφn , (∇ − ieAn )φn , the universal constants r∗ > 0, s∗ > 1 and 0 n X (T ) = T ψ1n−1 L2 + ψn−1 L2 2 0
(A.30)
if n = 1 if n > 1
.
(A.31)
Proof. From [14, Theorem 4.1], 0
T
1
ψ1n − 2ieAn0 φ˙ n − G (φn ) L2 + ψ n2 2 ≤ CT 2 (1 + k0 + X n (T ))4 ,
(A.32)
where C > 0 depends only upon e0 . By Lemma A.3, we are done if we show that
T
0
∗
∗
G (φn ) L2 ≤ CT r (1 + k0 + X n (T ))s
(A.33)
for some r∗ > 0 and s∗ > 1. This follows from (WP2) and (A.17). Corollary A.6. There exists T0 ∈ [0, 1] and c6 > 0 such that, for all n, X n (T0 ) < c6 . Furthermore, c6 and T0 depend only upon e0 and k0 . Thus, we have existence and uniform boundedness of the iterates on [0, T0 ]. Our next task is to demonstrate the contraction mapping property. A.2.3. The contraction mapping property We wish to show that there exist T1 > 0 and 0 < s < 1, dependent only upon e0 and k0 , such that ∆n (T1 ) ≤ ∆n−1 (T1 ) for n ≥ 2 where ∆n (T ) = 0
T
(φn − φn−1 ) L2 + (An − An−1 ) L2 .
(A.34)
This follows as a corollary to the following theorem. Theorem A.7. Let (φn , An ) solve the iteration scheme (A.19)–(A.24) on [0, T0 ]. It follows that for n ≥ 3 on [0, T0 ] ∆n (T ) ≤ c7 T q (∆n−1 (T ) + ∆n−1 (T )),
(A.35)
where c7 > 0 depends only upon e0 and k0 , and q > 0 is some universal constant.
October 7, 2006 17:42 WSPC/148-RMP
774
J070-00278
E. Long
Proof. Inequality 4.5 in [14] implies that 0
T
ψ1n − ψ1n−1 − (2ieAn0 φ˙ n − 2ieAn−1 φ˙ n−1 ) − (G (φn ) − G (φn−1 )) L2 0 1
+ ψn2 − ψ n−1 L2 dt ≤ c7 T 2 ∆n−1 (T ). 2 Thus, by Lemma A.3 and its Corollary A.4, we are done if we can show 0
t
G (φn ) − G (φn−1 ) L2 < c7 tq (∆n−1 (t)).
(A.36)
However, (A.36) follows from (WP1) and (A.17). Thus, we have proven the existence and regularity of Theorem 3.1. Local wellposedness in the sense of Sec. 1.4.1 now follows from [14, Theorem 4.2]. The following is an important corollary. Corollary A.8. The solutions given by our local existence Theorem 3.1, (A,φ) satisfy φ ∈ C([0, t]; H 1 ) ∩ C 1 ([0, t]; L2 ) and A ∈ C([0, t]; H 1 ) ∩ C 1 ([0, t]; L2 ), where L2 is the space of all functions f satisfying |f |2 < ∞, and H 1 is the space of those functions g satisfying g, ∇g ∈ L2 . It follows that the Hamiltonian energy, the Noether charge due to the phase symmetry, and the momentum are preserved by the equations of motion on the interval [0, t]. A.3. Some estimates of the soliton electromagnetic potential α Lemma A.9. For each f ∈ Hr2 (R3 ), there exists a unique α ∈ H˙ r1 (R3 ) such that −α + e2 f 2 α = ωef 2 .
(A.37)
Furthermore, the map A : H 2 (R3 ) → H˙ 1 (R3 ) defined by A(f ) = α is continuously Fr´echet-differentiable. Proof. This follows from standard arguments. Lemma A.10. Suppose that f ∈ H 1 (R3 ). Suppose further that α solves −α + e2 f 2 α = eωf 2 .
(A.38)
It follows that ∇α, ∇i ∇j α ∈ L2 (R3 ) for any i, j ∈ (1, 2, 3). Furthermore, ∇i ∇j α L2 , ∇α L2 , α L∞ = O(e) 2 2 2 2 |∇α| + e f α = eω f 2α (A.39) Proof.
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
775
from which it easily follows via Sobolev’s inequality that ∇α L2 ≤ ce f L2 f L3 .
(A.40)
Next, since −α = e(ω − eα)f 2 , we have α L2 ≤ e(ω f 2L4 + e α L6 f 2L6 ).
(A.41)
By the Calderon–Zygmund inequality, we have that for any i, j ∈ (1, 2, 3), ∇i ∇j α L2 = O(e).
(A.42)
By Sobolev’s inequality, we have thus shown that α ∈ W 1,6 and hence by Morrey’s inequality, α L∞ = O(e). Corollary A.11. Suppose that fω,e ∈ H 2 (R3 ) solves −fω,e + m2 fω,e − (ω − eαω,e )2 fω,e = β(fω,e )fω,e ,
(A.43)
where αω,e ∈ H˙ r1 (R3 ) is a non-local function of fω,e uniquely determined by 2 2 −αω,e + e2 fω,e αω,e = ωefω,e .
(A.44)
Then, fω,e ∈ H 4 (R3 ). Proof. Differentiate the equation for fω,e and apply the Calderon–Zygmund inequality. This leads naturally to the following lemma. Lemma A.12. Suppose that f ∈ H 4 (R3 ) and that α solves −α + e2 f 2 α = eωf 2 .
(A.45)
It follows that ∇α ∈ W 3,p (R3 ) for any p ∈ ( 32 , ∞). Proof. Differentiate (A.45), and apply the Calderon–Zygmund inequality (using the H¨older and Sobolev inequalities, if necessary) to get the result. Lemma A.13. Suppose that f ∈ H 2 (R3 ) and that α solves −α + e2 f 2 α = eωf 2 .
(A.46)
It follows that α L∞ ≤ | ωe |. Proof. Assume that f in Cc∞ (R3 ). Define α+ = max(α, 0) and α− = max(−α, 0). Suppose ωe > 0, then by a weak maximum principle ([15, Theorem 8.1]), α > 0. Now, A0 = α− ωe solves −A0 + e2 |f |2 A0 = 0, therefore A0 ≤ 0 by the same weak maximum principle. Hence, 0 ≤ α ≤ ωe . Similarly, if −ωe > 0, then 0 ≥ α ≥ − ωe so that α L∞ ≤ | ωe |. The lemma follows by approximation.
October 7, 2006 17:42 WSPC/148-RMP
776
J070-00278
E. Long
A.4. Differentiability Lemma A.14. Let fω,e be given by Theorem 1.1. Then, fω,e − fω,0 H 2 = O(e2 ).
(A.47)
Proof. By the implicit function theorem, dfω,e ∂ = −(Fφ (fω,e , e, 0))−1 ◦ F (fω,e , e, 0), (A.48) de ∂e where F is given by (2.1) in the proof of Theorem 1.1. It follows therefore that dfω,e dfω,e de (e = 0) = 0 and that de is continuously differentiable with respect to e. Lemma A.15. Let hω = hω − ωqω , where hω = H(ΦS,e (0, ω, 0, 0)) while qω = Q(ΦS,e (0, ω, 0, 0)). Then d hω = −qω . dω Proof. Following the argument given in [4], we note that d d hω = −qω + H (ΦS,e (λ0 )) − ωQ (ΦS,e (λ0 )), ΦS,e (λ0 ) , dω dω
(A.49)
(A.50)
where λ0 = (0, ω, 0, 0). The result follows from the fact that H (ΦS,e (λ0 )) − ωQ (ΦS,e (λ0 )) = 0. A.5. Taylor’s formula for the potential Lemma A.16. Suppose that G obeys the condition (N ). Suppose also that φ, v ∈ H 1 . Then, 1 G(|φ + v|) = G(|φ|) + G (|φ|)[v] + G (|φ|)[v]2 + o( v 2H 1 ). (A.51) 2 Proof. Using a standard Taylor expansion at each x ∈ R3 , we have 1 G(|φ + v|) = G(|φ|) + G (|φ|)[v] + G (|φ|)[v]2 2 3 3 R R 1 + (1 − s)(G (|φ + sv|) − G (|φ|))[v]2 . R3
0
The result follows from condition (N) by H¨ older’s and Sobolev’s inequalities. A.6. Exponential decay of the profile function fω,e Lemma A.17. Suppose that |e| < e1 , for some e1 > 0. Under conditions (A.2)– (A.5) on U, |Dκ fω,e (x)| ≤ C Exp[−δ|x|]
(A.52)
for positive constants C and δ, and where |κ| ≤ 2. Furthermore, the constants C and δ are independent of the coupling constant e.
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
777
Proof. We adapt an argument used to prove [2, Lemma 2]. Recall that fω,e solves −fω,e + (m2 − (ω − eα)2 fω,e = β(fω,e )fω,e . Now, let h = rfω,e , where r = |x|. Then, for r ∈ (0, ∞),
d2 dr 2 he
(A.53) = rfω,e , and so
d2 he = ((m2 − (ω − eα)2 − β(fω,e ))he . dr2
(A.54)
It follows from a lemma by Strauss [16], that since fω,e ∈ H 1 , we have it that there exists r1 > 0 such that |fω,e | < C|x|−1
(A.55)
for |x| > r1 , where by continuity of fω,e H 1 as a function of e, we may assume that the positive constant C is independent of e. Recall that from Lemma A.10 (ω − eα)2 < ω 2 , whence m2 − (ω − eα)2 − β(fω,e ) > m2 − ω 2 − β(fω,e ). Therefore, there exists r0 > 0 such that r > r0 implies that m2 − ω 2 − β(fω,e [r]) >
m2 − ω 2 , 2
(A.56)
for example. Next, let qe = h2e so that we have
2 dhe d2 q = 2 + 2((m2 − (ω − eα)2 − β(fω,e ))qe , e dr2 dr
(A.57)
from which it follows that d2 qe − (m2 − ω 2 )qe ≥ 0 dr2
(A.58) √
for r > r√ 0 . Factorizing the left side of the above, let us define ze by e d qe + m2 − ω 2 qe ) so that ( dr dze > 0. dr
m2 −ω 2 r
ze =
(A.59)
For the sake of obtaining a contradiction, let us suppose that there exists r1 > r0 such that ze (r1 ) > 0. This would then imply √ that ze (r) ≥ ze (r1 ) > √ 0 for all r > r1 . d 2 q ≥ z(r )e m2 −ω 2 r for all qe + m2 − ω√ However, we should then have that dr e 1 d qe + m2 − ω 2 qe is an integrable r > r1 . We now have a contradiction since dr function. It follows that for all r > r0 we have ze (r) < 0. Thus, for r > r0 , we have √ d √m2 −ω2 r 2 2 (e qe ) = e2 m −ω r ze ≤ 0, dr
(A.60)
October 7, 2006 17:42 WSPC/148-RMP
778
J070-00278
E. Long
whence qe (r) ≤ (qe (r1 )e
√ m2 −ω 2 r1
)e−
√ m2 −ω 2 r
,
(A.61)
and finally |fω,e | ≤
C − e r
√
m2 −ω2 r 2
.
(A.62)
To control the other derivatives, observe that
d d r2 fω,e = −r2 ((m2 − (ω − eα)2 − β(fω,e ))fω,e . dr dr 2
(A.63)
2
< m2 − (ω − eα)2 − Thus, there exists r2 > 0, such that if r > r2 , then m −ω 2 2 2 β(fω,e ) < m − ω . Upon integrating (A.63) on (R1 , R2 ), and using the decay of d fω,e (A.62), and letting both R1 and R2 go to infinity, we conclude that r2 dr fω,e has a limit as r tends to infinity. Integrating (A.63), this time on (R1 , ∞), it d d2 fω,e has exponential decay. The exponential decay of dr follows that dr 2 fω,e follows from (A.52). References [1] H. Spohn, Dynamics of Charged Particles and Their Radiation Field (Cambridge University Press, Cambridge, 2004). [2] H. Berestycki and P. L. Lions, Nonlinear scalar field equations. I. Existence of a ground state, Arch. Rational Mech. Anal 82 (1983) 313–345. [3] S. Coleman, Q balls, Nuclear Physics B 262 (1985) 263–283. [4] M. Grillakis, J. Shatah and W. Strauss, Stability theory of solitary waves in the presence of symmetry, I, J. Funct. Anal. 74 (1987) 160–197. [5] D. M. A. Stuart, Modulational approach to stability of non-topological solitons in semilinear wave equations, J. Math. Pures Appl. 80(1) (2001) 51–83. [6] V. Benci and D. Fortunato, Solitary waves of the nonlinear Klein–Gordon equation coupled with the Maxwell equations, Rev. Math. Phys. 14(4) (2002) 409–420. [7] T. D’Aprile and D. Mugnai, Solitary waves for nonlinear Klein–Gordon–Maxwell and Schr¨ odinger–Maxwell equations, Proc. Roy. Soc. Edinburgh Sect. A 134(5) (2004) 893–906. [8] K. McLeod, Uniqueness of positive radial solutions of u + f (u) = 0 in Rn , Trans. Amer. Math. Soc. 339(3) (1993) 495–505. [9] D. M. A. Stuart, Periodic solutions of the abelian Higgs model and rigid rotation of vortices, Geom. Funct. Anal. 9(3) (1999) 568–595. [10] M. Weinstein, Modulational stability of ground states of nonlinear Schr¨ odinger equations, SIAM J. Math. Anal. 16(3) (1985) 472–491. [11] M. Reed and B. Simon, Methods of Modern Mathematical Physics, Vol. 1 (Academic Press, New York, 1972). [12] M. Grillakis, Regularity for the wave equation with a critical non-linearity, Comm. Pure Appl. Math 45(6) (1992) 749–774. [13] R. S. Strichartz, Restrictions of Fourier transforms to quadratic surfaces and decay of solutions of wave equations, Duke Math. J. 44 (1977) 705–714.
October 7, 2006 17:42 WSPC/148-RMP
J070-00278
Stability Charged Solitary Waves
779
[14] S. Klainerman and M. Machedon, On the Maxwell–Klein–Gordon equation with finite energy, Duke Math. J. 74(1) 19–44 (1994). [15] D. Gilbarg and N. S. Trudinger, Elliptic Partial Differential Equations of Second Order (Springer-Verlag, Berlin, 1998). [16] W. Strauss, Existence of solitary waves in higher dimensions, Comm. Math. Phys. 55 (1977) 149–162. [17] E. Long, On charged solitons and electromagnetism, Ph.D. thesis, University of Cambridge (2006).
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Reviews in Mathematical Physics Vol. 18, No. 7 (2006) 781–821 c World Scientific Publishing Company
ON MOMENT MAPS ASSOCIATED TO A TWISTED HEISENBERG DOUBLE
ˇ ´IK C. KLIMC Institute de Math´ ematiques de Luminy, 163, Avenue de Luminy, 13288 Marseille, France [email protected] Received 21 February 2006 Revised 24 August 2006 We review the concept of the (anomalous) Poisson–Lie symmetry in a way that emphasizes the notion of Poisson–Lie Hamiltonian. The language that we develop turns out to be very useful for several applications: we prove that the left and the right actions of a group G on its twisted Heisenberg double (D, κ) realize the (anomalous) Poisson–Lie symmetries and we explain in a very transparent way the concept of the Poisson–Lie subsymmetry and that of Poisson–Lie symplectic reduction. Under some additional conditions, we construct also a non-anomalous moment map corresponding to a sort of quasi-adjoint action of G on (D, κ). The absence of the anomaly of this “quasi-adjoint” moment map permits to perform the gauging of deformed WZW models. Keywords: Gauged WZW model; twisted Heisenberg double; Poisson–Lie symmetry. Mathematics Subject Classification 2000: 81T40
1. Introduction Poisson–Lie symmetry [15] is the generalization of the ordinary Hamiltonian symmetry of a dynamical system and, upon quantizing, it becomes the quantum group symmetry. Many dynamical systems can be deformed in such a way that their ordinary symmetries become Poisson–Lie. Among such systems there is also the standard WZW model [17] where the loop group symmetry gets deformed [9]. The principal goal of the present work is to develop the theory of gauging the deformed WZW model. From the mathematical point of view, the problem amounts to identifying nonanomalous Poisson–Lie subsymmetries of the deformed WZW model which would permit to perform the gauging. In order to describe the Poisson–Lie analogue of the WZW vanishing anomaly condition [18], first we shall have to develop appropriate mathematical tools. It particular, it turns out that the standard definition of the Poisson–Lie symmetry (i.e. the action map G × M → M is Poisson) is too rough since it is unable to distinguish between non-anomalous and anomalous 781
October 7, 2006 17:43 WSPC/148-RMP
782
J070-00279
C. Klimˇ c´ık
symmetries. For this reason, we shall refine the standard concept of the Poisson–Lie symmetry and propose its new definition based rather on the Poisson–Lie structure on the cosymmetry (or dual) group B than on the symmetry group G. We are fully aware that the language that we develop is not quite standard in the Poisson(–Lie) geometry but we find it well adapted for our discussion of anomalies and we also believe that it may constitute an insightful alternative in treating the Poisson–Lie symmetric systems in general. The central object of our investigations will be a class of Poisson manifolds introduced by Semenov-Tian-Shansky under the name of twisted Heisenberg doubles [16]. As it was conjectured in [9] and showed in [11], particular elements of this class play the role of the phase spaces of the deformed WZW models. This also means that results obtained in full generality for any twisted Heisenberg double will also hold for any deformed WZW model. In order to present in this introduction the principal ideas and results of our work, we first expose two main definitions and three main theorems proved later in the body of the paper. Definition 2.2. Let M be a symplectic manifold whose algebra of smooth functions F un(M ) is equipped with a Poisson bracket {., .}. Let B be a Poisson–Lie group and let µ : M → B be a smooth map. To every function y ∈ F un(B), we can associate a vector field wµ (y) ∈ V ect(M ) as follows: wµ (y)f = {f, µ∗ (y )}µ∗ (S(y )),
y ∈ F un(B),
f ∈ F un(M ).
We say that µ realizes the Poisson–Lie symmetry of M if the map wµ is homomorphism of the Lie algebras F un(B) and V ect(M ). If, moreover, the map µ is Poisson, we say that the symmetry is equivariant or non-anomalous. Definition 2.4. Let D be an even-dimensional Lie group equipped with a maxi. mally Lorentzian bi-invariant metric. If Lie(D) = Lie(G) + Lie(B), where G and B are maximally isotropic subgroups, D is called the Drinfeld double of G or the Drinfeld double of B. Let κ be a metric preserving automorphism of D and suppose that there are respective basis T i and ti (i = 1, . . . , n) of G = Lie(G) and B = Lie(B) such that (T i , tj )D = δji . Then the (basis independent) expression R L L {f1 , f2 }D ≡ ∇R T i f1 ∇ti f2 − ∇κ(ti ) f1 ∇κ(T i ) f2 ,
f1 , f2 ∈ F un(D)
is a Poisson bracket and the Poisson manifold (D, {., .}D ) is called the twisted Heisenberg double. Theorem 2.5. Let D be a twisted Heisenberg double which is also decomposable, i.e. such that two global unambiguous decompositions hold: D = κ(B)G and D = κ(G)B. Consider (smooth) maps ΛL , ΛR : D → B, ΞR , ΞL : D → G respectively
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
783
induced by these two decompositions. Then it holds: (a) The Poisson manifold (D, {., .}D ) is symplectic. (b) Both maps ΛL and ΛR realize the (anomalous) Poisson–Lie symmetries of the symplectic manifold (D, {., .}D ). The corresponding symmetry group is G acting as h K = κ(h)K,
h ∈ G,
K ∈D
h K = Kh−1 ,
h ∈ G,
K ∈ D.
or, respectively, as
Theorem 3.1. Let D be a decomposable twisted Heisenberg double such that the twisting automorphism κ preserves the subgroup B. Construct two new maps BL : D → B and BR : D → B as follows BL (K) = κ(ΛL (K))ΛR (K),
BR (K) = κ−1 (ΛR (K))ΛL (K),
K ∈ D.
Then it holds: Both maps BL and BR are Poisson and they realize the (nonanomalous) Poisson–Lie symmetries of (D, {., .}D ). The corresponding symmetry group is G acting as h K = κ(h)KΞR (κ[hΛL (K)]),
h ∈ G,
K ∈ D,
or, respectively, as −1 −1 h K = κ[Ξ−1 )]Kh−1 , L (ΛR (K)h
h ∈ G,
K ∈ D.
Theorem 3.2. Let D be a decomposable twisted Heisenberg double, κ an automorphism of D preserving B and N a normal subgroup of B. Denote by C the factor group B/N, by ρ the natural homomorphism B → C and by Pκ : Lie(D) → Lie(B) a projector on Lie(B) with kernel κ(Lie(G)). Suppose that the Hopf subalgebra ρ∗ (F un(C)) of F un(B) is also a Poisson subalgebra. Then it holds: The composed map νR ≡ ρ ◦ ΛR realizes the Poisson–Lie symmetry of D and the corresponding symmetry group H is the subgroup of G. If, moreover, Pκ (Lie(H)) ⊂ Lie(N ) then the moment map νR is non-anomalous. Apart from these three theorems, we prove two more propositions (Lemmas 3.3 and 3.4) enlarging the story to the non-decomposable twisted Heisenberg doubles. The formulations of those additional lemmas require introduction of several new concepts therefore, for the sake of conciseness of this introduction, we shall expose them only in Sec. 3.3. The principal field of applications of our results is the theory of non-linear σ-models which are two-dimensional field theories describing the propagation of closed strings on a Riemannian manifold T . The manifold T is often referred to as
October 7, 2006 17:43 WSPC/148-RMP
784
J070-00279
C. Klimˇ c´ık
the target space and it comes also equipped with a closed 3-form H. The classical action for a closed string configuration xµ (σ, τ ) reads 1 µ µ ν x∗ H, S[x (σ, τ )] = dσdτ Gµν (x)∂+ x ∂− x + 2 V where σ is a periodic loop parameter, τ the evolution parameter, xµ are coordinates on T , Gµν are the components of the Riemannian metric and ∂± ≡ ∂τ ± ∂σ . It should be noted that the configuration xµ (σ, τ ) is extended to a configuration defined in the volume V whose boundary is the surface of the propagating closed string and x∗ H is the pull-back of the H-potential to this volume V . A detailed explanation of why the variational principle based on the action S does not depend on the ambiguity of the extension of x is given, e.g., in [17, 6, 12]. The prominent example of the non-linear σ-model is the WZW model for which the target space is the compact group manifold K equipped with the standard Killing–Cartan metric (., .)K . Its action reads SW ZW [g(σ, τ )] 1 1 −1 −1 = ([dgg −1 , dgg −1 ], dgg −1 )K . dσdτ (∂+ gg , ∂− gg )K + 2 12 V Let S be a subgroup of K and let A± (σ, τ ) be two Lie(S)-valued fields. The gauged K/S WZW model is then a dynamical system described by the following classical action SGW ZW [g(σ, τ ), A± (σ, τ )] = SW ZW [g(σ, τ )] + dσdτ (−(∂+ gg −1 , A− )K + (∂− gg −1 , A+ )K − (g −1 A− g, A+ )K + (A− , A+ )K ). The action SGW ZW is invariant with respect to gauge transformations g(σ, τ ) → s−1 (σ, τ )g(σ, τ )s(σ, τ ), A± (σ, τ ) → s−1 (σ, τ )A± (σ, τ )s(σ, τ ) − s−1 (σ, τ )∂± s(σ, τ ), where s(σ, τ ) takes values in the subgroup S. (Gauged) WZW models are dynamical systems whose phase spaces are symplectic manifolds. We shall show in Sec. 4, that their symplectic structures coincide with those of (gauged) twisted Heisenberg doubles. Actually, the twisted Heisenberg doubles underlying the ordinary WZW models are very special in the sense that the symmetry group G is the loop group LK and the cosymmetry group B is Abelian. If we consider also doubles with non-Abelian B, we are very naturally
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
785
led to more general theories which we call the deformed WZW models. Let us now explain the meaning of the Theorems 2.5, 3.1 and 3.2 in the WZW context. If B is Abelian, Theorem 2.5 says that the ordinary WZW models enjoy two anomalous chiral symmetries respectively given by the (twisted) left and ordinary right multiplications by elements of the loop group LK. If B is non-Abelian, the deformed WZW models still have two anomalous chiral Poisson–Lie symmetries. Theorem 3.1 says that the left and right moment maps ΛL , ΛR can be combined into the non-anomalous moment maps BL , BR . For B Abelian, this new moment maps are equal to each other and they generate the adjoint action of G on the target space of the σ-model. This adjoint action is non-anomalous and serves as the base of the standard vector gauging of the WZW model leading to the gauged K/S WZW model described above. However, if B is non-Abelian; the moment maps BL and BR do not coincide and we have two different non-anomalous quasi-adjoint actions of Theorem 3.1 which can be consistently gauged. Finally, the Theorem 3.2 explains under which conditions the chiral subsymmetries may become non-anomalous and can be consistently gauged. As an illustration, we devote an entire Sec. 4 to a very explicite construction of a particular new deformation of the ordinary WZW model (which we call the u-deformation) and work out in detail its deformed vector gauging. The paper is organized as follows: In Sec. 2, we present the discussion of the concept of the Poisson–Lie symmetry, we explain motivations for Definition 2.2 and we prove Theorem 2.5. Then in Secs. 3.1 and 3.2, we respectively prove Theorems 3.1 and 3.2 and, in Sec. 3.3, we expose the theory of non-decomposable doubles. In Sec. 4, we construct the u-deformed WZW model and perform its Poisson–Lie gauging. We finish with short conclusions and an outlook. 2. Twisted Heisenberg Double The presentation of this section extends that of [11]. In particular, we give full proofs of the statements listed in [11], and, moreover, we are more general concerning the properties of the twist κ of a double D. 2.1. Lie groups in a dual language Let B be a Lie group and F un(B) the algebra of functions on it. It is well known that the group structure on B gives rise to a so-called coproduct ∆ : F un(B) → F un(B) ⊗ F un(B), the antipode S : F un(B) → F un(B) and the counit ε : F un(B) → R given, respectively, by the formulae ∆x(b1 , b2 ) = x (b1 )x (b2 ) = x(b1 b2 ),
S(x)(b) = x(b−1 ),
ε(x) = x(eB ).
Here x ∈ F un(B), b, b1 , b2 ∈ B, eB is the unit element of B and we use the Sweedler notation for the coproduct: ∆x = xα ⊗ xα ≡ x ⊗ x . α
October 7, 2006 17:43 WSPC/148-RMP
786
J070-00279
C. Klimˇ c´ık
The Lie algebra B of B is defined as the set of ε-derivations of F un(B), i.e. B = {δ : F un(B) → R, δ(xy) = ε(x)δ(y) + ε(y)δ(x)}. The Lie bracket on B is defined as follows: [δ1 , δ2 ](x) = δ1 (x )δ2 (x ) − δ1 (x )δ2 (x ). This definition of the Lie algebra B is of course equivalent to a more standard one presenting B as the set of right-invariant vector fields. In order to connect two definitions, consider a map φB : F un(B) → Ω1 (B) (the map φB thus goes from functions into 1-forms on B) defined by φB (x) = dx S(x ). Note that the 1-form φB (x) is automatically right-invariant therefore the canonical pairing of a right-invariant vector field v with φB (x) defines a map δv : F un(B) → R: δv (x) = v, φB (x) .
(2.1)
The map δv is indeed the ε-derivation due to the following property of the map φB : φB (xy) = ε(x)φB (y) + ε(y)φB (x). On the other hand, every ε-derivation δ defines a right-invariant vector field ∇L δ which acts on x ∈ F un(B) as follows: ∇L δ x = δ(x )x .
Consider now a Poisson–Lie group B, i.e. a Lie group equipped with a Poisson bracket {., .}B satisfying ∆{x, y}B = {x , y }B ⊗ x y + x y ⊗ {x , y }B ,
x, y ∈ F un(B).
(2.2)
It is not difficult to prove that the property (2.2) implies S({x, y}B ) = −{S(x), S(y)}B , ε({x, y}B ) = 0,
x, y ∈ F un(B),
x, y ∈ F un(B).
(2.3a) (2.3b)
Denote by B ∗ the linear dual of the Lie algebra B = Lie(B). The Poisson–Lie bracket {., .}B induces a natural Lie algebra structure [., .]∗ on B ∗ . Let us explain this fact in more detail: First of all, recall that B ∗ can be identified with the space of right-invariant 1-forms on the group manifold B and we have the natural (surjective) map φB : F un(B) → B ∗ defined by φB (y) = dy S(y ),
y ∈ F un(B).
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
787
Note that the 1-form φB (y) is right-invariant therefore it is indeed in B ∗ . Let U, V ∈ B ∗ and x, y ∈ F un(B) such that U = φB (x) and V = φB (y). Then, we define [U, V ]∗ = φB ({x, y}B ).
(2.3c)
It is the Poisson–Lie property (1) of {., .}B which ensures the independence of [U, V ]∗ on the choice of the representatives x, y. In what follows, the Lie algebra (B ∗ , [., .]∗ ) will be denoted by the symbol G and G will be a (connected simply connected) Lie group such that G = Lie(G). We note that G is often referred to as the dual group of B. It can be itself equipped with a Poisson–Lie bracket {., .}G inducing on G ∗ ≡ B the correct Lie algebra structure Lie(B).
2.2. Poisson–Lie symmetry The concept of the Poisson–Lie symmetry of a symplectic manifold M was introduced by Semenov-Tian-Shansky [15]. Traditionally, it concerns the action of a Poisson–Lie group G on M such that the smooth map G × M → M is Poisson. Certain Poisson–Lie symmetries have moment maps µ : M → B, where B is the dual Poisson–Lie group. Let ΠM be the Poisson bivector corresponding to the symplectic structure on M , let ρB be the right-invariant Maurer–Cartan form on B and let ., . denote the canonical pairing between Lie(B) and Lie(G). Then the moment map µ is characterized by the property that the vector field ΠM (., µ∗ ρB , U ) ∈ V ect(M ) generates the infinitesimal action of the element U ∈ Lie(G) on M . We have the following lemma: Lemma 2.1. Let the action G × M → M be the Poisson–Lie symmetry with the moment map µ : M → B and let wµ : F un(B) → V ect(M ) be a map defined as wµ (y) = ΠM (., µ∗ φB (y)). Then wµ is anti-homomorphism of the Lie algebras F un(B) and V ect(M ). Proof. Let x, y be in F un(B). We know that the right-invariant 1-forms φB (x) and φB (y) can be seen as the elements of Lie(G), and we denote them as U and V , respectively. Then the statement of the lemma follows from Eq. (2.3c) and from the property of the moment map stated above. In this paper, we shall advocate a different approach to Poisson–Lie symmetry and we take the statement of the Lemma 2.1 as a definition. Thus we propose: Definition 2.2. Let M be a symplectic manifold whose algebra of smooth functions F un(M ) is equipped with a Poisson bracket {., .}. Let B be a Poisson–Lie group
October 7, 2006 17:43 WSPC/148-RMP
788
J070-00279
C. Klimˇ c´ık
and let µ : M → B be a smooth map. To every function y ∈ F un(B), we can associate a vector field wµ (y) ∈ V ect(M ) as follows: wµ (y)f = {f, µ∗ (y )}µ∗ (S(y )),
y ∈ F un(B),
f ∈ F un(M ).
(2.4)
We say that µ realizes the Poisson–Lie symmetry of M if the map wµ is the antihomomorphism of the Lie algebras F un(B) and V ect(M ). If, moreover, the map µ is Poisson, we say that the symmetry is equivariant or non-anomalous. Explanations. If µ realizes the Poisson–Lie symmetry of M , the opposite Lie algebra of the image Im(wµ ) of the map wµ is a Lie algebra that will be denoted as G. If the action of the Lie algebra G on M can be lifted to the action of a connected Lie group G (such that Lie(G) = G) we speak about global Poisson– Lie symmetry. G will be then referred to as the symmetry group of (M, µ) and B as the cosymmetry group. Note that G acts on M and B underlies the way how this action is expressed via the Poisson brackets. If there is distinguished (evolution) vector field v ∈ V ect(M ) leaving invariant Im(µ∗ ), we say that the dynamical system (M, {., .}, v) is (G, B)-Poisson–Lie symmetric (cf. [11]). We also note that y ∈ F un(B) can be interpreted as a non-Abelian (or Poisson–Lie) Hamiltonian of the vector field wµ (y). The fact that wµ is anti-homomorphism just implies a nice formula [wµ (x), wµ (y)] = −wµ ({x, y}B ). If the group B is Abelian then ∆(x) = 1 ⊗ x + x ⊗ 1 and (2.4) is nothing but the standard Hamiltonian formula wµ (y)f = {f, µ∗ (y)}. Thus the Poisson–Lie symmetry becomes the standard Hamiltonian symmetry if the cosymmetry group B is Abelian. Let us note also that the Definition 2.2 can be reformulated by using the Maurer–Cartan form ρB and thus avoiding to refer to the coproduct on F un(B) (this essentially amounts to replace dy S(y ) by ρB , V ). There are two reasons that we choose the formulation that uses the coproduct and the antipode. The first one is not directly related to this paper, but is important in general in the perspective of quantization. Indeed, for the definition of the Hopf symmetry the notions of coproduct and antipode are indispensable already at the level of basic definition and the close relationship between the Poisson–Lie and Hopf symmetry thus becomes more transparent.The second reason is more practical. In fact, the notation using the coproduct and the antipode is technically more convenient in elaborating and formulating proofs of the theorems presented in the paper. Remark. Our definition of the Poisson–Lie symmetry and the traditional one are close cousins but they are not quite identical. For example, a traditional symmetry must admit a moment map in order to be the symmetry in the new sense and the newly defined symmetry must be global in order to be traditional. The main reason why we shall use the new definition is its usefulness for treatment of anomalies which cause obstructions for gauging the Poisson–Lie symmetries. The traditional definition does not see the difference between anomalous and non-anomalous cases
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
789
while the new definition gives the very simply criterion to distinguish them. In what follows, we shall work exclusively with the new definition and we hope to convince the reader about its naturalness and usefulness. Lemma 2.3. Every Poisson map µ : M → B realizes the Poisson–Lie symmetry of M . Proof. First remind that the map µ : M → B is a Poisson morphism iff the dual map µ∗ : F un(B) → F un(M ) satisfies {µ∗ (x), µ∗ (y)} = µ∗ ({x, y}B ),
x, y ∈ F un(B).
(2.5)
Now we take x, y ∈ F un(B) and calculate [wµ (y), wµ (x)]f = {{f, µ∗ (x )}µ∗ (S(x )), µ∗ (y )}µ∗ (S(y )) − {{f, µ∗ (y )}µ∗ (S(y )), µ∗ (x )}µ∗ (S(x )) = {f, {µ∗ (x ), µ∗ (y )}}S(µ∗ (x ))S(µ∗ (y )) − {f, µ∗ (x y )}{µ∗ (S(x )), µ∗ (S(y ))} = {f, µ∗ ({x, y}B )}µ∗ (S({x, y}B )) = wµ ({x, y}B )f. Going from the second to the third line we have used the Jacobi identity and the fact that x S(x ) is a number (the counit of x). We have passed from the third to the fourth line by using (2.2), (2.3a,b) and (2.5).
2.3. Anomalous realizations The Poisson–Lie symmetry can be realized also by a map µ : M → B which is not the Poisson morphism. If this happens, we speak about the anomalous Poisson–Lie symmetry and we call µ the anomalous moment map. Anomalous moment maps naturally arise by twisting the Heisenberg doubles. The detailed exposition of this fact will be our following subject. Definition 2.4. Let D be an even-dimensional Lie group equipped with a maxi. mally Lorentzian bi-invariant metric. If Lie(D) = Lie(G) + Lie(B), where G and B are maximally isotropic subgroups, D is called the Drinfeld double of G or the Drinfeld double of B. Let κ be a metric preserving automorphism of D and suppose that there are respective basis T i and ti (i = 1, . . . , n) of G = Lie(G) and
October 7, 2006 17:43 WSPC/148-RMP
790
J070-00279
C. Klimˇ c´ık
B = Lie(B) such that (T i , tj )D = δji .
(2.6)
Then the (basis independent) expression R L L {f1 , f2 }D ≡ ∇R T i f1 ∇ti f2 − ∇κ(ti ) f1 ∇κ(T i ) f2 ,
f1 , f2 ∈ F un(D)
(2.7)
is a Poisson bracket and the Poisson manifold (D, {., .}D ) is called the twisted Heisenberg double. Theorem 2.5. Let D be a twisted Heisenberg double which is also decomposable, i.e. such that two global unambiguous decompositions hold: D = κ(B)G and D = κ(G)B. Consider (smooth) maps ΛL , ΛR : D → B, ΞR , ΞL : D → G respectively induced by these two decompositions. Then it holds: (a) The Poisson manifold (D, {., .}D ) is symplectic. (b) Both maps ΛL and ΛR realize the global (anomalous) Poisson–Lie symmetries of the symplectic manifold (D, {., .}D ). The corresponding symmetry group is G acting as h K = κ(h)K,
h ∈ G,
K ∈ D,
(2.8a)
h K = Kh−1 ,
h ∈ G,
K ∈ D.
(2.8b)
or, respectively, as
.
Explanations. The symbol + stands for the direct sum of vector spaces only and not of Lie algebras. Bi-invariant means both left- and right-invariant. The non-degenerated bi-invariant metric on D obviously induces an Ad-invariant nondegenerated bilinear form (., .)D on D = Lie(D). An isotropic submanifold of D is such that the induced metric on it vanishes. Maximally isotropic means that it are is not contained in any bigger isotropic submanifold. The vector fields ∇L,R T defined as d L f (esT K), ∇T f (K) ≡ δT (f )f (K) = ds s=0 d ∇R f (K) ≡ δ (f )f (K) = f (KesT ), T T ds s=0 where f ∈ F un(D), K ∈ D, T ∈ Lie(D). Global unambiguous decomposition D = κ(B)G means that for every element K ∈ D it exists a unique g = ΞR (K) ∈ G and a unique b = ΛL (K) ∈ B such that K = κ(b)g −1 . Similarly for D = κ(G)B: it exists a unique g˜ = ΞL (K) ∈ G and a unique ˜b = ΛR (K) ∈ B such that K = κ(˜ g )˜b−1 . The fact that the formula (2.7) defines the Poisson bracket was
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
791
proved by Semenov-Tian-Shansky in [16] and, for completeness, we shall outline here his argument: Consider a (basis independent) element c ∈ D ⊗ D given by c = T i ⊗ ti + ti ⊗ T i . It is easy to see that the Ad-invariance and κ-invariance of the bilinear form (., .)D implies the Ad-invariance and κ-invariance of c. Thus the bracket (2.7) can be rewritten as {f1 , f2 }D =
1 R 1 R 1 L 1 L R L L ∇ i f1 ∇R ti f2 − ∇ti f1 ∇T i f2 + ∇κ(T i ) f1 ∇κ(ti ) f2 − ∇κ(ti ) f1 ∇κ(T i ) f2 . 2 T 2 2 2
Note that in this bracket appear two elements of D ∧ D given by rD =
1 i 1 T ⊗ ti − ti ⊗ T i , 2 2
κ rD =
1 1 κ(T i ) ⊗ κ(ti ) − κ(ti ) ⊗ κ(T i ). 2 2
It can be shown by direct calculation that the algebraic Schouten brackets [rD , rD ]S (cf. [9, Eqs. (4.36)–(4.39)]) gives an invariant element of ∧3 D and, moreover, κ κ , rD ]S = [rD , rD ]S . Those facts imply that the Semenov-Tian-Shansky bracket [rD (2.7) satisfies the Jacobi identity. Let us finish the Explanations by saying that the list of decomposable doubles is not very long. The typical examples are the cotangent bundle T ∗ G of any Lie group G, the complexification GC of a compact (loop) group G and certain Drinfeld twists of two first items. Nevertheless, the independent theorem dealing with decomposable doubles is useful for two reasons. First of them is the range of applicability: many resoluble quantum theories have compact (quantum) group symmetry and in this or other way are based on the short list of decomposable doubles. The other reason is that the notion of the Poisson–Lie symmetry is traditionally globally defined and the decomposable doubles lead to global Poisson–Lie symmetry. Let us stress, however, that the local Poisson–Lie symmetries must be considered equally seriously (for instance the conformal symmetry in field theory is only local but physically relevant). This is the reason that we devote the Sec. 3.3 to non-decomposable doubles where the number of examples is very big. Proof of Theorem 2.5. (a) Consider a point K ∈ D and four linear subspaces of the tangent space TK D defined as SL = LK∗ G, SR = RK∗ κ(G), S˜L = LK∗ B and S˜R = RK∗ κ(B). (The symbols LK∗ and RK∗ stand for left and right transport on the group D, respectively). The existence of the global decompositions D = κ(B)G and D = κ(G)B means that at every K ∈ D the tangent space TK D can be decomposed as TK D = SL + S˜R and TK D = S˜L + SR , respectively. This fact makes
October 7, 2006 17:43 WSPC/148-RMP
792
J070-00279
C. Klimˇ c´ık
possible to introduce a projector ΠLR˜ on S˜R with a kernel SL and a projector ΠLR ˜ on SR with a kernel S˜L . At every point K ∈ D we can therefore define a following 2-form ω ω(t, u) = (t, (ΠLR ˜ − ΠLR ˜ )u)D ,
(2.9)
where t, u are arbitrary vectors in TK D and (., .)D is the bi-invariant metric at the point K (it is related by the left or right transport of the Ad-invariant bilinear form (., .)D defined at the unit element E ∈ D. Let us show that ω is the symplectic form corresponding to the Poisson structure {., .}D . First of all we remark that the Poisson bivector (= contravariant antisymmetric tensor) corresponding to the Poisson bracket {., .}D reads α = LK∗ (T i ⊗ ti ) − RK∗ (κ(ti ) ⊗ κ(T i )).
(2.10)
Introduce two more projectors ΠRR˜ , ΠLL ˜ , where the first subscript stands for the kernel and the second for the image. Then we conclude α(., ω(., u)) = LK∗ T i (LK∗ ti , (ΠLR ˜ − ΠLR ˜ )u)D − RK∗ κ(ti )(RK∗ κ(T i ), (ΠLR ˜ − ΠLR ˜ )u)D = (ΠLL ˜ − ΠRR ˜ − ΠLR ˜ )(ΠLR ˜ )u = u. Proof of (b) and (c). given by
(2.11)
Consider a bracket {., .}B on the cosymmetry group B
R {x, y}B (b) = −(T i , Adb T k )D (∇L ti x)(b)(∇tk y)(b),
b ∈ B,
x, y ∈ F un(B). (2.12)
It was shown in [9, Proposition 4.5], that {., .}B is the Poisson–Lie bracket on B. We shall prove that R {Λ∗L (x), Λ∗L (y)}D = Λ∗L ({x, y}B − Mκij ∇R ti x∇tj y), R {Λ∗R (x), Λ∗R (y)}D = Λ∗R ({x, y}B − Mκij−1 ∇R ti x∇tj y),
x, y ∈ F un(B), x, y ∈ F un(B),
(2.13a) (2.13b)
where the constant antisymmetric matrix Mκij is given by Mκ = Qκ Pκ−1 ,
(Pκ )i j = (κ(ti ), T j )D ,
i j Qij κ = (κ(T ), T )D .
(2.14)
We note that the non-degeneracy of (., .)D and also the global decomposabilities D = κ(B)G = κ(G)B guarantee that both matrices Pκ and Pκ−1 are invertible. In order to calculate the bracket {Λ∗L(x), Λ∗L (y)}D , we use the defining formula (2.7). We first realize that i d R ∗ x(ΛL (KesT )) = 0 (2.15) ∇T i ΛL (x) = ds s=0
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
793
and then we write {Λ∗L (x), Λ∗L (y)}D ∗ L ∗ = −∇L κ(ti ) ΛL (x)∇κ(T i ) ΛL (y) i d d s1 κ(ti ) x(ΛL (e K)) y(ΛL (es2 κ(T ) K)) =− ds1 s1 =0 ds2 s2 =0 i d y(ΛL (es2 κ(T ) κ(ΛL (K)))) = −Λ∗L (B ∇L ti x) ds2 s2 =0 d i = −Λ∗L (B ∇L x) y(ΛL (κ[ΛL (K) exp(sΛ−1 ti L (K)T ΛL (K))])). ds s=0
(2.16)
We note that i Λ−1 L (K)T ΛL (K) −1 i k i k = (Λ−1 L (K)T ΛL (K), tk )D T + (ΛL (K)T ΛL (K), T )D tk .
This identity permits to rewrite the right-hand side of (2.16) as the sum of two terms {Λ∗L (x), Λ∗L (y)}D = V1 + V2 , where i k ∗ B L ∗ B R ∗ V1 = −(Λ−1 L (K)T ΛL (K), T )D ΛL ( ∇ti x)ΛL ( ∇tk y) = ΛL ({x, y}B )
and
d i ∗ B L V2 = −(Λ−1 (K)T Λ (K), t ) Λ ( ∇ x) y(ΛL (κ[ΛL (K) exp(sT k )])) L k D L ti L ds s=0 d ∗ B R = −ΛL ( ∇tk x) y(ΛL (κ[ΛL (K) exp(sτ k )])) ds s=0 ∗ B R = −Λ∗L (B ∇R tk x)ΛL ( ∇τ k y).
The element τ k ∈ B is defined by the D = κ(B)G decomposition κ(T k ) = κ(τ k ) + ck ,
ck ∈ G.
From this it is easy to find that τ k = Mκkl tl , where the matrix Mκ was introduced in (2.14). Putting all together, we arrive at {Λ∗L (x), Λ∗L (y)}D = Λ∗L ({x, y}B − Mκij
B
∇R ti x
B
∇R tj y),
October 7, 2006 17:43 WSPC/148-RMP
794
J070-00279
C. Klimˇ c´ık
which is nothing but (2.13a). The identity (2.13b) can be proved in a similar way. We note also that our notation has distinguished the invariant derivatives on F un(D) and on F un(B) (the derivatives on F un(B) were denoted as B ∇R,L ). We shall not make this distinction in what follows and we let the reader to understand from the context on which space ∇R,L act. In case where the twisting automorphism is trivial (i.e. κ is identity), the anomaly matrices Mκ , Mκ−1 vanish and ΛL,R : D → B are the Poisson maps. From Lemma 2.3 it then follows that ΛL,R : D → B realize the Poisson–Lie symmetries of D. Let us show now that in the case of non-trivial twisting the maps ΛL,R : D → B also realize the Poisson–Lie symmetries although they are not Poisson morphisms. For this, we first remind the definition (2.4) of the map wΛL : F un(B) → V ect(D): wΛL (x)f = {f, Λ∗L (x )}D Λ∗L (S(x )),
x ∈ F un(B),
f ∈ F un(D).
We calculate [wΛL (y), wΛL (x)]f ≡ (wΛL (y)wΛL (x) − wΛL (x)wΛL (y))f = {{f, Λ∗L (x )}D Λ∗L (S(x )), Λ∗L (y )}D Λ∗L (S(y )) − (x ↔ y) = {{f, Λ∗L(x )}D Λ∗L (y )}D Λ∗L (S(x y )) + {f, Λ∗L(x )}D {Λ∗L (S(x )), Λ∗L (y )}D Λ∗L (S(y )) − (x ↔ y) = {f, {Λ∗L(x ), Λ∗L (y )}D }D Λ∗L (S(x y )) − {f, Λ∗L(x y )}D {Λ∗L (S(x ), Λ∗L (S(y ))}D . Now we use the formula (2.13a) and the Poisson–Lie property (2.2) of the bracket {., .}B to obtain [wΛL (y), wΛL (x)]f = {f, Λ∗L({x , y }B )}D Λ∗L (S(x y )) − {f, Λ∗L (x y )}D Λ∗L ({S(x ), S(y )}B ) R ∗ − Mκij ({f, Λ∗L (∇R ti x ∇tj y )}D ΛL (S(x y )) R − {f, Λ∗L(x y )}D Λ∗L (∇R ti S(x )∇tj S(y ))).
The last line of this expression vanishes due to following identities R R (∇R tl y )S(y ) + y ∇tl S(y ) = ∇tl (y S(y )) = 0, L L R R L (∇R tl ∇ti x )S(x ) + ∇ti x ∇tl S(x ) = ∇tl (∇ti x S(x )) = 0
and (using (2.7)) ∗ ∗ ∗ R {f, Λ∗L (∇R tl x )}D ΛL (S(x )) + {f, ΛL (x )}D ΛL (∇tl S(x )) ∗ R L L R = ∇L κ(T i ) f ΛL ((∇tl ∇ti x )S(x ) + ∇ti x ∇tl S(x )) = 0.
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
795
Now we use the Poisson–Lie properties (2.2), and (2.3) to arrive at [wΛL (y), wΛL (x)]f = {f, Λ∗L({x , y }B )}D Λ∗L (S(x y )) + {f, Λ∗L (x y )}D Λ∗L (S({x , y }B )) = wΛL ({x, y}B )f. According to the Definition 2.2, the map ΛL thus realizes the Poisson–Lie symmetry of D. Much in the same way, we obtain also [wΛR (x), wΛR (y)]f = wΛR ({x, y}B )f, where wΛR (x)f = {f, Λ∗R (x )}D Λ∗R (S(x )),
x ∈ F un(B),
f ∈ F un(D).
Having established that both maps wΛL , wΛR : F un(B) → V ect(D) are Lie algebra homomorphisms (i.e. that both ΛL , ΛR : D → B realize Poisson–Lie symmetries), it remains to find what are the corresponding symmetry groups. We use (2.7) and (2.1) to obtain wΛL (y)f = {f, Λ∗L (y )}Λ∗L (S(y )) ∗ L L = ∇L κ(T i ) f ΛL ((∇ti y )S(y )) = δti (y)∇κ(T i ) f.
(2.17a)
We remind that δti is the ε-derivative (cf. Sec. 2.1) hence δti (y) is a real number for every i. It therefore follows that Im(wΛL ) = κ(G) and we have proved (2.8a). Similarly, we obtain wΛR (y)f = −δti (y)∇R T i f,
(2.17b)
which proves (2.8b). 3. Non-Anomalous Moment Maps Non-anomalous Poisson–Lie symmetries play very important role in the symplectic geometry since they permit to perform the so called symplectic reduction (or “gauging” in the terminology of physicists). However, given a decomposable twisted Heisenberg double (D, κ), the basic moment maps ΛL , ΛR are generically anomalous and cannot be gauged. Indeed, the anomaly matrices Mκij , Mκij−1 vanish only in the case where the twisting automorphism κ preserves the symmetry group G (cf. (2.14)). In this section, we shall look for other moment maps (distinct from ΛL , ΛR ) which would allow us to gauge (D, κ). It turns out, that the existence of the non-anomalous Poisson–Lie moment maps associated to the twisted Heisenberg double heavily depend on the details of the structure of (D, κ). In the three following subsections, we shall discuss three interesting cases, where the non-anomalous moment maps can be constructed. We shall keep the exposition of the two first cases (a quasi-adjoint action and a proper subsymmetry) in an abstract level since
October 7, 2006 17:43 WSPC/148-RMP
796
J070-00279
C. Klimˇ c´ık
the concrete examples will be discussed in the subsequent Sec. 4. However, we shall illustrate the third case (an improper subsymmetry) already in this Sec. 3, since later we shall not consider it anymore. 3.1. Quasi-adjoint action In this subsection, we shall consider the decomposable twisted Heisenberg doubles for which the twisting automorphism κ preserves the cosymmetry group B. We have the following theorem: Theorem 3.1. Let D be a decomposable twisted Heisenberg double such that the twisting automorphism κ preserves the subgroup B. Consider the anomalous moment maps ΛL , ΛR and construct two new maps BL : D → B and BR : D → B as follows BL (K) = κ(ΛL (K))ΛR (K),
BR (K) = κ−1 (ΛR (K))ΛL (K),
K ∈ D.
Then it holds: Both maps BL and BR are Poisson and they realize global nonanomalous Poisson–Lie symmetries of (D, {., .}D ). The corresponding symmetry group is G acting as h K = κ(h)KΞR (κ[hΛL (K)]),
h ∈ G,
K ∈ D,
or, respectively, as −1 −1 )]Kh−1 , h K = κ[Ξ−1 L (ΛR (K)h
h ∈ G,
K ∈ D.
Proof. Consider two functions x, y ∈ F un(B). We know already that it holds R {Λ∗L (x), Λ∗L (y)}D = Λ∗L ({x, y}B − Mκij ∇R ti x∇tj y),
x, y ∈ F un(B),
R {Λ∗R (x), Λ∗R (y)}D = Λ∗R ({x, y}B − Mκij−1 ∇R ti x∇tj y),
x, y ∈ F un(B),
(2.13a) (2.13b)
where the Poisson–Lie bracket {., .}B and matrices Mκ , Mκ−1 were defined in (2.12) and in (2.14), respectively. Introduce maps ΓL : D → B, ΓR : D → B by ΓL (K) = κ(ΛL (K)),
ΓR (K) = κ−1 (ΛR (K)),
K ∈ D,
hence BL = ΓL ΛR and BR = ΓR ΛL . We shall now prove that L {Γ∗L (x), Γ∗L (y)}D = Γ∗L ({x, y}B + Mκij−1 ∇L ti x∇tj y),
x, y ∈ F un(B),
L {Γ∗R (x), Γ∗R (y)}D = Γ∗R ({x, y}B + Mκij ∇L ti x∇tj y),
First we remark that ∗ (∇R T i ΛL (x))(K) =
∗ (∇L κ(T i ) ΓR (y))(K)
=
d ds d ds
x, y ∈ F un(B).
i
x((ΛL (KesT ))) = 0, s=0 i
K ∈ D,
y(κ−1 (ΛR (esκ(T ) K))) = 0, s=0
Thus, using the fundamental definition (2.7), we obtain {Λ∗L(x), Γ∗R (y)}D = 0
K ∈ D.
(3.1a) (3.1b)
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
797
and {Γ∗R (x), Γ∗R (y)}D i d d = x(κ−1 (ΛR (Kes1 T ))) y(κ−1 (ΛR (Kes2 ti ))) ds1 s1 =0 ds2 s2 =0 i d x(κ−1 (ΛR (KesT )))Γ∗R (∇L =− κ−1 (ti ) y) ds s=0 L = Γ∗R ((b−1 κ−1 (T i )b, T j )D ∇R tj x∇κ−1 (ti ) y) L = Γ∗R ([(b−1 T i b, T j )D − (T i , κ−1 (T m ))D (bT j b−1 , tl )D (T l , κ−1 (tm ))D ]∇R tj x∇ti y) L = Γ∗R ({x, y}B + (T i , κ−1 (T m ))D (κ−1 (tm ), T j )D ∇L ti x∇tj y) L = Γ∗R ({x, y}B + Mκij ∇L ti x∇tj y).
(3.1b)
We note that b ∈ B in this formula denotes the argument of functions in F un(B). Similarly, we can prove that {Λ∗R (x), Γ∗L (y)}D = 0 and L {Γ∗L (x), Γ∗L (y)}D = Γ∗L ({x, y}B + Mκij−1 ∇L ti x∇tj y),
x, y ∈ F un(B).
(3.1a)
Now we calculate ∗ ∗ {BL (x), BL (y)}D = {Γ∗L (x )ΛR (x ), Γ∗L (y )ΛR (y )}D
= {Γ∗L (x ), Γ∗L (y )}ΛR (x )ΛR (y ) + Γ∗L (x )Γ∗L (y ){ΛR (x ), ΛR (y )} L = Γ∗L ({x , y }B + Mκij−1 ∇L ti x ∇tj y )ΛR (x )ΛR (y ) R + Γ∗L (x )Γ∗L (y )Λ∗R ({x , y }B − Mκij−1 ∇R ti x ∇tj y ) ij ∗ L R R = BL ({x, y}B + Mκij−1 ∇L ti x∇tj y − Mκ−1 ∇ti x∇tj y).
(3.2a)
Similarly, we obtain ∗ ∗ ∗ L ij R R {BR (x), BR (y)}D = BR ({x, y}B + Mκij ∇L ti x∇tj y − Mκ ∇ti x∇tj y).
(3.2b)
The reader may be surprised by the presence of the anomaly matrices Mκ , Mκ−1 in the resulting formulas (3.2a) and (3.2b). Did not we promise that the moment maps BL , BR realize non-anomalous Poisson–Lie symmetries? Well the point is the following: If the twisting automorphism κ preserves the cosymmetry group B then there are three natural Poisson–Lie brackets on F un(B). The first one is evident; it is given by the formula (2.12) of Sec. 2.3: R {x, y}B (b) = −(T i , Adb T k )D (∇L ti x)(b)(∇tk y)(b),
b ∈ B,
x, y ∈ F un(B).
October 7, 2006 17:43 WSPC/148-RMP
798
J070-00279
C. Klimˇ c´ık
The second and the third bracket are defined by R {x, y}κB (b) = −(κ(T i ), Adb κ(T k ))D (∇L κ(ti ) x)(b)(∇κ(tk ) y)(b), −1
R {x, y}κB (b) = −(κ−1 (T i ), Adb κ−1 (T k ))D (∇L κ−1 (ti ) x)(b)(∇κ−1 (tk ) y)(b).
(3.3a) (3.3b)
It is easy to understand why the brackets (3.3a) and (3.3b) verify the Jacobi identity and the Poisson–Lie property (2.2). It is because they appear on the same footing as the original bracket (2.12). Indeed, the double D is not only the double of the pair of groups G and B, but it is also the double of the pair κ(G) and κ(B) = B and of the pair κ−1 (G) and κ−1 (B) = B. Each of the three pairs generate the respective basis T i , ti ; κ(T i ), κ(ti ) and κ−1 (T i ), κ−1 (ti ), all three basis sharing the crucial duality property (2.6). The brackets (3.3a) and (3.3b) can be worked out in the basis ti instead of κ(ti ) or κ−1 (ti ). We use obvious identities κ(ti ) = (κ(ti ), T m )D tm ,
κ−1 (ti ) = (κ−1 (ti ), T m )D tm
and we find ij L R R {x, y}κB = {x, y}B + Mκij−1 ∇L ti x∇tj y − Mκ−1 ∇ti x∇tj y, −1
{x, y}κB
L ij R R = {x, y}B + Mκij ∇L ti x∇tj y − Mκ ∇ti x∇tj y.
This permits us to rewrite (3.2a) and (3.2b) as ∗ ∗ ∗ {BL (x), BL (y)}D = BL ({x, y}κB ), −1
∗ ∗ ∗ (x), BR (y)}D = BR ({x, y}κB ). {BR
We thus conclude that the moment maps BL and BR are indeed non-anomalous with respect to the Poisson–Lie brackets (3.3a) and (3.3b). Every Poisson–Lie moment map µ generates the action of the Lie algebra G and, in good cases, this G-action can be lifted to the action of the symmetry group G. Let us now show that the moment maps BL , BR are those “good” cases yielding the global non-anomalous Poisson–Lie symmetries. The following exposition uses some standard conventions concerning the Hopf algebra calculations (see [8]), namely, the repeated application of the coproduct is written as (∆ ⊗ Id ⊗ Id)(∆ ⊗ Id)∆(x) ≡ x ⊗ x ⊗ x ⊗ x ,
x ∈ F un(B).
The reader has certainly noticed that this is the generalization of the Sweedler notation introduced in Sec. 2.1. Consider first a set of functions xi ∈ F un(B) which is dual to the basis ti of B = Lie(B), i.e. it holds δtj (xj ) = δji , where δtj are the ε-derivatives. We denote by κ(xi ) the functions on B of the form κ(xi )(b) = xi (κ(b)),
b ∈ B.
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
799
We are going to make explicit the basic map wBL : F un(B) → V ect(D) expressing the action of Lie(G) on f ∈ F un(D) (cf. (2.4)). ∗ ∗ (κ−1 ((xi ) ))}BL (S(κ−1 ((xi ) ))) wBL (κ−1 (xi ))f = {f, BL
= {f, Γ∗L (κ−1 ((xi ) ))Λ∗R (κ−1 ((xi ) ))}Γ∗L (S(κ−1 ((xi ) ))) × Λ∗R (S(κ−1 ((xi ) ))) −1 = ∇L ((xi ) ))Γ∗L (κ−1 ((xi ) )) κ(T i ) f − δtk (κ
× Γ∗L (S(κ−1 ((xi ) )))∇R Tkf −1 i R = ∇L κ(T i ) f − (ΓL (K)tk ΓL (K), κ(T ))D ∇T k f −1 i R = ∇L (tk )Λ−1 κ(T i ) f − (ΛL (K)κ L (K), T )D ∇T k f.
Similarly, we obtain ∗ ∗ (κ((xi ) ))}BR (S(κ((xi ) )) wBR (κ(xi ))f = {f, BR −1 i L = −∇R T i f + (ΛR (K)κ(tk )ΛR (K), T )D ∇κ(T k ) f.
Note that K ∈ D stands for the argument of the functions from F un(D). The Lie algebra G-actions can be lifted to the group G-actions. The corresponding formulae can be written in a compact form by using the maps defined by the global decompositions D = κ(G)B and D = κ(B)G. On the top of the maps ΛL , ΛR : D → B we have also the maps ΞL , ΞR : D → G, respectively, defined as −1 K = κ(ΞL (K))Λ−1 R (K) and K = κ(ΛL (K))ΞR (K), K ∈ D. The actions of G on −1 i D via the vector fields wBL (κ (x )) and wBR (κ(xi )) is then, respectively, lifted to the G-actions as follows: h K = κ(h)KΞR (κ[hΛL (K)]),
h ∈ G, K ∈ D,
−1 −1 h K = κ[Ξ−1 )]Kh−1 , L (ΛR (K)h
h ∈ G, K ∈ D.
(3.4a) (3.4b)
It is easy to verify that, in both cases, it holds: (h1 h2 ) K = h1 (h2 K). In particular, when the cosymmetry group B is Abelian, the G-actions induced by the moment maps BL and BR coincide and give nothing but the twisted adjoint action of G on D (i.e., h K = κ(h)Kh−1 , h ∈ G, K ∈ D). This fact, that will be proved in Sec. 4, justifies our terminology “quasi-adjoint” action for the case of non-Abelian cosymmetry groups. 3.2. Proper subsymmetry In the case of the standard Hamiltonian symmetry, every subgroup H of the symmetry group G also realizes the Hamiltonian symmetry. In the general Poisson–Lie
October 7, 2006 17:43 WSPC/148-RMP
800
J070-00279
C. Klimˇ c´ık
context (anomalous or not), such statement is generically false. A natural question then arises: which subgroups of G are themselves Poisson–Lie symmetry groups? We are going to answer this question and we also determine the corresponding moment maps. Theorem 3.2. Let D be a decomposable twisted Heisenberg double, κ an automorphism of D preserving B and N a normal subgroup of B. Denote by C the factor group B/N, by ρ the natural homomorphism B → C and by Pκ : Lie(D) → Lie(B) a projector on Lie(B) with kernel κ(Lie(G)). Suppose that the Hopf subalgebra ρ∗ (F un(C)) of F un(B) is also a Poisson subalgebra. Then it holds: The composed map νR ≡ ρ ◦ ΛR realizes Poisson–Lie symmetry of D and the corresponding symmetry group H is the subgroup of G. If, moreover, Pκ (Lie(H)) ⊂ Lie(N ), then the moment map νR is non-anomalous. Proof. The Poisson–Lie bracket on F un(B) naturally induces the Poisson–Lie bracket on F un(C) because ρ∗ (F un(C)) is the Poisson subalgebra of F un(B). Thus {ρ∗ (u), ρ∗ (v)}B = ρ∗ ({u, v}C ),
u, v ∈ F un(C).
Now define ∗ ∗ wνR (u)f ≡ {f, νR (u )}D νR (SC (u )),
u ∈ F un(C),
f ∈ F un(D)
and calculate ∗ ∗ wνR ({u, v}C ) = {f, νR ({u, v}C )}D νR (SC ({u, v}C ))
= {f, Λ∗R ({ρ∗ (u), ρ∗ (v)}B )}D Λ∗R (SB ({ρ∗ (u), ρ∗ (v)}B )) = wΛR ({ρ∗ (u), ρ∗ (v)}B ) = [wΛR (ρ∗ (u)), wΛR (ρ∗ (v))] = [wνR (u), wνR (v)]. Here we have used the obvious fact that wνR (u) = wΛR (ρ∗ (u)). This fact also directly implies, that H is the subgroup of G. Let us see how the Lie algebra Lie(H) of H is located in the Lie algebra Lie(D) of the double D. Choose a vector subspace V ⊂ Lie(B) that is complement to . Lie(N ) (i.e., Lie(B) = Lie(N ) + V ). We can certainly pick a basis ti = (tι , tI ) such that tι ∈ Lie(N ) and tI ∈ V and complete (tι , tI ) by the dual basis (T ι , T I ) of Lie(G). From the duality property (2.6), it follows that T ι ’s span V ⊥ and T I ’s span Lie(N )⊥ (the superscript ⊥ means “perpendicular” in the sense of the bilinear form (., .)D ). We recall the formula (2.17b) R R wΛR (y)f = −δti (y)∇R T i f = −δtι (y)∇T ι f − δtI (y)∇T I f.
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
801
If y is in ρ∗ (F un(C)), then δtι (y) = 0 and we thus obtain wΛR (y)f = −δtI (y)∇R T I f. This means that Lie(H) is spanned by T I ’s only, or, in other words, Lie(H) = Lie(N )⊥ . Since the twisting automorphism κ preserves the cosymmetry group B the anomaly matrix Mκij−1 (cf. (2.14)) can be rewritten as Mκij−1 = (T i , κ(T m ))D (κ(tm ), T j )D = (Pκ T i , T j )D .
(3.5)
Now we pick u, v ∈ F un(C) and, by using (2.13b) and (3.5), we calculate ∗ ∗ {νR (u), νR (v)}D = {Λ∗R (ρ∗ (u)), Λ∗R (ρ∗ (v))}D ∗ R ∗ = Λ∗R ({ρ∗ (u), ρ∗ (v)}B − Mκab−1 ∇R ta ρ (u)∇tb ρ (v)) ∗ R ∗ = Λ∗R (ρ∗ ({u, v}C ) − (Pκ T A , T B )D ∇R tA ρ (u)∇tB ρ (v)).
The transition from the second to the third line is justified by the fact that ∗ R ∗ A ∇R tα ρ (u) = ∇tβ ρ (v) = 0 (Note that a = (α, A), b = (β, B)). Since both T ’s B ⊥ A B and T ’s are in Lie(H) = Lie(N ) , we have (Pκ T , T )D = 0. Hence we conclude that the moment map νR is non-anomalous: ∗ ∗ ∗ {νR (u), νR (v)}D = νR ({u, v}C ).
Remark. We have worked out the subsymmetry story for the right moment map ΛR . Obviously, there is an analogous “left story” for which the conclusions are the same: a subgroup H ⊂ G acting from the left (in the κ-twisted way) is the subsymmetry subgroup if Lie(H) = Lie(N )⊥ where Lie(N ) is the ideal in the cosymmetry Lie algebra Lie(B). If, moreover, Pκ (Lie(H)) ⊂ Lie(N ), then the Hsubsymmetry is non-anomalous. We should also remark, that from two conditions [Lie(B), Lie(N )] ⊂ Lie(N ) and Pκ (Lie(H)) ⊂ Lie(N ) only the second one is our original result. The first one was already identified in [15, 3] for the non-twisted Heisenberg doubles. 3.3. Improper subsymmetry In this subsection, we partially release the condition of the decomposability of twisted Heisenberg doubles in the sense that we shall keep the unicity of the decomposition but not the globality. Thus denote OL the set of elements K ∈ D for which it exists a g ∈ G and a b ∈ B such that K = κ(b)g −1 . In the same way, we denote by OR the set of elements K ∈ D for which it exists a g˜ ∈ G and a ˜b ∈ B such that K = κ(˜ g)˜b−1 . Suppose, moreover, that the respective decompositions κ(B)G and κ(G)B on OL and OR are unique. In the non-twisted case κ = Id, it was shown in [1] that the lack of global decomposability has unpleasant consequences. Namely, the fundamental SemenovTian-Shansky Poisson structure (2.7) is no longer symplectic and, therefore, the
October 7, 2006 17:43 WSPC/148-RMP
802
J070-00279
C. Klimˇ c´ık
Poisson manifold (D, {., .}D ) cannot play the role of the phase-space of any dynamical system. It turns out, however, that out from the Poisson structure {., .}D one can construct symplectic submanifolds of D (called the symplectic leaves) which have the same dimension as D. In particular, Alekseev and Malkin have proved in [1] that the intersection OL ∩ OR is such symplectic leaf of (D, {., .}D ). The result of Alekseev and Malkin can be generalized to the twisted case as the following lemma states: Lemma 3.3. Let (D, κ) be a twisted Heisenberg double and M its submanifold defined as M = OL ∩ OR . Consider maps ΛL : M → B, ΞR : M → G induced by the unambiguous decomposition M = κ(B)G and maps ΞL : M → G, ΛR : M → B, −1 induced by M = κ(G)B (thus K = κ(ΛL (K))Ξ−1 R (K) and K = κ(ΞL (K))ΛR (K) for each K in M ). Denote by rG and rB the right-invariant Maurer–Cartan forms on G and B, respectively (e.g. if G is a matrix group rG = dgg −1 ). Then a two-form ωM on M defined as 1 1 (3.6) ωM = (Λ∗L (rB ) ∧, Ξ∗L (rG ))D + (Λ∗R (rB ) ∧, Ξ∗R (rG ))D 2 2 is symplectic and its inverse is the fundamental Poisson bivector (2.10) restricted to M . Proof. Choose a basis ti of B and T i of G fulfilling the duality relation (T i , tj )D = δji . The form ωM can be then rewritten as 1 ∗ 1 (Λ (rB ), T i )D ∧ (Ξ∗L (rG ), ti )D + (Λ∗R (rB ), T i )D ∧ (Ξ∗R (rG ), ti )D . 2 L 2 Denote by ., . the pairing between forms and vectors and recall the definition of the projectors ΠLR˜ , ΠLR ˜ , ΠRL ˜ from the proof of the Theorem 2.5. Then we ˜ , ΠRL have ωM =
(Λ∗L (rB ), T i )D , t = (RK∗ κ(T i ), ΠLR˜ t)D ,
(3.7a)
(Ξ∗L (rG ), ti )D , t = (RK∗ κ(ti ), ΠLR ˜ t)D ,
(3.7b)
(Λ∗R (rB ), T i )D , t
i
= −(LK∗ T , ΠRL˜ t)D ,
(Ξ∗R (rG ), ti )D , t = −(LK∗ ti , ΠRL ˜ t)D ,
(3.7c) (3.7d)
where t is a vector at a point K of M ⊂ D. Let us show how to demonstrate (3.7a–d) on the example (3.7a). Due to the decomposability M = κ(B)G, the vectors LK∗ T i , RK∗ κ(ti ) form the basis of the tangent space TK M . Thus it is sufficient to prove (3.7a) for t being one of the elements of the basis of TK M . For t = LK∗ T j , it is obvious that the right-hand side of (3.7a) vanishes. On the other hand, knowing j that ΛL (KesT ) = ΛL (K), we can evaluate the left-hand side: (Λ∗L (rB ), T i )D , LK∗ T j = (rB , T i )D , ΛL∗ (LK∗ T j ) = 0. For t = RK∗ κ(tj ), the right-hand side of (3.7a) gives (RK∗ κ(T i ), ΠLR˜ RK∗ κ(ti ))D = (RK∗ κ(T i ), RK∗ κ(tj ))D = δji .
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
803
On the other hand, knowing that ΛL (esκ(tj ) K) = estj ΛL (K), we can evaluate the left-hand side: (Λ∗L (rB ), T i )D , RK∗ κ(tj ) = (rB , T i )D , ΛL∗ (RK∗ κ(tj ) = (rB , T i )D , RΛL (K)∗ tj = (RΛ−1 (K)∗ RΛL (K)∗ tj , T i )D = (tj , T i )D = δji .
L
By using the relations (3.7a–d), we can evaluate the form ωM on any two vectors t, u ∈ TK M in terms of the projectors: ωM (t, u) =
1 (RK∗ κ(T i ), ΠLR˜ t)D (RK∗ κ(ti ), ΠLR ˜ u)D 2 1 − (RK∗ κ(T i ), ΠLR˜ u)D (RK∗ κ(ti ), ΠLR ˜ t)D 2 1 + (LK∗ T i , ΠRL˜ t)D (LK∗ ti , ΠRL ˜ u)D 2 1 − (LK∗ T i , ΠRL˜ u)D (LK∗ ti , ΠRL ˜ t)D 2
=
1 1 1 (Π ˜ t, ΠLR ˜ u)D − (ΠLR ˜ t)D + (ΠRL ˜ u)D ˜ u, ΠLR ˜ t, ΠRL 2 LR 2 2 1 − (ΠRL˜ u, ΠRL ˜ t)D . 2
By realizing that it holds (t, ΠLR ˜ u)D = (ΠRL ˜ u)D = (ΠRL ˜ t, ΠLR ˜ t, u)D , + Π = Id, ΠLR ˜ ˜ RL we finally arrive at ωM (t, u) = (t, (ΠLR ˜ − ΠLR ˜ )u)D . From Eq. (2.11), we know that the form ωM is invertible and its inverse is nothing but the Semenov-Tian-Shansky Poisson tensor (2.10) restricted to M . From this it also follows that ωM is closed hence symplectic. It is certainly a good news to have the symplectic submanifold M of D, since it allows us to construct dynamical systems also for globally non-decomposable twisted Heisenberg doubles. On the other hand, it is a much less good news to remark that nothing guarantees that the group G still acts on M . In fact, it turns out, generically, that the submanifold M of D is not invariant under the left or right action of G on D, therefore G cannot play the role of the symmetry group. It may happen, however, that there is a subgroup H of G which does preserve the submanifold M and which has the property that H = N ⊥ , where N is an ideal
October 7, 2006 17:43 WSPC/148-RMP
804
J070-00279
C. Klimˇ c´ık
in B. We have then the following lemma: Lemma 3.4. Let H be a subgroup of G preserving the submanifold M = OL ∩ OR . We suppose moreover that H = N ⊥ , where N is the ideal of B. Then there exists a moment map ν : M → B realizing the global (H, C)-Poisson–Lie symmetry of M . Proof. For concreteness, we speak about the right action of G on D. Sitting on M , we construct the map wΛR : F un(B) → V ect(M ) by using the formula (2.4): wΛR (y)f = {f, Λ∗R (y )}M Λ∗R (S(y )),
y ∈ F un(B),
f ∈ F un(M ).
For every y ∈ F un(B), we have obviously ∗ ∇L κ(T i ) ΛR (y) = 0.
Since the Poisson bivector on M is given by Eq. (2.10), we thus obtain R ∗ ∗ wΛR (y)f = ∇R T i f ∇ti ΛR (y )ΛR (S(y )) ∗ L = −∇R T i f ΛR ((∇ti y )S(y ))
= −δti (y)∇R T i f. It follows that the Lie algebra G of G does act M , however, because we have supposed it, this action cannot be lifted to the action of G itself. Similarly as in the demonstration of Theorem 3.2, we thus observe that for νR ≡ ρ ◦ ΛR the following is true ∗ ∗ (u )}M νR (S(u )) = −δtI (ρ∗ (u))∇R {f, νR T I f,
u ∈ F un(C),
f ∈ F un(M ).
Recall that T I ’s span the Lie algebra H = N ⊥ therefore νR is indeed the moment map realizing the action of H on M . This action can be obviously lifted to the action of the group H on M , since we have supposed that M is H-invariant. Remark. In the case of the non-decomposable Heisenberg doubles of the type just described we cannot speak about the proper subsymmetry since G does not act on M , therefore we speak about the improper subsymmetry. Now it is time for an example. Consider a group SL(3, R) (consisting of real 3×3-matrices of unit determinant) and denote by sl(3, R) its Lie algebra (consisting of real traceless 3 × 3-matrices). The direct product D = sl(3, R) × SL(3, R) can be equipped with the group structure as follows: ˜ g˜ g), (χ, g)(χ, ˜ g˜) = (χ + Adg χ, −1
(χ, g)
= (−Adg−1 χ, g
−1
χ, χ ˜ ∈ sl(3, R),
g, g˜ ∈ Sl(3, R),
).
The Lie algebra D of D is formed by pairs of elements of sl(3, R) written as φ ⊕ α with the commutator [φ ⊕ α, ψ ⊕ β] = ([φ, β] + [α, ψ]) ⊕ [α, β].
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
805
There is a natural bi-invariant metric on D induced from an invariant bilinear form (., .)D on D = Lie(D): (φ ⊕ α, ψ ⊕ β)D = T r(φβ) + T r(ψα),
α, β, φ, ψ ∈ sl(3, R).
The twisting automorphism κ is defined by κ(χ, g) = (−χT , (g −1 )T ), where T stands for matrix transposition. In order to establish that (D, κ) is indeed a twisted Heisenberg double, we have to identify two maximally isotropic subgroups. Here they are G = {(χ, g) ∈ D; χ = 0}, χ1+ χ3+ χ + χ χ1− −2χ χ2+ , B = (χ, g) ∈ D; χ = 1 (1 − e−εs ) χ2− −χ + χ ε 1 εs 1 0 −εe 2 εs χ e2 g= 0 1 0 , 1 0 0 e− 2 εs where s, χ , χ , χj+ , χ1− , χ2− ∈ R are coordinates on B and ε is a parameter. For the basis of D, we may choose T = 0 ⊕ H, T j+ = 0 ⊕ E j+ , T 3+ = 0 ⊕ E 3+ , where
K , t = 2H ⊕ (−εE 3+ ), t = 2K ⊕ 0, 3 T j− = 0 ⊕ E j− , tj+ = E j− ⊕ 0, tj− = E j+ ⊕ 0, j = 1, 2, T 3− = 0 ⊕ E 3− , t3+ = E 3− ⊕ εH, t3− = E 3+ ⊕ 0,
T = 0 ⊕
0 1
E 1+ = 0 0 0 E 1− = 1 0
1 2 H =0 0 It is easy to verify (ti , tj )D = 0,
0
0
0 0
0 0
E 2+ = 0 0 1 , E 3+ = 0 0 0 0 0 0 0 0 0 0 E 2− = 0 0 0 , E 3− = 0 0 0 0 0 1 0 1 1 0 0 0 0 2 0 0 , K = 0 −1 0 . 1 1 0 − 0 0 2 2 that it holds 0 0
0, 0 0 0,
(T i , T j )D = 0,
(T i , tj )D = δji ,
1
0, 0 0 0 0 0, 0 0
0 0
i, j = , , 1±, 2±, 3 ± .
October 7, 2006 17:43 WSPC/148-RMP
806
J070-00279
C. Klimˇ c´ık
The commutation relations of G = Span(T i ) are evidently those of the Lie algebra sl(3, R). It is important for us to give the complete list of (non-zero) commutators of B = Span(ti ). Thus we have [t , t1+ ] = εt2− ,
[t , t2+ ] = −εt1− ,
1 [t3+ , tj± ] = ∓ εtj± , 2
[t3+ , t3− ] = εt3− ,
[t3+ , t ] = εt ,
j = 1, 2.
Let us choose a (nilpotent) subalgebra H of G = sl(3, R) spanned by T j+ . Thus the only non-zero commutator is [T 1+ , T 2+ ] = T 3+ . It is easy to find N ⊂ B such that H = N ⊥ : We have N = Span(t , t , tj− ),
j = 1, 2, 3.
It is the matter of direct check to verify that N is indeed an ideal in B. Therefore the (Heisenberg) group H consisting of upper-triangular real matrices with units on the diagonal is a good candidate for the Poisson–Lie subsymmetry. The corresponding cosymmetry group C has Lie algebra C = B/N and, by slightly abusing the notation, we can denote its basis by tj+ , j = 1, 2, 3. The non-zero commutators of C read 1 [t3+ , tj+ ] = − εtj+ , 2
j = 1, 2.
The cosymmetry group C can be most easily described in the dual way. Denote the coordinate fonctions as ξ j , j = 1, 2, 3. The coproduct reads ∆ξ 3 = ξ 3 ⊗ 1 + 1 ⊗ ξ 3 , ε ∆ξj = ξj ⊗ 1 + e− 2 ξ3 ⊗ ξj , the antipode S(ξ3 ) = −ξ3 ,
ε
S(ξj ) = −e 2 ξ3 ξj ,
j = 1, 2
and the counit (ξj ) = 0,
j = 1, 2, 3.
The dual map ρ∗ : F un(C) → F un(B) reads ρ∗ (ξ3 ) = s,
ρ∗ (ξj ) = χj− ,
j = 1, 2.
The Poisson–Lie bracket on F un(C) comes from that on F un(B), which, in turn, is given by (2.12). The result of the computation reads {ξ 1 , ξ 2 }C =
3 1 (1 − e−εξ ), ε
{ξ 3 , ξ j }C = 0,
j = 1, 2.
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
807
We observe that both symmetry group H and the cosymmetry group C are non-Abelian. Let us now show that the (H, C)-Poisson–Lie subsymmetry is in fact improper. In order to see this, we first notice that the Heisenberg double D is nondecomposable since, e.g., the element 1 1 0 0 − ε , (χ, g) = 0 1 0 0 cannot be written as κ(b)g −1 for some b ∈ B and g ∈ G. It is easy to identify the manifold M = OL ∩ OR . We find 1 1 3− 3+ M = (χ, g) ∈ D; T r(JL E ) > − , T r(JR E ) < , ε ε
(3.8)
where we have defined the sl(3, R)-valued functions JL , JR on D as JL (χ, g) = χ,
JR (χ, g) = −Adg−1 χ.
The symplectic form on M can be computed from the explicit expression (3.6). The result of calculation is as follows 1 1 ωM = − T r(dJR ∧ lG ) + T r(dJL ∧ rG ) 2 2 −
ε T r(dJL H) ∧ T r(dJL E 3− ) ε T r(dJR H) ∧ T r(dJR E 3+ ) − . 2 1 + εT r(JL E 3− ) 2 1 − εT r(JR E 3+ )
Note that the left- and right-invariant Maurer–Cartan forms lG , rG can be written also as g −1 dg, dgg −1 since G = SL(3, R) is the matrix group. The explicit expression of the symplectic form ωM is quite illuminating in the sense that it explains why the constraints T r(JL E 3− ) > − 1ε , T r(JR E 3+ ) < 1ε in (3.8) had to be imposed. It is now the matter of direct inspection to find that the right action of the group H on D and the left action of κ(H) on D preserve, respectively, the symplectic manifold M = OL ∩ OR . The (H, C)-Poisson–Lie symmetry of (M, ωM ) is therefore established. 4. u-Deformed WZW Model and Its Gauging We begin this section by introducing a particular example of the deformation of the WZW model which was not discussed in [9–11]. Then we shall perform the symplectic reduction of this u-deformed WZW model with respect to a non-anomalous quasi-adjoint action submoment map which is a sort of combination of the moment maps constructed in Secs. 3.1 and 3.2. Finally, we shall argue why this quasiadjoint symplectic reduction can be interpreted as the gauging of the deformed WZW model.
October 7, 2006 17:43 WSPC/148-RMP
808
J070-00279
C. Klimˇ c´ık
4.1. The u-deformation of the WZW model It was conjectured in [9] and explained in detail in [11] that the standard WZW model [17] on a compact Lie group K is a dynamical system whose phase space can be identified with certain (decomposable) twisted Heisenberg double of a loop group LK. Moreover, the symplectic form of the WZW model is just the inverse of the fundamental Semenov-Tian-Shansky Poisson bivector (2.10). The basic idea of the article [9] can be rephrased as follows: since the loop group LK may possess several different twisted Heisenberg doubles (D, κ), it makes sense to consider the dynamical system based on each of (D, κ) as a sort of generalized WZW model. The (twisted Heisenberg) double of the standard WZW model is distinguished among all other doubles of the loop group LK by the fact that the cosymmetry group B is Abelian. This circumstance is reflected by the fact that the standard WZW model has the ordinary Hamiltonian symmetry structure. On the other hand, the generalized WZW models have necessarily non-Abelian cosymmetry groups, therefore their symmetry structure must be genuinely Poisson–Lie. Some generalized WZW models form naturally families parametrized by one or several parameters. Suppose we investigate such a family. If for a particular value of the parameters the corresponding generalized WZW model becomes the standard WZW model, we call the other members of this family the deformed WZW models. Let us now describe a particular family of the deformed WZW models, which was not discussed in [9–11]. Thus K be a connected simple compact Lie group whose Lie algebra K is equipped with a non-degenerate Ad-invariant bilinear form (., .)K . Let LK be the group of smooth maps from a circle S 1 into K (the group law is given by pointwise multiplication) and define a natural non-degenerate Ad-invariant bilinear form (.|.) on LK ≡ Lie(LK) by the following formula π 1 dσ(α(σ), β(σ))K . (4.1) (α|β) = 2π −π As the twisted Heisenberg double D we take the semidirect product of the loop group LK with its Lie algebra LK. Thus the group multiplication law on D reads (χ, g).(χ, ˜ g˜) = (χ + Adg χ, ˜ g˜ g ),
g ∈ LK,
χ ∈ LK,
(4.2a)
(χ, g)−1 = (−Adg−1 χ, g −1 ),
(4.2b) ←
and the Lie algebra D of D has the structure of semidirect sum D = LK ⊕ LK [φ ⊕ α, ψ ⊕ β] = ([φ, β] + [α, ψ], [α, β]). Here φ, ψ ∈ LK are in the first and α, β ∈ LK in the second composant of the semidirect sum. The bi-invariant metric on D comes from Ad-invariant bilinear form (., .)D on Lie(D) = D defined with the help of (4.1): (φ ⊕ α, ψ ⊕ β)D = (φ|β) + (ψ|α).
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
809
The metric preserving automorphism κ of the group D reads κ(χ, g) = (χ + k∂σ gg −1 , g),
(4.2c)
where k is an (integer) parameter. The maximally isotropic subgroups are G = {(χ, g) ∈ D; χ = 0},
(4.3a)
B = {(χ, g) ∈ D; g = eu(χ) },
(4.3b)
where u is a certain map from LK to the Cartan subalgebra T of LK. Let us now explain the construction of the map u: The group K is naturally embedded in LK as the subgroup consisting of constant loops. The maximal torus T of K is therefore the (Abelian) subgroup of LK and we call T = Lie(T ) the Cartan subalgebra of LK. Since we have the inner product (4.1) on LK we can define the orthogonal projector P0 : LK → T . Let U : T → T be a skew-symmetric linear operator, i.e. it holds (U a, b)K = −(a, U b)K ,
a, b ∈ T .
(4.4)
We then define u = U ◦ P0 . It is easy to see that u(χ) + u(χ) ˜ = u(χ + eu(χ) χe ˜ −u(χ) ),
χ, χ ˜ ∈ LK,
hence the set B defined by (4.3b) is indeed the subgroup of D. Moreover, the condition (4.4) implies the isotropy of B in D. It is a simple task to establish the decompositions D = κ(G)B and D = κ(B)G. Indeed, we have for every g ∈ LK, χ ∈ LK (χ, g) = (k∂σ gg −1 , geu(JR ) )(−e−u(JR ) JR eu(JR ) , e−u(JR ) ) = (JL , eu(JL ) ).(0, e−u(JL ) g), where LK-valued functions JL , JR on D are defined as JL (χ, g) ≡ χ,
JR (χ, g) = −Adg−1 χ + kg −1 ∂σ g.
(4.5a)
Thus we can identify the moment maps ΛL,R : D → B, ΞL,R : D → G: ΛL (χ, g) = (JL , eu(JL ) ), ΞL (χ, g) = ge
u(JR )
,
ΛR (χ, g) = (JR , eu(JR ) ),
ΞR (χ, g) = g
−1 u(JL )
e
(4.5b)
.
Now we use the formula (3.6) and write down the symplectic form ωu of the udeformed WZW model: 1 1 1 1 ωu = (dJL ∧ |rLK ) − (dJR ∧ |lLK ) + (u(dJL ) ∧ |dJL ) + (u(dJR ) ∧ |dJR ). 2 2 2 2 (4.6) Here rLK = dgg −1 and lLK = g −1 dg stand for right- and left-invariant Maurer–Cartan forms on the group manifold LK.
October 7, 2006 17:43 WSPC/148-RMP
810
J070-00279
C. Klimˇ c´ık
The role of the deformation parameter is played by the linear operator U . Indeed, if U → 0 the form ωu can be rewritten as 1 ωu=0 = d(JL |rLK ) + k(rLG ∧ |∂σ rLG ). 2 In the expression ωu=0 , we can recognize the symplectic form of the standard WZW model (cf. [9, 5, 2]). We now complete the definition of the u-deformed WZW model by saying that it is a dynamical system with the phase space D, with the symplectic form ωu and with the following Hamiltonian 1 1 (JL |JL ) − (JR |JR ). (4.7) 2k 2k We note without giving proof that, in distinction to the q-deformation of the WZW model introduced in [9], the u-deformation does preserve the conformal symmetry. Let us study the symmetry structure of the u-WZW model. The group G = LK acts from the left as H=−
h (χ, g) = κ((0, h)).(χ, g) = (k∂σ hh−1 + hχh−1 , hg),
h, g ∈ LK,
χ ∈ LK
and also from the right (χ, g) h = (χ, g)(0, h−1 ) = (χ, gh−1 ). We know (by construction) that both these actions are Poisson–Lie symmetries with the moment maps ΛL,R given by (4.5b). Now we are going to evaluate the (anomalous) Poisson brackets (2.13a,b) of the moment maps. First of all we have to describe the structure of the cosymmetry group B in the dual language. The complexified algebra F unC (B) is generated by (linear) functions F α,n , F µ,n defined as F α,n (χ) = (E α,n |χ), α,n
α inσ
F µ,n (χ) = (H µ,n |χ).
(4.8)
α
Here E =E e and E are the step generators of the complexified Lie algebra C K . On the other hand, H µ,n = H µ einσ where H µ are the (orthonormalized) Cartan generators fulfilling the relations [H µ , E α ] = α, H µ E α , (H µ , H ν )K = δ µν ,
[E α , E −α ] = α∨ ,
(E α , E −α )KC =
2 , |α|2
[E α , E β ] = cαβ E α+β , (E α )† = E −α ,
(H µ )† = H µ ,
where the coroot α∨ is defined as α∨ =
2 α, H µ H µ . |α|2
Obviously, E α,n , H µ,n , n ∈ Z is the basis of LKC . The (non-Abelian) group law on B is encoded in the coproduct, the antipode and the counit on F unC (B). From Eqs. (4.2), (4.3b) and (4.8), it is not difficult to find out: ∆F µ,n = F µ,n ⊗ 1 + 1 ⊗ F µ,n , ∆F α,n = F α,n ⊗ 1 + e−α,U(H
µ
S(F µ,n ) = −F µ,n , )F µ,0
⊗ F α,n ,
ε(F µ,n ) = 0,
S(F α,n ) = −eα,U(H
ε(F α,n ) = 0, µ
)F µ,0
F α,n .
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
811
Because of the fact that χ† = −χ, the operation of the complex conjugation † on F unC (B) is given by (F α,n )† = −F −α,−n ,
(F µ,n )† = −F µ,−n .
It can be then easily verified that ∆ ◦ † = († ⊗ †) ◦ ∆,
S ◦ † = † ◦ S,
ε ◦ † = † ◦ ε.
This means that ∆, S, ε descend from F unC (B) to F unR (B) making the latter the real commutative Hopf algebra dual to the real group B. The Poisson–Lie bracket on F unC (B) can be obtained from the general formula (2.12): {F µ,m , F ν,n }B = 0, {F µ,m , F α,n }B = α, H µ F α,m+n , 2 α, H µ F µ,m+n , |α|2 {F α,m , F β,n }B = cαβ F α+β,m+n − α, U (H µ ) β, H µ F α,m F β,n .
{F α,m , F −α,n }B =
It is easy to verify, that the Poisson–Lie bracket on F unC (B) verifies {f1† , f2† }B = {f1 , f2 }†B , hence it defines also the Poisson–Lie bracket on the real group B. Now we are ready to evaluate the anomalous Poisson brackets (2.13a,b). We start with Λ∗L (F α,n ) = (JL |E α einσ ) ≡ JLα,n ,
α,n Λ∗R (F α,n ) = (JR |E α einσ ) ≡ JR ,
Λ∗L (F µ,n ) = (JL |H µ einσ ) ≡ JLµ,n , µ,n Λ∗R (F µ,n ) = (JR |H µ einσ ) ≡ JR
and find {JLµ,m , JLν,n }D = kδ µν inδm+n,0 , {JLµ,m , JLα,n }D = α, H µ JLα,n+m , {JLα,m , JL−α,n }D =
2 (α, H µ JLµ,n+m +iknδm+n,0), |α|2
{JLα,m , JLβ,n }D = cαβ JLα+β,m+n − α, U (H µ ) β, H µ JLα,m JLβ,n ; µ,m ν,n {JR , JR }D µ,m α,n {JR , JR }D
(4.9a)
= −kδ µν inδm+n,0 , α,n+m = α, H µ JR ,
α,m −α,n , JR }D = {JR
2 µ,n+m (α, H µ JR −iknδm+n,0), |α|2
α,m β,n α+β,m+n α,m β,n {JR , JR }D = cαβ JR − α, U (H µ ) β, H µ JR JR ;
{JL , JR }D = 0.
(4.9b) (4.9c)
In the formulae above, we note the anomalous terms proportional to k. They correspond to the matrices Mκij and Mκij−1 in (2.13a) and (2.13b), respectively. We
October 7, 2006 17:43 WSPC/148-RMP
812
J070-00279
C. Klimˇ c´ık
remark, that the left and right brackets differ by the sign in front of k. This fact will be crucial for gauging the u-deformed WZW model in Sec 4.3. We have also underlined the defomation terms containing U . Thus the relations (4.9a) or (4.9b) can be referred to as those of u-deformed Kac–Moody algebra. Knowing the symplectic structure of the u-deformed WZW models, we can compute other interesting Poisson brackets. The observables on D are functions of χ ∈ LK and g ∈ LK. Let as consider two functions φ(g), ψ(g), which do not depend on χ. Then we find directly from (2.7): R L L {φ(g), ψ(g)}D = ∇R T µ φ(g)∇U(T µ ) ψ(g) − ∇U(T µ ) φ(g)∇T µ ψ(g),
where T µ ≡ iH µ ∈ T ⊂ K. Note, that we have again underlined the u-deformation term (the corresponding bracket of the standard WZW model vanishes). Finally, we have {φ(g), JLµ,m }D = ∇L H µ,m φ(g), α,n α,n L µ {φ(g), JL }D = ∇L E α,n φ(g) − iα, U (H ) JL ∇T µ φ(g),
µ,m {φ(g), JR }D = −∇R H µ,m φ(g), α,n α,n R µ {φ(g), JR }D = −∇R E α,n φ(g) + iα, U (H ) JR ∇T µ φ(g).
4.2. Symplectic reduction: Generalities The symplectic reduction is the method of construction of new symplectic manifolds out from old ones. The simplest way of explaining the method relies on the dual language which uses rather the algebra of functions F un(M ) on a symplectic manifold M than the manifold M itself. We note that the space F un(M ) is the Poisson algebra, i.e. the Lie algebra compatible with the structure of the (standard commutative point-wise) multiplication on F un(M ). The Lie commutator is nothing but the Poisson bracket {., .}M corresponding to a symplectic structure ωM on M and the compatibility condition is given by the Leibniz rule: {f, gh}M = {f, g}M h + {f, h}M g,
f, g, h ∈ F un(M ).
Let J be an ideal of the algebra F un(M ) with respect to the ordinary commutative multiplication on F un(M ) (typically, J is the ideal of functions vanishing on a submanifold N ⊂ M ). Let J be also the Poisson subalgebra of F un(M ), i.e. {J, J} ⊂ J. We can now construct a new Poisson algebra A˜ defined as follows A˜ = {f ∈ F un(M ); {f, J}M ∈ J}. ˜ By construction, J is not Note that the property {J, J} ⊂ J implies that J ⊂ A. ˜ ˜ J}M ⊂ J. only the ordinary ideal of A but it is also the Poisson ideal, i.e. {A, ˜ inherits the Poisson bracket from A˜ hence Obviously, the factor algebra Ar ≡ A/J it becomes itself the Poisson algebra. If J is the ideal of functions vanishing on a submanifold N ⊂ M , then the algebra Ar is nothing but the Poisson algebra of functions corresponding to some symplectic manifold Mr . The manifold Mr together with its corresponding Poisson bracket {., .}r (or, equivalently, with its symplectic
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
813
form ωr ) is called the reduced symplectic manifold. If there is a Hamiltonian H on ˜ its class in A/J ˜ is denoted as Hr and it is referred to as the M such that H ∈ A, reduced Hamiltonian. The symplectic reduction is often put in relation with the actions of Lie groups on the non-reduced manifold M . It may even happen that the reader used to the group approach to the symplectic reduction did not recognize at first reading that his way of thinking about the reduction is just a particular case of the general algebraic definition presented above. We believe that it is worth to elucidate this point not only for pedagogical reasons. In fact, the group-based symplectic reduction will turn out to be in the core of our gauging of the u-WZW model. We shall work in the general Poisson–Lie setting, the standard Hamiltonian symplectic reduction (cf. [14] and references therein) will be the special case of our discussion when the cosymmetry group B is Abelian. Suppose that there is a non-anomalous moment map µ : M → B realizing the (G, B)-Poisson-Lie symmetry of M (cf. the Definition 2.2 of Sec. 2.2). Due to the property (2.3b) of the Poisson–Lie bracket on F un(B), we know that the kernel of the counit Ker() is the Poisson subalgebra of (F un(B), {., .}B ). Since the moment map µ is non-anomalous, the pull-back µ∗ (Ker()) is also the Poisson subalgebra of (F un(M ), {., .}M ). Thus the role of the ideal J from the general definition above is played by the ideal of F un(M ) generated by µ∗ (Ker()).We denote it also by the letter J. In the situation just described, the resulting reduced symplectic ˜ can be easily manifold Mr (corresponding to the reduced Poisson algebra A/J), “visualized”. For this, let us suppose that the set P of points of M mapped by µ to the unit element e of the cosymmetry group B forms a smooth submanifold of M . It is not difficult to verify that the action of the symmetry group G (which is itself locally induced by the moment map µ) leaves P invariant. Let us moreover suppose that the G-action on P is free, or, in other words, that P is isomorphic to a principal G-bundle. Then the basis P/G of this G-fibration can be then identified with the reduced symplectic manifold Mr . The restriction of the symplectic form ω on P becomes degenerated and the degeneracy direction of ω turn out to be nothing but the orbits of the gauge group G. Thus the symplectic form ωr is naturally induced from ω. Indeed, on each local trivialization of the G-bundle P we can choose a slice. The restriction of ω on the slice is the reduced symplectic form ωr . A particularly good situation occurs when the G fibration of P is topologically trivial. In this case, one can visualize the reduced symplectic manifold as the submanifold of P (and, hence, as the submanifold of the original symplectic manifold M ). This can be done by choosing a global slice Qi = 0, where the functions Qi are in F un(M ). In the usual terminology, the functions Ji ∈ µ∗ (Ker()) ⊂ F un(M ) are called the first class constraints and the functions Qi their complementary second class constraints. The reduced symplectic manifold Mr is now the common locus of all constraints Ji = 0 and Qi = 0 and the reduced symplectic form ωr is the pull-back of the non-reduced form ω to the submanifold Mr .
October 7, 2006 17:43 WSPC/148-RMP
814
J070-00279
C. Klimˇ c´ık
It is sometimes convenient to fix the gauge only partially. This means that it exists a slice Qγ = 0 (the subscript γ runs over a smaller set than the subscript i) which restricts the gauge freedom to some subgroup H ⊂ G. If we note by the letter L the common locus Ji = 0, Qγ = 0 in M , the reduced symplectic manifold Mr can be identified with the coset space L/H. The interest in such partial gauge fixing will be evident in the studies of the symplectic structure of the standard gauged WZW model and of its deformations. Indeed, as we shall see in the following section, there exists the partial gauge fixing for which the manifold L has a very simple left-right chiral symmetric description and the residual gauge group H is finite dimensional, compact and Abelian. 4.3. Symplectic reduction of the u-WZW model We start this section by remarking that the twisting automorphism κ given by (4.2c) not only preserves the cosymmetry group B described in (4.3b) but it leaves invariant every element of B. This means that we can safely apply the Theorem 3.1 of Sec. 3.1 which now states that the products ΛL ΛR ≡ BL and ΛR ΛL ≡ BR are both non-anomalous moment maps. We already know from the general theory that both BL and BR realize the global Poisson–Lie symmetries of the twisted Heisenberg double (D, κ) therefore, via their corresponding maps wBL ,wBR (cf. (2.4)), they induce the respective actions (3.4a), (3.4b) of the loop group G = LK on (D, κ). Let us work, for concreteness, with the moment map BL = ΛL ΛR . Recall the group multiplication law in B: (χ1 , eu(χ1 ) ).(χ2 , eu(χ2 ) ) = (χ1 + eu(χ1 ) χ2 e−u(χ1 ) , eu(χ1 )+u(χ2 ) ),
χ1 , χ2 ∈ LK. (4.10)
∗ The formula (4.10) together with Eq. (4.5b) allow us to calculate the BL,R -pullC backs of the basic functions from F un (B): ∗ (F α,n ) = (ΛL ΛR )∗ (F α,n ) = JLα,n + e−α,U(H BL
µ
µ,0 )JL
α,n JR ,
α,n ∗ (F α,n ) = (ΛR ΛL )∗ (F α,n ) = JR + e−α,U(H BR
µ
µ,0 )JR
JLα,n ,
µ,n ∗ ∗ BL (F µ,n ) = BR (F µ,n ) = JLµ,n + JR .
Now we are ready to make explicit the map wBL : F un(B) → V ect(D): ∗ ∗ ((F α,n ) )}D BL (S((F α,n ) )) wBL (F α,n )f ≡ {f, BL −α,U(H = ∇L κ(E α,n ) f − e
µ
µ,0 )JL
α,n R µ ∇R E α,n f − α, U (H ) JL ∇H µ f,
∗ ∗ ((F µ,n ) )}D BL (S((F µ,n ) )) wBL (F µ,n )f ≡ {f, BL R = ∇L κ(H µ,n ) f − ∇H µ,n f,
f ∈ F unC (D).
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
815
Recall that the symbol wBL (F α,n ) denotes the (complex) vector field on D corresponding to the Poisson–Lie Hamiltonian F α,n ∈ F unC (B). Similarly, we find ∗ ∗ wBR (F α,n )f ≡ {f, BR ((F α,n ) )}D BR (S((F α,n ) )) −α,U(H = −∇R E α,n f + e
µ
µ,0 )JR
α,n L µ ∇L κ(E α,n ) f + α, U (H ) JR ∇H µ f,
∗ ∗ ((F µ,n ) )}D BR (S((F µ,n ) )) wBR (F µ,n )f ≡ {f, BR R = ∇L κ(H µ,n ) f − ∇H µ,n f,
f ∈ F unC (D).
It is the matter of easy check that the vector fields wBL (F α,n ), wBL (F µ,n ) and also wBR (F α,n ), wBR (F µ,n ) generate the actions of the Lie algebra LKC on F unC (D). Moreover, it can be also seen that, by considering only the Poisson– Lie Hamiltonians from F unR (B), these actions get restricted to the actions of LK on F unR (D). It is not difficult to lift the LK actions just described to the LK actions. The resulting formulae are the special cases of the general formulae (3.4a) and (3.4b): h (χ, g) = κ(h)(χ, g)h−1 L ,
hL = e−u(hJL h
−1
+κ∂hh−1 )
heu(JL ) ,
h ∈ LK, (4.11a)
h (χ, g) = κ(hR )(χ, g)h
−1
,
hR = e
−u(hJR h−1 −κ∂hh−1 )
he
u(JR )
,
h ∈ LK. (4.11b)
We notice that for U → 0 the cosymmetry group B becomes Abelian and the LK-actions (4.11a) and (4.11b) coincide and (as we have promised to show in Sec. 3.1) they become identical to the twisted adjoint action h (χ, g) = κ(h)(χ, g)h−1 . Let Υ be a subset of the set of all positive roots of the Lie algebra KC . Consider a complex vector space S C defined as S C = Span{E γ , E −γ , [E γ , E −γ ]},
γ ∈ Υ.
In the rest of this paper, we shall suppose that the subset Υ was chosen in such a way that the vector space S C is the Lie subalgebra of KC (as an example take the block diagonal embedding of sl3 in sl4 ). Obviously, the vector space TSC = Span{[E γ , E −γ ]},
γ∈Υ
is the Cartan subalgebra of S C . The complex Lie algebra S C has a natural compact real form S consisting of the anti-Hermitean elements of S C . Consider the corresponding compact semi-simple group S and view it as the subgroup of K. We are now going to establish the conditions on the operator U which will guarantee that the action of the loop group LS on D via (4.11a) or (4.11b) is the Poisson–Lie subsymmetry.
October 7, 2006 17:43 WSPC/148-RMP
816
J070-00279
C. Klimˇ c´ık
Suppose that for all γ ∈ Υ, the operator U : T → T fulfils the following condition (γ ◦ U )(TS⊥ ) = 0,
(4.12)
where the subscript ⊥ stands for the orthogonal complement with respect to the restriction of the Killing–Cartan form (., .)K to T . It is then easy to verify that the set N = {(χ, g) ∈ D; g = eu(χ) , χ ∈ S ⊥ } is the normal subgroup of B. Consider the algebra of complex functions on the group C = B/N . As we have learned in Sec. 3.2, F unC (C) can be injected by the map ρ∗ into F unC (B). (Note that ρ∗ is the dual map to the projection homomorphism ρ : B → B/C.) It is easy to see that ρ∗ (F unC (C)) is spanned by the functions F γ,n , F ν,n where γ ∈ Υ and H ν ∈ TS . The normality of the subgroup N implies that the vector space ρ∗ (F unC (C)) is in fact the Hopf subalgebra of F unC (B). By using the explicit form of the Poisson–Lie brackets on F unC (B), it is straightforward to check that ρ∗ (F unC (C)) is also the Poisson subalgebra of F unC (B). It is moreover true that ρ∗ (F unC (C)) is †-invariant hence we conclude that ρ∗ (F un(C)) is the Poisson subalgebra of F un(B). All that means that we can use the Theorem 3.2 of Sec. 3.2 to conclude that the action of the loop group LS on D via (4.11a,b) is the Poisson–Lie subsymmetry. Our next goal is to gauge this (non-anomalous) subsymmetry, or, in other words, to perform the symplectic reduction with respect to it. Consider the LS-subsymmetry moment map CL = ρ ◦ BL , where ρ is the projection homomorphism from B to C = B/N . The first step of the reduction procedure consists in identification of the submanifold PL ⊂ D such that every point p ∈ PL is mapped by CL to the unit element of the group C. It is easy to see that PL = {p ∈ D; JLγ,n (p) + e−γ,U(H
ν
ν,0 )JL (p)
γ,n JR (p) = 0, JLν,n (p) + JLν,n (p) = 0},
where γ ∈ ±Υ and ν is such that H ν ∈ TS . In physicists’ terminology, the expressions JLγ,n + e−γ,U(H
ν
ν,0 )JL
γ,n JR = 0,
JLν,n + JLν,n = 0
(4.13)
are the first class constraints since it is not difficult to verify that the Poisson brackets of the constraints among themselves as well as those of the Hamiltonian (4.7) with the constraints vanish on the constrained surface PL . Now the u-deformed WZW symplectic form ωu restricted to PL becomes degenerated in the directions of the action of LS on PL . As we already know from Sec. 4.2, the reduced symplectic manifold Mr can be identified with the coset space PL /LS. We now perform a partial gauge fixing (cf. the general discussion in Sec. 4.2) which will lead to very elegant left-right symmetric chiral description of the symplectic structure of the reduced symplectic manifold Mr . For this, we first study the action
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
817
of LS on D given by the formula (4.11a). By using the formula (2.8a), we rewrite it as follows s (χ, g) = (sχs−1 + k∂σ ss−1 , sgs−1 L ), −1
sL = e−u(sJL s
+κ∂ss−1 )
seu(JL ) ,
s ∈ LK.
(4.14)
It is convenient to decompose χ as χs + χp , where χs ∈ LS ans χp ∈ LS ⊥ . We thus see from Eq. (4.14) that χs and χp do not mix under the action of s. We know that every χs can be brought by some s to an element of the finite dimensional Cartan subalgebra TS (cf. [9, Theorem 3.6]). Having in mind the definition (4.5a) of JL , this leads to the following natural slice on D: JLγ,n = 0,
γ ∈ ±Υ,
JLν,n = 0,
n ∈ Z,
n ∈ Z, n = 0,
(4.15a) (4.15b)
where ν is such that H ν ∈ TS . This slice is partial (it corresponds to the slice Qγ = 0 in the general discussion of Sec. 4.2). Indeed, the residual gauge group H is the normalizer of the Cartan subalgebra TS and, as the discussion before [9, Theorem 3.6] implies, the finite-dimensional Cartan torus TS is the normal subgroup of H. (In fact H/TS is nothing but the affine Weyl group of LS). The constraints (4.13) and (4.15) can be now rewritten in a U -independent way as JLγ,n = 0, JLν,n = 0,
γ,n JR = 0,
γ ∈ ±Υ,
ν,n JR = 0,
n ∈ Z,
JLν,0
= 0,
+
ν,0 JR
n ∈ Z, n = 0.
(4.16a) (4.16b) (4.16c)
ν
where ν is such that H ∈ TS . The constraints (4.16) define the submanifold L ⊂ D and the reduced symplectic manifold Mr can be identified with the space of cosets L/H. The similar discussion can be performed also with the moment map CR = ρ ◦ BR . The first class constrained manifold PR is γ,n PR = {p ∈ D; JR (p) + e−γ,U(H
ν
ν,0 )JR (p)
JLγ,n (p) = 0, JLν,n (p) + JLν,n (p) = 0}, (4.17) ν
where n ∈ Z, γ ∈ ±Υ and ν is such that H ∈ TS . The partial slice on D is γ,n = 0, JR
γ ∈ ±Υ,
ν,n JR
n ∈ Z,
= 0,
n ∈ Z, n = 0,
(4.18a) (4.18b)
where ν is such that H ν ∈ TS . The constrains (4.17) and (4.18) can also be rewritten in the U -independent way as JLγ,n = 0, JLν,n
= 0,
γ,n JR = 0, ν,n JR
= 0,
γ ∈ ±Υ, n ∈ Z,
ν,0 JLν,0 + JR = 0.
n ∈ Z, n = 0.
(4.19a) (4.19b) (4.19c)
October 7, 2006 17:43 WSPC/148-RMP
818
J070-00279
C. Klimˇ c´ık
We thus see that the symplectic reduction based on the moment map BR gives the same result as the one based on BL . This happens in spite of the fact that wCL and wCR induce the different actions of the gauge group LS on D. Our next task will be the description of the symplectic form ωr on Mr . Actually, we shall describe the pull-back of the original Semenov-Tian-Shansky form ωu on D to the submanifold L ⊂ D. We again use [9, Theorem 3.6] which permits us to parametrize the Heisenberg double D by means of two elements gL , gR of LK and one element µ of the Weyl alcove AK in the Cartan subalgebra TK ⊂ K: −1 −1 −1 + k∂σ gL gL , gL gR ). (χ, g) = κ(0, gL )(µ, eLK )(0, gR )−1 = (gL µgL
(4.20)
Here eLK is the unit element in LK. The Semenov–Tian–Shansky form ωu given by (4.6) gets rewritten in the new variables as follows k −1 −1 (g dgR ∧ |∂(gR dgR )) + 2 R k −1 −1 −1 + d(µ|gL dgL ) − (gL dgL ∧ |∂(gL dgL )) + 2
−1 ω ˜ u = −d(µ|gR dgR ) +
1 (u(dJR ) ∧ |dJR ) 2 1 (u(dJL ) ∧ |dJL ), 2
(4.21)
where −1 −1 JL = gL µgL + k∂σ gL gL ,
−1 −1 JR = −gR µgR − k∂σ gR gR .
Before giving the interpretation of the reduced symplectic manifold in terms of the deformed gauged WZW model, let us first study the residual gauge symmetries of the form ω ˜ u . We recall that the residual gauge group H is the normalizer of the Cartan algebra TS . We can make it smaller by further gauge fixing. Thus we ν,0 ) takes values only in the Weyl alcove of suppose that the variable JLν,0 (= −JR TS . (We remind that the Weyl alcove is the fundamental domain of the action of the affine Weyl group of LS on TS ). With this restriction the residual gauge group becomes just the Cartan torus TS acting as tS (gL , gR ) = (tS gL , tS gR ),
tS ∈ T S .
(4.22)
Indeed, replacing gL,R by tS gL,R in (4.21), the form ω ˜ u transforms as ω ˜u → ω ˜ u + d(JL + JR |t−1 ˜u, S dtS ) = ω ν,0 ν,0 since the term d(JL +JR |t−1 S dtS ) vanishes due to the constraint JL +JR = 0. It is important to stress that the parametrization (4.20) of the double D via the variables ˜ u which is related to the µ, gL , gR gave rise to another gauge symmetry of the form ω ambiguity of the chiral decomposition (4.20). Indeed, if we pick arbitrary element tK from the Cartan torus TK then it holds
(χ, g) = κ(0, gL )(µ, eLK )(0, gR )−1 = κ(0, gL tK )(µ, eLK )(0, gR tK )−1 . This means that the full residual gauge group of the form ω ˜ u is TS × TK acting as (tS , tK ) (gL , gR ) = (tS gL tK , tS gR tK ),
tS ∈ T S ,
tK ∈ T K .
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
819
The reader may find strange that we have somewhat artificially augmented the residual gauge symmetry of the Semenov-Tian-Shansky form ωu by expressing it in the new ambiguous variables µ, gL , gR . However, the benefit of this parametrization consists in the fact that in the form ω ˜ u the variables gL and gR get disentangled. The form ω ˜ u is defined on the manifold LK × AK × LK and its pull-back on D via the map (4.20) gives the Semenov-Tian-Shansky form ωu . Obviously, it holds D = (LK × AK × LK)/TK . We conclude this section by an observation, that the Hamiltonian (4.7) of the u-WZW model descends to the reduced Hamiltonian Hr (cf. the general discussion in Sec. 4.2). Thus our symplectic reduction has produced a new dynamical system (Mr , ωr , Hr ) that will be interpreted in the next subsection as the deformed gauged WZW model. 4.4. Interpretation The gauged WZW model is a dynamical system and its symplectic structure has been thoroughly investigated, e.g., in [7, Sec. 3.2 and Appendix A]. We report here Gaw¸edzki’s results in the language of the left-right movers, by considering maps mL , mR : R → K fulfilling (∂ξ mL,R m−1 L,R , S)K = 0, mL,R (ξ + 2π) = e−
2πν k
(4.23a)
mL,R (ξ)e
2πµ k
,
(4.23b)
where µ is in the Weyl alcove of TK and ν in the Weyl alcove of TS . The symplectic form of the gauged WZW model is then given by the following expression (cf. [7, Eq. (A.1)]) k k −1 −1 −1 ω K/S = − (m−1 L dmL ∧ |∂ξ (mL dmL )) + (mR dmR ∧ |∂ξ (mR dmR )) 2 2 1 −1 2πdν mL (0), ∧dµ)K − ((m−1 L dmL )(0) − mL (0) 2 k 1 − ((dmL m−1 L )(0), ∧dν)K 2 1 −1 2πdν mR (0), ∧dµ)K + ((m−1 R dmR )(0) − mR (0) 2 k 1 + ((dmR m−1 R )(0), ∧dν)K . 2 In writing the form ω K/S , we have switched from Gaw¸edzki’s notations to ours (e.g., we have used (., .)K instead of T r(., .) etc.), nevertheless ω K/S still does not quite resemble our reduced form ω ˜ u=0 . In fact, we should note that Gawedzki’s chiral movers are quasiperiodic (cf. (4.23b)) while we use the periodic fields gL,R (σ). Indeed, if we perform a transformation νξ
mL,R (ξ) = e− k gL,R (ξ)e
µξ k
,
October 7, 2006 17:43 WSPC/148-RMP
820
J070-00279
C. Klimˇ c´ık
the conditions (4.23) become −1 −1 (gL,R µgL,R + k∂σ gL,R gL,R − ν, S)K = 0,
(4.24a)
gL,R (ξ + 2π) = gL,R (ξ)
(4.24b)
and the form ω K/S transforms to −1 −1 dgL − gR dgR ) − ω K/S = d(µ|gL
k −1 −1 (g dgL ∧ |∂(gL dgL )) 2 L
k −1 −1 dgR ∧ |∂(gR dgR )). + (gR 2
(4.25)
It is not difficult to find out that the form (4.25) coincides with the form ω ˜ u=0 given by (4.21) and the constraints (4.24a) are, respectively, the constraints (4.16). We observe that the symplectic reduction of the u-WZW model for U = 0 gives the standard gauged WZW model. Therefore, if we switch on a non-trivial U , we interpret the reduced theory as the u-deformed gauged WZW model. 5. Conclusions and Outlook In the present paper, we have presented a thorough discussion of the gauging of the deformed WZW models. After the general derivation of the quasi-adjoint actions (3.4a) and (3.4b), which are to be gauged in general case, we have worked out the u-deformed WZW model as an example. Moreover, in Secs. 3.2 and 3.3, we have also introduced the moment maps ρ ◦ ΛL,R which can be used for deforming the procedure of the null gauging of the WZW models [4, 13]. The main open issue concerning the deformed WZW models is a quantization. Since we dispose of the rather explicit description of the Poisson brackets of the deformed WZW models (cf. Sec. 4.1) it seems to be doable to identify the operator algebra of the quantum deformed model and also the unitary representations of this algebra. What seems to be more difficult, however, is to extract from the deformed WZW theories general axioms of the deformed vertex algebras. We find this problem exciting and we wish to deal with it in future. References [1] A. Yu. Alekseev and A. Z. Malkin, Symplectic structures associated to Lie–Poisson groups, Commun. Math. Phys. 162 (1994) 147–174; hep-th/9303038. [2] J. Balog, L. Feh´er and L. Palla, Chiral extensions of the WZNW phase space, Poisson– Lie symmetries and groupoids, Nucl. Phys. B 568 (2000) 503–542; hep-th/9910046. [3] H. Flaschka and T. Ratiu, Convexity theorem for Poisson actions of compact Lie groups, Ann. Sci. Ecole Norm. Sup. 29 (1996) 787–809. [4] P. Forg´ acs, A. Wipf, J. Balog, L. Feh´er and L. O’Raifeartaigh, Liouville and Toda theories as conformally reduced WZNW theories, Phys. Lett. B 227 (1989) 214–220. [5] K. Gaw¸edzki, Classical origin of quantum group symmetries in WZW conformal field theory, Commun. Math. Phys. 139 (1991) 201–213.
October 7, 2006 17:43 WSPC/148-RMP
J070-00279
Moment Maps of a Twisted Heisenberg Double
821
[6] K. Gaw¸edzki, Topological actions in two-dimensional quantum field theories, in Nonperturbative Quantum Field Theory, eds. G.’t Hooft, A. Jaffe, G. Mack, P. K. Mitter and R. Stora (Plenum Press, New York, 1988), pp. 101–141. [7] K. Gaw¸edzki, Boundary WZW, G/H, G/G and CS theories, Ann. Henri Poincar´e 3 (2002) 847–881; hep-th/0108044. [8] C. Kassel, Quantum Groups (Springer-Verlag, 1995). [9] C. Klimˇc´ık, Quasitriangular WZW model, Rev. Math. Phys. 16 (2004) 679–808; hepth/0103118. [10] C. Klimˇc´ık, Quasitriangular chiral WZW model in a nutshell, Prog. Theor. Phys. Suppl 144 (2001) 119–124; hep-th/0108148. [11] C. Klimˇc´ık, Poisson–Lie symmetry and q-WZW model, to appear in Proc. 4th Int. Sympos. Quantum Theory and Symmetries (QTS-4), Varna Free University, Bulgaria (15–21 August, 2005); hep-th/0511003. ˇ [12] C. Klimˇc´ık and P. Severa, Open strings and D-branes in WZNW model, Nucl. Phys. B 488 (1997) 653–676; hep-th/9609112. [13] C. Klimˇc´ık and A. A. Tseytlin, Exact four-dimensional string solutions and Toda-like sigma models from ‘null-gauged’ WZNW theories, Nucl. Phys. B 424 (1994) 71–96; hep-th/9402120. [14] J.-P. Ortega and T. Ratiu, Momentum Maps and Hamiltonian Reduction (Birkhauser, Boston, 2004). [15] M. Semenov-Tian-Shansky, Dressing transformations and Poisson groups actions, Publ. Res. Inst. Math. Sci. 21 (1985) 1237–1260. [16] M. Semenov-Tian-Shansky, Poisson–Lie groups, quantum duality principle and the twisted quantum double, Theor. Math. Phys. 93 (1992) 1292–1307; hep-th/9304042. [17] E. Witten, Non-Abelian bosonisation in two dimensions, Commun. Math. Phys. 92 (1984) 455–472. [18] E. Witten, On holomorphic factorization of WZW and coset models, Commun. Math. Phys. 144 (1992) 189–212.
November
1,
2006 11:8 WSPC/148-RMP
J070-00281
Reviews in Mathematical Physics Vol. 18, No. 8 (2006) 823–886 c World Scientific Publishing Company
UNFOLDED FORM OF CONFORMAL EQUATIONS IN M DIMENSIONS AND o(M + 2)-MODULES
O. V. SHAYNKMAN∗ , I. YU. TIPUNIN† and M. A. VASILIEV‡ I.E.Tamm Theory Department, Lebedev Physics Institute, Leninski prospect 53, 119991, Moscow, Russia ∗[email protected] †[email protected] ‡[email protected] Received 11 July 2005 Revised 3 May 2006 A constructive procedure is proposed for formulation of linear differential equations invariant under global symmetry transformations forming a semi-simple Lie algebra f. Under certain conditions, f-invariant systems of differential equations are shown to be associated with f-modules that are integrable with respect to some parabolic subalgebra of f. The suggested construction is motivated by the unfolded formulation of dynamical equations developed in the higher spin gauge theory and provides a starting point for generalization to the nonlinear case. It is applied to the conformal algebra o(M, 2) to classify all linear conformally invariant differential equations in the Minkowski space. Numerous examples of conformal equations are discussed from this perspective. Keywords: Conformal equations; higher spin fields; representation theory. Mathematics Subject Classification 2000: 81R20, 81R25, 32L81
Contents 1. Background and Introduction 2. The 2.1. 2.2. 2.3. 2.4.
Simplest Conformal Systems Conformal scalar . . . . . . Conformal spinor . . . . . . Conformal p-forms . . . . . M = 4 electrodynamics . .
824 . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
831 831 833 834 837
3. General Construction
841
4. Conformal Systems of Equations 4.1. Irreducible tensors and spinor-tensors . . . . . . . 4.2. Generalized Verma modules . . . . . . . . . . . . . 4.3. Contragredient modules . . . . . . . . . . . . . . . 4.4. Structure of o(M + 2) generalized Verma modules
850 851 853 855 856
823
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
November 1, 2006 11:8 WSPC/148-RMP
824
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
4.4.1. M = 2q + 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.4.2. M = 2q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5. Cohomology of irreducible o(M + 2)-modules . . . . . . . . . . . . . 4.6. Examples of calculating cohomology of reducible o(M + 2)-modules 4.7. Conformal equations . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.7.1. Conformal Klein–Gordon and Dirac-like equations for a block 4.7.2. Conformal higher spins in even dimensions . . . . . . . . . . 4.7.3. Fradkin–Tseytlin conformal higher spins in even dimensions .
856 857 860 864 865 867 870 873
5. Conclusions
875
Appendix A. Relevant Facts from Representation Theory
877
Appendix B. Homomorphism Diagrams
882
1. Background and Introduction In this paper, we apply a method of the analysis of dynamical systems called unfolded formulation to classify all conformally invariant linear differential equations in any space-time dimension M > 2. This method, suggested originally for the analysis of higher spin dynamical systems [1–6], proved to be useful for the analysis of problems of deformation quantization [7, 8]. Unfolded formulation of a system of partial differential equations in a spacetime with coordinates xm (m = 0, . . . , M − 1) consists of its reformulation in the first-order form with respect to all coordinates. As such, it is a generalization of the first-order form of ordinary (i.e. M = 1) differential equations q˙i = Gi (q). More precisely, unfolded equations have the form dU Ω (x) = GΩ (U (x)) .
(1.1)
Here, d = ξ m ∂x∂m is the exterior differential.a U Ω (x) denotes a set of variables being differential forms (i.e. polynomials in ξ m ). The condition GΩ (U (x)) ∧
δGΛ (U (x)) =0 δU Ω (x)
(1.2)
is imposed on GΛ (U (x)) to guarantee that the system is formally consistent. (It is assumed that only wedge products of differential forms appear in (1.1) and (1.2), i.e. GΩ (U (x)) is a polynomial of U Ω (x) containing no derivatives in ξ m and xm .) In the case of ordinary differential equations, the variables q i (t) taken at any t = t0 provide a full set of initial data. For an M > 1 unfolded field-theoretical system, the knowledge of the fields U (x) at any xm = xm 0 also reconstructs U (x) . Therefore, to unfold a field-theoretical system with in some neighborhood of xm 0 infinitely many degrees of freedom, it is necessary to introduce infinitely many auxiliary fields. The latter identify with all derivatives of dynamical fields (i.e. with infinitely many generalized momenta). a Throughout
this paper, we use the notation ξ m for the basis 1-forms conventionally denoted dxm .
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
825
Unfolded formulation, which is available for any dynamical system, has a number of properties proved to be useful for the analysis of various aspects of linear and nonlinear dynamics (see [9] for a recent review). The property of the unfolded formulation which is of particular importance for the analysis of this paper is that it makes symmetries of a model manifest. In particular, unfolded formulation of any dynamical system possessing one or another linearly realized global symmetry g is formulated in terms of some g-module. This simple observation makes it trivial to list unfolded dynamical systems of a given symmetry. The nontrivial part of the problem is to single out nontrivial dynamical systems in this list that result from unfolding of certain differential equations. (Note that, generally, unfolded equations may describe an infinite set of constraints with no differential equations among them.) As we show in this paper, nontrivial g-invariant differential equations are associated with the unfolded equations based on g-modules resulting from factorization of generalized Verma g-modules over singular submodules. Our scheme is quite general and can be applied to the analysis of various dynamical systems. In this paper, we apply this analysis to classification of conformally invariant linear differential equations. Let us now analyze relevant properties of unfolded equations more carefully. Due to (1.2), the system (1.1) is invariant under the gauge transformations δU Ω (x) = dΩ (x) + Λ (x) ∧
δGΩ (U (x)) , δU Λ (x)
(1.3)
where the gauge parameters Ω (x) are arbitrary functions of the coordinates xm . A (x) be the set of 1-forms in U Ω (x). The requirement that the Let ω A (x) = ξ m ωm A B C A restriction G (U (x))|ω (x) = −GA BC ω (x) ∧ ω (x) to the sector of 1-forms ω (x) is A compatible with (1.2) implies that GBC satisfy (super)Jacobi identities thus being structure coefficients of some Lie (super)algebrab h. As a result, the restriction of Eq. (1.1) to the sector of 1-forms amounts to the flatness condition on ω A (x). In higher spin theories, h is some infinite dimensional higher spin symmetry algebra [5, 6, 10–15], which contains one or another finite dimensional space-time symmetry subalgebra f. For example, f = o(n, 2) appears either as anti-de Sitter (n = M − 1) or as conformal (n = M ) algebra in M dimensions. Let ω0Ω (x) be a fixed 1-form taking values in f, i.e. ω0 (x) = ω0i (x)ei , where ei is a basis in f. Equation (1.1) for U Ω (x) = ω0Ω (x) is equivalent to the zero curvature condition i = 0, dω0i (x) + ω0j (x) ∧ ω0k (x)fjk
(1.4)
i where fjk are structure coefficients of f. For f isomorphic to Poincar´e algebra, ω0i (x) is usually identified with the flat space gravitational field with co-frame and Lorentz connection corresponding to generators of translations Pn and Lorentz rotations b To introduce superalgebraic structure it is enough to let the 1-forms ω A (x), which correspond to the even (odd) elements of superalgebra h, be Grassmann even (odd).
November 1, 2006 11:8 WSPC/148-RMP
826
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
Lmn , respectively. The components of the co-frame part of the connection are required to form a non-degenerate M × M matrix in which case we call connection non-degenerate. For example, Minkowski space-time in Cartesian coordinates is described by zero Lorentz connection and co-frame ξ n Pn so that the components of the co-frame 1-form en m = δnm form a non-degenerate matrix. The freedom in the choice of a non-degenerate ω0i (x) encodes the coordinate choice ambiguity. One can analyze Eq. (1.1) perturbatively by setting U Ω (x) = ω0Ω (x) + U1Ω (x) ,
(1.5)
where U1Ω (x) describes first-order fields (fluctuations), while ω0Ω (x) is zero-order. Let |Φp (x)λ be the subset of p-forms contained in U1Λ (x) (we use Dirac ket notation for the future convenience). The linearized part of Eq. (1.1) associated with the p-forms reduces to some equations of the form D|Φp (x)λ = 0,
(1.6)
D|Φp (x)η = (dδλη + ω0i (x)ti η λ )|Φp (x)λ .
(1.7)
with
The identity (1.2) implies that the matrices ti η λ form a representation of f i ti ). Let M be the f-module associated with |Φp (x)λ , i.e. |Φp (x)λ (i.e. [tj , tk ] = 2fjk be a section of the trivial bundle B = M × RM with the fiber M and the M dimensional Minkowski base space RM . The covariant derivative D (1.7) in B is flat, DD = 0
(1.8)
as a consequence of (1.4). Let the associative algebra AM be the quotient of the universal enveloping algebra of f over the ideal Ann(M) that annihilates the representation M, i.e. AM = U (f)/Ann(M). Let EI be a basis of AM and TI η λ be the representation of AM induced from the representation ti η λ . If ω0i (x)ei satisfying Eq. (1.4) is (locally) represented in a pure gauge form ω0i (x)ei = g(x)dg −1 (x)
(1.9)
with an invertible element g(x) = g I (x)EI ∈ AM , the generic local solution of Eq. (1.6) gets the form |Φp (x)η = g I (x)TI η λ |Φp (x0 )λ .
(1.10)
We see that |Φp (x0 )λ plays a role of initial data for Eq. (1.6), fixing |Φp (x)η |x∈ε(x0 ) in a neighborhood ε(x0 ) of a point x0 such that g(x0 ) = 1. As a result, solutions of Eq. (1.6) are parametrized by elements of the f-module M. If the f-module M is finite dimensional, we will call the corresponding Eq. (1.6) topological because it describes at most dim(M) degrees of freedom.
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
827
The system (1.4), (1.6) is invariant under the gauge transformations (1.3) i δω0i (x) = di (x) − 2j (x)ω0k (x)fjk ,
(1.11)
δ|Φp (x)η = d|ε(x)η − (−1)p ω0i (x)ti η λ |ε(x)λ − i (x)ti η λ |Φp (x)λ , (1.12) where the (p − 1)-form |ε(x)η and 0-form i (x) are infinitesimal gauge symmetry parameters. (Note that if p = 0 then |ε(x)η ≡ 0.) Any fixed solution ω0i (x) of Eq. (1.4) (called vacuum solution) breaks the local f (super)symmetry associated with i (x) to its stability subalgebra with the infinitesimal parameter i0 (x) satisfying equation i = 0. di0 (x) − 2j0 (x)ω0k (x)fjk
(1.13)
This equation is consistent due to the zero curvature equation (1.4), and its generic (local) solution is parametrized by the values of i0 (x0 ), i0 (x)ei = i0 (x0 )g(x)ei g −1 (x) .
(1.14)
The leftover global symmetry δω0i (x) = 0
δ|Φp (x)η = i0 (x0 )(g(x)ti g −1 (x))η λ |Φp (x)λ ,
(1.15)
with the symmetry parameters i0 (x0 ) forms the Lie (super)algebra f. From the Poincar´e lemma, it follows that the gauge symmetries (1.12) of |Φp (x)η associated with the parameters |ε(x)η , which are p−1 > 0 forms, do not give rise to additional global symmetries of (1.4) and (1.6) in the topologically trivial situation. In fact, Eqs. (1.4) and (1.6) have a larger symmetry gM ⊃ f manifest. Let gM be the Lie (super)algebra built from AM via (super)commutators. One can extend (1.4) and (1.6) to
p
η
D|Φ (x) =
dwI (x) + wJ (x)wK (x)hIJK = 0 ,
(1.16)
(dδλη
(1.17)
I
+ w (x)TI
η
λ )|Φ
p
λ
(x) = 0 ,
I where ξ m wm (x) are the gauge fields of gM , and hIJK are the structure coefficients of gM . The system (1.16), (1.17) is consistent in the sense of (1.2) and has global symmetry gM for any wI (x), which solves (1.16) . Since f is canonically embedded into gM , setting wI (x)EI = ω0i (x)ei one recovers the system (1.4), (1.6) thus proving invariance of the system (1.4), (1.6) under the infinite dimensional global symmetry gM . Infinite dimensional symmetries of this class appear in the field-theoretical models as higher spin symmetries. This approach is universal: any system of f-invariant linear differential equations can be reformulated in the form (1.4), (1.6) by introducing auxiliary variables associated with the appropriate (usually infinite dimensional) f-module M [16] (also see examples below). As a result, classification of f-invariant linear systems of differential equations is equivalent to classification of f-modules M of an appropriate class. More precisely, let f, pΠ ⊂ f and M be, respectively, some semi-simple Lie algebra, its parabolic subalgebra and f-module integrable with respect to pΠ (for necessary
November 1, 2006 11:8 WSPC/148-RMP
828
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
definitions see Sec. 3). We show that, for a non-degenerate flat connection 1-form ω0i (x), the covariant constancy equation (1.6) on a p-form |Φp (x)λ taking values in M encodes an f-invariant system of differential equations RM |φp (x)λ = 0 on a p-form |φp (x)λ from the pth cohomology H p (rΠ , M) of the radical rΠ ⊂ pΠ with coefficients in M. For Abelian radical rΠ , we prove that each differential operator from RM corresponds to an element of H p+1 (rΠ , M) and vice versa. We introduce classification of f-invariant systems of equations RM by reducibility of f-modules M. f-invariant systems that correspond to (reducible) irreducible f-modules M are called (non-)primitive. Non-primitive systems contain nontrivial subsystems and can be described as extensions of the primitive ones. This general construction is applied to classification of linear homogeneous conformally invariant equations on |φ0 (x) ∈ H 0 (rΠ , M), where we set f = o(M, 2),c rΠ = t(M ) (the algebra of translations) and pΠ = iso(M ) ⊕ o(2) (i.e. the direct sum of Poincar´e algebra and the algebra of dilatations). Conformally, invariant equations are determined by H 1 (t(M ), M). Examples of primitive equations include Klein–Gordon and Dirac equations and their conformal generalizations to higher (spinor-)tensor fields, conformal equations on p-forms and, in particular, (anti)selfduality equations. Examples of non-primitive equations correspond to reducible M and include M = 4 electrodynamics with and without external current and its higher spin generalization to higher tensors in the flat space of any even dimension. Note that our construction allows us to write these systems both in gauge invariant and in gauge fixed form. In the latter case, we automatically obtain conformally invariant gauge conditions. A number of examples of conformal systems are considered in Secs. 2 and 4.7. To find H 1 (t(M ), I) with coefficients in an irreducible integrable with respect to iso(M )⊕ o(2) conformal module I, we consider a generalized Verma module V of o(M + 2) such that I is its irreducible quotient. We calculate H 1 (t(M ), I) for any I. As an iso(M ) ⊕ o(2)-module, H 1 (t(M ), I) is shown to be isomorphic to the space of certain systems of singular and subsingular vectors in V. As a result, the form of a primitive system of conformal differential equations RI encoded by the covariant constancy equation (1.6) is completely determined by these systems of singular and subsingular vectors in V. Since any reducible integrable with respect to iso(M ) ⊕ o(2)-module M is an extension of some irreducible modules I, H 1 (t(M ), M) can be easily calculated in terms of H 1 (t(M ), I), thus allowing classification of all possible conformal differential equations. Practical calculating of H p (rΠ , M) may be difficult for a general pair pΠ ⊂ f because the structure of generalized Verma modules is not known in the general case. In the relatively simple case where pΠ = iso(M ) ⊕ o(2) and f = o(M + 2), we calculate the structure of generalized Verma modules using the results of [17, 18]. This allows us to calculate H p (t(M ), M) for any integrable with respect to iso(M )⊕ o(2)-module M. c In
fact, we consider only complex case. Thus, o(M, 2) ∼ o(M + 2).
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
829
Let us note that our approach has significant parallels with important earlier works. In particular, the relation between conformally quasi-invariantd differential operators and singular vectors in the generalized Verma modules of the conformal algebra was originally pointed out in [19] for a particular case. For any semi-simple Lie algebra f and some its parabolic subalgebra pΠ , a correspondence between homogeneous f-(quasi-)invariant linear differential operators acting on a finite set of pΠ -covariant fields and jet bundle pΠ -homomorphisms was studied in [20]. Namely, let the Lie groups A and P ⊂ A correspond to the Lie algebras f and pΠ , respectively, and E and F be homogeneous vector bundles with the base A/P and, respectively, the fibers E and F being some finite dimensional pΠ -modules. J k E is the kth associated jet bundle of E. By taking the projective limit J ∞ E → · · · → J k+1 E → J k E → · · · → J 1 E → E,
(1.18)
one finds [20] that there exists a class of f-(quasi-)invariant linear differential operators corresponding to f-homomorphisms J ∞ E → J ∞ F . To establish relation with our approach, one observes that the f-module dual to the module J ∞ E identifies with the generalized Verma module induced from the pΠ -module E, i.e. V = (J ∞ E) , where (J ∞ E) is the contragredient module to J ∞ E. The image of the highest-weight subspace of (J ∞ F ) in (J ∞ E) under the dual mapping (J ∞ F ) → (J ∞ E) is spanned by singular vectors. We expect that RM in our construction corresponds to the big cell of A/P and the sections of the bundle V × RM satisfying (1.6) along with appropriate boundary conditions coincide with sections of the bundle J ∞ E over A/P. The approach developed in this paper allows one to classify all f-invariant homogeneous differential equations on a finite number of fields that form finite dimensional modules of a parabolic subalgebra pΠ ⊂ f with the Abelian radical rΠ ⊂ pΠ . Equations of this class are referred to as fpΠ -invariant equations for the rest of this paper. In particular, we give the full list of conformally invariant equations in Minkowski space. In the case of even space-time dimension, this list is broader than that of [20] because we are taking into account the equations resulting from subsingular vectors. Apart from giving a universal tool for classification of various f-invariant linear equations, the unfolded formulation is particularly useful for the study of their nonlinear deformations [1]. Once some set of linear equations is formulated in the unfolded form (1.4), (1.6), the problem is to check if there exists a nonlinear unfolded system (1.1), which gives rise to the linear equations in question in the free field limit. In particular, nonlinear dynamics of higher spin gauge fields in various dimensions was formulated this way in [2, 6]. This paper is the first step towards the realization of a full scale program of the study of nonlinear deformations of f-invariant equations. In fact, the analysis of this paper clarifies some ways towards nonlinear d An operator g is called f-quasi-invariant for a Lie algebra f if for any f ∈ f there exists an operator h such that [g, f ] = hg.
November 1, 2006 11:8 WSPC/148-RMP
830
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
deformation. In particular, one can consider extensions of the modules M associated with the free fields of the model by the “current” modules contained in the tensor products of M. Let us note that the unfolded equations (1.1) can be thought of as a particular L∞ algebra [21, 22] (and references therein). The specific property of the system (1.1), extensively used in the analysis of higher spin models [1, 2, 6], is that it is invariant under diffeomorphisms and, therefore, is ideally suited for the description of theories which contain gravity. It is important to note that in this case a nonlinear deformation within the system (1.1) may deform the f-symmetry transformations by some field-dependent terms originating from (1.3), that may complicate the description of this class of deformations within the manifestly f-symmetric schemes. For example, this happens when gravity or (conformal gravity) is described in this formalism with the Weyl tensor 0-form interpreted as a particular dynamical field of the system, added to the right-hand side of (1.16) [1, 23]. Note that such a deformation is inevitable in any theory of gravitation because no global symmetry f is expected away from a particular f-symmetric vacuum. Within unfolded formulation deformations of this class also admit a natural module extension interpretation. The content of the rest of the paper is as follows. In Sec. 2, we consider unfolded formulation of some simple conformal systems. In particular, conformal scalar is considered in Sec. 2.1, conformal spinor is considered in Sec. 2.2, conformal p-forms are considered in Sec. 2.3 and M = 4 electrodynamics is considered in Sec. 2.4. The general construction, which allows us to classify fpΠ -invariant linear differential equations for any semi-simple Lie algebra f and pΠ ⊂ f with Abelian radical rΠ is given in Sec. 3. In Sec. 4, we apply this construction to the conformal algebra o(M, 2). Irreducible finite dimensional representations of the Lorentz algebra are considered in Sec. 4.1. Conformal modules (in particular, generalized Verma modules and contragredient to generalized Verma modules) are discussed in Secs. 4.2 and 4.3, respectively. In Sec. 4.4, we collect relevant facts about submodule structure of conformal generalized Verma modules for the cases of odd (Sec. 4.4.1) and even (Sec. 4.4.2) space-time dimensions. Cohomology with coefficients in irreducible conformal modules is calculated in Sec. 4.5. Examples of calculating cohomology with coefficients in reducible conformal modules are given in Sec. 4.6. In Sec. 4.7, we formulate an algorithm that permits us to obtain explicit form of any conformal equation thus completing the analysis of conformally invariant equations. Conformal generalizations of the Klein–Gordon and the Dirac equations to the fields with block-type (rectangular) Young symmetries are given in Sec. 4.7.1. Generalization of M = 4 equations for massless higher spin fields to a broad class of tensor fields in the flat space of arbitrary even dimension is given in Sec. 4.7.2. Fradkin–Tseytlin conformal higher spin equations in even dimensions are considered in Sec. 4.7.3. In Sec. 5, we conclude our results. In Appendix A, we sketch the analysis of submodule structure of generalized Verma modules for odd and even dimensions. Corresponding homomorphism diagrams are given in Appendix B.
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
831
2. The Simplest Conformal Systems The nonzero commutation relations of the conformal algebra o(M, 2) are [Lmn , Lrs ] = η mr Lns + η ms Lrn − η nr Lms − η ns Lrm , [Lmn , P s ] = η ms P n − η ns P m , n
n
m
nm
n
[D, P ] = −P , n
[P , K ] = 2η
[Lmn , Ks ] = η ms Kn − η ns Km ,
(2.1)
n
[D, K ] = K , nm
D + 2L
,
mn
is an invariant metric of the Lorentz algebra o(M − 1, 1) and Lnm , P n , where η Kn , and D are generators of o(M − 1, 1) Lorentz rotations, translations, special conformal transformations and dilatation, respectively. Minkowski metric η mn and its inverse ηmn are used to raise and lower Lorentz indices. Let |Φ(x) = eη |Φ(x)η be a 0-form section of the trivial bundle RM × M. Here M is some o(M, 2)-module. In most examples in this section, we consider the case with an irreducible module M ∼ I∆ where I∆ is a quotient of the generalized Verma module V∆ freely generated by Kn from a vacuum Lorentz representation |∆A having a definite conformal weight ∆ ∈ C D|∆A = ∆|∆A
(2.2)
P n |∆A = 0 .
(2.3)
and annihilated by P n
To describe Minkowski space in Cartesian coordinates, we choose the flat connection D = ξ n (∂n + Pn ) .
(2.4)
2.1. Conformal scalar In order to describe a conformal scalar field, let us consider the generalized Verma module V∆,0 induced from the trivial Lorentz representation with the basis vector |∆, 0 satisfying Lnm |∆, 0 = 0. The generic element of V∆,0 is 1 Cn ···n Kn1 · · · Knl |∆, 0 , (2.5) l! 1 l l=0
where Cn1 ···nl ∈ C are totally symmetric tensor coefficients. Let |Φ∆,0 (x) be a section of the trivial bundle RM × V∆,0 , i.e. 1 Cn ···n (x)Kn1 · · · Knl |∆, 0 , |Φ∆,0 (x) = l! 1 l
(2.6)
l=0
where Cn1 ···nl (x) are some functions on RM . The covariant constancy condition (1.6) for the field |Φ∆,0 (x) D|Φ∆,0 (x) = 0
(2.7)
November 1, 2006 11:8 WSPC/148-RMP
832
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
is equivalent to the infinite system of equations ∂n |Φ∆,0,l−1 (x) + Pn |Φ∆,0,l (x) = 0 ,
l ≥ 1,
(2.8)
where |Φ∆,0,l (x) =
1 Cn ···n (x)Kn1 · · · Knl |∆, 0 . l! 1 l
(2.9)
With the definition ∂n |∆, 0 = 0 ,
(2.10)
(2.8) amounts to the system of equations ∂n Cm1 ···ml−1 (x) + 2(∆ + l − 1)Cnm1 ···ml−1 (x) − (l − 1)ηn(m1 Ck k m2 ···ml−1 ) (x) = 0 (2.11) for l ≥ 1, where parentheses imply symmetrization over the indices denoted by the same letter, i.e. 1 ηn(m1 Ck k m2 ···ml−1 ) = (ηnm Ck k m2 ···ml−1 + ηnm2 Ck k m1 m3 ···ml−1 + · · ·) . l−1 1 l − 1 terms
(2.12) For ∆ ∈ (2.11) expresses all tensors Cm1 ···ml (x) via the derivatives of C(x) imposing no differential conditions on the latter. For half-integer ∆, the situation is more interesting. For example, for ∆ = 12 M − 1 system (2.11) imposes the Klein– Gordon equation on C(x) and expresses all higher rank tensors in terms of the higher derivatives of C(x) and C mn (x)ηmn . Indeed, the first two equations in (2.11) are 1 2 Z,
∂n C(x) + 2∆Cn (x) = 0 , k
∂n Cm (x) + 2(∆ + 1)Cnm (x) − ηnm Ck (x) = 0 .
(2.13) (2.14)
Contracting (2.14) with η nm and substituting Cn (x) from (2.13) we obtain 1 C(x) + (2∆ + 2 − M )Ck k (x) = 0 . − (2.15) 2∆ Thus, for ∆ = 12 M − 1, ∆ = 0 (i.e. M = 2) (2.15) is equivalent to the Klein–Gordon equation for C(x) C(x) = 0 .
(2.16)
Algebraically, the situation is as follows. Whenever ∆ is not half-integer Pn |Φ∆,0,l (x) = 0 for any |Φ∆,0,l (x) with l ≥ 1 and the module V∆,0 is irreducible. This means that it is possible to solve the chain (2.11) by expressing each |Φ∆,0,l (x) via derivatives of |Φ∆,0,l−1 (x) for (l ≥ 1). Abusing notations, |Φ∆,0,l (x) = −(P −1 )n ∂n |Φ∆,0,l−1 (x) , l ≥ 1 . For ∆ = 12 M − 1, the module V∆,0 is reducible because the identity Pn |s = 0 ,
|s = Km Km |∆, 0
(2.17)
implies that |s is a singular vector, i.e. it is a vacuum vector of the submodule P∆,0 ⊂ V∆,0 generated from |s by Kn . Effectively, the algebraic condition
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
833
(2.17) imposes the Klein–Gordon equation on |Φ∆,0,0 (x) = C(x)|∆, 0. The same time, since the coefficient in front of Cnn Km Km |∆, 0 ∈ |Φ∆,0,2 (x) in Eq. (2.11) with l = 2 vanishes, Cnn (x) cannot be expressed in terms of derivatives of |Φ∆,0 (x), thus becoming an independent field. Setting Cnn (x) = 0 is equivalent to restriction of RM × V∆,0 to the bundle RM × I∆,0 with the irreducible fiber I∆,0 = V∆,0 /P∆,0 . As a result, the conformally invariant equation (2.16) corresponds to the irreducible o(M, 2)-module I∆,0 , thus being primitive. More generally, the generalized Verma module V∆,0 is reducible for ∆ = 12 M −n. Starting from V M −n,0 , one obtains the conformal equation n C(x) = 0 associated 2 with I M −n = V M −n /P M −n . 2
2
2
2.2. Conformal spinor Massless Dirac equation admits an analogous reformulation. Let the module V∆,1/2 be generated by Kn from the spinor module of the o(M − 1, 1) subalgebra with the basis elements |∆, 1/2α (α = 1, . . . , 2[M/2] is the spinor index) Lnm |∆, 1/2α =
1 m n (γ γ − γ n γ m )α β |∆, 1/2β . 4
(2.18)
Here γ nα β are gamma matrices γ nγ β γ mα γ + γ mγ β γ nα γ = (γ n γ m + γ m γ n )α β = 2η nm δβα . The covariant constancy condition (1.6) imposed on the field 1 Cm1 ···ml ,α (x)Km1 · · · Kml |∆, 1/2α , |Φ∆,1/2 (x) = l!
(2.19)
(2.20)
l=0
(i.e. on the section of the bundle RM × V∆,1/2 ) is equivalent to the system of equations ∂n Cm1 ···ml−1 ,α (x) + 2(∆ + l − 1)Cnm1 ···ml−1 ,α (x) − (l − 1)ηn(m1 Ck k m2 ···ml−1 ),α (x) 1 + (γ q γn − γn γ q )β α Cqm1 ···ml−1 ,β (x) = 0 , 2
l ≥ 1.
(2.21)
Whenever ∆ is not half-integer, the system (2.21) just expresses all higher rank spinor–tensors in terms of higher derivatives of Cα (x). For example, from (2.21) it follows that (l = 1) γ nα β (∂n Cα (x) + (2∆ − M + 1)Cn,α (x)) = 0 .
(2.22)
For ∆ = (M − 1)/2 the coefficient in front of Cn,α (x) vanishes and we arrive at the massless Dirac equation for Cα (x) γ nα β ∂n Cα (x) = 0 .
(2.23)
Other equations of the system (2.21) with ∆ = (M − 1)/2 express higher rank spinor-tensors in terms of higher derivatives of Cα (x) and γ nα β Cn,α (x).
November 1, 2006 11:8 WSPC/148-RMP
834
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
Algebraically, the situation is analogous to the case of the Klein–Gordon equation. For ∆ = (M −1)/2, the module V∆,1/2 is reducible. It contains the submodule P(M−1)/2,1/2 ⊂ V(M−1)/2,1/2 generated by Kn from the singular vectors |sα = γm α β Km |(M − 1)/2, 1/2β .
(2.24)
Setting γ nα β Cn,α (x) = 0 is equivalent to the restriction to the subbundle RM × I(M−1)/2,1/2 , where the irreducible module I(M−1)/2,1/2 = V(M−1)/2,1/2 / P(M−1)/2,1/2 corresponds to the primitive conformal equation (2.23). 2.3. Conformal p-forms Consider a trivial bundle RM × V∆,p , where the module V∆,p is induced from the rank p (p ≤ M ) totally antisymmetric tensor module of o(M − 1, 1) with the basis |∆, pk1 ···kp [k1 Lnm |∆, pk1 ···kp = pδn[k1 |∆, pm k2 ···kp ] − pδm |∆, pn k2 ···kp ] .
(2.25)
Here square brackets imply antisymmetrization over indices denoted by the same letter δn[k1 |∆m k2 ···kp ] =
1 k1 (δ |∆m k2 ···kp − δnk2 |∆m k1 k3 ···kp + · · ·) . p n
(2.26)
p terms
Consider a section |Φ∆,p (x) of the bundle RM × V∆,p |Φ∆,p (x) =
1 Cm1 ···ml ;k1 ···kp (x)Km1 · · · Kml |∆, pk1 ···kp , l!
(2.27)
l=0
where the tensor Cm1 ···ml ;k1 ···kp (x) is totally symmetric in the indices m and totally antisymmetric in the indices k. (The semicolon separates the groups of totally symmetric and antisymmetric indices). Equation (1.6) for the field |Φ∆,p (x) amounts to ∂n Cm1 ···ml−1 ;k1 ···kp (x) + 2(∆ + l − 1)Cnm1 ···ml−1 ;k1 ···kp (x) − (l − 1)ηn(m1 Cq q m2 ···ml−1 );k1 ···kp (x) + 2pCm1 ···ml−1 [k1 ;nk2 ···kp ] (x) − 2pηn[k1 Cm1 ···ml−1 q; q k2 ···kp ] (x) = 0 ,
l ≥ 1.
(2.28)
The differential equations imposed by the system (2.28) depend on the conformal weight ∆. 1. ∆ ∈ 12 Z. (2.28) imposes no differential restrictions, just expressing all higher rank tensor fields in terms of derivatives of the field C;k1 ···kp (x).
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
835
M 2. M is odd, ∆ = p = 0 , 1 , . . . , M − 1 or M is even, ∆ = p = 0, M 2 + 1, 2 + 2, . . . , M − 1. In this case, (2.28) imposes the closedness condition on the ∆-form C;k1 ···k∆ (x)
∂[k∆+1 C;k1 ···k∆ ] (x) = 0
(2.29)
and expresses all higher rank tensor fields in terms of derivatives of C;k1 ···k∆ (x) and C[k∆+1 ;k1 ···k∆ ] (x). Actually, consider (2.28) at l = 1. We have: ∂n C;k1 ···kp (x) + 2∆Cn;k1 ···kp (x) + 2pC[k1 ;nk2 ···kp ] (x) − 2pηn[k1 Cq; q k2 ···kp ] (x) = 0. (2.30) Total antisymmetrization of indices in (2.30) gives ∂[kp+1 C;k1 ···kp ] (x) + 2(∆ − p)C[kp+1 ;k1 ···kp ] (x) = 0 .
(2.31)
For ∆ = p, we obtain (2.29). 3. M is odd, ∆ = M − p = 0, 1, . . . , M − 1 or M is even, ∆ = M − p = 0, M M 2 + 1, 2 + 2, . . . , M − 1. In this case (2.28) imposes the dual form of Eq. (2.29) implying that the polyvector C ;k1 ···kM −∆ (x) conserves ∂n C ;nk2 ···kM −∆ (x) = 0 .
(2.32)
Also (2.28) expresses all higher rank tensor fields in terms of derivatives of the fields C ;k1 ···kM −∆ (x) and Cq ;qk2 ···kM −∆ (x). Indeed, contracting indices in (2.30) with η nk1 , one obtains (2.32) from ∂ n C;nk2 ···kp (x) + 2(∆ + p − M )C n ;nk2 ···kp (x) = 0 .
(2.33)
4. M is even, ∆ = p = 1, 2, . . . , M 2 − 1. In this case, (2.28) imposes on C;k1 ···k∆ (x) Eq. (2.29) along with equation M/2−∆ ∂ n C;nk2 ···k∆ (x) = 0
(2.34)
and expresses all higher rank tensor fields in terms of derivatives of the fields C;k1 ···k∆ (x), C[k∆+1 ;k1 ···k∆ ] (x), and C n1 ···nM/2−∆ n1 ···nM/2−∆ q ;qk2 ···k∆ (x). 5. M is even, ∆ = M − p = 1, 2, . . . , M 2 − 1. Now, (2.28) imposes on C;k1 ···kM −∆ (x) Eq. (2.32) along with M/2−∆ ∂[kM −∆+1 C;k1 ···kM −∆ ] (x) = 0
(2.35)
and expresses all higher rank tensor fields in terms of derivatives of the fields C;k1 ···kM −∆ (x), C n1 ···nM/2−∆ n1 ···nM/2−∆ [kM −∆+1 ;k1 ···kM −∆ ] (x), and C q ;qk2 ···kM −∆ (x). Note that system (2.32), (2.35) is dual to system (2.29), (2.34).
November 1, 2006 11:8 WSPC/148-RMP
836
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
6. M is even, ∆ = p = M 2 . In this case, the vacuum vectors |M/2, M/2k1 ···kM/2 form a reducible o(M, 2)module. The irreducible parts are singled out by the additional (anti)selfduality conditions k1 ···k |M/2, M/2± M/2
2
iM /4 p1 ···p p ···p k1 ···kM/2 |M/2, M/2± M/2 , (2.36) =± (M/2)! 1 M/2
which in the complex case can be imposed for any even space-time dimension. Equation (2.28) imposes primitive equation ∂ n C;nk2 ···kM/2 (x) = 0
(2.37)
on the (anti)selfdual field C;k1 ···kM/2 (x) 2
C;k1 ···kM/2 (x) = ±
iM /4 p1 ···pM/2 k1 ···kM/2 C;p1 ···pM/2 (x) (M/2)!
(2.38)
and expresses all higher rank tensor fields in terms of derivatives of the fields C;k1 ···kM/2 (x) and C q ;qk2 ···kM/2 (x). Vanishing coefficients in front of higher tensors in (2.31) and (2.33) imply the appearance of the singular vectors |sk1 ···k∆+1 = K[k1 |∆, ∆k2 ···k∆+1 ] ,
(2.39)
|sk1 ···kM −∆−1 = Kn |∆, M − ∆nk1 ···kM −∆−1
(2.40)
in V∆,p for ∆ = p = 0, . . . , M − 1 and ∆ = M − p = 0, . . . , M − 1, respectively. These singular vectors induce proper submodules P∆,∆ ⊂ V∆,∆ and P∆,M−∆ ⊂ V∆,M−∆ . In the cases 2 and 3, the quotients Q∆,∆ = V∆,∆ /P∆,∆ and Q∆,M−∆ = V∆,M−∆ /P∆,M−∆ are irreducible and, therefore, Eqs. (2.29) and (2.32) are primitive. In the cases 4 and 5, the modules Q∆,∆ and Q∆,M−∆ are reducible. They contain submodules P∆,∆ ⊂ Q∆,∆ and P∆,M−∆ ⊂ Q∆,M−∆ generated from the subsingular vectors |s k1 ···k∆−1 = (Kn Kn )M/2−∆ Km |∆, ∆mk1 ···k∆−1 , k1 ···kM −∆+1
|s
n M/2−∆
= (Kn K )
K
[k1
k2 ···kM −∆+1 ]
|∆, ∆
(2.41) ,
(2.42)
respectively. The quotients Q∆,∆ = Q∆,∆ /P∆,∆ and Q∆,M−∆ = Q∆,M−∆ / P∆,M−∆ are irreducible and systems (2.29), (2.34) and (2.32), (2.35) are primitive. Note that in the cases 4 and 5, the systems (2.29) and (2.32) alone are also conformally invariant but non-primitive. In case 6, the singular vector (2.39) coincide (up to a sign) with the singular vector (2.40). This vector contained in both generalized Verma modules VM/2,M/2+ and VM/2,M/2− generated from the selfdual and the antiselfdual vacuum Lorentz representations correspondingly. The quotients QM/2,M/2± = VM/2,M/2± / PM/2,M/2 are irreducible and, therefore, system (2.37), (2.38) is primitive.
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
837
2.4. M = 4 electrodynamics Primitive conformally invariant equations constructed with the use of irreducible conformal modules are the simplest ones in the sense that it is impossible to impose any stronger conformally invariant equations that admit nontrivial solutions. As follows from the Examples 4 and 5 in Sec. 2.3, non-primitive equations not necessarily reduce to a set of independent primitive subsystems. A somewhat trivial example of a non-primitive system is provided by case 6 in Sec. 2.3 with the relaxed (anti)selfduality condition (2.36). Namely, consider the module VM/2,M/2 induced from the reducible vacuum |M/2, M/2k1 ,...,kM/2 . It contains both singular vectors (2.39) and (2.40). Thus Eq. (2.28) imposes the system (2.29), (2.32) on the field C;k1 ···kM/2 (x) and expresses all higher rank tensor fields in terms of derivatives of the fields C;k1 ···kM/2 (x), C[kM/2+1 ;k1 ···kM/2] (x), and C q ;qk2 ···kM/2 (x). This system is non-primitive because it reduces to the combination of the independent subsystems for selfdual and antiselfdual parts. For M = 4, it coincides with the free Maxwell equations formulated in terms of field strengths. A less trivial important example of a nontrivial non-primitive system, which allows us to illustrate the idea of the general construction is provided by the potential formulation of the M = 4 electrodynamics. Consider the M = 4 irreducible module IA = Q1,1 /P1,1 , (see explanation to case 4 at the end of Sec. 2.3). The covariant constancy condition (1.6) for the section |ΦA (x) =
1 Am1 ···ml ;k (x)Km1 · · · Kml |Ak , l!
m, k = 1, . . . , 4
(2.43)
l=0
encodes the following differential equations on A;k (x): ∂[n A;k] (x) = 0 ,
(2.44)
∂ k A;k (x) = 0 .
(2.45)
Let us extend the irreducible module IA to a module EA,F by “gluing” the module KF = Q2,2+ ⊕ Q2,2− (see explanation to case 6 at the end of Sec. 2.3) to IA as follows. The module EA,F is generated from the vacuum vectors |Ak and |F k1 k2 of the modules VA = V1,1 and VF = V2,2 , respectively, with the following additional relations imposed K[n |Ak] = 0 ,
Km Km Kk |Ak = 0 ,
(2.46)
K[n |F k1 k2 ] = 0 ,
Kn |F nk = 0 ,
(2.47)
P n |F k1 k2 = −η n[k1 |Ak2 ] .
(2.48)
Here, the conditions (2.46) and (2.47) single out IA and KF from the generalized Verma modules VA and VF , respectively. The condition (2.48) “glues” the modules IA and KF into EA,F .
November 1, 2006 11:8 WSPC/148-RMP
838
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
Consider the section |ΦA,F (x) =
1 Am1 ···ml ;k (x)Km1 · · · Kml |Ak l! l=0 1 Fm1 ···ml ;k1 k2 (x)Km1 · · · Kml |F k1 k2 + l!
(2.49)
l=0
of the bundle R4 × EA,F . The covariant constancy condition D|ΦA,F (x) = 0 amounts to the infinite differential system ∂n Am1 ···ml−1 ;k (x) + 2lAnm1 ···ml−1 ;k (x) − (l − 1)ηn(m1 Aq q m2 ···ml−1 );k (x) + 2Am1 ···ml−1 k;n (x) − 2ηnk Am1 ···ml−1 q; q (x) − Fm1 ···ml−1 ;nk (x) = 0 , (2.50) ∂n Fm1 ···ml−1 ;k1 k2 (x) + 2(l + 1)Fnm1 ···ml−1 ;k1 k2 (x) − (l − 1)ηn(m1 Fq q m2 ···ml−1 );k1 k2 (x) + 4Fm1 ···ml−1 [k1 ;nk2 ] (x) − 4ηn[k1 Fm1 ···ml−1 q; q k2 ] (x) = 0,
(2.51)
for l = 1, 2, . . . . The subsystem (2.51) coincides with the system (2.28) for M = 4 and ∆ = p = 2. It expresses all higher components Fm1 ···ml ;k1 k2 (x) via the higher derivatives of the field F;k1 k2 (x) (note that components F q ;qk2 (x) and F[k3 ;k1 k2 ] (x) are set to zero in the bundle R4 × EA,F due to the relation (2.47)) and imposes Maxwell equations on the field strength 2-form F;k1 k2 (x) ∂[n F;k1 k2 ] (x) = 0 ,
(2.52)
∂ n F;nk (x) = 0 .
(2.53)
The subsystem (2.50) is a deformation of the system (2.28) for IA by the additional terms containing the fields Fm1 ···ml ;k1 k2 (x) resulting from the “gluing” condition (2.48) which links the vacuums |Ak and |F k1 k2 . The system (2.50) expresses all higher fields Am1 ···ml ;k (x) (l ≥ 1) via the higher derivatives of A;k (x) (in R4 × EA,F components A[k2 ;k1 ] (x) = 0 and An n q ;q (x) = 0 due to (2.46)) and also imposes the differential equation (2.45) on A;k (x) and the constraint ∂[k1 A;k2 ] (x) = F;k1 k2 (x)
(2.54)
on F;k1 k2 (x). The constraint (2.54) replaces the closedness condition (2.44) for the potential 1-form A;k (x). The point is that the singular vector |sk1 k2 = K[k1 |1, 1k2 ] from the module VA responsible for (2.44) is “glued” in the module EA,F by the field F;k1 k2 (x) in (2.48). As a result, the field F;k1 k2 (x) replaces zero on the righthand side of (2.44) giving rise to the constraint (2.54), which identifies A;k (x) with the potential for the field strength F;k1 k2 (x). Thus the infinite system (2.50) and (2.51) provides the potential formulation of M = 4 electrodynamics (2.52)–(2.54) along with infinitely many constraints on the auxiliary fields Am1 ···ml ;k (x) and Fm1 ...ml ;k1 k2 (x) for l ≥ 1. Equation (2.45) is the conformally invariant gauge condition, considered originally in [24, 25]. The system
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
839
(2.52)–(2.45) is non-primitive. Its primitive reduction results from the condition F;k1 k2 (x) = 0. The module EA,F can be further extended by the module IJ = V3,1 /P3,1 (see explanation to case 3 at the end of Sec. 2.3) to a module EA,F,J as follows. EA,F,J is generated from the totally antisymmetric vacua |Ak , |F k1 k2 and |Jk with the properties (2.46)–(2.48) along with Kk |Jk = 0 , 2 P n |Jk = − |F nk . 3 The covariant constancy condition for the section 1 Am1 ···ml ;k (x)Km1 · · · Kml |Ak |ΦA,F,J (x) = l! l=0 1 Fm1 ···ml ;k1 k2 (x)Km1 · · · Kml |F k1 k2 + l! l=0 1 Jm ···m ;k (x)Km1 · · · Kml |Jk + l! 1 l
(2.55) (2.56)
(2.57)
l=0
of the trivial bundle R4 × EA,F,J contains several parts. The first one is the system (2.50), which gives rise to Eqs. (2.54) and (2.45). The second one is the system for the fields Jm1 ···ml ;k (x) of the form (2.28) with M = 4 and ∆ = M − p = 3. This system encodes equation ∂ k J;k (x) = 0
(2.58)
on the field J;k (x) and expresses all the higher fields Jm1 ···ml ;k (x) (l ≥ 1) in terms of higher derivatives of J;k (x) (in R4 × EA,F,J component J q ;q (x) = 0 due to (2.55)). The third part reads ∂n Fm1 ···ml−1 ;k1 k2 (x) + 2(l + 1)Fnm1 ···ml−1 ;k1 k2 (x) − (l − 1)ηn(m1 Fq q m2 ···ml−1 );k1 k2 (x) + 4Fm1 ···ml−1 [k1 ;nk2 ] (x) 2 − 4ηn[k1 Fm1 ···ml−1 q; q k2 ] (x) − ηn[k1 Jm1 ···ml−1 ;k2 ] (x) = 0 3
(2.59)
for l = 1, 2, . . . . It is a deformation of the system (2.51) with the additional terms containing Jm1 ···ml ;k (x), which result from the “gluing” condition (2.56). This system encodes the Bianchi identities (2.52) along with the second pair of Maxwell equations with external current ∂ n F;nk (x) = J;k (x)
(2.60)
and expresses Fm1 ···ml ;k1 k2 (x) for l ≥ 1 via the derivatives of F;k1 k2 (x). Thus the covariant constancy condition (1.6) for the bundle R4 × EA,F,J encodes the nonprimitive system of differential equations (2.52), (2.54), (2.45), (2.60) and (2.58). Note that analogous differential system was derived in [26] in terms of a 5-potential
November 1, 2006 11:8 WSPC/148-RMP
840
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
that transforms according to a non-decomposable representation of SU (2, 2) (see also [27] and references therein). This system admits two interpretations. The first one with Jm1 ···ml ;k (x) treated as independent fields restricted only by Eqs. (1.6) is that it provides the off-massshell version of the Maxwell electrodynamics, which accounts for all differential consequences of the Bianchi identities. Another interpretation comes out when the field J;k (x) is a nonlinear combination of some other “matter” fields. In that case, Eqs. (1.6) should be treated as Maxwell equations describing electromagnetic interactions of the matter fields. Clearly, for this to be possible it is necessary to single out the module IJ from the tensor product of some other “matter modules” that leads to a nonlinear system describing electromagnetic interactions of matter fields from which the current J;k (x) is built. Equation (2.58) imposes the conservation condition on this current. Finally, let us note that to have a gauge invariant form of the Maxwell equations (i.e. to relax the gauge condition (2.45)) one has to consider the further extension EA,F,J,G of the module EA,F,J with the module IG = V4,0 . The module EA,F,J,G is defined by the relations (2.46)–(2.48), (2.55), (2.56) along with P n |G = −
1 m K Km |An , 16
where |G is the vacuum of the module IG . Consider a section 1 Am1 ···ml ;k (x)Km1 · · · Kml |Ak |ΦA,F,J,G (x) = l! l=0 1 Fm1 ···ml ;k1 k2 (x)Km1 · · · Kml |F k1 k2 + l! l=0 1 Jm ···m ;k (x)Km1 · · · Kml |Jk + l! 1 l l=0 1 Gm1 ···ml (x)Km1 · · · Kml |G + l!
(2.61)
(2.62)
l=0
of the bundle R4 × EA,F,J,G . The consequences of the covariant constancy condition imposed on (2.62) are analogous to those for the section |ΦA,F,J (x) but with subsystem (2.50) replaced with ∂n Am1 ···ml−1 ;k (x) + 2lAnm1 ···ml−1 ;k (x) − (l − 1)ηn(m1 Aq q m2 ···ml−1 );k (x) + 2Am1 ···ml−1 k;n (x) − 2ηnk Am1 ···ml−1 q; q (x) − Fm1 ···ml−1 ;nk (x) −
1 (l − 1)(l − 2)ηnk η(m1 m2 Gm3 ···ml−1 ) = 0, 16
(2.63)
and additional subsystem of the form (2.28) with M = 4, ∆ = M − p = 4 for the fields Gm1 ···ml (x). G-dependent terms in (2.63) modify Eq. (2.45) to ∂ k A;k (x) = G(x) .
(2.64)
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
841
Subsystem for the fields Gm1 ···ml (x) expresses higher components of Gm1 ···ml (x) (l ≥ 1) in terms of derivatives of G(x). In Sec. 4.7.2, we consider a generalization of this construction to a case of an almost arbitrary tensor structure of the field strength in any even space-time dimension M > 2. 3. General Construction Let f be a complex semi-simplee Lie algebra with simple roots Π = (α0 , α1 , . . . , αq ). Then f is generated by elements Hi , Ei and Fi , 0 ≤ i ≤ q with the relations [Hi , Ej ] = Aij Ej ,
[Hi , Fj ] = −Aij Fj ,
(3.1)
[Ei , Fj ] = δij Hj , (ad Ei )
1−Aij
Ej = 0,
(3.2) (ad Fi )
1−Aij
Fj = 0,
i = j,
(3.3)
where no summation over repeated indices is assumed and Aij = αj (Hi ),
Ai,j=i ≤ 0,
Aii = 2
(3.4)
is the Cartan matrix. The transformation τ τ (Ei ) = Fi ,
τ (Fi ) = Ei ,
τ (Hi ) = Hi
(3.5)
generates the involutive antilinear antiautomorphism of f called the Chevalley involution. Choose a subset of the set of simple roots Π ⊂ Π. Let aΠ ⊂ f denote the semisimple subalgebra generated by elements Ei , Fi , Hi such that αi ∈ Π. hΠ is the Cartan subalgebra of aΠ . Let pΠ be the parabolic subalgebra with respect to Π, i.e. pΠ is generated by Hi , Ei with 0 ≤ i ≤ q and Fi corresponding to simple roots in Π. Evidently, aΠ ⊂ pΠ ⊂ f for any Π. The parabolic subalgebra pΠ admits the + rΠ , where lΠ = hΠ\Π ⊂ + aΠ is the Levi factor Levi–Maltsev decomposition pΠ = lΠ ⊂ of pΠ and rΠ is the radical of pΠ . The linear space f can thus be decomposed into the direct sum f = aΠ ⊕ hΠ\Π ⊕ rΠ ⊕ f/pΠ . Let us choose a basis (Lβ , DI , Pa , Ka ) of f such that the elements Lβ , DI , Pa and Ka form some bases in aΠ , hΠ\Π , rΠ and f/pΠ , respectively. Note that the involution τ maps rΠ to f/pΠ and vice versa. Therefore, both for Pa and for Ka the index a takes values a = 0, . . . , M − 1, where M = dim(rΠ ) = dim(f/pΠ ). Note that the commutation relations of f in the basis (Lβ , DI , Pa , Ka ) have the following structure [L, L] ∼ L,
[P, P] ∼ P,
[K, K] ∼ K,
[D, L] ∼ L,
[L, P] ∼ P,
[L, K] ∼ K,
[P, K] ∼ L + D + P + K, [D, P] ∼ P, [D, K] ∼ K ,
(3.6)
[D, D] = 0, e In
fact, the following consideration remains essentially the same for any Kac–Moody algebra.
November 1, 2006 11:8 WSPC/148-RMP
842
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
where Lβ , DI , Pa , Ka are operators of generalized Lorentz transformations, dilatations, translations and special conformal transformations, respectively. Let M be some (usually infinite dimensional) f-module with the following properties. M decomposes into the direct sum of irreducible finite dimensional modules of lΠ . The action of the Cartan subalgebra hΠ ⊂ f is diagonalizable in M. The action of the radical rΠ is locally nilpotent in M, i.e. M admits a filtration by lΠ -modules M(0) ⊂ M(1) ⊂ · · · ⊂ M(f ) ⊂ · · · ⊂ M ∞ M(f ) , M=
(3.7)
f =0
where a lΠ -module M(f ) is such that (rΠ )f +1 M(f ) ≡ 0,
(3.8)
i.e. a product of any f + 1 elements from rΠ annihilates any vector from M(f ) . The filtration (3.7) gives rise to the grading on M M=
∞
M[l] .
(3.9)
l=0
Here M[0] = M(0) and M[l] (l ≥ 1) is the preimage of the quotient morphism q : M(l) → M(l) /M(l−1)
(3.10)
M[l] = q −1 (M(l) /M(l−1) ), where q −1 is a homomorphism of lΠ modules satisfying qq −1 = 1. q −1 is fixed uniquely provided that M(l−1) does not contain lΠ -irreducible submodules isomorphic to some of the lΠ -irreducible submodules of M(l) /M(l−1) . Otherwise, to fix the arbitrariness in q −1 , an appropriate additional prescription is needed. We demand every M[l] , which is called level-l submodule of M, to form a finite dimensional module of lΠ . An element r ∈ rΠ decreases the grading r : M[l] → M[l−n(r)] ,
(3.11)
where n(r) ≥ 1 is an integer. Note that if rΠ is Abelian, then n(r) = 1 for any r ∈ rΠ . Let Ξ be the Grassmann algebra on ξ n , n = 0, 1, . . . , M − 1, ξ n ξ m = −ξ m ξ n and ξ n are identified with space-time basis 1-forms. Consider the tensor product F = M ⊗ Ξ. F is bi-graded by the level of M (3.9) and by the exterior form degree of Ξ F=
M ∞ p=0 l=0
Fp[l] =
M
Fp ,
(3.12)
p=0
where Fp[l] is the space of p-forms taking values in M[l] . Fp is the space of p-forms taking values in the whole module M.
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
843
Consider the trivial vector bundle B = RM × F over RM F −→ B ↓ RM
(3.13)
with the fiber F. Let Γ(B) denote the space of sections of B. We define the covariant derivative in B D = ξ n ∂n + ξ n ωn β (x)Lβ + ξ n ωn a (x)Pa + ξ n ωn I (x)DI ,
(3.14)
where xn , n = 0, 1, . . . , M −1 are the space-time coordinates in RM . The connection 1-forms ωn β (x), ωn a (x) and ωn I (x) are chosen to satisfy the zero curvature equation (1.8). We require ωn a (x) to be non-degenerate det|ωn a (x)| = 0 .
(3.15)
In the rest of this paper, we focus on the case of Abelian rΠ , [Pa , Pb ] = 0 .
(3.16)
In this case, (1.8) and (3.15) admit the simple solution D = ξ n ∂n + ξ n δna Pa ,
(3.17)
with ωn α (x) = ωn I (x) = 0 and ωn a (x) = δna , where δna is identified with the flat space co-frame in Cartesian coordinates. Choosing different solutions of (1.8) allows one to analyze the problem in any other coordinates. Having fixed the flat frame in the form of Kronecker delta, in what follows we will not distinguish between the base and the fiber indices. Let us introduce the exterior differential d = ξ n ∂n : Fp[l] → Fp+1 [l]
(3.18)
σ− = ξ n Pn : Fp[l] → Fp+1 [l−1] .
(3.19)
D = d + σ− .
(3.20)
and the operator
We have
From (1.8), (3.18) and (3.19) it follows that the operators d and σ− are nilpotent and anticommutative dd = 0,
σ− σ− = 0,
dσ− + σ− d = 0 .
(3.21)
Let c ⊂ F and e ⊂ c ⊂ F be the spaces of σ− -closed and σ− -exact forms, respectively, σ− c = 0,
e = σ− F.
(3.22)
November 1, 2006 11:8 WSPC/148-RMP
844
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
The cohomology H (rΠ , M) of rΠ is the quotient c/e. Let p be the quotient mapping p : c → H (rΠ , M).
(3.23)
This mapping is a lΠ -homomorphism. We define the mapping p−1 : H (rΠ , M) → c
(3.24)
such that pp−1 = 1 and p−1 is a lΠ -homomorphism. These requirements fix p−1 uniquely provided that e does not contain lΠ -irreducible submodules isomorphic to some of the lΠ -irreducible submodules of c/e. Otherwise, to fix the arbitrariness in p−1 , an appropriate additional prescription is needed. The space F decomposes into the direct sum of lΠ -modules F = H ⊕ e ⊕ F.
(3.25)
−1
Here H denotes p (H (rΠ , M)), e complements H to c and F complements c to F. p p The gradings (3.12) of F induces the gradings of H, e and F . Let H[l] , ep[l] and F[l] denote corresponding homogeneous subspaces. Note that H 0 = c0 = F0[0] and thus p−1 is identical in the sector of 0-forms. Introduce the subbundle b = RM × H of the bundle B H −→ b ↓ RM
(3.26)
with the fiber H ⊂ F. Let Γ(b) denote the space of sections of b. Let a p-form |φp (x) ∈ Γ(b) be a section of b. Now we are in a position to formulate fpΠ -invariant differential equations on |φp (x) as the conditions for |φp (x) to admit a lift to a p-form |Φp (x) ∈ Γ(B) such that D|Φp (x) = 0, |Φp (x)|b = |φp (x).
(3.27)
Here |Φp (x)|b is the projection of F to H in the decomposition (3.25). Call a section |Φp (x) ∈ Γ(B) D-horizontal if D|Φp (x) = 0. Call a section |Φp (x) ∈ Γ(B) D-horizontal lift of |φp (x) ∈ Γ(b) if it satisfies (3.27). Taking into account (1.8), the equation D|Φp (x) = 0 is invariant under the gauge transformation δ|Φp (x) = D|p−1 (x) ,
(3.28)
where p−1 ∈ Γ(B) is an arbitrary (p − 1)-form. Note that for p ≥ 2 (3.28) is invariant under the second order gauge transformation δ|p−1 (x) = D|χp−2 (x),
(3.29)
where |χp−2 (x) is an arbitrary (p − 2)-form. For p ≥ 3, (3.29) is invariant under the third order gauge transformation and so on. We will distinguish between T (trivial), D (differential) and A (algebraic) classes p−1 (x) + of gauge transformations with the gauge parameters |p−1 T (x) = |ψT
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
845
p−1 p−1 p−2 p−1 p−1 D|χp−2 T (x), |D (x) = |ψD (x) + D|χD (x) and |A (x) = |ψA (x) + p−2 p−1 p−1 D|χA (x), respectively, with some (p − 1)-forms |ψT (x) ∈ e, |ψD (x) ∈ H, p−1 (x) ∈ F . The ambiguity in the second-order gauge parameters |χp−2 |ψA T (x), p−2 p−2 |χD (x) and |χA (x) manifests the fact that the decomposition into the T , D, and A gauge transformations is not unique. One can see, in particular, that any T -transformation reduces to a linear combination of some A-transformation and D-transformation and can therefore be discarded. Indeed, let |p−1 T [l] (x) =
σ− |χp−2 T [l+1] (x) be a level-l T -transformation parameter. Taking into account (3.21), one gets p−2 p−2 δT |Φp (x) = d|p−1 T [l] (x) = −σ− d|χT [l+1] (x) = −Dd|χT [l+1] (x) .
(3.30)
Decompose −d|χp−2 T [l+1] (x) into a combination of level-(l + 1) D, A and T gauge parameters. If the resulting level-(l + 1) T -parameter is nonzero, one applies the same procedure, and so on. The roles of the D and A gauge transformations are as follows. The variation of |Φp (x) under D-transformations is purely differential δD |Φp (x) = d|p−1 D (x) .
(3.31)
D-transformations generalize the gradient transformations in electrodynamics and linearized diffeomorphisms in gravity. A-transformations are gauge transformations of the form p−1 δA |Φp (x) = d|p−1 A (x) + σ− |A (x)
(3.32)
with a nonzero second term. These are analogous to the linearized local Lorentz transformations in gravity. Now, following to [13], we prove that the existence of a D-horizontal lift (see (3.27)) is governed by H p+1 (rΠ , M). Theorem 3.1 (1) Let |φp (x) ∈ Γ(b) and let there exist |Φp (x)1 and |Φp (x)2 ∈ Γ(B) that are D-horizontal lifts of |φp (x). Then |Φp (x)1 − |Φp (x)2 = δA |χp−1 (x) for some |χp−1 (x) ∈ Γ(B) (see (3.32)). (2) The two statements are equivalent (a) any section |φp (x) ∈ Γ(b) has a D-horizontal lift to a |Φp (x) ∈ Γ(B), (b) H p+1 (rΠ , M) = 0. (3) If H p+1 (rΠ , M) = 0, there exists a system of differential equations R|φp (x) = 0
(3.33)
such that any solution of (3.33) admits a D-horizontal lift to a |Φp (x) ∈ Γ(B) and all |φp (x) ∈ Γ(b) admitting such a lift satisfy (3.33).
November 1, 2006 11:8 WSPC/148-RMP
846
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
Proof. Let us look for a lift |Φp (x) in the form |Φp (x) = |ϕ[0] (x) + |ϕ[1] (x) + |ϕ[2] (x) + · · · ,
(3.34)
where |ϕ[l] (x) ∈ Fp[l] . The condition |Φp (x)|b = |φp (x) fixes the first term in this decomposition |ϕ[0] (x)b = |φp (x) ∩ Fp[0] modulo a σ− -exact form |ϕ[0] (x)e ∈ ep[0] . The freedom in |ϕ[0] (x)e ∈ ep[0] is a consequence of A-gauge symmetry, i.e. |ϕ[0] (x) is reconstructed modulo an A-gauge part (which, of course, also contributes to |ϕ[1] (x)H ). Suppose that H p+1 (rΠ , M) is trivial, i.e. cp+1 = ep+1 . To reconstruct |Φp (x), we use the following step-by-step procedure. The zero level part of (3.27) reads d|ϕ[0] (x) + σ− |ϕ[1] (x) = 0, |ϕ[1] (x)|b = |φp (x) ∩
Fp[1] .
(3.35) (3.36)
Since |ϕ[0] (x) has the lowest grading, it is σ− -closed. d|ϕ[0] (x) is also σ− -closed because dσ− + σ− d = 0. Since H p+1 (rΠ , M) is trivial, d|ϕ[0] (x) is σ− -exact d|ϕ[0] (x) = σ− |χ[1] (x)
(3.37)
for some |χ[1] (x). Setting |ϕ[1] (x) = −|χ[1] (x) we solve Eq. (3.35) modulo an arbitrary σ− -closed form |ϕ[1] (x)c ∈ cp[1] . The condition (3.36) fixes |ϕ[1] (x)c modulo an arbitrary σ− -exact form |ϕ[1] (x)e ∈ ep[1] , which parametrizes the level-1 restriction of some A-gauge part with level-2 gauge parameter. As a result, |ϕ[1] (x) ∈ Fp[1] is expressed via the first derivatives of |φp (x) ∩ Fp[0] and via |φp (x) ∩ Fp[1] modulo an arbitrary A-gauge part. The first level part of (3.27) d|ϕ[1] (x) + σ− |ϕ[2] (x) = 0, |ϕ[2] (x) = |φp (x) ∩ Fp , b
[2]
(3.38) (3.39)
is considered analogously. d|ϕ[1] (x) is σ− -closed because σ− d|ϕ[1] (x) = −dσ− |ϕ[1] (x) = d2 |ϕ[0] (x) = 0. Introducing |χ[2] (x) ∈ Fp[2] such that d|ϕ[1] (x) = σ− |χ[2] (x) and setting |ϕ[2] (x) = −|χ[2] (x) we solve Eq. (3.38) modulo an arbitrary σ− -closed form |ϕ[2] (x)c ∈ cp[2] . The condition (3.39) fixes |ϕ[2] (x)c modulo an arbitrary σ− -exact form |ϕ[2] (x)e ∈ ep[2] , which parametrizes the level-2 restriction of some A-gauge part with level-3 gauge parameter. As a result, |ϕ[2] (x) is expressed via the second derivatives of |φp (x) ∩ Fp[0] , via the first derivatives of |φp (x) ∩ Fp[1] and via the |φp (x) ∩ Fp[2] modulo some A-gauge terms. Repetition of this procedure reconstructs the lift |Φp (x) in the form (3.34) with |ϕ[l] (x) expressed in terms of derivatives of |φp (x) modulo an A-gauge part. Suppose now that H p+1 (rΠ , M) is nontrivial. Then it decomposes into a sum of some definite grade nonzero subspaces H p+1 (rΠ , M) = H[lp+1 (rΠ , M) ⊕ H[lp+1 (rΠ , M) ⊕ · · · , 1] 2]
(3.40)
where 0 ≤ l1 < l2 < · · · . Carrying out the first l1 steps of the described procedure, we solve (3.27) up to the (l1 − 1)-th level, expressing all |ϕ[l] (x) with 1 ≤ l ≤ l1
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
847
via derivatives of |φp (x) modulo some A-gauge part. The level-l1 sector of (3.27) reads d|ϕ[l1 ] (x) + σ− |ϕ[l1 +1] (x) = 0
(3.41)
|ϕ[l1 +1] (x)|b = |φp (x) ∩ Fp[l1 +1] .
(3.42)
From the level-(l1 − 1) sector of (3.27) it follows that the (p + 1)-form d|ϕ[l1 ] (x) is σ− -closed. However, Eq. (3.41) imposes a stronger condition that d|ϕ[l1 ] (x) is σ− -exact, thus requiring those combinations of d|ϕ[l1 ] (x) that belong to the cohomology class H p+1 (rΠ , M) to vanish. This imposes some differential equations on |φp (x) of orders not higher than l1 R[l1 ] |φp (x) = 0 .
(3.43)
In addition, Eqs. (3.41) and (3.42) expresses |ϕ[l1 +1] (x) via derivatives of |φp (x) modulo an arbitrary A-gauge part |ϕ[l1 +1] (x)e ∈ ep[l1 +1] . Solving further (3.27) level by level we fix |ϕ[l1 +1] (x), . . . , |ϕ[l2 ] (x) modulo an arbitrary A-gauge part. At level l2 , equations d|ϕ[l2 ] (x) + σ− |ϕ[l2 +1] (x) = 0
(3.44)
|ϕ[l2 +1] (x)|b = |φp (x) ∩ Fp[l2 +1]
(3.45)
fix |ϕ[l2 +1] (x) in terms of derivatives of |φp (x) modulo an A-gauge part |ϕ[l2 +1] (x)e ∈ ep[l2 +1] and impose some additional differential equations of orders not higher than l2 R[l2 ] |φp (x) = 0.
(3.46)
Repetition of this procedure reconstructs modulo an A-gauge part a lift |Φp (x) in the form (3.34) for |φp (x) satisfying the system of differential equations R[l1 ] |φp (x) = 0 , R[l2 ] |φp (x) = 0 ,
(3.47)
··· To show that the system (3.47) is necessarily nontrivial if H p+1 (rΠ , M) is nonzero, let us construct a section |φp (x) ∈ Γ(b) that does not satisfy (3.47). such that d|ψ [l1 ] (x) = 0 (for example, one Let us choose some |ψ [l1 ] (x) ∈ H[lp+1 1] [l ] (x) can choose |ψ [l ] (x) ∈ H p+1 to be x-independent). Then, |ψ [l ] (x) = d|ϕ 1
[l1 ]
1
1
[l1 ] (x) = |ϕ [l1 ] (x)H + |ϕ [l1 ] (x)e + |ϕ [l1 ] (x)F in for some |ϕ [l1 ] (x). Decompose |ϕ accordance with (3.25). Consider now the (l1 − 1)-th level part of Eq. (3.27) d|ϕ [l1 −1] (x) + σ− |ϕ [l1 ] (x) = 0 .
(3.48)
Because σ− |ϕ [l1 ] (x) is d-closed (dσ− |ϕ [l1 ] (x) = −σ− d|ϕ [l1 ] (x) = 0), we can [l1 −1] (x)H + |ϕ [l1 −1] (x)e + |ϕ [l1 −1] (x)F . Repeating solve it for |ϕ [l1 −1] (x) = |ϕ [0] (x)H , arriving at the field this “inverse” procedure, we find |ϕ [l1 ] (x)H , . . . , |ϕ [0] (x)H + · · · + |ϕ [l1 ] (x)H that solves (3.27) for the levels 0, 1 . . . , l1 − 1 |φp (x) = |ϕ
November 1, 2006 11:8 WSPC/148-RMP
848
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
but satisfies the modified Eq. (3.27) with the nonzero right-hand side proportional at the level l1 , thus violating (3.47). to |ψ [l1 ] (x) ∈ H[lp+1 1] Remark 3.2. If there exists a D-horizontal lift of 0-form |φ0 (x) to a 0-form |Φ0 (x) ∈ Γ(B) then it is unique. Proof. Gauge symmetries (3.28) trivialize in the sector of 0-forms. Remark 3.3. Consider the subbundle b = RM × H ⊕ e of the bundle B H ⊕ e → b ↓ RM
(3.49)
with the fiber H ⊕ e = c ⊂ F. If there exists a D-horizontal lift of |φp (x) to a p-form |Φp (x) ∈ Γ(B), then it is unique. Proof. Restriction to b fixes some A-gauge. Remark 3.4. Theorem 3.1 allows the following interpretation. Given Eq. (3.27), a section |Φp (x) decomposes into |Φp (x) = |Φp (x)H + |Φp (x)e + |Φp (x)F . The subsection |Φp (x)H describes dynamical fields subject to some differential equations (3.33). Solutions of these differential equations are moduli of solutions of Eq. (3.27). The part |Φp (x)F describes (usually infinite) set of fields expressed by Eq. (3.27) via derivatives of the dynamical fields. The fields of this class are called auxiliary fields and the equations that express them are called constraints. The A-gauge symmetry (3.32) (generalized local Lorentz symmetry) allows one to get rid of σ− -exact terms |Φp (x)e . The D-gauge symmetry (3.31) with the parameters in H p−1 (rΠ , M) acts on the dynamical fields |Φp (x)H and is the gauge symmetry of equations (3.33). Remark 3.5. According to (1.10) solutions of (3.27) are parametrized by the values of |Φp (x)|x∈ (x0 ) at a neighborhood (x0 ) of any point x0 . This is because Eq. (3.27) expresses all higher level (l ≥ 1) components of |Φp (x) via higher derivatives of |φp (x). As a result, the fields |φp (x) can be expressed modulo gauge symmetries in terms of |Φp (x)|x∈ (x0 ) by virtue of the Taylor expansion. For the rest of this paper, we mostly confine ourselves to the sector of 0-forms, which turns out to be reach enough to reformulate any fpΠ -invariant linear differential system in the unfolded form by virtue of introducing appropriate auxiliary fields. In other words, for any fpΠ -invariant linear differential system R|φ0 (x) = 0, there exists some f-module MR , which gives rise to R|φ0 (x) = 0 by virtue of the procedure described above.f Thus the problem of listing all linear fpΠ -invariant f Note that any equation R|φp (x) = 0 can be rewritten in terms of 0-forms by converting indices of forms into tangent indices with the aid of the frame field. The formulation in terms of higher forms may be useful however for the analysis of nonlinear dynamics and will be discussed elsewhere.
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
849
differential systems is equivalent to the problem of calculating the cohomology H 0 (rΠ , M) and H 1 (rΠ , M) for any f-module M. An important subclass of such systems is formed by those associated with irreducible M. Definition 3.6. A system of fpΠ -invariant linear differential equations R|φ0 (x) = 0
(3.50)
is called primitive if the f-module MR corresponding to (3.50) as in Theorem 3.1 is irreducible. Reducible modules can be treated as extensions of the irreducible ones. Let I1 and I2 be some irreducible f-modules. Consider a module M defined by the exact sequence 0 → I1 → M → I2 → 0 .
(3.51)
A trivial possibility is M = I1 ⊕ I2 . The non-primitive system corresponding to I1 ⊕ I2 decomposes into two independent primitive subsystems RI1 |φ0I1 (x) = 0 ,
(3.52)
RI2 |φ0I2 (x)
(3.53)
= 0,
where RI1 , |φ0I1 (x) and RI2 , |φ0I1 (x) correspond to I1 and I2 , respectively. For some particular irreducible I1 and I2 , a module M = EI1 ,I2 non-isomorphic to I1 ⊕ I2 may also exist however. The non-primitive system corresponding to EI1 ,I2 REI1 ,I2 |φ0EI1 ,I2 (x) = 0
(3.54)
contains the system (3.53) for the dynamical fields |φ0I2 (x) associated with M = I2 . The system (3.52) results from (3.54) at |φ0I2 (x) = 0, which means that the space of solutions of the non-primitive system (3.54) contains the invariant subspace of solutions of the system (3.52). In other words, the equations that contain d|φ0I2 (x) are |φ0I1 (x) independent, while those, that contain d|φ0I1 (x), contain some terms with |φ0I2 (x). Further extensions of the types 0 → I3 → M → EI1 ,I2 → 0
(3.55)
0 → EI1 ,I2 → M → I3 → 0
(3.56)
or
with indecomposable modules M and M can also be considered. As a result, all possible fpΠ -invariant linear differential equations can be classified in terms of extensions of the primitive equations. Some examples of nontrivial extensions are considered in Secs. 2.4, 4.7.2 and 4.7.3. To summarize, the construction is as follows. To write down all fpΠ -invariant homogeneous equations on a finite number of fields for a semi-simple Lie algebra f
November 1, 2006 11:8 WSPC/148-RMP
850
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
one has to classify all f-modules that are integrable with respect to parabolic subalgebra pΠ ⊂ f with the Abelian radical rΠ . These consist of irreducible f-modules of this class and all their extensions. The unfolded form of the fpΠ -invariant homogeneous equations has the form of the covariant constancy equation (1.6) for the 0-form section |Φ0 (x) of the bundle B. Dynamical fields form the 0-form section |φ0 (x) of b. Differential field equations on the dynamical fields are characterized by the cohomology H 1 (rΠ , M), which is the linear space where the nontrivial left-hand sides of the equations R|φ0 (x) = 0 take their values. Since Eq. (1.6) is f-invariant, the equation R|φ0 (x) = 0 is f-invariant as well, i.e. f maps its solutions to solutions. The construction is universal because any differential equations can be “unfolded” to some covariant constancy equation by adding enough (usually infinitely many) auxiliary fields expressed by virtue of the unfolded equations through derivatives of the dynamical fields |φ0 (x). If the original system of differential equations is f-invariant, the corresponding unfolded equation is also f-invariant, and auxiliary fields together with the dynamical fields, span the space of sections of B. Now we are in a position to give the full list of conformally invariant systems of differential equations in RM (M ≥ 3). 4. Conformal Systems of Equations We set f = o(M + 2) with the commutation relations (2.1) (o(M + 2) ∼ o(M, 2) for the complex case we focus on). The structure of simple roots Π for o(M + 2) depends on whether M is odd or even. For M = 2q, o(M + 2) = Dq+1 and Π is described by the Dynkin diagram ◦ α0
◦ α1
◦ α2
...
◦ αq−1 ◦ Q αq−2Q◦ αq .
(4.1)
For odd M = 2q + 1, o(M + 2) = Bq+1 and Π is described by the Dynkin diagram ◦
◦
α0
α1
◦ α2
...
◦
>◦
αq−1
αq .
(4.2)
In both cases, we choose Π = (α1 , . . . , αq ) and hence pΠ = iso(M ) ⊕ o(2) = o(M ) ⊕ o(2)⊂ +t(M ) where lΠ = o(M ) ⊕ o(2) is the direct sum of the Lorentz algebra and the dilatation while rΠ = t(M ) is the algebra of momenta. Since the algebra t(M ) is Abelian (cf. (2.1)), we can apply results of Sec. 3 to classify all linear conformally invariant systems of differential equations in terms of the cohomology H 0 (t(M ), M) and H 1 (t(M ), M) of t(M ) with coefficients in various integrable o(M + 2)-modules M. For the conformal algebra o(M +2) and its parabolic subalgebra iso(M +2)⊕o(2), we calculate the cohomology H p (t(M ), I) for any p and any irreducible module I using the information on the structure of the generalized Verma modules obtained by the methods developed in [17, 18, 28]. Once the cohomology H p (t(M ), I) for any
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
851
irreducible module I is known, the cohomology H p (t(M ), E) for any extension E of the irreducible modules can also be easily found. 4.1. Irreducible tensors and spinor-tensors Consider an irreducible finite dimensional module N(λ) of iso(M ) ⊕ o(2) with some basis elements |(λ)A of the carrier space, labelled by A, B (Lnm |(λ))A = LnmA B |(λ) , 0
D|(λ)A = ∆|(λ)A ,
P n |(λ)A = 0.
(4.3)
We choose the highest weight of N(λ) in the form (λ) = (λ0 , λ1 , . . . , λq ), where λ0 = −∆ is the highest weight of o(2) and (λ1 , . . . , λq ) is the highest weight of o(M ). The condition that N(λ) is finite dimensional demands 2λ1 ≡ · · · ≡ 2λq mod 2, λ1 ≥ λ2 ≥ · · · ≥ |λq | ≥ 0, λ1 ≥ λ2 ≥ · · · ≥ λq ≥ 0,
(4.4) M is even,
(4.5)
M is odd.
(4.6)
It is customary in physics to describe finite dimensional representations of the Lorentz algebra as appropriate irreducible spaces of traceless tensors or γ-transversal spinor-tensors. One possible realization is as follows. Let 2λ1 ≡ · · · ≡ 2λq ≡ 0 mod 2. Consider the space of traceless tensors 1
Tn
(λ1 ),n2 (λ2 ),...,nq (|λq |)
,
1
ηni nj T n
(λ1 ),n2 (λ2 ),...,nq (|λq |)
= 0,
1 ≤ i, j ≤ q,
(4.7)
where, following [29], we write ni (λi ) instead of writing a set of λi totally symmetrized indices ni1 , ni2 , . . . , niλi , i.e. we indicate in parentheses how many indices are subject to total symmetrization. For example, we write T n(λ) instead of rank-λ symmetric tensor T n1 ···nλ . We use the convention that upper (lower) indices denoted by the same latter inside parentheses are symmetrized. For example, T (n1 P n2 ) is 1 2 q equivalent to 12 (T n1 P n2 + T n2 P n1 ). The tensor T n (λ1 ),n (λ2 ),...,n (|λq |) is totally symmetric within each group of λi indices ni . We impose the condition that the total symmetrization of indices ni (λi ) with any index from some set nj (λj ) with j > i gives zero. Such symmetry properties are described by the Young tableau Λ composed of rows of length λ1 , λ2 , . . . , |λq |. Such tensors span the irreducible representation N(λ) whenever M is odd or λq = 0. For even M and λq = 0, this space is N(λ0 ,λ1 ,...,λq ) ⊕ N(λ0 ,λ1 ,...,−λq ) , where the direct summands are the selfdual and antiselfdual parts of the tensors (see below). Let σ1 , . . . , σp be the heights of the columns of Λ. Another basis in N(λ) with explicit antisymmetrizations consists of the traceless tensors Tm
1
[σ1 ],m2 [σ2 ],...,mp [σp ]
,
ηmi mj T m
1
[σ1 ],m2 [σ2 ],...,mp [σp ]
= 0,
1 ≤ i, j ≤ p,
(4.8)
where mi [σi ] denotes a set of totally antisymmetrized indices mi1 , mi2 , . . . , miσi . We use the convention that upper (lower) indices denoted by the same latter inside square brackets are antisymmetrized [29]. For example, T [n1 P n2 ] is equivalent to
November 1, 2006 11:8 WSPC/148-RMP
852
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
1 n1 n2 P 2 (T
− T n2 P n1 ). For a tensor associated with the Young tableau Λ, the condition is imposed that the total antisymmetrization of the indices mi [σi ] with any index from some set mj [σj ] with j > i gives zero. From the formula n1 ···nM m1 ···mM =
(−1)π(p) ηn1 mp(1) · · · ηnM mp(M ) ,
(4.9)
p
where summation is over all permutations p of indices mi , and π(p) = 0 or 1 is the oddness of the permutation p, it follows for traceless tensors that T ...,m
i
[σi ],...,mj [σj ],...
=0
(4.10)
if σi + σj > M for some i = j. From (4.10) along with the property that T ...,m
i
[σ],...,mj [σ],...
= T ...,m
j
[σ],...,mi [σ],...
,
(4.11)
it follows that there is essentially one way to define the Hodge conjugation operation ∗ for such tensors, (∗ T )k[M−σ1 ],m
2
[σ2 ],...,mp [σp ]
=
(i)σ1 (M−σ1 ) m1 [σ1 ],m2 [σ2 ],...,mp [σp ] T m1 [σ1 ] k[M−σ1 ] , σ1 ! (4.12)
where the normalization factor is fixed such that (∗∗ T )m
1
[σ1 ],...,mp [σp ]
= Tm
1
[σ1 ],...,mp [σp ]
.
(4.13)
For M = 2q and λq = 0, to single out the irreducible part of the o(2q) tensor 1 2 p representation T m [q],m [σ2 ],...,m [σp ] , we impose the (anti)selfduality condition ∗
Tm
1
[q],m2 [σ2 ],...,mp [σp ]
= ±T m
1
[q],m2 [σ2 ],...,mp [σp ]
.
(4.14)
When 2λ1 ≡ · · · ≡ 2λq ≡ 1 mod 2, the basis |(λ)A of the module N(λ) can be realized by spinor–tensors 1
Tn
(λ1 − 12 ),n2 (λ2 − 12 ),...,nq (|λq |− 12 ),α
or T m
1
[σ1 ],m2 [σ2 ],...,mp [σp ],α
,
(4.15)
where α = 1, . . . , 2[M/2] is the spinor index. They satisfy analogous (anti)symmetry conditions and are γ-transversal, i.e. 1
γni β α T n
(λ1 − 12 ),n2 (λ2 − 12 ),...,nq (|λq |− 12 ),α
= 0,
1 ≤ i ≤ q,
m1 [σ1 ],m2 [σ2 ],...,mp [σp ],α
= 0,
1 ≤ j ≤ p,
γmj
β
αT
(4.16)
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
853
where γ matrices satisfy (2.19). From (4.16), it follows that 1 2 p n1 (λ1 − 12 ),n2 (λ2 − 12 ),...,nq (|λq |− 12 ),α and T m [σ1 ],m [σ2 ],...,m [σp ],α are traceless. A counT terpart of the identity (4.10) for γ-transversal spinor-tensors is Tm
1
[σ1 ],m2 [σ2 ],...,mp [σp ],α
=0
(4.17)
if 2σi > M for some i. For M = 2q, to single out the irreducible part of a spinor-tensor o(2q) module, one imposes the additional chirality condition 1
Γβ α T n
(λ1 − 12 ),n2 (λ2 − 12 ),...,nq (|λq |− 12 ),α
Γ
β
αT
m1 [σ1 ],m2 [σ2 ],...,mp [σp ],α
1
= ±T n = ±T
(λ1 − 12 ),n2 (λ2 − 12 ),...,nq (|λq |− 12 ),β
m1 [σ1 ],m2 [σ2 ],...,mp [σp ],β
, (4.18)
,
where Γβ α = (−i)q (γ 1 · · · γ 2q )β α
(4.19)
is normalized to have unit square Γβ γ Γα β = δγα .
(4.20)
(Note that for odd M , Γ is the central element, which is required to be ±11 in a chosen spinor representation and hence (4.18) is automatically satisfied.) For 1 2 p even M , a γ-transversal chiral spinor-tensor T m [q],m [σ2 ],...,m [σp ],α that has definite Young properties, is automatically (anti)selfdual because ∗
Tm
1
[q],m2 [σ2 ],...,mp [σp ],β
= Γβ α T m
1
[q],m2 [σ2 ],...,mp [σp ],α
.
(4.21)
4.2. Generalized Verma modules The generalized Verma o(M + 2)-module V(λ) is freely generated from a vacuum module N(λ) (see Sec. 4.1) by the operators Kn . Recall that (λ) = (λ0 , . . . , λq ) satisfy (4.4)–(4.6). It is convenient to represent the action of Kn as a multiplication by an independent variable y n . Basis elements of V(λ) are formed by homogeneous polynomials |ln(l);A = y (n · · · y n) |(λ)A ,
l = 0, 1, 2 . . . .
(4.22)
l
A special universality property of generalized Verma modules that makes them important for our analysis is that any irreducible o(M + 2)-module J(λ) with the highest weight (λ) integrable with respect to the parabolic subalgebra iso(M )⊕o(2) is a quotient of V(λ) . The subspace V(λ)l ⊂ V(λ) spanned by degree l monomials (4.22) is called the lth level of V(λ) . The associated grading in V(λ) is V(λ) =
∞ l=0
V(λ)l .
(4.23)
November 1, 2006 11:8 WSPC/148-RMP
854
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
The representation of the conformal algebra in V(λ) is
mk k ∂ m ∂ mk L |v = y −y + L0 |v, ∂ym ∂yk
j ∂ D|v = −λ0 + y |v, ∂y j
(4.24) (4.25)
Km |v = y m |v,
∂ ∂ mj ∂ m j ∂ m ∂ P |v = 2 −λ0 + y −y + 2L0 |v, ∂y j ∂ym ∂y j ∂yj ∂y j
(4.26) (4.27)
where |v ∈ V(λ) and Lnm acts in the vacuum module (4.3). Lmk and D preserve 0 level l. D is the grading operator, i.e. V(λ)l is the eigenspace of D with the eigenvalue −λ0 + l. Km and P m increase and decrease a level by one unit, respectively. Every level V(λ)l decomposes into a direct sum of o(M ) ⊕ o(2) irreducible modules,
[l/2]
V(λ)l =
N(λ) ⊗ N(−l,l−2i,0,...,0) =
i=0
N(µ) ,
(4.28)
(µ)∈Λ(λ),l
where Λ(λ),l is the set of highest weights in this decomposition. A o(M ) ⊕ o(2)module S(µ) in decomposition (4.28) with l ≥ 1 is called singular module if P n S(µ) = 0 .
(4.29)
Any vector from S(µ) is called singular vector. Let singular vectors |sA form a basis of S(µ) . Any singular module S(µ) ⊂ V(λ)l induces the proper submodule P(λ),(µ) of V(λ) with the homogeneous elements of the form |mn(m);A = y (n · · · y n) |sA ,
m ≥ 0.
(4.30)
m
Note that P(λ),(µ) is not freely generated from S(µ) , i.e. the elements |mn(m);A are not necessarily linearly independent. Also note that the grading (4.23) defined for generalized Verma modules differs from the grading (3.9) defined in Sec. 3 for arbitrary pΠ -integrable modules. Namely, V(λ)[0] consists of V(λ)0 along with all singular subspaces of V(λ) . In what follows, we use the grading (4.23). If V(λ) is irreducible, it does not contain singular modules. For reducible V(λ) , let S(µ1 ) , S(µ2 ) , . . . list all singular modules of V(λ) . Let P(λ) be the image in V(λ) of the module induced from S(µ1 ) ⊕ S(µ2 ) ⊕ · · · . Consider the quotient O(λ) = V(λ) /P(λ) . A singular module S(µ) of O(λ) is called a subsingular module of V(λ) . Its elements are called subsingular vectors. A singular module of the quotient O(λ) = O(λ) /P(λ) is called a subsubsingular module S(µ) of V(λ) and so on. For generalized Verma modules V(λ) of the conformal algebra the situation is relatively simple because V(λ) can have only singular and subsingular modules for M even and only singular modules for M odd (see Sec. 4.4 and Appendix A for more details).
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
855
4.3. Contragredient modules Let M be an f-module. The module M contragredient to module M is the graded dual to M vector spaceg with the action of the algebra f defined as f α(v) = α(τ (f )v),
(4.31)
where f ∈ f, v ∈ M, α ∈ M and τ is the Chevalley involution (3.5). Note that for any irreducible module J(λ) with the highest weight (λ), the contragredient module J (λ) is also irreducible with the same highest weight and, thus, J(λ) ∼ J (λ) . The module V (λ) contragredient to the generalized Verma module V(λ) can be realized as follows. Consider N (λ) ∼ N(λ) with the basis A (λ)| dual to |(λ)A B (λ)||(λ)
A
A = δB
(4.32)
and the following action of the iso(M ) ⊕ o(2) algebra nm ) A ((λ)|L
B = B (λ)|Lnm 0 A ,
A (λ)|D
= −A (λ)|λ0 ,
A (λ)|P
n
= 0.
(4.33)
The vector space V (λ) can be realized as the space of polynomials of y n with coefficients in N (λ) . It is convenient to extend the definition of the Chevalley involution to this realization as follows:
∂ ∂ τ (y n ) = , τ (4.34) = yn. ∂yn ∂yn The lth level V(λ)l of V (λ) is spanned by the monomials n(l);A l|
=
1 A (λ)| y(n · · · yn) . l!
(4.35)
l
From τ (Lnm ) = −Lnm ,
τ (D) = D,
τ (Kn ) = P n ,
τ (P n ) = Kn ,
it follows that the action (4.31) of o(M + 2) on V (λ) is ← ← ∂ ∂ , α|Lmk = α| yk − y m + Lmk 0 ∂ym ∂yk α|D = α| −λ0 +
(4.37)
∂ j y , ∂yj
← ∂ ∂ , = α| 2 −λ0 + yj ym − y j yj + 2Lmj 0 yj ∂y j ∂ym
α|Km
←
(4.36)
(4.38)
←
(4.39)
dual vector space to the graded space V = ⊕i Vi with finite dimensional homogeneous components Vi is defined as V ∗ = ⊕i Vi∗ , where each Vi∗ is dual to the corresponding Vi .
g Graded
November 1, 2006 11:8 WSPC/148-RMP
856
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev ←
α|P
m
= α|
∂ , ∂ym
(4.40)
for α| ∈ V (λ) . Note that the elements P n act co-freely in V (λ) , i.e. any vector in V(λ)l has a preimage under the action of P n for every n. 4.4. Structure of o(M + 2) generalized Verma modules In this section, we describe the structure of o(M + 2) generalized Verma modules. Singular modules in o(M + 2) generalized Verma modules were completely investigated in [30, 31, 28]. To find subsingular modules, we use general results from [17, 18]. This analysis is sketched in Appendix A. 4.4.1. M = 2q + 1 It turns out that for odd M = 2q + 1, M ≥ 3 any o(M + 2) generalized Verma module V(λ) does not have subsingular modules.h This means that the maximal submodule P(λ) ⊂ V(λ) such that the quotient O(λ) = V(λ) /P(λ) is irreducible, is induced from singular modules. For generic (λ), V(λ) is irreducible. There are two series of reducible generalized Verma modules. Let (λ)0 be an arbitrary dominant integral weight, i.e. λ0 ≥ λ1 ≥ · · · ≥ λq and 2λ0 ≡ · · · ≡ 2λq mod 2. The first series consists of the modules with the following highest weights: (λ)0 = (λ0 , λ1 , . . . , λq ) , (λ)1 = (λ1 − 1, λ0 + 1, λ2 , . . . , λq ), .. . (λ)N = (λN − N, λ0 + 1, . . . , λN −1 + 1, λN +1 , . . . , λq ), .. .
N = 0, . . . , q,
(λ)q = (λq − q, λ0 + 1, . . . , λq−1 + 1),
(4.41)
(λ)q+1 = (−λq − q − 1, λ0 + 1, . . . , λq−1 + 1), .. . (λ)q+K = (−λq+1−K − q − K, λ0 + 1, . . . , λq−K + 1, λq−K+2 , . . . , λq ), K = 1, . . . , q, .. . (λ)2q−1 = (−λ2 − 2q + 1, λ0 + 1, λ1 + 1, λ3 , . . . , λq ), (λ)2q = (−λ1 − 2q, λ0 + 1, λ2 , . . . , λq ). fact that the homogeneous space SO(M + 2)/ISO(M ) × SO(2) does not contain two cells of the equal dimension forbids appearance of subsingular modules [17].
h The
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
857
The generalized Verma modules with highest weights from (4.41) have the structure described by the following short exact sequences 0 → J(λ)1 → V(λ)0 → J(λ)0 → 0; .. . 0 → J(λ)N +1 → V(λ)N → J(λ)N → 0, .. .
(4.42)
N = 0, . . . , 2q;
(4.43)
0 → J(λ)2q+1 → V(λ)2q → J(λ)2q → 0;
(4.44)
0 → V(λ)2q+1 → J(λ)2q+1 → 0,
(4.45)
where (λ)2q+1 = (−λ0 − 2q − 1, λ1 , . . . , λq ) and all J(λ) are irreducible. Equation (4.45) means that V(λ)2q+1 = J(λ)2q+1 is irreducible. Equation (4.44) means that J(λ)2q+1 is the maximal submodule of V(λ)2q and the quotient J(λ)2q = V(λ)2q /V(λ)2q+1 is irreducible. The maximal submodule of V(λ)2q−1 is J(λ)2q and the quotient J(λ)2q−1 = V(λ)2q−1 /J(λ)2q is irreducible, and so on. The second series consists of reducible generalized Verma modules with nonintegral highest weights. Let µ1 ≥ · · · ≥ µq and 2µ1 ≡ · · · ≡ 2µq mod 2. Consider the highest weight (µ) = (µ0 , µ1 , . . . , µq ), 1 + N0 2 µ0 = −q + N0
µ0 = −q +
if 2µ1 ≡ 2µq ≡ 0 mod 2,
(4.46)
if 2µ1 ≡ 2µq ≡ 1 mod 2.
We have 0 → V(µ) → V(µ) → J(µ) → 0 ,
(4.47)
(µ) = (−µ0 − 2q − 1, µ1 , · · · , µq ) .
(4.48)
where
The modules J(µ) = V(µ) /V(µ) and V(µ) are irreducible. The described two series give the full list of reducible o(M + 2) generalized Verma modules for odd M . 4.4.2. M = 2q The structure of o(M +2) generalized Verma modules V(λ) for even M is more complicated because in the even dimensional case, some V(λ) have subsingular modules (no subsubsingular modules, howeveri). Again, there are two series of reducible generalized Verma modules. i The fact that the homogeneous space SO(M + 2)/ISO(M ) ⊗ SO(2) does not contain three cells of the equal dimension forbids appearance of subsubsingular modules [17].
November 1, 2006 11:8 WSPC/148-RMP
858
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
Let (λ)−q be an arbitrary dominant integral weight, i.e. λ0 ≥ λ1 ≥ · · · ≥ |λq | and 2λ0 ≡ · · · ≡ 2λq mod 2. Consider the set of highest weights (λ)−q = (λ0 , λ1 , . . . , λq ) , (λ)−q+1 = (λ1 − 1, λ0 + 1, λ2 , . . . , λq ) , .. . (λ)−q+N = (λN − N, λ0 + 1, . . . , λN −1 + 1, λN +1 , . . . , λq ), .. .
N = 0, . . . , q − 1,
(λ)−1 = (λq−1 − q + 1, λ0 + 1, . . . , λq−2 + 1, λq ), (λ)0 = (λq − q, λ0 + 1, . . . , λq−1 + 1), (λ)0 = (−λq − q, λ0 + 1, . . . , λq−2 + 1, −λq−1 − 1) , (λ)1 = (−λq−1 − q − 1, λ0 + 1, . . . , λq−2 + 1, −λq ), .. .
(4.49)
(λ)K = (−λq−K − q − K, λ0 + 1, . . . , λq−K−1 + 1, λq−K+1 , . . . , λq−1 , −λq ), K = 1, . . . , q − 1, .. . (λ)q−2 = (−λ2 − 2q + 2, λ0 + 1, λ1 + 1, λ3 , . . . , λq−1 , −λq ) , (λ)q−1 = (−λ1 − 2q + 1, λ0 + 1, λ2 , . . . , λq−1 , −λq ). The structure of the generalized Verma modules with the highest weights (4.49) is described by the following short exact sequences 0 → O(λ)−q+1 → V(λ)−q → J(λ)−q → 0; 0→
V(λ)q
(4.50)
→ O(λ)−q+1 → J(λ)−q+1 → 0;
0 → O(λ)−q+2 → V(λ)−q+1 → J(λ)−q+1 → 0; 0→
V(λ)q−1
→ O(λ)−q+2 → J(λ)−q+2 → 0;
(4.51) (4.52) (4.53)
.. . 0 → O(λ)N +1 → V(λ)N → J(λ)N → 0,
N = −q, −q + 1, . . . , −2, (4.54)
0→
V(λ)−N
→ O(λ)N +1 → J(λ)N +1 → 0;
(4.55)
.. . 0 → O(λ)−1 → V(λ)−2 → J(λ)−2 → 0; V(λ)2
(4.56)
→ O(λ)−1 → J(λ)−1 → 0;
(4.57)
0 → O(λ)0 → V(λ)−1 → J(λ)−1 → 0,
(4.58)
0→
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
859
0 → V(λ)1 → O(λ)0 → J(λ)0 ⊕ J(λ)0 → 0;
(4.59)
0 → J(λ)1 → V(λ)0 → J(λ)0 → 0;
(4.60)
0 → J(λ)1 → V(λ)0 → J(λ)0 → 0;
(4.61)
0 → J(λ)2 → V(λ)1 → J(λ)1 → 0;
(4.62)
.. . 0 → J(λ)N +1 → V(λ)N → J(λ)N → 0,
N = 1, . . . , q − 1;
(4.63)
.. . 0 → J(λ)q → V(λ)q−1 → J(λ)q−1 → 0;
(4.64)
0 → V(λ)q → J(λ)q → 0.
(4.65)
Here (λ)q = (−λ0 − 2q, λ1 , . . . , λq−1 , −λq ), and all J(λ) are irreducible. Analogously to the odd dimensional case, (4.65), means that V(λ)q = J(λ)q is irreducible. From (4.64), it follows that J(λ)q is the maximal submodule of V(λ)q−1 and the quotient J(λ)q−1 is irreducible, which in its turn is the maximal submodule of V(λ)q−2 and so on. Continuing the same way, one finally arrives at J(λ)1 = V(λ)1 /J(λ)2 (4.62). The structure of the modules V(λ)1 , . . . , V(λ)q−1 is analogous to that of the odd dimensional case. The modules V(λ)0 and V(λ)0 have the common maximal submodule J(λ)1 (see (4.60) and (4.61)) and the quotients J(λ)0 = V(λ)0 /J(λ)1 and J(λ)0 = V(λ)0 /J(λ)1 are irreducible. The module V(λ)−1 has the most complicated structure of submodules. Equation (4.59) describes the structure of the maximal submodule O(λ)0 of V(λ)−1 . The appearance of the contragredient module V(λ)1 in (4.59) means that the maximal submodule of V(λ)−1 cannot be generated from singular modules because the module contragredient to a generalized Verma module is not (unless it is irreducible) a highest-weight module and therefore V(λ)−1 contains a subsingular module. Analogously the modules V(λ)−2 · · · V(λ)−q+1 contain singular and subsingular modules as described by (4.56), (4.57) and (4.52), (4.53). Finally, the module V(λ)−q contains the submodule V(λ)q but in this case subsingular modules do not appear because V(λ)q is isomorphic to V(λ)q = J(λ)q , and therefore the maximal submodule of V(λ)−q is generated from singular modules. Let µ1 ≥ · · · ≥ µq−1 ≥ |µq | and 2µ1 ≡ · · · ≡ 2µq mod 2. The second series of reducible generalized Verma o(M + 2) modules with even M contains the modules with the singular highest weights (µ) = (µ0 , µ1 , . . . , µq ) such that µ0 = µN − N
for some N = 1, . . . , q,
µ0 = −q , µ0 + µq + q ∈ N0 , µ0 − µq + q ∈ N0 .
(4.66)
November 1, 2006 11:8 WSPC/148-RMP
860
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
The structure of V(µ) is described by the short exact sequence 0 → V(µ) → V(µ) → J(µ) → 0 ,
(4.67)
where (µ) = (−µ0 − 2q, µ1 , . . . , µq−1 , −µq ) and J(µ) = V(µ) /V(µ) is irreducible. 4.5. Cohomology of irreducible o(M + 2)-modules Any irreducible o(M + 2)-module J(λ) with the highest weight (λ) integrable with respect to the parabolic subalgebra iso(M ) ⊕ o(2) is a quotient of an appropriate generalized Verma o(M + 2)-module V(λ) . (Recall that (λ) is required to satisfy (4.4)–(4.6).) In this section, we show that once the structure of all generalized Verma modules is known, one can calculate H p (t(M ), J(λ) ) (i.e. the cohomology of t(M ) with coefficients in J(λ) ) for any p and irreducible J(λ) . Recall that t(M ) is the subalgebra of o(M + 2) generated by the momenta P n . Let us start with the following Lemma. Lemma 4.1. Let V(λ) be the generalized Verma o(M + 2)-module induced from N(λ) . Then H 0 (t(M ), V (λ) ) = N(λ) , p
H (t(M ), V
(λ) )
=0
(4.68)
for p = 1, . . . .
(4.69)
Proof. From (4.40), it follows that σ− = ξ n ∂y∂n (see (3.19)) for any V (λ) . Equations (4.68) and (4.69) follow from the standard Poincar´e Lemma. The following two Theorems describe the σ− cohomology H p (t(M ), J(λ) ) with coefficients in J(λ) . Recall that any J(λ) is a quotient of the generalized Verma module V(λ) induced from N(λ) as described in Sec. 4.2. Theorem 4.2. Let M be odd. 1. If V(λ) is irreducible, then H 0 (t(M ), J(λ) ) = N(λ) , p
H (t(M ), J(λ) ) = 0,
(4.70)
p = 1, 2, . . . .
(4.71)
2. If V(λ) is reducible and (λ) = (λ)N (N = 0, . . . , 2q) belongs to the series (4.41), then H p (t(M ), J(λ)N ) = N(λ)p+N ,
p = 0, . . . , 2q + 1 − N,
(4.72)
H p (t(M ), J(λ)N ) = 0,
p = 2q + 2 − N, . . . .
(4.73)
3. If V(λ) is reducible and (λ) = (µ) belongs to the series (4.46), then H 0 (t(M ), J(µ) ) = N(µ) ,
(4.74)
H 1 (t(M ), J(µ) ) = N(µ) ,
(4.75)
p
H (t(M ), J(µ) ) = 0,
p = 2, . . . .
(4.76)
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
861
Proof. Item 1 follows from Lemma 4.1 and the observation that V(λ) is isomorphic to V (λ) whenever V(λ) is irreducible. Items 2 and 3 follow from Lemma 4.1 and long cohomological sequences corresponding to short exact sequences contragredient to (4.42)–(4.45) and (4.47).
Theorem 4.3. Let M be even. 1. If V(λ) is irreducible, then H 0 (t(M ), J(λ) ) = N(λ) ,
(4.77)
H p (t(M ), J(λ) ) = 0,
(4.78)
p = 1, 2, . . . .
2. If V(λ) is reducible and (λ) = (λ)N (N = −q, −q + 1, . . . , −1, 0, 0, . . . , q) belongs to the series (4.49), then N = −q, −q + 1, . . . , −1, 0, 0, 1, . . . , q,
H 0 (t(M ), J(λ)N ) = N(λ)N ,
(4.79)
(λ) (λ) H p (t(M ), J(λ)N ) = N ⊕N , p+N p−N p = 1, . . . , N = −q + 1, −q + 2, . . . , −1, (4.80) (λ) H p (t(M ), J(λ)N ) = N , p+N
p = 1, . . . , N = −q, 0, 0 , 1, 2, . . . q,
(4.81)
where (λ) = N(λ) N N N
for N = −q, −q + 1, . . . , q
and
N = 0 ,
(λ) = N(λ) ⊕ N(λ) , N 0 0 0 (λ) = 0 N N
(4.82) for N = q + 1, . . . ,
and p + 0 = p + 0 = p. 3. If V(λ) is reducible and (λ) = (µ) belongs to the series (4.67), then H 0 (t(M ), J(µ) ) = N(µ) ,
(4.83)
H 1 (t(M ), J(µ) ) = N(µ) ,
(4.84)
H p (t(M ), J(µ) ) = 0,
(4.85)
p = 2, . . . .
Proof. Item 1 is analogous to that of Theorem 4.2. Let us prove item 2. For the module J(λ)−q there exists the BGG resolution [28] 0 → J(λ)−q → V(λ)−q → V(λ)−q+1 → · · · → V(λ)−1 → V(λ)0 ⊕ V(λ) →
V(λ)1
→ ··· →
V(λ)q
0
→0
(4.86)
and for the modules J(λ)N for N = 0, 0 , 1, . . . , q, there exist the resolutions 0 → J(λ)N → V(λ)N → V(λ)N +1 → · · · → V(λ)q → 0,
N = 0, 0 , 1, 2, . . . , q. (4.87)
November 1, 2006 11:8 WSPC/148-RMP
862
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
The standard spectral sequence technique together with the definition of H 0 (t(M ), •) as invariants of t(M ) allows us to calculate the cohomology of the irreducible modules for N = −q, 0, 0, 1, 2, . . . , q. Using these we have H 0 (t(M ), J(λ)N ) = N(λ)N (λ) H p (t(M ), J(λ) ) = N N
N +p
for N = −q, −q + 1, . . . , −1, 0, 0, 1, 2, . . . , q, (4.88) for p = 1, . . . , N = −q, 0, 0 , 1, 2, . . . , q .
(4.89)
This proves (4.79) and (4.81). In order to prove (4.80), we consider the short exact sequences contragredient to (4.50)–(4.59) 0→ J(λ)N → V(λ)N → O(λ)N +1 → 0,
(4.90)
0→ J(λ)N +1 →
(4.91)
O(λ)N +1
→ V(λ)−N → 0,
where N = −q, −q + 1, . . . , −1 and J(λ)N = J(λ)N for N = 0 and J(λ)0 = J(λ)0 ⊕ J(λ)0 . The long cohomological exact sequence corresponding to (4.90) gives J(λ)N ). H p (t(M ), O(λ)N +1 ) = H p+1 (t(M ),
(4.92)
Then substituting this into the long cohomological exact sequence corresponding to (4.91) p−1
p
gN fN ··· → H p (t(M ), J(λ)N +1 ) → H p (t(M ), O(λ)N +1 ) → H p (t(M ), V(λ)−N ) p gN
→ H p+1 (t(M ), J(λ)N +1 ) → H p+1 (t(M ), O(λ)N +1 )
p+1 fN
gp+1
N → H p+1 (t(M ), V(λ)−N ) → ···
(4.93)
we obtain the long exact sequence gp−1
fp
N N ··· → H p (t(M ), J(λ)N +1 ) → H p+1 (t(M ), J(λ)N ) → H p (t(M ), V(λ)−N ) p gN
→ H p+1 (t(M ), J(λ)N +1 ) → H p+2 (t(M ), J(λ)N )
p+1 fN
gp+1
N → H p+1 (t(M ), V(λ)−N ) → ···.
(4.94)
Using (4.89), (4.88) and short exact sequences (4.60)–(4.65), we calculate the cohomology of the generalized Verma modules V(λ)N for N = 0, 0 , 1, 2, . . . , q, (λ) H 0 (t(M ), V(λ)N ) = N(λ)N ⊕ N N +1
for N = 0, 0 , 1, 2, . . . , q,
(λ) (λ) H p (t(M ), V(λ)N ) = N ⊕N N +p N +p+1
for p = 1, . . ., N = 0, 0 , 1, 2, . . . , q.
(4.95)
(4.96) Substituting this into (4.94), we have gp−1
fp
N N (λ) H p (t(M ), J(λ)N +1 ) → H p+1 (t(M ), J(λ)N ) → N(λ)−N +p ⊕ N ··· → −N +p+1 p gN
→ H p+1 (t(M ), J(λ)N +1 ) → H p+2 (t(M ), J(λ)N )
p+1 fN
p+1
gN (λ) (λ) → N ⊕N → ··· −N +p+1 −N +p+2
(4.97)
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
863
whence we can obtain the following recurrent relation between cohomology 0 J(λ)N ) = H 0 (t(M ), J(λ)N +1 ) ⊕ Im fN H 1 (t(M ),
for N = −q, −q + 1, . . . , −1, p−2 p−1 ⊕ Im fN J(λ)N ) = H p−1 (t(M ), J(λ)N +1 )/ Im gN H p (t(M ),
(4.98)
for N = −q, −q + 1, . . . , −1 and p ≥ 2.
(4.99)
J(λ)0 ) calcuThese relations interpolate between H (t(M ), J(λ)−q ) and H (t(M ), lated above. This allows us to calculate p
p
p (λ) =N Im fN −N +p+1 p Im gN
(4.100)
(λ) =N . −N +p
(4.101)
Then we have (λ) J(λ)N ) = H 0 (t(M ), J(λ)N +1 ) ⊕ N H 1 (t(M ), −N +1
for N = −q, −q + 1, . . . , −1, (4.102)
(λ) (λ) J(λ)N ) = H p−1 (t(M ), J(λ)N +1 )/N H p (t(M ), ⊕N −N +p−2 −N +p for N = −q, −q + 1, . . . , −1 and p ≥ 2.
(4.103)
Finally these recurrent relations give (4.80). Item 3 is analogous to that of Theorem 4.2 According to Sec. 4.4.2, items 2 and 3 in Theorems 4.2 and 4.3 describe all reducible V(λ) . Let us summarize the results for H 0 (t(M ), J(λ) ) and H 1 (t(M ), J(λ) ), which are most important for this paper: H 0 (t(M ), J(λ) ) = N(λ) ,
(4.104)
H 1 (t(M ), J(λ) ) = 0
if J(λ) ∼ V(λ) ,
(4.105)
H (t(M ), J(λ) ) = N(µ)
if (λ) = (µ) from (4.46) or (4.66) ,
(4.106)
1
H (t(M ), J(λ)N ) = N(λ)N +1 1
if M = 2q + 1, N = 0, . . . , 2q and (λ)N belongs to (4.41),
(4.107)
or if M = 2q, N = −q, 1, . . . , q − 1 and (λ)N belongs to (4.49). In addition, for M = 2q H 1 (t(M ), J(λ)0 ) = H 1 (t(M ), J(λ)0 ) = N(λ)1 , H (t(M ), J(λ)−1 ) = N(λ)0 ⊕ N(λ)0 ⊕ N(λ)2 , 1
(4.108) (4.109)
H (t(M ), J(λ)N ) = N(λ)N +1 ⊕ N(λ)−N +1 , 1
if (λ)N with N = −2, . . . , −q + 1 belongs to (4.49). (4.110)
November 1, 2006 11:8 WSPC/148-RMP
864
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
Remark 4.4. For any irreducible module J(λ) , H 1 (t(M ), J(λ) ) is equal to the direct sum of those singular and subsingular modules of the generalized Verma module V(λ) , that are not descendants of some other singular module in V(λ) . This property is expected because, as one can see from the examples in Secs. 2, 4.7, both H 1 (t(M ), J(λ) ) and singular and subsingular modules determine the structure of differential equations on the dynamical fields.
4.6. Examples of calculating cohomology of reducible o(M + 2)-modules Using Theorems 4.2 and 4.3 one can easily calculate H p (t(M ), M) for any integrable module M. Let EI1 ,I2 be the first extension of the irreducible modules I1 , I2 given by the nonsplittable short exact sequence 0 → I1 → EI1 ,I2 → I2 → 0 .
(4.111)
From the long exact sequence for cohomology 0 → H 0 (t(M ), I1 ) → H 0 (t(M ), EI1 ,I2 ) → H 0 (t(M ), I2 ) → H 1 (t(M ), I1 ) → · · · ,
(4.112)
where H p (t(M ), I1 ) and H p (t(M ), I2 ) are given by Theorems 4.2 and 4.3, one obtains H p (t(M ), EI1 ,I2 ). Using Theorem 4.2, it is not hard to see that in the case M = 2q + 1, any extension of an irreducible conformal module is isomorphic to a contragredient generalized Verma module. This means that any odd dimensional conformal system of equations is either primitive or decomposes into independent primitive subsystems. We therefore focus on the even dimensional case. As an example, let us calculate cohomology of the module EA,F which corresponds to the case of M = 4 electrodynamics considered in Sec. 2.4. The module EA,F is defined by the short exact sequence 0 → IA → EA,F → KF → 0 ,
(4.113)
where IA = J(λ)−1 and KF = J(λ)0 ⊕ J(λ)0 belong to the series (4.49) that starts from the dominant highest weight (λ)−2 = (0, 0, 0), M = 2q = 4. From Theorem 4.3, we obtain the long exact cohomology sequence 0 → N(λ)−1 → H 0 (t(M ), EA,F ) → N(λ)0 ⊕ N(λ)0 → N(λ)0 ⊕ N(λ)0 ⊕ N(λ)2 → H 1 (t(M ), EA,F ) → N(λ)1 ⊕ N(λ)1 → N(λ)1 ⊕ N(λ)3 → · · ·
(4.114)
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
865
whence H 0 (t(M ), EA,F ) = N(λ)−1 ,
H 1 (t(M ), EA,F ) = N(λ)1 ⊕ N(λ)2 .
(4.115)
As a generalization of (4.113), let us consider the module EI(λ)−N ,I(λ)−N +1 defined by the short exact sequence I(λ)−N +1 → 0 , 0 → I(λ)−N → EI(λ)−N ,I(λ)−N +1 →
(4.116)
I(λ)N = where (λ)−N and (λ)−N +1 with N = 1, . . . , q − 1 belong to (4.49), I(λ)N for N = 0 and I(λ)0 = I(λ)0 ⊕ I(λ)0 . Cohomology of EI(λ)−N ,I(λ)−N +1 is calculated from (λ) (λ) →N ⊕ N(λ)N +1 0 → N(λ)−N → H 0 (t(M ), EI(λ)−N ,I(λ)−N +1 ) → N −N +1 −N +1 (λ) → H 1 (t(M ), EI(λ)−N ,I(λ)−N +1 ) → N ⊕ N(λ)N −N +2 (λ) →N ⊕ N(λ)N +2 → · · · , −N +2
(4.117)
(λ) is defined in (4.82). From (4.117), we have that where N N H 0 (t(M ), EI(λ)−N ,I(λ)−N +1 ) = N(λ)−N , H 1 (t(M ), EI(λ)−N ,I(λ)−N +1 ) = N(λ)N ⊕ N(λ)N +1 .
(4.118)
Equations corresponding to EI(λ)−N ,I(λ)−N +1 are considered for N = 1 in Sec. 4.7.2 and for N = q − 1 in Sec. 4.7.3. An important general property of the dynamical systems associated with the module EI(λ)−N ,I(λ)−N +1 in (4.116) is that the Lorentz algebra representations of the dynamical fields and dynamical equations are isomorphic while the sum of their conformal dimensions is 2q which is the canonical dimension of a Lagrangian density. We therefore expect that all these dynamical systems to be Lagrangian. 4.7. Conformal equations Now it is straightforward to write down conformal equations RM |φ0 (x) = 0 corresponding to any conformal module M. First, one represents M as an extension of irreducible conformal modules. Then (as explained in Sec. 4.6), the results of Theorem 4.2 (for odd M ) and Theorem 4.3 (for even M ) are used to calculate H 0 (t(M ), M) and H 1 (t(M ), M). Finally, along the lines of the proof of Theorem 3.1, one expresses auxiliary fields contained in |Φ0 (x) (see Remark 3.4) in terms of derivatives of the dynamical field |φ0 (x) and reconstructs the nontrivial equations RM |φ0 (x) = 0 on the latter. These equations are associated with H 1 (t(M ), M). In practice, it is most useful to use Remark 4.4, which identifies the left-hand sides of the field equations with the singular and subsingular modules of V(λ) . In those cases where V(λ) does not contain modules of the Levi factor lΠ equivalent to (but different from) the singular and subsingular modules, the explicit form of conformal equations corresponding to the irreducible conformal module J(λ) can be obtained
November 1, 2006 11:8 WSPC/148-RMP
866
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
by replacing Kn by ∂x∂n in the expressions for a basis of the singular and subsingular modules. The examples given in Sec. 2 and in the rest of this section result from the application of this general scheme to the following modules (here I denotes an irreducible module and E denotes an extension). 1. I((2−M)/2,0,...,0) corresponds to Klein–Gordon equation (2.16) (primitive). 2. I((1−M)/2,1/2,...,1/2) corresponds to Dirac equation (2.23) (primitive). 3. I(−p,1,...,1,0,...,0) for odd M or for even M and p = 0 corresponds to closedness p
equation (2.29) on a p-form or equivalent conservation equation (2.32) on a (M − p)-polyvector (primitive); for even M and p > 0, I(−p,1,...,1,0,...,0) corresponds p
to the system (2.29), (2.34) on a p-form or the equivalent system (2.32), (2.35) on a (M − p)-polyvector (primitive). 4. I(p−M,1,...,1,0,...,0) p > 0 corresponds to conservation equation (2.32) on a p
5. 6. 7. 8. 9. 10.
p-polyvector or the equivalent closedness equation (2.29) on a (M − p)-form (primitive). I(M/2,1,...,1,±1) for even M corresponds to (anti)selfduality equation (2.37), (2.38) (primitive). I(−2,1,1) ⊕I(−2,1,−1) corresponds to the field strength form of Maxwell equations (2.52), (2.53) (non-primitive). EA,F corresponds to the potential form of Maxwell equations (2.52)–(2.54) in conformal gauge (2.45) (non-primitive). EA,F,J corresponds to the off-mass-shell version of Maxwell electrodynamics (2.52), (2.54), (2.60), (2.58) in conformal gauge (2.45) (non-primitive). EA,F,J,G corresponds to the off-mass-shell gauge invariant version of Maxwell electrodynamics (2.52), (2.54), (2.60), (2.58), (2.64) (non-primitive). I((2−M)/2,λ,...,λ,0,...,0) for odd M or for even M with either ν ≤ q − 2 or ν = q, ν
λ = 1 corresponds to Klein–Gordon-like equation (4.122) on a tensor field described by the (λ × ν)-rectangular Young tableau (primitive). 11. I((1−M)/2,λ+1/2,...,λ+1/2,1/2,...,±1/2) for odd M or for even M with ν ≤ q − 1 ν
corresponds to Dirac–like equation (4.129) on a spinor-tensor field described by the (λ × ν)-rectangular Young tableau (primitive). 12. K(λ)F = I(λ)+ ⊕ I(λ)− for even M corresponds to the field strength form of conformal higher spin equations (4.135), (4.136) (non-primitive). 13. EI(λ)A ,K(λ)F for even M corresponds to the gauge fixed potential form of conformal higher spin equations (4.135), (4.136), (4.144), (4.145) (non-primitive). 14. EI(λ)A ,K(λ)F ,I(λ)J for even M corresponds to the gauge fixed off-mass-shell version of conformal higher spin equations (4.135), (4.144), (4.145), (4.150), (4.151) (non-primitive).
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
867
15. EI(λ)A ,K(λ)F ,I(λ)J ,I(λ)G for even M corresponds to the gauge invariant offmass-shell version of conformal higher spin equations (4.135), (4.144), (4.150), (4.151), (4.154), (4.155) (non-primitive). 16. IC for even M corresponds to the condition that the generalized Weyl tensor for spin λ ≥ 1/2 symmetric tensor field equals to zero (4.156) supplemented with the gauge fixing condition (4.157) (primitive). 17. EIC ,IW for even M corresponds to gauge fixed spin λ ≥ 1/2 Fradkin–Tseytlin conformal higher spin equation (4.157), (4.159), (4.162) (non-primitive). 18. EIC ,IW ,IG for even M corresponds to gauge invariant spin λ ≥ 1/2 Fradkin–Tseytlin conformal higher spin equation (4.159), (4.162) (4.164) (nonprimitive). Note that flat limits of the most non-flat conformal equations considered in [20, 32–39] belong to the case 10. The system of conformal equations considered in [26] corresponds to the case 8.
4.7.1. Conformal Klein–Gordon and Dirac-like equations for a block Let (λ) = (−(M − 2)/2, λ, . . . , λ, 0, . . . , 0), λ ∈ N, and J(λ) be the irreducible ν
conformal module with the highest weight (λ). It is represented by the short exact sequence (4.47) for odd M and by (4.67) for even M . Let us consider the bundle B(λ) = RM × J(λ) and its subbundle B(λ) ⊃ b(λ) = RM × N(λ) . Consider a section 1
|φ(x) = Cn1 (λ),n2 (λ),...,nν (λ) (x)|(λ)n
(λ),n2 (λ),...,nν (λ)
(4.119)
of b(λ) and a section |Φ(x) of B(λ) such that, |Φ(x)|b(λ) = |φ(x), |Φ(x) =
1 1 2 ν Cn1 (λ),n2 (λ),...,nν (λ);m(l) (x) y (m · · · y m) |(λ)n (λ),n (λ),...,n (λ) . l! l=0
l
(4.120) 1
2
ν
Here |(λ)n (λ),n (λ),...,n (λ) form a basis of N(λ) . The symmetry properties of 1 2 ν |(λ)n (λ),n (λ),...,n (λ) imply that symmetrization over any λ + 1 indices gives zero. The corresponding Young tableau is a rectangle of length λ and height ν and is referred to as a block. Note that fields that appear in most of physical applications belong to this class. As shown in Sec. 3 the covariant constancy equation (3.27) encodes the differential equations on the dynamical variables that take values in H 0 (t(M ), J(λ) ). The form of these differential equations is determined by H 1 (t(M ), J(λ) ). These cohomology groups are determined in (4.104) and (4.106). Using the symmetry properties of the block Young tableau it can be easily seen that H 1 (t(M ), J(λ) ) corresponds to the singular module S(λ) of V(λ) described by the block tableau with the conformal
November 1, 2006 11:8 WSPC/148-RMP
868
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
weight M/2 + 1, i.e. it has the weights (λ) = (−(M + 2)/2, λ, . . . , λ, 0, . . . , 0). It is ν
easy to see that |s ∈ S(λ) has the form
1 ν 4λν ynν y k |λn1 (λ),...,nν (λ−1)k , |s = ψ n (λ),...,n (λ) y m ym δnk ν − 2λ − 2ν + M 1
(4.121)
ν
where ψ n (λ),...,n (λ) is an arbitrary parameter taking values in the λ × ν trace1 ν less block tableau. In fact, ψ n (λ),...,n (λ) can be thought of as an arbitrary element of the dual space of H 1 (t(M ), J(λ) ). The conformal equation associated with H 1 (t(M ), J(λ) ) is
1 ν 4λν ψ n (λ),...,n (λ) Cn1 (λ),...,nν (λ) (x) − ∂nν ∂ m Cn1 (λ),...,nν (λ−1)m (x) = 0. 2λ − 2ν + M (4.122) This is the Klein–Gordon type conformal equation for a field with the block symmetry properties and conformal weight M/2 − 1. Note that for even M , (4.66) requires either ν ≤ q − 2 or ν = q, λ = 1. This is in accordance with our analysis because, although being conformally invariant, the equations (4.122) with ν = q − 1, M = 2q are non-primitive (see Sec. 4.7.2). Also one can see for even M and ν = q, λ ≥ 2 that the singular vector (4.121) is zero j and Eq. (4.122) becomes the identity 0 = 0. For the particular cases of ν = 1, λ = 0, 1, 2, Eq. (4.122) reads C(x) = 0,
4 ∂n ∂ m Cm (x) = 0, ψ n Cn (x) − M
8 n1 n2 m ∂n ∂ Cn2 m (x) = 0. ψ Cn1 n2 (x) − 2+M 1
(4.123) (4.124) (4.125)
Equation (4.123) is the usual Klein–Gordon equation. Equation (4.124) for M = 4 corresponds to Maxwell electrodynamics formulated in terms of potential. Equation (4.124) for M = 4 and Eq. (4.125) correspond to non-unitary field-theoretical models. The Dirac–like equations are associated with the bundles b(λ) and B(λ) with
1 1 1 1 (λ) = −(M − 1)/2, λ + , . . . , λ + , , . . . , ± , λ ∈ N 2 2 2 2 ν
way to see this is to observe that for the case of ν = q the tensor contracted with 1 q ψn (λ),...,n (λ) on the left-hand side of (4.121) has opposite (anti)selfduality properties for the first and last columns of the corresponding rectangular Young tableau, that is only possible when it is zero. j One
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
869
and their sections 1
|φ(x) = Cn1 (λ),n2 (λ),...,nν (λ),α (x)|(λ)n and |Φ(x) =
(λ),n2 (λ),...,nν (λ),α
(4.126)
1 1 2 ν Cn1 (λ),n2 (λ),...,nν (λ),α;m(l) (x) y (m · · · y m) |(λ)n (λ),n (λ),...,n (λ),α , l! l=0
l
(4.127) M where |Φ(x)|b = |φ(x). Here α = 1, . . . , 2[ 2 ] is a spinorial index. Cn1 (λ),...,nν (λ),α (x) is a γ-transversal block spinor-tensor with definite chirality. The cohomology groups H 0 (t(M ), J(λ) ) and H 1 (t(M ), J(λ) ) are given in (4.104) and (4.106), respectively. H 1 (t(M ), J(λ) ) corresponds to the singular module S(λ) in V(λ) with the general element
2λν n1 (λ),...,nν (λ), m k α k α ν |s = ψ γn β y |λn1 (λ),...,nν (λ−1)k, β . α y γm β δnν − 2λ − 2ν + M (4.128) 1
ν
Here ψ n (λ),...,n (λ), α is an arbitrary γ-transversal chiral spinor-tensor parameter taking values in the (λ × ν)-block tableau. The conformal equation encoded by the covariant constancy equation (3.27) is
1 ν ψ n (λ),...,n (λ), α ∂ m γm α β Cn1 (λ),...,nν (λ), β (x) 2λν − γnν α β ∂ m Cn1 (λ),...,nν (λ−1)m, β (x) = 0. (4.129) 2λ − 2ν + M This is the conformally invariant generalization of the Dirac equation to a block spinor-tensor with conformal weight (M − 1)/2. For the particular cases of ν = 1, λ = 0, 1 we get (4.130) ∂ m γm α β C, β (x) = 0, 2 α m n, m α β β γn β ∂ Cm, (x) = 0. (4.131) ψ α ∂ γm β Cn, (x) − M Equation (4.130) is the usual Dirac equation. Note that conditions (4.66) require ν ≤ q − 1 for even M . Analogously to the case of Klein–Gordon type equations one can prove that singular vector (4.128) is zero for even M , ν = q, and corresponding equation (4.129) becomes identity 0 = 0. Analogous conformally invariant generalizations of the Klein–Gordon and Dirac equations exist for tensor fields of other symmetry types. They correspond to other irreducible modules J(λ) from the series (4.47) for odd M and (4.67) for even M . All these systems however are not expected to correspond to unitary field-theoretical models in accordance with the general fact [40–42] that conformal field equations compatible with unitarity are exhausted by the massless equations for a scalar, a spinor and blocks of the height [(M − 1)/2].
November 1, 2006 11:8 WSPC/148-RMP
870
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
4.7.2. Conformal higher spins in even dimensions Here we describe a generalization of the equations for M = 4 massless higher spin fields to a broad class of conformal field equations for tensor fields in M = 2q dimensions (the following construction can be easily formulated also for spinor-tensor fields). Let (λ)± = (−q, λ1 , λ2 , . . . , λq−1 , ±1), where λi ∈ N and λ1 ≥ λ2 ≥ · · · ≥ λq−1 ≥ 1. Let q = µ1 > µ2 ≥ µ3 ≥ · · · ≥ µp be the heights of the columns in the Young tableau corresponding to N(λ) ± . (Note that the first column is required to have the maximal height q, while the second one is required to be smaller.) Let us denote K(λ)F = I(λ)+ ⊕ I(λ)− . Consider the bundle BF = RM × K(λ)F and its subbundle bF = RM × (N(λ) + ⊕ N(λ) − ). Irreducible modules J(λ) + and J(λ) − are defined by the short exact sequences (4.60) and (4.61), respectively. Choose a section of bF 1
|φF (x) = Fn1 [q],n2 [µ2 ],...,np [µp ] (x)|(λ)F n 1
2
[q],n2 [µ2 ],...,np [µp ]
,
(4.132)
p
where |(λ)F n [q],n [µ2 ],...,n [µp ] is a basis in N(λ) + ⊕ N(λ) + , i.e. it contains both selfdual and antiselfdual parts. We treat |φF (x) as a higher spin field strength. Let |ΦF (x) be a section of BF such that |ΦF (x)bF = |φF (x) |ΦF (x) =
1 1 2 p Fn1 [q],n2 [µ2 ],...,np [µp ];m(l) (x) y (m · · · y m) |(λ)F n [q],n [µ2 ],...,n [µp ] . l! l=0
l
(4.133) As follows from (4.104) and (4.108), the condition D|ΦF (x) = 0
(4.134)
implies the equations 1
ψn
[q−1],n2 [µ2 ],...,np [µp ] m ∗
∂ ( F )mn1 [q−1],n2 [µ2 ],...,np [µp ] (x) = 0,
1
ψn
2
p
[q−1],n [µ2 ],...,n [µp ] m 1
∂ Fmn1 [q−1],n2 [µ2 ],...,np [µp ] (x) = 0, 2
p
(4.135) (4.136)
where an arbitrary element ψ n [q−1],n [µ2 ],...,n [µp ] of the irreducible o(M )-module associated with the Young tableau with columns of heights q − 1, µ2 , . . . , µp is introduced to avoid complicated projection operators. For the particular case of the block with µ2 = µ3 = · · · = q − 1 these are equations of motion (formulated in terms of field strengths) for the conformal fields that respect unitarity [40–42]. For q = 2, one recovers the usual equations of motion for massless fields in four dimensions formulated in terms of field strengths. For q = 3, the conformal massless higher spins of this type were discussed in [43]. The system (4.135), (4.136) admits extensions analogous to that of the system (2.52), (2.53). In particular, one can introduce potentials to the field strength Fn1 [q],n2 [µ2 ],...,np [µp ] (x) in both gauge invariant and conformal gauge fixed forms. To this end, we consider the nontrivial extension EI(λ)A ,K(λ)F of the module K(λ)F by
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
871
the module J(λ) A where (λ)A = (−q + 1, λ1 , . . . , λq−1 , 0). EI(λ)A ,K(λ)F is defined by the short exact sequence 0 → I(λ)A → EI(λ)A ,K(λ)F → K(λ)F → 0 . module EI(λ)A ,K(λ)F n1 [q−1],n2 [µ2 ],...,np [µp ]
The |(λ)A
can
be
(4.137)
described
as
follows.
Let
be the basis in N(λ) A . Impose the following relations 1
ψn1 [q−1],n2 [µ2 ],...,np [µp ] ym |(λ)F mn
1
ψn1 [q−1],n2 [µ2 ],...,np [µp ] ym (∗ |(λ)F mn 1
ψn1 [q],n2 [µ2 ],...,np [µp ] y |(λ)A q−1
2
p
[q−1],n [µ2 ],...,n [µp ]
n [q−1],n2 [µ2 ],...,np [µp ]
n
ψn1 [q−2],n2 [µ2 −1],...,nλq−1 [µλ
[q−1],n2 [µ2 ],...,np [µp ]
1
−1],nλq−1 +1 [µλq−1 +1 ],...,np [µp ] y 1
|(λ)A n
n
= 0,
(4.138)
= 0,
(4.139)
= 0,
(4.140)
yn yn1 · · · ynλq−1
[q−1],n2 [µ2 ],...,np [µp ]
= 0,
(4.141)
which single out the modules K(λ)F and I(λ)A , respectively. The nontrivial extension is defined by the condition 1
ψn1 [q],n2 [µ2 ],...,np [µp ] P m |(λ)F n
[q],n2 [µ2 ],...,np [µp ] 1
1
= −ψn1 [q],n2 [µ2 ],...,np [µp ] η mn |(λ)A n
[q−1],n2 [µ2 ],...,np [µp ] 1
The module EI(λ)A ,K(λ)F is generated by y n from |(λ)F n
.
(4.142)
[q],n2 [µ2 ],...,np [µp ]
and
n1 [q−1],n2 [µ2 ],...,np [µp ]
. |(λ)A Consider the bundle BA,F = RM × EI(λ)A ,K(λ)F . bF and bA = RM × N(λ) A are its subbundles. Consider a section |ΦA,F (x) of BA,F ,
|ΦA,F (x) 1 1 2 p Fn1 [q],n2 [µ2 ],...,np [µp ];m(l) (x) y (m · · · y m) |(λ)F n [q],n [µ2 ],...,n [µp ] = l! l=0
l
1 1 2 p An1 [q−1],n2 [µ2 ],...,np [µp ];m(l) (x) y (m · · · y m) |(λ)A n [q−1],n [µ2 ],...,n [µp ] . + l! l=0
l
(4.143) Cohomology H 0 (t(M ), EI(λ)A ,K(λ)F ), H 1 (t(M ), EI(λ)A ,K(λ)F ) is given in (4.118) for N = 1. Condition D|ΦF,A (x) = 0 implies 1
ψ mn
[q−1],n2 [µ2 ],...,np [µp ]
(∂m An1 [q−1],n2 [µ2 ],...,np [µp ] (x)
− Fmn1 [q−1],n2 [µ2 ],...,np [µp ] (x)) = 0, 1
ψn
(4.144)
[q−2],n2 [µ2 −1],...,nλq−1 [µλq −1 −1],nλq−1 +1 [µλq−1 +1 ],...,np [µp ] 1
λq−1
× ∂ n · · · ∂ n
An1 [q−1],n2 [µ2 ],...,np [µp ] (x) = 0
(4.145)
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
872
together with (4.135) and (4.136). This extension introduces gauge potentials An1 [q−1],n2 [µ2 ],...,np [µp ] (x) to the field strength, along with the conformally invariant gauge condition (4.145). Now we introduce the module EI(λ)A ,K(λ)F ,I(λ)J that extends EI(λ)A ,K(λ)F by the module I(λ)J , where (λ)J = (−q − 1, λ1 , . . . , λq−1 , 0). EI(λ)A ,K(λ)F ,I(λ)J is described by the short exact sequence 0 → EI(λ)A ,K(λ)F → EI(λ)A ,K(λ)F ,I(λ)J → I(λ)J → 0 .
(4.146) 1
The module EI(λ)A ,K(λ)F ,I(λ)J is generated by y n from |(λ)F n n1 [q−1],n2 [µ2 ],...,np [µp ]
n1 [q−1],n2 [µ2 ],...,np [µp ]
and |(λ)J |(λ)A (4.138)–(4.142) along with
ψn1 [q−2],n2 [µ2 −1],...,nλq−1 [µλ
q−1 1
× yn1 · · · ynλq−1 |(λ)J n
[q],n2 [µ2 ],...,np [µp ]
,
satisfying conditions
−1],nλq−1 +1 [µλq−1 +1 ],...,np [µp ]
[q−1],n2 [µ2 ],...,np [µp ]
=0
(4.147)
and 1
2
p
ψmn1 [q−1],n2 [µ2 ],...,np [µp ] P m |(λ)J n [q−1],n [µ2 ],...,n [µp ] 1 2 p q = − ψmn1 [q−1],n2 [µ2 ],...,np [µp ] |(λ)F mn [q−1],n [µ2 ],...,n [µp ] . 3
(4.148)
Consider a section |ΦA,F,J (x) of the bundle RM × EI(λ)A ,K(λ)F ,I(λ)J , |ΦA,F,J (x) 1 1 2 p Fn1 [q],n2 [µ2 ],...,np [µp ];m(l) (x) y (m · · · y m) |(λ)F n [q],n [µ2 ],...,n [µp ] = l! l=0
l
1 1 2 p An1 [q−1],n2 [µ2 ],...,np [µp ];m(l) (x) y (m · · · y m) |(λ)A n [q−1],n [µ2 ],...,n [µp ] + l! l=0
l
1 1 2 p Jn1 [q−1],n2 [µ2 ],...,np [µp ];m(l) (x) y (m · · · y m) |(λ)J n [q−1],n [µ2 ],...,n [µp ] . + l! l=0
l
(4.149) Calculating the cohomology H p (t(M ), EI(λ)A ,K(λ)F ,I(λ)J ) from (4.146), one obtains that the condition D|ΦA,F,J (x) = 0 implies Eqs. (4.135), (4.144) and (4.145) along with equations 1
ψn
[q−1],n2 [µ2 ],...,np [µp ]
1
(∂ n Fn1 [q],n2 [µ2 ],...,np [µp ] (x)
− Jn1 [q−1],n2 [µ2 ],...,np [µp ] (x)) = 0, 1
ψn
(4.150)
[q−2],n2 [µ2 −1],...,nλq−1 [µλq−1 −1],nλq−1 +1 [µλq−1 +1 ],...,np [µp ] n1
∂
× Jn1 [q−1],n2 [µ2 ],...,np [µp ] (x) = 0 .
λq−1
· · · ∂n
(4.151)
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
873
For λq−1 = 1 (equivalently µ2 ≤ q − 2), the system (4.135), (4.144), (4.150), (4.151) generalizes the ordinary M = 4 electrodynamics to any even spacetime dimension and arbitrary tensor structure of fields. Here, (4.144) defines the generalized field strength Fmn1 [q−1],n2 [µ2 ],...,np [µp ] (x) via the generalized potential An1 [q−1],n2 [µ2 ],...,np [µp ] (x). Equation (4.135) is the Bianchi identity for generalized field strength. Equation (4.150) describes “interaction” with the “current” (see Sec. 2.4) Jn1 [q−1],n2 [µ2 ],...,np [µp ] (x), which conserves due to Eq. (4.151). The system (4.135), (4.144), (4.150), (4.151) is gauge invariant under the generalized gradient transformations 1
ψn
[q−1],n2 [µ2 ],...,np [µp ] 1
= ψn
δAn1 [q−1],n2 [µ2 ],...,np [µp ] (x)
2
[q−1],n [µ2 ],...,np [µp ]
∂n1 n1 [q−2],n2 [µ2 ],...,np [µp ] (x)
(4.152)
with an arbitrary parameter n1 [q−2],n2 [µ2 ],...,np [µp ] (x). Equation (4.145) fixes conformal gauge, generalizing Eq. (2.45). Analogously to the example in Sec. 2.4, one can relax the gauge fixing condition (4.145) by considering the module EI(λ)A ,K(λ)F ,I(λ)J ,I(λ)G defined by the short exact sequence 0 → EI(λ)A ,K(λ)F ,I(λ)J → EI(λ)A ,K(λ)F ,I(λ)J ,I(λ)G → I(λ)G → 0 ,
(4.153)
where (λ)G = (−λq−1 − q − 1, λ1 , . . . , λq−2 , 0, 0). The covariant constancy condition for the section |ΦA,F,J,G implies Eqs (4.135), (4.144), (4.150), (4.151) along with the equation 1
ψn
[q−2],n2 [µ2 ],...,np [µp ]
1
(∂ n An1 [q−1],n2 [µ2 ],...,np [µp ] (x)
− Gn1 [q−2],n2 [µ2 ],...,np [µp ] (x)) = 0
(4.154)
instead of (4.145). The field Gn1 [q−1],n2 [µ2 ],...,np [µp ] (x) satisfies the equation 1
ψn
[q−3],n2 [µ2 −1],...,nλq−2 [µλq−2 −1],nλq−2 +1 [µλq−2 +1 ],...,np [µp ] n1
∂
× Gn1 [q−2],n2 [µ2 ],...,np [µp ] (x) = 0 .
λq−2
· · · ∂n
(4.155)
4.7.3. Fradkin–Tseytlin conformal higher spins in even dimensions Consider highest weight (λ)C = (λ − 2, λ, 0 . . . , 0), λi ∈ N (the case of half-integer λi can be considered analogously). Let I(λ)C be irreducible conformal module with the highest weight (λ)C . Using Theorems 3.1 and 4.3 we obtain primitive conformal system corresponding to the module I(λ)C . It has the form ψ n(λ),m(λ) ∂m · · · ∂m Cn(λ) (x) = 0, ψ
n(λ−1)
λ λ+q−1 n
(∂ · ∂)
∂ Cn(λ) (x) = 0.
(4.156) (4.157)
November 1, 2006 11:8 WSPC/148-RMP
874
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
Here Cn(λ) (x) is a symmetric traceless tensor field, ψ n(λ),m(λ) is an arbitrary traceless tensor parameter corresponding to the (λ×2)-block Young tableaux. (∂·∂)λ+q−1 is an order 2(λ + q − 1) differential operator (∂ · ∂)λ+q−1 Cn(λ) (x) =
m a(p, r)p ∂(n · · · ∂n ∂ · · ∂ m Cn(λ−r))m(r) · p+r=λ+q−1 r
r
(4.158) for some a(p, r). The left-hand side of Eq. (4.156) can be interpreted as the generalized Weyl tensor for the field Cn(λ) (x) ψ n(λ),m(λ) ∂m · · · ∂m Cn(λ) (x) = ψ n(λ),m(λ) Wn(λ),m(λ) (x).
(4.159)
λ
It is gauge invariant under the gauge transformations ψ n(λ) δCn(λ) (x) = ψ n(λ) ∂n n(λ−1) (x) ,
(4.160)
where n(λ−1) (x) is a gauge parameter. Equation (4.156) sets Wn(λ),m(λ) (x) to zero and is dynamically trivial (i.e. describes pure gauge degrees of freedom). Equation (4.157) is the conformal gauge condition for Cn(λ) (x). (Note that, as any covariant gauge condition, it is incomplete.) A nontrivial dynamical system with nonzero Weyl tensor is non-primitive and results from the reducible module EI(λ)C ,I(λ)W defined by the short exact sequence 0 → I(λ)C → EI(λ)C ,I(λ)W → I(λ)W → 0 ,
(4.161)
where I(λ)W is the irreducible conformal module with the highest weight (λ)W = (−2, λ, λ, 0, . . . , 0) corresponding to the Weyl tensor Wn(λ),m(λ) (x). Cohomology of EI(λ)C ,I(λ)W is given in (4.118) for N = q − 1. The module EI(λ)C ,I(λ)W gives rise to the gauge fixing equation (4.157) along with the definition of the Weyl tensor (4.159) and the equation m · · ∂ m Wn(λ),m(λ) = 0 . ψ n(λ) 2q−4 ∂ ·
(4.162)
λ
This class of conformal equations was found by Fradkin and Tseytlin in [44] along with the analogous equations for spinor-tensors for M = 4 and generalized to arbitrary even M = 2q in [45].
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
875
Gauge invariant form of the same system (i.e. without Eq. (4.157)) results from our construction applied to the module EI(λ)C ,I(λ)W ,I(λ)G defined by the short exact sequence 0 → EI(λ)C ,I(λ)W → EI(λ)C ,I(λ)W ,I(λ)G → I(λ)G → 0 .
(4.163)
Here I(λ)G is the irreducible conformal module with the highest weight (λ)G = (−λ − 2q + 1, λ − 1, 0, . . . , 0). Module EI(λ)C ,I(λ)W ,I(λ)G gives rise to the system containing equations (4.159), (4.162) and the equation ψ n(λ−1) (∂ · ∂)λ+q−1 ∂ n Cn(λ) (x) = ψ n(λ−1) Gn(λ−1) ,
(4.164)
which relaxes the gauge fixing equation (4.157). 5. Conclusions In this paper, we study a general framework, which allows us to classify and obtain the explicit form of all linear homogeneous fpΠ -invariant M -dimensional equations for an arbitrary semi-simple Lie algebra f which has a parabolic subalgebra pΠ with an M -dimensional Abelian radical rΠ . These equations are written in the form of the covariant constancy conditions D|Φp (x) = (d + ω0 (x))|Φp (x) = 0 .
(5.1)
Here the connection 1-form ω0 (x) takes values in f and is flat, i.e. (d+ω0 (x))2 =0. A particularly useful choice of the connection is ω0 (x) = σ− , where σ− takes values in 2 = 0. The p-forms |Φp (x) radical rΠ and is x-independent, i.e. dσ− + σ− d = 0, σ− take values in an f-module M that is required to be pΠ -integrable. We prove that (5.1) leads to a linear homogeneous f-invariant equation RM |φp (x) = 0
(5.2)
on the set of dynamical fields |φp (x) that are elements of the pth cohomology of σ− (see Remark 3.4). All other fields from the set |Φp (x) are either pure gauge or auxiliary fields expressed in terms of derivatives of the dynamical fields. The form of equations (5.2) is determined by the (p + 1)-th cohomology of σ− . fpΠ invariant equations (5.2) are classified by the modules M. This classification is complete because any equation can be unfolded to the form (5.1) by introducing auxiliary fields. A constructive procedure is described, which allows one to obtain the explicit form of the fpΠ -invariant equation associated with M. In this paper, the proposed general construction is applied to obtain the complete classification of conformally invariant differential equations in terms of singular and subsingular modules of generalized Verma modules of the conformal algebra in M dimensions. The approach proposed in this paper can be further applied to several problems. The most straightforward application is to study free (i.e. linear) equations invariant under symmetries different from the usual conformal symmetry. A particularly interesting example is that of the symplectic algebra sp(m) which was
November 1, 2006 11:8 WSPC/148-RMP
876
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
shown [5] to be a proper extension of the usual conformal algebra, acting on the infinite systems of fields of higher spins. More examples of sp(m)-invariant equations were obtained recently in [46]. It is also tempting to apply our approach to the study of M = 2 conformal systems starting with the related infinite-dimensional symmetries. Another interesting generalization to be studied consists in relaxing the requirement that the radical rΠ is Abelian. In this case, one can still formulate invariant equations in the form (5.1). The resulting equations are not translationally invariant because ω0 (x) is necessarily x-dependent. Also it is not clear how to implement the analysis of the dynamical content of the invariant equations in terms of cohomology. Let us note that this case is not of a purely “academic” interest. An important class of equations of this type is provided by superfield equations for supersymmetric systems, which are known to contain an explicit dependence on anticommuting variables through the supercovariant derivatives. It is well known that it is sometimes difficult to distinguish between constraints and “true” field equations in superspace. As mentioned in [5], the origin of this difficulty can be traced back to the absence of a distinct σ− cohomology description. One of the most important problems is to go beyond the class of linear equations. A suggestive feature of our approach mentioned in Sec. 2.4 is that it allows a natural definition of current modules. As a result, the interaction problem admits a reformulation in terms of the realization of current modules as tensor products (i.e. nonlinear combinations) of modules associated with matter fields. By analogy with higher spin theory, to put interacting theory in the framework of gravity with the gravitational field being one of the dynamical fields (i.e. not just a background one as in this paper) it is important to extend the formalism to (extensions of) field equations formulated in terms of differential p-forms with p > 0. Among other things, this requires clarifying the relationship between the dynamical equations formulated in terms of 0-forms as in this paper and those formulated in terms of higher differential forms (in particular, 1-forms) as in higher spin gauge theory [6, 3]. In this respect Theorems 4.2 and 4.3 in this paper and their generalizations to other Lie algebras to be worked out are likely to play the key role because they link together cohomology groups which determine dynamical fields and field equations in terms of various differential forms. Finally, it would be very instructive to make contact with other cohomological approaches such as developed, e.g., in [21, 47, 48]. Acknowledgments We are grateful to A. Semikhatov for useful discussions and numerous useful comments on the manuscript. We are grateful to R. Metsaev, B. Feigin and M. Finkelberg for valuable discussions. This work was supported by INTAS, Grant No. 00-01-254, the RFBR, Grant No. 02-02-17067 and Russian Federation President
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
877
Grant No. LSS-1578.2003.2. TIY is partially supported by the RFBR Grant No. 0202-16944, RFBR Grant No. 03-01-06135 and the Russian Science Support Formation. SOV is partially supported by the RFBR Grant No. 03-02-06465 and the Landau Scholarship Foundation, Forschungszentrum J¨ ulich. Appendix A. Relevant Facts from Representation Theory The structure of generalized Verma modules can be investigated using methods developed in [17, 18, 30, 31, 49, 50]. Let us first recall some notations. Let h be the Cartan subalgebra and h∗ is its dual space. Let simple roots be denoted α0 , α1 , . . . , αq and Π consists of α1 , . . . , αq (see Sec. 3). The Weyl group W q+1 is generated by reflections rαi ≡ ri (0 ≤ i ≤ q) of h∗ over the hyperplane orthogonal to the simple root αi ri λ = λ − 2
(λ, αi ) αi , (αi , αi )
(A.1)
λ ∈ h∗ . The action rα · λ (nonlinear representation) of W q+1 in h∗ is defined by the formula rα · λ = λ − 2
(λ + ρ, α) α (α, α)
(A.2)
for any α, λ ∈ h∗ . Here ρ is half of the sum of positive roots.k Let W q be the subgroup of the Weyl group generated by simple reflections ri with 1 ≤ i ≤ q. Denote by Q the root lattice {Zα0 + Zα1 + · · · + Zαq }. For any highest weight λ, let Wλq+1 be the subgroup constituted by such elements w ∈ W q+1 that w ·λ ∈ λ +Q.
(A.3)
Let Sλ ⊂ Wλq+1 be the stability subgroup of λ s · λ = λ,
s ∈ Sλ .
(A.4)
Consider the quotient Tλ = (W q ∩ Wλq+1 )\Wλq+1 /(W q ∩ Sλ ).
(A.5)
Denote by L the set of highest weights of the form λ = (λ0 , λ1 , λ2 , . . . , λq ) where (λ1 , λ2 , . . . , λq ) is a dominant integral highest weight of Bq (Dq ) (i.e. λ1 ≥ λ2 ≥ · · · ≥ λq (λ1 ≥ λ2 ≥ · · · ≥ |λq |) and 2λi are all even or odd simultaneously). For any equivalence class from Tλ , one can choose a representative t such that t · λ ∈ L whenever λ ∈ L. Let Tλ ⊂ Tλ denote the set of all such representatives. For any weight ν ∈ L, the set of elements Tλ generates the set of highest weights {t · ν}t∈Tλ . k Note that this formula is universal: given linear representation of a group G in a linear space V and a fixed vector ρ ∈ V , the transformations A · λ = Aλ + (A − )ρ for A ∈ G and λ ∈ V define the (nonlinear) action of G in V .
November 1, 2006 11:8 WSPC/148-RMP
878
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
Elements t ∈ Tλ are ordered with respect to their length(t), where the length(t) is the number of the multipliers in the reduced (i.e. minimal) decomposition of t into a product of the elementary reflections generated by the simple roots. The reduced decomposition is unique. We write t1 ≺ t2 whenever length(t1 ) < length(t2 ). Note that such defined order is partial because any two elements with the same length cannot be compared. The main point is that the generalized Verma module Vt2 ·ν admits a nontrivial homomorphism into the generalized Verma module Vt1 ·ν whenever t1 ≺ t2 [50]. Applying this general method to the conformal algebra one obtains the structure of singular modules in Vλ in the cases Bq+1 and Dq+1 , which was completely studied in [30, 31] (see also [28] for the textbook). This exhausts the case of Bq+1 . In the case of Dq+1 , subsingular modules exist and their structure should be investigated separately. Let us sketch the final results separately for the cases Bq+1 (i.e. M = 2q + 1) and Dq+1 (i.e. M = 2q). Let M = 2q + 1. The Dynkin diagram of the algebra Bq+1 is (4.2). Choose an orthogonal basis i 0 ≤ i ≤ q in h∗ . Then αi = i − i+1 ,
0 ≤ i ≤ q − 1,
αq = q .
Introduce the basis in h dual to i (i.e. i (j ) = δij ) √ 0 = −D, 1 = L12 , i = −1L2i−1,2i , 1 < i ≤ q − 1,
(A.6)
q =
√
−1L2q,2q+1 . (A.7)
Then Hi = i − i+1 ,
0 ≤ i ≤ q − 1,
Hq = 2q .
(A.8)
Half the sum of all positive roots is in this case ρ=
q
1 q−i+ i . 2 i=0
(A.9)
Recall that ri denote the simple reflections ri = rαi = r i − i+1 for 0 ≤ i ≤ q − 1 and rq = rαq = r q . In the case of dominant integral λ the stability subgroup is trivial and the set Tλ consists of the following elements [28], e ≺ r0 ≺ r1 r 0 − 2 ≺ r1 r2 r 0 − 3 ≺ · · · ≺ r1 r2 · · · rq−1 r 0 − q ≺ r1 r2 · · · rq−1 rq r 0 + q ≺ r1 r2 · · · rq−2 r q−1 r 0 + q−1 ≺ · · · ≺ r1 r 2 r 0 + 2 ≺ r 1 r 0 + 1 ≺ r 0 .
(A.10)
Note that these elements are written in the non-reduced form, which, however, is more convenient for calculations. This gives rise to the diagram (B.1) (see the end of the paper) of homomorphisms of modules Vλ , where λ0 ≥ λ1 ≥ · · · ≥ λq ≥ 0 and 2λi are either all even or all odd 0 ≤ i ≤ q. Composition of any two homomorphisms (arrows) in the diagram is zero.
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
879
For non-integral λ, homomorphisms are associated with Tλ = {e ≺ r 0 }. In this case the parameters of the highest weight should satisfy 1 (A.11) λ0 = −q − + n, n ∈ N, λi ∈ N, 1 ≤ i ≤ q, 2 or 1 λ0 = −q + n, n ∈ N0 , λi ∈ + N0 , 1 ≤ i ≤ q. (A.12) 2 This leads to the following diagram of homomorphisms (r ,l)
0 V(λ0 ,λ1 ,...,λq ) ←−− −− V(−λ0 −2q−1,λ1 ,...,λq ) .
(A.13)
For the case (A.11) l = 2n. For the case (A.12) l = 2n + 1. Let M = 2q. The Dynkin diagram of the algebra Dq+1 is (4.1). Choose an orthogonal basis i in h∗ . Then, αi = i − i+1 ,
0 ≤ i ≤ q − 1,
αq = q−1 + q .
(A.14)
The half of the sum of all positive roots is ρ=
q−1
(q − i)i .
(A.15)
i=0
The analysis analogous to that of the odd dimensional case gives that Tλ with a dominant integral λ consists of the following elements [28], e ≺ r0 ≺ r1 r 0 − 2 ≺ r1 r2 r 0 − 3 ≺ · · · ≺ r1 r2 · · · rq−2 r 0 − q−1 r1 r2 · · · rq−1 r − q ≺ r1 r2 · · · rq−2 rq0r 0 +
≺ r1 r2 · · · rq−1 rq r 0 + q−1 q ≺ r1 r2 · · · rq−3 r q−2 − q r q−2 + q r 0 + q−2 ≺ r1 r2 · · · rq−4 r q−3 − q r q−3 + q r 0 + q−3 ≺ · · · ≺ r1 r 2 − q r 2 + q r 0 + 2 ≺ r 1 − q r 1 + q r 0 + 1 ≺ r 0 − q r 0 + q .
(A.16)
The diagram of Vλ -homomorphisms is (B.2) (see the end of this paper), where λ0 ≥ λ1 ≥ · · · ≥ |λq | and 2λi are either all even or all odd 0 ≤ i ≤ q. Here, the composition of any two homomorphisms, except for those in the central rhombus and those that are labeled by NS, is zero. There exist also q − 1 nonstandard homomorphisms [30] (they are labeled by the symbol NS in the diagram (B.2)) between modules in this diagram that correspond to the element r 0 − q r 0 + q from Tλ V(λN −N,λ0 +1,λ1 +1,...,λN −1 +1,λN +1 ,...,λq ) (r0 −q r0 +q ,2λN −2N +2q)
←−−−−−−−−−−−−−−−−−−− V(−λN +N −2q,λ0 +1,λ1 +1,...,λN −1 +1,λN +1 ,...,−λq ) (A.17) forl 0 ≤ N < q − 1. l For N = q − 1, this homomorphism amounts to the composition of the homomorphisms that constitute the rhombus.
November 1, 2006 11:8 WSPC/148-RMP
880
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
There are also nonstandard homomorphisms in the case when λ is singular i.e. λ + ρ lies on a wall of the Weyl chamber. Then Tλ = {e ≺ r 0 − q r 0 + q } and the parameters of the highest weight satisfy the following relations λ0 − λN + N = 0
for some N = 1, 2, . . . , q,
λ0 + λq + q = n ∈ N0 , λ0 − λq + q = m ∈ N0
(A.18) (A.19)
and m + n = 0.
(A.20)
Here (A.18) is the condition that the highest weight is singular and (A.19), (A.20) are conditions that r 0 − q r 0 + q λ belongs to the weight lattice. These homomorphisms are (r0 −q r0 +q ,2λ0 +2q)
V(λ0 ,λ1 ,λ2 ,...,λq ) ←−−−−−−−−−−−−−−− V(−λ0 −2q,λ1 ,λ2 ,...,−λq ) .
(A.21)
The quotient of an arbitrary generalized Verma module Vλ over the submodule P(λ) generated from all singular submodules of Vλ is not necessarily irreducible. In fact, the module Vλ can have subsingular submodules (those that are singular in Vλ /P(λ) ), subsubsingular submodules etc. . . . In the conformal algebra case, subsubsingular submodules do not appear. To describe the structure of Vλ for the highest weight (λ) belonging to series (4.49), we start with the case of (λ)−q = (0, 0, . . . , 0). All other cases can be obtained from this one by application of the shift functor [17] to modules belonging to the case (λ)−q = (0, 0, . . . , 0). So let us consider the case (λ)−q = (0, 0, . . . , 0) , (λ)−q+1 = (−1, 1, 0, . . . , 0) , .. . (λ)−q+N = (−N, 1, . . . , 1, 0, . . . , 0), .. .
N = 0, . . . , q − 1 ,
N
(λ)−1 = (−q + 1, 1, . . . , 1, 0), (λ)0 = (−q, 1, . . . , 1),
(λ)0 = (−q, 1, . . . , 1, −1) ,
(λ)1 = (−q − 1, 1, . . . , 1, 0) , .. . (λ)K = (−q − K, 1, . . . , 1, 0, . . . , 0), q−K
.. . (λ)q−2 = (−2q + 2, 1, 1, 0, . . . , 0) , (λ)q−1 = (−2q + 1, 1, 0, . . . , 0) , (λ)q = (−2q, 0, . . . , 0) .
K = 1, . . . , q − 1 ,
(A.22)
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
881
The structure of generalized Verma modules with these highest weights can be elaborated by the direct calculation. Solving explicitly the system of equations
Pn FA (y m )|(λ)N A = 0
(A.23)
for the polynomials FA (y m ), where Pn are differential operators (4.27) we obtain that the module V(λ)−q contains singular vectors
|s1(λ)−q m = y m |(λ)−q , |s2(λ)−q = (y 2 )q |(λ)−q .
(A.24) (A.25)
The modules V(λ)N for N = −q + 1, . . . , −1 contain singular vectors
|s1(λ)N m[N +q+1] = y [m |(λ)N m[N +q]] ,
(A.26)
|s2(λ)N m[N +q] = (y 2 )−N |(λ)N m[N +q] − (N + q)(y 2 )−N −1 yn y [m |(λ)N nm[N +q−1]]
(A.27)
and subsingular vectors
|subs(λ)N m[N +q−1] = (y 2 )−N yn |(λ)N nm[n+q−1] .
(A.28)
The modules V(λ)N for N = 0, 0 , 1, . . . , q − 1 contain singular vectors
|s(λ)N m[q−N −1] = ym |(λ)N m[q−N ] .
(A.29)
The completeness of this list of singular and subsingular modules follows from the theory intersection cohomology sheaves [18].
November 1, 2006 11:8 WSPC/148-RMP
882
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
Appendix B. Homomorphism Diagrams V(λ)0 = V(λ0 ,λ1 ,λ2 ,...,λq ) (r 0 −1 , λ0 −λ1 +1) V(λ)1 = V(λ1 −1,λ0 +1,λ2 ,...,λq ) (r 0 −2 , λ1 −λ2 +1) ... (r 0 −N , λN −1 −λN +1) V(λ)N = V(λN −N,λ0 +1,λ1 +1,...,λN −1 +1,λN +1 ,...,λq ) (r − 0 N +1 , λN −λN +1 +1) V(λ)N +1 = V(λN +1 −N −1,λ0 +1,λ1 +1,...,λN +1,λN +2 ,...,λq ) (r − 0 N +2 , λN +1 −λN +2 +1) ... (r 0 −q , λq−1 −λq +1) V(λ)q = V(λq −q,λ0 +1,λ1 +1,...,λq−1 +1) (r , 2λ +1) 0 q
(B.1)
V(λ)q+1 = V(−λq −q−1,λ0 +1,λ1 +1,...,λq−1 +1) (r 0 +q , λq−1 −λq +1) ... (r + 0 N +1 , λN −λN +1 +1) V(λ)2q+1−N = V(−λN +N −2q−1,λ0 +1,λ1 +1,...,λN −1 +1,λN +1 ,...,λq ) (r 0 +N , λN −1 −λN +1) V(λ)2q+2−N = V(−λN −1 +N −2q−2,λ0 +1,λ1 +1,...,λN −2 +1,λN ,...,λq ) (r + 0 N −1 , λN −2 −λN −1 +1) ... (r 0 +1 , λ0 −λ1 +1) V(λ)2q+1 = V(−λ0 −2q−1,λ1 ,λ2 ,...,λq ) The label (r, l) at a homomorphism arrow has the following meaning. r denotes the reflection that connects highest weights of the two modules. l is the level at which a singular module resulting from the arrow homomorphism is situated.
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
883
The label (r, l) at a homomorphism arrow has the following meaning. r denotes the reflection that connects highest weights of the two modules. l is the level at which a singular module resulting from the arrow homomorphism is situated.
November 1, 2006 11:8 WSPC/148-RMP
884
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
References [1] M. A. Vasiliev, Consistent equations for interacting massless fields of all spins in the first order in curvatures, Ann. Phys. (N.Y.) 190 (1989) 59–106. [2] M. A. Vasiliev, Consistent equation for interacting gauge fields of all spins in (3 +1)dimensions, Phys. Lett. B 243 (1990) 378–382; Properties of equations of motion of interacting gauge fields of all spins in (3 + 1)-dimensions, Class. Quant. Grav. 8 (1991) 1387–1417; More on equations of motion for interacting massless fields of all spins in (3 + 1)-dimensions, Phys. Lett. B 285 (1992) 225–234. [3] M. A. Vasiliev, Higher spin gauge theories: Star product AdS spacer, in The Many Faces of the Superworld; Golfand’s Memorial Volume, ed. M. Shifman (World Scientific, 2000); arXiv: hep-th/9910096. [4] O. V. Shaynkman and M. A. Vasiliev, Scalar field in any dimension from the higher spin gauge theory perspective, Theor. Math. Phys. 123 (2000) 683–700; arXiv: hepth/0003123. [5] M. A. Vasiliev, Conformal higher spin symmetries of 4-d massless supermultiplets and osp(L,2M) invariant equations in generalized (super)space, Phys. Rev. D 66 (2002) 066006; arXiv: hep-th/0106149. [6] M. A. Vasiliev, Nonlinear equations for symmetric massless higher spin fields in (A)dS(d), Phys. Lett. B 567 (2003) 139–151; arXiv: hep-th/0304049. [7] E. S. Fradkin and V. Ya. Linetsky, BFV approach to geometric quantization, Nucl. Phys. B 431 (1994) 569–621. [8] B. V. Fedosov, Deformation Quantization and Index Theory, Berlin (Germany, Akademie-Verl, 1996). [9] M. A. Vasiliev, Higher-spin theories and Sp(2M) invariant space-time; arXiv: hepth/0301235. [10] E. S. Fradkin and M. A. Vasiliev, Candidate to the role of higher spin symmetry, Ann. Phys. 177 (1987) 63–126. [11] S. E. Konstein and M. A. Vasiliev, Extended higher spin superalgebras and their massless representations, Nucl. Phys. B 331 (1990) 475–499. [12] E. S. Fradkin and V. Ya. Linetsky, A superconformal theory of massless higher spin fields in D = (2 + 1), Ann. Phys. 198 (1990) 293–320. [13] O. V. Shaynkman and M. A. Vasiliev, Higher spin conformal symmetry for matter fields in (2 + 1)-dimensions, Theor. Math. Phys. 128 (2001) 1155–1168; arXiv: hepth/0103208. [14] E. Sezgin and P. Sundell, 7-D bosonic higher spin theory: Symmetry algebra and linearized constraints, Nucl. Phys. B 634 (2002) 120–140; arXiv: hep-th/0112100. [15] M. G. Eastwood, Higher symmetries of the Laplacian, Ann. of Math. 161(3) (2005) 1645–1665; arXiv: hep-th/0206233. [16] M. A. Vasiliev, Unfolded representation for relativistic equations in (2 + 1) anti-De Sitter space, Class. Quant. Grav. 11 (1994) 649–664. [17] D. A. Vogan, Representations of Real Reductive Lie Groups, Progress in Mathematics, Vol. 15 (Birkhauser, 1981). [18] A. Beilinson and J. Bernstein, A proof of the Jantzen conjectures, Advances in Soviet Mathematics, Vol. 16, Part 1 (AMS, 1993), pp. 1–50. [19] B. Kostant, Verma modules and the existence of quasi-invariant differential operators, Lect. Notes. Math. 466 (1975) 101–128. [20] M. G. Eastwood and J. W. Rice, Conformally invariant differential operators on Minkowski space and their curved analogues, Comm. Math. Phys. 109 (1987) 207– 228; Erratum, Comm. Math. Phys. 144 (1992) 213.
November 1, 2006 11:8 WSPC/148-RMP
J070-00281
Unfolded Form of Conformal Equations
885
[21] T. Lada and J. Stasheff, Introduction to SH Lie algebras for physicists, Internat. J. Theoret. Phys. 32 (1993) 1087–1104; arXiv: hep-th/9209099. [22] H. Kajiura, Homotopy algebra morphism and geometry of classical string field theory, Nucl. Phys. B 630 (2002) 361–432; arXiv: hep-th/0112228. [23] M. A. Vasiliev, Triangle identity and free differential algebra of massless higher spins, Nucl. Phys. B 324 (1989) 503–522. [24] D. H. Mayer, Vector and tensor fields in conformal space, J. Math. Phys. 16(4) (1975) 884. [25] F. Bayen and M. Flato, Remarks on conformal space, J. Math. Phys. 17(7) (1976) 1112–1114. [26] V. B. Petkova, G. M. Sotkov and I. T. Todorov, Conformal gauges and renormalized equations of motion in massless quantum, Comm. Math. Phys. 97 (1985) 227–256. [27] E. S. Fradkin and M. Ya. Palchik, Conformal Quantum Field Theory in D-Dimensions (Kluwer Academic Publishers, 1996). [28] R. J. Baston and M. G. Eastwood, The Penrose Transform. Its Interaction with Representation Theory (Clarendon Press, Oxford, 1989). [29] M. A. Vasiliev, Extended higher spin superalgebras and their realizations in terms of quantum operators, Fortschr. Phys. 36 (1988) 33–62. [30] B. D. Boe and D. H. Collingwood, A comparison theory for the structure of induced representations, J. Algebra 54 (1985) 511–545. [31] B. D. Boe and D. H. Collingwood, A comparison theory for the structure of induced representations II, Math. Z. 190 (1985) 1–11. [32] T. P. Branson, An anomaly associated with 4-dimensional quantum gravity, Comm. Math. Phys. 178 (1996) 301–309. [33] T. Parker and S. Rosenberg, Invariants of conformal Laplacians, J. Diff. Geom. 25 (1987) 199–222. [34] R. J. Riegert, A nonlocal action for the trace anomaly, Phys. Lett. B 134 (1984) 56–60. [35] S. Paneitz, A quartic conformally covariant differential operators for arbitrary pseudo-Riemannian manifolds, MIT preprint (1983). [36] A. Iorio, L. O’Raifeartaigh, I. Sachs and C. Wiesendanger, Weyl gauging and conformal invariance, Nucl. Phys. B 495 (1997) 433–450; arXiv: hep-th/9607110. [37] J. Erdmenger, Conformally covariant differential operators: Properties and applications, Class. Quant. Grav. 14 (1997) 2061–2084; arXiv: hep-th/9704108. [38] J. Erdmenger and H. Osborn, Conformally covariant differential operators: Symmetric tensor fields, Class. Quant. Grav. 15 (1998) 273–280; arXiv: gr-qc/9708040. [39] L. Dolan, C. R. Nappi and E. Witten, Conformal operators for partially massless states, JHEP 0110 (2001) 016; arXiv: hep-th/0109096. [40] W. Siegel, All free conformal representations in all dimensions, Int. J. Mod. Phys. A 4 (1989) 2015–2020. [41] R. R. Metsaev, All conformal invariant representations of d-dimensional anti-de Sitter group, Mod. Phys. Lett. A 10 (1995) 1719–1731. [42] S. Ferrara and C. Fronsdal, Conformal fields in higher dimensions, in Ninth Marcel Grossman Meeting, eds. V. G. Gurzadyan, R. T. Jantzen and R. Ruffini (World Scientific, 2002), pp. 508–527; arXiv: hep-th/0006009. [43] C. M. Hull, Symmetries and compactifications of (4,0) conformal gravity, JHEP (2000) 0012:007; arXiv: hep-th/0011215. [44] E. S. Fradkin and A. A. Tseytlin, Conformal supergravity, Phys. Rep. 119 (1985) 233–362.
November 1, 2006 11:8 WSPC/148-RMP
886
J070-00281
O. V. Shaynkman, I. Yu. Tipunin & M. A. Vasiliev
[45] A. Y. Segal, Conformal higher spin theory, Nucl. Phys. B 664 (2003) 59–130; arXiv: hep-th/0207212. [46] O. A. Gelfond and M. A. Vasiliev, Higher rank conformal fields in the sp(2m) symmetric generalized space-time, arXiv: hep-th/0304020. [47] G. Barnich, F. Brandt and M. Henneaux, Local BRST cohomology in the antifield formalism. 1. General theorems, Commun. Math. Phys. 174 (1995) 57–92; arXiv: hep-th/9405109. [48] M. Dubois-Violette and M. Henneaux, Tensor fields of mixed Young symmetry type and N complexes, Commun. Math. Phys. 226 (2002) 393–418; arXiv: math. QA/0110088. [49] V. G. Kac and D. A. Kazhdan, Structure of representations with highest weight of infinite-dimensional Lie algebras, Adv. Math. 34 (1984) 97–108. [50] D. P. Zhelobenko, Representations of Reductive Lie Algebras (Nauka, Fizmatlit Publishing Company, Moscow, 1993).
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
Reviews in Mathematical Physics Vol. 18, No. 8 (2006) 887–912 c World Scientific Publishing Company
THE SCHWINGER REPRESENTATION OF A GROUP: CONCEPT AND APPLICATIONS
S. CHATURVEDI School of Physics, University of Hyderabad, Hyderabad 500 046, India [email protected] G. MARMO Dipartimento di Scienze Fisiche, Universita di Napoli Federico II and INFN, Via Cintia, 80126 Napoli, Italy [email protected] N. MUKUNDA Centre for High Energy Physics, Indian Institute of Science, Bangalore 560 012, India [email protected] R. SIMON The Institute of Mathematical Sciences, C. I. T. Campus, Chennai 600 113, India [email protected] A. ZAMPINI SISSA, Mathematical Physics Sector, via Beirut 2, 4, 34014 Trieste, Italy [email protected] Received 4 April 2006 Revised 3 August 2006 The concept of the Schwinger Representation of a finite or compact simple Lie group is set up as a multiplicity-free direct sum of all the unitary irreducible representations of the group. This is abstracted from the properties of the Schwinger oscillator construction for SU (2), and its relevance in several quantum mechanical contexts is highlighted. The Schwinger representations for SU (2), SO(3) and SU(n) for all n are constructed via specific carrier spaces and group actions. In the SU (2) case, connections to the oscillator construction and to Majorana’s theorem on pure states for any spin are worked out.
887
November 1, 2006 11:8 WSPC/148-RMP
888
J070-00280
S. Chaturvedi et al. The role of the Schwinger Representation in setting up the Wigner–Weyl isomorphism for quantum mechanics on a compact simple Lie group is brought out. Keywords: Schwinger representation; Schwinger oscillator construction; compact semisimple Lie groups; Majorana representation for spin; Wigner distribution; Wigner–Weyl isomorphism. Mathematics Subject Classification: 22E70, 22E46, 81S30, 81R99
1. Introduction The Schwinger construction of the Lie algebra of SU (2) in terms of the annihilation and creation operators of two independent quantum mechanical harmonic oscillators has been used in a wide variety of contexts [1]. These include the physics of strongly correlated systems [2], quantum optics of two mode radiation fields [3], analysis of partially coherent classical Gaussian Schell model beams [4], extension to all three-dimensional Lie algebras and analysis of both classical and q-deformed versions [5], q-Boson calculus [6], connection between the hydrogen atom and the harmonic oscillator [7], SU (2) unit tensors [8], applications in the context of quantum computing [9], and a new approach to the spin-statistics theorem [10], to mention only a few. This is in addition to the elegance and relative ease with which many results belonging to the body of the quantum theory of angular momentum can be derived. Two important features of the Schwinger construction are economy and completeness. By these we mean that the unitary representation (UR) of SU (2) that is obtained by exponentiating the generators contains, upon reduction, every unitary irreducible representation (UIR) of SU (2) exactly once, omitting none. The feature of economy, i.e. simple reducibility, is lost when one considers the natural generalization of the Schwinger construction from SU (2) to SU (3): indeed in a minimal oscillator construction that ensures completeness, every SU (3) UIR occurs with infinite multiplicity [11]. An explicit construction of a complete and multiplicity-free representation of SU (3), via harmonic functions on the sphere S 5 , and oscillator construction of the same representation are given in [12]. In the present work, we abstract the two special features of the Schwinger SU (2) construction mentioned above, and make them the basis of the definition of what we shall call the Schwinger Representation (Schwinger rep) for an interesting class of groups. The groups we shall mainly consider are compact Lie groups with simple Lie algebras, while our considerations remain meaningful for finite groups as well. Both of these are of considerable importance in the general framework of quantum mechanics. The precise definition of the Schwinger rep is given in the next section. Here we may stress that on account of the two properties of economy and completeness it may be regarded as a “generating representation” of the group concerned. While these two features are retained, what is given up in general is any elementary construction in terms of oscillator operators.
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
889
A related concept of “model representations” has been introduced and studied by Gelfand et al. [13]. However, the focus there has been on the families of classical noncompact simple Lie groups, and moreover on the nonunitary finite dimensional representations of these groups. As mentioned above, our motivations lie in possible applications of our concept in problems arising within the framework of quantum mechanics, where unitarity of group representations has a special significance. The material of this paper is arranged as follows. In Sec. 2, we introduce the notion of the Schwinger rep of a group and discuss its consequences for compact Lie groups and non compact Abelian groups Rn . Further, we show that while the original Schwinger SU (2) representation, and that for SU (3) permit interpretation in terms of particular induced representations, this ceases to be the case for SU (n) beyond n = 3. In Sec. 3, we discuss the SU (2) Schwinger rep in a manner that anticipates generalization later and bring out the salient features of the carrier space thus obtained. Section 4 contains application of the construction developed in Sec. 3 to recover the Schwinger oscillator construction for SU (2) and Majorana’s representation for a spin j system by sets of points on S 2 . In Sec. 5, we develop the SO(3) Schwinger rep and contrast it with the way this is done conventionally. In Sec. 6, we show how the formalism developed in Sec. 3 for the SU (2) case naturally leads to the SU (n) Schwinger rep for any n. The significance of the Schwinger rep in the context of the Wigner–Weyl isomorphism for Lie groups developed by the present authors is brought out in Sec. 7. Section 8 contains concluding remarks and some open questions which merit further investigation. Throughout this paper, we shall adopt the usual quantum mechanical usage and denote unitary Lie group representation generators by hermitian operators.
2. The Schwinger Representation of a Group We consider a compact Lie group G with simple Lie algebra G. (However, many of the ideas developed below are meaningful also for finite groups.) Then, as is well known, every representation of G, and, in particular, every irreducible representation, may be assumed to be unitary. We shall use a notation for the UIR’s which generalizes the notation familiar for SU (2) and SO(3) in quantum angular momentum theory. We label the various mutually inequivalent UIR’s of G by a symbol or index j, standing in general for a collection of independent quantum numbers. (For SU (2), j is a single numerical label taking values 0, 1/2, 1, 3/2, . . . .) Within the jth UIR, realized on a Hilbert space H(j) of finite dimension Nj , we shall j write (Dm m (g)) for the unitary matrices representing elements g ∈ G in a suitable orthonormal basis. The row and column indices m , m are generalizations of the magnetic quantum number in angular momentum theory; like j, they too, in general, stand for collections of independent quantum numbers. (For SU (2), Nj = 2j+1 and m = j, j − 1, . . . , −j.) In terms of a normalized translation invariant volume element dg and associated invariant delta function δ(g) on G, these matrices obey
November 1, 2006 11:8 WSPC/148-RMP
890
J070-00280
S. Chaturvedi et al.
the orthogonality and completeness conditions j j ∗ dg Dmn (g)Dm = δjj δmm δnn /Nj , n (g) G
j j Nj Dmn (g)Dmn (g )∗ = δ g −1 g .
(2.1)
jmn
We now define the Schwinger rep of G to be the simply reducible UR D0 = ⊕ Dj
(2.2)
j
acting on the direct sum Hilbert space ⊕ H(j) , H0 =
(2.3)
j
the jth UIR Dj acting on the subspace H(j) of H0 . Thus every UIR Dj of G occurs exactly once in this UR. For the Lie group case, H0 is of infinite dimension; while if G is a finite group, H0 is of finite dimension. We can set up orthonormal bases within each H(j) , constituting all together an orthonormal basis for H0 , as follows: H(j) = Sp{|jm | j fixed, m varying}, H0 = Sp{|jm | jm varying},
(2.4)
j m |jm = δj j δm m , so that we have j j m |D0 (g)|jm = δj j Dm m (g).
(2.5)
We give now some immediate consequences of this definition, as well as some familiar examples: (i) If G is abelian, each UIR is one-dimensional, Nj = 1, and the Schwinger rep is the same as the regular representation acting in the usual way (by left or by right translations which coincide) on square integrable functions on G. For nonabelian G, the Schwinger rep is always “leaner” than the regular representation since there are always some UIR’s with Nj > 1. From this point of view, the case of simple G is the exact opposite of abelian G: no subgroup is normal in the former, every one is normal in the latter. Thus for simple G we expect qualitatively that the Schwinger rep will be “much smaller” than the regular representation. (ii) When G is a compact simple Lie group, we can characterize the Schwinger rep in an interesting way. In every UR of G, the generators are hermitian operators obeying the commutation relations corresponding to the Lie algebra G of G. In any individual UIR, apart from the commutation relations, the generators also obey some algebraic (symmetric polynomial) relations characteristic of
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
891
that UIR. In D0 however, no such algebraic relations are obeyed since every UIR is present. In other words, the generators of the Schwinger rep D0 on H0 provide in a sense a minimal faithful representation of the enveloping algebra of G: they are not subject to any algebraic relations beyond the commutation relations. (iii) The simple reducibility of D0 implies that the commutant of D0 is particularly simple: any operator Aˆ on H0 commuting with D0 (g) for all g is necessarily block diagonal, with each entry being some numerical multiple of the unit operator: ˆ 0 (g) = D0 (g)A, ˆ all g ∈ G ⇒ AD Aˆ = ⊕ Aˆj , j
(2.6)
Aˆj = cj 1j , 1j = unit operator on H(j) . This follows from Schur’s Lemma and the Wigner–Eckart theorem. Thus this commutant is commutative. (iv) The Schwinger rep concept can be extended heuristically to the noncompact case G = Rn , leading to an interesting perspective relevant to quantum mechanics. For a quantum system with Cartesian configuration space Q = Rn , corresponding to n canonical Heisenberg pairs of hermitian operators qˆr , pˆr , r = 1, 2, . . . , n, among whom the only nonzero commutators are [ˆ qr , pˆs ] = i δrs ,
(2.7)
the Stone–von Neumann theorem tells us that up to unitary equivalence there is only one irreducible representation of these relations. The Hilbert space can be described via coordinate space wave functions ψ(q) or via momentum space wave functions φ(p): n 2 2 n 2 H = L (R ) = ψ(q) ∈ C | ψ = d q|ψ(q)| < ∞ = φ(p) ∈ C | φ2 = φ(p) = (2π)−n/2
(ˆ qr ψ)(q) = qr ψ(q), (ˆ qr φ)(p) = i
Rn
φ = ψ;
∂ φ(p), ∂pr
Rn
dn p|φ(p)|2 < ∞ ,
dn q e−iq·p ψ(q),
Rn
(ˆ pr ψ) (q) = −i
∂ ψ(q); ∂qr
(ˆ pr φ)(p) = pr φ(p).
(2.8)
November 1, 2006 11:8 WSPC/148-RMP
892
J070-00280
S. Chaturvedi et al.
In this context, these operator actions are usually viewed as providing us after exponentiation with the (unique) Stone–von Neumann UIR of the (2n + 1)dimensional nonabelian Heisenberg–Weyl group of phase space displacements, the generators being qˆr , pˆr and the unit operator on H. However, the situation can now be viewed in an alternative manner: each real numerical n-dimensional momentum vector p corresponds to a one-dimensional UIR of the abelian group of configuration space translations G = Rn : q → q + a; as p ranges over all of momentum space Rn , each such UIR is present in H exactly once. (Another way of expressing this is the statement that the Cartesian momenta pˆr form a complete commuting set.) Thus we can view the kinematics of n-dimensional Cartesian quantum mechanics in two ways: we have the unique Stone–von Neumann UIR of the (2n + 1)dimensional nonabelian Heisenberg–Weyl group, or equally well, we have the Schwinger rep of the abelian group G = Rn of configuration space displacements. (v) The original Schwinger oscillator construction of SU (2) leads upon exponentiation to the Schwinger rep of SU (2) in the sense defined above. (The SU (2) notational details will be taken up in Sec. 3.) Each UIR of SU (2) for j = 0, 1/2, 1, . . . appears exactly once. In the case of SO(3) = SU (2)/Z2 , the distinct UIR’s are usually labeled by = 0, 1, 2, . . . ; these are the integer j UIR’s of SU (2). The familiar UR of SO(3) on square integrable functions on S 2 , with the simple geometric action of SO(3) elements, is a realization of the Schwinger rep of SO(3). The reduction into UIR’s in a multiplicity-free manner is achieved, as is familiar, by using the orthonormal basis provided by the spherical harmonics on S 2 . In Secs. 3 and 4 we describe other ways of constructing the Schwinger rep’s of SU (2) and SO(3), respectively. After these immediate properties and examples, we make some general remarks. Purely from the representation theory point of view, the Schwinger rep D0 of G is completely defined by the statement in (2.2) and (2.3) of its UIR content. However, from the point of view of possible applications in the framework of quantum mechanics, considerable interest attaches to various ways in which this UR may be realized, with corresponding carrier spaces and group actions. A general way to construct UR’s of a group G is by the process of induction starting from UIR’s of some subgroup [14]. Let H ⊂ G be some subgroup, and D0 be a UIR of H. Then (ind,D0 ) of G: by an elegantly simple construction, one arrives at an induced UR DH the notation indicates the roles of H, D0 and the inducing procedure. Once this UR of G has been obtained, one can ask for its UIR content. Here the main result is (ind,D0 ) of G contains the UIR Dj of G as many the reciprocity theorem. The UR DH j times as D contains D0 upon restriction from G to H. One can now ask whether the Schwinger rep of G arises as a particular induced UR corresponding to some carefully chosen H and D0 .
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
893
In the case of SU (2), a natural subgroup choice is H = U (1) generated by J3 in the usual notation, with eigenvalues being the magnetic quantum number m. However, as a quick analysis using the reciprocity theorem shows, we find the result: (ind,0)
D0 for SU (2) = DU(1)
(ind,1/2)
⊕ DU(1)
.
(2.9)
(Here the superscripts 0 and 1/2 on the right-hand side indicate the m values determining the U (1) UIR’s used in the inducing process.) The first term on the right accounts for all the integer j UIR’s of SU (2), while the second term accounts for the remaining half odd integer j UIR’s. In the case of SO(3), we may choose H = SO(2) and then we have (ind,0)
D0 for SO(3) = DSO(2) .
(2.10)
So in this case the Schwinger rep is indeed a particular induced representation. For SU (3), this situation continues to hold [15]. Each UIR of SU (3) is labeled by a pair of independent nonnegative integers, as (p, q). It is a fact that every UIR (p, q) contains the trivial (one-dimensional) UIR of the canonical SU (2) subgroup exactly once. Thus, from the reciprocity theorem, we see that (ind,0)
D0 for SU (3) = DSU(2) ,
(2.11)
where the zero in the superscript on the right stands for the trivial j = 0 UIR of SU (2). However, this trend does not continue for SU (n) beyond n = 3.a In fact, we show in Sec. 4 that the Schwinger rep of SU (n) for n ≥ 4 is not an induced UR corresponding to any choice of UIR of the canonical SU (n − 1) subgroup of SU (n). There is thus a need to develop an alternative method to construct the Schwinger rep of SU (n) which works uniformly for all n ≥ 2. This will be done for SU (2) in the next section, for SO(3) in Sec. 5, and for SU (n) in Sec. 6. 3. The SU (2) Schwinger Representation To set notations we begin by recalling the defining UIR and Euler angle parametrization of SU (2) [16]. An element g ∈ SU (2) is a 2 × 2 unitary unimodular matrix ξ −η ∗ , ξ, η ∈ C, g= η ξ∗ (3.1) |ξ|2 + |η|2 = 1. The hermitian generators are 12 σr , where σr for r = 1, 2, 3 are the Pauli matrices. The commutation relations are
1 1 1 σr , σs = i ∈rst σt . (3.2) 2 2 2 a In
the work on “model representations” [13], the inducing construction does lead to all such representations for the noncompact groups considered.
November 1, 2006 11:8 WSPC/148-RMP
894
J070-00280
S. Chaturvedi et al.
In the Euler angle parametrization, we express g as a product of three factors: g(α, β, γ) = e−iασ3 /2 e−iβσ2 /2 e−iγσ3 /2 e−i(α+γ)/2 cos β/2 −e−i(α−γ)/2 sin β/2 = , ei(α+γ)/2 cos β/2 ei(α−γ)/2 sin β/2 i.e. ξ = e−i(α+γ)/2 cos β/2,
(3.3)
η = ei(α−γ)/2 sin β/2.
The ranges for α, β, γ are determined by the condition that (except possibly on a set of measure zero) each element (3.1) must occur just once. Then one findsb : 0 ≤ |ξ| ≤ 1 ⇔ 0 ≤ β ≤ π; 0 ≤ arg ξ, arg η ≤ 2π ⇔ 0 ≤ α ≤ 2π, 0 ≤ γ ≤ 4π.
(3.4)
The elements g(0, 0, γ) for 0 ≤ γ ≤ 4π constitute the diagonal U (1) subgroup of SU (2). Since α and β can be interpreted as azimuthal and polar angles on S 2 , the form for g(α, β, γ) in (3.3) is in manifest agreement with the statement SU (2)/U (1) = S 2 . The normalized invariant volume element is dg = dα sin β dβ · dγ/16π 2 .
(3.5)
The unitary representation matrices in the jth UIR are, as is familiar [17]: j jm|Dj (α, β, γ)|jn ≡ Dmn (α, β, γ)
= e−imα−inγ djmn (β) with djmn (β) real. In verifying the orthogonality relation j j ∗ dg Dmn (α, β, γ)Dm n (α, β, γ) = δjj δmm δnn /(2j + 1),
(3.6)
(3.7)
SU(2)
it is necessary to keep in mind the asymmetry between α and γ in (3.4). Thus it is simplest to first carry out the γ integration producing the factor δnn . This implies that j − j and m − m are both integral. Then doing the α integration second leads to δmm ; and finally the β integration produces δjj . The two regular representations of SU (2) act on the Hilbert space H of square integrable functions on SU (2) [18]: H = ψ(α, β, γ) ∈ C |ψ2 =
1 16π 2
4π
dγ 0
2π
dα 0
π
sin β dβ|ψ(α, β, γ)|2 < ∞ .
0
(3.8) b It
is to be noted that in J. Schwinger [1], Eq. (2.61), the ranges chosen are 0 ≤ α, γ ≤ 4π, 0 ≤ β ≤ π, which amounts to covering SU (2) twice. In [16], the ranges chosen are 0 ≤ α, γ ≤ 2π, 0 ≤ β ≤ π and 2π ≤ β ≤ 3π.
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
895
When convenient we write ψ(g) · · · instead of ψ(α, β, γ). The left regular representation of SU (2) is given by unitary operators U (g ), g ∈ SU (2), acting on ψ as (U (g )ψ)(g) = ψ g −1 g . (3.9) ˜ ): Similarly the right regular representation is given by unitary operators U(g ˜ (g )ψ)(g) = ψ(gg ). (U
(3.10)
They obey U (g )U (g) = U (g g), ˜ (g )U ˜ (g) = U ˜ (g g), U ˜ (g )U (g) = U (g)U ˜ (g ). U
(3.11)
The generators Jr of U (g) such that U (g(α, β, γ)) = e−iαJ3 e−iβJ2 e−iγJ3 are
∂ cos α ∂ ∂ J1 = i cos α cot β + sin α − , ∂α ∂β sin β ∂γ ∂ sin α ∂ ∂ − cos α − J2 = i sin α cot β , ∂α ∂β sin β ∂γ J3 = −i
(3.12)
(3.13)
∂ . ∂α
˜ (g) are Similarly, the generators J˜r of U −cos γ ∂ ∂ ∂ ˜ + sin γ + cos γ cot β J1 = i , sin β ∂α ∂β ∂γ sin γ ∂ ∂ ∂ + cos γ − sin γ cot β J˜2 = i , sin β ∂α ∂β ∂γ
(3.14)
∂ . J˜3 = i ∂γ The complete set of commutation relations among them is [Jr , Js ] = i ∈rst Jt , [J˜r , J˜s ] = i ∈rst J˜t , [Jr , J˜s ] = 0.
(3.15)
Thus the left representation generators are right translation invariant and vice versa. As is well known, these two sets of generators share a common Casimir invariant,
November 1, 2006 11:8 WSPC/148-RMP
896
J070-00280
S. Chaturvedi et al.
and are related by the adjoint UIR of SU (2), namely the defining representation of SO(3): J 2 = Jr Jr = J˜r J˜r , J˜r = −Rsr (α, β, γ)Js .
(3.16)
j (α, β, γ), we have: Acting on Dmn j j J3 Dmn (α, β, γ) = −mDmn (α, β, γ), j j J˜3 Dmn (α, β, γ) = nDmn (α, β, γ),
J
2
j Dmn (α, β, γ)
= j(j +
(3.17)
j 1)Dmn (α, β, γ).
We now develop a method to extract the Schwinger rep of SU (2) from the (left) regular representation, in a way which generalizes to all SU (n). The functions j (α, β, γ) for all j, m, n form an orthonormal basis for H in which (2j + 1)1/2 Dmn ˜ the two commuting UR’s U (g), U(g) are simultaneously reduced into UIR’s. In the UR U (g), each UIR j of SU (2) occurs (2j + 1) times, and the quantum number n, eigenvalue of J˜3 , acts as a multiplicity index. (Conversely, m plays this role for the ˜ (g)). We can then see that if we restrict ourselves to the subset of reduction of U j (α, β, γ) with maximum possible value j for the eigenvalue n of basis functions Dmj J˜3 , and to the subspace of H spanned by these functions, we pick up each UIR of SU (2) exactly once from the reduction of U (g). This leads to the identification of a subspace H0 ⊂ H by the definition H0 = {ψ(α, β, γ) ∈ H | (J˜1 + i J˜2 )ψ(α, β, γ) = 0}.
(3.18)
(Strictly speaking, wave functions in the domain of and annihilated by J˜1 + iJ˜2 form a dense set in H0 , which upon completion gives H0 .) On the other hand, we know in advance that 1 1/2 j H0 = Sp (2j + 1) Dmj (α, β, γ), j = 0, , 1, . . . , m = j, j − 1, . . . , −j . (3.19) 2 The equivalence of (3.18) and (3.19) can be directly established as follows. The condition defining wave functions in H0 reads i∂ ∂ i ∂ − tan β − ψ(α, β, γ) = 0. (3.20) ∂γ ∂β cos β ∂α This is a complex first order partial differential equation whereas αβγ are all real. Therefore, we cannot conclude that ψ(α, β, γ) is effectively reduced to a function of two independent real combinations of αβγ. Essentially, this is like imposing the
∂ ∂ Cauchy–Riemann equations — ∂x + i ∂y f (x, y) = 0 — on a complex function of two real variables. The result is that f (x, y) has to be an analytic function of the complex combination z = x+iy. Considering first combinations of α and β, and then of γ and β, which obey (3.20), we find that ψ(α, β, γ) can be any analytic function of eiα tan β/2 and e−iγ sin β. (The analyticity condition arises because the complex
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
897
conjugate combinations e−iα tan β/2, eiγ sin β do not obey (3.20.) However, this is equivalent to the statement that ψ(α, β, γ) must be an analytic function of ξ, η of (3.3): ψ ∈ H0 ⇔ ψ(α, β, γ) = f (ξ, η).
(3.21)
j (α, β, γ) Dmj
are known to be given by [17]: √ j (α, β, γ) = 2j!ujm (ξ, η), Dmj (3.22) ujm (ξ, η) = ξ j+m η j−m / (j + m)!(j − m)!,
On the other hand the functions
so the equivalence of (3.18) with (3.19) follows. To cast the UIR’s present in H0 into the standard forms of quantum angular momentum theory, we notice from (3.18) that the eigenvalue of J3 is −m, and as a short calculation shows: j j (α, β, γ) = − (j + m)(j − m + 1)Dm−1,j (α, β, γ). (3.23) (J1 + iJ2 )Dmj If we therefore define the family of wave functions j Yjm (α, β, γ) = (−1)j−m (2j + 1)1/2 D−m,j (α, β, γ) (j + m)!(j − m)! = (2j + 1)! η j+m (−ξ)j−m = (2j + 1)! ujm (η, −ξ), j = 0, 1/2, 1, . . . , m = j, j − 1, . . . , −j,
(3.24) they form an orthonormal basis for H0 , 2π 4π π 1 sin β dβ dα dγ Yjm (α, β, γ)Yj m (α, β, γ)∗ = δjj δmm ; (3.25) 16π 2 0 0 0 and moreover for each fixed j, the Yjm (α, β, γ) transform under the left regular representation according to the standard form of the jth UIR of SU (2). The restriction of the left regular representation from H to H0 may be denoted by D0 , and it is a realization of the Schwinger rep of SU (2). The following comments may be made concerning the specific way in which the carrier space above has been obtained. It is important to notice that each basis function Yjm (α, β, γ) retains a dependence on each of the three real independent arguments. This can be easily seen when verifying the orthonormality condition (3.25): doing the γ integration first produces δjj , the α integration next produces δmm , while the final β integration produces the correct normalization. This is similar to the comments made earlier in connection with Eq. (3.7). This means that the extraction of the subspace H0 within the space H = L2 (SU (2)) carrying the regular representations, since it involves limiting oneself to solutions of a complex differential equation, does not amount to limiting oneself to functions defined on a lower dimensional submanifold of the full “configuration space” SU (2). In other words, the limitation to a subspace at the vector space level is not achieved by a limitation to any submanifold of the group manifold. This is similar to the relationships
November 1, 2006 11:8 WSPC/148-RMP
898
J070-00280
S. Chaturvedi et al.
among the position, momentum and Bargmann representations of the Heisenberg canonical commutation relations in quantum mechanics. While the first two can be handled in the real realm via the concept of polarization of a symplectic structure, the third brings in complex quantities in a novel manner. Moreover, to further clarify the meaning of the functions Yjm (α, β, γ), namely that they essentially depend on the three variables, and that obtaining the Schwinger rep from the left regular representation does not require to quotient the group manifold, it is possible to study their relations with the properties of the generalized coherent states for the group SU (2). As it is well known [19], if the fiducial vector in each finite dimensional UIR of SU (2) is chosen to be the highest weight in the Cartan–Weyl setting, then the coherent states are in correspondence with points of a 2-sphere S 2 ∼ SU (2)/U (1), where, with the standard identification, γ has been quotiented away: j (α, β, γ = 0) . j, m | αβ = Dmj
(3.26)
So that the functions Yjm (α, β, γ) are, by a direct check: Yjm (α, β, γ) = e−iγj j, m | α, β .
(3.27)
This shows, once more, that Yjm functions do depend on the three variables, so obtaining the Schwinger rep from the left regular does not require to quotient the group manifold of SU (2). Secondly, in this carrier space each basis function is a single term expression, a monomial, rather than a sum of several distinct terms, which is the case for j (α, β, γ) and for the usual spherical harmonics on S 2 . In the next a general Dmn section, we exploit these features to connect this form of the SU (2) Schwinger rep to other known results. 4. Applications of SU (2) Schwinger Representation In this section, we use the construction of the previous section to link up to the original Schwinger oscillator operator construction for SU (2), and to the Majorana theorem on the geometrical representation of pure states for a spin j system for any j. 4.1. The Schwinger oscillator construction The orthonormality relation (3.25) for the basis functions Yjm (α, β, γ) of H0 can be exhibited in an alternative form suggesting interesting generalization. Introduce two independent complex variables z1 , z2 proportional to η, −ξ: z1 = ρη = ρei(α−γ)/2 sin β/2, z2 = −ρξ = −ρe−i(α+γ)/2 cos β/2, |z1 |2 + |z2 |2 = ρ2 ,
0 ≤ ρ < ∞.
(4.1)
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
899
The uniform integration measure over the two complex planes is d2 z1 d2 z2 ≡ |z1 | |z2 |d|z1 |d|z2 |d arg z1 d arg z2 = π 2 dg · ρ2 dρ2 , where dg is given in (3.5). Then (3.25) takes the form ∗ 2 d z1 d2 z2 z1 z2 z1 z2 2 δ(ρ − 1)ujm , , = δjj δmm . (2j + 1)! u j m π π ρ ρ ρ ρ
(4.2)
(4.3)
Remembering that the last two factors of the integrand are actually ρ-independent, and that the result on the right-hand side really arises from the integration over SU (2) with measure dg, we see that we can replace δ(ρ2 − 1) by any (real positive) function fj (ρ2 ) subject to ∞ dρ2 · ρ2 fj (ρ2 ) = 1, (4.4) 0
and then (4.3) will remain valid in the form 2 d z2 d2 z2 fj (ρ2 )(ρ2 )−2j ujm (z1 , z2 )uj m (z1 , z2 )∗ = δjj δmm . (4.5) (2j + 1)! π π An easy and suggestive choice consistent with (4.4) is 2
fj (ρ2 ) = (ρ2 )2j e−ρ /(2j + 1)!, which leads to
d2 z1 d2 z2 −|z1 |2 −|z2 |2 e ujm (z1 , z2 )uj m (z1 , z2 )∗ = δjj δmm . π π
(4.6)
(4.7)
This is recognized to be just the Bargmann entire function realization of the Schwinger oscillator operator construction for SU (2), with the familiar complete system of basis functions ujm (z1 , z2 ) = z1j+m z2j−m (j + m)!(j − m)! (4.8) forming an orthonormal basis in the Bargmann Hilbert space [20]. The oscillator operators a†1 , a†2 correspond to multiplication by z1 , z2 , while the measure in (4.7) is such that a1 and a2 act as ∂z∂ 1 , ∂z∂ 2 , respectively. It is in this way that the original Schwinger oscillator operator construction for SU (2) can be recovered from the Schwinger rep of SU (2) in the form realized in the previous section. 4.2. The Majorana representation for spin j It is very well known from the theory of the Poincar´e–Bloch sphere that each pure state of a spin 1/2 system (two level quantum system) can be represented in a unique fashion by a point on S 2 . Majorana’s theorem generalizes this to pure states of a
November 1, 2006 11:8 WSPC/148-RMP
900
J070-00280
S. Chaturvedi et al.
spin j system for any j [21]. We show how this result can be obtained immediately and transparently from the work of the previous section. The orthonormal basis functions for the spin j UIR contained within the Schwinger rep D0 of SU (2), given in (3.24), are expressible in the form j (α, β, γ) Yjm (α, β, γ) = (−1)j−m (2j + 1)1/2 D−m,j (−1)j−m (2j +1)! −i(α+γ)/2 (e cos β/2)j−m (ei(α−γ)/2 sin β/2)j+m = (j + m)!(j − m)! (2j + 1)! · ξ 2j · (−1)j−m ζ j+m , = (j + m)!(j − m)!
ζ=
η = eiα tan β/2. ξ
(4.9)
The variable ζ, which can take any value in the complex plane since 0 ≤ α ≤ 2π, 0 ≤ β ≤ π, is the result of stereographic projection applied to the sphere S 2 , with the south pole as vertex, and onto the plane tangent to S 2 , at the north pole. Thus each ζ corresponds to a unique point on S 2 , the north and south poles being mapped onto ζ = 0 and ∞, respectively. A general vector ψ within the spin j UIR in D0 is thus of the form ψ=
+j
Cm Yjm (α, β, γ)
m=−j j 2j = (2j + 1)! ξ · m=−j
(−1)j−m Cm ζ j+m . (j + m)!(j − m)!
(4.10)
As it stands, this wave function is a common standard factor times a polynomial of degree ≤ 2j in the complex variable ζ. In the generic case with all Cm = 0, we have a polynomial of degree 2j, so ψ can be uniquely factored into the form (4.11) ψ = (2j + 1)! · ξ 2j · Cj · (ζ − ζ1 )(ζ − ζ2 ) · · · (ζ − ζ2j ). The (unordered) set of points ζ1 , ζ2 , . . . , ζ2j (some of which may coincide) corresponds to an (unordered) set of points on S 2 , which set determines ψ uniquely and vice versa (upto overall normalization of ψ). This is the celebrated Majorana result obtained transparently from the way the Schwinger rep of SU (2) was constructed in Sec. 3. In particular, the importance of each Yjm (α, β, γ) being a single term expression should be appreciated. In the generic case above with all Cm = 0, none of the points ζ1 , ζ2 , . . . , ζ2j can either vanish or be infinite. In the most general case, if m1 ≥ m2 are the largest and smallest m values for which Cm = 0, i.e., Cj = Cj−1 = · · · = Cm1 +1 = 0,
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
901
Cm1 = 0, . . . , Cm2 = 0, Cm2 −1 = Cm2 −2 = · · · = C−j = 0, the wave function ψ has the form Cm1 ζ m1 −m2 ψ = (2j + 1)! · ξ 2j · (−1)j−m1 · (j + m1 )!(j − m1 )! Cm1 −1 ζ m1 −m2 −1 (−1)m1 −m2 Cm2 − + ···+ · ζ j+m2 . (j + m1 − 1)!(j − m1 + 1)! (j + m2 )!(j − m2 )! Then in the Majorana representation of this ψ by a constellation of points on S 2 , we have j − m1 points at the south pole (ζ = ∞), j + m2 points at the north pole (ζ = 0), and the remaining m1 − m2 points away from both poles (but with coincidences permitted). 5. The SO(3) Schwinger Representation This case can be handled by making suitable changes in the SU (2) treatment in Sec. 3. The rotation matrix R(α, β, γ) in the defining (real orthogonal) UIR of SO(3) is
cos α R(α, β, γ) = sin α 0
− sin α cos α 0
0 cos β 0 0 1 − sin β
0 1 0
sin β cos γ 0 sin γ cos β 0
− sin γ cos γ 0
0 0. 1 (5.1)
The Euler angles now have the ranges 0 ≤ α, γ ≤ 2π, 0 ≤ β ≤ π, so the normalized volume element is dR =
1 dα sin β dβ dγ. 8π 2
(5.2)
The Hilbert space carrying the left and right regular representations of SO(3), denoted again by H, is H=
1 ψ(α, β, γ) ∈ C ψ 2 = 2 8π
2π
dγ 0
2π
dα 0
π
sin β dβ|ψ(α, β, γ)| < ∞ . 2
0
(5.3) The left and right regular representations of SO(3) are defined in ways analogous to (3.9) and (3.10) and need not be repeated. The expressions for their gener˜ r say, are the same as in (3.14) and (3.15), and the commutaators, Lr and L tion relations too are repetitions of (3.16). The complete set of orthonormal basis functions, realising the complete reductions of both regular representations, are (α, β, γ) : = 0, 1, 2, . . . , m and n = , − 1, . . . , −; and −m, n are (2 + 1)1/2 Dmn ˜ 3 , respectively. eigenvalues of L3 , L
November 1, 2006 11:8 WSPC/148-RMP
902
J070-00280
S. Chaturvedi et al.
Following the same procedure as with SU (2), we can isolate a subspace H0 ⊂ H carrying a realization of the Schwinger rep D0 of SO(3) by ˜ 1 + iL ˜ 2 )ψ(α, β, γ) = 0} H0 = {ψ(α, β, γ) ∈ H |(L = Sp{(2 + 1)1/2 Dm (α, β, γ), = 0, 1, 2, . . . , m = , − 1, . . . , −}.
(5.4) The identification of orthonormal basis functions transforming in the standard way under the left regular action by SO(3) is (compare (4.9)): Ym (α, β, γ) = (−1)−m (2 + 1)1/2 D−m, (α, β, γ) (2 + 1)! (e−i(α+γ) cos2 β/2) (−eiα tan β/2)+m . = ( + m)!( − m)!
(5.5) The single term structure of these basis functions and the dependences on all three Euler angles should again be noted. We have pointed out in Sec. 2 that the more familiar way of realizing the Schwinger rep of SO(3) is via the usual kinematical action of rotations on square integrable functions on S 2 , namely on functions ψ(α, β) with spherical harmonics (ind,0) Ym (β, α) as basis functions; and that this is the induced UR DSO(2) . While this realization is fully equivalent in the sense of representation theory to the realization given above, one sees that the actual carrier spaces and basis functions are quite different in the two cases. The realization on L2 (S 2 ) is appropriate for discussing the orbital angular momentum of a spinless quantum mechanical particle; that developed in this section is appropriate for describing the subset of states of a rigid body in quantum mechanics in which the third component of the angular momentum referred to body axes always has maximal value. It is important to note that the Schwinger oscillator operator construction for the group SO(3) can be obtained from that of SU (2) outlined in the previous section. Restricting the basis system in (4.8) to the set of even functions: ujm (−z1 , −z2 ) = ujm (z1 , z2 )
(5.6)
is equivalent to allow only for integer values of j, so to define a space supporting a realization of SO(3) Lie algebra in terms of oscillators. This means that the Schwinger oscillator construction for SU (2) goes through for SO(3). 6. The Schwinger Representation for SU (n) We now show how the SU (2) procedure developed in Sec. 3 can be extended to the entire family of unitary unimodular groups SU (n). (In the specific context of Schwinger oscillator construction, Mathur and Mani [22] have shown how the original SU (2) construction may be extended to SU (n).) We begin with preliminaries
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
903
about SU (n), then prove that for n ≥ 4 the Schwinger rep of SU (n) cannot be obtained by the inducing construction from any UIR of the canonical SU (n − 1) subgroup. We then sketch the generalization of the SU (2) procedure to general SU (n), and give details in the SU (3) case. In the so-called tensor notation the Lie algebra SU (n) of SU (n) consists of operators Aλµ , λ, µ = 1, 2, . . . , n, obeying the commutation, conjugation and algebraic relations [23]: [Aλµ , Aρσ ] = δµρ Aλσ − δσλ Aρµ , (Aλµ )† = Aµλ , Aλλ = 0.
(6.1)
The subset of commuting hermitian generators which can be assumed to be simultaneously diagonal in any UR of SU (n) may be taken to be (up to overall multiplicative factors): A11 − A22 , A11 + A22 − 2A33 , . . . , n n A11 + A22 + · · · + An−1 n−1 − (n − 1)An = −nAn .
(6.2)
Since SU (n) has rank (n − 1), there are (n − 1) fundamental UIR’s; a general UIR is obtained by forming the direct product of several copies of each fundamental UIR and then isolating the “largest” irreducible piece. The fundamental UIR’s are the defining n-dimensional UIR consisting of n × n unitary unimodular matrices, followed by antisymmetric tensor representations of successive ranks 2, 3, . . . , (n−1) over the defining UIR. For brevity, denote the fundamental UIR of SU (n) given by antisymmetric tensors of rank p by p(n) , for p = 1, 2, . . . , n − 1. Under complex conjugation, we have ∗
p(n) = (n − p)(n) .
(6.3)
Then the reduction of each fundamental UIR under the canonical SU (n − 1) subgroup is easily seen to have the two-term structure p(n) = p(n−1) ⊕ (p − 1)(n−1) ,
p = 1, 2, . . . , n − 1.
(6.4)
One sees from this that for n ≥ 4, there is no single UIR of SU (n − 1) which occurs exactly once in each fundamental UIR of SU (n), hence also none which appears exactly once in each UIR of SU (n). For example, when n = 4, we have in terms of dimensionalities 1(4) = 4, 2(4) = 6, 3(4) = 4∗ ; their SU (3) contents are 4 = 3 ⊕ 1, 6 = 3∗ ⊕ 3, ∗
∗
4 = 1⊕3 ,
(6.5)
November 1, 2006 11:8 WSPC/148-RMP
904
J070-00280
S. Chaturvedi et al.
where 1(3) = 3, 2(3) = 3∗ ; and the statement made above is seen to be true. For the SU (3) → SU (2) case, we have in contrast 3 = 2 ⊕ 1,
(6.6)
3∗ = 2 ⊕ 1,
and in fact, as mentioned in Sec. 2, each UIR of SU (3) does contain exactly one SU (2) invariant state. From the reciprocity theorem we conclude that for n ≥ 4, the Schwinger rep of SU (n) cannot be obtained by the inducing construction starting from any UIR of SU (n − 1). The method used for SU (2) in Sec. 3, however, does work for all SU (n). The Hilbert space carrying the two commuting regular representations of SU (n) is H = L2 (SU (n)): (6.7) H = ψ(g) ∈ C | g ∈ SU (n), ψ2 = dg|ψ(g)|2 < ∞ . Here dg is the normalized invariant volume element on SU (n), and the left and ˜ right regular representation operators U (g), U(g) are defined exactly as in (3.9) λ ˜λ and (3.10). Let us denote their generators by Aµ , Aµ : each set obeys Eq. (6.1), and they mutually commute. Then the subspace H0 supporting a Schwinger rep D0 of SU (n) is identified by H0 = {ψ(g) ∈ H | A˜λµ ψ = 0, λ < µ} = {ψ(g) ∈ H | A˜λλ+1 ψ = 0, λ = 1, 2, . . . , n − 1}. 1 2 n(n − 1)
(6.8)
A˜λµ
nonhermitian operators for λ < µ close Here we use the fact that the under commutation, so we can consistently look for their common null space. (In the defining UIR of SU (n), these are lower triangular matrices). Since [A˜λλ+1 , A˜λ+1 λ+2 ] = A˜λλ+2 etc., we can adopt the more economical definition in the second line of (6.8). These conditions have the following effect: out of the many appearances of each SU (n) UIR in the reduction of the left regular representation U (g) on H, exactly one is picked up corresponding to the highest weight with respect to the right ˜ (g). Then the UR U (g) on H, when restricted to H0 , gives regular representation U a realization of the Schwinger rep D0 of SU (n). We spell out the details in the SU (3) case [24]. The SU (2) subgroup is taken to be generated by A12 , A21 , A11 − A22 . In the standard isospin notation, we have: √ √ (6.9) I3 = A11 − A22 , I+ = 2A12 , I− = 2A21 . A general SU (3) UIR is denoted by (p, q), with p and q independent nonnegative integers. ((1, 0) = 3 = defining representation, (0, 1) = 3∗ .) Within this UIR, whose dimension is Np,q = 12 (p + 1)(q + 1)(p + q + 2), an orthonormal basis is written as |p, q; I, I3 , Y ,
(6.10)
where I, I3 are the usual SU (2) UIR quantum numbers, and the hypercharge Y is the eigenvalue of −A33 . The “I − Y multiplets” contained in the UIR (p, q) are given
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
905
by the rules: 1 (r + s), I3 = I, I − 1, . . . , −I, 2 2 Y = r − s + (q − p), 3 I=
r = 0, 1, 2, . . . , p,
(6.11)
s = 0, 1, 2, . . . , q.
(Thus by taking r = s = 0 we see that an SU (2) singlet state with I = I3 = 0 is always present once.) The nonhermitian generators A12 , A13 , A23 cause the following changes in the “magnetic quantum numbers” I, I3 , Y of the basis states (6.10): A12 : I, I3 , Y → I, I3 + 1, Y, 1 1 A13 : I, I3 , Y → I ± , I3 + , Y + 1, 2 2 1 1 A23 : I, I3 , Y → I ± , I3 − , Y + 1. 2 2
(6.12)
Thus either Y is increased by unity, or Y is unchanged but I3 is increased by unity. The unique basis state within (p, q) annihilated by A12 and A23 (hence also by A13 ) is then seen to be for r = p, s = 0: p, q; 1 p, 1 p, 1 (p + 2q) . (6.13) 2 2 3 With appropriate conventions this is the highest weight state in the UIR: it has the highest possible hypercharge value, and for this hypercharge it has the highest possible eigenvalue for I3 . Now we use this information about UIR’s of SU (3) to analyze the regular representations. These UR’s are realized on L2 (SU (3)), and an orthonormal basis is given in an obvious notation by the collection of all unitary representation matrices: (p,q) Np,q DII Y ;I˜I˜ Y˜ (g). (6.14) 3
3
The subspace H0 identified in (6.8) is thus seen to be spanned by those basis functions for which I˜ = I˜3 = 12 p, Y˜ = 13 (p + 2q): H0 = null space of A˜12 , A˜23 (and A˜13 ) (p,q) = Sp Np,q DII3 Y ; 1 p, 1 p, 1 (p+2q) (g) , 2
2
3
(6.15)
and we see explicitly that with respect to the left action each UIR of SU (3) occurs exactly once. Thus the Schwinger rep D0 of SU (3) is realized on H0 . To exhibit a basis Yp,q;II3 Y (g) for H0 which is orthonormal and transforms in the standard “Biedenharn” manner under SU (3) action, [24] equations analogous to (3.24) have to be set up, but we omit the details.
November 1, 2006 11:8 WSPC/148-RMP
906
J070-00280
S. Chaturvedi et al.
7. Application to the Wigner–Weyl isomorphism The Wigner–Weyl isomorphism (WW isomorphism) is a method to express states and operators in the traditional Hilbert space formulation of quantum mechanics in a classical phase space language [25]. Thus density matrices and general dynamical variables are represented by corresponding c-number functions on phase space, their Weyl symbols, while quantum mechanical expectation values are calculated as integrals of products of Weyl symbols over phase space in the manner of classical statistical mechanics. The WW isomorphism has been studied most extensively in the case of Cartesian quantum mechanics when, as mentioned in Sec. 2, the configuration space is Q = Rn and phase space is R2n . It has been shown elsewhere that if we consider the configuration space to be a (compact simple) Lie group G, the kinematic structure of quantum mechanics shows striking new features absent in the Cartesian case, so the WW isomorphism also exhibits unexpected features [26]. Interestingly the Schwinger rep of G plays a role in this context, and this will be outlined here. The Hilbert space of wave functions is in an obvious notation dg|ψ(g)|2 < ∞ . (7.1) H = ψ(g) ∈ C | g ∈ G, ψ2 = G
The left and right regular UR’s act as in (3.9)–(3.11) reinterpreted as referring to G. A density operator ρˆ and a general dynamical variable Aˆ are represented by their integral kernels ρ|g, ρˆ → g |ˆ
ˆ Aˆ → g |A|g.
(7.2)
where the ideal kets |g for g ∈ G are introduced such that ψ(g) = g|ψ,
g |g = δ(g −1 g ),
(7.3)
dg|gg| = 1 on H. ˜ This allows us to express the actions of U (g), U(g) in the succinct forms U (g)|g = |gg ,
˜ U(g)|g = |g g −1 .
(7.4)
The trace orthonormality of these unitary operators is then immediate: ˜ (g )U ˜ (g)) = δ(g g). Tr(U (g )U (g)) = Tr(U
(7.5)
The complementary “momentum” basis for H in which both regular representations are simultaneously completely reduced into UIR’s is determined by the D-functions as 1/2 j (g)|g (7.6) dg Dmn |jmn = Nj
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
907
with the basic properties j m n |jmn = δjj δmm δnn , j Dmm (g −1 )|jm n, U (g)|jmn =
(7.7)
m
˜ (g)|jmn = U
n
Dnj n (g)|jmn .
In the reduction of either regular representation each UIR j of G occurs Nj times. In this basis ρˆ and Aˆ are represented by “matrices” ρ|jmn, ρˆ → j m n |ˆ
ˆ Aˆ → j m n |A|jmn.
(7.8)
In this scheme the WW isomorphism can be set up in two equally good ways. We describe both at this point even though only the second one will be used later. Option I With an operator Aˆ described by kernel (7.2) or matrix (7.8) we associate the Weyl symbol ˆ Dj (g g −1 )δ(g −1 s(g , g )) WAˆ (g; jmm ) = dg dg g |A|g mm =
˜ (g)AˆU ˜ (g)−1 |g Dj (g g −1 )δ(s(g , g )). (7.9) dg dg g |U mm
This symbol depends on a group element g (coordinate variable) and on the discrete UIR labels jmm (momentum variable). It involves the function s(g , g ) ∈ G dependent on two arguments, having the properties s(g , g ) = s(g , g ), s(g , g ) = g ,
(7.10)
s(g1 g g2 , g1 g g2 ) = g1 s(g , g )g2 . A possible choice for s(g , g ) is the “midpoint” of the geodesic in G from g to g . Using (7.10), this solution can be written as s(g , g ) = g s0 (g −1 g ),
(7.11)
where s0 (g) is the “midpoint” of the one-parameter subgroup connecting the identity e ∈ G to g. ˜, U: With this option, we have under conjugation of Aˆ by U ˜ (g1 )AˆU ˜ (g1 )−1 ⇒ Aˆ = U WAˆ (g; jmm ) = WAˆ (gg1 ; jmm ); ˆ (g2 ) ⇒ Aˆ = U (g2 )−1 AU j j Dmm (g2−1 )WAˆ (g2 g; jm1 m1 )Dm WAˆ (g; jmm ) = m (g2 ). 1 m1 ,m1
1
(7.12)
November 1, 2006 11:8 WSPC/148-RMP
908
J070-00280
S. Chaturvedi et al.
ˆ B ˆ on H we find: Finally for two operators A, ˆ = dg Tr(AˆB) Nj WAˆ (g; jmm )WBˆ (g; jm m).
(7.13)
jmm
Option II To save on symbols, we use the same notations as in Option I; in any case we later make use only of Option II. With Aˆ we now associate the Weyl symbol ˆ Dj (g −1 g )δ(g −1 s(g , g )) dg dg g |A|g WAˆ (g; jnn ) = nn =
ˆ (g)|g Dj (g −1 g ) δ(s(g , g )). dg dg g |U (g)−1 AU nn (7.14)
Under conjugation of Aˆ we now have: ˜ (g1 )AˆU(g ˜ 1 )−1 ⇒ Aˆ = U WAˆ (g; jnn ) = ˆ (g2 ) ⇒ Aˆ = U (g2 )−1 AU
n1 ,n1
Dnj 1 n (g1−1 )WAˆ (gg1 ; jn1 n1 )Dnj n (g1 ); 1
(7.15)
WAˆ (g; jnn ) = WAˆ (g2 g; jnn ). For the trace over H, ˆ = Tr(AˆB)
dg
jnn
Nj WAˆ (g; jnn )WBˆ (g; jn n).
(7.16)
We stress that (7.9), (7.12) and (7.13) hold with Option I, while (7.14)–(7.16) with ˆ Option II. The major differences are in the behaviors under conjugation of A. Let us hereafter choose to work with Option II. The structure of the “momentum variables” in WAˆ (g; jnn ) suggests that we bring in the Schwinger rep D0 (g) of G acting on H0 , as set up in (2.2)–(2.5). We can then represent the Weyl symbol of Aˆ more compactly as simultaneously a function of g and a block diagonal operator on H0 : ˜ Aˆ → WAˆ (g; jnn ) → A(g) = ⊕ A˜j (g), A˜j (g) =
n,n
j
WAˆ (g; jnn )|jn )(jn|.
(7.17)
˜ Each A˜j (g) acts on the subspace H(j) ⊂ H0 , and A(g) acts in a block diagonal ˆ ˆ manner on H0 . For two operators A and B, traces within H(j) give ˜j (g)) = tr(A˜j (g)B WAˆ (g; jnn )WBˆ (g; jn n), (7.18) n,n
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
so the general trace formula (7.16) has the form ˆ = dg ˜j (g)). Tr(AˆB) Nj tr(A˜j (g)B
909
(7.19)
j
It is important to recognize that the trace operation on the right-hand side is not over H0 , because of the presence of the dimensionality factors Nj . We come back to this point later. We can now ask for the conditions on Aˆ which make its Weyl symbol WAˆ (g; jnn ) independent of “coordinate” g and dependent only on “momenta” jnn .c From (7.14), we see that Aˆ must belong to the commutant of the operators U (g) of the left regular representation. This means that it should be built up exclusively from ˜ (g) of the right regular representation. After elementary calculations the operators U we can state this as a series of two-way implications: WAˆ (g; jnn ) = independent of g ⇔ U (g) Aˆ = Aˆ U (g), all g ⇔ ˆ = f (g −1 g ), some f ⇔ g |A|g ˆ ˜ A = dg f (g)U(g) ⇔ (7.20) −1/2 ˆ j m n |A|jmn = Nj δjj δmm fnj n , 1/2 dg f (g)Dnj n (g), fnj n = Nj
f (g) =
jnn
1/2 j j −1 fn n Dnn ). (g
Nj
ˆ we in fact find: For such special operators A, −1/2
WAˆ (g; jnn ) = Nj fnj n , ˆ = δjj δmm W ˆ (·; jnn ). j m n |A|jmn
(7.21)
A
When the Weyl symbol of such an Aˆ is represented as a block diagonal operator on H0 according to (7.17), we have: ˜ Aˆ = dg f (g)U(g) ⇔ ˜ A(g) = g − independent =
dg f (g )D0 (g ).
(7.22)
Therefore, when Aˆ on H is built up exclusively from the operators of the right ˜ regular representation U(g), its Weyl symbol is the corresponding operator, in the c This
leads to interesting consequences and structures which are completely absent in the Cartesian case.
November 1, 2006 11:8 WSPC/148-RMP
910
J070-00280
S. Chaturvedi et al.
sense of (7.22), in the Schwinger rep of G, stripping away the degeneracy of the regular representation. At the generator level, we can say that if Aˆ is a function ˜ then A˜ is identically the same function of the only of the generators J˜r of U(g), generators of the Schwinger rep D0 on H0 . The block diagonality of A˜ is of course assured. This shows the important role of the Schwinger rep in the WW isomorphism for quantum mechanics on a (compact simple) Lie group. We return to the comment made after (7.19) and ask whether the definition of A˜j (g) for given Aˆ could have been altered so as to absorb the factors Nj appearing on the right in that equation. In that case, that right-hand side would be expressible in terms of a trace over H0 , which would make that relation more attractive. However, a careful analysis shows that in that case the simplicity of the correspondence (7.22) would be lost, and therewith the direct relevance of the Schwinger rep. Therefore, to secure (7.22), we have to retain (7.19) as it stands. Ultimately, this situation can be traced to the following source. While the way in which the delta function in the trace relation (7.5) appears is extremely elementary, when we express it as in (2.1) in terms of the irreducible representation matrices of G the dimensionality factors Nj are essential.
8. Concluding Comments The method by which the Schwinger rep has been isolated within the regular representation in the case of the group SU (n) readily generalizes to all the other compact simple Lie group families, namely SO(2n), SO(2n + 1), USp(2n) and even the five exceptional groups. This is because in each case the concept of highest weight in each UIR is unambiguously defined, and moreover the Lie algebra can be exhibited in the Cartan form, made up of “shift” or“raising” and “lowering” generators in the directions of the distinct root vectors. An interesting question is how to effect a similar extraction of the Schwinger rep from the regular representation in the case of finite groups, say the permutation groups SN . This presents interesting algebraic problems as generators, shifts along root vectors etc. are no longer available. The construction of the Schwinger representation for the permutation groups Sn has attracted attention in the mathematical literature: see, for instance, [28]. Two other general questions suggest themselves bearing in mind the basic properties of the Schwinger rep: simple reducibility and completeness: How are these properties reflected in the “classical limit”, can one give some differentialgeometric or manifold-theoretic characterizations at the level of the coadjoint orbit space of the Lie group? If one next takes the direct product of the Schwinger rep with itself, the simple reducibility aspect is likely to change, yet one can ask if any simplifying features remain. We hope to return to some of these questions elsewhere.
November 1, 2006 11:8 WSPC/148-RMP
J070-00280
The Schwinger Representation of a Group
911
References [1] J. Schwinger, On angular momentum, USAEC Report NYO-3071 (1952); reprinted in Quantum Theory of Angular Momentum, ed. K. A. Milton (Academic Press, New York, 1965), p. 229; A Quantum Legacy – Seminal Papers of Julian Schwinger, eds. L. C. Biedenharn and H. Van Dam (World Scientific Publishing Company, Singapore, 2000), p. 173. [2] D. P. Arovas and A. Auerbach, Phys. Rev. B 38 (1988) 316; A. Auerbach and D. P. Arovas, Phys. Rev. Lett. 61 (1988) 617; A. Auerbach, Interacting Electrons and Quantum Magnetism (Springer, New York, 1994). [3] Arvind, B. Dutta, N. Mukunda and R. Simon, Phys. Rev. A 52 (1993) 1609. [4] K. Sundar, N. Mukunda and R. Simon, J. Opt. Soc. Am. A 12 (1995) 560; R. Simon, K. Sundar and N. Mukunda, J. Opt. Soc. Am. A 10 (1993) 2008. [5] V. I. Man’ko, G. Marmo, P. Vitale and F. Zaccaria, Int. J. Mod. Phys. A 9 (1994) 5541. [6] Yu. F. Smirnov and M. R. Kibler, in Symmetries in Science VI: From the Rotation Group to Quantum Algebras, ed. B. Gruber (Plenum Press, New York, 1993), p. 691; M. R. Kibler, R. M. Asherova and Yu. F. Smirnov, Symmetries in Science VIII, ed. B. Gruber (Plenum Press, New York, 1995), p. 241. [7] M. Kibler and T. N´egadi, Lett. Nuovo Cimento 37 (1983) 225; ibid., J. Phys. A 16 (1983) 4265; ibid., Phys. Rev. A 29 (1984) 2891; M. Kibler, Molec. Phys. 102 (2004) 1221. [8] M. Kibler and G. Grenet, J. Math. Phys. 21 (1980) 422. [9] P. Aniello and R. Coen Cagli, arxiv:quantum-ph/0504108 (2005). [10] M. V. Berry and J. M. Robbins, Proc. Roy. Soc. London A 453 (1997) 1771. [11] For the SU (3) Schwinger construction see, for instance: M. Moshinsky, Rev. Mod. Phys. 34 (1962) 813; M. Mathur and D. Sen, J. Math. Phys. 42 (2001) 4181; S. Chaturvedi and N. Mukunda, J. Math. Phys. 43 (2002) 5262, 5278. [12] M. A. B. Beg and H. Ruegg, J. Math. Phys. 6 (1965) 677; A. J. Bracken, Comm. Math. Phys. 94 (1984) 371. [13] I. N. Bernstein, I. M. Gelfand, and S. I. Gelfand, Funct. Anal. Appl. 9 (1975) 322; I. M. Gelfand and A. V. Zelevinskii, Funct. Anal. Appl. 18 (1984) 183. [14] G. W. Mackey, Group Representations in Hilbert Space (American Mathematical Society, Providence, RI, 1963); see also N. Mukunda, Arvind, S. Chaturvedi and R. Simon, J. Math. Phys. 44 (2003) 2479, Appendix B. [15] S. Chaturvedi and N. Mukunda, J. Math. Phys. 43 (2002) 5262. [16] For an exhaustive treatment see L. C. Biedenharn and J. D. Louck, Angular Momentum in Quantum Physics — Theory and Applications, Encyclopedia of Mathematics and its Applications, ed. Gian-Carlo Rota, Vol. 8 (Addison-Wesley Publishing Company, 1981). [17] See [16], pp. 45–47. [18] For the following details, see [16], pp. 57–65. [19] A. Perelomov, Generalized Coherent States and Their Applications (Springer-Verlag, Berlin, 1986). [20] V. Bargmann, Rev. Mod. Phys. 34 (1962) 829. [21] E. Majorana, Nuovo Cimento 9 (1932) 43; J. Schwinger, Trans. NY Acad. Sc. 38 (1977) 170; reprinted in [1], p. 224; L. C. Biedenharn and J. D. Louck, [16], p. 463. [22] M. Mathur and H. S. Mani, J. Math. Phys. 43 (2002) 5351. [23] S. Okubo, Prog. Theoret. Phys. 27 (1962) 949; see also R. E. Behrends, J. Dreitlein, C. Fronsdal and B. W. Lee, Rev. Mod. Phys. 34 (1962) 1; B. G. Wybourne, Classical Groups for Physicists (Wiley, New York, 1974); R. Gilmore, Lie Groups, Lie Algebras and Some of Their Applications (Wiley, New York, 1974).
November 1, 2006 11:8 WSPC/148-RMP
912
J070-00280
S. Chaturvedi et al.
[24] For relevant details on the UIR’s of SU (3) see: J. J. de Swart, Rev. Mod. Phys. 35 (1963) 916; L. C. Biedenharn, Phys. Lett. 3 (1962) 69, 254; N. Mukunda and L. K. Pandit, J. Math. Phys. 6 (1965) 746. [25] H. Weyl, Z. Phys. 46 (1927) 1; ibid., The Theory of Groups and Quantum Mechanics (Dover, New York, 1931), p. 274; E. P. Wigner, Phys. Rev. 40 (1932) 749; M. Hillery, R. F. O’Connell, M. O. Scully and E. P. Wigner, Phys. Rep. 106 (1984) 121. [26] N. Mukunda, G. Marmo, A. Zampini, S. Chaturvedi and R. Simon, Wigner–Weyl isomorphism for quantum mechanics on Lie groups, J. Math. Phys. 46 (2005) 012106; quant-ph/0407257. [27] N. F. J. Inglis, R. W. Richardson and J. Saxl, Arch. Math. 54 (1990) 258.
November 1, 2006 11:8 WSPC/148-RMP
J070-00282
Reviews in Mathematical Physics Vol. 18, No. 8 (2006) 913–934 c World Scientific Publishing Company
PERIODIC AHARONOV–BOHM SOLENOIDS IN A CONSTANT MAGNETIC FIELD
TAKUYA MINE Department of Comprehensive Sciences, Kyoto Institute of Technology, Matsugasaki, Sakyo-ku, Kyoto 606-8585, Japan [email protected] YUJI NOMURA Department of Mathematics, Graduate School of Science and Engineering, Tokyo Institute of Technology, 2-12-1 Oh-okayama, Meguro-ku, Tokyo 152-8551, Japan [email protected] Received 28 May 2006 Revised 8 September 2006 We consider the magnetic Schr¨ odinger operator on R2 . The magnetic field is the sum of a homogeneous magnetic field and periodically varying pointlike magnetic fields on a lattice. We shall give a sufficient condition for each Landau level to be an infinitely degenerated eigenvalue. This condition is also necessary for the lowest Landau level. In the threshold case, we see that the spectrum near the lowest Landau level is purely absolutely continuous. Moreover, we shall give an estimate for the density of states for ˇˇtov´ıˇ Landau levels and their gaps. The proof is based on the method of Geyler and S cek, the magnetic Bloch theory, and canonical commutation relations. Keywords: Schr¨ odinger operator; periodic magnetic field; Aharonov–Bohm effect; delta magnetic field; Landau level; singular perturbation; canonical commutation relation. Mathematics Subject Classification 2000: 81Q10, 35P15, 35Q40, 47F05, 47N50
1. Introduction 1.1. Definition of operators and history We consider a magnetic Schr¨odinger operator on the Euclidean plane R2 L=
2 1 ∇+a , i 913
November 1, 2006 11:8 WSPC/148-RMP
914
J070-00282
T. Mine & Y. Nomura
where a = (ax , ay ) is the vector potential. We assume that a ∈ L1loc (R2 ; R2 ) ∩ C ∞ (R2 \Γ; R2 ) and the magnetic field rot a(z) = (∂x ay − ∂y ax )(z) satisfies 2παγ δ(z − γ) (1.1) rot a(z) = B + γ∈Γ
in the distribution sense, where B is a positive constant, δ is the Dirac measure concentrated at the origin, Γ is a lattice of rank 2 in R2 (a discrete subgroup of R2 with rankZ Γ = 2), and {αγ }γ∈Γ is a sequence of real numbers satisfying 0 < αγ < 1 for any γ ∈ Γ. We assume the periodicity on {αγ }γ∈Γ , i.e. there exists a rank-2 sublattice Γ of Γ such that αγ+γ = αγ holds for any γ ∈ Γ and γ ∈ Γ . As is noted in [1–4] an example of the vector potential a satisfying (1.1) is given by the following (we identify a vector z = (x, y) with a complex number z = x + iy in the sequel): a(z) = (Im φ(z), Re φ(z)), B z¯ + αγk ζΓ (z − γk ), 2 K
φ(z) =
(1.2)
k=1
is a complete system of representatives of the quotient group Γ/Γ , where and the function ζΓ is the Weierstrass ζ function corresponding to the lattice Γ (see Sec. 2 below). Using a gauge transformation technique, we see that the choice of the above gauge and the assumption 0 < αγ < 1 lose no generality (see [3, Secs. 3 and 6]). Define a linear operator L by {γk }K k=1
Lu = Lu,
D(L) = C0∞ (R2 \Γ),
where C0∞ (U ) denotes the compactly supported smooth functions whose supports are contained in an open set U , and D(A) denotes the operator domain of the linear operator A. Then, L is a positive symmetric operator. We denote the Friedrichs extension of L by H. More explicitly, Hu = Lu, 2 (R2 \Γ) | Lu ∈ L2 (R2 ), D(H) = {u ∈ L2 (R2 ) ∩ Hloc
lim |u(z)| = 0 for any γ ∈ Γ}.
z→γ
(1.3)
Sometimes the operator H is called the standard Aharonov–Bohm Hamiltonian (see [5]). The Hamiltonian H describes the motion of a non-relativistic charged quantum particle moving in the Euclidean plane in the presence of a homogeneous magnetic field B plus magnetic fields created by periodically placed infinitesimally thin solenoids, provided that the mass m = 1/2, the Planck constant (divided by 2π) = 1 and the charge of an electron e = 1. A similar situation occurs experimentally in GaAs/AlGaAs heterostructures coated with a film of type-II superconductors (see [6, 7]). The boundary conditions limz→γ |u(z)| = 0 are interpreted as the
November 1, 2006 11:8 WSPC/148-RMP
J070-00282
Periodic Aharonov–Bohm Solenoids in a Constant Magnetic Field
915
repulsive conditions, i.e. the solenoids are electrically shielded and no electron can penetrate inside them. The model of infinitesimally thin solenoids is known to be a physical model which explains the Aharonov–Bohm effect [8], and is extensively studied by many authors (see, e.g., [1–5, 9–15] and references therein). Especially the model of periodic solenoids is studied by the following authors; Geyler–Grishanov [2] studied the zero modes in the absence or the presence of a homogeneous magnetic field, and ˇˇtov´ıˇcek [3] studied the same subject in more detail (moreover, Geyler– Geyler–S ˇˇtov´ıˇcek [13] studied the same subject on the Lobachevsky plane); Melgaard– S Ouhabaz–Rozenblum [4] obtained the diamagnetic inequality, the Lieb–Thirring inequality and Hardy type inequalities; one of the authors [12] studied the spectrum in a gap between two consecutive Landau levels; Rozenblum–Shirokov [14] studied the zero modes of the Pauli operator when the magnetic field is a signed Borel measure (including the point measure case); Iwai–Yabu [15] studied the operator from the viewpoint of the flat connection on a punctured two-dimensional torus. We also note that there are some results about the Schr¨ odinger operators with a constant magnetic field plus point interactions (not point magnetic fields) on a lattice; see [16–20]. + (a) defined in [3, Sec. 8.5] corresponds The zero modes of the operator Hmax + (a) to the lowest Landau level. But the boundary conditions of the operator Hmax + and those of our operator H are different; in fact, the operator Hmax (a) admits + functions singular at points in Γ (our operator H corresponds to Hmin (a) + b0 in [3]). However, their method is applicable for our operator H, and gives us the condition for the Landau levels to be infinitely degenerated eigenvalues. Our aim is to develop the methods of [3, 12] by combining them with the magnetic Bloch theory. Consequently, we obtain (i) more detailed information about the spectrum, particularly around the Landau levels, and (ii) an estimate for the density of states for the Landau levels and their gaps. 1.2. Notations Before stating our results, we shall prepare some notations used in the present paper. For any positive integer n, the number En denotes the nth Landau level, i.e. En = (2n − 1)B. The pair of vectors {ω1 , ω2 } denotes a basis of Γ, i.e. Γ = ω1 Z ⊕ ω2 Z. We always assume Im(ω2 /ω1 ) > 0. The set Ω denotes a fundamental domain of Γ defined by 1 1 1 1 Ω = z = sω1 + tω2 − ≤ s < , − ≤ t < . 2 2 2 2 The set Ω denotes a fundamental domain of Γ defined similarly. For a measurable set E in R2 , the number |E| denotes the Lebesgue measure of E. The number R denotes the minimal distance between two different lattice points, i.e. R=
min |γ|.
γ∈Γ,γ=0
November 1, 2006 11:8 WSPC/148-RMP
916
J070-00282
T. Mine & Y. Nomura
The system of vectors {γ1 , . . . , γK } denotes a complete system of representatives of the quotient group Γ/Γ , where K = #(Γ/Γ ). We always assume γk ∈ Ω (k = 1, . . . , K). The number α ¯ denotes the average of {αγ }, i.e. K 1 α ¯= αγk . K k=1
The density of states measure ρ is a Borel measure on R satisfying tr(χΩ f (H)χΩ ) f (λ) dρ(λ) = |Ω | R
(1.4)
for every f ∈ C0 (R) (the compactly supported continuous functions on R), where χΩ is the characteristic function of Ω . The existence of the measure ρ is guaranteed by the Riesz representation theorem. Notice that, the equality tr(χU f (H)χU ) f (λ) dρ(λ) = lim 2 U→R |U | R holds in an appropriate sense (e.g., U = nΩ and n → ∞), because of the periodicity of the magnetic field. For a Borel measurable set I in R, we denote tr(χΩ PI (H)χΩ ) , ρ(I) = dρ(λ) = |Ω | I where PI (H) denotes the spectral projection of H corresponding to I. The condition B|Ω| +α ¯∈Q 2π
(1.5)
is called the rational flux condition. The number on the left-hand side of (1.5) is the average of the magnetic flux in a fundamental domain divided by 2π. 1.3. Results Our first result is the following. Theorem 1.1. The following holds: (i) Assume B|Ω| ¯ > n for some positive integer n. Then, En is an infinitely 2π + α degenerated eigenvalue of H. ¯ < 1. Then, E1 (= B) is not an eigenvalue of H. If we addi(ii) Assume B|Ω| 2π + α tionally assume the rational flux condition (1.5), then there exists a positive number such that σ(H) ⊂ [B + , ∞). ¯ = 1. Then, E1 is not an eigenvalue of H, and E1 is the edge (iii) Assume B|Ω| 2π + α of the purely absolutely continuous spectrum, i.e. there exists a constant E such that B < E ≤ 3B, [B, E] ⊂ σ(H) and Ran P[B,E) (H) ⊂ Hac , where Hac denotes the absolutely continuous subspace for the operator H.
November 1, 2006 11:8 WSPC/148-RMP
J070-00282
Periodic Aharonov–Bohm Solenoids in a Constant Magnetic Field
917
(iv) Assume B|Ω| ¯ = 1 and Γ = Γ , i.e. {αγ } is a constant sequence. Then, there 2π + α is only one band of absolutely continuous spectrum below the second Landau level E2 (= 3B), i.e. the number E given in (iii) can be taken so that σ(H) ∩ [B, 3B) = [B, E] ∩ [B, 3B),
Ran P[B,E]∩[B,3B) ⊂ Hac .
Notice that the inequality H ≥ B (see [12, Proposition 3.3(iii)]) implies that σ(H) ⊂ [B, ∞). The assertion (ii) is quite different from the corresponding result in [3]; + (a) always has zero modes (see [3, Theorem 8.16]). This fact the operator Hmax reflects the difference between our boundary conditions and theirs; ours are the repulsive conditions, while theirs are the attractive conditions. We remark that similar situation also occurs when B = 0 (see the remark after [4, Proposition 7.7]). It is natural to ask whether the sufficient condition given in (i) is also necessary even when n ≥ 2, but we do not know the answer at present. The assertions (iii) and (iv) are remarkable from the viewpoint of the solid state physics; they mean that, if the threshold condition holds and if the Fermi energy is close to the lowest Landau level, then the system has a non-zero conductance caused by the Aharonov–Bohm effect. Our second result is the following: Theorem 1.2. (i) Assume the rational flux condition (1.5). Then, we have n 1 B B − ≤ ρ({En }) ≤ + , 2π |Ω| 2π |Ω| n ρ((En , En+1 )) ≤ |Ω|
(1.6) (1.7)
for any positive integer n. (ii) For any positive integer n0 , there exist positive constants R0 and c, dependent only on n0 , B and {αγ }, satisfying the following conditions: If R ≥ R0 , then there exist closed sets S1 , . . . , Sn0 satisfying n0
σ(H) ∩ (−∞, En0 +1 ) =
({En } ∪ Sn ) ,
(1.8)
n=1
and Sn ⊂
K
2
2
[En + 2αγk B − e−cR , En + 2αγk B + e−cR ],
(1.9)
k=1
ρ(Sn ) =
n , |Ω|
n n−1 B B − ≤ ρ({En }) ≤ − , 2π |Ω| 2π |Ω|
(1.10) (1.11)
for n = 1, . . . , n0 . In particular, the infinitely degenerated eigenvalues E1 , . . . , En0 are isolated, if R is sufficiently large.
November 1, 2006 11:8 WSPC/148-RMP
918
J070-00282
T. Mine & Y. Nomura
Notice that ρ((−∞, B)) = 0 since H ≥ B. Notice also that the rational flux condition is not necessary for the second assertion. The value En + 2αB is the unique eigenvalue of the single solenoid operator H1α in the nth Landau gap (En , En+1 ) (for the definition of H1α , see the proof of Lemma 3.2 below). A physical interpretation of the above theorem is as follows. In a homogeneous magnetic field, a classical electron makes a cyclotron motion. It is suggested in [9, 12] that the energy of an electron turning around a solenoid is shifted by the Aharonov–Bohm effect, and thus eigenvalues in Landau gaps appear. According to the intuitive computation in [12], there are about n electrons with energy En B turning around a solenoid. Since the density of states for each Landau level is 2π , B we conclude that there are n “trapped” electrons and 2π |Ω| − n “non-trapped” electrons with energy En in a fundamental domain Ω. This explanation roughly consists with (1.10) and (1.11). The last statement in the second assertion seems peculiar in some sense; in general, the Landau levels are believed to be broadened by a periodic perturbation (e.g., [21]), or a random perturbation (e.g., [22]). We think the isolation of an infinitely degenerated eigenvalue is a character of the large-separated periodic pointlike perturbation; we also think the similar situation occurs in the case of the periodic point interaction treated in [19]. The present paper is organized as follows. In Sec. 2, we review some properties of the Weierstrass functions and an estimate for the growth rate of the Weierstrass σ function by Perelomov [23]. In Sec. 3, we review the magnetic Bloch theory, and apply the commutation method used in [12] to operators on fiber spaces. In Sec. 4, we shall prove Theorem 1.1. In Sec. 5, we shall prove Theorem 1.2.
2. Weierstrass Functions Let Γ = ω1 Z ⊕ ω2 Z be a lattice of rank 2 with Im(ω2 /ω1 ) > 0. Define a meromorphic function ζ and an entire function σ by ζ(z) =
1 + z
σ(z) = z
γ∈Γ\{0}
1 1 z + + 2 z−γ γ γ
,
2 z z + z 1− e γ 2γ 2 . γ
γ∈Γ\{0}
The function ζ(z) is a meromorphic function on C having only simple poles, whose set coincides with Γ. The function σ(z) is an entire function having only simple zeros, whose set also coincides with Γ. When we would like to indicate the dependence on the lattice Γ explicitly, we shall denote ζΓ (z) and σΓ (z) for ζ(z) and σ(z), ω respectively. We denote ηj = 2ζ( 2j ) for j = 1, 2. We shall quote some formulas for later use (see, e.g., [24]).
November 1, 2006 11:8 WSPC/148-RMP
J070-00282
Periodic Aharonov–Bohm Solenoids in a Constant Magnetic Field
919
Lemma 2.1. (i) We have σ (z) = ζ(z). σ(z)
(2.1)
(ii) We have the Legendre relation η1 ω2 − η2 ω1 = 2πi.
(2.2)
(iii) For integers m and n, put γ = mω1 + nω2 and η = mη1 + nη2 . Then, we have ζ(z + γ) = ζ(z) + η,
(2.3)
m+n+mn η(z+ γ2 )
e
σ(z + γ) = (−1)
σ(z).
(2.4)
Put µ=
i (η1 ω2 − η2 ω1 ), 4|Ω|
(2.5)
and put 2
σ ˜ (z) = e−µz σ(z),
π
2
Φ(z) = e− 2|Ω| |z| σ ˜ (z).
The function σ ˜ is introduced in [23], and called the modified Weierstrass σ function in [2]. We shall summarize some properties of σ ˜ in the following: Lemma 2.2. (i) We have iπ
Φ(z + ωj ) = −e |Ω| Im(ωj z) Φ(z)
(2.6)
for j = 1, 2. (ii) For z = w + γ (w ∈ Ω, γ ∈ Γ), we have π
2
|˜ σ (z)| ≤ e 2|Ω| |z| |˜ σ (w)|, |˜ σ (z)| ≥ Ce where C = inf w∈Ω e
π − 2|Ω| |w|2
π |z|2 2|Ω|
|˜ σ (w)|,
(2.7) (2.8)
.
Proof. (i) We can prove (2.6) by direct computation using (2.2), (2.3) and the equality |Ω| = − 2i (ω2 ω1 − ω1 ω2 ). (ii) This assertion follows immediately from the periodicity of |Φ(z)|. 3. Magnetic Bloch Theory and CCR In this section, we assume the rational flux condition (1.5) and review the magnetic Bloch theory briefly. Moreover, we shall investigate some properties of the operators A, A† defined by A = 2∂z + φ(z), ∂ −i∂
A† = −2∂ z + φ(z),
∂ +i∂
where ∂z = x 2 y , ∂ z = x 2 y , and φ(z) is the function given by (1.2). These operators satisfy the canonical commutation relations: L = A† A + B = AA† − B.
(3.1)
November 1, 2006 11:8 WSPC/148-RMP
920
J070-00282
T. Mine & Y. Nomura
3.1. Magnetic Bloch theory Replacing a period lattice Γ by its sublattice, we can assume B|Ω | + Kα ¯ ∈ Z. 2π
(3.2)
Let {ν1 , ν2 } be a basis of Γ satisfying Im(ν2 /ν1 ) > 0 and put ηj = 2ζΓ ( 1, 2). Define two operators {tνj }j=1,2 by
B
¯ j z) tνj u(z) = e−i Im( 2 νj z+K αη u(z − νj ).
νj 2
) (j =
(3.3)
Then we can prove by (2.3) that Atνj = tνj A,
A† tνj = tνj A† ,
Ltνj = tνj L
(3.4)
for j = 1, 2. Moreover, we can prove by (2.2) that
¯ tν2 tν1 . tν1 tν2 = e−i(|Ω |B+2πK α)
(3.5)
Thus two operators {tνj }j=1,2 commute with each other under the condition (3.2). For ν = mν1 + nν2 ∈ Γ (m, n ∈ Z), define n tν = tm ν1 tν2 .
(3.6)
Then, the operator tν commutes with A, A† or L, and the equality tν1 +ν2 = tν1 tν2 holds for any ν1 , ν2 ∈ Γ . The operator tν is called the magnetic translation operator. In the sequel, we denote the real inner product of two complex numbers z = x + iy and z = x + iy by z · z = Re(zz ) = xx + yy . Let {νj∗ }j=1,2 be complex numbers satisfying νj · νk∗ = 2πδjk , where δjk is the Kronecker delta. Then the lattice Γ∗ = ν1∗ Z ⊕ ν2∗ Z is called the dual lattice of Γ . Let Ω∗ be a fundamental domain of Γ∗ defined by 1 1 1 1 ∗ ∗ ∗ Ω = sν1 + tν2 − ≤ s < , − ≤ t < . 2 2 2 2 For θ ∈ Γ∗ , define a Hilbert space Hθ by Hθ = {u ∈ L2loc (R2 ) | tν u = eiθ·ν u for any ν ∈ Γ }, |u|2 dxdy. u2Hθ = Ω
November 1, 2006 11:8 WSPC/148-RMP
J070-00282
Periodic Aharonov–Bohm Solenoids in a Constant Magnetic Field
921
Define three linear operators Lθ , Aθ , A†θ on Hθ by Lθ u = Lu,
A†θ u = A† u,
Aθ u = Au,
D(Lθ ) = D(Aθ ) = D(A†θ ) = Dθ ,
Dθ = {u ∈ C ∞ (R2 ) ∩ Hθ ; supp u ∩ Γ = ∅}. The above operators are well-defined by virtue of (3.4). We denote the Friedrichs extension of Lθ by Hθ . The following lemma can be proved by a standard technique, so we shall omit the proof (see, e.g., [25]). Lemma 3.1. (i) The Hilbert space L2 (R2 ) is represented as L2 (R2 ) = Ω∗
Hθ
dθ , |Ω∗ |
(3.7)
where the right-hand side is the direct integral of Hilbert spaces {Hθ }. Correspondingly, the operator H is represented as H= Ω∗
dθ . |Ω∗ |
(3.8)
σ(Hθ ).
(3.9)
Hθ
Moreover, we have σ(H) =
θ∈Ω∗
(ii) The operator Hθ has compact resolvents for any θ ∈ Ω∗ . If we denote the jth eigenvalue (counting multiplicity) of Hθ by λj (θ), then λj is continuous on Ω∗ , analytic with respect to two variables θ = (θ1 , θ2 ) in the region λj (θ) is different from other λk (θ), and we have σ(H) =
∞
Ij ,
(3.10)
j=1
where Ij = θ∈Ω∗ {λj (θ)}. (iii) For any compactly supported, bounded and Borel measurable function f on R, we have 1 f (λ) dρ(λ) = tr f (Hθ ) dθ, (3.11) (2π)2 Ω R where ρ is the density of states measure defined by (1.4).
November 1, 2006 11:8 WSPC/148-RMP
922
J070-00282
T. Mine & Y. Nomura
3.2. CCR on fiber spaces Let us summarize some properties of the operators Aθ and A†θ . In the sequel, we denote N (I; H) = dim Ran PI (H). Lemma 3.2. (i) The deficiency indices of Lθ are (2K, 2K). (ii) There exists a self-adjoint extension Hθ− of Lθ satisfying ∗
Hθ = A∗θ Aθ + B = A†θ A†θ − B, Hθ− =
∗ A†θ A†θ
+ B.
(3.12) (3.13)
(iii) We have dim D(Hθ )/(D(Hθ ) ∩ D(Hθ− )) = K.
(3.14)
(iv) We have a unitary equivalence relation Hθ− |Ker(H − −B)⊥ Hθ + 2B. θ
(3.15)
In particular, we have N (I + 2B; Hθ− ) = N (I; Hθ )
(3.16)
for any Borel measurable set I not including the point −B. (v) We have N ((En , En+1 ); Hθ ) ≤ nK
(3.17)
for any positive integer n. Proof. One of the authors proved in [12] that similar assertions hold for a Schr¨ odinger operator with a constant magnetic field plus K pointlike magnetic fields. The above assertions can be proved in the same way, so we shall give only an outline of the proof. For 0 < α < 1 and B > 0, define a linear operator Lα 1 by Lα 1u =
2 1 ∇ + a u, i
a(z) = (Im φ(z), Re φ(z)),
φ(z) =
∞ 2 D(Lα 1 ) = C0 (R \{0}). α We denote the Friedrichs extension of Lα 1 by H1 .
B z¯ α + , 2 z
November 1, 2006 11:8 WSPC/148-RMP
J070-00282
Periodic Aharonov–Bohm Solenoids in a Constant Magnetic Field
923
(i) Since the operator Lθ is positive, the deficiency indices m± are equal by [26, Corollary of Theorem X.1]. We can prove there exists a vector space isomorphism D(L∗θ )/D(Lθ )
K
αγk ∗
D(L1
αγk
)/D(L1
),
(3.18)
k=1
whose definition is similar to (42) in [12] (notice that there are K solenoids in Ω ). The dimension of the left-hand side of (3.18) is equal to m+ + m− , and that of the right-hand side is equal to 4K by the result of [5]. (ii) The equalities (3.12) hold since the form domains of the three operators are ∗
equal. The equality (3.13) follows from the operator inclusion A†θ A†θ + B ⊃ Lθ and [12, Lemma 3.2(i)]. (iii) By determining D(Hθ ) and D(Hθ− ) explicitly, as (49) and (53) in [12]. (iv) By applying [12, Lemma 3.2(ii)] to A† , combining with (ii) of this lemma. (v) By (iii) of this lemma and [12, Lemma 3.5(ii)], we have N ((En , En+1 ); Hθ ) ≤ N ((En , En+1 ); Hθ− ) + K for any positive integer n. By (iv) of this lemma, we have N ((En , En+1 ); Hθ ) = N ((En+1 , En+2 ); Hθ− ) for any nonnegative integer n, where E0 = −B. Thus the assertion follows from an inductive argument using above expressions and the fact N ((−B, B); Hθ ) = 0. 4. Proof of Theorem 1.1 For simplicity, we shall prove Theorem 1.1 in the case Γ = Γ , that is, αγ = α for any γ ∈ Γ. In this case, the function φ(z) defined by (1.2) is written as B z¯ + αζ(z). φ(z) = 2 Lemma 4.1. Assume Γ = Γ . Then the following holds: (i) For any positive integer n and an entire function f , put u(z) = A†
n−1
B
2
(e− 4 |z| |σ(z)|−α σ(z)n f (z)).
(4.1)
If u ∈ L (R ), then we have u ∈ D(H) and Hu = En u. (ii) If u ∈ D(H) and Hu = E1 u, then there exists an entire function f satisfying (4.1). 2
2
Remark. When n = 1, the solution (4.1) is different from the solution (5) in [2] or (66) in [3] by the term σ(z), because of the difference between the boundary conditions. Proof. In the sequel, we denote the inner product on L2 (R2 ) by (u, v) = ¯v dxdy, the L2 -norm by u2 = (u, u). R2 u
November 1, 2006 11:8 WSPC/148-RMP
924
J070-00282
T. Mine & Y. Nomura
(i) By (2.1), we have B
2
α
B
2
α
A = e− 4 |z| σ(z)− 2 (2∂z )e 4 |z| σ(z) 2 .
(4.2)
Put 2
B
α
α
v(z) = e− 4 |z| σ(z)− 2 σ(z)n− 2 f (z). j
By (3.1) and (4.2), we have (L−B)v = A† Av = 0. Then we can prove LA† v = j Ej+1 A† v for any nonnegative integer j, by an inductive argument using (3.1). Thus we have Lu = En u. If u ∈ L2 (R2 ), then we have Lu = En u ∈ L2 (R2 ). Using (2.1), we can check that the right-hand side of (4.1) satisfies the boundary conditions limz→γ |u(z)| = 0 for every γ ∈ Γ. By (1.3), we have u ∈ D(H). (ii) Let u ∈ D(H) and Hu = Bu. Since H is the Friedrichs extension of L, (3.1) implies that ((H − B)u, u) = (A† Au, u) = Au2 . Thus, we have Au = 0
in R2 \Γ.
(4.3)
By (4.2), any solution to (4.3) is (at least locally) written as B
2
α
u(z) = e− 4 |z| σ(z)− 2 g(z),
(4.4)
where g(z) is a (possibly multi-valued) holomorphic function on C\Γ. Since the left-hand side of (4.4) is single-valued and satisfies the boundary conditions limz→γ |u(z)| = 0 (γ ∈ Γ), we see that the function g has to be factorized as α g(z) = σ(z)1− 2 f (z), where f (z) is an entire function on C. Thus the assertion holds. Remark. It is natural to ask whether all the solutions of Hu = En u (u ∈ D(H)) are written as (4.1); (ii) of the above lemma asserts that this is true when n = 1. However, it maybe false when n ≥ 2, because there maybe a solution u satisfying An u = 0 and A† An u = 0; neither the existence nor the nonexistence of the solution of this type is proved so far. Proof of Theorem 1.1 assuming Γ = Γ . (i) Let µ be the constant given by (2.5) and let 2
f (z) = P (z)e(α−n)µz , where P (z) is an arbitrary polynomial. Let u be the function given by (4.1) with the above f . By the Leibniz rule, (2.1), (2.3) and (2.7), we see that the 2 absolute value of u is bounded by Q(z)ed|z| , where Q(z) is some function of π(n−α) polynomial order and d = − B4 + 2|Ω| . Since d is negative by assumption, the solution u belongs to L2 (R2 ) for any choice of the polynomial P (z). Thus the assertion holds.
November 1, 2006 11:8 WSPC/148-RMP
J070-00282
Periodic Aharonov–Bohm Solenoids in a Constant Magnetic Field
925
(ii) Let f (z) be an arbitrary entire function which is not identically equal to 0, and let B
2
u(z) = e− 4 |z| |σ(z)|−α σ(z)f (z).
(4.5)
2
We write f (z) = e(α−1)µz g(z). Let be a positive number satisfying < |u(z)| = e
2 −B 4 |z|
R 4.
By (2.8), we have 2
|˜ σ (z)|1−α |g(z)| ≥ Ced|z| |g(z)|
(4.6)
for a complex number z satisfying dist(z, Γ) ≥ , where C is a positive constant independent of z and d = − B4 + π(1−α) 2|Ω| . Notice that d is positive by assumption. Since g is entire and not identically equal to 0, we see that g is not square integrable on R2 . Moreover, we can prove |g(z)|2 dxdy = ∞, (4.7) dist(z,Γ)≥
with the help of the mean value theorem. By (4.6), we see that u is not square integrable on R2 . Assume additionally the rational flux condition (1.5). The proof of (ii) of Lemma 4.1 also implies that any solution to Hθ u = Bu can be written as (4.5). By (4.6), such solution u cannot belong to Hθ for any θ ∈ Ω∗ . Thus we have λ1 (θ) > B for any θ ∈ Ω∗ , and therefore the conclusion follows from (3.10). (iii) Under the assumption B|Ω| 2π + α = 1, we can apply the magnetic Bloch theory. Put B
2
u(z) = e− 4 |z| |σ(z)|−α σ(z)e(α−1)µz2 . Then u satisfies Lu = Bu. Using the equality 2
u(z) = Φ(z)|Φ(z)|−α e−iα Im(µz ) , (2.2) and (2.6), one can check that 2
tωj u(z) = −e−iα Im(µωj ) u(z) for j = 1, 2. Thus we have u ∈ Hθ0 for some θ0 ∈ Ω∗ and λ1 (θ0 ) = B.
(4.8)
In particular u ∈ / L (R ). Any solution of Hθ v = Bv linearly independent of u is written as v = uf , where f is a non-constant entire function. We can prove the solution v cannot belong to L2 (R2 ) by the same argument used in the proof of (ii). Thus B is not an eigenvalue of H. Moreover, since the solution v cannot be bounded, we have 2
2
λ1 (θ) > B
(4.9)
for θ = θ0 . By (4.8) and (4.9), we see that the function λ1 (θ) is not constant in a neighborhood of θ0 . This fact implies the spectrum near B is purely
November 1, 2006 11:8 WSPC/148-RMP
926
J070-00282
T. Mine & Y. Nomura
absolutely continuous (see, e.g., the proof of [27, Theorem XIII.100] or that of [28, Theorem 2]). (iv) If Γ = Γ and B|Ω| 2π + α = 1, then we can apply (v) of Lemma 3.2 with K = 1. Then we have λ2 (θ) ≥ 3B for any θ ∈ Ω∗ . Thus there is only one band I1 in the interval [B, 3B). Remark. In the general case Γ = Γ , the solution (4.1) is replaced by
K † n−1 −B |z|2 −αk n 4 f (z) |σΓ (z − γk )| σΓ (z − γk ) e , u(z) = A k=1
where f (z) is an entire function. Using this solution, we can prove (i), (ii) and (iii) of Theorem 1.1 in the general case similarly. 5. Proof of Theorem 1.2 5.1. Rational flux case To prove (i) of Theorem 1.2, we use the Weyl asymptotics for the operator Hθ . Of course, it is well known when the vector potential a is smooth. Lemma 5.1. For any θ ∈ Γ∗ , we have |Ω | N ((−∞, λ]; Hθ ) = . λ→∞ λ 4π lim
(5.1)
Proof. Take open disjoint parallelograms O1 , . . . , On satisfying n
Oj ⊂ Ω ⊂
j=1
n
Oj ,
j=1
γk ∈
n
∂Oj
(k = 1, . . . , K).
j=1
Since Oj contains no points of Γ and Oj is simply connected, the singular part of the vector potential a can be gauged out in each Oj . By Dirichlet–Neumann bracketing (see, e.g., [27]), we have n
N Uj∗ H0,O U j ≤ Hθ ≤ j
j=1
n
D Uj∗ H0,O Uj j
j=1
D N in the form sense, where H0,O (resp. H0,O ) is the Dirichlet (resp. Neumann) j j 1 realization of the operator ( i ∇ + a0 )2 , a0 = (− B2 y, B2 x), and Uj is the gauge transformation operator defined on Oj . By the min-max principle, the equality (5.1) is reduced to the Weyl asymptotics for Schr¨ odinger operators with smooth vector potentials.
Proof of (i) of Theorem 1.2. Put an = N ({En }; Hθ ),
bn = N ((En , En+1 ); Hθ )
November 1, 2006 11:8 WSPC/148-RMP
J070-00282
Periodic Aharonov–Bohm Solenoids in a Constant Magnetic Field
927
for any positive integer n. By (v) of Lemma 3.2, we have bn ≤ nK
(5.2)
for any positive integer n. By (3.11) and (5.2), we have ρ((En , En+1 )) ≤
|Ω∗ | n , nK = (2π)2 |Ω|
where we used the equalities |Ω ||Ω∗ | = (2π)2 and |Ω | = K|Ω|. Thus (1.7) holds. By (iv) of Lemma 3.2, we have an = N ({En+1 }; Hθ− ),
bn = N ((En+1 , En+2 ); Hθ− )
for any positive integer n, and N ((B, 3B); Hθ− ) = 0. Put a0 = N ({B}; Hθ− ) and b0 = 0 for the convenience. Let n be a positive integer. Applying [12, Lemma 3.5(ii)] (notice that this assertion also holds for a closed interval I) to the interval I = [E1 , En ] combining with (iii) of Lemma 3.2, we have a1 + b1 + · · · + bn−1 + an ≥ a0 + b0 + · · · + bn−2 + an−1 − K, which is equivalent to bn−1 + an ≥ a0 − K.
(5.3) Hθ−
Since Hθ is the Friedrichs extension of Lθ , we have Hθ ≥ in the form sense. Comparing the number of eigenvalues less than En+1 by the min-max principle, we have a1 + b1 + · · · + an + bn ≤ a0 + b0 + · · · + an−1 + bn−1 , which is equivalent to an + b n ≤ a0 .
(5.4)
The Weyl asymptotics (5.1) implies B|Ω | a1 + b1 + · · · + bn−1 + an = . n→∞ n 2π By (5.3)–(5.5), we have lim
a0 − K ≤
B|Ω | ≤ a0 . 2π
(5.5)
(5.6)
By (5.2), (5.3) and (5.6), we have an ≥ By (3.11) and (5.7), we have |Ω∗ | ρ({En }) ≥ (2π)2
B|Ω | − nK. 2π B|Ω | − nK 2π
(5.7)
=
B n − . 2π |Ω|
(5.8)
November 1, 2006 11:8 WSPC/148-RMP
928
J070-00282
T. Mine & Y. Nomura
Moreover, we have by (5.4) and (5.6) an ≤ a0 ≤
B|Ω | + K. 2π
(5.9)
By (3.11) and (5.9), we have ρ((En , En+1 )) ≤
1 B + . 2π |Ω|
(5.10)
Thus we obtain (1.6). 5.2. Large separation and rational flux case First we shall prove (ii) of Theorem 1.2 in the rational flux case. Lemma 5.2. Let B0 be a positive constant, n be a positive integer and (αγ )γ∈Γ be a periodic sequence with 0 < αγ < 1 for any γ ∈ Γ. Then, there exist positive constants 0 , R0 and c dependent only on B0 , (αγ )γ∈Γ , n satisfying the following conditions: If R ≥ R0 , |B − B0 | ≤ 0 and the rational flux condition (1.5) holds, then, for any θ ∈ Ω∗ , there exist subspaces {Vk }K k=1 of D(Hθ ) such that: (i) dim Vk = n, (ii) supp v ∩ Ω ⊂ {|z − γk | ≤ R3 } for any v ∈ Vk , 2 (iii) (Hθ − (En + 2αγk B)) v ≤ e−cR v for any v ∈ Vk . Proof. Let H1α be the operator defined in the proof of Lemma 3.2. According to [5, 9], the operator H1α has an n-fold eigenvalue En + 2αB and the eigenfunctions corresponding to the eigenvalue En + 2αB are given by 2 Br2 Br α (z) = Cm,n rm+α Lm+α fm,n e− 4 eimθ , n 2 12 (m+α+1) B n! Cm,n = , 2 πΓ(n + m + α + 1) m = 0, . . . , n − 1, where z = reiθ is the polar coordinate and Lσn is the Laguerre polynomial of order n. For k = 1, . . . , K, let tγk be the magnetic translation operator from {|z| < R2 } to {|z−γk | < R2 } intertwining H1αk with Hθ (see [12, Definition 1.1]). Take a function χ ∈ C ∞ (R) satisfying 0 ≤ χ ≤ 1 and 1 1 x ≤ , 4 χ(x) = 1 . 0 x ≥ 3 k αk Put χR (z) = χ( |z| R ) and put fm,n,R = tγk (χR fm,n ). Let Vk be the linear hull of the k functions {fm,n,R }m=0,...,n−1 . We can naturally regard Vk as a subspace of Hθ . One can easily check that the subspace Vk has all the desired properties.
November 1, 2006 11:8 WSPC/148-RMP
J070-00282
Periodic Aharonov–Bohm Solenoids in a Constant Magnetic Field
929
Proof of (ii) of Theorem 1.2 in the rational flux case. Assume the rational flux condition (1.5) holds. Let c, R0 , 0 be constants given by Lemma 5.2 with B0 = B, and assume R ≥ R0 . Put IR =
K
2
2
[En + 2αk B − e−cR , En + 2αk B + e−cR ].
k=1
Taking R0 sufficiently large, we can assume IR0 ⊂ (En , En+1 ). By Lemma 5.2 and the min-max principle, we conclude that N (IR ; Hθ ) ≥ nK.
(5.11)
The inequality (5.11) and (v) of Lemma 3.2 imply σ(Hθ ) ∩ (En , En+1 ) ⊂ IR ,
(5.12)
N (IR , Hθ ) = nK.
(5.13)
Thus we have (1.9) and (1.8) by (3.9) and (5.12). We also have (1.10) by (3.11) and (5.13). Now let us use the notation in the proof of (i) of Theorem 1.2 again. The equality (5.13) implies bn = nK.
(5.14)
By (5.4), (5.6) and (5.14), we have an ≤ a0 − b n ≤
B|Ω | − (n − 1)K. 2π
(5.15)
Thus we have (1.11) by (3.11), (5.7) and (5.15).
5.3. Approximating lemmas To prove (ii) of Theorem 1.2 in the general case, it is sufficient to prove the following two approximating lemmas. In the sequel, we shall fix a sequence (αγ )γ∈Γ , and denote aB , LB , HB and ρB for a, L, H and ρ respectively, in order to indicate the value B explicitly. Lemma 5.3. Let B be a positive number and {Bn } be a sequence of positive numbers convergent to B. Then, we have HBn → HB
(5.16)
in the strong resolvent sense. Combining Lemma 5.3 with [29, Theorem VIII.24], we can deduce (1.9) and (1.8) in the general case from those in the rational flux case.
November 1, 2006 11:8 WSPC/148-RMP
930
J070-00282
T. Mine & Y. Nomura
Proof. Define a subspace D of D(HB ) by 2 D = u ∈ L2 (R2 ) ∩ Hloc (R2 \Γ) | LB u ∈ L2 (R2 ),
supp u is bounded, lim u(z) = 0 for any γ ∈ Γ . z→γ
We can prove that the right-hand side of the above definition is independent of B. We can also prove that D is an operator core of HB for any B > 0, by a cut-off argument. Moreover, we can check that HBn u → HB u
(5.17)
in L2 (R2 ) as n → ∞, for any u ∈ D. Thus the conclusion follows from [29, Theorem VIII.25]. Lemma 5.4. Suppose that there exist a real number λ and positive constants 0 , B0 and δ such that 0 < B0 and inf
|B−B0 |≤ 0
dist(λ, σ(HB )) > 0.
(5.18)
Then, the function B → ρB ((−∞, λ]) is continuous in {|B − B0 | ≤ 0 }. Using Lemma 5.4, we can also deduce (1.10) and (1.11) in the general case from those in the rational flux case. Proof. By definition, we have tr χΩ P(−∞,λ] (HB )χΩ dρB = |Ω | (−∞,λ] =
χΩ P(−∞,λ] (HB )22 , |Ω |
where · 2 denotes the Hilbert–Schmidt norm. Let C be the counterclockwise circular path in the complex plane whose diameter is the interval [0, λ]. Since −1 (HB − z)−1 dz, P(−∞,λ] (HB ) = 2πi C it is sufficient to show that the map B → χΩ (HB − z)−1 ∈ I2
(5.19)
is continuous uniformly in z ∈ C, where I2 denotes the Hilbert–Schmidt class.
November 1, 2006 11:8 WSPC/148-RMP
J070-00282
Periodic Aharonov–Bohm Solenoids in a Constant Magnetic Field
931
We shall divide the rest of the proof into three steps: Step 1. There exists a positive constant C1 independent of B and α such that χΩ (HB + 1)−1 2 ≤ C1 .
(5.20)
Proof. We shall use the diamagnetic inequality for multivortex Aharonov–Bohm Hamiltonian, that is, |(HB + λ)−1 u|(z) ≤ (−∆ + λ)−1 |u|(z) a.e.
(5.21)
for any λ > 0, which is obtained by Melgaard–Ouhabaz–Rozenblum [4]. More exactly, they obtain the semigroup form of the diamagnetic inequality |e−tHB u|(z) ≤ et∆ |u|(z) a.e.
(5.22)
for any t > 0, under the assumption rot a is the (possibly infinite) sum of point measures. However, their proof can be applied to our case, and the resolvent form (5.21) can be deduced from the semigroup form (5.22) by taking Laplace transform. By (5.21), we have a domination between integral kernels, that is, |(HB + 1)−1 (z, z )| ≤ (−∆ + 1)−1 (z, z ) a.e. Hence (5.20) holds with C1 = χΩ (−∆ + 1)−1 2 . Step 2. There exists a positive constant C2 such that χΩ (HB − z)−1 2 ≤ C2
(5.23)
for any z ∈ C and any B with |B − B0 | ≤ 0 . Proof. By the resolvent identity, we have χΩ (HB − z)−1 2 ≤ χΩ (HB + 1)−1 2 + χΩ (HB + 1)−1 2 (1 + z)(HB − z)−1 −1 ≤ C1 1 + sup |1 + z| dist(z, σ(HB )) . z∈C
By assumption, the supremum in the right-hand side is bounded by some constant independent of z ∈ C and B in {|B − B0 | ≤ 0 } (in the sequel, we use the term “uniformly bounded” in this sense). Step 3. The map (5.19) is continuous uniformly in z ∈ C. Proof. Put z ⊥ = (−y, x),
as = (Im ψ, Re ψ),
ψ(z) =
K k=1
Then aB is written as aB =
B ⊥ z + as . 2
αk ζ(z − γk ).
November 1, 2006 11:8 WSPC/148-RMP
932
J070-00282
T. Mine & Y. Nomura
Then we have χΩ (HB − z)−1 − χΩ (HB − z)−1 = χΩ (HB − z)−1 (HB − HB ) (HB − z)−1 1 ∇ + aB (HB − z)−1 = (B − B)χΩ (HB − z)−1 z ⊥ · i 1 + (B − B)2 χΩ (HB − z)−1 |z|2 (HB − z)−1 . 4
(5.24)
Put T1 = χΩ (HB − z)−1 z ⊥ ·
1 ∇ + aB (HB − z)−1 , i
T2 = χΩ (HB − z)−1 |z|2 (HB − z)−1 . It is sufficient to show that the Hilbert–Schmidt norm of Tj (j = 1, 2) is uniformly bounded. We have T1 = −χΩ (HB − z)−1 yΠx,B (HB − z)−1 + χΩ (HB − z)−1 xΠy,B (HB − z)−1 ,
(5.25)
where Πx,B =
1 ∂x + ax,B , i
Πy,B =
1 ∂y + ay,B . i
Since HB is the Friedrichs extension, we have (HB u, u) = Πx,B u2 + Πy,B u2 for any u ∈ D(HB ). By this equality, we can prove the operators Πx,B (HB − z)−1 and Πy,B (HB − z)−1 are uniformly bounded. Moreover, since χΩ [(HB − z)−1 , y] = 2iχΩ (HB − z)−1 Πy,B (HB − z)−1 , we see that the Hilbert–Schmidt norm of the operator χΩ [(HB − z)−1 , y] is uniformly bounded by step 2. Since the first term of (5.25) is written as −yχΩ (HB − z)−1 Πx,B (HB − z)−1 −χΩ [(HB − z)−1 , y]Πx,B (HB − z)−1 , we see that the first term of (5.25) is uniformly bounded by step 2, and so is the second term. Therefore T1 is uniformly bounded. We can prove T2 is uniformly bounded in the similar way. Therefore Lemma 5.4 is proved.
November 1, 2006 11:8 WSPC/148-RMP
J070-00282
Periodic Aharonov–Bohm Solenoids in a Constant Magnetic Field
933
Acknowledgments We thank the referee for introducing us the references [16–18] and for giving us helpful comments. The work of T. M. is partially supported by JSPS grant Kiban B-18340049, JSPS grant Kiban C-18540215 and JSPS grant Kiban C-18540218. The work of Y. N. is partially supported by JSPS grant Kiban C-16540097 and JSPS grant Kiban C-17540148.
References [1] A. Arai, Canonical commutation relations, the Weierstrass zeta function, and infinitedimensional Hilbert space representations of the quantum group Uq (sl2 ), J. Math. Phys. 37(9) (1996) 4203–4218. [2] V. A. Geyler and E. N. Grishanov, Zero modes in a periodic system of Aharonov– Bohm solenoids, JETP Letters 75(7) (2002) 354–356. ˇˇtov´ıˇcek, Zero modes in a system of Aharonov–Bohm fluxes, [3] V. A. Geyler and P. S Rev. Math. Phys. 16(7) (2004) 851–907. [4] M. Melgaard, E.-M. Ouhabaz and G. Rozenblum, Negative discrete spectrum of perturbed multivortex Aharonov–Bohm Hamiltonians, Ann. Henri Poincar´e 5(5) (2004) 979–1012; Errata, ibid. 6(2) (2005) 397–398. ˇˇtov´ıˇcek and P. Vytˇras, Generalized boundary conditions for the [5] P. Exner, P. S Aharonov–Bohm effect combined with a homogeneous magnetic field, J. Math. Phys. 43(5) (2002) 2151–2168. [6] S. J. Bending, K. von Klitzing and K. Ploog, Weak Localization in a distribution of magnetic flux tubes, Phys. Rev. Lett. 65 (1990) 1060-1063. [7] A. K. Geim, V. I. Falko, S. V. Dubonos and I. V. Grigorieva, Single magnetic flux tube in a mesoscopic two-dimensional electron gas conductor, Solid State Commun. 82(10) (1992) 831–836. [8] Y. Aharonov and D. Bohm, Significance of electromagnetic potentials in the quantum theory, Phys. Rev. 115 (1959) 485–491. [9] Y. Nambu, The Aharonov–Bohm problem revisited, Nuclear Phys. B 579(3) (2000) 590–616. [10] H. Tamura, Norm resolvent convergence to magnetic Schr¨ odinger operators with point interactions, Rev. Math. Phys. 13(4) (2001) 465–511. [11] J. F. Brasche and M. Melgaard, The Friedrichs extension of the Aharonov–Bohm Hamiltonian on a disc, Integral Equations Operator Theory 52(3) (2005) 419–436. [12] T. Mine, The Aharonov–Bohm solenoids in a constant magnetic field, Ann. Henri Poincar´e 6(1) (2005) 125–154. ˇˇtov´ıˇcek, Zero modes in a system of Aharonov–Bohm solenoids [13] V. A. Geyler and P. S on the Lobachevsky plane, J. Phys. A 39(6) (2006) 1375–1384. [14] G. Rozenblum and N. Shirokov, Infiniteness of zero modes for the Pauli operator with singular magnetic field, J. Funct. Anal. 233(1) (2006) 135–172. [15] T. Iwai and Y. Yabu, Aharonov–Bohm quantum systems on a punctured 2-torus, J. Phys. A 39(4) (2006) 739–777. [16] Y. Avishai, R. M. Redheffer and Y. B. Band, Electron states in a magnetic field and random impurity potential: Use of the theory of entire functions, J. Phys. A 25 (1992) 3883–3889. [17] Y. Avishai and R. M. Redheffer, Two dimensional disordered electronic systems in a strong magnetic field, Phys. Rev. B 47(4) (1993) 2089–2100.
November 1, 2006 11:8 WSPC/148-RMP
934
J070-00282
T. Mine & Y. Nomura
[18] Y. Avishai, M. Ya. Azbel and S. A. Gredeskul, Electron in a magnetic field interacting with point impurities, Phys. Rev. B 48(23) (1993) 17280–17295. [19] V. A. Ge˘ıler, The two-dimensional Schr¨ odinger operator with a homogeneous magnetic field and its perturbations by periodic zero-range potentials, St. Petersburg Math. J. 3(3) (1992) 489–532. [20] T. C. Dorlas, N. Macris and J. V. Pul´e, Characterization of the spectrum of the Landau Hamiltonian with delta impurities, Comm. Math. Phys. 204(2) (1999) 367–396. [21] J. Zak, Group-theoretical consideration of Landau level broadening in crystals, Phys. Rev. A 136(3) (1964) A776–A780. [22] E. I. Dinaburg, Y. G. Sinai and A. B. Soshnikov, Splitting of the low Landau levels into a set of positive Lebesgue measure under small periodic perturbations, Comm. Math. Phys. 189(2) (1997) 559–575. [23] A. M. Perelomov, Remark on the completeness of the coherent state system, Teoret. Mat. Fiz. 6(2) (1971) 213–224 (in Russian); ibid. Theoret. and Math. Phys. 6(2) (1971) 156–164 (in English). [24] M. Abramowitz and I. A. Stegun (eds.), Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, reprint (Dover Publications, Inc., New York, 1992). [25] A. Mohamed and G. D. Ra˘ıkov, On the spectral theory of the Schr¨ odinger operator with electromagnetic potential, in Pseudo-differential Calculus and Mathematical Physics, Math. Top., Vol. 5 (Akademie Verlag, Berlin, 1994), pp. 298–390. [26] M. Reed and B. Simon, Methods of Modern Mathematical Physics. II. Fourier Analysis, Self-Adjointness (Academic Press, 1975). [27] M. Reed and B. Simon, Methods of Modern Mathematical Physics. IV. Analysis of Operators (Academic Press, 1978). [28] L. E. Thomas, Time dependent approach to scattering from impurities in a crystal, Comm. Math. Phys. 33 (1973) 335–343. [29] M. Reed and B. Simon, Methods of Modern Mathematical Physics. I. Functional Analysis, 2nd edn. (Academic Press, 1980).
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Reviews in Mathematical Physics Vol. 18, No. 9 (2006) 935–970 c World Scientific Publishing Company
ENTANGLEMENT, HAAG-DUALITY AND TYPE PROPERTIES OF INFINITE QUANTUM SPIN CHAINS
M. KEYL Istituto Nazionale di Fisica della Materia, Unita’ di Pavia, Dipartimento di Fisica “A. Volta”, via Bassi 6, I-27100 Pavia, Italy [email protected] T. MATSUI Graduate School of Mathematics, Kyushu University, 1-10-6 Hakozaki, Fukuoka 812-8581, Japan [email protected] D. SCHLINGEMANN∗ and R. F. WERNER† Institut f¨ ur Mathematische Physik, TU Braunschweig, Mendelssohnstr.3, 38106 Braunschweig, Germany ∗ [email protected] † [email protected] Received 15 May 2006 We consider an infinite spin chain as a bipartite system consisting of the left and right half-chains and analyze entanglement properties of pure states with respect to this splitting. In this context, we show that the amount of entanglement contained in a given state is deeply related to the von Neumann type of the observable algebras associated to the half-chains. Only the type I case belongs to the usual entanglement theory which deals with density operators on tensor product Hilbert spaces, and only in this situation separable normal states exist. In all other cases, the corresponding state is infinitely entangled in the sense that one copy of the system in such a state is sufficient to distill an infinite amount of maximally entangled qubit pairs. We apply this results to the critical XY model and show that its unique ground state ϕS provides a particular example for this type of entanglement. Keywords: Entanglement; von Neumann algebras; quantum spin chains. Mathematics Subject Classification 2000: 81P68, 82B10, 82B20, 46L60, 47L90
1. Introduction Entanglement theory is not only at the heart of quantum information theory, it has also produced a lot of very deep (and, in particular, quantitative) insights into the structure of quantum correlations. Quantum correlations also play a paramount role in condensed matter physics, in particular, in the study of phase transitions and 935
November 28, 2006 11:15 WSPC/148-RMP
936
J070-00284
M. Keyl et al.
critical phenomena. It is therefore an interesting and promising task to analyze how both fields can benefit from each other, or in other words: to apply entanglement theory to models of quantum statistical mechanics. A lot of research was recently done on this subject, concentrating, in particular, on one-dimensional systems (cf. [1–15] and the references therein for a still incomplete list). Many of these papers study a ground state of a spin chain model and calculate the von Neumann entropy S of its restriction to a finite, contiguous block. It turns out that the scaling behavior of S with respect to the length L of the block is intimately related to criticality: For critical models, the entropy S(L) tends to diverge logarithmically (in the limit L → ∞), while limL→∞ S(L) remains finite in the non-critical case. The relation of these results to entanglement theory is given by the fact that S — the entropy of entanglement — measures the rate of maximally entangled qubit pairs (“singlets”), which can be distilled from an infinite supply of systems, if only local operations and classical communication (LOCC) are allowed. To be more precise, consider a spin chain as a bipartite system consisting of a finite block of length L (given to Alice) and the rest (given to Bob), and assume that an infinite amount of chains is available. The entropy of entanglement S(L) describes then the number of singlets Alice and Bob can produce per chain, if they are only allowed to communicate classically with each other and to operate on their parts of the chains. While this is a natural concept for finite dimensional systems, it seems to be odd for infinite degrees of freedoms, because we already have infinitely many systems. Hence it is more natural to ask how many singlets Alice and Bob can produce (in terms of LOCC) if only one chain is available. This question is discussed in [10, 14], and it turns out that in the critical case, this “one-copy entanglement ” diverges logarithmically as well (but with a smaller factor in front of the logarithm). Let us change our point of view now slightly and consider a splitting of the chain into a left and right half, rather than into a finite part and the rest. The results just discussed indicate that the one-copy entanglement of a critical chain becomes infinite in this case. As shown in [16], states of such a type cannot be described within the usual setup of entanglement theory (density operators on tensor product Hilbert spaces) but require instead the application of operator algebraic methods. The purpose of the present paper is to take this point of view seriously and to rediscuss entanglement properties of infinite quantum spin chains in an appropriate (i.e. algebraic) mathematical context. The basic idea is to associate to each set Λ of spins in the chain the C*-algebra AΛ of observables localized in Λ, and to describe the systems in term of this net of algebras — rather than in terms of a fixed Hilbert space. This is a well-known mathematical approach to quantum spin systems, and it has produced a lot of deep and powerful methods and results (cf. the corresponding section of [17] and the references therein). Of special importance for us are the algebras AL and AR associated to the left (L) and right (R) half-chains. They represent the corresponding splitting of the spin chain into a bipartite system. In the following we can think of AL (respectively, AR ) as the algebra which is generated
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
937
by the observables available only to Alice (respectively, Bob). The main message of this paper is now that the degree of entanglement contained in a pure state of the chain is deeply related to properties (in particular, the von Neumann type) of the weak closure of AL and AR in the corresponding GNS representation. We can show in particular that under mild technical assumption (most notably Haag-duality) two different cases arise: • The low entangled case, where the half-chain algebras are of type I, separable normal states exist, but no normal state can have infinite one-copy entanglement. This covers the traditional setup of entanglement theory. • The infinitely entangled case. Here the half-chain algebras are not of type I, all normal states have infinite one-copy entanglement, and consequently, no separable normal state exists. The previous results mentioned above indicate that critical models usually belong to the second case. Using the method developed in [18, 19] we prove this conjecture explicitly for the critical XY model. In this context, we show in particular that the (unique) ground state of a critical XY chain satisfies Haag-duality. The outline of the paper is as follows: After presenting some notations and mathematical preliminaries in Sec. 2, we will discuss (in Sec. 3), the generalizations of the usual setup for entanglement theory which are necessary in a C*-algebraic context. This is mostly a review of material presented elsewhere [20, 16, 21] adopted to the special needs of this paper. In Sec. 4, we analyze the relations between the von Neumann type of half-chain algebras and the amount of entanglement in a given state (cf. the discussion in the last paragraph). These results are then applied to spin chains. In Sec. 5, we treat kinematical properties like translational invariance, localization of entanglement and cluster properties, while Sec. 6, is devoted to a detailed study of the critical XY model. 2. Preliminaries A quantum spin chain consists of infinitely many qubits (more generally d-level systems, but we are only interested in the spin 1/2 case) arranged on a one-dimensional regular lattice (i.e. Z). We describe it in terms of the UHF C ∗ -algebra 2∞ (the infinite tensor product of 2 by 2 matrix algebras): A=
C∗
M2 (C)
.
(2.1)
Z
Each component of the tensor product above is specified with a lattice site Z. By Q(j) , we denote the element of A with Q in the jth component of the tensor product and the identity in any other component. For a subset Λ of Z , AΛ is defined as the C ∗ -subalgebra of A generated by elements supported in Λ. We set Aloc = AΛ , (2.2) Λ⊂Z,|Λ|<∞
November 28, 2006 11:15 WSPC/148-RMP
938
J070-00284
M. Keyl et al.
where the cardinality of Λ is denoted by |Λ|. We call an element of AΛ a local observable or a strictly local observable. Aloc is a dense subalgebra of A. In this paper we will look at spin chains as bipartite systems. Hence we have to consider observables and operations which are located on the right, respectively left, part of the chain. They are described in terms of the two half-chain algebrasa AR = A[0,∞) ,
AL = A(−∞,0) .
(2.3)
For each state ω of A, we can introduce the restricted states ωR = ω|AR ,
ωL = ω|AL
(2.4)
RL,ω = πω (AL ) ,
(2.5)
and the von Neumann algebras RR,ω = πω (AR ) ,
where (Hω , πω , Ωω ) denotes the GNS representation associated with ω. For arbitrary ω, the two von Neumann algebras RL/R,ω have the following properties: • Since AL/R are generated (as C*-algebras) by an increasing sequence of finite dimensional matrix algebras, the same holds for RR/L,ω in the weak topology. Hence the RL/R,ω are hyperfinite. • For the same reason the GNS Hilbert space Hω is separable, hence RL,ω and RR,ω are σ-finite. • RL,ω and RR,ω are mutually commuting, i.e. [A, B] = 0 for all A ∈ RL,ω and all B ∈ RR,ω . If ω is pure, the GNS representation πω is irreducible and we get in addition: • RL,ω and RR,ω together generate πω (A) = B(Hω ), i.e. RL,ω ∨ RR,ω = (RL,ω ∪ RR,ω ) = B(Hω ).
(2.6)
• The RL/R,ω are factors. This can be seen as follows: The center Zω of RL,ω satisfies Zω = (RL,ω ∩ RL,ω ) = RL,ω ∨ RL,ω .
(2.7)
But RR,ω ⊂ RL,ω and from Eq. (2.6), we therefore get Zω = B(Hω ). Hence RL,ω is a factor, and RR,ω can be treated similarly. A special class of states we will consider frequently are translationally invariant states. A state ω is translationally invariant if ω ◦τ1 = ω holds, where τ1 denotes the automorphism which shifts the whole chain one step to the right. More precisely, we define for each k ∈ Z an automorphism τk of A by τj (Q(k) ) = Q(j+k) for any j ∈ Z and any 2 × 2 matrix Q. a We
will use interval notations like (a, b] frequently for subsets of Z rather than R.
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
939
A particular example of a translationally invariant state is the ground state of the critical XY model. To give its definition, note first that a state ϕ is a ground state with respect to a one parameter group αt of automorphisms of A, if ϕ(Q∗ δ(Q)) ≥ 0
(2.8)
holds for any Q in the domain of the generator δ of αt , where d αt (Q)|t=0 . dt The dynamics of the XY model is given formally by δ(Q) = −i
αt (Q) = expitHXY Q exp−itHXY ,
(2.9)
Q∈A
(2.10)
with the Hamiltonian {(1 + γ)σx(j) σx(j+1) + (1 − γ)σy(j) σy(j+1) + 2λσz(j) }, HXY = −
(2.11)
j∈Z (j)
(j)
(j)
where σx , σy , and σz are Pauli spin matrices at the site j and γ and λ are real parameters (anisotropy and magnetic field). The precise mathematical definition of αt is obtained via thermodynamic limit: If we set HXY ([a, b]) = −
b−1
{(1 + γ)σx(j) σx(j+1) + (1 − γ)σy(j) σy(j+1) + 2λσz(j) },
(2.12)
j=a
the limit αt (Q) = lim eitHXY ([−N,N ])Qe−itHXY ([−N,N ]) , N →∞
Q∈A
(2.13)
exists in norm topology of A and defines the time evolution αt . The local algebra Aloc is a core for the generator δ(Q) = [HXY , Q]. The critical XY model arises if the parameter λ, γ satisfy |λ| = 1, γ = 0 or |λ| < 1, γ = 0. In this case it is known (cf. [19, Theorem 1]) that there is a unique ground state ϕS . We will refer to it throughout this paper as “the (unique) ground state of the critical XY model”. 3. Entanglement and C*-Algebras Our aim is to look at an infinite quantum spin chain as a bipartite system which consists of the left and right half-chains and to analyze entanglement properties which are related to this splitting. However, in our model the two halfs of the chain are not described by different tensor factors of a tensor product Hilbert space, but by different subalgebras of the quasi-local algebra A. Therefore we have to generalize some concepts of entanglement theory accordingly (cf. also [20, 16, 21]). Definition 3.1. A bipartite system is a pair of unital C*-algebras A, B which are both subalgebras of the same “ambient algebra” M, commute elementwise ([A, B] = 0 for all A ∈ A, B ∈ B) and satisfy A ∩ B = C1I.
November 28, 2006 11:15 WSPC/148-RMP
940
J070-00284
M. Keyl et al.
For the spin chain we have A = AL , B = AR and M = A. The usual setup in terms of a tensor product Hilbert space H1 ⊗ H2 arises with A = B(H1 ) ⊗ 1I, B = 1I ⊗ B(H2 ) and M = B(H1 ⊗ H2 ); we will refer to this situation as the “type I case” (since A and B are type I von Neumann algebras in this case). If M is finite dimensional, the latter is the only possible realization of bipartite systems — in full compliance with ordinary (i.e. finite dimensional) entanglement theory. Definition 3.2. A state ω on the ambient algebra is called a product state if ω(AB) = ω(A)ω(B) for all A ∈ A, B ∈ B; i.e. if ω does not contain any correlations. ω is separable if it is an element of the weakly closed convex hull of the set of product states. If ω is not separable, it is called entangled. If ω is a normal state of a type I system, i.e. ω(A) = tr(ρA) with a density operator ρ on H1 ⊗ H2 we see immediately that ω is a product state iff ρ = ρ1 ⊗ ρ2 holds. Hence we recover the usual definitions. Given a bipartite system in an entangled state, our aim is to extract maximally entangled qubit pairs using an operation T which does not generate entanglement itself (i.e. T should map separable states to separable states — such a map is called separable itself). For the purpose of this paper, it is sufficient to look only at the most simple class of such maps: local operations (for LOCC maps cf. [21]). Definition 3.3. A local operation between two bipartite systems A1 , B1 ⊂ M1 and A2 , B2 ⊂ M2 is a unital completely positive (cp) map T : M1 → M2 such that (1) T (A1 ) ⊂ A2 and T (B1 ) ⊂ B2 , and (2) T (AB) = T (A)T (B) holds for all A ∈ A1 , B ∈ B1 . If we consider in the type I case an operation T : A1 ⊗ B1 → A2 ⊗ B2 which is local and normal, it must have the form T = T1 ⊗ T2 with two unital (and normal) cp maps T1 , T2 . To see this, expand an element Q ∈ M1 = A1 ⊗ B1 = B(H1 ⊗ H2 ) in terms of matrix units ei,j . By normality we get Q= cijkl eij ⊗ ekl , ijkl
T (Q) =
cijkl T (eij ⊗ 1)T (1 ⊗ ekl ) = T1 ⊗ T2 (Q),
(3.1)
ijkl
hence T = T1 ⊗ T2 as stated. Note that T would not factorize if we consider only item 1 of this definition: If ω is a state on M1 the map T (A) = 1Iω(A) satisfies condition 1 even if ω is entangled. To fulfill condition 2 as well, however, ω has to be a product state. Usual distillation protocols describe procedures to extract entanglement from a large (possibly infinite) number of equally prepared systems. However, if we study an infinite quantum spin chain, we have already a system consisting of infinitely many particles. Hence one copy of the chain could be sufficient for distillation purposes, and if the total amount of entanglement contained in the system is infinite,
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
941
it might be even possible to extract infinitely many singlets from it. This idea is the motivation for the following definitionb [16, 10]. Definition 3.4. Consider a state ω of a bipartite system A, B ⊂ M. The quantity E1 (ω) = log2 (d) is called the one-copy entanglement of ω (with respect to A, B), if d is the biggest integer d ≥ 2 which admit for each > 0 a local operation T : B(Cd ) ⊗ B(Cd ) → M such that 1 χd = √ |jj d j=1 d
ω(T (|χd χd |)) > 1 − ,
(3.2)
holds. If no such d exists, we set E1 (ω) = 0 and if (3.2) holds for all d ≥ 2, we say that ω contains infinite one-copy entanglement (i.e. E1 (ω) = ∞). The next result is a technical lemma which we will need later on (cf. [21] for a proof). It allows us to transfer results we have got for C*-algebras A, B to the enveloping von Neumann algebras A , B and vice versa. Lemma 3.5. Consider a bipartite system A, B ⊂ M ⊂ B(H) with irreducible M and a density operator ρ on H. The state tr(ρ · ) has infinite one-copy entanglement with respect to A, B iff the same is true with respect to A , B . Finally, we will consider the violations of Bell inequalities. This subject is studied within an algebraic context in [20]. Following these papers, let us define: Definition 3.6. Consider a bipartite system A, B ⊂ M. The Bell correlations in a state ω : M → C are defined by β(ω) =
1 sup ω(A1 (B1 + B2 ) + A2 (B1 − B2 )), 2
(3.3)
where the supremum is taken over all selfadjoint Ai ∈ A, Bj ∈ B satisfying −1I ≤ Ai ≤ 1I, −1I ≤ Bj ≤ 1I, for i, j = 1, 2. In other words A1 , A2 and B1 , B2 are (appropriately bounded) observables measurable by Alice, respectively Bob. Of course, a classically correlated (separable) state, or any other state consistent with a local hidden variable model [22] satisfies the Bell-CHSH-inequality β(ω) ≤ 1, while any ω has to satisfy Cirelson’s inequality [23–25] √ (3.4) β(ω) ≤ 2. √ If the upper bound 2 is attained we speak of a maximal violation of Bell’s inequality. that the definition given in [10] is slightly different from ours, because the condition T ∗ (ω) = |χd χd | is used instead of Eq. (3.2). The advantage of our approach (following [16]) lies in the fact that topological questions concerning the limit → 0 can be avoided. b Note
November 28, 2006 11:15 WSPC/148-RMP
942
J070-00284
M. Keyl et al.
4. Entanglement and von Neumann Type In this section, we want to consider the special case that A and B are von Neumann algebras acting on a Hilbert space H and having all the properties mentioned in Sec. 2. In other words: A and B are hyperfinite and σ-finite factors, and they generate together B(H), i.e. A ∨ B = B(H).
(4.1)
As the ambient algebra we choose M = B(H) and we will call a bipartite system with these properties in the following simple. If in addition A = B holds, we say that Haag-duality holds. We will see that these conditions are already quite restrictive (in particular, Eq. (4.1)) and lead to a close relation between entanglement and the type of factors A and B. 4.1. Split property Let us consider first the low entangled case. It is best characterized by the split property, i.e. there is a type I factor N such that A ⊂ N ⊂ B
(4.2)
holds. In this case, normal states with infinite one-copy entanglement does not exist. More precisely, we have the following theorem. Theorem 4.1. Consider a simple bipartite system A, B ⊂ B(H) satisfying the split property (4.2). Then there is no normal state on B(H) with infinite one-copy entanglement. The proof of this theorem can be divided into two steps. The first one shows that the split property forces the algebras A, B to be of type I. Proposition 4.2. A simple bipartite system A, B ⊂ B(H) satisfies the split property iff it is (up to unitary equivalence) of the form H = H1 ⊗ H2 , A = B(H1 ) ⊗ 1I and B = 1I ⊗ B(H2 ). This shows in particular that the split property implies Haag duality. Proof. If A, B are of the given form, the split property holds trivially with N = A. Hence only the other implications have to be proved. To this end, consider the relative commutant M = A ∩ N of A in N . Since N ⊂ B , we have M ⊂ A and M ⊂ B . Hence with Eq. (4.1), M ⊂ (A ∨ B) = C1I.
(4.3)
Since N is of type I, there are Hilbert spaces H1 , H2 and a unitary U : H → H1 ⊗H2 such that U N U ∗ = B(H1 ) ⊗ 1I holds [26, Theorem V.1.31]. Hence A ⊂ N implies ˜ ⊗ 1I, with a subalgebra A ˜ of B(H1 ). Equation (4.3) therefore leads to U AU ∗ = A ˜ ˜ A = C1I; hence A = B(H1 ) and U AU ∗ = B(H1 ) ⊗ 1I as stated. In a similar way, we can show that U BU ∗ = 1I ⊗ B(H2 ), which concludes the proof.
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
943
Roughly speaking, we can say that there is not enough room between A and B to allow non-trivial splits with A = N . This is exactly the converse of a standard split inclusion, where A ∩ B is big enough to admit a cyclic vector [27, 28]. With this proposition Theorem 4.1 follows immediately from a recent result about the type I case [16]: Proposition 4.3. Consider a normal state ω of a type I bipartite system (A = B(HA ) ⊗ 1I, B = 1I ⊗ B(HB ) ⊂ M = B(HA ⊗ HB )). For each sequence of unital cp-maps, Td : B(Cd ⊗ Cd ) → M such that Td∗ φ is pptc for each pure product state φ, we have 1 χd = √ |jj . d j=1 d
lim ω(Td (|χd χd |)) = 0,
d→∞
The operations Td considered here map pure product states to ppt-states. This is a much weaker condition than separability (and therefore much weaker than LOCC). Hence this theorem covers all physically relevant variations of Definition 3.4. Note in addition that the possibility of normal states with infinite distillable entanglement is not excluded, because the usual entanglement distillation allows the usage of an infinite supply of systems not just one copy. It is in fact easy to see that in type-I systems with dim HA = dim HB = ∞ normal states with infinite distillable entanglement are in a certain sense generic (cf. [29, 30] for details). The result of this subsection shows that the split property (4.2) characterizes exactly the traditional setup of entanglement theory. Hence there are normal states which are separable but no normal state has infinite one-copy entanglement. This is the reason why we have called this case the “low entangled” one. 4.2. The maximally entangled case The prototype of a state with infinite one-copy entanglement is a system consisting of infinitely many qubit pairs, each in a maximally entangled state. It can be realized on a spin chain as follows: Consider the algebra A{−j,j−1} containing all observables localized at lattice sites −j and j − 1. It is naturally isomorphic to B(C2 ) ⊗ B(C2 ). Therefore we can define the state {−j,j−1}
ω1
(A) = tr(|χ2 χ2 |A)
(4.4)
with χ2 from Eq. (3.2). It represents a maximally entangled state between the qubits at site −j and j − 1. Now we can consider the infinite tensor product {−j,j+1} ω1 , (4.5) ω1 = j∈N c That
is, the density operator associated to T ∗ φ has positive partial transpose.
November 28, 2006 11:15 WSPC/148-RMP
944
J070-00284
M. Keyl et al.
which has obviously infinite one-copy entanglement. In [16], it is argued that this state is the natural analog of a maximally entangled state in infinite dimensions. The left and right half-chain von Neumann algebrasd RL,1 and RR,1 have the following properties [16] • RL,1 , RR,1 ⊂ B(H1 ) form a simple bipartite system. • Haag-duality holds: RR,1 = RL,1 . • RL,1 and RR,1 are hyperfinite type II1 factors. Note that the last property can be seen very easily, because the construction shown in the last paragraph is exactly the Araki–Woods construction of the hyperfinite type II1 factor ([31], cf. also [16, Theorem 2] for a direct proof of the type II1 property). Since all hyperfinite type II1 factors are mutually isomorphic the maximally entangled case can be characterized as follows: Proposition 4.4. Consider a hyperfinite type II1 factor M ⊂ B(H) admitting a cyclic and separating vector. Then the following statements hold: (1) The pair M, M ⊂ B(H) defines a simple bipartite system which is unitarily equivalent to RL,1 , RR,1 ⊂ B(H1 ). (2) Each normal state on B(H) has infinite one-copy entanglement (with respect to M, M ). Proof. Since M and RL,1 are hyperfinite type II1 factors, they are isomorphic [32, Theorem XIV.2.4] and since both have a cyclic and separating vector this isomorphism is implemented by a unitary U . Hence U ∗ MU = RL,1 and due to RR,1 = RL,1 [16] we also have UM U ∗ = RR,1 . This already proves item (1). To prove item (2) it is sufficient to show the statement for RL,1 , RR,1 rather than a general pair M, M . Hence consider a density matrix ρ on H1 and the corresponding state ω(A) = tr(ρπ1 (A)) on the quasi-local algebra A. According to Lemma 3.5, ρ has infinite one-copy entanglement with respect to RL,1 , RR,1 iff ω has infinite one-copy entanglement with respect to AL , AR . Therefore, it is sufficient to prove the latter. To this end, note first that ω1 is pure and π1 therefore irreducible. If ρ = |ψ ψ| with a normalized ψ ∈ H1 this implies that ω(A) = ψ, π1 (A)ψ is pure (in particular factorial) and unitarily equivalent to ω1 . Hence we can apply Corollary 2.6.11 of [33] which shows that quasi-equivalence of ω and ω1 implies that for each > 0 there is an N ∈ N with |ω(A) − ω1 (A)| < A ∀ A ∈ A{|n|>N } .
(4.6)
avoid clumsy notations, we will write occasionally H1 etc. instead of Hω1 , i.e. we will replace double indices ωj by an index j.
d To
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
945
Now assume that ρ is a general density matrix and ω therefore a mixed normal state on A. If the spectral decomposition of ρ is ρ = j λj |ψj ψj |, we have for each > 0 a J ∈ N with ω − ωJ <
3
and ωJ (A) =
J
λj ωj (A) =
j=1
J
λj ψj , π1 (A)ψj .
(4.7)
j=1
The ωj are pure states. Hence we find as in Eq. (4.6) an N ∈ N such that |ωj (A) − ω1 (A)| < A ∀ A ∈ A{|n|>N } ∀ j = 1, . . . , J (4.8) 3J J holds. By construction, we have in addition 1 − j=1 λj < /3. Therefore we get for all A ∈ A{|n|>N } with A = 1: |ω(A) − ω1 (A)| ≤ |ω(A) − ωJ (A)| + |ωJ (A) − ω1 (A)| J J λj |ωj (A) − ω1 (A)| + 1 − λj |ω1 (A)| ≤ . ≤ + 3 j=1 j=1
(4.9) (4.10)
Now consider the natural isomorphism TNM : B(C2M ⊗ C2M ) → A[−N −M,−N ]∪[N −1,N +M−1] ⊂ A. It satisfies by construction ω1 (TN M χ⊗M ) = 1. Together 2 = 1 since χ⊗M is a projector) (with TNM χ⊗M 2 2
(4.11)
with Eq. (4.10), this implies
|ω(TNM χ⊗M )| ≥ |ω1 (TNM χ⊗M )| − |ω1 (TNM χ⊗M ) − ω(TNM χ⊗M )| 2 2 2 2 ≥ 1 − TNM χ⊗M = 1 − , 2
(4.12) (4.13)
which shows that ω has infinite one-copy entanglement. The bipartite systems described in this proposition admit only normal states which have infinite one-copy entanglement. Hence there are, in particular, no normal, separable states. This is exactly the converse of the split situation described in the last subsection, and we can call it “the maximally entangled case”. 4.3. Haag-duality Let us consider now simple bipartite systems which are not split but satisfy Haagduality. Then we always can extract a maximally entangled system (as described in the last subsection) in terms of a local operation. Proposition 4.5. Consider a simple bipartite system A, B = A ⊂ B(H) such that A is not of type I. Then there is an operation γ : B(H1 ) → B(H) which is local with respect to RL/R,1 and A, B. Proof. By assumption, A is a factor, not of type I and B = A . Hence A, B are either both of type II or both of type III.
November 28, 2006 11:15 WSPC/148-RMP
946
J070-00284
M. Keyl et al.
If A and B are of type II∞ , let us define the additional von Neumann algebras ML = B(HL ) ⊗ RL,1 ⊗ 1IR ,
ML = MR = 1IR ⊗ RR,1 ⊗ B(HR ),
(4.14)
where HL/R are two infinite dimensional, separable Hilbert spaces and 1IL/R are the unit operators on them. Since RL/R,1 are hyperfinite type II1 factors, the ML/R are hyperfinite type II∞ factors satisfying ML = MR . By assumption the same is true for A, B. Hence there is a *-isomorphism γ : ML → A (since the hyperfinite type II∞ factor is unique up to isomorphism [32]). Since A, ML and their commutants are σ-finite, purely infinite factors both admit a cyclic and separating vector [34, Proposition 9.1.6]. Hence the isomorphism γ is unitarily implemented [34, Theorem 7.2.9], i.e. γ(A) = U AU ∗ with a unitary U : HL ⊗ H1 ⊗ HR → H. Since UML U ∗ = A
and UMR U ∗ = UML U ∗ = A = B
(4.15)
we get a local operation (even a local *-homomorphism) by B(H1 ) A → U (1IL ⊗ A ⊗ 1IR )U ∗ ∈ B(H),
(4.16)
which proves the statement in the type II∞ case (note that Haag-duality entered in Eq. (4.15)). If A and B are both of type II1 , we can define in analogy to Eq. (4.14) the hyperfinite II∞ factors A1 = B(HL ) ⊗ A ⊗ 1IR ,
B1 = 1IL ⊗ B ⊗ B(HR ).
(4.17)
As in the previous paragraph, there exists a unitary U : HL ⊗ H1 ⊗ HR → HL ⊗ H ⊗ HR such that Eq. (4.15) holds with A, B replaced by A1 , B1 . Hence with the density matrices ρL on HL and ρR on HR we can define a local operation B(H1 ) → B(H) by (4.18) B(H1 ) A → trLR ρL ⊗ 1I ⊗ ρR U (1IL ⊗ A ⊗ 1IR )U ∗ ∈ B(H), where trLR denotes the partial trace over HL ⊗ HR . If one algebra is type II∞ and the other type II1 we can proceed in the same way, if we adjoin only one type I factor to B(H), i.e. either B(HL ) or B(HR ). Hence only the type III case remains. If A is a hyperfinite type III factor it is strongly stable (cf. Appendix A), i.e. A∼ = A ⊗ RL,1
(4.19)
holds. By the same argument which leads to Eq. (4.15), this implies the existence of a unitary U : H ⊗ H1 → H such that U A ⊗ RL,1 U ∗ = A
and U B ⊗ RR,1 U ∗ = B.
(4.20)
Therefore the map B(H) A → U (1I ⊗ A)U ∗ ∈ B(H) is an operation with the required properties.
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
947
As an immediate corollary, we can show that “not type I” together with Haagduality implies infinite one-copy entanglement. Corollary 4.6. Consider a simple bipartite system A, B ⊂ B(H) which is not split, but satisfies Haag-duality. Each normal state ω of B(H) has infinite one-copy entanglement with respect to A, B. Proof. Since the split property does not hold, the two algebras A, B are not of type I (Proposition 4.2). Hence we can apply Proposition 4.5 to get a local, normal operation γ : B(H1 ) → B(H). Since ω is normal, the state ω ◦ γ of B(H1 ) is normal as well, and according to Proposition 4.4 it has infinite one-copy entanglement. Hence, by definition we can find for all > 0 and all d ∈ N a local operation T : B(Cd ⊗ Cd ) → B(H1 ) such that ω(γ ◦ T [|χd χd |]) ≥ 1 − .
(4.21)
Since γ is local by assumption, this implies that ω has infinite one-copy entanglement, as stated. A second consequence of Proposition 4.5 concerns Bell inequalities. To state it we need the following result from [20]. Proposition 4.7. Consider a (not necessarily simple) bipartite system, consisting of the von Neumann algebras A, B ⊂ B(H). The following two statements are equivalent: √ (1) For every normal state ω, we have β(ω) = 2. (2) There is a unitary isomorphism under which ˜ H∼ = H1 ⊗ H,
˜ A∼ = RL,1 ⊗ A,
˜ B∼ = RR,L ⊗ B
(4.22)
˜ B ˜ ⊂ B(H). ˜ holds with appropriate von Neumann algebras A, From this, we get with Proposition 4.5: Corollary 4.8. Consider again the assumptions from Corollary 4.6. Then each √ normal state ω of B(H) satisfies β(ω) = 2. Proof. According to Proposition 4.5, we have a local, normal operation γ : B(H1 ) → B(H), and√σ = ω ◦ γ becomes a normal state of B(H1 ). Proposition 4.7 implies that β(σ) = 2 holds. Hence for each > 0 there are operators Ai ∈ RL,1 , Bj ∈ RR,1 , i, j = 1, 2 satisfying −1I ≤ Ai ≤ 1I, −1I ≤ Bj ≤ 1I and √ ω ◦ γ(A1 (B1 + B2 ) + A2 (B1 − B2 )) > 2 − . (4.23) Since γ √is local and > 0 is arbitrary this, equation immediately implies that β(ω) = 2 holds as stated.
November 28, 2006 11:15 WSPC/148-RMP
948
J070-00284
M. Keyl et al.
Now we can summarize all our results to get the main theorem of this section: Theorem 4.9. Consider a simple bipartite system A, B ⊂ B(H) satisfying Haagduality (B = A ). Then the following statements are equivalent: Each normal state on B(H) has infinite one-copy entanglement. Each separable state is singular. The algebras A, B are not type I. The split property does not hold. Each normal state on B(H) leads to a maximal violation of Bell inequalities. There is a von Neumann algebra M ⊂ B(K) and a unitary U : H → H1 ⊗ K with U AU ∗ = RL,1 ⊗ M and U BU ∗ = RR,1 ⊗ M . (7) There is a normal state on B(H) with infinite one-copy entanglement.
(1) (2) (3) (4) (5) (6)
Proof. The implications (1) ⇒ (2) and (2) ⇒ (3) are trivial, while (3) ⇒ (1) and (3) ⇔ (4) are shown in Corollary 4.6 and Proposition 4.2. Hence we get (1) ⇔ (2) ⇔ (3) ⇔ (4). To handle the remaining conditions note first that (3) ⇒ (5) and (7) ⇒ (3) follow from Corollary 4.8 and Theorem 4.1 respectively, while (5) ⇒ (6) is a consequence of Proposition 4.7 and the fact that Haag-duality holds by assumption. Hence it remains to show that (7) follows from (6). To this end assume that condition (6) holds and consider a normal state ω = σ1 ⊗ σ2 of B(H1 ) ⊗ B(K). According to Proposition 4.4, σ1 (and therefore, ω as well) has infinite one-copy entanglement. Since the operation B(H) A → U AU ∗ = γ(A) ∈ B(H1 ) ⊗ B(K) is local and normal the pull back ω ◦ γ of ω with γ is normal and has infinite one-copy entanglement, which implies condition (7). Therefore we get the chain of equivalences (3) ⇔ (5) ⇔ (6) ⇔ (7), which concludes the proof. Hence, under the assumption of Haag-duality, entanglement theory divides into two different cases: on the one hand low entangled systems which can be described as usual in terms of tensor-product Hilbert spaces and on the other infinitely entangled ones, which always arise if the observable algebras A, B of Alice and Bob are not of type I. This implies, in particular, that there are a lot of systems which can be distinguished in terms of the type of the algebra A and B, but not in terms of ordinary entanglement measures (because all normal states of these systems are infinitely entangled). Nevertheless, it seems to be likely that there are relations between the type of A, B and entanglement, which go beyond the result of Theorem 4.9. In this context it is of particular interest to look for entanglement properties which can be associated to a whole bipartite system instead of individual states. We come back to this discussion at the end of Sec. 5.2. For now, let us conclude this section with the remark that item (6) of Theorem 4.9 admits an interpretation in terms of distillation respectively dilution processes, which nicely fits into the point of view just outlined: If we take the maximally entangled system RL/R,1 and add a second non-maximally entangled one (M, M ) the result (A, B)
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
949
is again non-maximally entangled. Hence we have “diluted” the entanglement originally contained in RL/R,1 . If we start on the other hand with a non-maximally entangled system A, B and discard a lower one (M, M ) we can concentrate (or distill) the entanglement originally contained in A, B and get a maximally entangled system RL/R,1 . 5. Entangled Spin Chains Let us return now to spin chains and to the C*-algebras AL , AR ⊂ A defined in Sec. 2. If ω is a pure state on the quasi-local algebra A, the pair of von Neumann algebras RL,ω , RR,ω form a simple bipartite system (cf. Sec. 2). According to Lemma 3.5, ω has infinite one-copy entanglement with respect to AL , AR iff the GNS vacuum has the same property with respect to RL,ω , RR,ω . Hence we get the following simple corollary of Theorem 4.9. Corollary 5.1. Consider a pure state ω ∈ A∗ which satisfies Haag-duality, i.e. RR,ω = RL,ω . It has infinite one-copy entanglement iff the von Neumann algebras RL/R,ω are not of type I. Applying again Theorem 4.9 and Lemma 3.5 we see in addition that (under the same assumption as in Corollary 5.1) each πω -normal state σ has infinite one-copy entanglement as well. This fact has a simple but interesting consequence for the stability of infinite entanglement under time evolution. To explain the argument i.e. consider a completely positive map T : A → A which is πω -normal, there is a normal cp-map Tω : B(Hω ) → B(Hω ) such that πω T (A) = Tω πω (A) . Obviously, this T maps πω -normal states to πω -normal states. Hence we get Corollary 5.2. Consider again a pure state ω ∈ A∗ which satisfies Haag-duality, and a πω -normal cp map T : A → A. The image T ∗ (ω) of ω under T has infinite one-copy entanglement iff ω has. We can interpret this corollary in terms of decoherence: Infinite one-copy entanglement of a state ω is stable under each decoherence process which can be described by a πω -normal, completely positive time evolution. By the same reasoning, it is impossible to reach a state with infinite one-copy entanglement by a normal operation, if we start from a (normal) separable state. This might look surprising at a first glance, however, the result should not be overestimated: It does not mean that infinite one-copy entanglement cannot be destroyed, instead the message is that operations which are normal with respect to the GNS-representation of the initial state are too tame to describe physically realistic decoherence processes. 5.1. Translational invariance After these general remarks, let us have now a closer look on those properties which uses explicitly the net structure Z ⊃ Λ → AΛ ⊂ A, which defines the
November 28, 2006 11:15 WSPC/148-RMP
950
J070-00284
M. Keyl et al.
kinematics of a spin chain. One of the most important properties derived from this structure is translational invariance. If a state ω is translationally invariant, we can restrict the possible types for the algebras RR/L,ω significantly, as the following proposition shows. Proposition 5.3. If ω is a translationally invariant pure state, the half-chain algebra RL,ω (respectively, RR,ω ) is infinite, i.e. not of type II1 or In with n < ∞. Proof. We only consider RL,ω because RR,ω can be treated similarly. Assume that RL,ω is a finite factor. Then there is a (unique) faithful, normal, tracial state ψ˜ on RL,ω , which gives rise to a state ψ = ψ˜ ◦ πω on AL . Obviously ψ is factorial and quasi-equivalent to the restriction of ω to AL . Hence by Corollary 2.6.11 of [33] we find for each > 0 an n ∈ −N such that |ω(Q) − ψ(Q)| < /2Q holds for all Q ∈ A which are located in the region (−∞, n]. Now consider A, B ∈ A[0,k] for some k ∈ N with A = B = 1. Then we get with j > n + k and due to translational invariance |ω(AB) − ψ(τ−j (AB))| = |ω(τ−j (AB)) − ψ(τ−j (AB))| < /2.
(5.1)
Hence |ω(AB) − ω(BA)| ≤ |ω(AB) − ψ(τ−j (AB))| + |ψ(τ−j (AB)) − ω(BA)| < .
(5.2)
Since and k were arbitrary we get ω(AB) = ω(BA) for all A, B ∈ Aloc and by continuity for all A, B ∈ A. Hence ω is a tracial state on A which contradicts the assumption that ω is pure. We do not yet know whether even more types can be excluded. However, the only cases where concrete examples exist are I∞ (completely separable states of the form φ⊗Z ) and III1 (the critical XY model with γ = 0; cf. Sec. 6.3). Our conjecture is that these are the only possibilities. Another potential simplification arising from translational invariance concerns Haag-duality. We expect that each translationally invariant pure state automatically satisfies Haag-duality. However, we are not yet able to prove this conjecture. If it is true we could replace Haag-duality in Corollary 5.1 by translational invariance, which is usually easier to test (in particular, if ω is the ground state of a translationally invariant Hamiltonian). Finally, note that we can discuss all these question on a more abstract level, because we only need the unitary V : Hω → Hω which implements the shift τ , in addition to the bipartite system RL/R,ω . All other (local) algebras can be reconstructed by A0 = V RL,ω V ∗ ∩ RR,ω , and appropriate products of the Aj .
Aj = V j A0 V −j ,
(5.3)
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
951
5.2. Localization properties The message of Theorem 4.9 and Corollary 5.1 is that whenever we have a spin chain in a pure state ω, satisfying Haag-duality (or a state quasi-equivalent to such an ω) we can generate as much singlets as we want by operations which are located somewhere in the left and right half-chains, respectively. However, these localization properties can be described a little bit more precise. To this end, let us introduce the following definition: Definition 5.4. Consider two regions Λ1 , Λ2 ⊂ Z with Λ1 ∩ Λ2 = ∅. An operation T : B(Cd⊗Cd ) → Ais localized in Λ1 and Λ2 if T is local in the sense of Definition 3.3 d d and if T B(C ) ⊗ 1I ⊂ AΛ1 and T 1I ⊗ B(C ) ⊂ AΛ2 holds. Theorem 5.5. Consider a pure state ω on A, which satisfies Haag-duality and which has infinite one-copy entanglement. Then the following statement hold: For all > 0, M ∈ −N, N ∈ [−M, ∞) and d ∈ N we can find an operation T which is localized in (−∞, M ) and [M + N, ∞) and which satisfies ω(T (|χd χd |)) > 1 − . Proof. Without loss of generality we can assume M = 0, because the proof is easily adopted to general M (by translating ω appropriately). In addition let us denote the region [0, N ) by Λ and set Λc = Z\Λ. Since RΛ,ω = πω (AΛ ) is finite dimensional, it must be of type I. Hence there are Hilbert spaces HΛ,ω and HΛc ,ω with Hω = HΛ,ω ⊗ HΛc ,ω ,
RΛ,ω = B(HΛ,ω ) ⊗ 1I,
RΛc ,ω = 1I ⊗ B(HΛc ,ω ).
(5.4)
Since RL,ω and R[N,∞),ω are subalgebras of RΛc ,ω they can be written as ˜ L,ω , RL,ω = 1I ⊗ R
˜ R,ω R[N,∞),ω = 1I ⊗ R
(5.5)
˜ L/R,ω which act on HΛc ,ω and which are isomorwith two von Neumann algebras R ˜ L,ω ∨ R ˜ R,ω = phic to RL,ω and R[N,∞),ω respectively. We see immediately that R ˜ B(HΛc ,ω ) follows from the corresponding property of RL/R,ω . In addition RL,ω and ˜ R,ω are mutually commuting, hyperfinite and σ-finite. Hence they form a simple R bipartite system, as defined at the beginning of Sec. 4. To finish the proof we only ˜ L/R,ω are not of type I and satisfy Haag-duality. The statement have to show that R then follows from Theorem 4.9. Since ω has infinite one-copy entanglement RL/R,ω are according to Theorem 4.9 ˜ L,ω cannot be of type I not of type I. Hence Eq. (5.5) implies immediately that R ˜ R,ω . To ˜ either. A similar statement about RR,ω follows from RR,ω = B(HΛ,ω ) ⊗ R ˜ show Haag-duality consider A ∈ RL,ω . Then we have 1I ⊗ A ∈ RL,ω = RR,ω . Since ˜ R,ω this implies A ∈ R ˜ R,ω as required. Together with the RR,ω = B(HΛ,ω ) ⊗ R previous remark this concludes the proof. It is interesting to compare this result with the behavior of other models: If we consider a quantum field and two tangent, wedge-shaped subsets of spacetime as localization regions the vacuum state has infinite one-copy entanglement under
November 28, 2006 11:15 WSPC/148-RMP
952
J070-00284
M. Keyl et al.
quite general conditions [20]. If the regions do not touch, however, the entanglement is finite and decays quite fast as a function of the (space-like) distance of the wedges (but entanglement never vanishes completely [21]). In a harmonic oscillator chain the entanglement is always finite even if we consider two adjacent half-chains, and it (almost) vanishes if we tear the half-chains apart [1]. In both examples the entanglement is mainly located at the place where the localization regions meet and is basically negligible at large distances. For a spin chain in a state with infinite one-copy entanglement, it is exactly the other way round. At a first glance the result from Theorem 5.5 seems to be quite obvious: A finite number of qubits can carry only a finite amount of entanglement. Subtracting a finite number from infinity remains infinite. This argument is, however, incomplete, because it assumes implicitly that entanglement is localized along the chain, such that ignoring a finite part in the middle cannot disturb the entanglement of the rest. The following corollary shows that this type of localization is indeed possible. Corollary 5.6. Consider the same assumptions as in Theorem 5.5. For all > 0, M ∈ −N, N ∈ [M, ∞) and d ∈ N there is an L ∈ N (depending in general on N, and d) and an operation T localized in Λ1 = [M − L, M ) and Λ2 = [M + N, M + N + L) (cf. Fig. 1) such that ω(T (|χd χd |)) > 1 − holds. Proof. As above we can assume without loss of generality that M = 0 holds. From Theorem 5.5 we know that an operation S : B(Cd ⊗ Cd ) → A exists, which is localized in (−∞, 0) and [N, ∞) and which satisfies ω(A) > 1 − /2 with A = S(|χd χd |).
(5.6)
The operator A can be written as a limit over a net AΛ ∈ AΛ , (Λ ⊂ Z, finite), i.e. for each > 0 there is an Λ such that Λ ⊃ Λ implies A − AΛ < /4. Now consider Λ = [−L, N + L) such that Λ ⊂ Λ and Λc = Z\Λ. On AΛc we can define the state σ = j∈Λc σ (j) with σ (j) (B) = tr(B)/2 and this leads to the operation (where IdΛ denotes the identity map on AΛ , and we have denoted the map AΛc A → σ(A)1I ∈ AΛc again with σ) (5.7) B(Cd ⊗ Cd ) B → σ ⊗ IdΛ T (B) ∈ AΛ ,
Fig. 1.
Localization regions Λ1 , Λ2 from Corollary 5.6.
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
953
which is localized in [−L, 0] and [N, N + L). Now note that the map σ ⊗ IdΛ is idempotent with σ ⊗ IdΛ = 1 (since σ is a state and therefore completely positive and unital). Hence we get A − σ ⊗ IdΛ (A) ≤ A − AΛ + AΛ − σ ⊗ IdΛ (A) ≤ + σ ⊗ IdΛ AΛ − A ≤ , 4 2 therefore |ω(A − σ ⊗ IdΛ (A))| ≤ 2 and this implies with (5.6)
ω(σ ⊗ IdΛ S(|χd χd |) ) = ω(σ ⊗ IdΛ (A)) ≥ ω(A) − ≥ 1 − . 2 Hence the statement follows with T = (σ ⊗ IdΛ )S.
(5.8)
(5.9)
This corollary strongly suggests the introduction of a function Lω (M, N, ) which associates to a position M and a distance N the minimal length Lω of the localization regions which is needed to extract a maximally entangled qubit pair with accuracy 0 < < 1 from a chain in the state ω. For a state with infinite one-copy entanglement, L is well defined and always finite. Hence it provides a method to distinguish between different states with infinite one-copy entanglement. To get an idea what Lω can possibly tell us about ω, consider first its dependence on . We can get rid of it by defining Lω (M, N ) = sup Lω (M, N, ). However, this quantity can become infinite if the entanglement contained in ω is not perfectly localized (i.e. we can never extract a perfect singlet at position M and distance N ). In this case the dependence of Lω on is a measure of the degree of localization of the entanglement contained in ω. To discuss the parameters M and N note that two quasi-equivalent factor states ω, σ become indistinguishable “far outside”, i.e. for each δ > 0 there is a K ∈ N such that A ∈ A{|j|>K} ⇒ |ω(A) − σ(A)| < δA
(5.10)
holds [33, Corollary 2.6.11]. This indicates that the asymptotic behavior of Lω for M → ±∞, respectively N → ∞ characterizes the folium of ω (i.e. the equivalence class under quasi-equivalence) while the behavior for finite M, N distinguishes different states in the same folium. (This observation matches the discussion from the end of Sec. 4.3.) In both cases the dependence of Lω on M and N describes how entanglement is distributed along the chain (M ) and how it decays if the distance N of the localization regions grows. Closely related to Lω is the one-copy entanglement E1 (ωΛ ) of the restriction ωΛ of ω to AΛ = AΛ1 ⊗ AΛ2 , Λ = Λ1 ∪Λ2 , with respect to the splitting AΛ1 , AΛ2 ⊂ AΛ : For each L ≥ Lω (M, N ) we get E1 (ωΛ ) ≥ 1, if Λ1 , Λ2 are disjoint regions of length L, at position M and with distance N (cf. Fig. 1). This fact can be used to calculate Lω (M, N ) if we have a method to compute E1 (ωΛ ). Another closely related quantity is the one-copy entanglement E1 (ω) of ω with respect to the splitting of the whole chain into a finite contiguous block of length L and the rest.
November 28, 2006 11:15 WSPC/148-RMP
954
J070-00284
M. Keyl et al.
Explicit calculation of this type are available in [10, 14], where it is shown that E1 diverges for critical chains logarithmically in L. Unfortunately the methods used there are restricted to pure states, and cannot be applied directly to the computation of the one-copy entanglement of ωΛ with respect to the bipartite system AΛ1 , AΛ2 ⊂ AΛ just mentioned (since ωΛ is in general mixed, even if ω is pure).
5.3. Cluster properties The function Lω just introduced provides a special way to analyze the decay of correlations as a function of the distance (of the localization regions). A different approach with the same goal is the study of cluster properties. In this subsection we will give a (very) brief review together with a discussion of the relations to the material presented in this paper. In its most simple form, the cluster property just says that correlations vanish at infinite distances, i.e. lim |ω(Aτk (B)) − ω(A)ω(B)| = 0
k→∞
(5.11)
should hold for all A, B ∈ A (this is known as the weak cluster property). This condition, however, is to weak for our purposes, because it always holds if ω is a translationally invariant factor state (cf. [33, Theorem 2.6.10]). Hence we have to control the decrease of correlations more carefully. One possibility is to consider exponential clustering, i.e. exponential decay of correlations. It is in particular conjectured that a translationally invariant state ω satisfies the split property (cf. Sec. 4.1) if ω Aτk (B) − ω(A)ω(B) ≤ C(A, B)e−Mk ∀ A ∈ AL , B ∈ AR (5.12) holds, where C(A, B) is an A, B dependent constant, M is a positive constant (independent of A and B) and k is any positive integer. A complete proof of this conjecture is not yet available. If it is true, however, it would imply according to [35] that any ground state with a spectral gap (for a Hamiltonian with finite range interaction) has the split property. A different, approach is to assume that the limit (5.11) holds (roughly speaking) uniformly in A. It can be shown that this uniform cluster property is indeed equivalent to the split property. More precisely, the following proposition holds [36, Proposition 2.2]: Proposition 5.7. For each translationally invariant pure state ω on A the following two statements are equivalent. (1) ω satisfies the split property, i.e. RL,ω ⊂ N ⊂ RR,ω holds with a type I factor N .
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
955
(2) ω satisfies lim sup (ω(Aj τk (Bj )) − ω(Aj )ω(Bj )) = 0, k→∞ A
(5.13)
j
where the supremum is taken over all A ∈ Aloc with A ≤ 1 and A=
n
Aj Bj ,
Aj ∈ AR , Bj ∈ AL
(5.14)
j=1
for some n ∈ N.
6. Case Study: The Critical XY Model To illustrate the abstract discussion from the last two sections, let us now discuss the critical XY model and its unique ground state ϕS . To this end let us denote the GNS representation associated to ϕS with (πS , HS , ΩS ) and the corresponding half-chain von Neumann algebras by RL,S and RR,S . The main result of this section is the following theorem which shows that the RL/R,S are not of type I and that Haag-duality holds. The proof will be given in Sec. 6.3. In addition, we will provide a short review of several technical details of this model. Theorem 6.1. Consider the critical XY model (i.e. αt from Eq. (2.13) with |λ| = 1, γ = 0 or |λ| < 1, γ = 0). (1) The unique ground state ϕS is not split, i.e. RL,S , RR,S are not of type I. (2) ϕS satisfies Haag-duality RL,S = RR,S .
(6.1)
According to Theorems 4.9 and 5.5 this result implies immediately that each πS -normal state (in particular ϕS itself) has infinite one-copy entanglement. Corollary 6.2. Each πS -normal state ω on A has infinite one-copy entanglement with respect to the bipartite system AL , AR ⊂ A. 6.1. The selfdual CAR algebra To prove Theorem 6.1 we will use the method introduced in [18] by Araki. The idea is, basically, to trace statements about spin chains back to statements about Fermionic systems (cf. Sec. 6.2). To prepare this step we will give a short review of some material about CAR algebras which will be used in this context. More detailed and complete presentations of this subject can be found in [37–39, 17].
November 28, 2006 11:15 WSPC/148-RMP
956
J070-00284
M. Keyl et al.
Hence, let us consider a complex Hilbert space K equipped with an antiunitary involution Γ. To this pair we can associate a C*-algebra ACAR (K, Γ) which is generated by elements B(h) ∈ ACAR (K, Γ) where h ∈ K and h → B(h) is a linear map satisfying {B(h1 )∗ , B(h2 )} = (h1 , h2 )K 1,
B(Γh)∗ = B(h).
(6.2)
ACAR (K, Γ) is uniquely determined up to isomorphisms and called selfdual CAR algebra over (K, Γ). If there is no risk of confusion we denote ACAR (K, Γ) by ACAR . Any unitary u on K satisfying ΓuΓ = u gives rise to the automorphism βu of CAR determined by A βu (B(h)) = B(uh).
(6.3)
βu is called the Bogoliubov automorphism associated with u. Of particular importance is the case u = 1I and we write Θ = β−1 .
(6.4)
Θ is an automorphism of ACAR (K, J) specified by the following equation: Θ(B(h)) = −B(h).
(6.5)
As the automorphism Θ is involutive, Θ2 (Q) = Q, we introduce the Z2 grading with respect to Θ: = {Q ∈ ACAR | Θ(Q) = ±Q}, ACAR ±
ACAR = ACAR ∪ ACAR . + −
(6.6)
Next we introduce quasi-free states of ACAR (K, Γ). To this end note that for each state ψ of ACAR there exists a bounded selfadjoint operator A on the test function space K such that ψ(B(h1 )B(h2 )) = (Γh1 , Ah2 )K
(6.7)
and 0 ≤ A ≤ 1,
ΓAΓ = 1 − A
(6.8)
holds. A is called the covariance operator for ψ. Definition 6.3. Let A be a selfadjoint operator on K satisfying (6.8), and ψA the state of ACAR (K, J) determined by ψA (B(h1 )B(h2 ) · · · B(h2n+1 )) = 0,
(6.9)
and ψA (B(h1 )B(h2 ) · · · B(h2n )) =
sign(p)
n
(Jhp(2j−1) , Ahp(2j) )K ,
(6.10)
j=1
where the sum is taken over all permutations p satisfying p(1) < p(3) < · · · < p(2n − 1),
p(2j − 1) < p(2j)
(6.11)
and sign(p) is the signature of p. ψA is called the quasi-free state associated with the covariance operator A.
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
957
A projection E on K satisfying ΓEΓ = 1 − E is called a basis projection and the corresponding quasi-free state ψE is called a Fock state. A quasi-free state is pure iff it is a Fock state. The GNS representation (HE , πE , ΩE ) of ψE can be easily given in terms of the antisymmetric Fock space Fa (EK) over EK: HE = Fa (EK),
πE (B(h)) = C(EJh) + C ∗ (Ef ),
ΩE = Ω,
(6.12)
where C(f ), C ∗ (f ) denote annihilation and creation operators on Fa (EK) and Ω ∈ Fa (EK) is the usual Fock vacuum. If two quasi-free states are given, we need a criterion to decide whether they are quasi-equivalent or not. This is done by the following proposition. Proposition 6.4. Two quasi-free states ψA1 , ψA2 of ACAR (K, Γ) are quasi√ √ equivalent iff the operator A1 − A2 is Hilbert–Schmidt. For two Fock states ψE1 , ψE2 this condition reduces obviously to: E1 − E2 is Hilbert–Schmidt, and since ψE1 and ψE2 are pure, they are quasi-equivalent iff they are unitarily equivalent. Hence in this case we get the statement: ψE1 and ψE2 are unitarily equivalent iff E1 − E2 is Hilbert–Schmidt. If only one of the two operator is a projection, Proposition 6.4 can be easily reduced to the following statement (cf. [37] for a proof): Proposition 6.5. Consider a Fock state ψE and a quasi-free state ψA of ACAR (K, Γ). They are quasi-equivalent iff E − A and A(1I − A) are both Hilbert– Schmidt. Now consider a second projection P on K and assume that P commutes with Γ. Then we can define ACAR (P K, P ΓP ) which is a subalgebra of ACAR (K, Γ). To state our next result (known as “twisted duality”) concerning the commutant of the algebra M(P ) = πE (ACAR (P K, P ΓP )) ,
(6.13)
note that ψE is invariant under the automorphism Θ defined in (6.4). Hence there is a unitary Z on HE such that πE Θ(A) = ZπE (A)Z ∗ holds. Now we have (cf. [37, 40] for a proof) Proposition 6.6 (Twisted Duality). The von Neumann algebra
N (1 − P ) = ZπE B(h) | h ∈ (1I − P )K coincides with the commutant of M(P ), i.e. M(P ) = N (1 − P ) holds.
(6.14)
November 28, 2006 11:15 WSPC/148-RMP
958
J070-00284
M. Keyl et al.
6.2. The Jordan Wigner transformation Now we will use the arguments in [18] to relate spin chains to Fermionic systems. The first step is to enlarge the algebra A to another algebra A˜ by adding a new selfadjoint unitary element T which has the following property: T ∗ = T,
T 2 = 1,
T QT = Θ− (Q) for Q in A,
where Θ− is an automorphism of A defined by −1 −1 Θ− (Q) = lim σz(j) Q σz(j) . N →−∞
j=−N
(6.15)
(6.16)
j=−N
A˜ is the crossed product by the Z2 action via Θ− . Obviously A˜ = A ∪ AT
(6.17)
and we extend Θ− to A˜ by Θ− (T ) = T . We introduce another automorphism Θ via the formula, N N Θ(Q) = lim σz(j) Q σz(j) . N →∞
j=−N
(6.18)
j=−N
Thus Θ(σx(j) ) = −σx(j) ,
Θ(σy(j) ) = −σy(j) ,
Θ(T ) = T,
(6.19)
and we set A± = {Q ∈ A | Θ(Q) = ±Q} .
(6.20)
Now we can realize the creation and annihilation operators of fermions in A˜ as follows. c∗j = T Sj (σx(j) + iσy(j) )/2, where
cj = T Sj (σx(j) − iσy(j) )/2,
(0) (j−1) σz · · · σz Sj = 1 σ (−j) · · · σ (−1) z
z
(6.21)
for j ≥ 1, for j = 0,
(6.22)
for j ≤ −1.
Operators c∗j and cj satisfy the canonical anticommutation relations (6.23). {cj , ck } = {c∗j , c∗k } = 0, for any integer j and k.
{cj , c∗k } = δj,k 1
(6.23)
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
For a vector f = (fj ) ∈ l2 (Z), we set c∗j fj , c(f ) = cj f j , c∗ (f ) = j∈Z
959
(6.24)
j∈Z
˜ Furthermore, let where the sum converges in norm topology of A. B(h) = c∗ (f1 ) + c(f2 ),
(6.25)
where h = (f1 ⊕ f2 ) is a vector in the test function space K = l2 (Z) ⊕ l2 (Z) . By f¯ we denote the complex conjugate f¯ = (f¯j ) of f ∈ l2 (Z) and we introduce an antiunitary involution Γ on the test function space K = l2 (Z)⊕ l2 (Z) determined by Γ(f1 ⊕ f2 ) = (f¯2 ⊕ f¯1 ).
(6.26)
It is easy to see that {B(h1 )∗ , B(h2 )} = (h1 , h2 )K 1,
B(Γh)∗ = B(h)
(6.27)
holds. Hence the elements B(h) just defined generate a subalgebra of A˜ which is isomorphic to the CAR algebra ACAR (K, Γ), and which is therefore identified with the latter. In this context note that the two definitions of the automorphism Θ in Eqs. (6.18) and (6.4) are compatible. The relation between the CAR algebra ACAR and the spin chain algebra A is now given by the following equation: A+ = ACAR , +
A− = ACAR T, −
(6.28)
i.e. the even parts of both algebras coincide. Note that this implies in particular that A is generated by elements B(h)T with h ∈ K. Furthermore, the automorphisms τ and Θ− can be implemented as well in terms of Bogolubov transformations, provided the shift τ is extended to A˜ by τ1 (cj ) = cj+1 ,
τ1 (c∗j ) = c∗j+1 ,
τ1 (T ) = T σz(0) = T (2c∗0 c0 − 1).
(6.29)
Now we define for f = (fj ) ∈ l2 (Z) the operators (uf )j = fj−1 , and
(θ− f )j =
fj
for j ≥ 0,
−fj
for j ≤ −1.
(6.30)
(6.31)
By the abuse of notation, we denote operators θ− and u on K = l2 (Z) ⊕ l2 (Z) by the same symbols: u(f1 ⊕ f2 ) = (uf1 ⊕ uf2 ), Then we have
for all h ∈ K.
τ1 B(h) = B(uh),
θ− (f1 ⊕ f2 ) = (θ− f1 ⊕ θ− f2 ).
(6.32)
Θ− B(h) = B(θ− h),
(6.33)
November 28, 2006 11:15 WSPC/148-RMP
960
J070-00284
M. Keyl et al.
Now we are interested in states ω on A which are Θ-invariant. Since Θ(A) = −A for each A ∈ A− this implies that ω is uniquely determined by its restriction to A+ . Due to Eq. (6.28) this restriction can arise in particular from a Fock state ψE of ACAR , i.e. ω(A) = ω(A+ + A− ) = ψE (A+ ),
A+ ∈ A+ = ACAR , +
A− ∈ A− .
(6.34)
For this special class of states we can trace Haag-duality back to twisted duality (Proposition 6.6). To this end let us introduce the projection p on l2 (Z) by p= or more explicitly, for f in l2 (Z) (pf )j =
θ− + 1I 2
fj 0
(6.35)
for j ≥ 0,
(6.36)
for j ≤ −1.
On K we then set P (f1 ⊕ f2 ) = (pf1 ⊕ pf2 ).
(6.37)
The operator P defines the localization to the right half-chain. With this notation we can state the following result: Proposition 6.7. Consider a Θ invariant state ω which coincides on A+ = ACAR + with the Fock state ψE . Then Haag-duality holds, i.e. RL,ω = RR,ω
(6.38)
is satisfied. Proof. The idea of the proof is to relate the GNS representation (Hω , πω , Ωω ) of ω to the GNS representation (HE , πE , ΩE ) of ψE (i.e. the Fock representation), and to apply twisted duality (Proposition 6.6). Hence, let us consider the restriction of + + . Its GNS representation is given by (HE , πE , ΩE ) with ψE to A+ = ACAR + + + (A) = πE (A) HE , πE
+ HE = [πE (A+ ) ΩE ],
A ∈ A+ .
(6.39)
In addition, note that A can be written as the crossed product of A+ with respect (0) to the Z2 action given by Ad(σx ). In other words each A ∈ A can be written in (0) unique way as A = A0 + A1 σx with A0 , A1 ∈ A+ . This implies that πω is uniquely (0) determined by its action on A+ and σx . It is therefore straightforward to see that πω can be written as + + Hω = HE ⊗ HE ,
Ωω = ΩE ⊕ 0,
πω (σx(0) )ξ ⊕ η = η ⊕ ξ,
+ + (0) (A) ⊕ πE (σx Aσx(0) ), πω (A) = πE
A ∈ A+ .
(6.40) (6.41)
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
961
Alternatively, recall that A is generated by elements B(h)T ∈ A− with h ∈ K. Hence it is sufficient to calculate πω (B(h)T ). To this end, denote the orthocomple+ − by HE and introduce the operators ment of HE ± ∓ BE (h) = πE (B(h)) HE ,
h ∈ K.
(6.42)
± ± From Eqs. (6.9) and (6.10), it follows immediately that the range of BE (h) is HE , hence + − (h)η ⊕ BE (h)ξ, πE (B(h))ξ ⊕ η = BE
+ ξ ∈ HE ,
− η ∈ HE .
(6.43)
(0) (0) B(h)T σx σx ,
With B(h)T = we get from (6.40) and (6.41), (0) + + B(h)T σx(0) η ⊕ πE σx B(h)T ξ. πω B(h)T )ξ ⊕ η = πE
(6.44)
(0)
Now note that σx = T B(h0 ) holds with (h0 )j = (δj0 , δj0 ) — this can be derived immediately from the definitions of B(h) and cj , c∗j in Eqs. (6.21) and (6.25). Hence we get from (6.44), + + (B(h)B(h0 ))η ⊕ πE (B(h0 )T B(h)T )ξ πω (B(h)T )ξ ⊕ η = πE + − + − = BE (h)BE (h0 )η ⊕ BE (h0 )BE (θ− h)ξ,
(6.45) (6.46)
2
where we have used T = 1I, T B(h)T = Θ− (B(h)) = B(θ− h) and the fact that T commutes with B(h0 ); cf. the definition of T and Θ− in (6.15) and (6.16). This implies + − U πω (B(h)T )U ∗ ξ ⊕ κ = BE (h)κ ⊕ BE (θ− h)ξ,
where U :
+ HE
⊕
+ HE
→
+ HE
Uξ ⊕ η = ξ ⊕
⊕
− HE
+ − ξ ∈ HE , κ ∈ HE ,
(6.47)
denotes the unitary given by
− BE (h0 )η,
+ U ∗ ξ ⊕ κ = ξ ⊕ BE (h0 )κ,
(6.48)
− HE .
and κ ∈ for each ξ, η ∈ To continue the proof recall that Z is the unitary on HE which implements the and ZA− Z ∗ = automorphism Θ of ACAR . Hence ZA+ Z ∗ = A+ for A ∈ ACAR + CAR CAR is generated by monomials −A− for A− ∈ A− . Since the even algebra A+ + + B(h1 ) · · · B(h2n ) with an even number of factors, we see that A+ HE ⊂ HE and − − + − CAR A+ HE ⊂ HE hold for each A+ ∈ A+ . Similarly we have A− HE ⊂ HE and . This implies immediately that Z is given (up to a global vice versa if A− ∈ ACAR − + − and κ ∈ HE . Since θ− (P h) = P h and phase) by Zξ = ξ and Zκ = −κ for ξ ∈ HE θ− ([1I − P ]h) = −[1I − P ]h hold, we get from (6.47) + HE
U πω (B(P h)T )U ∗ = πE (B(P h)), U πω (B([1I − P ]h)T )U ∗ = ZπE (B([1I − P ]h)). In addition, we have
RL,ω = πω (B([1I − P ]h)T ) | h ∈ K ,
RR,ω = πω (B(P h)T ) | h ∈ K .
Hence we get (6.38) from Proposition 6.6.
(6.49) (6.50)
(6.51) (6.52)
November 28, 2006 11:15 WSPC/148-RMP
962
J070-00284
M. Keyl et al.
6.3. The ground state Now let us return to the XY model and its ground state (cf. [19] for details). Recall that the shift is defined on ACAR by a Bogolubov transformation with respect to the unitary u given in Eq. (6.30). A quasi-free state ψA is translationally invariant if and only if the covariance operator A commutes with this u. It turns out that for a translationally invariant quasi-free state ψA , the Fourier transform FAF −1 of the ˜ covariance operator A is a (2 by 2 matrix valued) multiplication operator A(x) on F K = L2 ([0, 2π]) ⊕ L2 ([0, 2π]). We use the following normalization for the Fourier transform: 2π ∞ −1 F (f )(x) = einx fn , fn = (2π) e−inx F (f )(x) dx (6.53) 0
n=−∞
for f = (fn ) ∈ l2 (Z) and F (f )(x) ∈ L2 ([0, 2π]). The Θ invariant ground state of the XY model ϕS is described by ϕS (Q) = ϕS (Q+ + Q− ) = ψE (Q+ ),
(6.54)
where Q = Q+ + Q− , Q± ∈ A± , and E is the basis projection defined by the multiplication operator on F K; 1 1 −1 ˆ K(x) (6.55) F EF = E(x) = 1+ 2 k(x) with
−iγ sin x , −(cos x − λ)
(6.56)
k(x) = [(cos x − λ)2 + γ 2 sin2 x]1/2 .
(6.57)
cos x − λ K(x) = iγ sin x and
We will denote the GNS representation of ϕS by (HS , πS , ΩS ) and the left/right half-chain algebras by RL/R,S . From Proposition 6.7 we immediately get: Corollary 6.8. The unique ground state ϕS of the critical XY model satisfies Haag-duality, i.e. RL,S = RR,S
(6.58)
holds. The next step is to analyze the type of the half-chain algebras RL/R,S . For an isotropic chain (γ = 0) with magnetic field |λ| < 1 this is done in [36, Theorem 4.3] using methods from [41]. Proposition 6.9. Consider the ground state ϕS in the special case γ = 0, |λ| < 1. Then the von Neumann algebras RR/L,S are of type III1 . In the general case we are not yet able to prove such a strong result. We can only show that the RL/R,S are not of type I (as stated in Theorem 6.1). This is
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
963
done in a series of steps, which traces the problem back to a statement about quasi-inequivalence of quasi-free states. Lemma 6.10. Consider a pure state ω on A and its restrictions ωL/R to AL/R . Assume that the von Neumann algebras RL/R,ω are of type I, then ω and σ = ωL ⊗ωR are quasi-equivalent and factorial. Proof. Since RR,ω and RL,ω are of type I, we can decompose the GNS Hilbert space into a tensor product Hω = HL,ω ⊗ HR,ω with RR,ω = 1I ⊗ B(HR,ω ) and RL,ω = B(HL,ω ) ⊗ 1I. The state σ = ωL ⊗ ωR is ω-normal and it can be written as σ(A) = tr(πω (A)ρL ⊗ρR ) where ρL/R are partial traces of |Ωω Ωω | over HR/L,ω . The GNS representation of σ is therefore given by Hσ = HS ⊗ K and πσ (A) = πω (A)⊗ 1I with an auxiliary Hilbert space K. Hence πσ (A) = B(Hω ) ⊗ 1I which shows that σ is factorial. Since ω is factorial as well, the two states are either quasi-equivalent or disjoint, and since σ is ω-normal they are quasi-equivalent. Hence, to prove that RL/R,S are not of type I, we have to show that ϕS and ϕL,S ⊗ ϕR,S are quasi-inequivalent. The following lemmas helps us to translate this to a statement about states on ACAR . Lemma 6.11. Consider two Θ-invariant states ω1 , ω2 on A and their restrictions ω1+ , ω2+ to the even algebra A+ . Assume in addition that ω1 is pure and ω2+ factorial. If ω1 and ω2 are quasi-equivalent one of the following is valid: (1) The restriction to the even part ω1+ is quasi-equivalent to ω2+ . (0) (2) The restriction to the even part ω1+ is quasi-equivalent to ω2+ ◦ Ad(σx ) where (0) (0) (0) Ad(σx )(Q) = σx Qσx . Proof. Let us denote the GNS representation of ωj+ by (Hj+ , πj+ , Ω+ j ) and of ωj by (Hj , πj , Ωj ). Then we have with A ∈ A+ Hj+ = πj (A+ )Ωj , Pj+ πj (A)Pj+ = πj+ (A)
Ω+ j = Ωj ,
and Pj− πj (A)Pj− = πj− (A) = πj+ (σx Aσx ), (0)
Pj±
where denote the projections onto Pj± ∈ πj (A+ ) the maps
Hj+
(0)
and its orthocomplement
πj (A+ ) A → Pj± APj± ∈ πj± (A+ )
(6.59) Hj− .
Since (6.60)
πj± (A+ ) .
define *-homomorphisms onto Now note that ω1 and ω2 are factorial. For ω1 this follows from purity (hence π1 (A) = B(H1 )) and for ω2 from quasi-equivalence with ω1 , since the latter implies the existence of a *-isomorphism β : π1 (A) → π2 (A)
with β(π1 (A)) = π2 (A).
(6.61)
Due to factoriality of ωj the center Zj of πj (A+ ) is either trivial or twodimensional. To see this, note that any operator in Zj which commutes with
November 28, 2006 11:15 WSPC/148-RMP
964
J070-00284
M. Keyl et al.
Vj = πj (σx ) is in the center of πj (A) . Since ωj is factorial, this implies that the automorphism πj (A+ ) Q → αj (Q) = Vj QVj ∈ πj (A+ ) acts ergodically on Zj (i.e. the fixed point algebra is trivial). But αj is idempotent such that each αj (Q)Q, Q ∈ Zj is a fixed point of αj . If Q is a non-trivial projection this implies αj (Q) = 1I−Q. By linearity of αj this cannot hold simultaneously for two orthogonal projections Q1 , Q2 = 1I − Q1 in Zj . Hence Zj is at most two-dimensional as stated. To proceed, we have to use purity of ω1 . According to [19, Lemmas 4.1 and 8.1] (0) the representations π1+ and π1− = π1+ ◦ Ad(σx ) of A+ are irreducible and disjoint. Since π1± (A) = P1± π(A)P1± holds for each A ∈ A+ the latter implies that the central supports c(P1± ) of P1+ and P1− = 1I − P1+ (i.e. the smallest central projections in π1 (A+ ) containing P1± ) are orthogonal. But this is only possible if c(P1± ) = P1± . Hence P1± are in the center of π1 (A+ ) and according to the discussion of the last paragraph these are the only non-trivial central projections. Applying the *isomorphism β we see likewise that Q = β(P1+ ) and 1I−Q = β(P1− ) are the only nontrivial central projections in π2 (A+ ) . Since A → P2+ AP2+ is a *-homomorphism from π2 (A+ ) onto π2+ (A+ ) the center of π2 (A+ ) is mapped into the center of π2+ (A+ ) . Since ω2+ is factorial by assumption we get P2+ QP2+ = P2+ and P2+ (1I − Q)P2+ = 0 or vice versa. This implies either Q = P2+ or Q = P2− . Hence β maps π1+ (A+ ) in the first case to π2+ (A+ ) and in the second to π2− (A+ ) . Therefore (0) ω1+ is quasi-equivalent to ω2+ or ω2+ ◦ Ad(σx ) as stated. (0)
We will apply this lemma to states coinciding with quasi-free states on the even part of the algebra. The following lemmas (partly taken from [42, 43]) help us to . discuss the corresponding restrictions to ACAR + Lemma 6.12. Let ω1 and ω2 be quasi-free states of ACAR . The restrictions to the even part ω1+ and ω2+ are not quasi-equivalent, if ω1 and ω2 are not quasi-equivalent. Proof. cf. [42, Proposition 1]. Lemma 6.13. Consider a basis-projection E, the covariance operator F = PEP + (1I − P )E(1I − P ), + , ψE
(6.62)
ψF+
of the quasi-free states ψE , ψF to the even algebra and the restrictions + is quasi-inequivalent to ψF+ and to A+ . If ψE and ψF are quasi-inequivalent, ψE (0) + ψF ◦ Ad(σx ). + Proof. Quasi-inequivalence of ψE and ψF+ follows directly from Lemma 6.12. Hence (0) + + assume ψE and ψF ◦ Ad(σx ) are quasi-equivalent. From the proof of Proposi(0) tion 6.7, recall that σx = T B(h0 ) = B(h0 )T holds with h0 ∈ K, (h0 )j = (δj0 , δj0 ). Therefore
σx(0) B(h)σx(0) = B(h0 )T B(h)T B(h0 ) = B(h0 )B(θ− h)B(h0 ).
(6.63)
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains (0)
965
(0)
With the anti-commutation relations (6.2) we get σx B(h)σx = B(ϑh) with ϑ(h) = h0 , θ− h h0 − θ− h. The operator ϑ is selfadjoint and unitary and commutes (0) with Γ. This implies that ϑF ϑ is a valid covariance operator and ψF ◦ Ad(σx ) = + and ψϑF ϑ is therefore quasi-free. Hence by Lemma 6.12, quasi-equivalence of ψE (0) (0) + ψF ◦ Ad(σx ) implies quasi-equivalence of ψE and ψF ◦ Ad(σx ). To proceed note (0) that ψF ◦ Ad(σx ) and ψF ◦ Θ− are unitarily equivalent. This follows immedi(0) ately from Ad(σx ) = Θ− ◦ Ad(B(h0 )) and the fact that Ad(B(h0 )) is an inner automorphism of ACAR . Therefore ψE is quasi-equivalent to ψF ◦ Θ− = ψθ− F θ− . But θ− = 2P − 1I and therefore P θ− = P and (1I − P )θ− = (P − 1I) which implies θ− F θ− = F . But this would imply that ψE and ψF are quasi-equivalent + cannot be quasi-equivalent to in contradiction to our assumption. Hence ψE (0) + ψF ◦ Ad(σx ). Lemma 6.14. Consider a quasi-free state ψA of ACAR with covariance operator + to the even algebra ACAR is factorial if A(1I − A) is not of A. Its restriction ψA + trace-class. Proof. cf. [42, Proposition 2]. Now consider again the ground state ϕS and the corresponding product state they coincide with the Fock state σ = ϕS,L ⊗ ϕS,R . On the even algebra ACAR + ψE and the quasi-free state ψF , where E is the basis projection from Eq. (6.55) and F is given by Eq. (6.62). To check quasi-equivalence, we have to calculate the Hilbert–Schmidt norm of E − F (cf. Propositions 6.4 and 6.5). Such calculations are already done in [19], and we easily get the following lemma. Lemma 6.15. The operator X = PEP − PEPEP + (1I − P )E(1I − P ) − (1I − P )E(1I − P )E(1I − P )
(6.64)
with E from Eq. (6.55) is not trace-class. Proof. According to [19, Lemma 4.5], we have E − θ− Eθ− 2HS = tr(E + θ− Eθ− − Eθ− Eθ− − θ− Eθ− E) = ∞.
(6.65) Inserting θ− = P − (1I − P ) and using the fact that tr(Y ) = tr(P Y P ) + tr (1I − P )Y (1I − P ) holds for any positive operator Y , it is straightforward to see that E − θ− Eθ− 2HS = 4 tr(X) holds. Hence the statement follows. Now we are ready to combine all the steps to prove that RL/R,S are not of type I. The following proposition concludes the proof of Theorem 6.1. Proposition 6.16. Consider the unique ground state ϕS of the critical XY model and its GNS representation (HS , πS , ΩS ). The half-chain algebras RR,S = πS (AR ) , RL,S = πS (AL ) are not of type I.
November 28, 2006 11:15 WSPC/148-RMP
966
J070-00284
M. Keyl et al.
Proof. Consider the operators E, F and X from Eqs. (6.55), (6.62) and (6.64). It is easy to see E −F 2HS = tr(X). Hence E −F is not Hilbert–Schmidt by Lemma 6.15 and ψE not quasi-equivalent to ψF by Proposition 6.5. Lemma 6.13 implies therefore (0) + is neither quasi-equivalent to ψF+ nor to ψF+ ◦Ad(σx ). The quasi-free states that ψE CAR = A+ with ϕS and σ = ϕS,L ⊗ ϕS,R . In addition we ψE , ψF coincides on A+ know that ϕS and σ are Θ-invariant, ϕS is pure and σ + = ψF+ is factorial. The latter follows from Lemmas 6.14 and 6.15 and the fact that F (1I − F ) = X holds. Hence we can apply Lemma 6.11 to see that ϕS and σ are quasi-inequivalent. The statement then follows from Lemma 6.10. 7. Conclusions We have seen that the amount of entanglement contained in a pure state ω of an infinite quantum spin chain is deeply related to the type of the von Neumann algebras RL/R,ω . If they are of type I, the usual setup of entanglement theory can be applied, including in particular the calculation of entanglement measures. However, if RL/R,ω are not of type I all normal states have infinite one-copy entanglement and all known entanglement measures become meaningless. The discussion of Sec. 6 clearly shows that the critical XY model belongs to this class and it is very likely that the same holds for other critical models. An interesting topic for future research is the question how different states (respectively inequivalent bipartite systems) can be physically distinguished in the infinitely entangled case. One possible approach is to look again at the von Neumann type. However, it is very likely that additional information about the physical context is needed. A promising variant of this idea is to look for physical condition which exclude particular cases. Proposition 5.3 is already a result of this type and it is interesting to ask whether more types can be excluded by translational invariance. Another possibility is to analyze localization behavior along the lines outlined at the end of Sec. 5.2. In particular, the asymptotics of Lω in the limit N → ∞ for a translationally invariant state (such that Lω does not depend on the position parameter M ) seems to be very interesting, because it should provide a way to characterize the folium of ω in terms of entanglement properties (cf. the discussion in Sec. 5.2). A first step in this direction would be the calculation of Lω for particular examples such as the critical XY model. Acknowledgment This research of M. K. is partially supported by the Ministero Italiano dell’Universit` a e della Ricerca (MIUR) through FIRB (bando 2001) and PRIN 2005 and that of T. M. by the Center of Excellence Program, Graduate School Mathematics, Kyushu University, Japan. Appendix A. Strong Stability of Hyperfinite Type III Factors The discussion in Sec. 4.3 relies heavily on the strong stability of hyperfinite type III factors. While this is basically a known fact, we have not found an easily accessible
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
967
reference. Therefore, we will provide in the following a complete proof, which is based on the classification of hyperfinite factors (cf. [32, Chap. XIII] for a detailed survey). Hence, let us start with a type III factor R and its continuous decomposition [44, Theorem XII.1.1] R∼ = N θ R,
(A.1)
i.e. N is a type II∞ von Neumann algebra (acting on a Hilbert space H), admitting a faithful, semifinite, normal trace τ , and θ is a centrally ergodic flow on N which scales τ (i.e. τ ◦ θs = e−s τ ). The covariant system (N , R, θ) is uniquely determined (up to conjugation) by the isomorphism class of R. Therefore the central system (Z(N ), R, θ) — the flow of weights — is unique as well. Now, consider a (hyperfinite) type II1 factor M (acting on K). The tensor product R ⊗ M is type III again and satisfies R⊗M∼ = (N ⊗ M) θ⊗Id R.
(A.2)
To prove this equation, note that the crossed product on the right-hand side is a von Neumann algebra acting on the Hilbert space L2 (H ⊗ K, R, dx) = L2 (H, R, dx) ⊗ K and generated by π0 (N ⊗ M) and λ(R), where π0 and λ are representations of N ⊗ M and R respectively. They are given by (π0 (A ⊗ B)ξ)(s) = (θs−1 (A) ⊗ B)ξ(s),
(λ(t)ξ)(s) = ξ(t − s),
(A.3)
where A ∈ N , B ∈ M and ξ ∈ L2 (H ⊗ K, R, dx). If we set ξ = η ⊗ ζ with η ∈ L2 (H, R, dx) and ζ ∈ K this leads to ˜0 (A)η ⊗ Bζ, π0 (A ⊗ B)η ⊗ ζ = π
˜ λ(t)η ⊗ ζ = λ(t)η ⊗ ζ,
(A.4)
˜ are the representations of N and R given by where π ˜0 and λ (˜ π0 (A)η)(s) = θs−1 (A)η(s),
˜ (λ(t)η)(s) = η(t − s).
(A.5)
˜ generate N θ R ∼ But π ˜0 (N ) and λ(R) = R. Hence Eq. (A.2) follows from (A.4). Since R is a type III and M a type II factor, the tensor product R ⊗ M is again a type III factor. If we consider in addition the (unique) tracial state τ0 on M we see that θ ⊗ Id scales τ ⊗ τ0 . Therefore Eq. (A.2) is the continuous decomposition of R ⊗ M. Now, let us have a look at the flow of weights associated to R ⊗ M. Since M is a factor the center of N ⊗ M coincides with Z(N ) ⊗ 1I. Hence the central covariant systems (Z(N ), R, θ) and (Z(N ⊗ M), R, θ ⊗ Id) are mutual conjugate. If R is hyperfinite, this fact can be used to show strong stability. To this end note first that R ⊗ M is hyperfinite as well, because M is hyperfinite by assumption. Therefore we can use classification theory and get three different cases: • R is of type IIIλ with 0 < λ < 1. In this case the flow of weights of R is periodic with period −ln λ. Since (Z(N ), R, θ) and (Z(N ⊗ M), R, θ ⊗ Id) are conjugate the same holds for R ⊗ M, i.e. R ⊗ M is type IIIλ with the same λ (cf. [44,
November 28, 2006 11:15 WSPC/148-RMP
968
J070-00284
M. Keyl et al.
Definition XII.1.5, Theorem XII.1.6]). Strong stability (R ⊗ M ∼ = R) therefore follows from the uniqueness of hyperfinite IIIλ factors with 0 < λ < 1. (cf. [32, Theorem XVIII.1.1]). • R is of type III1 . Hence the center of N is trivial and since M is a factor the same holds for Z(N ⊗ M) — in other words R ⊗ M is type III1 again (cf. [44, Definition XII.1.5, Theorem XII.1.6]). Now we can proceed as above, if we use the uniqueness of the hyperfinite type III1 factor [32, Theorem XVIII.4.16]. • R is of type III0 . In this case strong stability follows directly from the fact that two hyperfinite III0 factors are isomorphic iff the corresponding flows of weights are conjugate [32, Theorem XVIII.2.1]. This list covers all possibilities and therefore the strong stability property used in the proof of Proposition 4.5 is shown. References [1] K. Audenaert, J. Eisert, M. B. Plenio and R. F. Werner, Entanglement properties of the harmonic chain, Phys. Rev. A 66 (2002) 042327. [2] M. Fannes, B. Haegeman and M. Mosonyi, Entropy growth of shift-invariant states on a quantum spin chain, J. Math. Phys. 44(12) (2003) 6005–6019. [3] A. Botero and B. Reznik, Spatial structures and localization of vacuum entanglement in the linear harmonic chain, Phys. Rev. A 70 (2004) 052329. [4] P. Calabrese and J. Cardy, Entanglement entropy and quantum field theory, J. Stat. Mech. Theory Exp. 2004(6) (2004) 002, 27 pp. (electronic). [5] B.-Q. Jin and V. E. Korepin, Quantum spin chain, Toeplitz determinants and the Fisher-Hartwig conjecture, J. Statist. Phys. 116(1–4) (2004) 79–95. [6] J. P. Keating and F. Mezzadri, Random matrix theory and entanglement in quantum spin chains, Comm. Math. Phys. 252(1–3) (2004) 543–579. [7] V. E. Korepin, Universality of entropy scaling in one dimensional gapless models, Phys. Rev. Lett. 92 (2004) 096402. [8] J. I. Latorre, E. Rico and G. Vidal, Ground state entanglement in quantum spin chains, Quantum Inf. Comput. 4(1) (2004) 48–92. [9] I. Peschel, On the entanglement entropy for an XY spin chain, J. Stat. Mech. Theory Exp. 2004(12) (2004) 005, 6 pp. (electronic). [10] J. Eisert and M. Cramer, Single-copy entanglement in critical spin chains, Phys. Rev. A 72 (2005) 042112. [11] S. Farkas and Z. Zimbor´ as, On the sharpness of the zero-entropy-density conjecture, J. Math. Phys. 46(12) (2005) 123301. [12] A. R. Its, B.-Q. Jin and V. E. Korepin, Entanglement in the XY spin chain, J. Phys. A 38(13) (2005) 2975–2990. [13] J. P. Keating and F. Mezzadri, Entanglement in quantum spin chains, symmetry classes of random matrices, and conformal field theory, Phys. Rev. Lett. 94(5) (2005) 050501. [14] R. Orus, J. I. Latorre, J. Eisert and M. Cramer, Half the entanglement in critical systems is distillable from a single specimen, quant-ph/0509023 (2005). [15] M. M. Wolf, G. Ortiz, F. Verstraete and J. I. Cirac, Quantum phase transitions in matrix product systems, cond-mat/0512180 (2005). [16] M. Keyl, D. Schlingemann and R. F. Werner, Infinitely entangled states, Quant. Inf. Comput. 3(4) (2003) 281–306.
November 28, 2006 11:15 WSPC/148-RMP
J070-00284
Entanglement of Infinite Quantum Spin Chains
969
[17] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics II (Springer, Berlin, 1997). [18] H. Araki, On the XY -model on two-sided infinite chain, Publ. Res. Inst. Math. Sci. 20(2) (1984) 277–296. [19] H. Araki and T. Matsui, Ground states of the XY -model, Comm. Math. Phys. 101(2) (1985) 213–245. [20] J. S. Summers and R. Werner, Maximal violation of Bell’s inequalities is generic in quantum field theory, Comm. Math. Phys. 110(2) (1987) 247–259. [21] R. Verch and R. F. Werner, Distillability and positivity of partial transposes in general quantum field systems, Rev. Math. Phys. 17(5) (2005) 545–576. [22] R. F. Werner, Quantum states with Einstein–Podolsky–Rosen correlations admitting a hidden-variable model, Phys. Rev. A 40(8) (1989) 4277–4281. [23] B. S. Cirel’son, Quantum generalizations of Bell’s inequalities, Lett. Math. Phys. 4 (1980) 93–100. [24] S. J. Summers and R. F. Werner, On Bell’s inequalities and algebraic invariants, Lett. Math. Phys. 33 (1995) 321–334. [25] R. F. Werner and M. M. Wolf, Bound entangled gaussian states, Phys. Rev. Lett. 86(16) (2001) 3658–3661. [26] M. Takesaki, Theory of Operator Algebras. I (Springer-Verlag, New York, 1979). [27] S. Doplicher and R. Longo, Standard and split inclusions of von Neumann algebras, Invent. Math. 75(3) (1984) 493–536. [28] R. Longo, Solution of the factorial Stone–Weierstrass conjecture. An application of the theory of standard split W ∗ -inclusions, Invent. Math. 76(1) (1984) 145–155. [29] R. Clifton and H. Halvorson, Bipartite mixied states of infinite dimensional systems are generically nonseparable, Phys. Rev. A 61 (2000) 012108. [30] P. Horodecki, J. I. Cirac and M. Lewenstein, Bound entanglement for continuous variables is a rare phenomenon, quant-ph/0103076 (2001). [31] H. Araki and E. J. Woods, A classification of factors, Publ. Res. Inst. Math. Sci. 4 (1968) 51–130. [32] M. Takesaki, Theory of Operator Algebras III, Operator Algebras and Noncommutative Geometry, 8, Encyclopaedia of Mathematical Sciences, Vol. 127 (Springer-Verlag, Berlin, 2003). [33] O. Bratteli and D. W. Robinson, Operator Algebras and Quantum Statistical Mechanics. I (Springer, New York, 1979). [34] R. V. Kadison and J. R. Ringrose, Fundamentals of the Theory of Operator Algebras. Vol. II: Advanced Theory, Graduate Studies in Mathematics, Vol. 16 (American Mathematical Society, Providence, RI, 1997); Corrected reprint of the 1986 original. [35] B. Nachtergaele and R. Sims, Lieb–Robinson bound and the exponential clustering theorem, math-ph/0506030 (2005). [36] T. Matsui, The split property and the symmetry breaking of the quantum spin chain, Comm. Math. Phys. 218(2) (2001) 393–416. [37] H. Araki, On quasifree states of CAR and Bogoliubov automorphisms, Publ. Res. Inst. Math. Sci. 6 (1970/71) 385–442. [38] H. Araki, Bogoliubov automorphisms and Fock representations of canonical anticommutation relations, in Operator Algebras and Mathematical Physics, Contemporary Mathematics, Vol. 62 (Amer. Math. Soc., Providence, RI, 1987), pp. 23–41. [39] H. Baumg¨ artel and M. Wollenberg, Causal Nets of Operator Algebras (Akademie Verlag, Berlin, 1992). [40] H. Baumg¨ artel, M. Jurke and F. Lled´ o, Twisted duality of the CAR-algebra, J. Math. Phys. 43(8) (2002) 4158–4179.
November 28, 2006 11:15 WSPC/148-RMP
970
J070-00284
M. Keyl et al.
[41] A. Wassermann, Operator algebras and conformal field theory. III. Fusion of positive energy representations of LSU(N ) using bounded operators, Invent. Math. 133(3) (1998) 467–538. [42] T. Matsui, Factoriality and quasi-equivalence of quasifree states for Z2 and U(1) invariant CAR algebras, Rev. Roumaine Math. Pures Appl. 32(8) (1987) 693–700. [43] T. Matsui, On quasi-equivalence of quasifree states of gauge invariant CAR algebras, J. Operator Theory 17(2) (1987) 281–290. [44] M. Takesaki, Theory of Operator Algebras II, Operator Algebras and NonCommutative Geometry, 6, Encyclopaedia of Mathematical Sciences, Vol. 125 (Springer-Verlag, Berlin, 2003).
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Reviews in Mathematical Physics Vol. 18, No. 9 (2006) 971–1053 c World Scientific Publishing Company
LINEAR SUPERPOSITION IN NONLINEAR WAVE DYNAMICS
A. BABIN∗ and A. FIGOTIN† Department of Mathematics, University of California at Irvine, CA 92697, USA ∗[email protected] †[email protected] Received 24 April 2006 Revised 27 August 2006
We study nonlinear dispersive wave systems described by hyperbolic PDE’s in Rd and difference equations on the lattice Zd . The systems involve two small parameters: one is the ratio of the slow and the fast time scales, and another one is the ratio of the small and the large space scales. We show that a wide class of such systems, including nonlinear Schrodinger and Maxwell equations, Fermi–Pasta–Ulam model and many other not completely integrable systems, satisfy a superposition principle. The principle essentially states that if a nonlinear evolution of a wave starts initially as a sum of generic wavepackets (defined as almost monochromatic waves), then this wave with a high accuracy remains a sum of separate wavepacket waves undergoing independent nonlinear evolution. The time intervals for which the evolution is considered are long enough to observe fully-developed nonlinear phenomena for involved wavepackets. In particular, our approach provides a simple justification for numerically observed effect of almost non-interaction of solitons passing through each other without any recourse to the complete integrability. Our analysis does not rely on any ansatz or common asymptotic expansions with respect to the two small parameters but it uses rather explicit and constructive representation for solutions as functions of the initial data in the form of functional analytic series. Keywords: Nonlinear waves; wave packets; quasiparticles; nonlinear hyperbolic PDE; nonlinear Schrodinger equation; Fermi–Pasta–Ulam system; dispersive media; small parameters; implicit function theorem. Mathematics Subject Classification 2000: 35L70, 35L75, 35L90, 35G55, 35Q60, 34C15, 37K60, 39A12
1. Introduction The principal object of our studies here is a general nonlinear evolutionary system which describes wave propagation in homogeneous media governed either by a hyperbolic PDE’s in Rd or by a difference equation on the lattice Zd , where 971
November 28, 2006 11:15 WSPC/148-RMP
972
J070-00285
A. Babin & A. Figotin
d = 1, 2, 3, . . . is the space dimension. We assume the evolution to be governed by the following equation with constant coefficients i ∂τ U = − L(−i∇)U + F(U),
U(r, τ )|τ =0 = h(r),
r ∈ Rd ,
(1.1)
where (i) U = U(r, τ ), r ∈ Rd , U ∈ C2J is a 2J-dimensional vector; (ii) L(−i∇) is a linear self-adjoint differential (pseudodifferential) operator with constant coefficients with the symbol L(k), which is a Hermitian 2J × 2J matrix; (iii) F is a general polynomial nonlinearity; (iv) > 0 is a small parameter. The form of the equation suggests that the processes described by it involve two time scales. Since the nonlinearity F(U) is of order one, nonlinear effects occur at times τ of order one, whereas the natural time scale of linear effects, governed by the operator L with the coefficient 1/, is of order . Consequently, the small parameter measures the ratio of the slow (nonlinear effects) time scale and the fast (linear effects) time scale. A typical example an equation of the form (1.1) is nonlinear Schrodinger equation (NLS) or a system of NLS. Another one is the Maxwell equation in a periodic medium when truncated to a finite number of bands, and more examples are discussed below. We assume further that the initial data h for the evolution equation (1.1) to be the sum of a finite number of wavepackets hl , l = 1, . . . , N , i.e. h = h1 + · · · + hN ,
(1.2)
where the monochromaticity of every wavepacket hl is characterized by another small parameter β. The well-known superposition principle is a fundamental property of every linear evolutionary system, stating that the solution U corresponding to the initial data h as in (1.2) equals U = U1 + · · · + UN ,
for h = h1 + · · · + hN ,
(1.3)
where Ul is the solution to the same linear problem with the initial data hl . Evidently the standard superposition principle cannot hold exactly as a general principle in the presence of a nonlinearity, and, at the first glance, there is no expectation for it to hold even approximately. We have discovered though that the superposition principle does hold with a high accuracy for general dispersive nonlinear wave systems provided that the initial data are a sum of generic wavepackets, and this constitutes the subject of this paper. Namely, the superposition principle for nonlinear wave systems states that the solution U corresponding to the multiwavepacket initial data h as in (1.2) equals U = U1 + · · · + UN + D,
for h = h1 + · · · + hN ,
where D is small.
As to the particular form (1.1) we chose to be our primary one, we would like to point out that many important classes of problems involving small parameters can be readily reduced to the framework of (1.1) by a simple rescaling. It can be seen from the following examples. First example is a system with a small factor before
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
973
the nonlinearity ∂t v = −iLv + αf (v),
0 < α 1,
v|t=0 = h,
(1.4)
where initial data are bounded uniformly in α. Such problems are reduced to (1.1) by the time rescaling τ = tα. Note that now = α and the finite time interval 0 ≤ τ ≤ τ∗ corresponds to the long time interval 0 ≤ t ≤ τ∗ /α. The second example is a system with small initial data on a long time interval. The system here is given and has no small parameters but the initial data are small, namely ∂t v = −iLv + f0 (v),
v|t=0 = α0 h, f0 (v) =
0 < α0 1,
(m) f0 (v)
+
(m+1) f0 (v)
where + ···,
(1.5)
where α0 is a small parameter and f (m) (v) is a homogeneous polynomial of degree m ≥ 2. After the rescaling v = α0 V, we obtain the following equation with a small nonlinearity (m)
[f0 ∂t V = −iLV + αm−1 0
(V) + α0 f 0(m+1) (V) + · · ·],
V|t=0 = h,
(1.6)
. Introducing the slow time variable which is of the form of (1.4) with α = αm−1 0 we get from the above an equation of the form (1.1), namely τ = tαm−1 0 ∂τ V = −
i LV + [f (m) (V) + α0 f (m+1) (V) + · · ·], αm−1 0
V|t=0 = h,
(1.7)
where the nonlinearity does not vanish as α0 → 0. In this case = αm−1 and the 0 τ∗ finite time interval 0 ≤ τ ≤ τ∗ corresponds to the long time interval 0 ≤ t ≤ αm−1 0 with small α0 1. Very often in theoretical studies of equations of the form (1.1) or ones reducible to it, a functional dependence between and β is imposed, resulting in a single small parameter. The most common scaling is = β 2 . The nonlinear evolution of wavepackets for a variety of equations which can be reduced to the form (1.1) was studied in numerous physical and mathematical papers, mostly by asymptotic expansions of solutions with respect to a single small parameter similar to β, see [11, 14, 18, 20, 23, 28, 29, 34, 38–40] and references therein. Often the asymptotic expansions are based on a specific ansatz prescribing a certain form to the solution. In our studies here we do not use asymptotic expansions with respect to a small parameter and do not prescribe a specific form to the solution, but we impose conditions on the initial data requiring it to be a wavepacket or a linear combination of wavepackets. Since we want to establish a general property of a wide class of systems, we apply a general enough dynamical approach. There is a number of general approaches developed for the studies of highdimensional and infinite-dimensional nonlinear evolutionary systems of hyperbolic type, [10, 13, 19, 22, 27, 31, 35, 39, 41, 43, 45] and references therein. We develop here
November 28, 2006 11:15 WSPC/148-RMP
974
J070-00285
A. Babin & A. Figotin
an approach which allows to exploit specific properties of a certain class of initial data, namely wavepackets and their linear combinaions, which comply with the symmetries of equations. Such a class of the initial data is obviously lesser than all possible initial data. One of the key mathematical tools developed here for the nonlinear studies is a refined implicit function theorem (Theorem 4.25). This theorem provides a constructive and rather explicit representation of the solution to an abstract nonlinear equation in a Banach space as a certain functional series. The representation is explicit enough to prove the superposition principle and is general enough to carry out the studies of the problem without imposing restrictions on dimension of the problem, structural restrictions on nonlinearities or a functional dependence between the two small parameters , β. As we have already stated the superposition principle holds with high accuracy for linear combinations of wavepackets. A wavepacket h(β, r) can be most easily ˜ k). Simply speaking, wavepacket described in terms of its Fourier transform h(β, ˜ h(β, k) is a function which is localized in β-neighborhood of a given wavevector k∗ (the wavepacket center ) and as a vector is an eigenfunction of the matrix L(k), details of the definition of the wavepacket can be found in the following Sec. 2. The simplest example of a wavepacket is a function of the form −d ˆ k − k∗ ˜ (1.8) h(β, k) = β h gn (k∗ ), k ∈ Rd , β ˆ where gn (k∗ ) is an eigenvector of the matrix L(k∗ ) and h(k) is a Schwartz function (i.e. it is infinitely smooth and rapidly decaying one). Note that the inverse Fourier ˜ k) has the form transform h(β, r) of h(β, h(β, r) = h(βr)eik∗ r gn (k∗ ),
r ∈ Rd ,
(1.9)
where h(r) is a Schwartz function, and obviously has a large spatial extension of order β −1 . We study the nonlinear evolution equation (1.1) on a finite time interval 0 ≤ τ ≤ τ∗ ,
where τ∗ > 0 is a fixed number
(1.10)
∞
which may depend on the L norm of the initial data h but, importantly, τ∗ does not depend on . We consider classes of initial data such that wave evolution governed by (1.1) is significantly nonlinear on time interval [0, τ∗ ] and the effect of the nonlinearity F (U) does not vanish as → 0. We assume that β, satisfy 0 < β ≤ 1,
0 < ≤ 1,
β2 ≤ C1
with some C1 > 0. 2
(1.11)
The above condition on the dispersion parameter β ensures that the dispersive effects are not dominant and do not suppress nonlinear effects, see [7] for a discussion. To formulate the superposition principle more precisely, we introduce first the solution operator S(h)(τ ) : h → U(τ ) which relates to the initial data h of the nonlinear evolution equation (1.1) the solution U(t) of this equation. Suppose that the
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
975
initial state is a multi-wavepacket, namely h = hl , with hl , l = 1, . . . , N being “generic” wavepackets. Then for all times 0 ≤ τ ≤ τ∗ the following superposition principle holds S
N l=1
hl (τ ) =
N
S(hl )(τ ) + D(τ ),
D(τ )E = sup D(τ )L∞ ≤ Cδ 0≤τ ≤τ∗
(1.12)
l=1
β 1+δ
for any small δ > 0.
(1.13)
Obviously, the right-hand side of (1.13) may be small only if ≤ C1 β. There are examples (see [7]) in which D(τ ) is not small for = C1 β. In what follows we refer to a linear combination of wavepackets as a multi-wavepacket, and to wavepackets which constitutes the multi-wavepacket as component wavepackets. The superposition principle implies, in particular, that in the process of nonlinear evolution every single wavepacket propagates almost independently of other wavepackets even though they may “collide” in physical space for a certain period of time and the exact solution equals the sum of particular single wavepacket solutions with a high precision. In particular, the dynamics of a solution with multiwavepacket initial data is reduced to dynamics of separate solutions with single wavepacket data. Note that the nonlinear evolution of a single wavepacket solution for many problems is studied in detail, namely it is well-approximated by its own nonlinear Schrodinger equation (NLS), see [18, 23, 29, 30, 39–41, 7]and references therein. The superposition principle (1.12), (1.13) can also be looked at as a form of separation of variables. Such a form of separation of variables is different from usual complete integrability, and its important factor is the continuity of spectrum of the linear component of the system. The approximate superposition principle imposes certain restrictions on dynamics which differ from usual constraints imposed by the conserved quantities as in completely integrable systems as well as from topological constraints related to invariant tori as in KAM theory. Now we present an elementary physical argument justifying the superposition principle. If nonlinearity is absent, the superposition principle holds exactly and any deviation from it is due to the nonlinear interactions between wavepackets, so we need to estimate their impact. Suppose that initially at time τ = 0 the spatial extension s of every composite wavepacket is characterized by the parameter β −1 as in (1.9).] Assume also (and it is quite an assumption) that the component wavepackets during the nonlinear evolution maintain somehow their wavepacket identity, group velocities and spatial extension. Then, consequently, the spatial extension of every component wavepacket is propositional to β −1 and its group velocity vj is proportional to −1 . The difference ∆v between any two different component group velocities is also proportional to −1 . The time when two different component wavepackets overlap in space is proportional to s/|∆v| and, hence, to /β.
November 28, 2006 11:15 WSPC/148-RMP
976
J070-00285
A. Babin & A. Figotin
Since the nonlinear term is of order one, the magnitude of the impact of the nonlinearity during this time interval should be proportional to /β, which results in the same order of magnitude of D. This conclusion is in agreement with the estimate of magnitude of D in (1.13) (if we set δ = 0). The rigorous proof of the superposition principle we present in this paper is not based on the above argument since it implicitly relies on a superposition principle in the form of an assumption that component wavepackets can somehow maintain their identity, group velocities and spatial extension during nonlinear evolution which by no means is obvious. In fact, the question if a wavepacket or a multi-wavepacket structure can be preserved during nonlinear evolution is important and interesting question on its own right. The answer to it under natural conditions is affirmative as we have shown in [7]. Namely, if initially solution was a multi-wavepacket at τ = 0, it remains a multi-wavepacket at τ > 0, and every component wavepacket maintains its identity. Therefore a wavepacket can be interpreted as a quasi-particle which maintains its identity and can interact with other quasi-particles. This property holds also in the situation when there are stronger nonlinear interactions between wavepacket components which do not allow the superposition principle to hold, see [7] for details. The proof we present here is based on general algebraic-functional considerations. The strategy of our proof is as follows. First, we prove that the operator S(h) in (1.12) is analytical, i.e. it can be written in the form of a convergent series S(h) =
∞
S (j) (hj ),
hj = h, . . . , h
(j copies of h),
j=1
where S (j) (hj ) is a j-linear operator applied to h. Now we substitute h in S (j) with the sum of hl as in (1.2). Considering for simplicity the case N = 2 and using the polylinearity of S (j) we get S (2) ((h1 + h2 )2 ) = S (2) ((h1 )2 ) + 2S (2) (h1 h2 ) + S (2) ((h2 )2 ), . . . , implying after the summation S(h) = S (2) ((h1 )2 ) + S (3) ((h1 )3 ) + · · · + S (2) ((h2 )2 ) + S (3) ((h2 )3 ) + · · · + Scr = S(h1 ) + S(h2 ) + Scr , where Scr is a sum of all cross terms such as S (2) (h1 h2 ) etc. The main part of the proof is to show that every term in Scr is small. An important step for that is based on the refined implicit function theorem (Theorem 4.25) which allows to represent the operators S (j) in the form of a sum of certain composition monomials, which, in turn, have a relatively simple oscillatory integral representation. Importantly, the relevant oscillatory integrals involve the known initial data hl rather than unknown solution U. The analysis of the oscillatory integrals shows that there are two mechanisms responsible for the smallness of the integrals. The first one is time averaging, and the second one is based on large group velocities (in the slow time scale) of
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
977
wavepackets. Remarkably, if wavepackets satisfy proper genericity conditions, every cross term is small due one of the above mentioned two mechanisms. Importantly, the both mechanism are instrumental for the smallness of terms in Scr , and the time averaging alone is not sufficient. We obtain estimates on terms in Scr which ultimately yield the estimate (1.13). Since the smallness of interactions between waves under nonlinear evolution stems from high frequency oscillations in time and space of functions involved in the interaction integrals, we can interpret it as a result of the destructive wave interference. The above sketch shows that the mathematical tools we use in our studies are (i) the theory of analytic functions and corresponding series of infinite-dimensional (Banach) variable, and (ii) the theory of oscillatory integrals. We would like to point out that the estimate (1.13) for the remainder in the superposition principle is quite accurate. For example, when the estimate is applied to the sine-Gordon equation with bimodal initial data, it yields essentially optimal estimates for the magnitude of the interaction of counterpropagating waves. These estimates are more accurate than ones obtained by the well known ansatz method as in [38], and the comparative analysis is provided below in Example 1 of Sec. 2.2. To summarize the above analysis, we list important ingredients of our approach. • The spectrum of the underlying linear problem is continuous. • The wave nonlinear evolution is analyzed based on the modal decomposition with respect to the linear component of the system because there is no exchange of energy between modes by linear mechanisms. Wavepacket definition is based on the modal expansion determining, in particular, its the spatial extension and the group velocity. • The problem involves two small parameters β and respectively in the initial data and coefficients of the equations. These parameters scale respectively (i) the range of wavevectors involved in its modal composition, with β −1 scaling its spatial extension, and (ii) scaling the ratio of the slow and the fast time scales. We make no assumption on the functional dependence between β and , which are essentially independent and are subject only to inequalities. • The nonlinear evolution is studied for a finite time τ∗ which may depend on, say, the amplitude of the initial excitation, and, importantly, τ∗ is long enough to observe appreciable nonlinear phenomena which are not vanishingly small. The superposition principle can be extended to longer time intervals up to blow-up time or even infinity if relevant uniform in β and estimates of solutions in appropriate norms are available. • Two fast wave processes (in the chosen slow time scale) attributed to the linear operator L and having typical time scale of order can be identified as responsible for the essential independence of wavepackets: (i) fast time oscillations which lead to time averaging; (ii) fast wavepacket propagation with large group velocities produce effective weakening of interactions which are not subjected to time averaging.
November 28, 2006 11:15 WSPC/148-RMP
978
J070-00285
A. Babin & A. Figotin
The rest of the paper is organized as follows. In the following Sec. 2, we formulate exact conditions and theorems for lattice equations and partial differential equations and give examples. In Sec. 3, we recast the original evolution equation in a convenient reduced form allowing, in particular, to construct a representation of the solution in a form of convergent functional operator series explicitly involving the equation nonlinear term. In Sec. 4, we provide the detailed analysis of functionanalytic series used to get a constructive representation of the solution. Section 5 is devoted to the analysis of certain oscillatory integrals which are terms of the series representing the solution. Note that when making estimations we use the same letter C for different constants in different statements. Finally, the proofs of Theorems 2.15 and 2.19 are provided in Sec. 6. More examples and generalizations are given in Sec. 7. For the reader’s convenience, we provide a list of notations in the end of the paper. 2. Statement of Results In this section, we consider two classes of problems: lattice equations and partial differential equations. After Fourier transform they can be written in the modal form which is essentially the same in both cases. We formulate the exact conditions on the modal equations and present the main theorems on the superposition principle. We also give examples of equations to which the general theorems apply, in particular Fermi–Pasta–Ulam system and nonlinear Schrodinger equation. 2.1. Main definitions, statements and examples for the lattice equation The first class of evolutionary systems we consider involves systems of equations describing coupled nonlinear oscillators on a lattice Zd , namely the following lattice system of ordinary differential equations (ODE’s) with respect to time i (2.1) ∂τ U(m, τ ) = − LU(m, τ ) + F (U)(m, τ ), U(m, 0) = h(m), m ∈ Zd , where L is a linear operator, F is a nonlinear operator and > 0 is a small parameter (see [6]). To analyze the evolution equation (2.1) it is instrumental to recast it in the modal form (the wavevector domain), in other words, to apply to it the lattice Fourier transform as defined by the formula ˜ U(m)e−im·k , where k ∈ [−π, π]d , (2.2) U(k) = m∈Zd
k is called a wave vector. We assume that the Fourier transformation of the original lattice evolutionary equation (2.1) is of the form i ˜ ˜ ˜ ˜ ˜ ∂τ U(k, τ ) + F˜ (U)(k, τ ); U(k, 0) = h(k) for τ = 0. (2.3) τ ) = − L(k)U(k, ˜ Here, U(k, τ ) is 2J-component vector, L(k) is a k-dependent 2J × 2J matrix that ˜ is a nonlinear operator, which we corresponds to the linear operator L and F˜ (U)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
979
˜ describe later. The matrix L(k) and the coefficients of the nonlinear operator F˜ (U) in (2.3) are 2π-periodic functions of k and for that reason we assume that k belongs to the torus Rd /(2πZ)d which we denote by [−π, π]d . The k-dependent matrix L(k) determines the linear operator L and plays an important role in the analysis. We refer to L(k) as to the linear symbol. Since (2.3) describes evolution of the Fourier modes of the solution, we call (2.3) modal evolution equation. We study the modal evolution equation (2.3) on a finite time interval 0 ≤ τ ≤ τ∗ ,
(2.4)
where τ∗ > 0 is a fixed number which, as we will see, may depend on the magnitude of the initial data. The time τ∗ does not depend on small parameters, it is of order one and is determined by norms of operators and initial data; it is almost optimal for general F since there are examples when τ∗ is of the same order as the blow-up time of solutions. To make formulas and estimates simpler, we assume without loss of generality that τ∗ ≤ 1.
(2.5)
For a number of reasons the modal form (2.3) of the evolution equation is much more suitable for nonlinear analysis than the original evolution equation (2.1). This is why from now on we consider the modal form of evolution equation (2.3) for the ˜ modal components U(k, τ ) as our primary evolution equation. First, as an illustration, let us look at the simplest nontrivial example of (2.3) with J = 1 corresponding to two-component vector fields on the lattice Zd . A two-component vector function U(m) of a discrete argument m ∈ Zd has the form U+ (m) U(m) = (2.6) , m ∈ Zd . U− (m) In this example L(k) in (2.3) is a 2 × 2 matrix, and we assume that for almost all k it has two different real eigenvalues ω− (k) and ω+ (k) (the dependence of ω± (k) on k is called the dispersion relation) satisfying the relation ω− (k) = −ω+ (k), namely, L(k)gζ (k) = ωζ (k)gζ (k),
ωζ (k) = ζω(k),
ζ = ±,
(2.7)
where, evidently, gζ (k) are the eigenvectors of L(k). These eigenvalues ωζ (k), ζ = ±, are 2π-periodic real valued functions ωζ (k1 + 2π, k2 , . . . , kd ) = · · · = ωζ (k1 , k2 , . . . , kd + 2π) = ωζ (k1 , k2 , . . . , kd ).
(2.8)
˜ = The simplest nonlinearity in (2.3) is a quadratic nonlinear operator F˜ (U) (2) ˜ 2 ˜ F (U ) which is given by the following convolution integral ˜ 2 )(k) = 1 ˜ 1U ˜ 1 (k )U ˜ 2 (k )) dk , (2.9) F˜ (2) (U χ(2) (k, k)(U (2π)d k ∈[−π,π]d; k +k =k where k = (k , k ), χ(2) (k, k) is a quadratic tensor (susceptibility) which acts on ˜ 2 . We refer to the case J = 1 as the one-band case since the corre˜ 1, U vectors U sponding linear operator is described by a single function ω(k).
November 28, 2006 11:15 WSPC/148-RMP
980
J070-00285
A. Babin & A. Figotin
A particular example of (2.3) is obtained as a Fourier transform of the following Fermi–Pasta–Ulam equation (FPU) (see [12, 37, 44]) describing a nonlinear system of coupled oscillators: 1 (2.10) ∂τ xn = (yn − yn−1 ), 1 (xn+1 − xn ) + α2 (xn+1 − xn )2 + α3 (xn+1 − xn )3 , n ∈ Z. Note that an equivalent form of (2.10) (with α2 = 0) is the second-order equation 1 α3 ∂τ2 xn = 2 (xn−1 − 2xn + xn+1 ) + ((xn+1 − xn )3 − (xn − xn−1 )3 ). (2.11) In this example d = 1, k = k and elementary computations show that the Fourier transform of the FPU equation (2.10) has the form of the modal evolution equation (2.3), (2.9) where x ˜ 0 −(1 − e−ik )∗ k ˜ U= , iL(k) = , ωζ (k) = 2ζ sin , 2 y˜ (1 − e−ik ) 0 ∂τ yn =
˜ 1 (k )U ˜ 2 (k ) = α2 (1 − e−ik )(1 − e−ik χ(2) (k, k , k )U
)
0 , x ˜1 (k )˜ x2 (k )
(2.12)
and a similar formula for χ(3) (see (7.5)). Now let us consider the general multi-component vector case with J > 1 which we refer to as J-band case for which the system (2.3) has 2J components, and instead of (2.7) we assume that L(k) has eigenvalues and eigenvectors as follows: L(k)gn,ζ (k) = ωn,ζ (k)gn,ζ (k),
ωn,ζ (k) = ζωn (k),
ζ = ±,
n = 1, . . . , J, (2.13)
where ωn (k) are real-valued, continuous for all k functions, and eigenvectors gn,ζ (k) ∈ C2J have unit length in the standard Euclidean norm. We also suppose that the eigenvalues are numbered so that ωn+1 (k) ≥ ωn (k) ≥ 0,
n = 1, . . . , J − 1,
(2.14)
and we call n the band index. Note that the presence of ζ = ± reflects a symmetry of the system allowing it, in particular, to have real-valued solutions. Such a symmetry of dispersion relation ωn (k) occurs in photonic crystals and many other physical problems. Note that (2.13) implies that the following symmetry relation hold: ωn,−ζ (k) = −ωn,ζ (k),
n = 1, . . . , J.
(2.15)
We also always assume that the following inversion symmetry holds: ωn,ζ (−k) = ωn,ζ (k).
(2.16)
Remark 2.1. Assuming (2.15) and (2.16) we suppose that the dispersion relations ωζ (k) have the same symmetry properties as the dispersion relations of Maxwell
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
981
equations in periodic media, see [1–3, 5]. We would like to stress that these symmetry conditions are not imposed for technical reasons but because they are consequences of fundamental symmetries of physical media. Such symmetries arise in many problems including, for instance, the Fermi–Pasta–Ulam equation, or when L(k) originates from a Hamiltonian H(p, q) = 12 (H1 (p2 )) + 12 H2 (q 2 ). In the opposite case if it is assumed that (2.15) and (2.16) never hold, the results of this paper hold and the proofs, in fact, are simpler. The case with the symmetry is more difficult and delicate because of a possibility of resonant nonlinear interactions. There are values of k for which inequalities (2.14) turn into equalities, these points require special treatment. Definition 2.2 (Band-Crossing Points). We call k0 a band-crossing point if ωn+1 (k0 ) = ωn (k0 ) for some n or ω1 (k0 ) = 0 and denote the set of band-crossing points by σ. Everywhere in this paper we assume that the following condition is satisfied. Condition 2.3. The set σ of band-crossing points is a closed nowhere dense set in Rd with zero Lebesgue measure, the entries of the matrix L(k) are infinitely smooth functions of k ∈ / σ and ωn (k) are continuous functions of kfor all k and are infinitely smooth when k ∈ / σ. Observe that for k ∈ / σ all the eigenvalues of the matrix L(k) are different and the corresponding eigenvectors gn,ζ (k) of L(k)can be locally defined as smooth functions of k ∈ / σ as long as L(k) is smooth. Remark 2.4. The band-crossing points are discussed in more details in [1, 2]. Here we only note that generically the singular set σ is a manifold of the dimension d − 2, see [1, 2]. A simple example of a band-crossing point is k = 0 in (2.12). Since we do not assume the matrix L(k) to be Hermitian, we impose the following condition on its eigenfunctions which guarantees its uniform diagonalization. Condition 2.5. We assume that the 2J × 2J matrix formed by the eigenvectors gn,ζ (k) of L(k), namely, Ξ(k) = [g1,+ (k), g1,− (k), . . . , gJ,+ (k), gJ,+ (k)] is uniformly bounded together with its inverse sup Ξ(k),
k∈σ /
sup Ξ−1 (k) ≤ CΞ k∈σ /
for some constant CΞ .
(2.17)
Here and everywhere we use the standard Euclidean norm in C2J . Note that if the matrix L(k) is Hermitian for every k, the eigenvectors form an orthonormal system. Then the matrix Ξ, which diagonalizes L, is unitary and (2.17) is satisfied with CΞ = 1. Everywhere throughout the paper we assume that Condition 2.5 is satisfied.
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
A. Babin & A. Figotin
982
We introduce for vectors u ˜ ∈ C2J their expansion with respect to the basis gn,ζ : u ˜(k) =
J
u˜n,ζ (k)gn,ζ (k) =
n=1 ζ=±
J
u ˜ n,ζ (k),
(2.18)
n=1 ζ=±
and we refer to it as the modal decomposition of u ˜(k), and call the coefficients ˜(k). In this expansion we assign to every n, ζ a u˜n,ζ (k) the modal coefficients of u linear projection Πn,ζ (k) in C2J corresponding to gn,ζ (k), namely u(k) = u ˜n,ζ (k)gn,ζ (k) = u ˜ n,ζ (k), Πn,ζ (k)˜
n = 1, . . . , J,
ζ = ±.
(2.19)
Note that these projections may be not orthogonal if L(k) is not Hermitian. Evidently the projections Πn,ζ (k) are determined by the matrix L(k) and therefore do not depend on the choice of the basis gn,ζ (k). Projections Πn,ζ (k) depend smoothly on k ∈ / σ (note that we do not assume that the basis elements gn,ζ (k) are defined globally as smooth functions for all k ∈ / σ, in fact band-crossing points may be branching points for eigenfunctions, see, for example, [1].) They are also uniformly bounded thanks to Condition 2.5: 1/2 |Πn,ζ (k)V|2 ≤ CΞ |V|, V ∈ C2J , k ∈ / σ. (2.20) CΞ−1 |V| ≤ n,ζ
We would like to point out that most of the quantities are defined outside of the ˜ singular set σ of band-crossing points. It is sufficient since we consider U(k) as an element of the space L1 of Lebesgue integrable functions and the set σ has zero Lebesgue measure. The class of nonlinearities F˜ in (2.3) which we consider can be described as follows. F˜ is a general polynomial nonlinearity of the form ˜ = F˜ (U)
mF
˜ m ), F˜ (m) (U
with mF ≥ 2,
(2.21)
m=2
where m-linear operators F˜ (m) are represented by integral convolution formulas similar to (2.9), namely ˜(m−1)dk, ˜ 1, . . . , U ˜ m )(k, τ ) = ˜ 1 (k ) · · · U ˜ m (k(m) (k, k)) d χ(m) (k, k)U F˜ (m) (U Dm
(2.22) where the domain Dm = [−π, π](m−1)d ,
(2.23)
and we use notation ˜(m−1)dk = d
1 (2π)(m−1)d
dk · · · dk(m−1)
(2.24)
and k(m) (k, k) = k − k − · · · − k(m−1) ,
k = (k , . . . , k(m) ).
(2.25)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
983
˜ Condition 2.6 (Nonlinearity Regularity). The nonlinear operator F˜ (U) defined by (2.21) satisfy χ(m) =
1 (2π)(m−1)d
sup k,k ,...,k(m)
χ(m) (k, k , . . . , k(m) ) ≤ Cχ ,
m = 2, 3, . . . , (2.26)
where, without loss of generality, we can assume that Cχ ≥ 1. The norm |χ(m) (k, k)| of the tensor χ(m) with a fixed k as a m-linear operator from (C2J )m into (C2J ) is defined by |χ(m) (k, k)| = sup |χ(m) (k, k)(x1 , . . . , xm )|,
(2.27)
|xj |≤1
where as always, |·| stands for the standard Euclidean norm. The tensors χ(m) (k, k) / σ, namely for every comare assumed to be smooth functions of k, k , . . . , k(m) ∈ d pact K ⊂ R \σ and for all m = 2, 3, . . . |∇l χ(m) (k, k , . . . , k(m) )| ≤ CK,l
if
k, k , . . . , k(m) ∈ K,
l = 1, 2, . . . , (2.28)
where ∇l χ(m) is the vector composed of all partial derivatives of order l of all components of the tensor χ(m) with respect to the variables k, k , . . . , k(m) . From now on all the nonlinear operators we consider are assumed to satisfy the nonlinearity regularity Condition 2.6. Remark 2.7. At first sight, since is a small parameter, one might think that the linear term in (2.1) with the factor 1 is dominant. But it is not that simple. Indeed, i ˜ since all eigenvalues of L(k) are purely imaginary the magnitude of e− L(k) h(k) ˜ which represents the solution of a linear equation (with F = 0) is bounded uniformly in . A nonlinearity F˜ alters the solution for a bounded time τ∗ which is not small for small . Therefore the influence of the nonlinearity can be significant. This phenomenon can be illustrated by the following toy model. Let us consider the partial differential equation for a scalar function y(x, τ ): 1 ∂τ y = − ∂x y + y 2 ,
y(x, 0) = h(x).
Its solution is of the form
τ h x− , y(x, τ ) = τ 1 − τh x −
(2.29)
and regularly it exists only for a finite time. The solution (2.29) shows that the large coefficient 1 enters it so that the corresponding wave moves faster with the velocity 1 along the x-axis but the wave’s shape does not depend on at all. For ˜ ˜ β), = β 2 , and the coefficient 1 at the the NLS with the initial data h(k) = h(k,
November 28, 2006 11:15 WSPC/148-RMP
984
J070-00285
A. Babin & A. Figotin
linear part, the nonlinearity balances the effect of dispersion leading to emergence of solitons, see [6] for a discussion. To formulate our results we introduce a Banach space E = C([0, τ∗ ], L1 ) of functions v ˜(k, τ ), 0 ≤ τ ≤ τ∗ , with the norm ˜ v(k, τ )E = ˜ v(k, τ )C([0,τ∗ ],L1 ) = sup |˜ v(k, τ )| dk. (2.30) 0≤τ ≤τ∗
[−π,π]d
Here L1 is the Lebesgue function space with the standard norm defined by the formula |˜ v(k)| dk. (2.31) ˜ v(·)L1 = [−π,π]d
The following theorem guarantees the existence and the uniqueness of a solution to the modal evolution equation (2.3) on a time interval which does not depend on (see Theorem 5.4 for details). Theorem 2.8 (Existence and Uniqueness). Let the model evolution equation ˜ L1 ≤ R. Then there exists a ˜ ∈ L1 , h (2.3) satisfy the Condition 2.5, and let h ˜ ˜ unique solution U = G(h) of (2.3) which belongs to C 1 ([0, τ∗ ], L1 ). The number τ∗ > 0 depends on R, Cχ and CΞ and it does not depend on . Now we would like to formulate the main result of this paper, a theorem on the superposition principle, showing that the generic wavepackets evolve almost independently for the case of lattice equations. To do that, first, we define an important concept of wavepacket. ˜ k) which depends on a parameter Definition 2.9 (Wavepacket). A function h(β, 0 < β < 1, is called a wavepacket with a center k∗ if it satisfies the following conditions: (i) It is bounded in L1 uniformly in β, i.e. ˜ ·)L1 ≤ Ch . h(β,
(2.32)
(ii) It is composed of modes from essentially a single band n, namely for any 0 < < 1 there is a constant C > 0 such that ˜+ (k)L1 ≤ C β, ˜ ˜ − (k) − h h(k) −h
˜ ˜ζ (k) = Πn,ζ h(k), h
ζ = ±,
(2.33)
˜ζ (β, k) is essentially supported in a small vicinity of ζk∗ , where k∗ is the and h wavepacket center, namely ˜ζ (β, k)| dk ≤ C β. |h (2.34) |k−ζk∗ |≥β 1−
/ σ, and (iii) The wavepacket center k∗ is not a band-crossing point, that is k∗l ∈ the following regularity condition holds: ˜ζ (β, k)| dk ≤ C β −1− . |∇k h (2.35) |k−ζk∗ |≤β 1−
In the above conditions (ii) and (iii), C does not depend on β, 0 < β < 1.
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
985
The simplest example of a wavepacket in the sense of Definition 2.9 is a function of the form ˆ ζ k − ζk∗ gn,ζ (k), ζ = ±, ˜ζ (β, k) = β −d h (2.36) h β ˆ ζ (k) is a Schwartz function, that is an infinitely smooth, rapidly decaying where h ˜ centered at k∗ is function. Another typical and natural example of a wavepacket h readily provided by ˜ k) = Πn,+ (k)h ˜0,+ (β, k) + Πn,− (k)h ˜0,− (β, k), h(β,
(2.37)
˜0,ζ (β, k) is the lattice Fourier transform of the following function where h h0,ζ (m, β) = eiζk∗ ·m Φζ (βm − r0 )g,
ζ = ±,
(2.38)
where g is a vector in C2J , projection Πn,ζ is as in (2.19) with some n, vectors m, r0 ∈ Rd and Φζ (r) being an arbitrary Schwartz function (see Lemma 7.2). Our special interest is in the waves that are finite sums of wavepackets and we refer to them as multi-wavepackets. ˜ k), 0 < β < 1, is called Definition 2.10 (Multi-Wavepacket). A function h(β, ˜ a multi-wavepacket if it is a finite sum of wavepackets hl as defined in Definition 2.9, namely ˜ k) = h(β,
Nh
˜l (β, k), h
(2.39)
l=1
˜ and we call the set {k∗l } of all the centers k∗l of involved wavepackets center set of h. In what follows we will be interested in generic multi-wavepackets such that their centers are generic. The exact meaning of this is provided below in the following conditions. Condition 2.11 (Non-Zero Frequency). We assume that every center k∗l of a wavepacket satisfies the following condition ωnl (k∗l ) = 0,
l = 1, . . . , Nh .
(2.40)
Condition 2.12 (Group Velocity). We assume that all centers k∗l , l = ˜ as defined in Definition 2.10 are not band1, . . . , Nh , of the multi-wavepacket h crossing points, and the gradients ∇k ωnlj (k∗lj ) (called group velocities) at these points satisfy the following condition |∇k ωnl1 (k∗l1 ) − ∇k ωnl2 (k∗l2 )| = 0 when l1 = l2 ,
(2.41)
indicating that the group velocities are different. We also want the functions (dispersion relations) ωnl (k) to be non-degenerate in the sense that they are not exactly linear, below we give exact conditions.
November 28, 2006 11:15 WSPC/148-RMP
986
J070-00285
A. Babin & A. Figotin
Consider the following equation for n and θ θωnl (k∗ ) − ζωn (θk∗ ) = 0, where the admissible θ have the form m ζ (j) , ζ (j) = ±1, θ=
ζ = ±1,
(2.42)
m ≤ mF ,
(2.43)
j=1
mF is the same as in (2.21). In the case when in the series (2.21) some terms F˜ (m) vanish, we take in (2.43) only m corresponding to non-zero F˜ (m) . Condition 2.13 (Non-Degeneracy). Given a point k∗ = k∗l and band nl we assume that dispersion relations ωn (k) are such that all solutions n, θ of (2.42) are necessarily of the form n = nl ,
θ = ζ.
(2.44)
˜ as Definition 2.14 (Generic Multi-Wavepackets). A multi-wavepacket h defined in Definition 2.10 is called generic if the centers k∗l , l = 1, . . . , Nh , of all wavepackets satisfy Conditions 2.11 and 2.12; and the dispersion relations ωn (k) at every k∗l and band nl satisfy Condition 2.13. ˜ into the We introduce now the solution operator G mapping the initial data h ˜ ˜ solution U = G(h) of the modal evolution equation (2.3); this operator is defined ˜ ≤ R according to Theorem 2.8. The main result of this paper for the lattice for h case is the following statement. Theorem 2.15 (Superposition Principle for Lattice Equations). Suppose ˜ of (2.3) is a multi-wavepacket of the form that the initial data h ˜= h
Nh
˜l, h
l=1
˜l L1 ≤ R, Nh max h l
(2.45)
˜ is generic in the sense of Definition 2.14. Let us satisfying Definition 2.10, where h assume that 1 1 β2 ≤ C, with some C, 0 < β ≤ , 0 < ≤ . (2.46) 2 2 ˜ to the evolution equation (2.3) satisfies the following ˜ = G(h) Then the solution U approximate superposition principle N Nh h ˜l = ˜l ) + D, ˜ h G(h G l=1
(2.47)
l=1
˜ ) satisfying the following estimate with a small remainder D(τ ˜ )L1 ≤ C |ln β|, sup D(τ β 1+
0≤τ ≤τ∗
(2.48)
where is the same as in Definition 2.9 and can be arbitrary small, τ∗ does not depend on β, and .
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
987
The most common case when (2.46) holds is = β 2 , a discussion of different scalings is provided in [6, 7]. Observe that solutions to the original evolution equation (2.1) with the initial data (2.39), (2.38) satisfy the superposition principle if the wave vectors k∗l in (2.38) satisfy (2.41), (2.42) and Φl are Schwartz functions. It turns out, that the evolution of every coefficient u ˜n,ζ (k) of the solution as defined by (2.18) can be accurately approximated by a solution a relevant nonlinear Schrodinger equation (NLS), see [23]. Therefore Theorem 2.15 provides a reduction of multi-wavepacket problem to several single-wavepacket problems. We also would like to stress that though β is small the nonlinear effects are not small. Namely, there can be a significant difference between solutions of a nonlinear and the corresponding linear (with F (U) being set zero) equations with the same initial data for times τ = τ∗ . Recall that up to now we analyzed the nonlinear evolution in the modal form ˜ (2.3) for U(k, τ ). To make a statement on the nonlinear evolution for the original evolution equation (2.1), i.e. in terms of the quantities U(m, τ ), we introduce ˜ U(h)(m) as the inverse Fourier transform of the solution G(h)(k) of the modal evolution equation (2.3). Recall that the inverse Fourier transform corresponding to (2.2) is given by the formula ˜ dk, (2.49) eim·k U(k) U(m) = (2π)−d [−π,π]d
and when applying the inverse Fourier transform we get back the original lattice system (2.1) from its modal form (2.3). The convolution form of the nonlinearity makes the lattice system invariant with respect to translations on the lattice Zd . Using Theorem 2.15 and applying the inverse Fourier transform together with the inequality ˜ L1 UL∞ ≤ (2π)−d U
(2.50)
we obtain the following statement. Corollary 2.16. Let the evolution equation (2.1) be obtained as the lattice Fourier transform of (2.3). If h is given by (2.38) where every Φl,ζ (r) is a Schwartz function (that is an infinitely smooth, rapidly decaying function) then U(h) is a solution to the evolution equation (2.1). If h = h1 + · · · + hNh and every hl is given by (2.38) then the approximate superposition principle holds: U(h) = U(h1 ) + · · · + U(hNh ) + D,
(2.51)
with a small coupling remainder D(τ ) satisfying sup D(τ )L∞ ≤ Cδ
0≤τ ≤τ∗
where δ > 0 can be taken arbitrary small.
, β 1+δ
(2.52)
November 28, 2006 11:15 WSPC/148-RMP
988
J070-00285
A. Babin & A. Figotin
As an application of Theorem 2.15 let us consider the Fermi–Pasta–Ulam equation (2.10). We impose the initial condition for (2.10) xn (0) =
nh
Ψ0l (βn − rl )eik∗l n + cc,
l=1
yn (0) =
nh
(2.53) Ψ1l (βn − rl )eik∗l n + cc,
n ∈ Z,
l=1
where Ψ0l (r), Ψ1l (r) are arbitrary Schwartz functions, and rl are arbitrary real numbers, cc means complex conjugate to the preceding terms and assume that , β satisfy (2.46). For any given k∗l there are two eigenvectors g± (k∗l ) of the matrix L(k∗l ) in (2.12) given by (7.3) and corresponding terms in (2.53) can be written as Ψ0l ik∗l n = [Φ−,l g− (k∗l ) + Φ+,l g+ (k∗l )]eik∗l n . e Ψ1l In this case all requirements of Definition 2.10 are fulfilled, and (2.53) defines a multi-wavepacket. Note that the multi-wavepacket (2.53) involves Nh = 2nh wavepackets with 2nh wavepacket centers ϑk∗l , ϑ = ±. To satisfy Condition 2.12 the wavepacket centers k∗l must satisfy k∗l cos 2 = sin k∗l 2
k∗j cos 2 sin k∗j 2
if l = j.
(2.54)
To check if the centers k∗l satisfy Condition 2.13 we consider the equation 3 k∗l k∗l z sin ζ (j) , ζ (j) = ±1. (2.55) − ζ sin z = 0, z = 2 2 j=1 Evidently the possible values of z are −3, −1, 1, 3. Since the equation 3|sin φ| = |sin(3φ)| has the only solution φ = 0 on [0, π/2], Eq. (2.55) has the only solution z = ζ. Consequently, all points k∗l = 0 satisfy Condition 2.13, and Theorem 2.15 applies. The initial data for a single wavepacket solution have the form xϑ,n,l (0) (2.56) = Φϑ,l (βn − rl )gϑ (k∗l ) + cc, n ∈ Z, ϑ = ±. yϑ,n,l (0) According to this theorem and Corollary 2.16 the solution to (2.10), (2.53) equals the sum of solutions of (2.10) with single wavepacket initial data, that is xn (τ ) =
nh ϑ=± l=1
xϑ,n,l (τ ) + D1,n (τ ),
yn (τ ) =
nh
yϑ,n,l (τ ) + D2,n (τ ),
ϑ=± l=1
(2.57)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
989
3
2
1
y0 -40
-20
0 x
20
40
-1
-2
-3 Fig. 1. In this picture, two wavepackets are shown with different “centers” k∗1 and k∗2 . The values of k∗1 and k∗2 are proportional to the frequences of spatial oscillations. Though the wavepackets overlap in physical space, they pass one through another in the process of nonlinear evolution almost without interaction if their group velocities are different.
where Dn is a small remainder satisfying sup sup[|D1,n (τ )| + |D2,n (τ )|] ≤ Cδ
0≤τ ≤τ∗
n
β 1+δ
(2.58)
with arbitrarily small positive δ. Hence, the following statement holds. Theorem 2.17 (Superposition for Fermi–Pasta–Ulam Equation). If every Φl,ζ (r) is a Schwartz function, and the wavevectors k∗l = 0 satisfy (2.54), then the solution xn (τ ), yn (τ ) of the initial value problem for the Fermi–Pasta–Ulam equation (2.10) with multi-wavepacket initial condition (2.53) is a linear superposition of solutions xn,l (τ ), yn,l (τ ) of the same equation with single-wavepacket initial condition (2.56) up to a small coupling term D1,n (τ ), D2,n (τ ) satisfying (2.57), (2.58) with arbitrary small δ > 0 and τ∗ which do not depend on β, , δ. Note that solutions xϑ,n,l (τ ) with different ϑ, l resemble 2nh solitons which originate at different points rl and propagate with different group velocities. According to (2.57), (2.58) all these soliton-like wavepackets pass through one another with very little interaction, see Fig. 1. Note that Theorem 2.15 shows that this phenomenon is robust in the class of general difference equations on the lattice Z, and that it persists under polynomial perturbations of the nonlinearity as well as perturbations of the linear part of Eq. (2.11) as long as they leave the linear difference operator nonpositive and self-adjoint. Observe also that the evolution of every single wavepacket is nonlinear, and it is well-approximated by a properly constructed NLS (we intend to write a proof of this statement for general lattice systems in another article; see [23] for a particular case). For example, for a special choice of Ψjl the solution xn,l (τ ) can be well-approximated by a soliton solution of a corresponding NLS.
November 28, 2006 11:15 WSPC/148-RMP
990
J070-00285
A. Babin & A. Figotin
2.2. Main statements and examples for semilinear systems of hyperbolic PDE In this subsection, we consider nonlinear evolution equation involving partial differential (and pseudodifferential) operators with respect to spatial variables with constant coefficients in the entire space Rd . There is a great deal of similarity between such nonlinear evolution PDE and the lattice nonlinear evolution equations considered in the previous section. In particular, we study first not the original PDE but its Fourier transform, modal evolution equation, and the results concerning the original PDE are obtained by applying the inverse Fourier transform. Recall that for functions U(r) from L1 (Rd ) the Fourier transform and its inverse are defined by the formulas ˆ U(r)e−ir·k dr, where k ∈ Rd , (2.59) U(k) = Rd
U(r) =
1 (2π)d
ir·k ˆ U(k)e dr, Rd
where r ∈ Rd .
(2.60)
Similarly to (2.3) we introduce the following modal evolution equation i ˆ ˆ ˆ ∂τ U(k, τ ) + Fˆ (U)(k, τ ), τ ) = − L(k)U(k,
ˆ ˆ U(k, 0) = h(k),
k ∈ Rd ,
(2.61)
ˆ where (i) U(k, τ ) is a 2J-component vector-function of k, τ , (ii) L(k) is a 2J × 2J ˆ is the nonlinearity. We assume that the 2J ×2J matrix function of k, and (iii) Fˆ (U) d matrix L(k), k ∈ R , has exactly 2J eigenvectors gn,ζ (k) with corresponding 2J real eigenvalues ωn,ζ (k) satisfying the relations (2.13)–(2.17). We also assume the matrix L(k), k ∈ Rd , to satisfy the polynomial bound |L(k)| ≤ C(1 + |k|p ).
(2.62)
The singular set σ for L(k) is as in Definition 2.3 with the only difference that functions ωn,ζ (k) are defined over Rd rather than the torus [−π, π]d , and, consequently ˆ has a form entirely similar to (2.21): they are not periodic. The nonlinearity Fˆ (U) ˆ = Fˆ (U)
mF
ˆ m ), Fˆ (m) (U
(2.63)
m=2
with Fˆ (m) being m-linear operators with the following representation similar to (2.22): ˆ 1, . . . , U ˆ m )(k) Fˆ (m) (U ˜(m−1)dk, ˆ 1 (k ) · · · U ˆ m (k(m) (k, k)) d = χ(m) (k, k)U Dm
(2.64)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
991
˜ is defined by (2.24) where k(m) (k, k) is defined by the convolution equation (2.25), d and Dm in (2.64) is now defined not by (2.23) but by Dm = R(m−1)d .
(2.65)
The difference with (2.3) now is that the involved functions of k, k etc. are not 2π-periodic, Dm in (2.64) is defined by (2.65) instead of (2.23), and the tensors χ(m) (k, k) satisfy the nonlinear regularity Condition 2.6 without the periodicity ˆ l (k(l) ) in (2.64) are assumed to be from the space assumption. The functions U d L1 = L1 (R ) with the norm ˆ U(·) = |˜ v(k)| dk. (2.66) L1 Rd
We seek solutions to (2.61) in the space C 1 ([0, τ∗ ], L1 ) with 0 < τ∗ ≤ 1. Applying the inverse Fourier transform to the modal evolution equation (2.61) we obtain a hyperbolic 2J-component systems in Rd of the form i ∂τ U(r, τ ) = − L(−i∇r )U(r, τ ) + F (U)(r, τ ),
U(r, 0) = h(r).
(2.67)
Note that since L(k) satisfies the polynomial bound (2.62) we can define the action of the operator L(−i∇r ) on any Schwartz function Y(r) by the formula ˆ L(−i∇ r )Y(k) = L(k)Y(k),
(2.68)
where, in view of (2.62), the order of L does not exceed p. If all the entries of L(k) are polynomials, such a definition coincides with the common definition of the action of a differential operator L(−i∇r ). In this case L(−i∇r ) defined by (2.68) is a differential operator with constant coefficients of order not greater than p. The properties of the modal evolution equation (2.61) are completely similar to its lattice counterpart and are as follows. The existence and uniqueness theorem is similar to Theorem 2.8. Theorem 2.18 (Existence and Uniqueness). Let Eq. (2.61) satisfy conditions ˜ L1 ≤ R. Then there exists a unique (2.17) and (2.26) and h ∈ L1 = L1 (Rd ), h solution to the modal evolution equation (2.61) in the functional space C 1 ([0, τ∗ ], L1 ). The number τ∗ depends on R, Cχ and CΞ . Here is the main result for the semilinear hyperbolic systems of PDE which is completely similar to Theorem 2.15. Theorem 2.19 (Principle of Superposition for PDE Systems). Let the initial data of the modal evolution equation (2.61) be a multi-wavepacket, i.e. the sum ˆl as in (2.45) satisfying Definitions 2.9 and 2.10. Suppose of Nh wavepackets h ˆ is generic in the sense of that , β satisfy condition (2.46). Assume also that h
November 28, 2006 11:15 WSPC/148-RMP
992
J070-00285
A. Babin & A. Figotin
ˆ = G(h) ˆ to the modal evolution equation (2.61) Definition 2.14. Then the solution U satisfies the approximate linear superposition principle, namely N Nh h ˆ ˆl ) + D, ˆ hl = G G(h (2.69) l=1
l=1
ˆ ) with a small remainder D(τ ˆ )L1 ≤ C
sup D(τ
0≤τ ≤τ∗
β 1+
|ln β|,
(2.70)
where is the same as in Definition 2.9, τ∗ does not depend on β, and . The solutions U(h)(r, τ ) of the space evolution equation (2.67) are obtained as the inverse ˆ and they satisfy the approximate linear superposition Fourier transform of G(h) principle, namely U(h) = U(h1 ) + · · · + U(hNh ) + D, with a small coupling remainder D(τ ) satisfying sup D(τ )L∞ ≤ C 1+ |ln β|, β 0≤τ ≤τ∗
(2.71)
(2.72)
where > 0 is the same as in Definition 2.9 and can be arbitrary small. Example 1. Sine-Gordon and Klein–Gordon Equations with Small Initial Data. Let us consider the sine-Gordon equation (see [26]) ∂t2 u = ∂r2 u − sin u
(2.73)
with small initial data u(r, 0) = βb0 ,
∂t u(r, 0) = βb1 ,
β 1.
(2.74)
First, we recast this the equation into our framework by rescaling the variables u = βU1 ,
β 2 t = τ.
(2.75)
Since sin βU1 = βU1 − 16 β 3 U13 +β 5f (U1 ), where evidently f (U1 ) is an enitire function, we can recast Eq. (2.73) into the following form ∂τ2 U1 =
1 2 1 [∂ U1 − U1 ] + 2 [qU13 + β 2 f (U1 )]. β4 x β
(2.76)
We introduce then a linear pseudodifferential operator A = (I − ∂x2 )1/2 with the symbol (1 + k 2 )1/2 and rewrite Eq. (2.76) as the following system ∂τ U1 =
1 AU2 , β2
∂τ U2 = −
1 AU1 + A−1 [qU13 + β 2f (U1 )], β2
(2.77)
with the initial data U1 (0) = h0 ,
U2 (0) = h1 ,
(2.78)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
where h0 and h1 are assumed to be of the form nh Ψjl (βr − rl )eik∗l ·r + cc, z(r, 0) = h0 , p(r, 0) = h1 , hj =
993
j = 0, 1,
l=1
(2.79) in one-dimensional case with r = r, k = k. Evidently, the relations with the initial data of (2.73) are b 0 = h0 ,
b1 = Ah1 .
Notice that the system (2.77) is of the form (2.67) with AU2 = β 2 , LU = , F (U) = F0 (U) + β 2 F1 (U), −AU1 −1
F0 (U) = A
0 , qU13
−1
F1 (U) = A
(2.80)
0 . f (U1 )
Observe now that L has only one spectral band with the dispersion relation and eigenvectors given by 2 1/2 −1/2 −iϑ , ϑ = ±1, ω(k) = (I + k ) , gϑ (k) = gϑ = 2 1 and there is no band-crossing points. We use expansion in the basis g± Ψ0l ik∗l ·r = [Φ+,l g+ + Φ−,l g− ]eik∗l ·r e Ψ1l
(2.81)
to represent initial data (2.78) and (2.79). Here Eq. (2.42) takes the form 2 1/2 2 1/2 (1 + k∗l ) λ = ζ(1 + λ2 k∗l ) ,
ζ = ±1.
Obviously, this equation has only solutions λ = ζ and Condition 2.13 is fulfilled. Condition 2.12 holds if ϑk∗l ϑ k∗l = for l = l or ϑ = ϑ (2.82) 2 2 )1/2 1/2 (1 + k∗l ) (1 + k∗l which is equivalent to k∗l = k∗l
for l = l,
and k∗l = 0 for all l.
(2.83)
Equation (2.77) can be written in the integral form (3.3) with mF = ∞ and by Theorem 5.4, it has unique solution U for τ ≤ τ∗ . If we replace F (U) in (2.80) by F0 (U), we obtain 1 1 ∂τ V1 = 2 AV2 , ∂τ V2 = − 2 AV1 + A−1 qV13 , (2.84) β β where we take the initial data to be as in (2.78), namely V1 (0) = h0 ,
V2 (0) = h1 .
(2.85)
Equations (2.84) can be obtained by replacing sin u in (2.73) by the cubic polynomial u − u3 /6 producing the quasilinear Klein–Gordon equation (see [36]). Observe
November 28, 2006 11:15 WSPC/148-RMP
994
J070-00285
A. Babin & A. Figotin
that the solutions to the sine-Gordon and the Klein–Gordon equations with small initial data are very close. To see that, note that the operator f (U )(k) is bounded
which are bounded in L1 . Therefore the norm of the neglected term in L1 for U(k) is small, namely β 2 f (U )L1 ≤ Cβ 2 . Thus, by Remark 4.8, the solutions of (2.77) and (2.84) are close, namely U1 − V1 L∞ + U2 − V2 L∞ ≤ Cβ 2 ,
0 ≤ τ ≤ τ∗.
(2.86)
According to Theorem 2.19 the superposition principle is applicable to Eq. (2.84) with initial data as in (2.85), and the following statements hold. Theorem 2.20 (Superposition for Klein–Gordon). Assume that the initial data h0 , h1 in (2.85) are as in (2.79). Then the solution {V1 , V2 } to the system (2.84) satisfies the linear superposition principle, namely V1 (r, τ ) =
nh
V1,ϑ,l (r, τ ) + D1 (r, τ ),
ϑ=± l=1
V2 (r, τ ) =
nh
(2.87) V2,ϑ,l (r, τ ) + D2 (r, τ ),
ϑ=± l=1
where {V1,ϑ,l (r, τ ), V2,ϑ,l (r, τ )} is a solution to (2.84) with the one-wavepacket initial condition V1,ϑ,l (r, 0) (2.88) = Φϑ,l (βr − rl )gϑ eik∗l ·r + cc, V2,ϑ,l (r, 0) where Φϑ,l (r) are arbitrary Schwartz functions. If (2.83) holds, the coupling terms D1 , D2 satisfy the bound sup [D1 (τ )L∞ + D2 (τ )L∞ ] ≤ Cδ
0≤τ ≤τ∗
= Cδ β 1−δ , β 1+δ
(2.89)
where τ∗ and Cδ do not depend on β, and δ can be taken arbitrary small. Using (2.86) we obtain a similar superposition theorem for the sine-Gordon equation. Theorem 2.21 (Superposition for Sine-Gordon). Assume that the initial data h0 , h1 in (2.78) are as in (2.79). Then the solution {U1 , U2 } to (2.77), (2.78) satisfies the linear superposition principle, namely U1 (r, τ ) = U2 (r, τ ) =
nh ϑ=± l=1 nh ϑ=± l=1
U1,ϑ,l (r, τ ) + D1 (r, τ ), U2,ϑ,l (r, τ ) + D2 (r, τ ),
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
995
where U1,ϑ,l (r, τ ), U2,ϑ,l (r, τ ) is a solution of (2.77) with the one-wavepacket initial condition U1,ϑ,l (r, 0) = Φϑ,l (βr − rl )gϑ eik∗l ·r + cc, ϑ = ±, U2,ϑ,l (r, 0) where Φϑ,l (r) are arbitrary Schwartz functions. If (2.83) holds, the coupling terms D1 , D2 satisfy the bound (2.89). Note that a theorem completely similar to Theorem 2.20 holds also for a generalized Klein–Gordon equation where qV13 is replaced by an arbitrary polynomial P (V1 ). Hence, the superposition principle holds for the sine-Gordon equation (2.73) with a small initial data and a strongly perturbed nonlinearity as, for example, when sin u is replaced by sin u + β −1 u4 + β −2 u5 . We would like to compare now our results and methods with that of [38] where the interaction of counterpropagating waves is studied by the ansatz method. Pierce and Wayne considered in [38] the sine-Gordon equation in the case of small initial data which have the form of a bimodal wavepacket. In our notation it corresponds to the case when = β 2 , nh = 1 in (2.79), when two wavepackets, corresponding to ϑ = + and ϑ = −, have exactly opposite group velocities. They proved that the bimodal wavepacket data generate two waves which are described by two uncoupled nonlinear Schrodinger equations with a small error. The magnitude of the error given in [38] (which we formulate here for the solution U1 of the rescaled equation (2.76)) is estimated by Cβ 1/2 on the time interval 0 ≤ τ ≤ τ0 (or 0 ≤ t ≤ τ0 β −2 ). Note that our general Theorem 2.19 when applied to the special case of the sineGordon equation (2.76) provides a better estimate of the coupling error, namely C/β 1+δ = Cβ 1−δ in (2.89) with arbitrary small δ, for the same time interval. Notice that the estimate (2.72) given in Theorem 2.19 is almost optimal, since it is possible to construct examples when the coupling error is greater than cβ 1+δ with arbitrary small δ. We would like to point out that the general mechanism responsible for the wavepacket decoupling is the destructive wave interference, this mechanism is subtle though general. We treat the destructive wave interference by taking into account explicitly all nonlinear interactions of high-frequency waves. In our approach, we use the exact representation of a general solution in the form of a functional-analytic operator monomial series, every term of the series is explicitly given as a multilinear oscillatory integral operator applied to the initial data. A key advantage of such an approach is that it allows to estimate wavepacket coupling as a sum of contributions of highly oscillatory terms and to get a precise estimate of magnitude of every term. In contrast, the well-known “ansatz” approach as, for instance, in [38, 32], requires to find a clever ansatz with consequent estimations of the “residuum” in an appropriate norm. Our approach can naturally treat general tensorial polynomial nonlinearities F of arbitrary large degree NF and any number of wavepackets, whereas finding a good ansatz which allows to estimate the residuum in such a
November 28, 2006 11:15 WSPC/148-RMP
996
J070-00285
A. Babin & A. Figotin
general situation would be difficult. For readers interested in detailed features of one-wavepacket solutions to the sine-Gordon equations, we refer to [32, 38, 39]. Example 2. Nonlinear Schrodinger Equation. The nonlinear Schrodinger equation (NLS) with d spatial variables [42, 16, 15] has the form 1 ∂τ z(r, τ ) = i γ(−i∇)z(r, τ ) + α|z|2 z(r, τ ),
z(r, 0) = h(r),
r ∈ Rd ,
(2.90)
where α is a complex constant, γ(−i∇) is a second-order differential operator, its symbol γ(k) is a real, symmetric quadratic form γij ∂ri ∂rj z. γ(k) = γ(k, k) = γij ki kj , γ(−i∇)z = − To put the NLS into the framework of this paper, we introduce the following twocomponent system 1 2 ∂τ z+ (r, τ ) = i γ(−i∇)z+ (r, τ ) + αz− z+ (r, τ ), 1 2 ∂τ z− (r, τ ) = −i γ(i∇)z− (r, τ ) + α∗ z+ z− (r, τ ), z+ (r, 0) = h(r), z− (r, 0) = h∗ (r), r ∈ Rd ,
(2.91)
where α∗ denotes complex conjugate to α. Obviously if z(r, τ ) is a solution of (2.90) then z+ (r, τ ) = z(r, τ ), z− (r, τ ) = z ∗ (r, τ ) gives a solution of (2.91). Using the Fourier transform we get from (2.90) 1 ∗ z 2 )(k, τ ), z (k, τ ) + α(z ∂τ zˆ(k, τ ) = i γ(k)ˆ
k ∈ Rd .
(2.92)
Now the band-crossing set σ = {k ∈ Rd : γ(k) = 0}. We assume that the quadratic form γ is not identically zero. The Fourier transform of (2.91) takes the form of (2.67) with ˆ ˆ 0 U+ ˆ = γ(k) ˆ = U+ , L(k)U , U ˆ ˆ− 0 −γ(−k) U U− ˆ (3)
ω(k) = |γ(k)|, F
ˆ3
(U ) =
ˆ z ˆ z− (U)) ˆ α(ˆ z+ (U)ˆ + (U)ˆ . ˆ z ˆ z (U)) ˆ (U)ˆ α∗ (ˆ z (U)ˆ −
−
+
To satisfy the requirements of Condition 2.14 we have to take the wave vectors / σ so that k∗l ∈ ∇|γ(k∗l )| =
2γ(k∗l ) 2γ(k∗l ) γ(k∗l , ·) = γ(k∗l , ·) |γ(k∗l )| |γ(k∗l )|
if l = l ,
which provides (2.41). Since |γ(k∗l )|λ − ζ|γ(λk∗l )| = |γ(k∗l )|[λ − ζ|λ|2 ],
(2.93)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
997
and λ is odd, every point k∗l ∈ / σ satisfies Condition 2.13. If the quadratic form γ is not singular, that is det γ = 0, then condition (2.93), which ensures that group velocities of wavepackets are different, holds when γ(k∗l ) γ(k∗l ) k∗l = k∗l |γ(k∗l )| |γ(k∗l )|
if l = l .
In this case Theorem 2.19 is applicable, and generic wavepacket solutions of the NLS are linearly superposed and propagate almost independently with coupling O(β). More precisely, as a corollary of Theorem 2.19 we obtain the following statement. Theorem 2.22 (Superposition for NLS). Assume that initial data of the NLS (2.90) have the form h = h1 + · · · + hNh , hl (r) = eik∗l ·m Φl,+ (βr − r0 ) + e−ik∗l ·m Φl,− (βr − r0 ),
l = 1, . . . , Nh ,
where Φl,ζ (r) are arbitrary Schwartz functions. Assume also that det γ = 0 and the vectors k∗l satisfy conditions γ(k∗l ) = 0,
l = 1, . . . , Nh ;
if l = l .
k∗l = k∗l
Then solution z = z(h) is a linear superposition z(h) = z(h1 ) + · · · + z(hNh ) + D with a small coupling term D sup D(τ )L∞ (Rd ) ≤ Cδ
0≤τ ≤τ∗
β 1+δ
,
where δ > 0 can be taken arbitrary small. We note in conclusion, that the superposition principle reduces dynamics of multi-wavepacket solutions to dynamics of single-wavepacket solutions; we do not study dynamics of single-wavepacket solutions in this paper. Note that the theory of NLS-type approximations of one-wavepacket solutions of hyperbolic PDE is welldeveloped, see [29, 30, 18, 40, 41, 5] and references therein. Relevance of different group velocities of wavepackets for smallness of their interaction was noted in [29].
2.3. Generalizations Note that in a degenerate case when the function ωnl (k) is linear in the direction of k∗ , Eq. (2.42) for ζ = 1 has many solutions for which θ = ±1 and Condition 2.13 does not hold. It turns out, that if Condition 2.13 for dispersion relations ωn (k) at k∗ is not satisfied, still we can prove our results under the following alternative condition. We consider here the case of PDE in the entire space Rd and k ∈ Rd .
November 28, 2006 11:15 WSPC/148-RMP
998
J070-00285
A. Babin & A. Figotin
Condition 2.23 (Complete Degeneracy). The series (2.21) has only F˜ (m) with odd m. The wavevectors k∗l and functions ωnl (k), l = 1, . . . , Nh , have the following three properties: (i) There exists δ > 0 such that for every l1 = l2 , the following inequality holds: |∇k ωnl1 (ν1 k∗l1 ) − ∇k ωnl2 (ν2 k∗l2 )| ≥ δ,
(2.94)
for any odd integers ν1 , ν2 = 1, 3, . . . . (ii) There exists δ > 0 such that νk∗l does not get in a δ-neighborhood of σ for any odd integer ν and any l = 1, . . . , Nh . (iii) For any positive integer odd number θ and any k∗l , for any n the following identities hold: ∇k ωn (θk∗l ) = ∇k ωn (k∗l ),
(2.95)
ωn (θk∗l ) = θωn (k∗l ).
(2.96)
A nontrivial examples, where the above Condition 2.23 is satisfied, is given below. We give here a generalization of Definition 2.14. ˆ as Definition 2.24 (Generic Multi-Wavepackets). A multi-wavepacket h defined in Definition 2.10 is called generic if (i) the centers k∗l , l = 1, . . . , Nh , of all wavepackets satisfy Conditions 2.11 and 2.12; (ii) either the dispersion relations ωn (k) at every k∗l and band nl satisfy Condition 2.13 or they satisfy Condition 2.23. The statement of Theorem 2.19 remains true if Condition 2.14 is replaced by less restrictive Condition 2.24, namely the following theorem holds. Theorem 2.25. Let the initial data of the modal evolution equation (2.61) be a ˆl as in (2.45) satisfying Defmulti-wavepacket, i.e. the sum of Nh wavepackets h ˆ is generic initions 2.9 and 2.10. Suppose that (2.46) holds. Assume also that h ˆ to the modal evoluˆ = G(h) in the sense of Definition 2.24. Then the solution U tion equation (2.61) satisfies the approximate linear superposition principle, namely (2.69)–(2.72) hold. The proofs we give in this paper directly apply to more general Theorem 2.25. Another generalization concerns the possibility to shift independently initial wavepackets. If initial data involve parameters rl as in (2.79) it is possible to prove that C in (2.48), (2.70) and (2.72) does not depend on rl ∈ Rd if the functions Ψjl are Schwartz functions. Most of the proofs remain the same, but several statements have to be modified, and we present proofs in a subsequent paper.
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
999
One more generalization concerns the smoothness of initial data. It is possible to take initial data hl (r) with a finite smoothness rather than from the Schwartz class. Namely, consider weighted spaces L1,a with the norm (1 + |k|)a |ˆ v(k)| dk, a ≥ 0. (2.97) ˆ vL1,a = Rd
Obviously, large a corresponds to high smoothness of the inverse Fourier transform ˆ ζ (k) = h ˆ l,ζ (k) from the ˆl,ζ (k) have the form (2.36) with h v(r). Then if functions h class L1,a the inequality (2.70) can be replaced by ˆ )L1 ≤ C |ln β| + C β s , sup D(τ (2.98) β 1+
0≤τ ≤τ∗ where s > 0 and > 0 have to satisfy restriction s < a. This generalization requires minor modifications in the proofs and in conditions (2.33) and (2.34), C β has to be replaced by C β s . In particular, if a = 1, = β 2 and s = 1/2 the right-hand side of (2.98) can be estimated by C 1 β 1/2− 1 with arbitrary small 1 . More generalizations which involve the structure of equations are discussed in Secs. 7.3 and 7.4. Now we give an example where Condition 2.23 is applicable. Example 3. Semilinear Wave Equation. Let us consider a semilinear wave equation with d spatial variables 1 α (2.99) ∂τ2 z(r, τ ) = 2 ∆z(r, τ ) + ∂x1 z 3 (r, τ ), r ∈ Rd , where ∆ is the Laplace operator, α is an arbitrary complex constant, = β 2 . We √ introduce the operator A = −∆ which is defined in terms of the Fourier transform, it has symbol |k|. We rewrite (2.99) in the form of a first-order system 1 (2.100) ∂τ z(r, τ ) = Ap(r, τ ), r ∈ Rd ; 1 ∂τ p(r, τ ) = − Az(r, τ ) + αA−1 ∂x1 z 3 (r, τ ). 1 The linear operator A−1 ∂x1 has the symbol −ik |k| , it is a zero-order operator. We rewrite (2.100) in the form of (2.67) where 0 A z z 0 z , F =α U= , −iL(−i∇r )U = . −A 0 p p −A−1 ∂x1 z 3 p
Using the Fourier transform, we get (2.61) with 0 |k| zˆ zˆ 0
3) ˆ = ˆ = ˆ 3 ) = −iαk1 (z U , −iL(k)U , Fˆ (3) (U , |k| pˆ −|k| 0 pˆ 1 1
3 zˆ(k )ˆ z (k )ˆ z (k ) dk dk . (z )(k) = (2π)2d k ,k ∈R2d ;k +k +k =k k1 Since the factor |k| is uniformly bounded and smooth for |k| = 0, conditions (2.26) and (2.28) are satisfied. The eigenvalues and corresponding eigenvectors of L are
November 28, 2006 11:15 WSPC/148-RMP
1000
J070-00285
A. Babin & A. Figotin
given explicitly: ω+ (k) = |k|,
ω− (k) = −|k|,
g+ (k) = 2
−1/2
−i 1
,
g− (k) = 2
−1/2
i 1
.
(2.101) Since the matrix L(k) is Hermitian, Condition 2.5 is satisfied. The singular set σ consists of the single point k = 0. Note that conclusions of Theorem 2.19 are applicable to Eq. (2.100) and consequently to (2.99). For instance, we take the initial data for (2.100) in the form (2.79) z(r, 0) = h0 ,
p(r, 0) = h1 ,
hj =
nh
Ψjl (βr − rl )eik∗l ·r + cc,
j = 0, 1,
l=1
(2.102) where Ψ0l (r), Ψ1l (r) are arbitrary Schwartz functions, and cc means complex conjugate to the preceding terms. The points rl are arbitrary. Note that terms corresponding to k∗l can be written using the basis (2.101) as Ψ0l ik∗l ·r = [Φ+,l g+ + Φ−,l g− ]eik∗l ·r . (2.103) e Ψ1l In this case all requirements of Definition 2.9 are fulfilled. The number of initial wavepackets for the first-order system (2.100) corresponding to initial data (2.102) equals Nh = 2nh and there are 2Nh wavepacket centers ϑk∗l , ϑ = ±. To satisfy the requirements of Condition 2.14 we have to take the wave vectors k∗l = 0 so that ϑ k∗l ϑk∗l = |k∗l | |k∗l |
if l = l or ϑ = ϑ ,
which provides (2.41). Since |k∗l |λ − ζ|λk∗l | = |k∗l |(λ − ζ|λ|), Eq. (2.42) has solutions λ = ζ and every point k∗l does not satisfy Condition 2.13. This is the property of the very special, purely homogeneous ω(k) = |k|. Checking the second alternative, namely Condition 2.23 we observe that ∇k |νk∗l | =
ν k∗l νk∗l = . |νk∗l | |ν| |k∗l |
Hence, if ϑk∗l ϑ k∗l = |k∗l | |k∗l |
for l = l or ϑ = ϑ
and if k∗l = 0
(2.104)
then Condition 2.23 is satisfied and Superposition Theorem 2.19 is applicable. As a corollary of Theorem 2.19 applied to (2.99), we obtain that if the initial data for (2.99) equal the sum of wavepackets, then the solution equals the sum of separate solutions plus a small remainder, more precisely we have the following theorem.
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1001
Theorem 2.26 (Superposition Principle for Wave Equation). Assume that the initial data for (2.100) to be a multi-wavepacket of the form (2.102) and (2.46) holds. Then the solution z(r, τ ) to (2.100), (2.102) satisfy the superposition principle, namely z(r, τ ) =
nh
zϑ,l (r, τ ) + D1 (r, τ ),
p(r, τ ) =
ϑ=± l=1
nh
pϑ,l (r, τ ) + D2 (r, τ )
ϑ=± l=1
where zϑ,l (r, τ ), pϑ,l (r, τ ) is a solution of (2.100) with the initial condition zϑ,l (r, 0) (2.105) = Φϑ,l (βr − rl )gϑ eik∗l ·r + cc, pϑ,l (r, 0) with Φϑ,l (r) being arbitrary Schwartz functions. If (2.104) holds, the coupling terms D1 and D2 satisfy the bound (2.106) sup [D1 (τ )L∞ + D2 (τ )L∞ ] ≤ Cδ 1+δ , β 0≤τ ≤τ∗ where τ∗ and Cδ do not depend on β, and δ can be taken arbitrary small. In the following sections, we introduce concepts and develop analytic tools allowing to prove the approximate linear superposition principle as stated in Theorems 2.15, 2.19 and 2.25. 3. Reduced Evolution Equation Since the properties of the evolution equations (2.3) and (2.61) are very similar, we consider here in detail the lattice evolution equation (2.3) with understanding that ˜ with U, ˆ [−π, π]d with all the statements apply to the PDE (2.61) if we replace U d d d R , the function space L1 = L1 ([−π, π] ) with L1 = L1 (R ) and so on. First, using the variation of constants formula we recast the modal evolution equation (2.3) into the following equivalent integral form τ −i(τ −τ ) −iζτ ˜ ˜ ˜ τ ) dτ + e L(k) h(k), τ ≥ 0. (3.1) e L(k) F˜ (U)(k, U(k, τ) = 0
˜ Then we introduce for U(k, τ ) its two-time-scale representation (with respectively slow and fast times τ and t = τ ) iτ ˜ ˜(k, τ ), U(k, τ ) = e− L(k) u
iτ ˜ n,ζ (k, τ ) = u U ˜n,ζ (k, τ )e− ζωn (k) ,
(3.2)
˜(k, τ ) (see (2.18)); note that where u ˜n,ζ (k, τ ) are the modal coefficients of u u ˜ n,ζ (k, τ ) may depend on , therefore (3.2) is just a change of variables. Consequently we obtain the following reduced evolution equation for u ˜=u ˜ (k, τ ), τ ≥ 0, ˜ u ˜(k, τ ) = F (˜ u)(k, τ ) + h(k), F (m) (˜ um )(k, τ ) =
mF
F (m) (˜ um (k, τ )),
(3.3)
m=2 τ
e 0
F (˜ u) =
iτ
L(k)
F˜ (m) ((e
−iτ
L(·)
u ˜ )m )(k, τ ) dτ ,
(3.4)
November 28, 2006 11:15 WSPC/148-RMP
1002
J070-00285
A. Babin & A. Figotin
where the quantities F˜ (m) are defined by (2.21) and (2.22) in terms of the susceptibilities χ(m) . The norm of the oscillatory integral F (m) in (3.4) is estimated in terms of the norm of the tensor χ(m) (k, k) defined in (2.26) and (2.27). The operator F (m) is shown to be a bounded one from (E)m into E; see Lemma 5.1 for details. The proof of this property is based on the following Young inequality for the convolution uL1 ˜ v L1 . ˜ u∗v ˜L1 ≤ ˜
(3.5)
For a detailed analysis of solutions of (3.3) we recast Eq. (3.3) for u ˜(k, τ ) using projections (2.19) as the following expanded reduced evolution equation u ˜ n,ζ (k, τ ) =
∞ m=2
n,ζ
F
(m) um )(k, τ )
(˜ n,ζ, n,ζ
+ hn,ζ (k),
τ ≥ 0,
(3.6)
for the modal coefficient u ˜n,ζ (k, τ ). In the above formula and elsewhere, we use notations n = (n , . . . , n(m) ), The operators formulas F
(m) F
n,ζ, n,ζ
(m) u1
(˜ n,ζ, n,ζ
ζ = (ζ , . . . , ζ (m) ),
k = (k , . . . , k(m) ).
(3.7)
are m-linear oscillatory integral operators defined by the
···u ˜ m )(k, τ ) =
τ
τ1 exp iφn,ζ, n,ζ (k, k) Dm
0
(3.8)
(m) u1 (k , τ1 ), . . . , u ˜(m−1)dkdτ1 , χ ˜ m (k(m) (k, k), τ1 )] d
(k, k)[˜ n,ζ, n,ζ
where we use notations (2.23)–(2.25). In (3.8), the interaction phase function φ is defined by φn,ζ, n,ζ (k, k) = ζωn (k) − ζ ωn (k ) − · · · − ζ (m) ωn(m) (k(m) ),
k(m) = k(m) (k, k) (3.9)
and the susceptibilities 2J m
from (C )
(m) χn,ζ, n,ζ (k, k)
are m-linear symmetric tensors (i.e. mappings into C ) defined for almost all k, k by the following formula 2J
(m)
χn,ζ, n,ζ (k, k)[˜ u1 (k ), . . . , u ˜m (k(m) )] = Πn,ζ (k)χ(m) (k, k)[Πn ,ζ (k )˜ u1 (k ), . . . , Πn(m) ,ζ (m) (k(m) (k, k)) ×u ˜ m (k(m) (k, k))]. (3.10) For the lattice equation, χ
(m)
(k, k) n,ζ, n,ζ
is 2π-periodic with respect to every vari-
able k, k , . . . , k(m) . Note that operators F (m) (um ) in (3.3) can be rewritten using (3.8) as (m) F um ). (3.11) F (m) (um ) =
(˜
n,ζ
We also call operators F
(m)
n,ζ, n,ζ
n,ζ, n,ζ
decorated operators.
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1003
Remark 3.1. The expanded reduced evolution equation (3.6) is instrumental to the nonlinear analysis. Its very form, a convergent series of multilinear forms which are oscillatory integrals (3.8), is already a significant step in the analysis of the solution accomplishing several tasks: (i) it suggests a constructive representation (m) for the solution; (ii) every term F
can be naturally interpreted as nonlinear n,ζ, n,ζ
(m)
interaction of the underlying linear modes; (iii) the representation of F
as the n,ζ, n,ζ oscillatory integral (3.8) involving the interaction phase φn,ζ, n,ζ and the suscepti(m) (m) bilities χ (k, k) directly relates F to the terms of the original evolution
n,ζ, n,ζ
n,ζ, n,ζ
equation as well as to physically significant quantities. We can also add that since we consider → 0, the interaction phase function φn,ζ, n,ζ (k, k) plays the decisive role in the analysis of nonlinear interactions of different modes. The analysis of fundamental properties of the reduced evolution equation (3.6), including, in particular, the linear modal superposition principle, involves and combines the following three components: (i) the linear spectral theory component in the form of the modal decomposition of the solution and introduction of wavepackets as elementary waves; (ii) function-analytic component which deals with the structure of series similar to the one in (3.6) and its dependence on the nonlinearity of the original evolution equation; (iii) asymptotic analysis of oscillatory integrals (3.8) which allows to estimate the magnitude of nonlinear interactions between different modes and, in particular, to show that generically different modes almost do not interact leading to the superposition principle. Sometimes it is convenient to rewrite (3.8) in a slightly different form. The convolution integral (3.8) according to (2.25) involves the following phase matching condition k + · · · + k(m) = k.
(3.12)
Using the following notation for the integral over the plane (3.12) f (k, k) dk · · · dk(m−1) k ,...,k(m−1) ∈[−π,π](m−1)d ;k +···+k(m) =k
= [−π,π]md
f (k, k)δ(k − k − · · · − k(m) ) dk · · · dk(m)
(3.13)
in terms of a delta-function, we can rewrite (3.8) in the form τ 1 τ1 (m) Fn,ζ, n,ζ (˜ u1 · · · u ˜m )(k, τ ) = exp iφn,ζ, n,ζ (k, k) (2π)m(d−1) 0 [−π,π]md · δ(k − k − · · · − k(m) )χ
(m) u1,ζ (k ) · · · u ˜m,ζ (m) (k(m) ) dk
(k, k)˜ n,ζ, n,ζ
· · · dk(m) dτ1 . (3.14)
November 28, 2006 11:15 WSPC/148-RMP
1004
J070-00285
A. Babin & A. Figotin
4. Function-Analytic Operator Series In this section necessary algebraic concepts required for the analysis are introduced. We study the reduced evolution equation (3.3) as a particular case of the following abstract nonlinear equation in a Banach space u = F (u) + x,
F (u) =
∞
F (s) (xs ),
(4.1)
s=2
where the nonlinearity F (u) is an analytic operator represented by a convergent operator series. It is well known (see [25]) that the solution u = G(x) of such equation can be represented as a convergent series in terms of m-linear operators Gm which are constructed based on F : ∞
G(x) = G(F , x) =
G (m) (xm ),
G (m) (xm ) = G (m) (F , xm ),
where
m=1
· · x . xm = x · m times
Using the multilinearity of G (m) we readily obtain the formula G(x1 + · · · + xN ) = =
∞ m=1 ∞ m=1
G (m) ((x1 + · · · + xN )m ) G((x1 )m ) + · · · +
∞
G((xN )m ) + GCI (x1 , . . . , xN ), (4.2)
m=1
where x = x1 +· · ·+xN represents a multi-wavepacket and GCI (x1 , . . . , xN ) collects all “cross terms” and describes the “cross interaction” (CI) of involved wavepackets x1 , . . . , xN . We will find in sufficient detail the dependence of the solution operators Gm on the nonlinearity F and prepare a basis for the consequent estimation of nonlinear interactions between different modes and wavepackets. Then combining the facts about the structure of the solution operators G (m) with asymptotic estimates of relevant oscillatory integrals we show that for a multi-wavepacket x = x1 + · · ·+ xN the cross interaction term satisfies the following estimate GCI (x1 , . . . , xN ) = O(β) + O(|ln β|/β 1+ ),
β, → 0,
implying the modal superposition principle. 4.1. Multilinear forms and polynomial operators The analysis of nonlinear equations of the form (3.3) requires the use of appropriate Banach spaces of time dependent fields, as well as multilinear and analytic functions in those spaces. It also uses an appropriate version of the implicit function theorem. For the reader’s convenience we collect in this section the known concepts and statements on the above-mentioned subjects needed for our analysis. In this section, we consider functional-analytic operators which are defined in a ball in a Banach
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1005
space X with the norm xX . In our treatment of the analytic functions in infinitelydimensional Banach spaces we follow to [25, Sec. 3] and [21]. Definition 4.1 (Polylinear Operator). Suppose that x1 , x2 , . . . , xn are vectors in a Banach space X. Let a function F (n) (x), x = (x1 , . . . , xn ), take values in X and be defined for all x ∈ X n . Such a function F (n) is called a n-linear operator if it is linear in each variable, and it is said to be bounded if its following norm is finite F (n) =
sup x1 X =···=xn X =1
F (n) (x1 x2 · · · xn )X < ∞.
(4.3)
Definition 4.2 (Polynomial). A function P (x) from X to X defined for all x ∈ X is called a polynomial in x of degree n if for all a, h ∈ X and all complex α P (a + αh) =
n
Pν (a, h)αν ,
ν=0
where Pν (a, h) ∈ X are independent of α. The degree of Pn is exactly n if Pn (a, h) is not identically zero. A polynomial F (x) is a homogeneous polynomial of a degree n if for all c ∈ C F (cx) = cn F (x). Then n is called also the homogeneity index of F (x). A homogeneous polynomial F is called bounded if its norm F ∗ = sup {F (x)X }
(4.4)
xX =1
is finite. For a given n-linear operator F (n) (x) = F (n) (x1 x2 · · · xn ) we denote by F (n) (xn ) a homogeneous of degree n polynomial from X to X: F (n) (xn ) = F (n) (x · · · x).
(4.5)
Note the norm definitions (4.3)–(4.5) readily imply F (n) ∗ ≤ F (n) .
(4.6)
Definition 4.3 (Analyticity Class 1). Let a function F be defined by the following convergent series F (x) =
∞
F (m) (xm )
for xX < R∗F ,
(4.7)
m=2
where F (m) (xm ), m = 2, 3, . . . is a sequence of bounded m-homogenious polynomials satisfying −m F (m) ∗ ≤ C∗F R∗F ,
m = 2, 3, . . . .
(4.8)
Then we say that F (x) belongs to the analyticity class A∗ (C∗F , R∗F ) and write F ∈ A∗ (C∗F , R∗F ).
November 28, 2006 11:15 WSPC/148-RMP
1006
J070-00285
A. Babin & A. Figotin
Notice that for xX < R∗F , we have F (x)X ≤ C∗F
∞
−n xnX R∗F ≤ C∗F
n=2
−n0 xnX0 R∗F −1 , 1 − xX R∗F
(4.9)
implying, in particular, the convergence of the series (4.7). Definition 4.4 (Analyticity Class 2). If F (m) (x), m = 2, 3, . . . , is a sequence of bounded m-linear operators from X m to X and −m , F (m) ≤ CF RF
m = 2, 3, . . . ,
(4.10)
we say that a function F defined by the series (4.7) for xX < RF belongs to the analyticity class A(CF , RF ) and write F ∈ A(CF , RF ). In this paper we will use operators from the classes A(CF , RF ) based on multilinear operators. Note that evidently A(CF , RF ) ⊂ A∗ (CF , RF ). One can construct a polynomial based on a multilinear operator according to the formula (4.5). Conversely, the construction of a multilinear operator, called polar form, based on a given homogeneous polynomial is described by the following statement, [21, Secs. 1.1 and 1.3] and [25, Sec. 26.2]. Proposition 4.5 (Polar Form). For any homogeneous polynomial P (n) (x) of degree n, there is a unique symmetric n-linear operator P˜ (n) (x1 x2 · · · xn ), called the polar form of Pn (x), such that P (n) (x) = P˜ (n) (x · · · x). It is defined by the following polarization formula: n 1 (n) (n) P˜ (x1 x2 · · · xn ) = n P ξj xj . (4.11) 2 n! j=1 ξj =±1
In addition to that, the following estimate holds: nn Pn ∗ ≤ P˜n ≤ Pn ∗ ≤ en P (n) ∗ . n!
(4.12)
Since by Definition 4.4 functions from A(C, R) have zero of the second-order at zero, their Lipschitz constant is small in a vicinity of zero. More exactly, the following statement holds. Lemma 4.6 (Lipschitz Estimate). If F ∈ A(CF , RF ), then F (x) − F(y) ≤ CF Cx − y(x + y) and RF . where C > 0 depends on RF
for x, y ≤ RF < RF ,
(4.13)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1007
4.2. Implicit Function Theorem and expansion of operators into composition monomials Here we provide a version of the Implicit Function Theorem, first we formulate classical implicit function theorem for equations u = F (u) + x with analytic function F and then we present a refined implicit function theorem. The refined implicit function theorem we prove here produces expansion of the solution u into a sum of terms which are multilinear not only with respect to x but also with respect to F . The formulation of the theorem and the proof involve convenient labeling of the terms of the expansion (called composition monomials), and we use properly introduced trees to this end. The explicit expansion produced by the refined implicit function theorem is required to be able to take into account rather subtle mechanisms which lead to the superposition principle. Let us consider the abstract nonlinear equation (4.1) and its solution u = u(x) for small x when the nonlinear operator F belongs to the class A(CF , RF ). We seek the solution u in the following form u = G(F , x) =
∞
G (m) (xm )
for sufficiently small x,
(4.14)
m=1
and we call G the solution operator for (4.1). It readily follows from (4.1) that G(F , x) = x + F (G(F , x)) and ∞
G (m) (xm ) = x +
m=1
∞ s=2
F (s)
∞
(4.15) s
G (m) (xm )
.
(4.16)
m=1
From the above equation we can deduce recurrent formulas for multilinear operators G (m) . Indeed for m = 1, the linear term is the identity operator G (1) (x) = F (1) (x) ≡ x.
(4.17)
For m ≥ 2, we write the following recurrent formula G
(m)
(x1 · · · xm ) =
m
F (s) (G (i1 ) (x1 · · · xi1 ) · · · G (is ) (xm−is +1 · · · xm )).
s=2 i1 +···+is =m
(4.18) (i)
By the construction, if multilinear operators G are defined by (4.18), then (4.16) is satisfied. Namely, expanding right-hand side of (4.16) using multilinearity of F (s) we obtain a sum of expressions as in right-hand side of (4.18), and since (4.18) holds, terms in the left-hand side of (4.16) with given homogeneity index p cancel with the terms in the right-hand side with the same homogeneity. Note that in (4.18) we do not assume that the operators F (s) and G (i) are symmetrized and the order of variables is important; we prefer to treat F (s) and G (m) as multilinear operators of s and m variables, respectively. Though, when we apply constructed G (i) to solve (4.1), we set x1 = · · · = xm .
November 28, 2006 11:15 WSPC/148-RMP
1008
J070-00285
A. Babin & A. Figotin
The following implicit function theorem holds (see [4] and Theorem 4.25 below with a similar proof). Theorem 4.7 (Implicit Function Theorem). Let F ∈ A(CF , RF ). Then there exists a solution u = x + G(F , x) of Eq. (4.1) u = x + F (u), given by the solution operator G ∈ A(CG , RG ), where we can take CG =
2 RF , 2(CF + RF )
RG =
2 RF , 4(CF + RF )
(4.19)
the series (4.14) converges for xX < RG . The multilinear operators G (m) (x) satisfy the recursive relations (4.17) and (4.18). Note that uniqueness of the solution and continuous dependence on parameters follows from Lemma 4.6 and from a standard observation which we formulate in the following remark. Remark 4.8. If u1 , u2 are two solutions of Eq. (4.1) with x = x1 , x2 respectively and u1 , u2 ≤ R, and F (u) is Lipschitz continuous for u ≤ R with a Lipschitz constant q < 1 then u1 − u2 ≤ (1 − q)−1 h1 − h2 . If u1 , u2 are two solutions of Eq. (4.1) with F = F0 and F = F0 + F1 respectively, u1 , u2 ≤ R, and F (u) is Lipschitz continuous for u ≤ R with a Lipschitz constant q < 1 and F1 (u) ≤ when u ≤ R then u1 − u2 ≤ (1 − q)−1 . Observe that every term G (il ) in (4.18), in turn, can be recast as a sum (4.18) with m replaced by il < m. Evidently, applying the recurrent representation (4.18) and multilinearity of F (s) , we can get a formula for G (m) as a sum of terms involving exclusively (i) the symbols F (m) , (ii) variables xj and (iii) parentheses. We will refer to the terms of such a formula as composition monomials. To be precise we give below a formal recursive definition of composition monomials. The monomials are expressions which involve variables uj , j = 1, 2, . . . , and m-linear operators F (m) , m = 2, 3, . . . , and are constructed by induction as follows. Definition 4.9 (Composition Monomials). Let {F (s) }∞ s=2 be a sequence of s-linear operators which act on variables uj , j = 1, 2, . . . . A composition monomial M of rank 0 is the identity operator, namely M (uj ) = uj , and its homogeneity index is 1. A composition monomial M of a non-zero rank r ≥ 1 has the form M (ui0 · · · uis ) = F (s) (M1 (ui0 · · · ui1 ) · · · Ms (uis−1 +1 · · · uis )),
(4.20)
where M1 (ui0 · · · ui1 ), M2 (ui1 +1 , · · · ui2 ), . . . , Ms (uis−1 +1 · · · uis ), with 1 ≤ i0 < i1 < · · · < is , are composition monomials of ranks not exceeding r − 1 (submonomials) and at least one of the rank r − 1, the homogeneity index of Mj equals ij − ij−1 . For a composition monomial M the operator F (s) in its representation (4.20) is called its root operator. The index of homogeneity of M defined by (4.20) equals im − i0 + 1. We call the labeling of the arguments of a composition monomial M defined by (4.20) by consecutive integers standard labeling if i0 = 1.
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1009
If the monomials M1 , . . . , Ms have the respective homogeneity indexes ν(Mi ) then we readily get that the homogeneity index of the monomial M satisfies the identity ν(M ) = ν(M1 ) + · · · + ν(Ms ).
(4.21)
Using the formula (4.20) inductively we find that any composition monomial M is given by a formula which involves symbols from the set {F (s) }∞ s=2 , arguments ui and parentheses, and if s-linear operators are substituted as F (s) we obtain the terms contained in the expansion of G (m) . Definition 4.10 (Incidence Number). The total number of symbols F (q) involved in M is called the incidence number for M . For instance, the expression of the form M = F (4) (u1 u2 u3 F (3) (u4 F (2) (u5 u6 )F (3) (u7 u8 u9 )))
(4.22)
is an example of a composition monomial M of rank 3, incidence number 4 and homogeneity index 9. It has three submonomials. Namely, the first one is F (3) (u4 F (2) (u5 u6 )F (3) (u7 u8 u9 )) of rank 2 and incidence number 3. The second submonomial F (2) (u5 u6 ) has rank 1 and incidence number 1, and the third one is F (3) (u7 u8 u9 ) of rank 1 and incidence number 1. When analyzing the structure of composition monomials we use basic concepts and notation from the graph theory, namely, nodes, trees and subtrees. Definition 4.11 (Nodes, Tree, Subtree). A (finite) directed graph T consists of nodes Ni ∈ NT where NT is the set (finite) of nodes of T and a set of edges Ni Nj ∈ NT × NT . An edge Ni Nj connects Ni with Nj , it is an outcoming edge of Ni and an incoming edge of Nj . A tree (more precisely a rooted tree, we only consider rooted trees) is a directed connected graph which is cycle-free and has a selected root node, that is a node N∗ which has no incoming edges. If a node N has an outcoming edge N Nj the node Nj is called a child node of N ; if a node N has an incoming edge Nj N the node Nj is called the parent node of N . We denote the parent node of N by p(N ). If a node does not have children it is called an end node (or a leaf). For every node N , we denote by µ(N ) the number of child nodes of the node N. If a path connects two nodes, we call the number of edges in the path its length. We denote by l(N ) the length of a path which connects N∗ with N . Every node N of the tree T can be taken as a root node of a subtree which involves all descendent nodes of N and connecting edges; we denote this maximal subtree T (N ). Since we consider only maximal subtrees we simply call them subtrees. We call by the rank of a tree the maximal length of a path from its root node to an end node and denote it by r(T ). We call by the rank of a node N of the tree T the rank of the subtree T (N ). Definition 4.12 (Tree Incidence Number and Homogeneity Index). For a tree T we call the number of non-end nodes incidence number i = i(T ). We denote the number of end nodes of the tree by ν(T ) and call it homogeneity index.
November 28, 2006 11:15 WSPC/148-RMP
1010
J070-00285
A. Babin & A. Figotin
Elementary Properties of Trees. Since a tree does not have cycles, the path connecting two nodes on a tree is unique. The root node N∗ does not have a parent node, and since it is connected with every other node, every non-root node has a parent node. The end nodes have zero rank. The only node with rank r(T ) is the root node. The total number of nodes of a tree T equals m(T ) + i(T ). Definition 4.13 (Ordered Tree). A tree is called an ordered tree if for every node N all child nodes of N are labeled by consecutive positive integers (which may start not from 1). Hence, for any node N = N∗ there is the parent node N = p(N ) and the order number (label) o(N ), i1 ≤ o(N ) ≤ i1 + µ(N ) − 1. Two trees are equal if there is one-to-one mapping Θ between the nodes which preserves edges, maps the root node into the root node and preserves the order of ˜ and p(N1 ) = p(N2 ) = N then children of every node up to a shift: if Θ(N ) = N o(N1 ) − o(N2 ) = o(Θ(N1 )) − o(Θ(N2 )). Since we use in this paper only ordered trees we simply call them trees. Standard Node Labeling and Ordering. We use the following way of labeling and ordering of end nodes of a given ordered tree T . Let rˆ be the rank of T . For any end node N we take the unique path N∗ N1 · · · Nl(N )−1 N of length l(N ) ≤ rˆ connecting it to the root. Since the tree is ordered, every node Nj in the path has an order number o(Nj ). These order numbers form a word w(N ) of length l(N ). If l(N ) < rˆ we complete w(N ) to the length rˆ adding several symbols ∞ and assuming that ∞ > n for n = 1, 2, . . . . After that we order words w(N ) in the lexicographic order. We obtain the ordered list w1 (N1 ), . . . , wν(T ) (Nν(T ) ). We take this ordering and labeling of the end nodes N1 , . . . , Nν(T ) as a standard ordering and denote by o0 (N ) the consecutive number with respect to this labeling: j = o0 (Nj ). To label the nodes with rank r we delete all the nodes of rank less than r together with the incoming edges and nodes of rank r become end nodes. We apply to them the described labeling and denote the indexes obtained by or (N ). Hence, every node N of the tree T has two integer numbers assigned: r(N ) and or(N ) (N ). We introduce the standard labeling of all nodes of T by applying the lexicographic ordering to pairs (r(N ), or(N ) (N )), and denote the corresponding number o(N ), 1 ≤ o(N ) ≤ m(T ) + i(T ). The following statement follows straightforwardly from the definition of the standard ordering. Proposition 4.14. If a tree T has a subtree T and the standard labeling of end nodes is used, then all the end nodes of the subtree T fill an interval j1 ≤ o0 (N ) ≤ j2 for some j1 and j2 . Theorem 4.15. Let T2 be the set of ordered trees such that each node of a tree which is not an end node has at least two children nodes. The set of composition monomials based on {F (s) , s = 2, 3, . . .} is in one-to-one correspondence with the set T2 . The correspondence has the following properties. The monomials of rank r
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1011
correspond to trees of rank r. The root node of the tree T corresponds to the root operator of the composition monomial. The end nodes correspond to variables uj , j = 1, . . . , ν(T ). The standard labeling of end nodes coincides with the consecutive labeling of the variables uj of monomial from left to right. The homogeneity index of a monomial equals the homogeneity index of the corresponding tree. The incidence number of a monomial equals the incidence number of a tree, and the rank of a monomial equals the rank of a tree. Proof. For a given {F (s) } the set of monomials with rank r is finite, the set of trees with rank r is finite too. Therefore, to prove one-to-one correspondence of the two sets it is sufficient to construct two one-to-one mappings from the first set into the second and from the second into the first. First of all, using the induction with respect to r we construct for every monomial the corresponding tree. Let r = 0. A monomial of rank 0 has the form u1 , and it corresponds to a tree involving one node. The tree has no edges and the node is the both the root and the end node; its incidence number is zero and homogeneity power is one. Assume now that we have defined a tree for any monomial of rank not greater than r − 1. A monomial of rank r has the form F (m) (M1 · · · Mm ) where monomials M1 · · · Mm have rank not greater than r − 1. Every monomial M1 · · · Mm corresponds to an ordered tree T1 , . . . , Tm with the root nodes N∗1 , . . . , N∗m . We form the tree T as a union of the nodes of T1 , . . . , Tm and add one more node N∗ which corresponds to the root operator F (m) and it becomes the root node of T . We take the union of edges from T1 , . . . , Tm and add m more edges connecting N∗ with the nodes N∗1 , . . . , N∗m , the order of the nodes corresponds to ordering of M1 · · · Mm from left to right. The first mapping is constructed. Now let us define for every ordered tree T the corresponding monomial M (F , T ). If we have a tree T of rank zero we set M (F , T ) = uj and j = 1 if we use the standard labeling. Now we do induction step from r − 1 to r. If we have a tree of rank r we take the root node N∗ and its children N∗1 , . . . , N∗s , s = µ(N∗ ). The subtrees T (N∗1 ), . . . , T (N∗s ) have rank not greater than r − 1 and the monomials M (F , T (N∗1 )), . . . , M (F , T (N∗s )) are defined according to induction assumption, let m(T (N∗1 )), . . . , m(T (N∗s )) be their homogeneity indices. We set m(T ) = m(T (N∗1 )) + · · · + m(T (N∗s )). We denote the variables of every monomial M (F , T (N∗j )) by uj,1 , . . . , uj,m(T (N∗j )) counting from left to right, and then labeling all the variables uj,l using the lexicographic ordering of pairs j, l we obtain variables u1 , . . . , um(T ) and monomials M (F , T (N∗1 ))(u1 , . . . , um(T (N∗1 )) ),
M (F , T (N∗2 ))(um1 +1 , . . . , um1 +m2 ),
etc., where mj = m(T (N∗j )). After that we set M (F , T )(u1 , . . . , um(T ) ) = F (s) (M (F , T (N∗1 ))(u1 , . . . , um(T (N∗1 )) ), . . . , M (F , T (N∗s )) × (um(T )−ms−1 +1 , . . . , um(T ) )).
November 28, 2006 11:15 WSPC/148-RMP
1012
J070-00285
A. Babin & A. Figotin
Note that the homogeneity index for the monomial M equals the sum of the indices for submonomials M1 · · · Mm , the homogeneity index for the tree T equals the sum of the indices for subtrees T1 , . . . , Tm , this implies their equality by induction. The incidence number for the monomial M equals the sum of the numbers for submonomials M1 · · · Mm plus one; the incidence number for the tree T equals the sum of the numbers for submonomials T1 , . . . , Tm plus one. Therefore, these quantities for monomials and trees are equal by induction. Induction is completed. Therefore we constructed the two mappings, one can easily check that they are one-to-one and have all required properties. Definition 4.16 (Monomial to a Tree). For a tree T ∈ T2 , we denote by M (F , T ) the monomial which is constructed in Theorem 4.15. Conclusion 4.17. The above construction shows that the structure of every composition monomial is completely described by an (ordered) tree T with nodes Ni corresponding to the operators F (mi ) . At such a node Ni (i) the number mi of outcoming edges equals the homogeneity index of F (mi ) ; (ii) the outcoming edges are in one-to-one correspondence with the arguments of F (mi ) , and the ordering of the child nodes coincides with the ordering of arguments of F (mi ) from left to right. The value of mi may be different for different nodes. A node corresponding to F (m) is connected by edges with m child nodes corresponding to the arguments of F (m) . Every node N of the tree T can be taken as a root node of a subtree T (N ) which correspond to a submonomial M (F , T (N )). Conversely, every submonomial of M (F , T ) equals M (F , T (N )) for some mode N . If m > 1 the submonomial has a non-zero rank. The number of non-end nodes equals to the number of symbols F (m) used in F -represenation of the monomial which is the incidence number of the monomial. The total number of end nodes of an m-homogeneous operator equals to m = ν(T ). The rank of a node N equals the rank of the corresponding submonomial M (F , T (N )). The arguments u1 , . . . , us of a monomial correspond to the end nodes of the tree. The standard labeling of nodes of T agrees with the standard labeling (from left to right) of the arguments of the composition monomial M (F , T ). The number of end nodes of the tree T equals the homogeneity index of corresponding monomial. If the root mode of the tree T of a monomial M has µ(N∗ ) = m edges which are connected to child nodes N1 , . . . , Nm , then there is a node F (mj ) , j = 1, . . . , n at the end of every edge such that M has the form F (m) (F (µ(N1 )) (· · ·), . . . , F (µ(Nm )) (· · ·)).
(4.23)
Example 4.18. The tree corresponding to F (3) (u1 u2 F (u1 u2 u3 )) has two nodes of non-zero rank, the root node of rank 2, one non-end node of rank 1 and five end nodes of rank 0. Another example, the monomial (4.22) has the root node corresponding to F (4) , four edges lead respectively to nodes corresponding to the end nodes with u1 , u2 , u3 and to the non-end node with F (3) , see Fig. 2.
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
Fig. 2.
1013
In this picture, a tree corresponding to a monomial is drawn.
Remark 4.19. Since all operators in the set {F (s) }∞ s=2 in (4.18) have the homogeneity index at least two, the trees of monomials generated by recurrent relations (4.18) have a special property: every non-end mode has at least two children. Sometimes it is convenient to use monomials involving several types of operators. To describe such a situation we introduce for a given tree a decorated monomial. Definition 4.20 (Decorated Monomial of a Tree). Assume that we have several formal series {F1 , . . . , Fl } where Fi is represented by a formal series (m) Fl = m Fi , i = 1, . . . , l. We call the set {F } = {Fj , j = 1, . . . , S} the operator alphabet, and j is called the decoration index. We consider a function Γ(N ), N ∈ T , defined on the nodes of the tree T and taking values in the set {1, . . . , l} of the decoration indices, and call such a function a decoration function on the tree T . Then for a decoration function Γ(N ) we define the decorated monomial M ({F }, Γ, T ) (m) with j defined by Γ. For every node N of the tree T by picking operators Fj (m)
the homogeneity index m = µ(N ) of the operator Fj equals to the number of children of N and j is defined by Γ, namely Fj , j = Γ(N ). Hence, a decorated monomial M ({F }, Γ, T ) has instead of (4.23) the following form (m) (µ(N )) (µ(N )) (4.24) FΓ(N ) FΓ(N1 )1 (· · ·), . . . , FΓ(Nmm) (· · ·) . (m)
When Fi are multilinear operators, a monomial M ({F }, T, Γ) is also a multilinear operator, its homogeneity index m equals ν(T ) and we denote its arguments by (x1 · · · xm ). Respectively, if x1 · · · xν are arguments of a monomial M ({F }, T, Γ) and we use the standard labeling of the nodes then according to Proposition 4.14
November 28, 2006 11:15 WSPC/148-RMP
1014
J070-00285
A. Babin & A. Figotin
a submonomial M ({F }, T, Γ) has arguments xκ(T ) , . . . , xκ(T )+ν(T )−1 which are labeled constructively. Now we would like to describe elementary properties of composition monomials and the related trees. Note that for every N ∈ T a composition monomial is a linear µ(N ) function of operator FΓ(N ) . Consequently, the concept of the decorated composition monomial can be naturally extended to monomials associated with the following family of operators {F } = {F : F = c1 F1 + · · · + cl Fl , ci ∈ C}. For a given tree T the submonomial M ({F }, Γ, T ) is represented as a function on the tree T with values in {F }, this is an i-linear function of F where i is the incidence number of T . There are elementary relations between the incidence number i(T ), the rank r(T ), the number of edges of a tree T which do not end at an end node e0 (T ) and the homogeneity index m of a tree T , and corresponding monomial M ({F }, Γ, T ). For example, e0 (T ) = i(T ) − 1. Some useful relations expressed by inequalities are given in the following lemma. Lemma 4.21. Let us consider trees T for which every non-end node has at least two children, µ(N ) ≥ 2 for all N ∈ T . Let for any i the number m(i) be the minimum number of the end nodes ν(T ) for all trees T with given incidence number i. Then m(i) ≥ i + 1.
(4.25)
Similarly for any given r let m(r) be the minimum number of end nodes with given rank r. Then m(r) ≥ r + 1.
(4.26)
0
Let e (T ) be the number of edges of a tree T which do not end at end nodes. For any given e, let m(e) be the minimum number of end nodes with e0 (T ) = e. Then m(e0 ) > e0 + 1.
(4.27)
Proof. For i = 1, (4.25) is true. Let the statement be true for i = i0 . Let T be a tree with the minimum number of end nodes m(i0 ) = m . We delete one of the end nodes together with the edge leading to it from its parent obtaining a tree with m(i0 ) − 1 end node. If the tree remains in the same class, then m(i0 ) is reduced by one contradicting the minimality. Hence, the deletion of the edge created a node with only one child. Such a node can be replaced by an edge leading from its parent to its child and reducing the incidence number by one. Using the induction assumption we get m(i0 ) − 1 ≥ m(i0 − 1) ≥ (i0 − 1) + 1
(4.28)
that completes the induction and proves (4.25) for all i. Similar induction proves (4.26). For r = 1, (4.26) is true. Let T be a tree with the minimum number of end
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1015
nodes m(r0 ) = m. As above, by deleting an end node and using the minimality we reduce the tree T to a tree T with a smaller rank. Since only one non-end node is eliminated, the rank of T is r0 − 1 and we get (4.26). Inequality (4.27) holds for e = 0 since m(0) ≥ 2. Let T be a tree with the minimum number of end nodes m(e0 ) = m. We again delete one of the end nodes together with the edge joining it to its parent and obtain a tree with m(e0 ) − 1 end nodes and the same number of edges which do not end at an end node. The minimality implies that the parent node has only one another child and removing it we get either e0 or e0 − 1 edges which do not go to end nodes. We use the induction as in (4.28) obtaining (4.27).
Monomial Expansion in the Implicit Function Theorem. If operators G m (x1 · · · xm ) are determined by the recurrent formulas (4.18) it is obvious that every G m can be represented in terms of F = {F (s) } using the recurrence and multilinearity of F (s) . More precisely the following representation holds G (m) (F , x1 · · · xm ) = cT M (F , T )(x1 · · · xm ), (4.29) T ∈Tm
where (i) M (F , T ) ∈ T2 is a composition monomial corresponding to a tree T and Tm ⊂ T2 stands for the set of trees with m end nodes; (ii) the integer-valued multiplicity coefficient cT ≥ 0 counts the multiplicity of the related monomial M (F , T ) in the expansion of (4.18); for some trees T its multiplicity coefficient cT may be zero. The expansion (4.29) is obtained by an inductive process with respect to m since (4.18) expresses G m in terms of G (ij ) with 2 ≤ ij < m. Notice that for a given operator F = {F (s) } the monomial M (F , T ) considered as an operator can be the same for different T , the monomials and the multiplicity coefficients are determined purely algebraically. Remark 4.22. The expression (4.29) for G (m) as a linear combination of composition monomials M (F , T ), in particular the multiplicity coefficients cT , does not depend on a specific form of the operator F . It is the same for a solution z = x + G(F , x) of the general functional equation (4.1) and for an elementary algebraic equation u = F (u) + x with u, x ∈ C and with a scalar analytic function F of one complex variable. (m)
are bounded multilinear operators then a decorated monomial If all Fi M (F , T, Γ) is also a bounded multilinear operator as it follows from the following statement. Lemma 4.23. Let M ({F }, T, Γ) be a decorated monomial of the homogeneity index (s) ν(T ) = m and all Fi be bounded operators from E s into E for a Banach space E. Then the following estimate holds M ({F }, T, Γ)(x1 · · · xm )E ≤
N ∈T,r(N )>0
m (µ(N )) F xj E . Γ(N ) j=1
(4.30)
November 28, 2006 11:15 WSPC/148-RMP
1016
J070-00285
A. Babin & A. Figotin
Proof. Notice that F (m) (M1 · · · Mm )E ≤ F (m) M1E · · · Mm E
(4.31)
where Mj are submonomials. Applying the above inequality repeatedly we obtain (4.30). The next statement provides a bound for the norm of a decorated monomial which involves as a factor the norm of a submonomial. Lemma 4.24. Let M ({F }, T, Γ) be a decorated monomial evaluated at x1 · · · xm . Let all F (s) be bounded operators from E s into Banach space E. Then for every evaluated submonomial M ({F }, T (N0 ), Γ) we have an estimate M ({F }, T, Γ)(x1 · · · xm )E ≤ M ({F }, T (N0 ), Γ)(xκ , . . . , xκ+ν(T (N ))−1 )E (µ(N )) × FΓ(N ) xj N ∈T \T (N
where xκ , . . . , xκ M ({F }, T (N0 ), Γ).
0 ),r(N )>0
+ν(T (N ))−1
are
j<κ
the
j≥κ+ν(T (N
arguments
of
xj , (4.32) 0 ))
the
submonomial
Proof. The proof uses the induction with respect to the length l(N0 ). For l(N0 ) = 0 the statement is obvious. Assuming that the statement is true for l(N ) < l0 , we consider the case when l(N0 ) = l0 . Notice that (µ(N ))
(µ(N ))
FΓ(N∗∗) (M1 · · · Mµ(N ) )E ≤ FΓ(N∗∗) M1 E · · · Mµ(N ) E , where Mj = M ({F }, T (N∗j ), Γ), N∗j are child nodes of N∗ . One of the submonomials M1 · · · Mµ(N ) contains M ({F }, T (N0 ), Γ) as a submonomial, and let it be M ({F }, T (N∗j0 ), Γ). The length of the path from N0 to N∗j is less than l0 and we can use the induction hypothesis to estimate the norm of M ({F }, T (N∗j0 ), Γ). The norms of Mj with j = j0 are estimated using (4.30). The labels of the arguments of the submonomial fill an interval according to Proposition 4.14. The following theorem gives a needed refinement of the Implicit Function Theorem 4.7. Theorem 4.25 (Refined Implicit Function Theorem). Let F ∈ A(CF , RF ). Let G ∈ A(CG , RG ) be the analytic solution operator constructed in Theorem 4.7 which solves (4.1). Then the expansion of G(F , x) into composition monomials G(F , x) =
∞ m=1 T ∈Tm
cT M (F , T )(xm )
(4.33)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
converges for x < RG , and the following estimates hold cT M (F , T )(xm ) ≤ CG RG−m xm , m = 2, . . . ,
1017
(4.34)
T ∈Tm
∞
cT M (F , T )(xm ) ≤ CG
m=2 T ∈Tm
x2X RG−2
1 − xX RG−1
,
where CG and RG depend only on CF and RF and satisfy CG =
2 RF , 2(CF + RF )
RG =
2 RF . 4(CF + RF )
The multiplicity coefficients cT ≥ 0 satisfy the inequality 1 cT ≤ 8 m . 4
(4.35)
T ∈Tm
The proof of this statement is given in Appendix B. 4.3. Decorated expansions In this section we develop a formalism for treating linear operators with several invariant subspaces which span the entire space as, for example, in the case of projections (2.19). The decomposition into related invariant subspaces is very important for the analysis. The general setting is as follows. Suppose that a Banach space E has several projection operators Πλ , λ ∈ Λ, where Λ is a finite set of indices, we call this set decoration set. We assume that the sum of the projections equals the identical operator, i.e. Πλ = Id, where Id is the identity operator, (4.36) λ∈Λ
and Πλ Πλ = Πλ ,
Πλ Πλ = 0
if λ = λ,
λ , λ ∈ Λ.
(4.37)
We call such projections decoration projections. For example, let us look at projections Πn,ζ (k), n = 1, . . . , J, ζ = ± defined by (2.19). These projections define bounded operators Πn,ζ acting on (i) functions of k in the space L1 ; (ii) functions of k, τ in the space E = C([0, τ∗ ], L1 ). In another example based on (2.19) we fix n0 and define Πn,ζ (k). (4.38) Πζ (k) = Πn0 ,ζ (k), ζ = ±, Π∞ (k) = n=n0 ,ζ=±
Using (4.36) we expand vectors x ∈ E as follows x= Πλ x = xλ , xλ = Πλ (x). λ∈Λ
(4.39)
λ∈Λ
We also use notation (n)
Fλ
= Πλ F (n) .
(4.40)
November 28, 2006 11:15 WSPC/148-RMP
1018
J070-00285
A. Babin & A. Figotin
Often in applications the number of elements in Λ is either 2 or 3. In the case when Λ has three elements we set Λ = {+, −, ∞},
Π+ + Π− + Π∞ = Id,
(4.41)
and x = x+ + x− + x∞ ,
F (x) = F+ (x) + F− (x) + F∞ (x).
(4.42)
Using the decomposition (4.36) we introduce for m-linear operators F (n) (x1 · · · xn ) (n) the corresponding decorated operators F as follows: λ,ζ
(n)
(n)
Fλ,ζ (x1 · · · xn ) = Πλ F (n) (Πζ x1 · · · Πζ (n) xn ) = Fλ (Πζ x1 · · · Πζ (n) xn ), where ζ is defined in (3.7). Obviously, we have (n) Fλ,ζ (x1 · · · xn ). F (n) (x1 · · · xn ) =
(4.43)
(4.44)
n
λ∈Λ, ζ∈Λ
An example of expansion (4.44) is given by (3.11).
4.4. Decorated composition monomials We assume that operators F (n) act in the space allowing a decomposition into three components as in (4.41). Let M (F , T ) be a composition monomial of the homogeneity index m, and assume that the corresponding tree T has the incidence number i, the rank r, and e edges. Suppose also that every operator F (n) is expanded into a sum of decorated operators as in (4.44) and (4.43). Using the linearity of M (F , T ) with respect to operators F (n) we get M (F , T ) = F (n) (F (m1 ) (· · ·) · · · F (mn ) (· · ·)) (n) (m1 ) (mn ) = Fλ F (· · ·) · · · F
(· · ·) , λj1 ,ζj1
j , j=1,...,e λ∈Λ, λ∈Λi−1 , ζ (m )
λjn ,ζjn
(4.45)
(m )
where submonomials F 1 (· · ·), . . . , F n (· · ·) have ranks not exceeding r − 1. λ1 ,ζ1 λn ,ζn We expanded repeatedly the expression in the left-hand side of (4.45) as long as submonomials of non-zero rank were present resulting in an expansion involving (n) only decorated operators F . λ,ζ
Remark 4.26. Note that (n) (m ) (m ) (n) (m ) (m ) Fλ Fλ ,ζ1 (· · ·) · · · Fλ ,nζ (· · ·) = Fλ Πλ1 Fλ ,ζ 1 (· · ·) · · · Πλn Fλ ,nζ (· · ·) . 1
1
n
n
1
1
n
n
(4.46)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1019
Since projections Πζ satisfy the identities (4.37) if a vector ζ = (ζ , . . . , ζ (n) ) and indices λ1 , . . . , λn are given, then we have the identity (n) (m ) (m ) (4.47) F F 1 · · · F n = 0 when λi = ζ (i) for some i. λ,ζ
λ1 ,ζ1
λn ,ζn
Hence, for non-zero terms in the expansion (4.45) if indices λ1 , . . . , λn for (m ) (m ) (n) F 1 , . . . , F n are given the vector ζ in F is determined by them λ1 ,ζ1
λn ,ζn
λ,ζ
ζ (i) = λi ,
i = 1, . . . , n.
(4.48)
Note that according to (4.47) and (4.48) we have (n) (n) (m1 ) (mn ) (m1 ) (mn ) (· · ·) · · · F (· · ·) = Fλ Fλ1 (· · ·) · · · Fλn (· · ·) . F F λ,λ
(4.49)
According to (4.45) and (4.49) for every tree T of the homogeneity index m and the incidence number i, we get an expansion into a sum of monomials of the form M (F , T, λ, ζ)(x1 x2 · · · xm ) = M ({F }, Γ, T )(x1 x2 · · · xm ), (n) : λ ∈ Λ, ζ ∈ Λn , n = 2, 3, . . .}. {F } = {F
λ,ζ
(4.50)
Namely, if a monomial M (F , T ) has at a node N operator F (m(N )) then (m(N )) M ({F }, Γ, T ) at this node has operator FΓ(N ) . We call a composition monomial of the form (4.50), where (4.48) is assumed, a decorated composition monomial. Using the standard labeling of nodes, for a given function Γ on the tree T with values in Λ we find the vectors λ ∈ Λi , ζ ∈ Λm , with i being the incidence number of the tree T , and using (4.48) we rewrite (4.45) in the form M (F , T )(x1 x2 · · · xm ) = M (F , T, λ, ζ)(x1 x2 · · · xm ), (4.51) m
λ∈Λi , ζ∈Λ
where ζ is determined by values of Γ on the end nodes. The sum (4.51) contains at most 3i+m non-zero terms, where 3 is the number of elements in Λ. Combining (4.51) with (4.33) we obtain m ). cT M (F , T, λ, ζ)(x (4.52) G (m) (xm ) = T ∈Tm m
λ∈Λi(T ) , ζ∈Λ
5. Expansions of Solutions for Oscillatory Integral Equation In this section we apply general concepts introduced in previous sections to oscillatory integrals involving operators F as in (3.3) and (3.4). Based on projections Πn,ζ (k) in (2.19) for given n = n0 we define as in (4.38) decoration projections in L1 which satisfy (4.41): ˜(k) = Πn0 ,ζ (k)˜ u(k), ζ = ±, Π∞ = Πn,ζ . (5.1) Πζ u n=n0 ζ=±
November 28, 2006 11:15 WSPC/148-RMP
1020
J070-00285
A. Babin & A. Figotin
5.1. Boundedness of oscillatory integral operators In this subsection we estimate norms of multilinear operators F = F (m) defined by (3.4) and the related composition monomials. The operators F (m) have the form (3.4) where Dm = Rd(m−1) as in (2.65) or Dm = [−π, π]d(m−1) as in (2.23). The both cases are completely similar since we use the same properties of the spaces L1 = L1 ([−π, π]d ) or L1 = L1 (Rd ), and we do not use in our proofs the boundedness and compactness of the domain [−π, π]d . Hence, we will consider everywhere the periodic case [−π, π]d which corresponds to lattice equations and without further comment apply the results to the case Rd . Lemma 5.1. The operator F (m) defined by (3.4) and (2.22) is bounded from E = C([0, τ∗ ], L1 ) into C 1 ([0, τ∗ ], L1 ) and its norm is estimated as follows m
u1 · · · u ˜m )E ≤ τ∗ CΞ2m+1 χ(m) F (m) (˜
˜ uj E ,
(5.2)
˜ uj E .
(5.3)
j=1
∂τ F (m) (˜ u1 · · · u ˜m )E ≤ CΞ2m+1 χ(m)
j
Proof. According to Condition 2.5 we can diagonalize the matrix exp{−iL(k) τ1 } and its norm is bounded uniformly in k, τ1 and : exp −iL(k) τ1 ≤ CΞ2 ∀ k ∈ Rd , > 0, τ1 ≥ 0. (5.4) By (3.4), (3.5) and (2.22), F (m) (˜ u1 · · · u ˜m )(·, τ )L1 ≤ CΞ2m+1 sup |χ(m) (k, k)| k, k
×
τ 0
Dm
|˜ u1 (k )| · · · |˜ um (k(m) (k, k))| dk · · · dk(m−1) dτ1 dk
≤ CΞ2m+1 χ(m)
τ
0
˜ u1 (τ1 )L1 · · · ˜ um (τ1 )L1 dτ1
≤ τ∗ CΞ2m+1 χ(m) ˜ u1 E · · · ˜ um E . Similarly, ∂τ F (m) (˜ u1 · · · u ˜m )(·, τ )L1 ≤ CΞ2m+1 χ(m) |˜ u1 (k )| · · · |˜ um (k(m) (k, k))| dk · · · dk(m−1) dk Dm
u1 E · · · ˜ um E . ≤ χ(m) ˜
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1021
1 · · · xm ) is a decorated composition monomial Corollary 5.2. If M (F , T, λ, ζ)(x defined by (4.18) and F is defined by (3.3) and (3.4) then 1 · · · xm )E ≤ C 2e+i τ i M (F , T, λ, ζ)(x ∗ Ξ
χ(µ(N ))
N ∈T
1 · · · xm )E ≤ C 2e+i τ i−1 ∂τ M (F , T, λ, ζ)(x ∗ Ξ
N ∈T
m
xl E ,
(5.5)
l=1
χ(µ(N ))
m
xl E ,
(5.6)
l=1
where i is the incidence number of the tree T, and e is the number of edges of T . Proof. We estimate the norm of the monomial M = F (m) (M1 · · · Mm ) and its time derivative applying Lemma 5.1. Then we use (5.2) to estimate Mj C([0,τ∗],L1 ) . The formal proof is straightforward and uses the induction with respect to the incidence number of a monomial. Using boundedness of operators F (m) we obtain in a standard way uniqueness of solution of (3.3). ˜2 ∈ C([0, τ0 ], L1 ) with τ0 > 0 are two solutions of (3.3) with Lemma 5.3. If u ˜1 , u ˜ then u ˜2 . the same h, ˜1 = u Proof. Applying Lemma 4.6, we conclude that u2 )C([0,τ1 ],L1 ) ≤ Cτ1 F (˜ u1 ) − F(˜ u2 )C([0,τ1 ],L1 ) , F (˜ u1 ) − F(˜
0 < τ1 ≤ τ0 .
Deriving the above inequality we use that since NF < ∞ the radius RF in Lemma 4.6 is arbitrary large and CF in (4.13) according to (5.2) is proportional to τ1 . When the Lipschitz constant Cτ1 < 1, in a standard way we obtain that ˜2 (τ ) for 0 ≤ τ ≤ τ1 . Since this statement can be applied to u ˜1 (τ − τ1 ) u ˜ 1 (τ ) = u and u ˜2 (τ − τ1 ) we obtain that solutions coincide for 0 ≤ τ ≤ τ0 . 5.2. Function-analytic expansion of solutions for modal integral evolution equation The reduced evolution equation (3.3) has the form u ˜ = F (˜ u) + x ˜,
(5.7)
where u ˜, x ˜ are functions of (k, τ ). The nonlinear operator F in the right-hand side ˜ of (5.7) is determined by (3.4), x ˜(k, τ ) = h(k) as in (3.3). We look for the solution operator G in the form of operator series u ˜ = G(˜ x) =
∞
G (m) (˜ x(m) ).
(5.8)
m=1
The questions related to the existence and the convergence of such series are addressed in Theorem 4.7. As a direct corollary of Theorem 4.7 and Lemma 5.3 if applied to the reduced evolution equation (3.3) we obtain the following theorem.
November 28, 2006 11:15 WSPC/148-RMP
1022
J070-00285
A. Babin & A. Figotin
Theorem 5.4. Let ˜ xE < RG = (τ∗ Cχ CΞ2mF +1 )−1/(mF −1) /8,
τ∗ ≤ CΞ−3 Cχ−1
(5.9)
with Cχ as in (2.26), CΞ as in (2.17). Then the series (5.8) converges in E = x) = u ˜ determines the solution to (5.7) and C([0, τ∗ ], L1 ). The solution operator G(˜ the operators G (m) in series (5.8) satisfy the recursive relations (4.18). Proof. From (2.26) and (5.2), we infer that F defined by (2.21) belongs to the class A(CF , RF ) if −m τ∗ Cχ CΞ2m+1 ≤ CF RF ,
m = 2, . . . , mF .
−1 ≤ 1 it is sufficient to verify the above condition at m = mF only. After If CΞ−2 RF this we apply Theorem 4.7 where according to (4.19) we can take
CG =
2 RF , 2(CF + RF )
RG =
2 RF . 4(CF + RF )
(5.10)
We take CF = RF = (τ∗ Cχ CΞ2mF +1 )−1/(mF −1) ,
CG = 2RG = RF /4
(5.11)
−1 and apply Theorem 4.7. Note that CΞ−2 RF ≤ 1 if τ∗ ≤ CΞ−3 Cχ−1 .
From Theorem 5.4 (observing that by (5.11) RF → ∞ when τ∗ → 0) we obtain Theorems 2.8 and 2.18. To prove Theorem 2.15 on the superposition principle we apply the solution ˜l (k, β) as in Definition 2.9. For technical operator G to a sum of wavepackets h reasons we have to modify the wavepackets using cut-off functions described below. Cutoff Functions. We often use an infinitely smooth cutoff function Ψ(η), η ∈ Rd , satisfying the following relations 0 ≤ Ψ(η) ≤ 1,
Ψ(−η) = Ψ(η),
Ψ(η) = 1 for |η| ≤ π0 /2,
Ψ(η) = 0
(5.12) for |η| ≥ π0 ,
where π0 ≤ 1 is a sufficiently small number which satisfies the inequality 0 < π0 <
1 min dist{k∗l , σ}. 2 l
(5.13)
Using Ψ we introduce cutoff functions Ψl,ζ (k, β) with support near ζk∗l defined as follows: k − ζk∗l (5.14) Ψl,ζ (k, β) = Ψ , l = 1, . . . , Nh . β 1−
Here is a small number, 1/2 > > 0; we take the same as in Definition 2.9.
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1023
˜ l (k, β) we introduce a modified wavepacket Given a wavepacket h ˜Ψ ˜Ψ ˜Ψ ˜Ψ ˜ h l (k, β) = hl,+ (k, β) + hl,− (k, β), hl,ζ (k, β) = Ψl,ζ (k, β)hl,ζ (k, β),
(5.15)
where Ψl,ζ are defined by (5.14). ˜ (k, β) is a wavepacket in the sense of Definition 2.9 Proposition 5.5. If h l ˜ Ψ (k, β) defined by (5.15) and (5.14) is also a wavepacket in the sense of then h l Definition 2.9 and, in addition to that, ˜Ψ (k, β) = 0 h l,ζ
if |k − ζk∗l | ≥ π0 β 1− ,
(5.16)
˜l − h ˜Ψ h l L1 ≤ Cβ.
(5.17)
Proof. To obtain (5.17) we note that (2.34) and (5.12) imply: ˜ l,ζ (k)| dk ≤ Cβ, ˜l,ζ L1 = |(1 − Ψl,ζ (k, β))h (1 − Ψl,ζ )h
(5.18)
and (5.17) follows. Remaining statements are obtained by a straightforward verification. ˜l by h ˜ Ψ in the statement of The following lemma shows that we can replace h l Theorem 2.15, in particular in (2.47) and (2.48). ˜ Ψ (k, β) be defined by (5.15). Let ˜ l,ζ satisfy (2.34) and h Lemma 5.6. Let h l ˜l ≤ R, l = 1, . . . , Nh h
where Nh R < RG .
(5.19)
Then the difference G
Nh
˜l h
l=1
−
Nh
˜l ) − G G(h
l=1
Nh
˜Ψ h l
l=1
−
Nh
˜Ψ G(h l ) = BΨ ,
(5.20)
l=1
is small, namely BΨ E ≤ C(R)β.
(5.21)
Proof. Note that since 0 ≤ Ψl ≤ 1 we have ˜l,ζ L1 ≤ h ˜l,ζ L1 , Ψl,ζ h
˜l,ζ L1 ≤ h ˜l,ζ L1 , (1 − Ψl,ζ )h
(5.22)
and (5.18). Using the Lipschitz continuity of the solution operator G (see (4.6)) and (5.17) we obtain (5.21).
November 28, 2006 11:15 WSPC/148-RMP
1024
J070-00285
A. Babin & A. Figotin
Truncation. We will truncate the infinite series (5.8). To this end we define an integer m = m(β q ) as a solution of the inequality 2|ln β q | 2|ln β q | < m(β q ) ≤ + 1, |ln RG | |ln RG |
(5.23)
where RG is the same as in (5.9). We consider then the following partial sum of the expansion (5.8) m(β q )
˜ = Gm(β q ) (h)
˜(m) ) G (m) (h
(5.24)
m=1
and readily conclude that the following statement holds. Lemma 5.7. Let G be defined by (5.8), then ˜ − Gm(β) (h) ˜ E ≤ C(R)β G(h)
˜ E ≤ R < RG . when h
(5.25)
5.2.1. SI-CI splitting for evaluated monomials ˜ which is a sum of the form (2.39) and the solution G(F , h). ˜ We consider a function h (m) ˜ (m) Expanding G (h ) into composition monomials as in (4.33) we obtain a sum ˜m ). Then we look at the m-linear monomial of composition monomials M (F , T )(h ˜ equals a sum of Nh one-band wavepacket h ˜l as in (2.39). ˜m ) where h M (F , T )(h Using the linearity with respect to each argument we expand the monomial into a sum of Nhm expressions (evaluated monomials) N m h ˜ ˜ ˜ ˜ M (F , T ) hl hli . = M (F , T )(hl1 . . . hlm ) = M (F , T ) l=1
l1 ,...,lm
l1 ,...,lm
i
(5.26) The sum contains evaluated monomials of two kinds: (i) ones which involve the same wavepacket; and (ii) one corresponding to the cross terms (terms involving different wavepackets). To be precise, we introduce the following definition. Definition 5.8 (SI and CI). We say that an evaluated monomial ˜ lm ) with the argument multiindex l1 , . . . , lm ∈ {1, . . . , N }m in ˜ l1 · · · h M (F , T )(h the expansion (5.26) is self-interacting (SI) if l1 = l2 = · · · = lm .
(5.27)
˜lm ) is cross-interacting (CI). ˜ l1 · · · h Otherwise we say that M (F , T )(h Using this notation we rewrite (5.26): N m Nh h ˜ ˜l )m ) M (F , T ) hl M (F , T )((h = l=1
l=1
+
l1 ,...,lm is CI
˜ l1 · · · h ˜lm ). M (F , T )(h
(5.28)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1025
Substituting this expression into (4.33) we obtain the expansion ˜1 + · · · + h ˜N ) = G(h h
∞
˜1 + · · · + h ˜N )m ) Gm ((h h
m=1
=
∞
˜1 )m ) + · · · + G((h
m=1
∞
˜N )m ) + GCI (h ˜1 , . . . , h ˜ N ), G((h h h
m=1
(5.29) where GCI contains only CI monomials with cross terms. ˜1 , . . . , h ˜Nh ) has a subProposition 5.9. Every evaluated CI monomial M (F , T )(h monomial of the form ˜ l1 · · · h ˜l1 ) · · · M (F , Ts )(h ˜ ls · · · h ˜ls )) F (s) (M (F , T1 )(h
(5.30)
˜ l1 · · · h ˜ l1 ), . . . , M (F , Ts )(h ˜ ls · · · h ˜ls ) are SI, and there are at where all M (F , T1 )(h ˜ ˜ least two indices i and j such that hli = hlj . We call such a monomial a minimal CI monomial. Proof. The set of CI submonomials of M (F , T ) is finite and it is non-empty since M (F , T ) itself is a CI monomial. We take CI submonomial of M (F , T ) with a minimal rank. Its rank is non-zero since every zero rank submonomial is SI. Since the rank is minimal all submonomials are SI. Hence it has the form (5.30).
5.3. Properties of SI monomials ˜ l1 = · · · = h ˜ lm . According to Definition 5.8 for a SI evaluated monomial we have h ˜ Observe also that in view of Definition 2.9 every single-band wavepacket hl has its band number, and n = n = · · · = n(m) , that is the band nl = n0 is the same for ˜l . Similarly, k∗l1 = · · · = k∗lm . Having these properties we often omit in this all h section indices ni , li and skip n for notational brevity, writing, for example, ωn,ζ (k) = ωζ (k),
u ˜n,ζ (k) = u ˜ζ (k),
(m)
(m)
χn,ζ, n,ζ = χζ,ζ .
5.3.1. Monomials applied to a single-band wavepacket Here we consider monomials based on oscillatory integral operators and which are applied to a single-band wavepacket. We recall that according to (2.33) a single˜− and a small complement ˜ involves two components h ˜+ and h band wavepacket h ˜ component h∞ . Definition 5.10 (Frequency Matching). We call a decorated composition frequency matched (FM) if for every non-end node N ∈ T monomial M (F , T, λ, ζ)
November 28, 2006 11:15 WSPC/148-RMP
1026
J070-00285
A. Babin & A. Figotin (m )
the corresponding decorated submonomial M = Fλ the following conditions: λ = ∞,
ζ (j) = ∞,
(M1,ζ · · · Mm ,ζ (m ) ) satisfies
j = 1, . . . , m ,
(5.31)
and
m
ζ (j) = λ,
(5.32)
j=1
where λ, ζ (j) ∈ Λ defined by (4.41), we identify ± with ±1. A decorated composition monomial which does not satisfy the above conditions is called not frequency matched (NFM) monomial. Collecting separately FM and NFM terms in the expression (4.51) we obtain 1 x2 · · · xm ) M (F , T, λ, ζ)(x M (F , T )(x1 x2 · · · xm ) =
FM λ,ζ
+
1 x2 · · · xm ). M (F , T, λ, ζ)(x
(5.33)
NFM λ,ζ
Remark 5.11. Any SI evaluated monomial is either FM or NFM. We do not define for CI evaluated monomials if they are FM or NFM. Below we show that FM decorated monomials have the following properties which can be briefly stated as follows. ˜ Property 1. If h(k) is a wavepacket in the sense of Definition 2.9 centered around h ˜m )(k) is also localized about ±k∗ . This ±k∗ then FM monomial M (F , T, λ, ζ)( property is proved below in Corollary 5.13. Property 2. The most important property concerning FM-NFM splitting is that the result of a NFM monomial application to a wavepacket has magnitude O(), that is O(β 2 ) for the scaling (2.46). Consequently, all NFM terms in (5.33) are ˜ small (see Lemma 5.16 below) and they give contribution only to the remainder D in (2.47). Now we formulate exact statements clarifying the above properties. The following two statements show, in particular, that an FM monomial transforms a function supported in a vicinity of k∗ into a similar function. ˜m,ζ (m) are such that Lemma 5.12 (Operator Support). If u ˜1,ζ · · · u u ˜ ζ (l) (k(l) ) = 0
when
|k(l) − ζ (l) k∗ | > δl ,
l = 1, . . . , m,
and kζ = (ζ + · · · + ζ (m) )k∗ .
(5.34)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1027
then F (m) (˜ u1,ζ · · · u ˜ m,ζ (m) )(k, τ ) given by (3.4), satisfies (m)
Fζ
(˜ u1,ζ · · · u ˜ m,ζ (m) )(k, τ ) = 0
if
|k − kζ | > δ1 + · · · + δm .
(5.35)
In particular, if the binary indices ζ, ζ(m) are frequency matched (FM), that is ζ = ζ + · · · + ζ (m) ,
ζ (j) , ζ = ±1,
where
(5.36)
then (5.35) holds with kζ = ζk∗ . Proof. From (3.8) and (5.36) we obtain the equality k − ζk∗ = (k − ζ k∗ ) + · · · + (k(m) − ζ (m) k∗ ) which implies lemma’s statement. h ˜1 · · · h ˜m ) is a Corollary 5.13 (Support of a Monomial). If M (F , T, λ, ζ)( decorated composition monomial and ˜ l,ζ (l) = 0 h
when
|k(l) − ζ (l) k∗ | > δ0 ,
l = 1, . . . , m,
(5.37)
then h ˜1 · · · h ˜m )(k) = 0 M (F , T, λ, ζ)(
if
|k − kζ | > mδ0 ,
(5.38)
˜1 · · · h ˜m ) is a FM where kζ is defined by (5.34). In particular, if M (F , T, λ, ζ)(h decorated composition monomial, then h ˜1 · · · h ˜m )(k) = 0 M (F , T, λ, ζ)(
if
|k − ζk∗ | > mδ0 ,
(5.39)
where ζ satisfies (5.36). In particular, if δ0 = β 1− and m ≤ C ln β then for any δ1 > 0 there exists β0 such that for β < β0 we have Cπ0 β 1− ln β < δ1 and h ˜1 · · · h ˜m )(k) = 0 M (F , T, λ, ζ)(
when
|k − ζk∗ | > Cπ0 β 1− ln β.
(5.40)
Proof. To obtain (5.38) we apply Lemma 5.12 and use the induction with respect to the rank of a monomial. is NFM and h(k) ˜ Remark 5.14. If M (F , T, λ, ζ) is a wavepacket localized m ˜ near ±k∗ , then M (F , T, λ, ζ)(h )(k) is localized near the point kζ . As ζ vary over {−1, 1}m such points kζ lie on a straight line parallel to k∗ . For m → ∞ the closure of the set of such kζ with a generic k∗ can be the entire torus [−π, π]d , whereas for the case of ζ corresponding to an FM monomial the closure is just two points ±k∗ . Hence Property 1 is very useful and, in particular, allows to avoid small denominators in coupling terms. The following lemma shows that the FM interaction phase function of a single wavepacket has a critical point at its center, or, in other words, FM monomials satisfy the group velocity matching condition (see [3, 6]).
November 28, 2006 11:15 WSPC/148-RMP
1028
J070-00285
A. Babin & A. Figotin (m)
Lemma 5.15. If a decorated operator Fζ,ζ
is FM then the interaction phase
(m)
function φ in (3.8) has a critical point: ∇k φn,ζ, n,ζ (ζk∗ , k∗ ) = 0
at k∗ = (ζ k∗ , . . . , ζ (m) k∗ ).
(5.41)
Proof. For FM decorated operator all indices ζ (j) = ± and n = n = · · · = n(m)
and
ζ = ζ + · · · + ζ (m) .
(5.42)
Hence we obtain from (3.9) that ∇k φn,ζ, n,ζ (k, k) = ζ∇k ω(k) − ζ (m) ∇k ω(k − k − · · · − k(m−1) ). (m−1)
Since ζk∗ − ζ k∗ − · · · − ζ (m−1) k∗
(m)
= ζ (m) k∗
ζ∇k ω(ζk∗ ) = ζ (m) ∇k ω(ζ (m) ζk∗ )
for
and (2.16) implies
ζ = ±,
ζ (m) = ±,
(5.43)
we obtain the desired (5.41). Now we consider NFM monomials and prove the Property 2. First we note that (2.40) implies ωnl (k∗l ) ≥ ω∗ > 0,
l = 1, . . . , Nh .
(5.44)
If k∗l = k∗ , nl = n0 satisfy Condition 2.13 then if (2.44) does not hold, (2.42) does not hold too, hence for m ≤ mF m m (j) ζ ωn0 (k∗ ) − ζωn (kζ ) ≥ ω∗ > 0, kζ = ζ (j) k∗ , (5.45) j=1
j=1
where ω∗ > 0 is a positive number (we take for notation simplicity the same small enough constant in (5.44) and (5.45)). The following lemma, which is a version of the standard statement of the stationary phase method, shows that the action of an NFM monomial on a wavepacket produces a wave of a small amplitude. Lemma 5.16. Let the decoration projections be defined by (5.1). Assume that Condition 2.13 holds. Let indices ζ, ζ , . . . , ζ (m) be NFM, that is either one of them is ∞ or ζ = ζ + · · · + ζ (m) ,
ζ (j) = ±1,
ζ = ±1.
(5.46)
Let δNFM > 0 be small enough to satisfy δNFM
max
|k∗l −k|≤δNFM
|∇ωl (k)| ≤
1 ω∗ , 4
l = 1, . . . , Nh ,
(5.47)
where ω∗ is given in (5.45). Let k,k(j) satisfy (3.12) and be such that m j=1
|k(j) − ζ (j) k∗ | ≤ δNFM ,
|k − kζ | ≤ δNFM ,
(5.48)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1029
where kζ is defined by (5.34) and k∗ = k∗l satisfy the conditions (5.44) and (5.45). Let the functions u ˜j,ζ (j) (k, τ ) satisfy the condition u˜j,ζ (j) (k, τ ) = 0
when
ζ (j) = ∞
u˜j,ζ (j) (ζ (j) k∗ + s, τ ) = 0
when
|s| ≥ δNFM .
and
Then (m)
Fζ,ζ ,...,ζ (m) (˜ u1,ζ · · · u ˜ m,ζ (m) )E ≤
(5.49)
4 (m) 2m+1 χ CΞ ˜ uj E ω∗ j +
2τ∗ 2m+1 (m) CΞ χ ∂τ u ˜ i E ˜ uj E . ω∗ i j=i
(5.50) (m)
Proof. If one of the indices ζ , . . . , ζ (m) equals ∞ by (5.49) Fζ,ζ ,...,ζ (m) = 0 and
(5.50) is satisfied. Now we consider the case when all ζ, ζ , . . . , ζ (m) are finite. We denote for brevity ωn0 = ω, k∗l = k∗ and φn,ζ, n,ζ = φ. Since (5.48) holds we get from (3.9) that |φ(k, k) − φ(k, k∗ )| ≤ |ω(k ) − ω(ζ k∗ )| + · · · + |ω(k(m) ) − ω(ζ (m) k∗ )| ≤
max
|k∗ −k|≤δNFM
≤ δNFM
|∇ω(k)|
max
|k∗ −k|≤δNFM
m
|k(j) − ζ (j) k∗ |
j=1
|∇ω(k)|.
Using (5.47), we conclude that 1 (5.51) |φ(k, k)| ≥ |φ(k, k∗ )| − |ω∗ |. 4 By (5.46), the condition (2.44) is not satisfied, therefore (5.45) holds and implies that |φ(kζ , k∗ )| ≥ ω∗ . (5.52) Using (5.52), (5.48) and (5.47) we conclude that |φ(k, k∗ )| ≥ ω∗ − |ω(k) − ω(kζ )| ≥ ω∗ − δNFM
max
|k∗ −k|≤δNFM
|∇ω(k)| ≥
3 ω∗ . 4
(5.53)
Together with (5.51) this inequality implies that when (5.48) holds we have the estimate 1 (5.54) |φ(k, k)| ≥ ω∗ . 2 Now we note that the oscillatory factor in (3.8) τ1 τ1 exp iφ(k, k) ∂τ1 exp iφ(k, k) = . iφ(k, k)
November 28, 2006 11:15 WSPC/148-RMP
1030
J070-00285
A. Babin & A. Figotin
Integrating (3.8) by parts with respect to τ1 we obtain F
(m) u1
(˜ ζ,ζ
···u ˜m )(k, τ )
= Dm
exp iφ(k, k) τ (m) ˜(m−1)dk u1,ζ (k , τ ) · · · u ˜m,ζ (k(m) (k, k), τ ) d χ (k, k)˜ ζ,ζ iφ(k, k)
(m) ˜(m−1)dk u1,ζ (k , 0) · · · u ˜m,ζ (k(m) (k, k), 0) d χ (k, k)˜ ζ,ζ iφ(k, k) Dm τ τ1 − exp iφ(k, k) k) 0 Dm iφ(k,
−
(m)
× χζ,ζ
(m)
˜(m−1)dkdτ1 . (k, k)∂τ1 [˜ u1,ζ (k ) · · · u˜m,ζ (k(m) (k, k))] d
(5.55)
Estimating the denominator by (5.54) and using (3.5) we obtain (5.50). Finally, we consider the case when ζ = ∞ and all remaining indices ζ (j) equal ±. We expand (m) Π∞ into sum of Πn,ζ as in (4.38). In this case χ (k, k) involves a projection ζ,ζ(m)
Πn,ζ with n = n0 (the oscillatory integral may involve Nh − 1 terms with such n). For a fixed n the corresponding phase function φ(k, k) takes the form φ(k, k) = φn,ζ, n,ζ (k, k) = ζωn (k) − ζ ωn0 (k ) − · · · − ζ (m) ωn0 (k(m) ). Using again (5.45) (now with n = n0 ) we obtain that (5.52) holds. This implies (5.54) as above provided δNFM is small enough. Hence, the relation (5.55) holds, implying readily the desired bound (5.50). 5.3.2. FM and NFM monomials for SI oscillatory integrals The following below theorem shows that NFM monomials are of the order O() as → 0. We begin first with the following statement. Lemma 5.17. Assume that Condition 2.13 holds. Let a monomial S = (s) Fζ (M1,ζ (1) · · · Ms,ζ (s) ) have all submonomials M1,ζ (1) · · · Ms,ζ (s) which satisfy FM condition (5.36), but S itself is not FM. Assume that S is applied to wavepackets hl which satisfy Definition 2.9 and ˜ l,ζ (k, β) = 0 h
if
|k − ζk∗l | ≥ π0 β 1− ,
ζ = ±.
(5.56)
Then SE ≤
4χ(s) 2s+1 C Mj,ζ (j) E |ω(k∗ )| Ξ j 4τ∗ χ(s) 2s+1 CΞ ∂τ Mi,ζ (i) E Mj,ζ (j) E , |ω(k∗ )| i=1 s
+
E = C([0, τ∗ ], L1 ).
j=i
(5.57)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1031
Proof. Since M1,ζ (1) · · · Ms,ζ (s) are decorated FM submonomials we can use Lemma 5.12 and Corollary 5.13. Applying Corollary 5.13 and using (5.12) we obtain that Ml,ζ (l) (k(l) , τ1 ) = 0
|k(l) − ζ (l) k∗ | > ν(Ml,ζ (l) )β 1− π0 ,
when
l = 1, . . . , s, (5.58)
where ν(M ) is homogeneity index of M . Consider now the oscillatory integral (3.8) which determines S, namely τ τ1 (s) exp iφζ,ζ (k, k) F (M1,ζ (1) · · · Ms,ζ (s) )(k, τ ) = ζ,ζ 0 Ds ×χ
(s) (s) (k, k), τ1 )
(k, k)M1,ζ (1) (k , τ1 ) · · · Ms,ζ (s) (k ζ,ζ
˜ (s−1)dkdτ1 . ×d
(5.59)
We apply Lemma 5.16 where, according to (5.58) and (5.62) δNFM = mβ 1− π0 . According to (5.50) (s)
SE = Fζ,ζ (M1,ζ (1) · · · Ms,ζ (s) )(k, τ )E ≤ (s)
4χ(s) 2s+1 Mj,ζ (j) E C |ω(k∗ )| Ξ j
4τ∗ χ(s) 2s+1 CΞ ∂τ Mi,ζ (i) E Mj,ζ (j) E , |ω(k∗ )| i=1 s
+
E = C([0, τ∗ ], L1 ),
j=i
(5.60) that implies (5.57). ˜l are wavepackets Theorem 5.18. Suppose that (i) the inequalities (5.44) hold; (ii) h in the sense of Definition 2.9; (iii) the relations (5.56) hold; (iv) the projections are defined by (5.1); (v) Condition 2.13 holds. Then a NFM decorated monomial based on oscillatory integrals F defined by (3.4) satisfies the estimate h ˜m )C([0,τ ],L ) ˜1 · · · h M (F , T, λ, ζ)( ∗ 1 ≤
4τ∗i−1 [1 + m] 2i+e CΞ |ω(k∗ )|
N ∈T,r(N )>0
χ(µ(N ))
m
˜l,ζ (l) C([0,τ ],L ) , h ∗ 1
(5.61)
l=1
where i, m and e are respectively the incidence number, the homogeneity index and the number of edges of T . ˜1 · · · h ˜ m ) be a NFM decorated m-homogenious Proof. Let M (F , T, λ(q) , ζ(m) )(h monomial. We find a decorated submonomial S = M (F , T (N0 ), λ(q) , ζ(m) ) of M (F , T, λ(q) , ζ(m) ) with such N0 that S is NFM and has minimal rank of all NFM submonomials. We denote by r0 the rank of S, by i its incidence number and by s = ν(S) = ν(T (N0 )) its homogeneity index. This monomial has the form
November 28, 2006 11:15 WSPC/148-RMP
1032
J070-00285
A. Babin & A. Figotin (s)
S = Fζ (M1,ζ (1) · · · Ms,ζ (s) ). Since the rank is minimal, all decorated submonomials Ml,ζ (l) are FM and their ranks do not exceed r0 − 1. Then according to (4.21) their homogeneity indices satisfy ν(M1,ζ (1) ) + · · · + ν(Ms,ζ (s) ) = s ≤ m.
(5.62)
Applying Lemma 5.17 we obtain (5.57). Now we use Lemmas 5.1 and 5.2. Applying Lemma 4.24 we obtain ˜m )E ˜1 · · · h M ({F }, T, Γ)(h ≤ SE N ∈T \T (N
0 ),r(N )>0
(µ(N ))
FΓ(N )
l<κ
˜l,ζ (l) E h
l≥κ+ν(T (N
˜l,ζ (l) E . h 0 ))
(µ(N ))
Note that the norm of FΓ(N ) is estimated by (5.2) and norm of S by (5.57). In turn, we estimate right-hand side of (5.57) using (5.2) and (5.3). Taking into account that s ≤ m in the sum in (5.60) we get the estimate (5.61). We also consider the case when Condition 2.13 does not hold and Condition 2.23 holds. In this case we give an alternative definition of FM and NFM decorated monomials. Definition 5.19 (Alternative Frequency Matching). We call a decorated alternatively frequency matched (AFM) if composition monomial M (F , T, λ, ζ) (i) every node of T has an odd number of child nodes (at least three); (ii) for every non-end node N ∈ T the corresponding decorated submonomial = F (m ) (M1,ζ · · · M (m ) ) satisfies (5.31) and M (F , T (N ), λ, ζ) m ,ζ λ m (j) sign ζ = λ, (5.63) j=1
where λ, ζ (j) ∈ Λ defined by (4.41), we identify ± with ±1. A decorated composition monomial which is not AFM is called alternatively not frequency matched (ANFM) monomial. Now we prove a statement analogous to Theorem 5.18 when Condition 2.23 holds. Theorem 5.20. Assume that assumptions of Theorem NFM hold with Condition 2.13 replaced by Condition 2.23. Then (5.61) holds. ˜ lm = h ˜l satisfy Definition 2.9 ˜ l1 = · · · = h Proof. According to Corollary 5.13, if h ˜ ˜ ˜l1 ,ζ · · · h ˜l ,ζ (m) ) has and (5.56), then M (F , T, λ, ζ)(hl1 · · · hlm ) = M (F , T, λ, ζ)(h m 1−
vicinity of kζ = νk∗ defined by (5.34), ν and m are odd integers, support in a mβ be minimal ANFM submonomial m ≥ 3, ν = ζ + · · · + ζ (m) . Let S = M (F , T , λ, ζ) is AFM submonomial of S. that is if T ⊂ T then M (F , T , λ, ζ) of M (F , T, λ, ζ),
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1033
The monomial S has the form of (5.59) with the interaction phase function φζ,ζ (k, k) = ζωn (k) − ζ ωn0 (k ) − · · · − ζ (s) ωn0 (k(s) ).
(5.64)
The integrand is non-zero near k(j) = νj k∗ , and applying (5.63) to every AFM submonomial we get ζ (l) = sign(νl ).
(5.65)
Using (2.96) and (2.16) we obtain φζ,ζ (νk∗ , k∗ ) = ζωn (νk∗ ) − ζ ωn0 (ν1 k∗ ) − · · · − ζ (s) ωn0 (νs k∗ ) = ζωn (νk∗ ) − sign(ν1 )|ν1 |ωn0 (k∗ ) − · · · − sign(νs )|νs |ωn0 (k∗ ) = ζ|ν|ωn (k∗ ) − (ν1 + · · · + νs )ωn0 (k∗ ),
ν = ν1 + · · · + νs .
(5.66)
Therefore, since S is ANFM, ζ = sign(ν) and since ν is odd, φζ,ζ (νk∗ , k∗ ) = −2νωn0 (k∗ ) = 0,
(5.67)
therefore (5.52) holds. We can repeat the proofs of Lemmas 5.16 and 5.17 and obtain (5.57). From (5.57), we obtain (5.61) as in the proof of Theorem 5.18. Below we give estimations for the derivatives with respect to k of a composition monomial applied to a wavepacket. Note that (2.35) admits a singular dependence ˜ ζ (β, k). This type of dependence also naturally comes from on β of wavepackets h explicit formulas as (2.36) which yield that the first derivative with respect to k has a factor β −1 . Below we estimate dependence on β of monomials applied to wavepackets and will show that they have the same type of singularity. Observe that by (5.13) all the points k∗l are at the distance at least 2π0 from σ. Hence, according to Definition 2.3, and (2.28) max
|k±k∗l |≤π0 , l=1,...,Nh ,
max
|k±k∗l |≤π0 , l=1,...,Nh
|∇χ
(|∇2k ω| + |∇k ω|) ≤ Cω,2 ,
(m) (m) )|
(k, k , . . . , k ζ,ζ
≤ Cχ CΞm+1 .
(5.68) (5.69)
The following seemingly technical lemma describes a very important property of solutions. It shows that the k-gradient of solutions behaves, roughly speaking, as the gradient of initial data. Corresponding estimates play a crucial role in the control of smallness of interaction of different wavepackets. h ˜lm ) be a decorated monomial which is SI. ˜ l1 · · · h Lemma 5.21. Let M (F , T, λ, ζ)( ˜ ˜ Assume that hlj = hl are wavepackets satisfying Definition 2.9, (5.56) and (5.19), that (2.46) holds and β 1− m ≤ π0 .
(5.70)
November 28, 2006 11:15 WSPC/148-RMP
1034
J070-00285
A. Babin & A. Figotin
Assume that either Condition 2.13 holds and the monomial is FM or Condition 2.23 holds and the monomial is AFM. Then h ˜ l1 · · · h ˜ lm )E ≤ CCχ τ∗i C 2i+e Cχi−1 Rm−1 β −1− m2 , (5.71) ∇k M (F , T, λ, ζ)( Ξ
where E = C([0, τ∗ ], L1 ), τ∗ ≤ 1, with i = i(T ) and e = e(T ) being respectively the incidence number and the number of edges of T . Proof. We use the induction with respect to the incidence number i of a tree T . is FM. First, we consider the case when Condition 2.13 holds and M (F , T, λ, ζ) For i = 0, (5.71) follows from (2.35). Now we assume that (5.71) holds for the incidence number less than i and prove it when the incidence number equals i. Since arguments of M (F , T, λ, ζ) are SI, according Definition 5.8, l1 = · · · = lm = l. It is sufficient to prove the boundedness of h ˜m )E = ∇k F (s) (M1,ζ · · · Ms,ζ (s) )E , ∇k M (F , T, λ, ζ)( l
λ
where M1 · · · Ms are decorated submonomials, Mj,ζ = Πζ Mj . Let the submonomials have incidence numbers i1 , . . . , is and homogeneities m1 , . . . , ms respectively satisfying i1 + · · · + is = i − 1,
m1 + · · · + ms = m.
(5.72)
We have by (3.8) (s)
∇k Fλ (M1,ζ · · · Ms,ζ (s) )(k, τ ) τ τ1 exp iφλ,ζ (k, k) = ∇k 0 [−π,π](s−1)d ×χ
(s) (s) ˜(s−1)dkdτ1 . (k, k)) d
(k, k)M1,ζ (k ) · · · Ms,ζ (s) (k λ,ζ
(5.73)
By Leibnitz formula, (s)
∇k Fλ (M1,ζ · · · Ms,ζ (s) )(k, τ ) = I1 + I2 + I3 , where
I1 =
τ
[−π,π](s−1)d
0
τ1 ∇k exp iφλ,ζ (k, k)
(s) ˜(s−1)dkdτ1 , × χλ,ζ (k, k)M1,ζ (k ) · · · Ms,ζ (s) (k(s) (k, k)) d
I2 =
τ
0
τ1 exp iφλ,ζ (k, k) [−π,π](s−1)d
(s) ˜(s−1)dkdτ1 , × [∇k χλ,ζ (k, k)]M1,ζ (k ) · · · Ms,ζ (s) (k(s) (k, k)) d
I3 =
τ
0
×χ
τ1 exp iφλ,ζ (k, k) [−π,π](s−1)d
(s) (s) ˜(s−1)dkdτ1 . (k, k)) d
(k, k)M1,ζ (k ) · · · ∇k Ms,ζ (s) (k λ,ζ
(5.74)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1035
By (5.5), (j)
(j)
Mj,ζ (j) (k(j) )L1 ≤ Cτ i CΞ2i
+e(j)
(j)
Cχi Rmj ,
j = 1, . . . , s.
(5.75)
Using (3.5), (5.75), (5.72) and the induction assumption we get |I3 | ≤ χ(s)
s−1
Mj,ζ (j) (k(j) )E
j=1
0
τ
∇k Ms,ζ (s) E dτ1
≤ CC1m Rm−1 τ i CΞ2i+e Cχi β −1− .
(5.76)
From (5.75) and the smoothness of χ(s) (k, k) we get |I2 | ≤ Cβ −1− τ i C1m CΞ2i+e Cχi Rm .
(5.77)
Now we estimate I1 . Using (3.9) we obtain τ τ1 I1 = exp iφλ,ζ (k, k) 0 [−π,π](s−1)d ×
τ1 [−λ∇k ω(k) + ζ (s) ∇k ω(k(s) (k, k))]
×χ
(s) (s) ˜(s−1)dkdτ1 . (k, k)) d
(k, k)M1,ζ (k ) · · · Ms,ζ (s) (k λ,ζ
(5.78)
The difficulty in the estimation of the integral I1 comes from the factor τ1 since is is FM, its every small. Note that according to (2.46) β 2 / ≤ C. Since M (F , T, λ, ζ) submonomial is FM too and we can apply to them Corollary 5.13, which yields Mj,ζ (j) (k(j) ) = 0
for
|k(j) − ζ (j) k∗ | > mj π0 β 1− ,
j = 1, . . . , s.
Hence, it is sufficient to estimate I1 for |k(j) − ζ (j) k∗ | ≤ δ1 = mπ0 β 1−
for all j.
(5.79)
According to Lemma 5.15, since λ, ζ are FM ∇k φλ,ζ (λk∗ , k∗ ) = [−λ∇k ω(k∗ ) + ζ (s) ∇k ω((k(s) (k∗ , k∗ )))] = 0.
(5.80)
Using (5.68) we conclude that in a vicinity of k∗ defined by (5.79) we have |[−λ∇k ω(k) + ζ (s) ∇k ω(k(s) (k, k))]| ≤ 2(s + 1)Cω,2 δ1 . This yields the estimate |I1 | ≤ CCΞ2i+e τ i Cχi C1m β −1− m2 Rm .
(5.81)
Combining (5.81), (5.77) and (5.76) we obtain (5.71) and the induction is completed. Now we consider the case when Condition 2.23 holds and the monomial is AFM. Note that according to Corollary 5.13, the submonomials Mj,ζ (j) have supports near νj k∗ , with an odd νj . By Lemma 5.12 the monomial itself is non-zero near νk∗ ,
November 28, 2006 11:15 WSPC/148-RMP
1036
J070-00285
A. Babin & A. Figotin
ν = ν1 + · · · + νs ; since s is odd ν is odd too. Obviously, one of νj has the same sign as ν, we assume that j = s, that is sign(νs ) = sign(ν1 + · · · + νs ) = sign(ν),
(5.82)
the general case can be reduced to this by a relabeling of variables. The interaction phase function is given by (5.64) and since the submonomials are AFM (5.65) holds. According to (2.16) ∇k (ω(−k)) = −(∇k ω)(k). Therefore, using (2.95) we obtain ∇k φλ,ζ (νk∗ , k∗ ) = λ∇k ω(νk∗ ) − ζ (s) ∇k ω(νs k∗ ) = λ(∇k ω)(sign(ν)|ν|k∗ ) − ζ (s) ∇k ω(sign(νs )|νs |k∗ ) = λ(∇k ω)(sign(ν)k∗ ) − ζ (s) ∇k ω(sign(νs )k∗ ) = (λ sign(ν) − ζ (s) sign(νs ))(∇k ω)(k∗ ). Using (5.65) we conclude that ∇k φλ,ζ (νk∗ , k∗ ) = 0,
k∗ = (ν1 k∗ , . . . , νs k∗ ).
(5.83)
Using (5.83) instead of (5.80) we conclude as in the first half of the proof that (5.71) holds in the AFM case too. 5.4. Properties of minimal CI monomials Here we consider CI evaluated monomials with arguments involving different ˜l . Since the group velocities of wavepackets are different, namely wavepackets h (2.41) is satisfied, there exists p0 > 0 such that |∇ω(k∗l1 ) − ∇ω(k∗l2 )| ≥ p0 > 0
if l1 = l2 .
(5.84)
The next lemma is a standard implication of the Stationary Phase Method in the case when the phase function has no critical points in the domain of integration, namely when (2.41) holds. Lemma 5.22. Let k∗l and ωn be generic in the sense of Definition 2.24. Let F (m) be defined by (3.4), m(β) be as in (5.23). We assume that (2.28) and (2.41) hold. We also assume that (5.19), (5.56), (2.34), (2.35) and (2.46) hold. We ˜lm ) is a monomial with homogeneity index m evalu˜ l1 · · · h assume that M (F , T )(h ated at arguments with CI multiindex l1 , . . . , lm , but every evaluated submonomial ˜lm ) is SI. Then for m ≤ m(β) and small β ˜ l1 · · · h of M (F , T )(h C i−1 2i+e 2m i |ln β| ˜ ˜ M (F , T )(hl1 · · · hlm )E ≤ τ∗ CΞ 3 Cχ + β m2 Rm−1 , (5.85) p0 β 1+
where i and e are respectively the incidence number and number of edges of T, R is as in (5.19). Proof. Since k∗l are not band-crossing points, the relations (5.69) and (5.68) hold. as in (4.51), We expand M (F , T ) into a sum of decorated monomials M (F , T, λ, ζ)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1037
which contains no more than 3i(T )+m terms, and i(T ) + m ≤ 2m. The submonomials of every decorated monomial are SI by the assumption of the theorem. If Condition 2.13 holds, the submonomials are either FM or NFM; if Condition 2.23 holds, the submonomials are either AFM or ANFM. If a decorated submonomial M (F , T , λ , ζ ) is NFM we use Theorem 5.18 and obtain from (5.61) the inequality
˜l · · · h ˜l )E ≤ Cτ i −1 [1 + m]C 2i +e C i Rm , M (F , T , λ , ζ )(h χ ∗ Ξ j +1 j +m
(5.86)
where i and e are the incidence number and number of edges of the subtree T . Alternatively, if Condition 2.23 holds, and a decorated monomial M (F , T , λ , ζ ) is ANFM, we use Theorem 5.20 and obtain from (5.61) the inequality (5.86). Using (5.86) in both cases we obtain h ˜lm )E ≤ Cτ∗i−1 C 2i+e Cχi mRm . ˜ l1 · · · h M (F , T, λ, ζ)( Ξ
(5.87)
Now we consider the case when Condition 2.13 holds and every submonomial of is FM. We write the integral with respect to τ1 in (5.59) as a sum of M (F , T, λ, ζ) two integrals from 0 to β and from β to τ , namely F
(s)
(M1 ζ,ζ
· · · Ms )(k, τ ) = F1 + F2 ,
τ
τ1 (s) ˜(s−1)dkdτ1 , F2 exp iφζ,ζ (k, k) A (k, k) d ζ,ζ Dm
F1 = β
β
= 0
· · · dτ1
(5.88)
where (s)
(m) (k, k) ζ,ζ
A
=χ
(s) (s) ),
(k, k)M1 (k ) · · · Ms (k ζ,ζ
(5.89)
Mj are submonomials of M . According to Corollary 5.2 with τ∗ = β F2 L1 ≤ 2CΞ1+2s Cχ β
s
Mj E
j=1
≤ βCΞe+2i τ∗i−1 Cχ
m
˜ lj E h
j=1
≤
i−1 m βCχ CΞe+2i τ∗E R .
(5.90)
Now we estimate F1 . Since M (F , T ) is CI, there are two SI submonomials Mj1 and ˜lj )m1 and (h ˜lj )m2 with lj1 = lj2 . Let us assume that lj1 = l1 , Mj2 applied to (h 1 2 lj2 = ls (the general case can be easily reduced to it by a relabeling of variables). We denote φ = ∇k φζ,ζ (k, k∗ ) = ∇k ω(k∗l1 ) − ∇k(s) ω(k∗ls ) = 0,
p = φ /|φ |.
(5.91)
By (5.84) and (5.43) we obtain |p·∇k φζ,ζ (k, k∗ )| ≥ p0 > 0 for k = k∗ = (k∗l1 , . . . , k∗ls ).
(5.92)
November 28, 2006 11:15 WSPC/148-RMP
1038
J070-00285
A. Babin & A. Figotin
Note that
τ1 τ1 p·∇k exp iφζ,ζ (k, k) exp iφζ,ζ (k, k) = . ip·∇k φζ,ζ (k, k)τ1
Using this identity, (2.25) and integrating by parts the integral which defines F1 in (5.88) we obtain τ τ1 (s) ˜(s−1)dk F1 = I(k, τ1 ) dτ1 , I(k, τ1 ) = exp iφζ,ζ (k, k) A (k, k) d ζ,ζ β Dm τ1 (s) exp iφζ,ζ (k, k) A (k, k) ζ,ζ ˜ (s−1)dk. =− p·∇k (5.93) d iτ1 ∇k φζ,ζ (k, k) · p Ds From (5.56), Lemma 5.12 and Corollary 5.13 we see that in the integral I(k, τ1 ) the integrands are non-zero only if (j)
|k(j) − ζ (j) k∗ | ≤ mj π0 β 1− ,
|k − ζk∗ | ≤ mπ0 β 1− ,
m1 + · · · + ms ≤ m, (5.94)
where π0 ≤ 1. Using the Taylor remainder estimate for φζ,ζ at k∗ we obtain the inequality |∇k φζ,ζ (k, k) − φ | ≤ 3mβ 1− Cω,2
if (5.94) holds.
(5.95)
Suppose that β is small and satisfies p0 . (5.96) 2 Condition (5.96) is satisfied for small β if m ≤ m(β) as in (5.23). Using (5.95) we derive from (5.92), (5.96) and (5.56) that p0 |p·∇k φζ,ζ (k, k)| ≥ > 0 if (5.94) holds. (5.97) 2 3mβ 1− Cω,2 ≤
Now we use (5.97) to estimate denominators, (5.68) to estimate second k derivatives of ω and (5.69) to estimate ∇k χ. We conclude that 8Cω,2 (s) (s) 2s+1 ˜(s−1)dk |I(k, τ1 )| ≤ CΞ |∇k Aζ,ζ (k, k)| + |Aζ,ζ (k, k)| d (m) τ1 p20 Ds τ1 p0 s 8Cω,2 (m) 2s+1 (s) χ (k, ·) CΞ Mj L1 ≤ (∇k − ∇k(s) )χ (k, ·) + τ1 p0 p0 j=1 s CΞ2s+1 χ(s) (k, ·) + Mj L1 ∇k M1 L1 τ1 p0 j=2 s−1 + Mj L1 ∇k(s) Ms L1 . j=1
(5.98)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1039
To estimate ∇Mi we use Lemma 5.21. We also use (5.2) and (5.5) to estimate Mj L1 . Therefore, using (5.72), we obtain |I(k, τ1 )| ≤
C i−1 2i+e i τ CΞ Cχ 1+ m2 Rm−1 . τ1 ∗ β p0
(5.99)
Finally, we consider the case when the alternative Condition 2.23 holds. In this case M1 and Ms according to Lemma 5.12 are localized near ν1 k∗l1 and ν2 k∗ls with some ν1 and ν2 ; we use (2.94) to obtain (5.92) both for AFM and ANFM submonomials. Therefore (5.97) holds and we again get (5.98) and (5.99). So, we proved (5.99) in all cases. Integrating (5.99) in τ1 we obtain F1 E ≤ Cτ∗i−1 CΞ2i+e Cχi 1+ m2 |ln β|Rm−1 . (5.100) β p0 Using summation over all λ, ζ (the sum involves no more than 32m terms) we obtain (5.85) from (5.87) and (5.100). 6. Proof of the Superposition Theorems In this section we prove Theorems 2.15 and 2.19 on the approximate modal superposition principle. 6.1. Proof of the superposition principle for lattice equations Here we prove Theorem 2.15. First we note that according to Lemma 5.6 we can ˜ Ψ in the statement of Theorem 2.15, in particular in (2.47) and ˜ l by h replace h l (2.48). Hence we can assume that (5.56) holds. Based on Theorem 5.4, we expand the solution of (2.3) into series (5.8) and then into the sum of composition monomials M (F , T ) as in (4.33): ˜ =h ˜+ G(F , h)
∞
˜m ), cT M (F , T )(h
(6.1)
m=2 T ∈Tm
where ˜= h
Nh
˜l, h
˜l E ≤ R, h
l = 1, . . . , Nh ,
(6.2)
l=1
and the relation (5.19) (that is Nh R < RG ) holds, where RG is the radius of convergence from Theorem 5.4, R will be specified below. Using Lemma 5.7 we conclude that
m(β)
˜ =h ˜+ G(F , h)
˜m ) + g, cT M (F , T )(h
gE ≤ β,
(6.3)
m=2 T ∈Tm
˜m ) where m(β) is defined by (5.23). Then we expand every monomial M (F , T )(h ˜ ˜ according to (5.28) into the sum of the terms M (F , T )(hl1 · · · hlm ). Note that since m(β) ≤ C|ln β|, conditions (5.96), (5.70) and (5.47) are satisfied if β is small enough
November 28, 2006 11:15 WSPC/148-RMP
1040
J070-00285
A. Babin & A. Figotin
˜ l1 · · · h ˜ lm ) belong to two classes, SI for every m ≤ m(β). The monomials M (F , T )(h and CI (according to Definition 5.8) and the class is determined by the multiindex (l1 , . . . , lm ) = ¯l. Using (6.3) we conclude that G F,
Nh
˜l h
=
l=1
m(β)
˜ = D
m=2 T ∈Tm
CI
Nh
˜l ) + D, ˜ G(F , h
(6.4)
l=1
˜ l1 · · · h ˜ lm ) + g 1 , cT M (F , T )(h
g1 E ≤ Cβ.
l1 ,···,lm
˜ and show that it is small. It To obtain (2.48), we have to estimate the sum in D follows from (4.35) that m(β)
m=2 T ∈Tm
CIl1 ,...,lm
m(β)
≤
m=2
˜ ˜ cT M (F , T )(hl1 · · · hlm )
Nhm
cT
T ∈Tm
sup
T ∈Tm ,CI¯ l
E
˜ l1 · · · h ˜lm )E M (F , T )(h
m(β)
≤
m=2
Nhm c0 cm 1
sup
T ∈Tm ,CI¯ l
˜ l1 · · · h ˜lm )E . M (F , T )(h
˜lm ) with arguments given ˜ l1 · · · h Now we consider an evaluated monomial M (F , T )(h by CI multiindex ¯l = (l1 , . . . , lm ). To prove that this monomial has a small norm, according to Lemma 4.24 it is sufficient to show that one of its submonomials is small and the relevant operators are bounded. According to Proposition 5.9 ˜lm ) contains a submonomial M (F , T )(h ˜l · · · h ˜l ) ˜ l1 · · · h the monomial M (F , T )(h s s with the homogeneity index s = s − s + 1, the incidence number i and the rank ˜l · · · h ˜l ) r which is minimal in the following sense. The monomial M (F , T )(h s s ˜ ˜ is CI, but every its submonomial M (F , T )(hls · · · hls ) is SI. Now we use the space decomposition (5.1) and expand M (F , T ) as in (4.44) into a sum of no h ˜l · · · h ˜l ). The decorated more than 32m decorated monomials M (F , T , λ, ζ)( s s submonomials of every decorated monomial are SI. We apply Lemma 5.22 and conclude that 2 s i −1 e +2i i s −s ˜ ˜ τ CΞ Cχ R . M (F , T , λ, ζ)(hls · · · hls )E ≤ C 1+ |ln β| + β β p0 ∗ (6.5) ˜ l1 · · · h ˜lm ) with a small norm. Namely, Hence, there is a submonomial of M (F , T )(h since (2.46) and (2.5) are assumed, this small submonomial provides the smallness ˜lm ) according to Lemma 4.24. ˜ l1 · · · h of the norm of the whole monomial M (F , T )(h
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1041
We also use Corollary 5.2 and (2.26) to estimate norms of remaining submonomials of rank r and apply (4.32) and (5.72) to obtain ˜lm ) ≤ 32m ˜ l1 · · · h M (F , T )(h
β 1+
|ln β| + β C1 m2 τ∗i−1 CΞe+2i Cχi Rm−1 .
(6.6)
Since e = i + m − 1, using (4.25) and the inequalities i(T ) = i ≥ m/mF , i ≤ m − 1 we get
m(β)
˜ l1 · · · h ˜ lm ) cT M (F , T )(h
m=2 T ∈Tm CIl1 ,...,lm
≤ C2
β 1+
|ln β| + β
∞
m/mF −1
τ∗
m−1 m2 Nhm cm , 1 R
(6.7)
m=2
with c1 = 9CΞ5 Cχ . The series converges if, in addition to (5.19), R satisfies the inequality 1/mF
RNh c1 τ∗
< 1.
For such R and τ∗ , combining (6.7) with (6.3) and using (2.46) we obtain (2.48), and the Theorem 2.15 is proved.
6.2. Proof of the superposition principle for PDE Here we prove Theorem 2.25 (and its particular case Theorem 2.19). The proof is completely similar to the above proof of Theorem 2.15 up to every detail. One only have to replace Dm given by (2.23) by Dm given by (2.65) and the space L1 is now defined by (2.66) instead of (2.31). Remark 6.1. Note that smallness of CI terms is essential and is based on different group velocities of single band wavepackets. Note that separation of different wavepackets based only on FM and NFM arguments as in Lemma 5.17 is impossible since there are always FM monomials with different l because of the symmetry conditions (2.15) and (2.16), for example and FM condition ζωn,ζ (ζk∗ ) − ζ ωn (ζ k∗1 ) − ζ ωn (ζ k∗2 ) − ζ ωn (ζ k∗3 ) = 0 is fulfilled if n = n ,
ζ = ζ,
k∗ = k∗1 ,
n = n ,
ζ = −ζ ,
k∗2 = k∗3
independently of the values of k∗ , k∗3 and independently of a particular form of functions ωn (k).
November 28, 2006 11:15 WSPC/148-RMP
1042
J070-00285
A. Babin & A. Figotin
7. Examples and Possible Generalizations 7.1. Fermi–Pasta–Ulam equation FPU equation on the infinite lattice has the form ∂t2 xn = (xn−1 − 2xn + xn+1 ) + α3 ((xn+1 − xn )3 − (xn − xn−1 )3 ) + α2 ((xn+1 − xn )2 − (xn − xn−1 )2 ).
(7.1)
It can be reduced to the following first-order equation ∂t xn = yn − yn−1 ,
∂t yn = xn+1 − xn + α3 (xn+1 − xn )3 + α2 (xn+1 − xn )2 . (7.2)
We introduce lattice Fourier transforms x ˜(k) and y˜(k) by (2.2), namely xn e−ink , k ∈ [−π, π]. x˜(k) = n
First we write Fourier transform of the linear part of (7.2) (that is with α3 = α2 = 0). Multiplying by e−ink and doing summation we obtain ˜(k) = y˜(k) − e−ik y˜(k), ∂t x
∂t y˜(k) = eik x ˜(k) − x˜(k).
that can be recast in the matrix form as follows 0 −(eik − 1)∗ x˜ x ˜ = ik ∂t . y˜ e −1 0 y˜ The eigenvalues of the matrix are purely imaginary and equal iωζ (k) with k ωζ (k) = ζ|eik − 1| = 2ζ sin , ζ = ±, −π ≤ k ≤ π. 2 The eigenvectors are orthogonal and are given explicitly by iζ ik iζ|e − 1| 1 1 gζ (k) = √ = √ eik − 1 , ζ = ±, 2|eik − 1| eik − 1 2 |eik − 1|
k = 0.
(7.3)
Now let us consider nonlinear terms. Note that the lattice Fourier transform of the product x(n) z(n), n ∈ Zd is given by the following convolution formula 1 x z(k) = x ˜(s)˜ z (k − s) ds (7.4) (2π)d [−π,π]d as in the case of the continuous Fourier transform. Note that − xn (k) = (eik − 1)˜ x(k), xn+1 and, hence, the Fourier transform of the cubic term of the nonlinearity in (7.2) is 1 − xn )3 = (eik − 1)(eik − 1)(eik − 1) (xn+1 (2π)2 k +k +k =k;(k ,k )∈[−π,π]2 x(k )˜ x(k ) dk dk , ×x ˜(k )˜ and similar convolution for the quadratic term.
(7.5)
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1043
7.2. Examples of wavepacket data Here we give examples of initial data for PDE in Rd and on the lattice Zd which are wavepackets in the sense of Definition 2.9. We define a wavepacket by (2.33) where hζ is chosen to satisfy (2.35) and (2.34). Recall that a Schwartz function is an infinitely smooth function Φ(r), r ∈ Rd which rapidly decays and satisfies for every s ≥ 0 the inequality |r|p |∂rα Φ(r)| dr ≤ C1 (s), (7.6) sup r
|α|+p≤s
where ∂rα Φ(r) = ∂rα11 · · · ∂rαdd Φ(r),
α = (α1 , . . . , αd ),
|α| = α1 + · · · + αd .
It is well known that Fourier transform of a Schwartz function remains to be a Schwartz function and that its derivatives satisfy the inequality ˆ ≤ C2 (s). ||k|p ∂kα Φ(k)| (7.7) sup k
|α|+p≤s
Example 1. We consider equation in Rd as in Sec. 1.2. The simplest example of a wavepacket in the sense of Definition 2.9 is a function of the form (2.36) where ˆ ζ (k)| + |∇k h ˆ ζ (k)| dk < ∞, ˆ ζ (k)| + |k|1/ |h |h (7.8) Rd
ˆ ζ (k/β) is the Fourier and gn,ζ (k) is an eigenvector from (2.13). Note that β −d h transform of a function hζ (βr). ˆ k), k ∈ Rd be defined by (2.36) and (7.8). Then h ˆl,ζ (β, k) Lemma 7.1. Let h(β, is a wavepacket with wavepacket center k∗ in the sense of Definition 2.9 with L1 = L1 (Rd ). Proof. First, condition (2.32) holds since k − ζk∗ −d ˆ ˆ ζ (k)| dk. ˆ hζ (β, ·)L1 = β hζ |h gn,ζ (k∗ ) dk = |gn,ζ (k∗ )| β Rd Rd Condition (2.33) is obviously fulfilled since ˆζ (β, k) = Πn,ζ (k)h ˜ζ (β, k). h Inequality (2.34) follows from the estimate k − ζk∗ −d ˆ ζ (k)| dk ≤ Cβ. ˆ β |k|1/ |h hζ dk ≤ β β |k−ζk∗ |≥β 1− |k|≥β −
(7.9)
November 28, 2006 11:15 WSPC/148-RMP
1044
J070-00285
A. Babin & A. Figotin
To verify (2.35) we note that since Πn,ζ (k) smoothly depend on k near ζk∗ we have ˆζ (β, k)| dk |∇k h |k−ζk∗ |≤β 1−
≤C
β |k−ζk∗ |≤β 1−
≤ Cβ −1
k − ζk∗ k − ζk∗ −d ˆ ˆ ∇k hl + β hl dk β β
−d−1
Rd
|∇k ˆ hζ (k)| dk + C
and (7.8) implies (2.35). Example 2. Let us consider a lattice equation in Zd as in Sec. 1.1. We would like to give a sufficient condition for functions defined on the lattice which ensures that their Fourier transforms satisfy all requirements of Definition 2.9. We pick a Schwartz function Φ(r) (see (7.6)), a vector k∗ ∈ [−π, π]d and introduce h(β, r) = e−ir·k∗ Φ(βr),
r ∈ Rd .
(7.10)
Then we restrict the above function to the lattice Z by setting r = m. The following lemma is similar to Lemma 7.1. d
Lemma 7.2. Let Φ(r) be a Schwartz function, hζ (β, r) be defined by (7.10), ˜ ζ (β, k) extended to ˜ ζ (β, k) be its lattice Fourier transform. Then the function h h d R as a periodic function with period 2π satisfies all requirements of Definition 2.9 with L1 = L1 ([−π, π]d ). Proof. The lattice Fourier transform of h(β, r) equals ˜ e−im·k∗ Φ(βm)e−im·k = Φ(βm)e−im·(k−k∗ ) . h(β, k) = m∈Zd
(7.11)
m∈Zd
˜ Since the above expression naturally defines h(β, k) as a function of k − k∗ , it is sufficient to take k∗ = 0. To get (2.34), we use the representation of Φ(r) in terms of inverse Fourier transform (2.60) 1 1 1 ir·k ˆ ˆ Φ(k)e Φ k eim·k dk. dk, Φ(βm) = (7.12) Φ(r) = (2π)d Rd (2πβ)d Rd β We split Φ(βm) into two terms: 1 1 ˆ 1 ξ eim·ξ dξ + Φ1 (m), Ψ 1− ξ Φ Φ(βm) = (2πβ)d Rd β β 1 1 ˆ 1 ξ eim·ξ dξ, Φ Φ1 (m) = 1 − Ψ 1− ξ d (2πβ) Rd β β
(7.13)
with Ψ(ξ) defined by (5.12). The first term in (7.13) coincides with the inverse lattice Fourier transform, its lattice Fourier transform is explicitly given and can be treated as in Lemma 7.1. The second term gives O(β N ) with large N for Schwartz
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1045
ˆ Using these observations we check all points of Definition 2.9 as in functions Φ. Lemma 7.1. 7.3. The nonlinear maxwell equation We expect that the approximate superposition principle can be generalized to the Nonlinear Maxwell equations (NLM) in periodic media studied in [4] . A concise operator form of the NLM is i ∂τ U = − MU + FNL (U) − J0 ,
U(τ ) = 0
for τ ≤ 0,
where the excitation current J(τ ) = 0
for τ ≤ 0.
We were studying the properties of nonlinear wave interactions as described by the Nonlinear Maxwell equations in series of papers [1–6]. Our analysis of the solutions to the NLM uses an expansion in terms of orthonormal Floquet–Bloch basis ˜ n,ζ (r, k), n = 1, . . . , namely G ˜ U(k, r, τ ) =
∞
˜n,ζ (k, τ )G ˜ n,ζ (r, k), U
k ∈ [−π, π]d .
(7.14)
ζ=±1 n=1
This expansion is similar to (2.18) with J replaced by ∞, since the linear Maxwell operator in a periodic medium has infinitely many bands. The excitation currents take the form similar to forcing term in (3.1), namely ˜ n,+ (r, k)e− i ωn (k)τ + ˜jn,− (k, τ )G ˜ n,− (r, k)e i ωn (k)τ , ˜ k, τ ) = ˜jn,+ (k, τ )G J(r, ˜ Jn (r, k, τ ) = 0,
n = n0 ,
with a fixed n = n0 . The difference with (3.1) is that time-independent hn,ζ (k) is replaced by ˜jn,ζ (k, τ ). The functions ˜jn,ζ (k, τ ) for every τ have the form of wavepackets in the sense of Definition 2.9, or in particular the form similar to (2.36) with fixed k∗ . The Existence and Uniqueness Theorem for the NLM is proven in [4], in particular function-analytic representation of the solution as a function of the excitation current. The results of this paper can be extended to the NLM equations provided that certain technical difficulties are addressed. Particularly, the classical NLM equation allows for the time dispersion with consequent time-convolution integration in the nonlinear term. This complication can be addressed by approximating it with a nonlinearity of the form (2.22) with an error O() = O(β 2 ), see [6]. Then the derivation of the approximate linear superposition principle for wavepackets can be done as in this paper. Another complication with the NLM is that it has infinite number of bands.
November 28, 2006 11:15 WSPC/148-RMP
1046
J070-00285
A. Babin & A. Figotin
7.4. Dissipative terms in the linear part Equations (2.3) and (2.61) involve linear operators iL(k) with purely imaginary spectrum. Quite similarly we can consider equations of the form i ˆ ˆ ˆ τ ) + Fˆ (U)(k, τ ), (7.15) ∂τ U(k, τ ) = −G(k) − L(k) U(k, where a Hermitian matrix G(k) commutes with the Hermitian matrix L(k) and G(k) is non-negative. In this case the approximate superposition principle also holds. The proofs are quite similar. In the case (2.61), which corresponds to of PDE, G(k) determines a dissipative term, for example G(k) = |k|2 I, k ∈ Rd , where I is the identity matrix, corresponds to Laplace operator ∆. When such a dissipative term is introduced, we can consider nonlinearities Fˆ which involve derivatives, see [8, 9] in a similar situation. For such nonlinearities our framework remains the same, but some statements and proofs have to be modified. We will consider this case in a separate paper. Appendix A. Structure of a Composition Monomial Based on Oscillatory Integrals ˜1 · · · h ˜ m ) based on oscillatory Every composition monomial M (F , T, λ(ˆs) , ζ(m) )(h (m) as defined by (3.14) and the space decomposition as defined integral operators F by (5.1) has the following structure. Let T be the tree corresponding to the monomial M . The monomial involves integration with respect to time variables τ(N ) where N ∈ T are the nodes of the tree T . The monomial also involves integration with respect to variables kN , N ∈ T . The argument of the integral operator M (F , T, λ(ˆs) , ζ(m) ) involves only end nodes (of zero rank) and has the form ˜N (kN ). h rank(N )=0
The kernel of the integral operator involves the composition monomial M (χ, T, λ(ˆs) , ζ(m) ) based on the susceptibilities tensors χ(m) (k, k(m) ) with the same tree
ζ,ζ(m)
T . Note that the phase matching condition (3.12) takes the form kN =
kN
+ ··· +
(µ(N )) kN
µ(N )
=
kci (N ) .
i=1
Recall that if ci (N ), i = 1, . . . , µ(N ) is the ith child node of N , then the arguments in (3.14) are determined by the formula (c )
kci (N ) = kN i . ˜1 · · · h ˜ m ) involves Hence, the kernel of the integral operator M (F , T, λ(ˆs) , ζ(m) )(h the product of normalized delta functions δ(kN − kc1 (N ) − · · · − kcµ(N ) (N ) ), rank(N )>0
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1047
and the integration with respect to kN is over the torus dkN , [· · ·] N =N∗
[−π,π]µ(N )d
N =N∗
and, obviously, the variable kN∗ corresponding to the root node N∗ is not involved into the integration. Since every operator F (m) at a node N of the monomial M (F , T, λ(ˆs) , ζ(m) ) contains the oscillatory factor τ(N ) exp iφζ,ζ (m),N (k, k(m) ) (m) (m) τ(N ) ω(kN ) − · · · − ζN ω(kN )] = exp i[ζN ω(kN ) − ζN , we obtain the following total oscillatory factor 1 exp i Φζ,ζ (m),T (k, k(m) ) ,
(A.1)
where the phase function ΦT,ζ (k) of the monomial is defined by the formula µ(N ) (c (N )) i ζN ω(kci (N ) ) τ(N ) . (A.2) ζN ω(k) − ΦT,ζ (k, τ ) = N ∈T
i=1
The vectors k, τ and ζ are composed of kN , τN and ζN using the standard labeling of the nodes. Notice then that the oscillatory exponent (A.1) is the only expression in the composition monomial which involves parameter . Observe also that the FM condition takes here the form µ(N )
ζN =
(c (N ))
ζN i
.
i=1
The domain of integration with respect to time variables is given in terms of the tree T by the following inequalities DT = {τ(N ) : 0 ≤ τ(N ) ≤ τ(p(N )) , N ∈ T \N∗ }
(A.3)
where p(N ) is the parent node of the node N . Using introduced notations we can write the action of the monomial M (F , T, λ(ˆs) , ζ(m) ) in the form ˜N (kN∗ , τN∗ ) M (F , T, λ, ζ) h rank(N )=0
= DT
N =N∗
[−π,π]µ(N )d
1 k) exp i ΦT,ζ (k, τ ) M (χ, T, λ, ζ,
November 28, 2006 11:15 WSPC/148-RMP
1048
J070-00285
A. Babin & A. Figotin
×
˜ N (kN ) h
rank(N )=0
×
N =N∗
dkN
δ(kN − kc1 (N ) − · · · − kcµ(N ) (N ) )
rank(N )>0
dτ(N ) .
(A.4)
N =N∗
Note that m equals the number of end nodes, that is nodes with zero rank and they are numerated using the standard labeling of the nodes, that is ˜ 1 (k1 ) · · · h ˜ N (kN ). ˜ m (km ) = h h rank(N )=0
The formula (A.4) gives a closed form of a composition monomial based on oscillatory integral operators F (m) with an arbitrary large rank. Appendix B. Proof of the Refined Implicit Function Theorem Here we give the proof of Theorem 4.25. First, we consider the following elementary problem which provides majorants for the problem of interest. Let a function of one complex variable be defined by the formula 2 2 ∞ u /RF −m um RF = CF (B.1) Fˇ (u) = CF , CF > 0, RF > 0. 1 − u/RF m=2 −m In this case Fˇ (m) (x1 · · · xm ) = CF RF x1 · · · xm . Let us introduce the equation
u = Fˇ (u) + x,
u, x ∈ C
(B.2)
which is a particular case of (4.1). A small solution u(x) of this equation such that u(0) = 0 is given by the series ˇ u = G(x) =
∞
ˇ (m) xm , G
m=1
ˇ (m) xm of this problem are which is a particular case of formula (4.14). The terms G determined from (4.18) and can be written in the form (4.29) ˇ (m) xm = cT M (Fˇ , T )xm . (B.3) G T ∈Tm
Obviously, i(T ) −e(T ) m M (Fˇ , T )xm = CF RF x
(B.4)
where i(T ) is the incidence number of the tree T , e(T ) is the number of edges of T . Now we compare solution of the general equation (4.1). It is given by the formula (4.14) with operators G (m) (um ) admitting expansion (4.29). Since −m F (m) ≤ CF RF ,
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1049
where the constants are the same as in (B.1) we have M (F , T )(x1 · · · xν ) ≤ M (Fˇ , T )x1 · · · xν , implying
cT M (F , T )(x1 · · · xm )
T ∈Tm
≤
ˇ (m) x1 · · · xm . cT M (Fˇ , T )x1 · · · xm = G
(B.5)
T ∈Tm
Solving (B.2) we get explicitly $ RF x ˇ u= = G(x), 1 − 1 − 4c 2c RF We have the following estimate of the coefficients m 2 RF CF + RF ˇ (m) ≤ G , 4 2 2(CF + RF ) RF
c=
CF + 1. RF
m = 1, 2, . . . ,
(B.6)
(see [4] for details in a similar situation). From (B.4) and (B.6) we infer the following inequality m 2 RF CF + RF i(T ) −e(T ) cT CF RF ≤ 4 2 2(CF + RF ) RF T ∈Tm
which holds for all CF , R bound (4.35).
F
> 0. We set CF = RF = 1 and obtain the desired
Notations and Abbreviations For the reader’s convenience, we provide below a list of notations and abbreviations used in this paper. AFM — alternatively frequency matched, see Definition 5.19 ANFM — alternatively non-frequency-matched, see Definition 5.19 band-crossing points — see Definition 2.3 cc — complex conjugate to the preceding terms in the formula composition monomial — see Definition 4.9 decoration projections — see (4.36) and (4.37) decorated monomial — see Definition 4.20 CI monomials — cross-interacting monomials, see Definition 5.8 FPU, Fermi–Pasta–Ulam equation — see (2.10), (2.11) and (7.1) Floquet–Bloch modal decomposition — see (7.14) Fourier transform — see (2.59) FM — frequency matched, see Definition 5.10 and also (5.42) homogeneity index of a monomial — Definition 4.9 homogeneity index of a tree — Definition 4.11
November 28, 2006 11:15 WSPC/148-RMP
1050
J070-00285
A. Babin & A. Figotin
incidence number of a monomial — number of occurrences of operators F (l) in the composition monomial incidence number of a monomial — see Definition 4.10 incidence number of a tree — Definition 4.12 lattice Fourier transform — see (2.2) monomial — Definition 4.9 NFM — non-frequency-matched, see Definition 5.10 and also (5.46) oscillatory integral operator — see (3.8) and (3.3) rank of monomial — see Definition 4.9 root operator — (4.20) SI monomials — self-interacting monomials, see Definition 5.8 Schwartz functions — infinitely smooth functions on Rd which decay faster than any power, see (7.6) single-mode wavepacket — see Definition 2.9 submonomial — (4.10) wavepacket — see Definition 2.9 1 ˜(m−1)k = dk · · · dk(m−1) — see (2.24) d (2π)(m−1)d Dm = [−π, π](m−1)d — see (2.23) or Dm = R(m−1)d — see (2.65) E = C([0, τ∗ ], L1 ) — see (2.30) Fˆ (m) — m-linear operator in L1 , see (2.22) and (2.64) F
(m)
n,ζ, n,ζ
F
(n)
λ,ζ
— basis element of the m-linear operator F (m) in E, see (3.8)
— see (4.43)
ˆ ζ (β, k), ζ = ± — Fourier transform of the wavepacket initial data hζ (β, r), h see Definition 2.9 % & ˆ ζ 1 ξ , ζ = ± — Fourier transform of the wavepacket initial data hζ (βr), see h β Definition 2.9 ˜Ψ (k, β) — a function nullified outside β 1− vicinity of ±k∗ , see (5.15) h l k = (k1 , . . . , kd ) ∈ [−π, π]d — quasimomentum (wave vector) variable, see (2.2) and (2.25). d k = (k1 , . . . , kd ) ∈ R — Fourier wave vector variable, see (2.59) and (2.25). k∗ = (k∗1 , . . . , k∗d ) — center of the wavepacket, see Definition 2.9 k∗l — center of lth wavepacket k = (k , . . . , k(m) ), — interaction multiwave vector, see (2.25) and (3.7). k(s) (k, k) = k − k − · · · − k(s−1) — see (2.25) L1 — Lebesgue space L1 ([−π, π]d ) or L1 (Rd ), see (2.31) and (2.66) n — band number n = (n , . . . , n(m) ) — band interaction index, (3.7) ∂ ∂ ∂ , ,..., — spatial gradient ∇r = ∂r1 ∂r2 ∂rd
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1051
O(µ) — any quantity having the property that O(µ) is bounded as µ → 0 µ ωn¯ (k) = ζωn (k) — dispersion relation of the band (ζ, n), see (2.13) ωn 0 (k) = ∇k ωn0 (k) — group velocity vector ωn (k) — nth eigenvalue of L(k), see (2.13); dispersion relation of nth band Ψ — cutoff function in quasimomentum domain, see (5.12) φ n (k, k) = ζωn (k) − ζ ωn (k ) − · · · − ζ (m) ωn(m) (k(m) ) — interaction phase function, (3.9) π0 — see (5.13) Πn,ζ (k) — projection in C2J onto direction of gn,ζ (k), see (2.19) r = (r1 , . . . , rd ) — spatial variable = β 2 — (2.46) σ — the set of band-crossing points, see Definition 2.3 ˆ U(k) — Fourier transform of U (r), see (2.59) iτ ˜ n,ζ (k, τ ) = u ˜n,ζ (k, τ )e− ζωn (k) — amplitudes, see (3.2) U ζ = ± or ζ = ±1 — band binary index. ζ = (ζ , . . . , ζ (m) ) — binary band index vector, see (3.7) Z ∗ – complex conjugate to Z
Acknowledgment The effort of A. Babin and A. Figotin is sponsored by the Air Force Office of Scientific Research, Air Force Materials Command, USAF, under grant number FA9550-04-1-0359. References [1] A. Babin and A. Figotin, Nonlinear photonic crystals: I. Quadratic nonlinearity, Waves Random Media 11 (2001) R31–R102. [2] A. Babin and A. Figotin, Nonlinear photonic crystals: II. Interaction classification for quadratic nonlinearities, Waves Random Media 12 (2002) R25–R52. [3] A. Babin and A. Figotin, Nonlinear photonic crystals: III. Cubic Nonlinearity, Waves Random Media 13 (2003) R41–R69. [4] A. Babin and A. Figotin, Nonlinear maxwell equations in inhomogeneous media, Commun. Math. Phys. 241 (2003) 519–581. [5] A. Babin and A. Figotin, Polylinear spectral decomposition for nonlinear Maxwell equations, in Partial Differential Equations, eds. M. S. Agranovich and M. A. Shubin, Advances in Mathematical Sciences, American Mathematical Society Translations Series 2, Vol. 206 (American Mathematical Society, Providence, RI, 2002), pp. 1–28. [6] A. Babin and A. Figotin, Nonlinear photonic crystals: IV Nonlinear Schrodinger equation regime, Waves Random and Complex Media 15(2) (2005) 145–228. [7] A. Babin and A. Figotin, Wavepacket preservation under nonlinear evolution, submitted; arXiv:math.AP/0607723. [8] A. Babin, A. Mahalov and B. Nicolaenko, Global regularity of 3D rotating Navier– Stokes equations for resonant domains, Indiana Univ. Math. J. 48(3) (1999) 1133–1176.
November 28, 2006 11:15 WSPC/148-RMP
1052
J070-00285
A. Babin & A. Figotin
[9] A. Babin, A. Mahalov and B. Nicolaenko, Fast singular oscillating limits and global regularity for the 3D primitive equations of geophysics, M2AN Math. Model. Numer. Anal. 34(2) (2000) 201–222. [10] D. Bambusi, Birkhoff normal form for some nonlinear PDEs, Comm. Math. Phys. 234(2) (2003) 253–285. [11] W. Ben Youssef and D. Lannes, The long wave limit for a general class of 2D quasilinear hyperbolic problems, Comm. Partial Differential Equations 27(5–6) (2002) 979–1020. [12] G. P Berman and F. M. Izrailev, The Fermi–Pasta–Ulam Problem: 50 Years of Progress, arXiv:nlin.CD. [13] N. N. Bogoliubov and Y. A. Mitropolsky, Asymptotic Methods in the Theory of NonLinear Oscillations (Delhi, Hindustan Pub. Corp., 1961). [14] J. L. Bona, T. Colin and D. Lannes, Long wave approximations for water waves, Arch. Ration. Mech. Anal. 178(3) (2005) 373–410. [15] J. Bourgain, Global Solutions of Nonlinear Schr¨ odinger Equations, American Mathematical Society Colloquium Publications, Vol. 46 (American Mathematical Society, Providence, RI, 1999). [16] T. Cazenave, Semilinear Schr¨ odinger equations, Courant Lecture Notes in Mathematics, Vol. 10 (American Mathematical Society, Providence, RI, 2003). [17] T. Colin, Rigorous derivation of the nonlinear Schr¨ odinger equation and Davey– Stewartson systems from quadratic hyperbolic systems, Asymptot. Anal. 31(1) (2002) 69–91. [18] T. Colin and D. Lannes, Justification of and long-wave correction to Davey– Stewartson systems from quadratic hyperbolic systems, Discrete Contin. Dyn. Syst. 11(1) (2004) 83–100. [19] W. Craig and M. D. Groves, Normal forms for wave motion in fluid interfaces, Wave Motion 31(1) (2000) 21–41. [20] W. Craig, C. Sulem and P.-L. Sulem, Nonlinear modulation of gravity waves: A rigorous approach, Nonlinearity 5(2) (1992) 497–522. [21] S. Dineen, Complex Analysis on Infinite Dimensional Spaces (Springer, 1999). [22] T. Gallay and C. E. Wayne, Invariant manifolds and the long-time asymptotics of the Navier–Stokes and vorticity equations on R2 , Arch. Ration. Mech. Anal. 163(3) (2002) 209–258. [23] J. Giannoulis and A. Mielke, The nonlinear Schr¨ odinger equation as a macroscopic limit for an oscillator chain with cubic nonlinearities, Nonlinearity 17(2) (2004) 551–565. [24] N. Hayashi and P. Naumkin, Asymptotics of small solutions to nonlinear Schr¨ odinger equations with cubic nonlinearities, Int. J. Pure Appl. Math. 3(3) (2002) 255–273. [25] E. Hille and R. S. Phillips, Functional Analysis and Semigroups (American Mathematical Society, Providence, RI, 1991). [26] E. Infeld and G. Rowlands, Nonlinear Waves, Solitons, and Chaos, 2nd edn. (Cambridge University Press, 2000). [27] G. Iooss and E. Lombardi, Polynomial normal forms with exponentially small remainder for analytic vector fields, J. Differential Equations 212(1) (2005) 1–61. [28] J.-L. Joly, G. Metivier and J. Rauch, Diffractive nonlinear geometric optics with rectification, Indiana Univ. Math. J. 47(4) (1998) 1167–1241. [29] L. A. Kalyakin, Long-wave asymptotics. Integrable equations as the asymptotic limit of nonlinear systems, Uspekhi Mat. Nauk 44(1) (1989) 5–34, 247; Russian Math. Surveys 44(1) (1989) 3–42 (translations).
November 28, 2006 11:15 WSPC/148-RMP
J070-00285
Linear Superposition in Nonlinear Wave Dynamics
1053
[30] L. A. Kalyakin, Asymptotic decay of a one-dimensional wave packet in a nonlinear dispersive medium, Math. USSR Sb. Surveys 60(2) (1988) 457–483. [31] S. B. Kuksin, Fifteen years of KAM for PDE. Geometry, Topology, and Mathematical Physics, Amer. Math. Soc. Transl. Ser. 2, Vol. 212 (Amer. Math. Soc., Providence, RI, 2004), pp. 237–258. [32] P. Kirrmann, G. Schneider and A. Mielke, The validity of modulation equations for extended systems with cubic nonlinearities, Proc. Roy. Soc. Edinburgh Sect. A 122 (1–2) (1992) 85–91. [33] P. D. Lax, Integrals of nonlinear equations of evolution and solitary waves, Comm. Pure Appl. Math. 21 (1968) 467–490. [34] V. P. Maslov, Non-standard characteristics in asymptotic problems, Uspekhi Mat. Nauk 38(6) (1983) 3–36; Russian Math. Surveys 38(6) (1983) 1–42 (translations). [35] A. Mielke, G. Schneider and A. Ziegra, Comparison of inertial manifolds and application to modulated systems, Math. Nachr. 214 (2000) 53–69. [36] A. H. Nayfeh, Perturbation Methods (Wiley, New York, 1973). [37] A. Pankov, Travelling Waves and Periodic Oscillations in Fermi–Pasta–Ulam Lattices (Imperial College Press, 2005). [38] R. D. Pierce and C. E. Wayne, On the validity of mean-field amplitude equations for counterpropagating wavetrains, Nonlinearity 8(5) (1995) 769–779. [39] G. Schneider, Justification of modulation equations for hyperbolic systems via normal forms, NoDEA Nonlinear Differential Equations Appl. 5(1) (1998) 69–82. [40] G. Schneider, Justification and failure of the nonlinear Schr¨ odinger equation in case of non-trivial quadratic resonances, J. Differential Equations 216(2) (2005) 354–386. [41] G. Schneider and H. Uecker, Existence and stability of modulating pulse solutions in Maxwell’s equations describing nonlinear optics, Z. Angew. Math. Phys. 54(4) (2003) 677–712. [42] C. Sulem and P.-L. Sulem The Nonlinear Schrodinger Equation (Springer, 1999). [43] A. Soffer and M. I. Weinstein, Resonances, radiation damping and instability in Hamiltonian nonlinear wave equations, Invent. Math. 136(1) (1999) 9–74. [44] T. P. Weissert, The Genesis of Simulation in Dynamics: Pursuing the Fermi–Pasta– Ulam problem (Springer-Verlag, New York, 1997). [45] G. Whitham, Linear and Nonlinear Waves (John Wiley & Sons, 1974).
December 15, 2006 16:52 WSPC/148-RMP
J070-00283
Reviews in Mathematical Physics Vol. 18, No. 10 (2006) 1055–1073 c World Scientific Publishing Company
FAY-LIKE IDENTITIES OF THE TODA LATTICE HIERARCHY AND ITS DISPERSIONLESS LIMIT
LEE-PENG TEO Faculty of Information Technology, Multimedia University, Jalan Multimedia, Cyberjaya, 63100, Selangor Darul Ehsan, Malaysia [email protected] Received 27 June 2006 In this paper, we derive the Fay-like identities of tau function for the Toda lattice hierarchy from the bilinear identity. We prove that the Fay-like identities are equivalent to the hierarchy. We also show that the dispersionless limit of the Fay-like identities are the dispersionless Hirota equations of the dispersionless Toda hierarchy. Keywords: Toda lattice hierarchy; tau function; Fay-like identities; dispersionless limit. Mathematics Subject Classification 2000: 37K10, 37K20
1. Introduction The Toda lattice hierarchy was introduced in [13] as a generalization of Toda lattice (see, e.g., [12]). In the paper [13], Ueno and Takasaki developed the theory along the line of the work of Date, Jimbo, Kashiwara and Miwa [3] on KP hierarchy. In particular, they proved that there exists a tau function for the Toda lattice hierarchy that satisfies a bilinear identity, which implies one can consider KP hierarchy as a special case of Toda lattice hierarchy. In [9], Takasaki and Takebe considered the dispersionless (quasi-classical) limit of the Toda lattice hierarchy. Since then, the dispersionless Toda (dToda) hierarchy has found to appear in a lot of other areas of mathematics and physics, such as the evolution of conformal mappings (see, e.g., [15, 5]), the solution of Dirichlet boundary problem (see, e.g., [6]), WDVV equations (see, e.g., [1]), two-dimensional string theory (see, e.g., [8]) and normal random matrix model (see, e.g., [14]). One of the ingredients appears in some of these works is the dispersionless Hirota equations of the tau function of the dToda hierarchy, first written down in [15], as analogues of the dispersionless Hirota equation for dispersionless KP (dKP) hierarchy derived by Takasaki and Takebe in [10] (see also [2]). In the Appendix of this seminal paper [10], Takasaki and Takebe derived the differential Fay identity from the bilinear identity satisfied by the tau function of KP hierarchy. They showed that the differential Fay identity is equivalent to KP hierarchy, and its dispersionless limit is what we call 1055
December 15, 2006 16:52 WSPC/148-RMP
1056
J070-00283
L.-P. Teo
dispersionless Hirota equation of dKP hierarchy nowadays. However, up to date, we have not found any derivation of the dispersionless Hirota equation for dToda hierarchy directly as dispersionless limits of equations satisfied by the tau function of the Toda lattice hierarchy. The goal of the present paper is to solve this problem. In Sec. 2, we review some basic facts about the Toda lattice hierarchy. In Sec. 3, we re-derive the existence of a tau function for Toda lattice hierarchy along the same line of the proof of existence of tau function for KP hierarchy in [3]. This section serves as a warm-up for later sections. In Sec. 4, we derive what we call the Fay-like identities for Toda lattice hierarchy from the bilinear identity satisfied by the tau function. In Sec. 5, we prove that the Fay-like identities are equivalent to the Toda lattice hierarchy. More specifically, a function satisfies the Fay-like identities if and only if it is a tau function of the Toda lattice hierarchy. Finally, in Sec. 6, we show that the dispersionless limit of the Fay-like identities give the dispersionless Hirota equations of dToda hierarchy. 2. Toda Lattice Hierarchy In this section, we quickly review the necessary facts we need about the Toda lattice hierarchy [13]. We closely follow the exposition in [10]. Let x = (x1 , x2 , . . .) and y = (y1 , y2 , . . .) be two sets of continuous variables. We denote by s a continuous variable with spacing unit . The Lax formalism of Toda lattice hierarchy is
∂L = [Bn , L], ∂xn
∂L = [Cn , L], ∂yn
∂K ∂K = [Bn , K], = [Cn , K], ∂xn ∂yn
(2.1)
where L, K, Bn , Cn are difference operators. L and K −1 have the form L = e∂s +
∞ n=0
K
−1
=
−n∂s u+ , n+1 (, s; x, y)e
−∂s u− 0 (, s; x, y)e
+
∞ n=0
(2.2) n∂s u− , n+1 (, s; x, y)e
u± n (, s; x, y)
where the functions are assumed to be regular in , i.e. u± n (, s; x, y) = ± un,0 (s; x, y) + O() as → 0. Bn , Cn are defined by Bn = (Ln )≥0 ,
Cn = (K −n )<0 ,
where for A = n∈Z An en∂s a difference operator and S a subset of Z, we let (A)S = n∈S An en∂s . ˆ ± (, s; x, y), There exist two dressing operators W ˆ ± (, s; x, y) = W
∞ n=0
wn± (, s; x, y)e∓n∂s ,
w0+ (, s; x, y) ≡ 1,
December 15, 2006 16:52 WSPC/148-RMP
J070-00283
Toda Lattice Hierarchy
1057
such that ˆ+=W ˆ + e∂s , LW
ˆ−=W ˆ − e∂s , KW
ˆ+ ˆ+ ∂W ˆ + , ∂ W = Cn W ˆ +, = −(Ln )<0 W ∂xn ∂yn
ˆ− ∂W ˆ −, = Bn W ∂xn ∞ Let ξ(t, a) = n=1 tn an . Define
ˆ− ∂W ˆ −. = −(K −n )≥0 W ∂yn
ˆ + (, s; x, y)e W + (, s; x, y) = W
ξ(x,e∂s )
=
∞
∂s )
wn+ (, s; x, y)e−n∂s eξ(x,e
n=0
ˆ − (, s; x, y)e W − (, s; x, y) = W
(2.3)
ξ(y,e−∂s )
=
∞
,
−∂s )
wn− (, s; x, y)en∂s eξ(y,e
.
n=0
(2.4) Then the system (2.3) is equivalent to LW + = W + e∂s ,
KW − = W − e∂s ,
∂W + = Bn W + , ∂xn
∂W + = Cn W + , ∂yn
∂W − = Bn W − , ∂xn
∂W − = Cn W − . ∂yn
Therefore
− ∂W + · (W + )−1 = ∂W · (W − )−1 , ∂x n ∂xn
− ∂W + − −1 · (W + )−1 = ∂W . ∂yn · (W ) ∂yn
(2.5)
This gives the bilinear identity W + (, s; x, y) · (W + )−1 (, s; x , y ) = W − (, s; x, y) · (W − )−1 (, s; x , y ) for all s, x, y, x , y . Let ˆ + )−1 (, s; x, y) = (W
∞
e−n∂s wn+,∗ (, s + ; x, y),
n=0
ˆ − )−1 (, s; x, y) = (W
∞ n=0
en∂s wn−,∗ (, s + ; x, y),
(2.6)
December 15, 2006 16:52 WSPC/148-RMP
1058
J070-00283
L.-P. Teo
and introduce the wave functions w± and the dual wave functions w±,∗ by ∞ + −n + w (, s; x, y; λ) = wn (, s; x, y)λ λs/ eξ(x,λ)/ n=0
= w ˆ+ (, s; x, y; λ)λs/ eξ(x,λ)/ , ∞ +,∗ −n +,∗ wn (, s; x, y)λ λ−s/ e−ξ(x,λ)/ w (, s; x, y; λ) = n=0
= w ˆ+,∗ (, s; x, y; λ)λ−s/ e−ξ(x,λ)/ , ∞ −1 − n − wn (, s; x, y)λ λs/ eξ(y,λ )/ w (, s; x, y; λ) =
(2.7)
n=0 −1
= w ˆ− (, s; x, y; λ)λs/ eξ(y,λ )/ , ∞ −1 −,∗ n −,∗ wn (, s; x, y)λ λ−s/ e−ξ(y,λ )/ w (, s; x, y; λ) = n=0 −1
= w ˆ−,∗ (, s; x, y; λ)λ−s/ e−ξ(y,λ
)/
.
In terms of the wave functions and dual wave functions, the bilinear identity (2.6) can be written in the residual form Resλ (w+ (, s; x, y; λ)w+,∗ (, s ; x , y ; λ)) = Resλ (w− (, s; x, y; λ−1 )w−,∗ (, s ; x , y ; λ−1 )λ−2 ) for all s, s , x, x , y, y . Here for a power series β(λ) = On the other hand, we have:
n n∈Z βn λ ,
(2.8)
Resλ (β(λ)) = β−1 .
Proposition 2.1. If W + and W − of the form (2.4) satisfy the system of equations (2.5), then
∂W + = Bn W + , ∂xn
∂W + = Cn W + , ∂yn
∂W − = Bn W − , ∂xn
∂W − = Cn W − ; ∂yn
and (L, K) defined by L = W + e∂s (W + )−1 ,
K = W − e∂s (W − )−1 ,
is a solution of the Toda lattice hierarchy. Proof. See the proof of [13, Theorem 1.5].
December 15, 2006 16:52 WSPC/148-RMP
J070-00283
Toda Lattice Hierarchy
1059
3. Existence of Tau Function Using the bilinear identity (2.6), Ueno and Takasaki [10] proved that there exists a tau function τ (, s; x, y) such that w ˆ + (, s; x, y; λ) =
τ (, s; x − [λ−1 ], y) , τ (, s; x, y)
w ˆ +,∗ (, s; x, y; λ) =
τ (, s; x + [λ−1 ], y) , τ (, s; x, y)
τ (, s + ; x, y − [λ]) w ˆ (, s; x, y; λ) = , τ (, s; x, y)
(3.1)
−
w ˆ −,∗ (, s; x, y; λ) =
τ (, s − ; x, y + [λ]) . τ (, s; x, y)
Here [λ] = (λ, 12 λ2 , 13 λ3 , . . .) and w ˆ± , w ˆ±,∗ are defined in (2.7). As → 0, the function log τ (, s; x, y) behaves as (see [10]) log τ (, s; x, y) = −2 F (s; x, y) + O(−1 )
(3.2)
for some function F (s; x, y). In this section, we recapitulate the proof of [13] along the line of the proof of existence of tau function for KP hierarchy given by [3]. We define the operators ∞ ∞ µ−n ∂ νn ∂ + − , G (ν) = exp − (3.3) G (µ) = exp − n ∂xn n ∂yn n=1 n=1 and ∞
∂ d = dxn , ∂x n n=1 +
−
d =
∞ n=1
dyn
∂ , ∂yn
d = d+ + d− .
We are going to make use of the following identities: η1 1 1 η2 1 = − , (1 − λη1 )(1 − λη2 ) η1 − η2 1 − λη1 η1 − η2 1 − λη2 ∞ ∞ 1 −n = ζ Resλ αn λ αn ζ −n . −1 ) (1 − λζ n=0 n=1 First, we define A+ n (; s; x, y)
n
= Resλ λ
−
−n A− n (; s; x, y) = −Resλ λ
ω = −1
∞ n=1
∞
−j−1
λ
j=1
∂ ∂ + log w ˆ (, s; x, y; λ) , + ∂xj ∂λ
∂ ∂ log w ˆ− (, s; x, y; λ) , λj−1 + ∂y ∂λ j j=1
∞
−1 A+ n dxn +
(3.4)
∞ n=1
+ − A− n dyn = ω + ω ,
(3.5)
December 15, 2006 16:52 WSPC/148-RMP
1060
J070-00283
L.-P. Teo
and rewrite the bilinear identity (2.8) as
Resλ (w ˆ+ (, s; x, y; λ)w ˆ+,∗ (, s ; x , y ; λ)λ(s−s )/ e(ξ(x,λ)−ξ(x ,λ))/ ) = Resλ (wˆ− (, s; x, y; λ−1 )w ˆ−,∗ (, s ; x , y ; λ−1 )
× λ−2+((s −s)/) e(ξ(y,λ)−ξ(y ,λ))/ ).
(3.6)
We consider the following cases: −1 Case I. s = s, x = x − [µ−1 1 ] − [µ2 ], y = y. In this case, the bilinear identity (3.6) gives
1 −1 ˆ+,∗ (, s; x−[µ−1 ˆ+ (, s; x, y; λ)w Resλ w 1 ]−[µ2 ], y; λ) −1 (1 − λµ1 )(1 − λµ−1 2 )
= 0.
Using the formulas in (3.4), this gives w ˆ+ (, s; x, y; µ1 )G+ (µ1 )G+ (µ2 )wˆ+,∗ (, s; x, y; µ1 ) =w ˆ+ (, s; x, y; µ2 )G+ (µ1 )G+ (µ2 )w ˆ+,∗ (, s; x, y; µ2 ).
(3.7)
Setting µ1 = λ and putting µ−1 2 = 0, we have w ˆ+ (, s; x, y; λ)G+ (λ)wˆ+,∗ (, s; x, y; λ) = 1,
(3.8)
or equivalently, G+ (λ)wˆ+,∗ (, s; x, y; λ) =
1 w ˆ+ (, s; x, y; λ)
.
(3.9)
Using this relation, (3.7) gives w ˆ+ (, s; x, y; λ1 ) w ˆ + (, s; x, y; λ2 ) = , G+ (λ2 )w ˆ+ (, s; x, y; λ1 ) G+ (λ1 )w ˆ+ (, s; x, y; λ2 ) or equivalently, log w ˆ+ (, s; x, y; λ1 ) − G+ (λ2 ) log w ˆ+ (, s; x, y; λ1 ) = log w ˆ+ (, s; x, y; λ2 ) − G+ (λ1 ) log w ˆ+ (, s; x, y; λ2 ). Case II. s = s + , x = x − [µ−1 ], y = y − [ν]. In this case, the bilinear identity (3.6) gives Resλ w ˆ+ (, s; x, y; λ)wˆ+,∗ (, s + ; x − [µ−1 ], y − [ν]; λ)λ−1
1 1 − λµ−1
= Resλ w ˆ− (, s; x, y; λ−1 )wˆ−,∗ (, s + ; x − [µ−1 ], y − [ν]; λ−1 ) −1
×λ
1 . 1 − λν
(3.10)
December 15, 2006 16:52 WSPC/148-RMP
J070-00283
Toda Lattice Hierarchy
1061
Using the second formula in (3.4), this gives w ˆ+ (, s; x, y; µ)G+ (µ)G− (ν)wˆ+,∗ (, s + ; x, y; µ) =w ˆ− (, s; x, y; ν)G+ (µ)G− (ν)wˆ−,∗ (, s + ; x, y; ν).
(3.11)
Setting µ−1 = 0 and ν = λ, we obtain w ˆ− (, s; x, y; λ)G− (λ)wˆ−,∗ (, s + ; x, y; λ) = 1.
(3.12)
1 . − ; x, y; λ)
(3.13)
Equivalently, G− (λ)wˆ−,∗ (, s; x, y; λ) =
w ˆ− (, s
Substituting (3.13) and (3.9) into (3.11), we have w ˆ+ (, s; x, y; λ1 ) w ˆ− (, s; x, y; λ2 ) = + , + ; x, y; λ1 ) G (λ1 )w ˆ− (, s; x, y; λ2 )
G− (λ2 )wˆ+ (, s or equivalently,
log w ˆ+ (, s; x, y; λ1 ) − G− (λ2 ) log w ˆ+ (, s + ; x, y; λ1 ) = log w ˆ− (, s; x, y; λ2 ) − G+ (λ1 ) log w ˆ− (, s; x, y; λ2 ).
(3.14)
Case III. s = s + 2, x = x, y = y − [ν1 ] − [ν2 ]. In this case, the bilinear identity (3.6) gives ˆ− (, s; x, y; λ−1 )wˆ−,∗ (, s + 2; x, y − [ν1 ] − [ν2 ]; λ−1 ) 0 = Resλ w 1 . × (1 − λν1 )(1 − λν2 ) Using the formulas in (3.4), this gives w ˆ− (, s; x, y; ν1 )G− (ν1 )G− (ν2 )w ˆ−,∗ (, s + 2; x, y; ν1 ) ˆ−,∗ (, s + 2; x, y; ν2 ). =w ˆ− (, s; x, y; ν2 )G− (ν1 )G− (ν2 )w Using (3.13), this gives w ˆ− (, s; x, y; λ1 ) w ˆ − (, s; x, y; λ2 ) = , G− (λ2 )wˆ− (, s + ; x, y; λ1 ) G− (λ1 )wˆ− (, s + ; x, y; λ2 ) or equivalently, log w ˆ− (, s; x, y; λ1 ) − G− (λ2 )w ˆ− (, s + ; x, y; λ1 ) = log w ˆ− (, s; x, y; λ2 ) − G− (λ1 ) log w ˆ− (, s + ; x, y; λ2 ).
(3.15)
December 15, 2006 16:52 WSPC/148-RMP
1062
J070-00283
L.-P. Teo
Now, using the definition (3.5) of ω + and ω − , (3.10), (3.14) and (3.15) give us, respectively, ˆ+ (, s; x, y; λ), ω + (, s; x, y) − G+ (λ)ω + (, s; x, y) = −d+ log w ˆ+ (, s; x, y; λ), ω − (, s; x, y) − G+ (λ)ω − (, s; x, y) = −d− log w ˆ− (, s; x, y; λ), ω + (, s; x, y) − G− (λ)ω + (, s + ; x, y) = −d+ log w
(3.16)
ω − (, s; x, y) − G− (λ)ω − (, s + ; x, y) = −d− log w ˆ− (, s; x, y; λ). The first two equations and the last two equations give, respectively, ˆ+ (, s; x, y; λ), ω(, s; x, y) − G+ (λ)ω(, s; x, y) = −d log w ˆ− (, s; x, y; λ). ω(, s; x, y) − G− (λ)ω(, s + ; x, y) = −d log w
(3.17)
Setting λ = 0 in the second equation, we have ω(, s; x, y) − ω(, s + ; x, y) = −d log w0− (, s; x, y).
(3.18)
Applying d again to both sides of Eqs. (3.17) and (3.18), we obtain dω(, s; x, y) = G+ (λ)dω(, s; x, y), dω(, s; x, y) = G− (λ)dω(, s + ; x, y),
(3.19)
dω(, s; x, y) = dω(, s + ; x, y). This implies that dω(, s; x, y) =
amn ()dtm ∧ dtn ,
(3.20)
n=0 m=0
where tn = xn if n > 0 and tn = yn if n < 0; amn () are independent of x, y and s and anm = −amn . Therefore, ω(, s; x, y) = amn ()tm dtn + dH(, s; x, y) (3.21) n=0
m=0
for some function H(, s; x, y). Substituting back into Eqs. (3.17), we have d log w ˆ+ (, s; x, y; λ) = G+ (λ)dH(, s; x, y) − dH(, s; x, y) ∞ λ−m dtn , amn () − m m=1 n=0
−
−
d log w ˆ (, s; x, y; λ) = G (λ)dH(, s + ; x, y) − dH(, s; x, y) ∞ λm dtn . a−m,n (; s) − m m=1 n=0
December 15, 2006 16:52 WSPC/148-RMP
J070-00283
Toda Lattice Hierarchy
1063
∞ −n Therefore, for some functions A(, s; λ) = and B(, s; λ) = n=1 An (, s)λ ∞ n n=0 Bn (, s)λ independent of x, y, we have log w ˆ+ (, s; x, y; λ) = G+ (λ)H(, s; x, y) − H(, s; x, y) ∞ λ−m − tn + A(, s; λ), amn () m m=1 n=0
−
−
log w ˆ (, s; x, y; λ) = G (λ)H(, s + ; x, y) − H(, s; x, y) ∞ λm − tn + B(, s; λ). a−m,n (; s) m m=1
(3.22)
n=0
Substituting these back into Eqs. (3.10), (3.14) and (3.15), we find that ∞ ∞ ∞ ∞ λ−n λ−n λ−m λ−m 1 2 2 1 = , amn () amn () m n m n n=1 m=1 n=1 m=1 ∞ n=1
∞ n=1
λn2 λ−m 1 + A(, s; λ1 ) − A(, s + ; λ1 ) am,−n () m n m=1 ∞ ∞ λ−n λm 1 , = a−m,n () 2 m n n=1 m=1
∞
∞
λm a−m,−n () 1 m m=1
∞ λn2 = n n=1
∞
λm a−m,−n () 2 m m=1
λn1 . n
Comparing coefficients on both sides, we find that amn () = anm () for all n = 0, m = 0. However, since by definition anm () = −amn (), we conclude that amn () = 0 for all m = 0, n = 0. Hence from (3.20), we have dω(, s; x, y) = 0. Together with Eq. (3.18), we conclude that there exists a function τ (, s; x, y) such that ω(, s; x, y) = d log τ (, s; x, y),
τ (, s + ; x, y) = w0− (, s; x, y). τ (, s; x, y)
(3.23)
We call τ (, s; x, y) the tau function of Toda lattice hierarchy. Compare with (3.21), we can take H = log τ . Then (3.22) gives us log w ˆ+ (, s; x, y; λ) = G+ (λ) log τ (, s; x, y) − log τ (, s; x, y) + A(, s; λ), log w ˆ− (, s; x, y; λ) = G− (λ) log τ (, s + ; x, y) − log τ (, s; x, y) + B(, s; λ). Substituting this into the definition (3.5) of ω and using (3.23), we conclude that A(, s; λ) = B(, s; λ) = 0. This gives the first and third equations in (3.1). Equations (3.9) and (3.13) then give the second and fourth equations in (3.1).
December 15, 2006 16:52 WSPC/148-RMP
1064
J070-00283
L.-P. Teo
4. Fay-Like Identities In this section, we derive the Fay-like identities for Toda lattice hierarchy from the bilinear identity. Substituting Eqs. (3.1) into the bilinear identity (3.6), we obtain the bilinear identity of Toda lattice hierarchy in terms of the tau function:
Resλ (τ (, s; x − [λ−1 ], y)τ (, s ; x + [λ−1 ], y )λ(s−s )/ e(ξ(x,λ)−ξ(x ,λ))/ ) = Resλ (τ (, s + ; x, y − [λ−1 ])τ (, s − ; x , y + [λ−1 ])
× λ−2+((s −s)/) e(ξ(y,λ)−ξ(y ,λ))/ ).
(4.1)
We consider the following cases. −1 Case I. s = s + , x = x − [µ−1 1 ] − [µ2 ], y = y. In this case, the bilinear identity (4.1) and formulas in (3.4) give us
µ−1 −1 −1 1 −1 τ (, s; x − [µ1 ], y)τ (, s + ; x − [µ2 ], y) µ−1 1 − µ2 −
µ−1 −1 2 τ (, s; x − [µ−1 2 ], y)τ (, s + ; x − [µ1 ], y) − µ−1 2
µ−1 1
−1 = τ (, s + ; x, y)τ (, s; x − [µ−1 1 ] − [µ2 ], y).
(4.2)
Case II. s = s + , x = x, y = y − [ν1 ] − [ν2 ]. In this case, the bilinear identity (4.1) and formulas in (3.4) give us τ (, s; x, y)τ (, s + ; x, y − [ν1 ] − [ν2 ]) =
ν1 τ (, s + ; x, y − [ν1 ])τ (, s; x, y − [ν2 ]) ν1 − ν2 ν2 − τ (, s + ; x, y − [ν2 ])τ (, s; x, y − [ν1 ]). ν1 − ν2
(4.3)
Case III. s = s, x = x − [µ−1 ], y = y − [ν]. In this case, the bilinear identity (4.1) and formulas in (3.4) give us µ(τ (, s; x − [µ−1 ], y)τ (, s; x, y − [ν]) − τ (; s; x, y)τ (, s; x − [µ−1 ], y − [ν])) = ντ (, s + ; x, y − [ν])τ (, s − ; x − [µ−1 ], y). Rearranging (4.2)–(4.4), we obtain (A)
µ1 − µ2
−1 τ (, s; x − [µ−1 1 ], y)τ (, s + ; x − [µ2 ], y) −1 τ (, s; x − [µ−1 2 ], y)τ (, s + ; x − [µ1 ], y)
= (µ1 − µ2 )
−1 τ (, s + ; x, y)τ (, s; x − [µ−1 1 ] − [µ2 ], y) , −1 τ (, s; x − [µ−1 2 ], y)τ (, s + ; x − [µ1 ], y)
(4.4)
December 15, 2006 16:52 WSPC/148-RMP
J070-00283
Toda Lattice Hierarchy
(B)
ν1 − ν2
τ (, s + ; x, y − [ν2 ])τ (, s; x, y − [ν1 ]) τ (, s + ; x, y − [ν1 ])τ (, s; x, y − [ν2 ])
= (ν1 − ν2 ) (C)
1065
τ (, s; x, y)τ (, s + ; x, y − [ν1 ] − [ν2 ]) , τ (, s + ; x, y − [ν1 ])τ (, s; x, y − [ν2 ])
τ (; s; x, y)τ (, s; x − [µ−1 ], y − [ν]) τ (, s; x − [µ−1 ], y)τ (, s; x, y − [ν]) =1−
ν τ (, s + ; x, y − [ν])τ (, s − ; x − [µ−1 ], y) . µ τ (, s; x − [µ−1 ], y)τ (, s; x, y − [ν])
(4.5)
We are going to prove in next section that these three identities alone are enough to imply that τ (, s; x, y) is a tau function of the Toda lattice hierarchy. We are also going to prove in Sec. 6 that the dispersionless limit of these identities are precisely the dispersionless Hirota equations of dispersionless Toda (dToda) hierarchy [7, 15, 6, 16, 1, 11]. These should be compared to the work of Takasaki and Takebe in [10, Appendix], where they showed that the differential Fay identity of KP hierarchy is equivalent to the KP hierarchy, and the dispersionless limit of the differential Fay identity is the dispersionless Hirota equation of KP hierarchy [10, 4, 2]. Therefore we call the identities (A)–(C) in (4.5) the Fay-like identities of Toda lattice hierarchy. Remark 4.1. In [16], Zabrodin has written down an equivalent form of the identities (A) and (B) for a sector of the Toda lattice hierarchy where the variables xn . xn , yn , n ∈ N are complex variables satisfying yn = −¯ 5. Equivalence of Fay-Like Identities to the Toda Lattice Hierarchy In this section, we are going to show that the Fay-like identities (4.5) are equivalent to the Toda lattice hierarchy. More precisely, we have: Proposition 5.1. If τ (, s; x, y) is a function that satisfies the Fay-like identities (4.5), then τ (, s; x, y) is a tau function of the Toda lattice hierarchy. Proof. Given the function τ (, s; x, y) that satisfies the Fay-like identities (4.5), we define the functions τ (, s; x − [λ−1 ], y) ξ(x,λ) e w ˜ + (, s; x, y; λ) = τ (, s; x, y) ∞ + −n = wn (, s; x, y)λ eξ(x,λ) , n=0
τ (, s + ; x, y − [λ]) ξ(y,λ−1 ) e τ (, s; x, y) ∞ −1 − n = wn (, s; x, y)λ eξ(y,λ ) ,
w ˜− (, s; x, y; λ) =
n=0
December 15, 2006 16:52 WSPC/148-RMP
1066
J070-00283
L.-P. Teo
and define the operators W ± (, s; x, y) by (2.4), with wn± (, s; x, y) defined by the formulas above. We are going to show that W ± satisfy the system of Eq. (2.5). Then by Proposition 2.1, W ± are the dressing operators of Toda lattice hierarchy. This implies that τ (, s; x, y) is a tau function of the Toda lattice hierarchy. We define the functions w± , v ± by w+ (, s; x, y; λ) =
w ˜+ (, s; x, y; λ) τ (, s; x − [λ−1 ], y) ξ(x,λ) e , = − τ (, s + ; x, y) w0 (, s; x, y)
w− (, s; x, y; λ) =
τ (, s + ; x, y − [λ]) ξ(y,λ−1 ) w ˜− (, s; x, y; λ) e = , − τ (, s + ; x, y) w0 (, s; x, y)
v + (, s; x, y; λ) = λ−1 v − (, s; x, y; λ) = λ
∞ w+ (, s; x, y; λ) = v + (, s; x, y)λ−n , w+ (, s + ; x, y; λ) n=1 n
(5.1)
∞ w ˜ − (, s; x, y; λ) = v − (, s; x, y)λn , w ˜− (, s − ; x, y; λ) n=1 n
and the operators X[µ], Y [ν] by Y [ν] = 1 − G− (ν),
X[µ] = 1 − G+ (µ),
where the operators G± are defined by (3.3). It is easy to verify that
∞ ∞ µ−n ∂ X[µ]n , = n ∂xn n n=1 n=1
∞ ∞ νn ∂ Y [ν]n . = n ∂yn n n=1 n=1
(5.2)
Applying X[µ] to w± and Y [ν] to w ˜± , we obtain respectively the following results: I. X[µ]w+ (, s; x, y; λ) λ τ (, s; x − [λ−1 ], y) ξ(x,λ) τ (, s; x − [λ−1 ] − [µ−1 ], y) ξ(x,λ) e e − 1 − τ (, s + ; x, y) τ (, s + ; x − [µ−1 ], y) µ µ − λ τ (, s + ; x, y)τ (, s; x − [λ−1 ] − [µ−1 ], y) + . = w (, s; x, y; λ) 1 − µ τ (, s; x − [λ−1 ], y)τ (, s + ; x − [µ−1 ], y)
=
By (A) of (4.5), this is equal to λ + τ (, s + ; x − [λ−1 ], y)τ (, s; x − [µ−1 ], y) w (, s; x, y; λ) µ τ (, s; x − [λ−1 ], y)τ (, s + ; x − [µ−1 ], y) =
w+ (, s + ; x, y; λ)w+ (, s; x, y; µ) λ + w (, s; x, y; λ) + µ w (, s + ; x, y; µ)w+ (, s; x, y; λ)
= v + (, s; x, y; µ)λw+ (, s + ; x, y; λ).
December 15, 2006 16:52 WSPC/148-RMP
J070-00283
Toda Lattice Hierarchy
1067
II. X[µ]w− (, s; x, y; λ) τ (, s + ; x, y − [λ]) ξ(y,λ−1 ) τ (, s + ; x − [µ−1 ], y − [λ]) ξ(y,λ−1 ) e e − τ (, s + ; x, y) τ (, s + ; x − [µ−1 ], y) τ (, s + ; x, y)τ (, s + ; x − [µ−1 ], y − [λ]) = w− (, s; x, y; λ) 1 − . τ (, s + ; x, y − [λ])τ (, s + ; x − [µ−1 ], y) =
By (C) of (4.5), this is equal to τ (, s + 2; x, y − [λ])τ (, s; x − [µ−1 ], y) λ − w (, s; x, y; λ) µ τ (, s + ; x, y − [λ])τ (, s + ; x − [µ−1 ], y) =
λ − w− (, s + ; x, y; λ)w+ (, s; x, y; µ) w (, s; x, y; λ) − µ w (, s; x, y; λ)w+ (, s + ; x, y; µ)
= v + (, s; x, y; µ)λw− (, s + ; x, y; λ). III. Y [ν]w ˜+ (, s; x, y; λ) τ (, s; x − [λ−1 ], y) ξ(x,λ) τ (, s; x − [λ−1 ], y − [ν]) ξ(x,λ) e e − τ (, s; x, y) τ (, s; x, y − [ν]) τ (, s; x, y)τ (, s; x − [λ−1 ], y − [ν]) =w ˜+ (, s; x, y; λ) 1 − τ (, s; x − [λ−1 ], y)τ (, s; x, y − [ν]) =
=
ν + τ (, s − ; x − [λ−1 ], y)τ (, s + ; x, y − [ν]) w ˜ (, s; x, y; λ) λ τ (, s; x − [λ−1 ], y)τ (, s; x, y − [ν])
=
˜− (, s; x, y; ν) ν + w˜+ (, s − ; x, y; λ)w w ˜ (, s; x, y; λ) + λ w˜ (, s; x, y; λ)w ˜− (, s − ; x, y; ν)
= v − (, s; x, y; ν)λ−1 w ˜+ (, s − ; x, y; λ). IV. Y [ν]w˜− (, s; x, y; λ) =
τ (, s + ; x, y − [λ]) ξ(y,λ−1 ) e τ (, s; x, y)
ν τ (, s + ; x, y − [λ] − [ν]) ξ(y,λ−1 ) 1− e τ (, s; x, y − [ν]) λ τ (, s; x, y)τ (, s + ; x, y − [λ] − [ν]) λ − ν =w ˜− (, s; x, y; λ) 1 − . τ (, s + ; x, y − [λ])τ (, s; x, y − [ν]) λ −
December 15, 2006 16:52 WSPC/148-RMP
1068
J070-00283
L.-P. Teo
By (B) of (4.5), this is equal to τ (, s; x, y − [λ])τ (, s + ; x, y − [ν]) ν − w ˜ (, s; x, y; λ) λ τ (, s + ; x, y − [λ])τ (, s; x, y − [ν]) =
ν − w ˜− (, s − ; x, y; λ)w˜− (, s; x, y; ν) w ˜ (, s; x, y; λ) − λ w ˜ (, s; x, y; λ)w ˜− (, s − ; x, y; ν)
= v − (, s; x, y; ν)λ−1 w ˜− (, s − ; x, y; λ). Applying X[µ] again to I and II, we have X[µ]2 w± (, s; x, y; λ) = (X[µ]v + (, s; x, y; µ))(λe∂s )w± (, s; x, y; λ) + (G(µ)v + (, s; x, y; µ))(λe∂s )(X[µ]w± (, s; x, y; λ)) = ((X[µ]v + (, s; x, y; µ))(λe∂s ) + (G(µ)v + (, s; x, y; µ))v + (, s + ; x, y; µ)(λe∂s )2 )w± (, s; x, y; λ) = (q2,1 (, s; x, y; µ)(λe∂s ) + q2,2 (, s; x, y; µ)(λe∂s )2 )w± (, s; x, y; λ). Since as µ → ∞, v + (, s; x, y; µ) = O(µ−1 ), we have q2,1 (, s; x, y; µ) = X[µ]v + (, s; x, y; µ) = O(µ−1 ), q2,2 (, s; x, y; µ) = (G(µ)v + (, s; x, y; µ))v + (, s + ; x, y; µ) = O(µ−2 ). By induction, we find that X[µ]n w± (; s; x, y; λ) =
n
qn,k (, s; x, y; µ)(λe∂s )k w± (; s; x, y; λ),
k=1
where qn,k (, s; x, y; µ) is a power series in µ−1 , containing µj with j ≤ −k. Similarly, from III and IV, we have ˜± (; s; x, y; λ) = Y [ν]n w
n
rn,k (, s; x, y; ν)(λ−1 e−∂s )k w ˜± (; s; x, y; λ),
k=1
where rn,k (, s; x, y; ν) is a power series in ν, containing ν j with j ≥ k. Consequently, we obtain from (5.2) that
∞ ∞ µ−n ∂w± (, s; x, y; λ) µ−n Qn (, s; x, y; λe∂s )w± (, s; x, y; λ), = n ∂x n n n=1 n=1
∞ ˜± (, s; x, y; λ) νn ∂w n ∂yn n=1
∞ νn Rn (, s; x, y; (λe∂s )−1 )w˜± (, s; x, y; λ), = n n=1
(5.3)
December 15, 2006 16:52 WSPC/148-RMP
J070-00283
Toda Lattice Hierarchy
1069
where Qn (, s; x, y; Λ) and Rn (, s; x, y; Λ) are order n-polynomials in Λ without a constant term. Define ˜ ± (, s; x, y) = w− (, s; x, y)−1 W ± (, s; x, y). W 0 Using the definition (2.4) of W ± (, s; x, y), and the fact that for any function A(, s; x, y) (e∂s (A(, s; x, y)en∂s ))|e∂s =λ = (A(, s + ; x, y)e(n+1)∂s )|e∂s =λ = (λe∂s )(A(, s; x, y)λn ), we obtain by comparing coefficients in (5.3) that
˜ ± (, s; x, y) ∂W ˜ ± (, s; x, y), = Qn (, s; x, y; e∂s )W ∂xn
∂W ± (, s; x, y) ˜ ± (, s; x, y). = Rn (, s; x, y; e∂s )W ∂yn
These give
˜− ˜+ ∂W ˜ + )−1 = ∂ W · (W ˜ − )−1 , · (W ∂xn ∂xn
∂W + ∂W − · (W + )−1 = · (W − )−1 . ∂yn ∂yn
(5.4)
Finally, we have from the first equation,
˜+ ∂W ∂w− ∂W + ˜ + )−1 (w− )−1 · (W + )−1 = 0 (w0− )−1 + w0− · (W 0 ∂xn ∂xn ∂xn =
˜− ∂W ∂w0− − −1 ˜ − )−1 (w− )−1 (w0 ) + w0− · (W 0 ∂xn ∂xn
=
∂W − · (W − )−1 . ∂xn
Together with the second equation of (5.4) give (2.5). This concludes the proof.
6. Dispersionless Limit of Fay-Like Identities In this section, we show that the dispersionless limit ( → 0) of the Fay-like identities (4.5) are the dispersionless Hirota equations of dToda hierarchy. First we review some facts about dToda hierarchy.
December 15, 2006 16:52 WSPC/148-RMP
1070
J070-00283
L.-P. Teo
6.1. Dispersionless Toda hierarchy By taking the dispersionless limit ( → 0) of the Toda lattice hierarchy (2.1), we obtain the Lax representation of the dToda hierarchy [9, 10]: ∂L = {Bn , L}, ∂xn
∂L = {Cn , L}, ∂yn
∂K = {Bn , K}, ∂xn
∂K = {Cn , K}. ∂yn
(6.1)
Here L = L(p; s; x, y) and K = K(p; s; x; y) are formal power series in the variable p, such that L and K−1 have the form L = L(p; s; x, y) = p +
∞ n=0
−n u+ , n+1,0 (s; x, y)p
−1 K−1 = K(p; s; x, y)−1 = u− + 0,0 (s; x, y)p
∞ n=0
n u− n+1,0 (s; x, y)p ,
obtained by replacing the operator e∂s by p and taking the limit → 0 in the definition of L and K in (2.2). Bn and Cn are defined by Bn = (Ln )≥0 , Cn = (K−n )<0 , n where now for a power series A = n∈Z An p and a subset S of Z, we define n AS = n∈S An p . {·, ·} is the Poisson bracket {f, g} = p
∂f ∂g ∂f ∂g −p ∂p ∂s ∂s ∂p
of dToda hierarchy. The function F (s; x, y) defined by (3.2) as the leading term in the expansion of log τ (, s; x, y) with respect to , is called the free energy of the dToda hierarchy. The tau function of dToda hierarchy τdToda is then defined as τdToda = exp F . It satisfies the dispersionless Hirota equation of dToda hierarchy, which is the following set of equations [7, 15, 6, 16, 1, 11]: ∞ ∞ 1 ∂ 2 log τdToda 1 ∂ 2 log τdToda z1−n − z2 exp − z2−n z1 exp − n ∂s∂x n ∂s∂x n n n=1 n=1 ∞ ∞ 1 ∂ 2 log τdToda −m −n z1 z2 , (6.2) = (z1 − z2 ) exp mn ∂xm ∂xn m=1 n=1 ∞ ∞ 1 ∂ 2 log τdToda 1 ∂ 2 log τdToda z1n − z2−1 exp z2n z1−1 exp n ∂s∂y n ∂s∂y n n n=1 n=1 ∞ ∞ 1 ∂ 2 log τdToda −1 −2 m n z1 z2 , (6.3) = (z1 − z2 ) exp mn ∂ym ∂yn m=1 n=1
December 15, 2006 16:52 WSPC/148-RMP
J070-00283
Toda Lattice Hierarchy
1−
z1 z2−1
∞ ∞ ∂ 2 F 1 ∂ 2 log τdToda n 1 ∂ 2 log τdToda −n exp − z + z2 1 ∂s2 n ∂s∂yn n ∂s∂xn n=1 n=1
1071
∞ ∞ 1 ∂ 2 log τdToda m −n = exp z1 z2 . mn ∂ym ∂xn m=1 n=1
(6.4)
It was shown in [1, 4, 2, 11] that this set of equations is equivalent to the dToda hierarchy. 6.2. Dispersionless limit of Fay-like identities Making use of the behavior of τ (, s; x, y) as → 0 given by (3.2), it is easy to show that Proposition 6.1. The dispersionless limits of the Fay-like identities (A)–(C) in (4.5) are Eqs. (6.2)–(6.4), respectively. Proof. Using the operator G± defined by (3.3), and by a suitable renaming of the variables, we rewrite (A)–(C) in (4.5) as z1 − z2 exp((e∂s − 1)(G+ (z2 ) − G+ (z1 )) log τ (, s; x, y))
(A )
= (z1 − z2 ) exp((e∂s − G+ (z2 ))(1 − G+ (z1 )) log τ (, s; x, y)), z2−1 − z1−1 exp((e∂s − 1)(G− (z2 ) − G− (z1 )) log τ (, s; x, y))
(B )
= (z2−1 − z1−1 ) exp((1 − e∂s G− (z1 ))(1 − G− (z2 )) log τ (, s; x, y)), exp((1 − G+ (z2 ))(1 − G− (z1 )) log τ (, s; x, y))
(C )
= 1 − z1 z2−1 exp((e∂s − 1)(G−(z1 ) − e−∂s G+ (z2 )) log τ (, s; x, y)). Since as → 0,
∞ µ−n ∂ G (µ) = exp − n ∂xn n=1
+
∞ νn ∂ G− (ν) = exp − n ∂yn n=1
=1−
=1−
∞ µ−n ∂ + O(2 ), n ∂x n n=1
∞ νn ∂ + O(2 ), n ∂y n n=1
we have (e∂s − 1)(G+ (z2 ) − G+ (z1 )) ∞ ∞ z −n ∂ z2−n ∂ 2 ∂ 1 = − + O(3 ), ∂s n=1 n ∂xn n=1 n ∂xn
December 15, 2006 16:52 WSPC/148-RMP
J070-00283
L.-P. Teo
1072
(e∂s − G+ (z2 ))(1 − G+ (z1 )) ∞ ∞ z −n ∂ ∂ z2−n ∂ 2 1 = + + O(3 ), ∂s n=1 n ∂xn n=1 n ∂xn (e
∂s
∞ ∞ ∂ z1n ∂ z2n ∂ − 1)(G (z2 ) − G (z1 )) = − + O(3 ), ∂s n=1 n ∂yn n=1 n ∂yn −
−
2
(1 − e
∂s
−
−
G (z1 ))(1 − G (z2 )) =
(1 − G+ (z2 ))(1 − G− (z1 )) =
2
∞ z1n ∂ ∂ − n ∂y ∂s n n=1
∞ z2n ∂ + O(3 ), n ∂y n n=1
∞ ∞ z2−n ∂ z1n ∂ + O(3 ), n ∂x n ∂y n n n=1 n=1
(e∂s − 1)(G− (z1 ) − e−∂s G+ (z2 )) ∞ ∞ −n n ∂ ∂ ∂ ∂ z z 1 2 − = 2 + + O(3 ). ∂s ∂s n=1 n ∂yn n=1 n ∂xn Therefore, using the behavior of τ (, s; x, y) given by (3.2), we find that as → 0, (A )–(C ) give
(A )
∞ ∞ 1 ∂ 2 F −n 1 ∂ 2 F −n z1 − z2 exp z1 − z n ∂s∂xn n ∂s∂xn 2 n=1 n=1
∞ ∞ ∞ ∂2F 1 ∂ 2 F −n 1 −m −n = exp z + z z2 , n ∂s∂xn 1 mn ∂xm ∂xn 1 n=1 m=1 n=1
(B )
z2−1
−
z1−1 exp
∞ ∞ 1 ∂2F n 1 ∂2F n z1 − z n ∂s∂yn n ∂s∂yn 2 n=1 n=1
∞ ∞ ∞ ∂2F 1 1 ∂ 2F n m n = exp z z − z , mn ∂ym ∂yn 1 2 n=1 n ∂s∂yn 2 m=1 n=1
(C )
∞ ∞ 1 ∂2F exp z1m z2−n mn ∂y ∂x m n m=1 n=1
= 1−
z1 z2−1
∞ ∞ ∂ 2 F 1 ∂ 2 F n 1 ∂ 2 F −n exp − z + z . ∂s2 n ∂s∂yn 1 n=1 n ∂s∂xn 2 n=1
After simple manipulations, it it easy to see that this set of equations give Eqs. (6.2)–(6.4), respectively.
December 15, 2006 16:52 WSPC/148-RMP
J070-00283
Toda Lattice Hierarchy
1073
Acknowledgments I am grateful for the hospitality of Academia Sinica of Taiwan during my visit, when part of this work was done. Special thanks go to Derchyi Wu who had arranged the visit. I would also like to thank J. S. Chang, Y. T. Chen, N. C. Lee, J. C. Shaw, M. H. Tu and D. C. Wu for the discussions we had during my visit. This work is partially supported by MMU Internal Funding PR/2006/0590. References [1] A. Boyarsky, A. Marshakov, O. Ruchayskiy, P. Wiegmann and A. Zabrodin, Associativity equations in dispersionless integrable hierarchies, Phys. Lett. B 515(3–4) (2001) 483–492. [2] R. Carroll and Y. Kodama, Solution of the dispersionless Hirota equations, J. Phys. A 28(22) (1995) 6373–6387. [3] E. Date, M. Kashiwara, M. Jimbo and T. Miwa, Transformation groups for soliton equations, in Proc. RIMS Symp. Nonlinear Integrable Systems — Classical Theory and Quantum Theory (Kyoto, 1981), eds. M. Jimbo and T. Miwa (World Scientific Publishing, Singapore, 1983), pp. 39–119. [4] J. Gibbons and Y. Kodama, Solving dispersionless Lax equations, in Singular Limits of Dispersive Waves (Lyon, 1991), NATO Adv. Sci. Inst. Ser. B Phys., Vol. 320 (Plenum, New York, 1994), pp. 61–66. [5] I. K Kostov, I. M Krichever, M. Mineev-Weinstein, P. B Wiegmann and A. Zabrodin, The τ -function for analytic curves, in Random Matrix Models and Their Applications, Math. Sci. Res. Inst. Publ., Vol. 40 (Cambridge Univ. Press, Cambridge, 2001), pp. 285–299. [6] A. Marshakov, P. Wiegmann and A. Zabrodin, Integrable structure of the Dirichlet boundary problem in two dimensions, Comm. Math. Phys. 227(1) (2002) 131–153. [7] M. Mineev-Weinstein, P. B Wiegmann and A. Zabrodin, Integrable sructure of interface dynamics, Phys. Rev. Lett. 84 (2000) 5106–5109. [8] K. Takasaki, Dispersionless Toda hierarchy and two-dimensional string theory, Comm. Math. Phys. 170(1) (1995) 101–116. [9] K. Takasaki and T. Takebe, Quasi-classical limit of Toda hierarchy and W -infinity symmetries, Lett. Math. Phys. 28(3) (1993) 165–176. [10] K. Takasaki and T. Takebe, Integrable hierarchies and dispersionless limit, Rev. Math. Phys. 7(5) (1995) 743–808. [11] L.-P. Teo, Analytic functions and integrable hierarchies — Characterization of tau functions, Lett. Math. Phys. 64(1) (2003) 75–92. [12] M. Toda, Studies of a non-linear lattice, Phys. Rep. 18C(1) (1975) 1–123. [13] K. Ueno and K. Takasaki, Toda lattice hierarchy, in Group Representations and Systems of Differential Equations (Tokyo, 1982), Adv. Stud. Pure Math., Vol. 4 (North-Holland, Amsterdam, 1984), pp. 1–95. [14] P. Wiegmann and A. Zabrodin, Large scale correlations in normal non-Hermitian matrix ensembles, J. Phys. A 36(12) (2003) 3411–3424. [15] P. B. Wiegmann and A. Zabrodin, Conformal maps and integrable hierarchies, Comm. Math. Phys. 213(3) (2000) 523–538. [16] A. V. Zabrodin, The dispersionless limit of the Hirota equations in some problems of complex analysis, Teoret. Mat. Fiz. 129(2) (2001) 239–257.
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Reviews in Mathematical Physics Vol. 18, No. 10 (2006) 1075–1102 c World Scientific Publishing Company
ON UNCERTAINTY, BRAIDING AND ENTANGLEMENT IN GEOMETRIC QUANTUM MECHANICS
` ALBERTO BENVEGNU Dipartimento di Matematica, Universit` a di Ferrara, Via Machiavelli 35, 44100 Ferrara, Italy [email protected] MAURO SPERA Dipartimento di Metodi e Modelli Matematici, per le Scienze Applicate, Universit` a di Padova, Torre Archimede, Via Trieste 63, 35121 Padova, Italy [email protected] Received 22 February 2006 Revised 17 September 2006 Acting within the framework of geometric quantum mechanics, an interpretation of quantum uncertainty is discussed in terms of Jacobi fields, and a connection with the theory of elliptic curves is outlined, via classical integrability of Schr¨ odinger’s dynamics and the cross-ratio interpretation of quantum transition probabilities. Furthermore, a thoroughly geometrical construction of all special unitary representations of the 3-strand braid group on the quantum 1-qubit space is given, and the connection of one of them with elliptic curves admitting complex multiplication automorphisms — the physically relevant one corresponding to the anharmonic ratio — is shown. Also, contact is made with the Temperley–Lieb algebra theoretic constructions of Kauffman and Lomonaco, and it is shown that the standard trace relative to one of the above representations computes the Jones polynomial for particular values of the parameter, for knots arising as closures of 3-strand braids. Subsequently, a geometric entanglement criterion (in terms of Segre embeddings) is discussed, together with a projective geometrical portrait for quantum 2-gates. Finally, Aravind’s idea of describing quantum states via knot theory is critically analyzed, and a geometrical picture — involving a blend of SU (2)representation theory, classical projective geometry, binary trees and Brunnian and Hopf links — is set up in order to describe successive measurements made upon generalized GHZ states, close in spirit to the quantum knot picture again devised by Kauffman and Lomonaco. Keywords: Geometric quantum mechanics; Jacobi fields; elliptic curves; Artin’s braid group; Segre and Veronese maps; quantum entanglement; links. Mathematics Subject Classification 2000: 81Q70, 53B21, 14N05, 33E05, 14H52, 81P15, 81P68, 57M25, 05C05
1075
December 15, 2006 16:52 WSPC/148-RMP
1076
J070-00286
A. Benvegn` u & M. Spera
1. Introduction In the present note, which may be viewed as an ideal sequel to [8], we discuss three relevant issues in quantum mechanics, namely uncertainty, braiding and entanglement from a geometric point of view (actually, different fields of geometry will be involved). As in the latter paper, we do not pursue any foundational aim, (see, e.g., [16, 17] for differential geometric non linear extensions of quantum mechanics) but just act within the standard framework. Also, we confine ourselves to a finite dimensional environment, this however being physically interesting in view, for instance, of applications to quantum computing and quantum chemistry. Briefly, we address quantum entanglement and/vs topological entanglement (cf. [30, 31, 33, 34]), a topic of current interest and still mysterious in many aspects. The latter, in contrast to the former, already crops up at the 1-qubit space level, through existence of many 3-strand braid group representations. Our geometric approach unveils a possibly fruitful relationship among quantum uncertainty, elliptic function theory and the 3-strand braid group, based on the classical integrability of Schr¨ odinger’s dynamics pointed out in [8]. As for quantum entanglement, we elaborate on its formulation in terms of classical algebraic geometry, via (generalized) Segre embeddings. Also, connections emerge between measurements taken on generalized GHZ entangled states, projective geometry and knot theory, pursuing the pioneering work of Aravind [2]. Here is a more detailed outline of the contents of the paper. In Sec. 2, we recall the basic framework of geometric quantum mechanics, including the cross-ratio interpretation of transition probabilities [27, 11, 12], which is going to be crucial for all further developments. The exposition of new results begins with Sec. 3. Recall that, according to Aharonov and Anandan [1], the dispersion of the Hamiltonian in a (pure) state is given by the length (with respect to the Fubini–Study metric) of the Hamiltonian vector field induced on projective Hilbert space. In Theorem 3.1, we show that, taking a geodesic joining two orthogonal eigenstates of the Hamiltonian (so we actually work on a (complex) projective line, which is topologically a 2-sphere), the Hamiltonian vector field is indeed a Jacobi field along the geodesic (a meridian circle). This turns out to be extremely straightforward, but it does not appear to have been previously noticed explicitly. Also, the given geometrical setting naturally suggests a connection with elliptic integrals and curves, elaborated on in Sec. 4. This part is rather speculative, but it might lead to unexpected developments. Briefly, this goes as follows (see, Theorem 4.1): by virtue of integrability of Schr¨ odinger’s dynamics (see, e.g., [8] and references therein, or Secs. 2 and 4), the sphere is foliated into 1-dimensional Lagrangian (or Liouville) tori (circles, actually parallels, with the poles being given by the orthogonal eigenstates in question) degenerating into points at the eigenstates themselves. Any parallel is labeled by the value of the action (one of the transition probability = a suitable cross-ratio) of any state thereon; its radius is proportional to the corresponding dispersion of the
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1077
Hamiltonian. These 1-dimensional circles (dynamical cycles) may be looked upon as 1-cycles on the natural (two-dimensional) tori canonically associated to the relevant cross-ratio (Jacobi modulus (squared)). Natural complementary cycles, provided by the meridians passing through the poles (eigenstates) are associated to the former ones, and may be termed collapse cycles, since the measurement of the Hamiltonian forces collapse onto an eigenstate, with the appropriate probability. This provides a simple instance of the procedure of embedding Liouville tori in abelian varieties arising in the theory of algebraic integrable systems (see, e.g., [4]). The following situation also emerges: we have a family of tori possessing the same symplectic structure, but with varying complex structure, the variation being governed by uncertainty. This is formally the same portrait (but with a completely different physical meaning) arising in abelian Chern–Simons theory [54, 3, 5]. Also, in a purely formal way, a classical pendulum may be associated to the whole arrangement, and Berry phase phenomena (via its Hannay-type interpretation set forth in [8]) yield a Foucault-type rotation of the plane thereof. In Sec. 5, we study special unitary representations of the 3-strand braid group B3 from a geometrical viewpoint (focussing on the relationship between rotation axes and rotation angles of the possible representations of the two braid generators, Theorem 5.1), making contact with the purely algebraic approach of Kauffman and Kauffman–Lomonaco (Proposition 5.2, [28–31]) via Temperley–Lieb algebras. We notice that a similar independent “geometric” study has been quite recently undertaken in [34]. However, we do not discuss the important issue of “density” (see [34] for full details). We further show that the ordinary trace evaluated on a particular representation of B3 yields the Jones polynomial for particular values of its parameter (Theorem 5.3), for knots obtained as closures of 3-strand braids. We also discuss another special representation of B3 , coming from braiding of the three real roots of the Weierstraß elliptic cubic associated to the Jacobi modulus k 2 = 12 pertaining to the anharmonic ratio (lemniscatic, or square lattice case), Theorem 5.4. More precisely, the above roots correspond to points of a unit sphere forming an equilateral triangle inscribed in a great circle; the various positions of the plane thereby determined with respect to a stereographic projection plane give rise to a family of solutions (giving, however, the same representation class). The above mentioned representation corresponds to perpendicular planes, whereas their coincidence yields the so-called triangular lattice case. Subsequently, starting from Sec. 6, we address quantum entanglement. First, we establish, in a geometric fashion via Segre maps, an entanglement criterion (Theorem 6.1; however, a Segre-type approach similar to ours already appears — in a more condensed form — in [14]). The variety X of disentangled states emerges as an intersection of quadrics. A recursive “localization” procedure is devised which produces a (minimal) set of equations locally cutting out X. This construction is then extended to encompass “partial” entanglement. In particular we show that, up to order, the possible Segre embeddings pertaining to partial entanglements are given by Euler’s partitio numerorum (Theorem 6.3).
December 15, 2006 16:52 WSPC/148-RMP
1078
J070-00286
A. Benvegn` u & M. Spera
Then, in Sec. 7, as an application of the previous results and in the spirit of [12], we build up (see Theorem 7.1) a geometrical portrait of the 2-qubit space and its unitary operators (quantum gates) which could turn out to be useful in discussing quantum teleportation, see, e.g., [20] and references therein, [31, 35, 55]: intersecting the (Segre) quadric Q parametrizing disentangled states in P3 with a suitable “plane at infinity” yields an “absolute” conic (classical terminology) which is the image of the 1-qubit space under the Veronese map. Any diagonalization of the quadric Q is obtained via a universal 2-quantum gate (by the Brylinskis’ theorem [15]). A natural choice is provided by the unitary R-matrix of, e.g., [31], see also below. Finally, in Sec. 8, we discuss Aravind’s intriguing idea of connecting quantum states with knots and links [2], further pursued, e.g., in [31]. We discuss successive measurements of particular entangled states made up of identical particles (specifically photons, but also Fermions can be accommodated, after taking the Pauli Exclusion principle into due account), generalizing the so-called GHZ states [24], by resorting to standard SU (2)-(actually U (2)-) representation theory (Clebsch– Gordan decomposition). We obtain a clear-cut, systematic description of these successive measurements via suitables trees, determined by simplices and subsimplices in certain complex vector spaces — in passing, the whole set-up illustrates finite F2 projective geometries — whose leaves can be associated to Brunnian or Hopf links (Proposition 8.1). The upshot is that, in agreement with Aravind and the “quantum knot” picture of Kauffman–Lomonaco [33], links are related to measurements rather than to states alone. The final section is devoted to concluding observations and outlook. 2. Geometric Quantum Mechanics The present treatment partially overlaps with the one given in [8] (see also the references given therein). Throughout the paper we assume = 1. Let V be a complex Hilbert space of finite dimension n + 1, for simplicity, with scalar product ·|·, linear in the second variable. Let P (V ) denote its associated projective space, of complex dimension n. This is the space of (pure) states in quantum mechanics. Making use of Dirac’s bra-ket notation, we can identify a point in P (V ) — which is, by definition, the ray [v] (i.e. one-dimensional vector space v) pertaining to (resp. generated by) a vector v ≡ |v (of norm one, for simplicity) — with the projection operator onto that line, namely [v] = |vv|.
(2.1)
If U (V ) denotes the unitary group associated to (V, ·|·), with Lie algebra u(V ), consisting of all skew-hermitian endomorphisms of V — the quantum observables, with a slight abuse of language — then the projective space P (V ) is a U (V )homogeneous K¨ ahler manifold. The isotropy group (stabilizer) of a point [v] ∈ P (V ) is isomorphic to U (V ) × U (1), with V the orthogonal complement to v in V , the
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1079
U (1) part coming from phase invariance: [eiα v] = [v]. Hence P (V ) ∼ = U (n + 1)/(U (n) × U (1)) ∼ = Pn . = U (V )/(U (V ) × U (1)) ∼
(2.2)
The fundamental vector field A associated to A ∈ u(V ) reads (evaluated at [v] ∈ P (V ), v= 1) A |[v] = |vAv| + |Avv|.
(2.3)
In view of homogeneity, these vectors span the tangent space of P (V ) at each point. The (action of the) complex structure J is, accordingly: J|[v] A |[v] = |viAv| + |iAvv|.
(2.4)
The expressions for the natural (i.e. Fubini–Study) metric g and K¨ ahler form ω read g[v] (A |[v] , B |[v] ) = Re{Av|Bv + v|Avv|Bv}
(2.5)
and ω[v] (A |[v] , B |[v] ) = g[v] (J|[v] A |[v] , B |[v] ) =
i v|[A, B]v. 2
(2.6)
In particular one has the following formula for the dispersion (variance) of an observable A ∈ u(V ) in a state [v], see also, e.g., [1, 18, 46, 20]: ∆[v] A = Av − v|Avv = A |[v] = g[v] (A |[v] , A |[v] ) = J|[v] A |[v]. (2.7) In particular, the relationship between the canonical metrics of the two-sphere and the projective line is gP (C2 ) = 14 gS 2 , i.e. the Fubini–Study metric on the projective line — “Bloch sphere”, 1-qubit or simply qubit state space — is the metric on a sphere of radius 12 , whence the curvature KP (C2 ) = 4 KS 2 = 4. This can be also checked via the calculation we shall perform in Sec. 3 (see [20] as well). Projective lines yield totally geodesic submanifolds of projective space, which, in turn, has constant holomorphic sectional curvature equal to 4 (see, e.g., [20–22]. We also notice in passing that the “hyperplane at infinity” corresponding to [v], i.e. P (v⊥ ) ∼ = Pn−1 coincides with the cut-locus pertaining to a point [v] ∈ P (V ) (see [22, 21, 36]). We now briefly recall, for clarity and motivation, the basic steps taken in [8], leading to show the integrability of Schr¨ odinger’s dynamics when looked upon classically over P (V ). One starts from a non degenerate quantum Hamiltonian H=
n j=0
λj Pj =
n j=0
λj |ej ej |
(2.8)
December 15, 2006 16:52 WSPC/148-RMP
1080
J070-00286
A. Benvegn` u & M. Spera
i.e. λi = λj , if i = j, and (ej ) is an orthonormal basis of eigenvectors, with Pj := |ej ej | being the orthogonal projection operator onto the line ej . Without loss of generality one may assume 0 = λ0 < λ1 < · · · < λn , so H=
n
λj Pj .
(2.9)
j=1
The Schr¨ odinger equation is given by (recall that = 1) ∂ |v = −iH|v. ∂t
(2.10)
n n Set v = j=0 αj ej , j=1 |αj |2 = 1. The mean value of H on a state [v] yields a “classical” Hamiltonian h on P (V ); with the above notations h([v]) = v|Hv =
n
λj |αj |2 .
(2.11)
j=1
Now let αj = 0 for all j = 0, . . . , n. The submanifold consisting of such [v]’s is open and dense in P (V ). The torus Tn+1 acts on P (V ) via the position ej → eiβj ej , βj ∈ [0, 2π), but actually, in view of global phase arbitrariness, this action descends to an effective action of G := Tn . We set β0 = 0 in order to be specific. The generators of the torus action are the (mutually commuting) operators iPj , j = 1, 2, . . . , n. Their associated Hamiltonians pj := ·|Pj · give rise to n constants of motion (first integrals) in involution with respect to the Poisson bracket defined by the Fubini–Study form, which turn out to be the action variables. In the complement, we have a stratification of toral orbits of dimensions k = 0, 1, . . . , n − 1 (isotropic tori), but the basic picture persists. More precisely, one has the following Theorem 2.1 [8]. (i) The “classical” Hamiltonian system (P (V ), ω, h) (actually an open dense set thereof ) is completely integrable. The Lagrangian tori are provided by the orbits G · [v] of the n-dimensional torus G-action above. The action variables Ij coincide with the transition probabilities |αj |2 = pj ([v]), j = 1, 2, . . . , n. (ii) Indeed, the full system remains integrable, allowing isotropic tori, and the orbit space can be identified with the standard n-simplex in the Euclidean space Rn . Remarks 1. As we have already pointed out in [8], this result is known in different, less direct and explicit guises, (cf. among others [49, 19]). 2. The simplest case dim V = 2, i.e. P (V ) ∼ = P (C2 ) ∼ = S 2 is already interesting: Schr¨ odinger’s dynamics takes place on parallels (associated to the “poles” [e0 ] ≡ [0], [e1 ] ≡ [1]) parametrized by an appropriate transition probability: this geometric picture will be retrieved in the course of the paper and will be crucial for establishing a possible link with elliptic function theory. Let us finally quote the following basic result, whose proof is elementary.
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1081
Theorem 2.2 (cf. [27, 11, 12]). Given two distinct points (quantum states) [ξ] and [η] in P (V ), with representative (ket) vectors ξ and η, and given their respective orthogonal states [ξ ⊥ ] [η ⊥ ] on the projective line [ξ][η] they determine, then the cross-ratio k 2 := ([ξ], [η], [η ⊥ ], [ξ ⊥ ]) equals the transition probability between [ξ] and [η], namely ([ξ], [η], [η ⊥ ], [ξ ⊥ ]) =
|ξ|η|2 . ξ|ξη|η
(2.12)
Notice that if [ξ][η] is regarded as a sphere, then [ξ] and [ξ ⊥ ], and [η], [η ⊥ ], respectively, become antipodal points thereon. As a final remark, we may observe that a quantum observable induces a projective reference frame built from its eigenstates, and a choice of phase of their representing vectors amounts at fixing its unit point.
3. Uncertainty and Jacobi Fields In this section, we are going to prove that the fundamental vector field induced by the Schr¨ odinger Hamiltonian, when restricted to a minimal geodesic connecting two orthogonal eigenstates pertaining to different energy levels is a Jacobi field thereon. We first observe that we may confine ourselves to the case of a two level system with non degenerate Hamiltonian H = λ0 |00| + λ1 |11| with λ0 < λ1 and δh := λ1 − λ0 . The result we are going to discuss will hold even in the infinite dimensional case, for two different eigenvalues of H (when present). Also recall from Sec. 2 that ∆[v] H = JH |[v] . The vector field J := JH , taken along a (minimal) geodesic curve joining two orthogonal eigenstates of H (this is just a half-meridian in S 2 ∼ = P (C2 ) viewed as a totally geodesic submanifold of the full projective space) is perpendicular to it at every point (see also below). Theorem 3.1. (i) The dispersion ∆[v] H equals δh · rϑ , with rϑ the radius of the parallel with colatitude ϑ pertaining to the sphere with radius 12 . (ii) The vector field J is a Jacobi vector field when restricted to a geodesic connecting two orthogonal eigenstates corresponding to different energy levels. Proof. Assertion (i) amounts to state that the Fubini–Study metric for P (C2 ) coincides with the standard metric on a sphere of radius 1/2 (whose curvature is K = 4), see Sec. 2, however we can directly check our assertion as follows. First one has rϑ = 12 sin ϑ = sin ϑ2 cos ϑ2 ; then we explicitly compute the dispersion: we have, setting as usual (see below, Remark 2), |v = z0 |0 + z1|1 = cos ϑ2 |0 + eiϕ sin ϑ2 |1, with ϑ ∈ [0, π] the “colatitude” taken along a “meridian” (on the standard S 2 ), and ϕ ∈ [0, 2π) the “longitude”, 1
1
∆[v] H = (λ1 − λ0 )|z1 |(1 − |z1 |2 ) 2 ≡ δh(|z1 |(1 − |z1 |2 ) 2 ) = δh · rϑ
(3.1)
December 15, 2006 16:52 WSPC/148-RMP
1082
J070-00286
A. Benvegn` u & M. Spera
as desired, and s = ϑ2 is then the geodesic parameter along a meridian with respect to the Fubini–Study metric. Then assertion (ii) is also immediate: d2 J + 4J = 0 (3.2) ds2 which is indeed the Jacobi equation since the sectional curvature of the appropriate plane is K = 4 (i.e. the constant holomorphic sectional curvature, see [21, 22], and Sec. 2). Another quick proof of (ii) is the following: a rotation around the “polar axis” induces a geodesic variation with fixed extremities, which, when infinitesimalized, gives rise to a Jacobi field (cf., e.g., the article of Kobayashi in [36]). In our case, it is immediately verified via elementary geometry that the Jacobi field is, up to a constant, the one given above. Remarks. 1. We clearly see that the Heisenberg Uncertainty Principle is essentially a manifestation of curvature. 2. Notice that the standard parametrization of S 2 with “half-angles” comes from stereographic projection, and the inhomogeneous complex variable ζ := zz10 = eiϕ tan ϑ2 is just the coordinate of the projection onto the “equatorial” plane — taken from the “south pole” [1] — of the above point [v] on the sphere, having colatitude ϑ (i.e. the angle between [0] and [v]) and longitude ϕ, measured from one fixed meridian. In the projective line picture (cf. Sec. 2), [0] is the origin of a projective frame in which [1] is the point at infinity and [|0 + |1] is the unit point. 2
the geometrical uncertainty, in view of its purely We call the quantity rϑ2 = ∆H δh2 geometric origin. Also recall, in passing, that the quantity τZ = 1/∆[v] H is called the Zeno time and plays an important role in quantum measurement theory [23]. 4. Integrability and Elliptic Functions This section is somewhat speculative, and outlines a possible connection with elliptic functions. This stems from the basic fact, recalled in Sec. 2, that the transition probability in quantum mechanics is actually a cross-ratio. For full details on this venerable topic we refer, for instance, to the classical treatise [53], and also to [45, 25, 38, 50]. We recall the complete elliptic integral of the first kind (according to Legendre’s classification) associated to the Jacobi modulus k ∈ C \ {0, 1} (in Legendre’s and Jacobi’s form, respectively): 1 π2 1 1 dx (4.1) dφ = K(k) = 2 2 (1 − x )(1 − k 2 x2 ) 0 0 1 − k 2 sin φ 2
and its complementary integral K (k) := K(k ), with k = 1 − k 2 . An important case (especially for applications), arises when k is real and 0 < k < 1. The modulus
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1083
squared k 2 can be interpreted as the cross-ratio of the four roots of a complex polynomial P4 = P4 (x) of fourth degree(or three, if one of the roots goes to infinity) 2 appearing in a generic elliptic integral dx y , where y = P4 (x), see [53, 45, 50] for a detailed discussion. One obtains a non singular cubic C in P2 whose Weierstraß form reads y 2 = 4x3 − g2 x − g3 = 4(x − e1 )(x − e2 )(x − e3 )
(4.2)
with e1 + e2 + e3 = 0 (the ei ’s are all distinct, see also below). The elliptic integral above is explicitly inverted by Weiestraß’ function ℘ = ℘(z, g2 , g3 ) ≡ ℘(z, τ ), fulfilling the above equation with x = ℘, y = ℘ . Then C ∼ = T := C/Λ, a torus defined by quotienting C by the normalized lattice Λ = Z(1, τ ), where τ ∈ C, τ > 0. If τ K we set ω1 = 12 , ω2 = 1+τ 2 , ω3 = 2 , then ei = ℘(ωi ), and also τ = i K . We record the following homogeneity property of ℘ (with t a non zero complex number): ℘(z; g2 , g3 ) = t2 ℘(tz; t−4 g2 , t−6 g3 )
(4.3)
and the formulae for the discriminant ∆ and the j-invariant (the latter parametrizing the isomorphism classes of elliptic curves) ∆ = g2 3 − 27g3 2 = 16(e1 − e2 )2 (e2 − e3 )2 (e3 − e1 )2
(4.4)
and 2
j=
g2 3 g2 3 4 (1 − k 2 k )3 = 3 = . ∆ g2 − 27g3 2 27 k4 k 4
(4.5)
The following lattice arrangements are particularly important (fix g3 > 0 and g2 > 0, respectively, cf. [50, 38]): the “equilateral triangle” lattice (equianharmonic π case) g2 = 0, g3 = 4, k 2 = e 3 i , stemming from ℘(ξz; 0, g3) = ξ ℘(z; 0, g3 )
(4.6)
with ξ a cubic root of unity, and the “square” lattice (lemniscatic case) g2 = 12 and g3 = 0, with k 2 = 12 , arising from ℘(iz; g2, 0) = −℘(z; g2, 0).
(4.7)
They are the only lattices admitting complex multiplication automorphisms, and yield the singular points of the moduli space M1 of elliptic curves (see, e.g., [38] or [42] for a more detailed discussion and physical applications). Therefore we have, building on the geometric description of the 1-qubit space of Secs. 2 and 3 and employing the notation therein, |1|v|2 = ([v], [1], [0], [v ⊥ ]) =: k 2
2
and |0|v|2 =: k = 1 − k 2
(4.8)
and thus we may regard, simply, k 2 = |1|v|2 as the Jacobi modulus (squared) of an elliptic curve C = Ck2 = Cj (with j given by (4.5)). The modulus k 2 will also be the cross-ratio of the corresponding Weierstraß roots.
December 15, 2006 16:52 WSPC/148-RMP
1084
J070-00286
A. Benvegn` u & M. Spera
(k) These data determine the modular parameter τ = i K K(k) , which induces a complex structure. We are now prepared to discuss our result. Recall that in view of the integrability of Schr¨ odinger’s dynamics (see, e.g., [8] and references therein, or Sec. 2) the Bloch sphere is foliated into 1-dimensional Lagrangian (or Liouville) tori (circles, actually parallels, with the poles being given by the orthogonal eigenstates [0] and [1], whereon the dynamics takes place as a uniform rotation around the “polar” axis, of period T = 2π δh . They degenerate into points at the eigenstates themselves. Upon stereographically projecting as in Sec. 3, Remark 2, we see that the Schr¨odinger evolution of a state [v] describes a circle centred at the origin [0]. Any parallel, say Pk2 , is labeled by the value of the action k 2 = |1|v|2 ; its radius is given by the dispersion of the Hamiltonian on any state thereon. These 1-dimensional “variable” circles (dynamical cycles) may be looked upon as 1-cycles on the natural (two-dimensional) (Weierstraß) torus with periods K and iK canonically associated to C. This is a standard procedure in the theory of algebraic integrable systems (cf., e.g., [4]). Then, obvious complementary “fixed” 1-cycles, provided by the meridians passing through the poles (eigenstates) are associated to the former ones, which we call collapse cycles, since the measurement of the Hamiltonian forces collapse onto an eigenstate, with the appropriate probabilities 2 k 2 and k . They are also naturally mapped to 1-cycles on the elliptic curve. This can be made explicit as follows: take [v] on a fixed parallel Pk2 , with v = k e0 + e−iβ ke1 (via an appropriate phase adjustment), where β ∈ [0, 2π]/ ∼ (endpoint identification). Then, in terms of a suitable stereographic projection, the Liouville torus Pk2 is embedded into the complex torus T by means of the map
Pk2 eiβ
K k → iβ = βτ ∈ T . k K
(4.9)
Recall, for completeness, that k and k can in turn be recovered from the modular parameter τ via Jacobi’s theta functions (see the above references): k2 =
ϑ42 (0) , ϑ43 (0)
2
k = 1 − k2 =
ϑ44 (0) . ϑ43 (0)
(4.10)
Thus we have a family of tori possessing the same symplectic structure but variable complex structure, the variation being governed by uncertainty. We collect the above remarks in the following Theorem 4.1. (i) There exists a family of elliptic curves Ck2 parametrized by k 2 , building up a (topologically trivial, having contractible base) fibration F → (0, 1) in abelian tori, wherein the dynamical Lagrangian tori (parallels on the unit sphere) can be embedded and made to correspond, in the normalized lattice Z(1, τ ) to the τ -1-cycle. The 1-1-cycle can be associated to a meridian passing through the poles, and can be called collapse cycle, since the measurement of the Hamiltonian forces collapse onto an eigenstate, with the appropriate probability.
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1085
(ii) The tori have varying complex structures (induced by τ ), ultimately governed by the geometrical uncertainty, which appears directly in the expression for the j-invariant. (iii) With the above notation, the angle ϑ also represents the maximal elongation of a simple pendulum, with period given by 4K(k) (in view of k 2 = sin2 ϑ2 ). In this picture an adiabatic evolution of the Hamiltonian yields a Berry phase, resulting in a rotation along a parallel, this inducing a Foucault-type rotation of the plane of the pendulum. (iv) The above fibration yields a (prequantizable) symplectic family, whereupon a line bundle L → F can be constructed, restricting to the standard theta line bundle over fibres. For details concerning the mechanical analogy, see [38]; for the Hannay–Berry interpretation of the Foucault pendulum, one can refer to [20]. In [8], a Hannay-type interpretation of Berry’s phase is established, again by exploiting integrability. We need to comment a bit more on point (iv); this is easily made explicit via Riemann’s theta function 2 eiπ n τ +2πi nz (4.11) ϑ(z, τ ) = n∈Z
corresponding to the (unique up to a scalar) holomorphic section of the “theta line bundle”, defined over any principally polarized abelian variety (see, e.g., [25] for details). The heat equation fulfilled by ϑ is a manifestation of its covariant constancy with respect to a natural projectively flat connection. This ties neatly with abelian Chern–Simons theory [3, 54]. Assertion (iv) can be seen as a simple instance of the so-called GLSW-construction (see [13, 48] for details and more refined applications), in the sense that it presents a (unobstructed) family of (geometric) quantizations over a family of symplectic manifolds (abelian varieties). 5. 3-Strand Braiding in 1-Qubit Spaces In this section we are going to describe all SU (2)-representation (classes) of the 3strand braid group B3 in a purely geometric fashion, and then we compare our conclusions with the Temperley–Lieb theoretic approach of Kauffman and Kauffman– Lomonaco (see e.g. [28–31]). Background for the topics involved in this section can be found, among others, in [28, 9]. Recall that the braid group Bn can be presented via generators bi , i = 1, 2, . . . , n − 1 subject to relations bi bi+1 bi = bi+1 bi bi+1 for i = 1, 2, . . . , n − 2 and bi bj = bj bi for |i − j| ≥ 2. Adjoining the relations bi2 = 1 we get a presentation for the symmetric group Sn . There is a natural surjection Bn → Sn , and its kernel is given by the pure (or colored) braid group Pn . We also recall that Bn is the fundamental group of Yn := Conf(C, n)/Sn consisting of all collections of n different but indistinguishable points on the complex plane C (thus it is the quotient of the
December 15, 2006 16:52 WSPC/148-RMP
1086
J070-00286
A. Benvegn` u & M. Spera
configuration space Conf(C, n) by the obvious action of the permutation group Sn ). The latter space can be also identified with the space of monic polynomials of degree n possessing distinct roots. Also, Bn is the subgroup of the mapping class group (viz. group of components of orientation preserving diffeomorphisms) of a sphere with n + 1 marked points p1 , p2 , . . . , pn+1 = ∞ leaving the last one (say) fixed. It is well known that, in view of Alexander’s theorem, all links can be realized via closing a braid (determined up to Markov moves [9, 28]). In the present paper we shall concentrate on the simplest non trivial case n = 3, where we have the single condition b1 b2 b1 = b2 b1 b2 . It is easily seen that the center Z of B3 is generated by (b1 b2 )3 and that one has B3 /Z ∼ = PSL(2, Z) (the latter being the modular group), see, e.g., [52]. This further substantiates the relationship with elliptic functions discussed above. Explicitly, one has the surjective map B3 → PSL(2, Z) induced by 1 1 1 0 , b2 → . (5.1) b1 → 0 1 −1 1 One has PSL(2, Z) ∼ = Z2 ∗ Z3 (free product) via the explicit representation 0 1 1 1 , b1 b2 → U = . b1 b2 b1 = b2 b1 b2 → S = −1 0 −1 0
(5.2)
Clearly , S 2 = U 3 = −I, whence the right-hand side of the projected relations in PSL(2, Z) is the identity. One also has the useful identities (with a slight abuse of notation) S = b1 b2 b1 , so S = U b1 b1 = U −1 S, b1 b2 = U , b2 = (U −1 S)−1 U = S −1 U 2 . To proceed further, we also need, for the sake of completeness, to gather some basic information about the special unitary group SU (2), the universal (double) covering group of SO(3) (see, e.g., [37, 39]). The reader may prefer to proceed directly to Theorem 5.1 and go back if necessary. A general special unitary matrix takes the form (in terms of the so-called Cayley–Klein parameters) α β (5.3) −β α with α, β ∈ C, |α|2 +|β|2 = 1. First recall the expression for the Pauli matrices (multiplied by i, they provide a basis for the Lie algebra Lie(SU (2)) ∼ = Lie(SO(3)) ∼ = R3 ), 0 1 0 −i 1 0 , σ2 = , σ3 = . (5.4) σ1 = 1 0 i 0 0 −1 Given a geometric vector a = a1 i + a2 j + a3 k (standard notation) and setting, 3 successively σ = (σ1 , σ2 , σ3 ) and σ · a := i=1 ai σi , we have (with · and × denoting the scalar and vector product in the space of geometric vectors, respectively): (σ · a)(σ · b) = (a · b)I2 + iσ · a × b.
(5.5)
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1087
The preceding formula can be cast into quaternionic form, upon setting I := iσ1 ,
J := iσ2 ,
K := iσ3
(5.6)
and remembering that quaternionic multiplication (also denoted by ·) entails multiplying the respective matrices in reverse order, e.g., I · J = (iσ2 )(iσ1 ) = −iσ · j × i = iσ · i × j = iσ3 = K. Now, given a unit vector n and an oriented angle ϕ ∈ [0, 2π], the (special) unitary operator ϕ ϕ ϕ (5.7) Un (ϕ) = exp i σ · n = cos I2 + sin iσ · n 2 2 2 induces, via the adjoint action Ad on Lie(SU (2)) a counter-clockwise rotation Rn (ϕ) of angle ϕ around the axis n on planes perpendicular thereto: Ad(Un (ϕ))X = Un (ϕ)XUn (−ϕ)
(5.8)
i.e., setting, with a slight abuse of notation f : R3 → Lie(SU (2)),
x = (x, y, z) → X = xσ1 + yσ2 + zσ3 =
z x + iy
x − iy −z
(5.9)
then f ◦ Rn (ϕ) = Ad(Un (ϕ)) ◦ f.
(5.10)
Comparison between (5.3) and (5.7) easily yields ϕ ϕ 1 cos = α, sin = 1 − (α)2 , n = ( β i + β j + αk) 2 2 1 − (α)2 (5.11) unless α = ±1 = α, i.e. ±I2 , both inducing the trivial rotation. Let a, b ∈ R3 be unit vectors, and let a · b =: cos Ω. Then, in the preceding notation Ua (α) · Ub (β) = pI2 + qiσ · a + riσ · b + siσ · a × b,
(5.12)
where
α β α β p = cos cos − sin sin cos Ω 2 2 2 2 β α q = sin cos 2 2 β α r = cos sin 2 2 s = − sin α sin β . 2 2 Setting Ua (α) · Ub (β) =: Un (ψ), one has, in particular, cos
ψ = p, 2
sin
1 ψ = +(1 − p2 ) 2 2
(5.13)
December 15, 2006 16:52 WSPC/148-RMP
1088
J070-00286
A. Benvegn` u & M. Spera
which is easily interpreted in terms of spherical trigonometry (or, conversely, one could establish the latter via the present machinery). A tedious but straightforward calculation, given unit vectors a, b, c ∈ R3 , and recalling the general vector identity (a × b) × c = (a · c)b − (b · c)a
(5.14)
also yields, Ua (α) · Ub (β) · Uc (γ) = p I2 + q iσ · a + r iσ · b + s iσ · c + t iσ · a × b + u iσ · a × c + v iσ · b × c (5.15) with
β γ α β γ α β γ α p = cos cos cos − sin sin cos a · b − sin cos sin a · c 2 2 2 2 2 2 2 2 2 α β γ α β γ − cos sin sin b · c + sin sin sin a × b·c 2 2 2 2 2 2 β γ α β γ α q = sin cos cos − sin sin sin b · c 2 2 2 2 2 2 β γ α β γ α r = cos sin cos + sin sin sin a · c 2 2 2 2 2 2 (5.16) β γ α β γ α s = cos cos sin − sin sin sin a · b 2 2 2 2 2 2 β γ α t = − sin sin cos 2 2 2 β γ α u = − sin cos sin 2 2 2 β γ α v = − cos sin sin . 2 2 2 We are now prepared to state the following:
Theorem 5.1. (i) There exists a unique family of SU (2)-representation classes of the 3-strand braid group B3 , where the rotation angle α of both generators and the angle Ω between their respective axes are related by means of the formula 2 sin
Ω α cos = 1 2 2
(5.17)
2π with Ω ∈ [− 2π 3 , 3 ]. Equivalent forms are
cos Ω =
cos α , 1 − cos α
cos α =
cos Ω 1 + cos Ω
(5.18)
with α ∈ [ π3 , 5π 3 ] (trivial representations are included). (ii) The above representations induce, in turn, special unitary representations of SL(2, Z) (and of the modular group PSL(2, Z)). In particular, the rotation axis
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1089
pertaining to the generator S bisects the angle formed by the corresponding axes of the braid group generators b1 and b2 . Proof. Ad (i). The proof is straightforward. Indeed, after specializing the above formula for the product of three “rotations”, with c = a and upon exchanging the roles of a and b, the braid identity Ua (α) · Ub (β) · Ua (α) = Ub (β) · Ua (α) · Ub (β)
(5.19)
leads to the following equations, relating α, β and Ω, β β α α cos α cos − sin α sin cos Ω = cos β cos − sin β sin cos Ω 2 2 2 2 α α β β β α (5.20) 2 sin cos cos − 2 sin2 sin cos Ω = sin 2 2 2 2 2 2 2 sin β cos β cos α − 2 sin2 β sin α cos Ω = sin α . 2 2 2 2 2 2 Appropriate manipulation of trigonometric identities or, more simply, taking symmetry of the braid relation into due account shows that, if, for a fixed Ω, solutions to the above equations exist, then β = ±α (or, working in the interval [0, 2π], β = 2π − α). We treat the first case in full detail, the other reducing to the first upon changing one of the generators into its inverse. Let us set x = cos α2 , y = sin α2 . The first equation becomes an identity, the other two merge into the following one: y(2x2 − 2 cos Ω y 2 − 1) = 0
(5.21)
(in addition to x2 + y 2 = 1). If y = 0, then α = 2kπ, k ∈ Z, which yields a trivial solution. If y = 0, then 1 + 2 cos Ω 2 x = 2(1 + cos Ω) (5.22) 1 2 y = 2(1 + cos Ω) which requires cos Ω ≥ − 12 . Also notice that y ≥ 0, for α ∈ [0, 2π]. The equation involving y can be cast in the form y = sin
α = 2
1 2 cos
Ω 2
(5.23)
2π or in the equivalent form (5.17), with Ω ∈ [− 2π 3 , 3 ] (and this is, in turn, tantamount 2π to (5.18)). The case Ω = 3 yields α = π (restricting to the fundamental interval), and this is the special solution we shall encounter later on in connection with elliptic functions, and which can also be easily obtained by a synthetic argument via the corresponding rotations. If cos Ω = 0, then α = π2 or α = 3π 2 , also arrived at by a geometric reasoning. We also observe that, as a sort of consistency check, the basic equation (5.17) comes from the irreducibility condition (b1 b2 )3 = ±I2 (Schur’s lemma; recall that
December 15, 2006 16:52 WSPC/148-RMP
1090
J070-00286
A. Benvegn` u & M. Spera
the left-hand side generates the center Z of B3 ). Indeed (again with β = α), upon 2π 4π resorting to the above formulae, cos 3ψ 2 = ±1, whence ψ = 0, 3 , 3 , and ψ = x2 − cos Ωy 2 = A (5.24) 2 with A = 1, ± 21 . The only case consistent with the braid equation (5.21), for y = 0, is A = 12 and corresponds to (b1 b2 )3 = (b1 2 b2 )2 = −I2 . Recall that we always have (Cayley–Hamilton), for U being either Ua (α) or Ub (α) α U 2 − 2 cos U + I2 = 0 (5.25) 2 cos
(the eigenvalues of both matrices are clearly e±iα ) and, for α = π, bi 2 = U 2 = −I2 . By continuity with respect to Ω, this remains true for all representations involved. Clearly, everything depends just on Ω (thence on α) and not on the direction of a fixed axis a. Ad (ii). This part is also immediate. One has, indeed: α S = sin iσ · (a + b) 2 1 α α α U = I2 + sin cos iσ · (a + b) − sin2 iσ · a × b 2 2 2 2
(5.26)
with respective rotation angles equal to π and 2π 3 respectively. It is immediately verified that the rotation axis of S bisects the angle Ω. A short calculation using Ω (5.27) 2 and (5.17), shows that the angle Υ between S and U fulfills the condition α α tan Υ = ∓ tan sin . (5.28) 2 2 The minus sign is necessary for α ∈ [ π3 , π], whereas the plus sign is to be employed 2π π for α ∈ [π, 5π 3 ]. Notice that the special case Ω = 3 yields Υ = 2 . The particular π 1 case Ω = 2 is also notable: Υ = arctan(− √2 ). Further elaboration yields, for S a + b2 = 4 cos2
S = iσ · u
(5.29)
a+b a+b .
with u = Notice that any unit vector u may appear in the above formula and that the explicit dependence on α (and Ω) has been stored in Υ. Remarks. 1. Observe that all non trivial special unitary representations of B3 are genuine braid group representations in the sense that they do not induce representations of the symmetric group S3 : indeed, this is the case if the extra condition b1 2 = b2 2 = 1 is fulfilled, which never happens unless the representation is trivial. The characters of the representations read, in turn χ(Ua (α)) = Tr(Ua (α)) = 2 cos α2 = χ(Ub (α)).
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1091
2. The above unitary representations of the modular group induce unitary representations on any tensor product (and in particular on the symmetric part, of dimension n + 1, see the following sections on entanglement). 3. Formula (5.18) has been obtained, independently and in a different guise, in [34]. Now let us compare our geometric treatment with Kauffman’s algebraic one; this is done via a short computation, yielding the following: Proposition 5.2 (Comparison with Kauffman). Let A = ei ϕ , δ = −A2 − A−2 . Consider the (Temperley–Lieb algebra) representations of B3 on C2 devised by Kauffman [28, 29] √ −1 −1 −1 A−1 1 − δ −2 ˜b1 = A + A δ 0 , ˜b2 = A +√A δ . (5.30) 0 A A−1 1 − δ −2 A + A−1 (δ − δ −1 ) Unitarity (and non triviality) is ensured for δ 2 > 1, i.e. for {|ϕ| < π6 } ∪ {|ϕ − π| < π π π 3π π −1 ˜ bj , j = 1, 2, with bj ∈ SU (2), 6 } ∪ {|ϕ − 2 | < 6 } ∪ {|ϕ − 2 | < 6 }. Then bj = iA and the bj ’s have rotation angle ψ = π − 4ϕ and the angle Ω between their induced rotation axes fulfills Eq. (5.18), i.e. cos Ω = −
cos 4ϕ cos 4ϕ + 1
(5.31)
and Kauffman’s condition δ 2 > 1 is tantamount to cos 4ϕ > − 12 (i.e. |cos Ω| < 1). Remarks. 1. The last two angle ranges are omitted in [29–31]. 2. Low-dimensional representations of B3 have also been discussed in [52, 51] in a purely algebraic fashion. Before stating our next result, we record, again for the sake of completeness, the unitary R-matrix used in the Kauffman–Lomonaco (KL) paper [31] √ √ 0 0 1/ 2 1/ 2 √ √ 0 0 1/ 2 −1/ 2 . √ √ (5.32) R= 1/ 2 1/ 2 0 0 √ √ 0 0 1/ 2 −1/ 2 Also, we quickly review the definition of the Jones polynomial VL = VL (q) — 1 1 or rather VL = VL (q 2 ), q being a formal parameter with “square root” q 2 — for an oriented link L via the skein relation 1
1
q −1 VL+ − qVL− = (q 2 − q − 2 )VL0
(5.33)
together with the normalization condition V = 1 for the unknot ; from this it follows, if L denotes the disjoint union of a link L with the unknot, that 1 1 VL = − (q 2 + q − 2 )VL . As usual, the three links in question differ by a single crossing (a choice of a plane projection being understood; positive, negative, no crossing, respectively). Exhaustive discussion concerning VL can be found in many
December 15, 2006 16:52 WSPC/148-RMP
1092
J070-00286
A. Benvegn` u & M. Spera
references, see, e.g., [28]. The bracket polynomial approach of Kauffman provides the shortest route thereto. In the sequel we shall take q ∈ S 1 ⊂ C, with a suitably chosen square root. We are now in a position to state the following: Theorem 5.3. (i) Taking the braid representation √(class) with α = Ω = π2 (in Kauffman’s description, it corresponds, e.g., to δ = − 2, ϕ = π8 ), the standard trace Tr thereupon fulfills the KL-skein relation [33] associated to the unitary R-matrix above, reading: √ (5.34) Tr(b+ ) + Tr(b− ) = 2 Tr(b0 ). (ii) Furthermore, the same trace computes the Jones polynomial (for links obtained 1 3 1 5 by closing 3-braids) for the values q = −i, q 2 = ei 4 π , and for q = +i, q 2 = ei 4 π . Proof. The proof is straightforward, via the basic formulae of the present Section; we shall verify (i), an entirely similar computation yielding (ii). As for the former, in view of the properties of Tr it is enough to check it for b+ = b · bj , b− = b · (bj )−1 . We set b ≡ b0 = cos γ2 I2 + sin γ2 i σ · b, b= 1, with Ωj denoting the angle between b and bj . We easily find Tr(b± ) = 2(cos γ2 cos π4 ∓ sin γ2 sin π4 cos Ωj ), whence Tr(b+ ) + Tr(b− ) = 4 cos
√ √ γ 1 γ · √ = 2 · 2 cos = 2 Tr(b0 ). 2 2 2
(5.35)
Since the unknot can be realized as the closure of the braid b1 b2 , and one has Tr(b1 b2 ) = 1, the proof is complete. We now discuss a specialized construction involving braiding of the Weierstraß roots. We shall recover the only essentially different lattices admitting a complex multiplication, i.e. the “square” lattice and the “equilateral triangle” lattice at one stroke via the same unitary B3 representation, corresponding to π-rotations around two oriented axes forming a 2π 3 angle. In detail, we consider the following problem: find the unitary representations of B3 involving “concrete” braiding of three specific quantum states in the 1-qubit space (looked upon as points on a unit sphere). A quick reflection shows that these three points must necessarily form an equilateral triangle, necessarily inscribed in a great circle. The braid generators induce rotations of angle π, and their corresponding axes form an angle 2π 3 , i.e. we abut at the “extremal” representation class previously found. Stereographic projection (which preserves generalized circles) onto a general plane passing through the center of the sphere yields a triangle inscribed in the projected circle. Two specific situations arise: in the first case, the projection plane coincides with the one determined by the triangle, this yielding the equilateral triangle lattice. In the second case, the planes in question are perpendicular, and the three√roots are collinear, √ and simple geometric reasoning gives e1 = 3, e2 = 0, e3 = − 3 (according to the
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1093
usual convention). This yields the square lattice, with Jacobi modulus (squared) 1 3 k 2 = ee21 −e −e3 = 2 . The above discussion leads, in particular, to the following: Theorem 5.4. There exists a unique “physical” (i.e. with Jacobi modulus 0 < k 2 < 1) unitary representation (class) of the 3-strand braid group √ B3 (and thence√of the modular group), causing braiding of the three roots e1 = 3, e2 = 0, e3 = − 3 of the natural elliptic cubic, in Weiestraß form, and it is the one associated to the 1 2π π 3 Jacobi modulus k 2 = ee21 −e −e3 = 2 , with α = π, Ω = 3 (and Υ = 2 ). Remark. Notice that in this specific case we get two pairs of antipodal points on the sphere, namely the poles and two antipodal equatorial points, yielding for the change of basis matrix from one pair to the other the gate P H, with P the phase shift gate and H the Hadamard gate, see e.g. [20, 35]. 6. Geometric Entanglement Criteria In this section we present a general entanglement criterion. We resort to the Segre embedding, familiar from classical algebraic geometry (see, e.g., [26, 7]. This approach is also briefly outlined in [14], but it will be useful to discuss it more explicitly. Let us review the Segre and Veronese embeddings, referring to [26] for full details. Given (complex) vector spaces V and W of respective dimensions n + 1 and m + 1, the Segre map S : P (V ) × P (W ) → P (V ⊗ W ) (the latter space has then dimension (n + 1)(m + 1) − 1) is intrinsically given by ([v], [w]) → [v ⊗ w]. In terms of homogeneous coordinates, it reads (obvious notation) S : Pn × Pm → P(n+1)(m+1)−1 ([Zi ], [Wj ]) → [Zi Wj ]
(6.1)
where i = 0, . . . , n, j = 0, . . . , m and lexicographic ordering is adopted. The Veronese map νd : P (V ) → P (Symd V ) → P (V ⊗d ) is intrinsically given by [v] → [v ⊗ · · · ⊗ v] ≡ [v d ]. Here Symd V denotes the dth-symmetric tensor power of V . If dim V = 2, we get a curve in Pd , called the rational normal curve. It is immediately checked that the image of νd is given by the common zero locus of the polynomials Zi Zj − Zi−1 Zj+1 , 1 ≤ i ≤ j ≤ d − 1. Here (V, ·|·) will be again a Hilbert space of dimension 2, with a choice of an orthonormal basis {|0, |1}, with 1-dimensional associated complex projective space P (V ) ∼ = S 2 . Concretely, and also in view of further analysis later on, one = P1 ∼ may consider the space of polarization states for a monocromatic electromagnetic wave. The chosen orthonormal basis may represent the (right and left-handed) circularly polarized states, yielding the eigenstates of the helicity operator H (the analogue of spin for photons, see [20] and Sec. 8 for further discussion of this point). Thus V can be regarded as the carrier of the fundamental representation of U (2) = SU (2) × U (1). Let V ⊗n denote the n-fold tensor product of V (the n-qubit space). In view of enforcement of Bose–Einstein statistics, we are also interested in Symn V the fully
December 15, 2006 16:52 WSPC/148-RMP
1094
J070-00286
A. Benvegn` u & M. Spera
symmetric part of V ⊗n , which, upon resorting to the Clebsch–Gordan theory (see, e.g., [41, 37, 39]), is given by V n2 , the (n + 1)-dimensional space pertaining to the n 2 -spin representation (of SU (2)). A state in P (V ⊗n ) (which has dimension 2n − 1, a Mersenne number) is (completely) disentangled if it is of the form [ξ1 ⊗ · · · ⊗ ξn ], i.e. if it comes from a decomposable vector |ξ1 · · · ξn . These states build up the (generalized) Segre varin ety X ⊂ P2 −1 . The corresponding Veronese curve describes the completely symmetric and disentangled states. Since it is nonlinear, it is not physically realizable (no cloning theorem). In particular, in the 1-qubit space case only the chosen basis vectors |0 and |1 can be copied and P (V ) is embedded via ν2 into P (Sym2 V ) as a conic C (whose only physically realizable states are then |00 and |11). Although the following theorem can be subsumed by a more general result (see, e.g., [14], and below), it is possibly useful to discuss it separately, in view of its special importance, and for the explicit proof we give. The notation is as follows: the n projective space (homogeneous) coordinates of a point in P2 −1 can be represented as [Zγ ], γ = 0, . . . , 2n − 1, with γ written in binary form, so, for instance, if n = 3 one has [Z000 , Z001 , . . . , Z111 ]; the suffix α0k — with α = 0, 1, . . . , 2n−1 − 1 — is just a string of n binary digits given by the ones of α, with the kth position occupied by 0 (so they are n − 1). A similar meaning is attached to α1k . Thus, for example, if n = 4, α = 5, k = 3, one has α0k = 1001. Theorem 6.1. (i) The set of completely disentangled states is an algebraic subvan riety (generalized Segre variety) X ⊂ P2 −1 of dimension n and degree n! cut out set-theoretically by the family of quadratic polynomials Qα,k = Z00k Zα1k − Z01k Zα0k
(6.2)
where α = 1, . . . , 2 − 1 and k = 1, 2, . . . , n − 1, i.e. X is the common zero locus n−1 − 1) polynomials Qα,k ; geometrically, X is the intersection of of the (n − 1) · (2 the quadric hypersurfaces Qα,k = 0. Equivalently, X is the common zero locus of the polynomials n−1
Qα,β,k = Zα0k Zβ1k − Zα1k Zβ0k
(6.3)
where α, β = 0, 1, . . . , 2n−1 − 1 (α = β) and k = 1, 2, . . . , n − 1. (ii) A recursive change of coordinates procedure can devised so as to produce an “optimal” set of 2n − n − 1 equations. Proof. The (necessary and sufficient) disentanglement conditions for the first particle state read (1)
α0
(1) α1
=
Z0β , Z1β
β = 0, 1, . . . , 2n−1 − 1.
(6.4)
Thus we get 2n−1 − 1 equations for the Z’s. The fact that k ranges from 1 to n− 1 is clear since the conditions for k = n are automatically fulfilled if the preceding ones
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1095
are (if n − 1 states are disentangled, the remaining one is such). Thus we obtain (n − 1) · (2n−1 − 1) equations, which can be put in the form Qα,k = 0. Vanishing denominator situations are easily handled. n−1 Now, if we denote the homogeneous coordinates of P2 −1 collectively by Z , n−1 n we get, for the embedding P1 × P2 −1 → P2 −1 the equations (1)
Z0β = α0 · Zβ ,
(1)
Z1β = α1 · Zβ ,
(6.5)
which enable us to compute Zβ , in view of (6.4). The special case in which one of the α’s vanishes is easily settled, and correspond to a disentangled state containing one of the basis vectors in the first copy of V . Then, proceeding inductively, we get (2n−1 − 1) + (2n−2 − 1) + · · · + (20 − 1) = 2n − 1 − n equations locally cutting out, set theoretically, the variety X (this number equals the codimension of X). The above procedure can be easily algorithmically implemented. Remarks. 1. The above proof can be used to check partial entanglement conditions as well, i.e. whether a certain “particle” is disentangled from the others. 2. An entanglement criterion similar to ours has been discussed by Kauffman and Lomonaco in [32]; however, it seems to have only a “local” character, in the sense that it works only in the local chart Z00···0 = 0 (with their notation, a00···0 = 0). For example, the manifestly entangled state (for n = 3) given by |1(|00+|01+2|10+ |11) fulfils the KL-criterion. It does not satisfy ours: Z100 Z111 −Z110 Z101 = −1 = 0 (here α = 10, β = 11, k = 2). 3. See, e.g., [25, 26, 7] for the notion of degree of a variety. As a simple application of the above criterion, we observe that the symmetry (or antisymmetry) operator is in general entangling, i.e. transforms a disentangled quantum state into an entangled one. Specifically, we consider the following example: take the n-particle state vector Ψ = |0α, α = 0 · · · 0 (n − 1 binary digits). Then its symmetrization S|ψ induces an entangled state. Indeed, the initial state has just one non vanishing component Z0α = 1. In view of the above assumption, SΨ is a superposition of the states labeled by the appropriately permuted digits containing |1β, for some β. Then Z1β = 1 (it is not necessary to normalize). But clearly Z1α = Z0β = 0, whence Z0α Z1β − Z1α Z0β = 1 = 0, yielding the conclusion. Actually, one has the following: Proposition 6.2. Any symmetric disentagled state must be of the form [ξ n ], ξ = 0, i.e. it is a point on the Veronese curve. The latter can be cut out by the above quadrics Qα,β,k = 0, in addition to the hyperplanes Zγ − Zσ·γ = 0, with σ denoting any permutation from the symmetric group Sn acting on γ ∈ {0 . . . , 2n − 1}, written in binary form (redundancies occur). Thus one abuts again at an intersection of quadrics. We may then consider the following general situation.
December 15, 2006 16:52 WSPC/148-RMP
1096
J070-00286
A. Benvegn` u & M. Spera
Let us consider the Segre embedding (representing the full “partial entanglement” hierarchy) S : Pn1 × Pn2 × · · · × Pnm → PN
(6.6)
m with ni = 2 i − 1, N = 2n − 1, i=1 i = n. m i n Indeed, one checks: Πm i=1 (ni + 1) − 1 = Πi=1 2 − 1 = 2 − 1 = N . However, the construction below is completely general. Denote points in Pni via their homogeneous coordinates as follows z (i) := (i) (i) (i) [z0 , z1 , . . . , zni ]. A set of coordinates for a point of the target projective space PN (lexicographic order employed) reads as [Zi1 i2 ···im ],
ik = 0, 1, . . . , nk ,
k = 1, 2, . . . , m.
(6.7)
Let Nj denote the dimension of the projective space wherein the product of the remaining factors (i.e. other than Pnj ) of the full cartesian product Pn1 × Pn2 × · · · × Pnm is embedded, that is (ni + 1) − 1. (6.8) Nj = i =j
We also introduce a notation analogous to the previous one: set α = (i1 i2 · · · ik · · · im ), the hat meaning omission, ij = 0, 1, . . . , nj , j = 1, 2, . . . , m. Then, for example, (α, jk ) means insertion of jk at the kth position etc. Then we have, explicitly, (1) (2)
(m)
S : (z (1) , z (2) , . . . , z (m) ) → [Zi1 i2 ···im = zi1 zi2 · · · zim ]
(6.9)
with ik = 0, 1, . . . , nk , and k = 1, 2, . . . , m. Theorem 6.3. (i) With the above notation, the image of Segre embedding is given as the common zero locus of the quadratic polynomials Qα,β,ik ,jk := Zαik Zβjk − Zαjk Zβik = 0,
α = β
(6.10)
(ii) The number of admissible “decompositions” (n1 , n2 , . . . , nm ), corresponding to (1 , 2 , . . . , m ) (up to ordering and including the trivial embedding) is equal to Euler’s partitio numerorum p(n), i.e. the number of ways of decomposing a positive integer into a sum of positive integers, up to order. Proof. The above set of equations is immediately written down starting from the parametric form of the Segre embedding. Conversely, it is easily seen that any point in PN fulfilling the above equations comes from a point in Pn1 × Pn2 × · · · × (k)
Pnm : the above equations are indeed enough to determine the ratios
zi
k (k)
zj
k
, say, for
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1097
ik , jk = 0, 1, . . . , nk , and k = 1, 2, . . . , m. Explicitly (and temporarily assuming non vanishing quantities throughout): (k)
zik
(k) zjk
=
(1)
(k)
(m)
(1) zi1
(k) · · · zjk
(m) · · · zim
zi1 · · · zik · · · zim
=
Zi1 ···ik ···im Zαik ≡ Zi1 ···jk ···im Zαjk
(6.11)
for all α as above, yielding (6.10) and we have some redundancy coming from (k) (k) (k) (k) (k) (k) zik /zjk = zik /zhk · zhk /zjk . However, we must keep all equations in order to handle non generic situations (i.e. points lying in “hyperplanes at infinity”; in the previous situation we had just points since all projective spaces were 1-dimensional). For generic points m−1 j=1
nj N j =
m−1
(2 j − 1) · (2n− j − 1)
(6.12)
j=1
equations suffice. It is also possible to devise a recursive procedure, as in the pre ceding theorem, to get, locally, a minimal set of N − m j=1 nj equations cutting out X. Assertion (ii) is clear. Remarks. 1. The former entanglement criterion (and the number of equations obtained) is a special instance of the latter, when i = 1 for all i, and i = 1, 2, . . . , n. 2. Recall that Euler’s function p is given via the identity (for a formal parameter q) ∞ ∞ 1 p(n)q n . (6.13) = 1 + n 1 − q n=1 n=1 We point out, in passing, that p also emerges in the expression of the S 1 -equivariant L2 -index of the Dirac operator on loops in flat spaces [47]. 3. The above theorem can be easily extended verbatim to partial symmetric entanglements as well. One has a substantial simplification in dimensional complexity, since one goes from 2 − 1 to . 7. On the Geometry of Quantum 2-Gates This section furnishes an application of the preceding techniques and it is meant to provide a projective geometric interpretation of the KL R-matrix discussed above, and it is quite close to the discussion of spin 1-systems given in [12], see also [35, 20, 31, 55] for the standard algebraic approach. Consider the so-called Bell basis in V ⊗ V given by (ϕ+ , ϕ− , ψ + , ψ − ), with: 1 |ϕ+ = √ (|00 + |11), |ϕ− = 2 1 |ψ + = √ (|01 + |10), |ψ − = 2
1 √ (|00 − |11), 2 1 √ (|01 − |10). 2
(7.1)
December 15, 2006 16:52 WSPC/148-RMP
1098
J070-00286
A. Benvegn` u & M. Spera
We have the following: Theorem 7.1. The basis (ψ − , ψ + , ϕ+ , ϕ− ) of V ⊗ V ∼ = C4 (made up of entangled states), gives rise, projectively, to a self-polar tetrahedron in P3 (with respect to the polarity induced by the (Segre) quadric Q of disentangled states), namely, the equation of the quadric Q takes (after appropriate adjustment) the projective canonical form ξ0 2 + ξ1 2 + ξ2 2 + ξ3 2 = 0.
(7.2)
Taking the plane π∞ : ξ3 = 0 as the plane at infinity, the conic C = π∞ ∩ Q is the image of the Bloch sphere P (V ) under the Veronese map. + , ψ+ , ψ− ), with ϕ + = Proof. Consider the following modified Bell basis (ϕ − , ϕ + − + + + − − = −iϕ , ψ = −iψ , ψ = ψ (they give rise to the same states), with ϕ ,ϕ respective coordinates (ξ0 , ξ1 , ξ2 , ξ3 ). One has (obvious notation) 1 i ξ0 = √ (x00 + x11 ), ξ1 = √ (x00 − x11 ), 2 2 (7.3) i 1 ξ2 = √ (x01 + x10 ), ξ3 = √ (x01 − x10 ), 2 2 (notice that the corresponding matrix is R, up to minor modifications). Therefore, the equation of Q becomes ξ0 2 + ξ1 2 + ξ2 2 + ξ3 2 = 2(x00 x11 − x01 x10 ) = 0
(7.4)
as claimed. Intersecting it with π∞ , we see that C coincides with the Veronese curve on that plane (indeed ξ3 = 0 enforces the symmetry condition x01 = x10 ). The geometrical assertions come from rephrasal in classical algebro-geometric language; also, the points [ϕ+ ] and [ϕ− ] lie on the polar of [ψ + ] with respect to C, and, together with [|00] and [|11], belonging to C, give rise to a harmonic quadruple (in an appropriate order), whereas the tangents drawn therefrom meet in [ψ + ]. Remark. By virtue of a theorem of J. L. and R. Brylinski [15], the change of basis R yields a universal quantum gate. 8. Brunnian Links, Projective Geometry and Measurement In this section we wish to point out the emergence of a possibly interesting geometrical pattern in discussing measurements made upon particular entangled states. We first resume the discussion begun in Sec. 6. The eigevalues of the helicity operator H are ±n, ±(n− 2), . . . , ±(n− 2[ n2 ]), with (non normalized) eigenvectors given (up to phase) below, starting from H|0 = |0, H|1 = −|1: φn = |0 · · · 0, φn−2 = |1 · · · 0 + |01 · · · 0 + · · · + |0 · · · 1, ··· φ−n = |1 · · · 1.
(8.1)
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1099
The (non normalized) state (vector) φn + φ−n = |0 · · · 0 + |1 · · · 1 is a generalized GHZ-state (see, e.g., [24]): a measurement of the helicity carried out upon any particle yields a completely disentangled state as outcome (either |0 · · · 0 or |1 · · · 1). According to the suggestion of Aravind, this arrangement (state plus measurement!) can be depicted by a Brunnian link (a link such that removing any of its components yields a trivial link (we ought to observe at this point that there are inequivalent Brunnian links with n components, for n ≥ 4, see [44, 43]); in the case n = 3 we find the celebrated Borromean rings. We now wish to show that similar remarks apply to the states φn−2 etc. confining ourselves to the first one. The following statement is easily proved, and we refer to any book in graph theory (e.g., [10]) for the basic terminology. Proposition 8.1. (i) All potential successive measurements of the state φn−2 := f1 + f2 + · · · + fn
(8.2)
give rise to an oriented graph which can be geometrically portrayed as follows: its nodes are the vertices of the simplex (f1 , f2 , . . . , fn ) in Cn where fj := |00 · · · 1 · · · 0
(8.3)
(1 at the jth position), together with the barycentres of its various subsimplices; in total, they are amount to 2n − 1. The n + 1 = (n − 1) + 2 points [f1 ], [f2 ], . . . , [fn ], [φn−2 ] provide a projective frame for the complex (n − 1)dimensional projective space corresponding to f1 , f2 , . . . , fn , with [φn−2 ] being the unit point. Furthermore, upon passing to F2 -coefficients (F2 being the Galois field with two elements), one gets the projective space P (Fn2 ). Its arrows connect a barycenter with a basis vertex and with the (sub)face opposite to it. (ii) The successive measurements of the state φn−2 (with respect to a fixed particle (or, better, position) give rise to a binary tree (Bn−2 , Bn−3 , . . . , B0 , B0 ). The leaves Bi can be depicted as Brunnian (or Borromean) links of decreasing complexity. The last two leaves are (two-component) Hopf links. Comment. We briefly discuss the case n = 3. Upon measuring helicity in the state [f1 + f2 + f3 ], if say, we measure 1 at the first position, then we get [f1 ], which is completely disentangled, so the leaf B1 is represented by the Borromean rings. Upon measuring 0, we find [f2 + f3 ], and the state is partially entangled; a successive measurement (of the second particle) produces a disentangled state in both cases, so the corresponding leaves B0 and B0 can be both represented by a Hopf link (discarding a disjoint circle given by the first measured particle). Geometric interpretation leads to the well-known (projective) Fano plane. 9. Conclusions and Outlook In this paper we tried to elucidate some issues related to quantum and topological entanglement mostly relying on geometrical methods. The relationship between
December 15, 2006 16:52 WSPC/148-RMP
1100
J070-00286
A. Benvegn` u & M. Spera
elliptic function theory, braid groups and quantum mechanics certainly deserves deeper scrutiny. A geometric approach in the spirit of the present one could shed extra light on important issues such as quantum teleportation. Also, the generalization of Jacobi elliptic functions devised in [6] could possibly have a strong relevance on the matters discussed here, towards extensions to general n-qubit spaces. We hope to be able to delve further into these problems elsewhere. Acknowledgments The authors are grateful to A. Giacobbe, P. Marchetti, E. Previato, M. C. Ronconi, N. Sansonetto and E. Zizioli for useful discussions. They also thank the Referee for his/her critical remarks on a previous version of this work. Financial support from M.I.U.R. (ex 60%) is acknowledged. References [1] Y. Aharonov and J. Anandan, Geometry of quantum evolution, Phys. Rev. Lett. 65 (1990) 1697–1700. [2] P. K. Aravind, Borromean entanglement of the GHZ state, in Potentiality, Entanglement and Passion-at-a-Distance, eds. R. S. Cohen, M. Horne and J. Stachel (Kluwer Academic Publishers, Boston, 1997). [3] M. Atiyah, The Geometry and Physics of Knots Lezioni Lincee (Cambridge University Press, Cambridge, 1990). [4] M. Audin, Courbes alg´ebriques et syst`emes int´egrables: G´eod´esiques des quadriques, Expo. Math. 12 (1994) 193–226. [5] S. Axelrod, S. Della Pietra and E. Witten, Geometric quantization of Chern–Simons Gauge theory, J. Diff. Geom. 33 (1991) 787–902. [6] L. Bates and R. Cushman, Complete integrability beyond Liouville–Arnol’d Rep. Math. Phys. 12 (2005) 77–91. [7] M. C. Beltrametti, E. Carletti, D. Gallarati and G. Monti Bragadin, Letture su curve, superficie e variet` a proiettive speciali. Un’introduzione alla geometria algebrica Bollati Boringhieri, Torino (2002) (in Italian). [8] A. Benvegn` u, N. Sansonetto and M. Spera, Remarks on geometric quantum mechanics, J. Geom. Phys. 51 (2004) 229–243. [9] J. Birman, Braids, Links and Mapping Class Groups, Annals of Mathematical Studies, Vol. 82 (Princeton, NJ, 1974). [10] B. Bollob´ as, Extremal Graph Theory (Dover, New York, 1978); reprinted (2005). [11] D. C. Brody and L. P. Hughston, The quantum canonical ensemble, J. Math. Phys. 39 (1998) 2586–2592. [12] D. C. Brody and L. P. Hughston, Geometric quantum mechanics, J. Geom. Phys. 38 (2001) 19–53. [13] J. L. Brylinski, Loop Spaces, Characteristic Classes and Geometric Quantization (Birkh¨ auser, Basel, 1993). [14] J. L. Brylinski, Algebraic measures of entanglement, in Mathematics of Quantum Computation, eds. R. Brylinski and G. Chen, Computational Mathematics Series (Chapman & Hall/CRC Press, Boca Raton, Florida, 2002), pp. 3–23. [15] J. L. Brylinski and R. Brylinski, Universal quantum gates, in Mathematics of Quantum Computation, eds. R. Brylinski and G. Chen, Computational Mathematics Series (Chapman & Hall/CRC Press, Boca Raton, Florida, 2002), pp. 101–116.
December 15, 2006 16:52 WSPC/148-RMP
J070-00286
Uncertainty, Braiding and Entanglement in Geometric Quantum Mechanics
1101
[16] R. Cirelli, M. Gatti and A. Mani` a, On the non-linear extension of quantum superposition and uncertainty principles, J. Geom. Phys. 29 (1999) 64–86. [17] R. Cirelli, M. Gatti and A. Mani` a, The pure state space of quantum mechanics as Hermitian symmetric space, J. Geom. Phys. 45 (2003) 267–284. [18] R. Cirelli, A. Mani` a and L. Pizzocchero, Quantum mechanics as an infinite dimensional Hamiltonian system with uncertainty structure, Parts I and II, J. Math. Phys. 31 (1990) 2891–2897 and 2898–2903. [19] R. Cirelli and L. Pizzocchero, On the integrability of quantum mechanics as an infinite-dimensional Hamiltonian system, Nonlinearity 3 (1990) 259–268. [20] D. Chru´sci´ nski and A. Jamiolkowski, Geometric Phases in Classical and Quantum Mechanics (Birkh¨ auser, Boston, 2004). [21] M. do Carmo, Riemannian Geometry (Birkh¨ auser, Boston, 1992). [22] S. Gallot, D. Hulin and J. Lafontaine, Riemannian Geometry (Springer, Heidelberg, 1987). [23] D. Giulini, E. Joos, C. Kiefer, J. Kupsch, I. O. Stamatescu and H. D. Zeh, Decoherence and the Appearance of Classical World in Quantum Theory (Springer, Heidelberg, 2003). [24] D. Greenberger, M. Horne, A. Shimony and A. Zeilinger, Bell’s theorem without inequalities, Am. J. Phys. 58 (1990) 1131–1143. [25] P. Griffiths and J. Harris, Principles of Algebraic Geometry (J. Wiley & Sons, New York, 1978). [26] J. Harris, Algebraic geometry: A First Course (Springer-Verlag, New York, 1992). [27] L. P. Hughston, Geometric aspects of quantum mechanics, Twistor Theory, ed. S. Huggett (Marcel Dekker, Inc., 1995), pp. 59–79. [28] L. Kauffman, Knots and Physics, 3rd edn. (World Scientific, Singapore, 2001). [29] L. Kauffman, Quantum computing and the Jones polynomial, Cont. Math. 305 (2002) 100–137. [30] L. Kauffman and S. Lomonaco, Quantum entanglement and topological entanglement, New J. Phys. 4 (2002) 73.1–73.18. [31] L. Kauffman and S. Lomonaco, Braiding operators are universal quantum gates, New J. Phys. 6 (2004) 134. [32] L. Kauffman and S. Lomonaco, Entanglement criteria — Quantum and topological, in Quantum Information and Computation — Spie Proceedings, Orlando, Florida, USA, Vol. 5105 (April, 2003), pp. 51–58. [33] L. Kauffman and S. Lomonaco, Quantum knots, arXiv:quant-ph/0403228. [34] L. Kauffman and S. Lomonaco, q-deformed spin networks, knot polynomials and anyonic topological quantum computation, arXiv:quant-ph/0606114v2. [35] A. Yu. Kitaev, A. H. Shen and M. N. Vyalyi, Classical and Quantum Computation (AMS, Providence, RI, 2002). [36] S. Kobayashi, On conjugate and cut loci, in Studies in Global Geometry and Analysis (MAA, Prentice-Hall, Englewood Cliffs, NJ, 1967), pp. 96–122. [37] L. D. Landau and M. E. Lifˇsits, Quantum Mechanics (Pergamon, London, 1960). [38] H. McKean and V. Moll, Elliptic Curves (Cambridge University Press, Cambridge, 1999). [39] A. Messiah, M´ecanique Quantique I, II (Dunod, Paris, 1959, 1962). [40] G. D. Mostow, Braids, hypergeometric functions, and lattices, Bull. Am. Math. Soc. 16 (1987) 225–246. [41] M. Naimark and A. Stern, Th´eorie des Repr´esentations des Groupes (MIR, Moscow, 1979). [42] C. Nash, Differential Topology and Quantum Field Theory (Academic Press, London, 1991).
December 15, 2006 16:52 WSPC/148-RMP
1102
J070-00286
A. Benvegn` u & M. Spera
[43] V. Penna and M. Spera, Higher order linking numbers, curvature and holonomy, J. Knot Theory Ram. 11 (2002) 701–723. [44] D. Rolfsen, Knots and Links (Publish or Perish, Berkeley, 1976). [45] C. L. Siegel, Topics in Complex Function Theory, Vol. I (Wiley, New York, 1969, 1988). [46] M. Spera, On a generalized uncertainty principle, coherent states, and the moment map, J. Geom. Phys. 12 (1993) 165–182. [47] M. Spera and T. Wurzbacher, The Dirac–Ramond operator on loops in flat space, J. Funct. Analysis 197 (2003) 110–139. [48] M. Spera and T. Wurzbacher, Twistor spaces and spinors over loop spaces, Preprint LMAM Universit´e de Metz (January, 2005). [49] A. Thimm, Integrabilit¨ at beim geod¨ atisch Fluss, Bonner Math. Schrift B. 10 (1978); ibid., dissertation, Universit¨ at Bonn (1980); Integrable geodesic flows, Ergodic Theory Dynam. Systems 1 (1981) 495–517. [50] F. Tricomi, Funzioni Ellittiche (Zanichelli, Bologna, 1937) (in Italian). [51] I. Tuba, Low-dimensional representations of B3 , Proc. Amer. Math. Soc. 129 (2001) 2597–2606. [52] I. Tuba and H. Wenzl, Representations of the braid group B3 and of SL(2, Z), Pacific J. Math. 197 (2001) 491–509. [53] E. T. Whittaker and G. N. Watson, A Course of Modern Analysis (Cambridge University Press, Cambridge, 1927), 4th edn., reprinted (1980). [54] E. Witten, Quantum field theory and the Jones polynomial, Commun. Math. Phys. 121 (1989) 351–399. [55] Y. Zhang, Teleportation, braid group and Temperley–Lieb algebra, arXiv:quantph/0601050.
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Reviews in Mathematical Physics Vol. 18, No. 10 (2006) 1103–1154 c World Scientific Publishing Company
THE TOPOLOGY OF T -DUALITY FOR T n -BUNDLES
ULRICH BUNKE∗ , PHILIPP RUMPF† and THOMAS SCHICK‡ ∗Mathematisches
Institut, Georg-August-Universit¨ at G¨ ottingen, Bunsenstr. 3, 37073 G¨ ottingen, Germany [email protected]
†Fakult¨ at
f¨ ur Mathematik, Universit¨ at M¨ unster, Einsteinstr. 62, 48149 M¨ unster, Germany p [email protected]
‡Mathematisches
Institut, Georg-August-Universit¨ at G¨ ottingen, Bunsenstr. 3, 37073 G¨ ottingen, Germany [email protected] Received 26 April 2005 Revised 17 September 2006
In string theory, the concept of T -duality between two principal T n -bundles E and ˆ over the same base space B, together with cohomology classes h ∈ H 3 (E, Z) and E ˆ ∈ H 3 (E, ˆ Z), has been introduced. One of the main virtues of T -duality is that hh ˆ twisted K-theory of E is isomorphic to ˆ h-twisted K-theory of E. In this paper, a new, very topological concept of T -duality is introduced. We construct a classifying space for pairs as above with additional “dualizing data”, with a forgetful map to the classifying space for pairs (also constructed in the paper). On the first classifying space, we have an involution which corresponds to passage to the dual pair, i.e. to each pair with dualizing data exists a well defined dual pair (with dualizing data). We show that a pair (E, h) can be lifted to a pair with dualizing data if and only if h belongs to the second step of the Leray–Serre filtration of E (i.e. not always), and that in general many different lifts exist, with topologically different dual bundles. We establish several properties of the T -dual pairs. In particular, we prove a T -duality isomorphism of degree −n for twisted K-theory. Keywords: Topological T -duality; twisted K-theory. Mathematics Subject Classification 2000: 55R15, 55T10
Contents 1. Introduction 2. Topological T -duality via T -duality triples 3. The space Rn 4. The T -duality group and the universal triple 5. Pairs and triples 6. T -Duality transformations in twisted cohomology 7. Classification of T -duality triples and extensions Appendix A. Twists, spectral sequences and other conventions 1103
1104 1106 1118 1122 1127 1133 1134 1148
December 15, 2006 16:52 WSPC/148-RMP
1104
J070-00287
U. Bunke, P. Rumpf & Th. Schick
1. Introduction 1.1. String theory is a part of mathematical quantum physics. Its ultimate goal is the construction of quantum theories modeling the basic structures of our universe. More specifically, a string theory should associate a quantum field theory to a target consisting of a manifold equipped with further geometric structures like metrics, complex structures, vector bundles with connections, etc. A schematic picture is target
string theory
/ quantum field theory.
The target is thought of to encode fundamental properties of the universe. Actually there are several types of string theories, where the most important ones for the present paper are called of type IIA and IIB (see [18, Chap. 10]). 1.2. T -duality is a relation between two string theories on the level of quantum field theories to the effect that two different targets can very well lead to the same quantum field theory. The simplest example is the duality of bosonic string theories on the circles of radius R and R−1 (see [18, Chap. 8]). A relevant problem is to understand the factorization of the T -duality given on the level of quantum theory through T -duality on the level of targets. Schematically it is the problem of understanding the dotted arrow in target
string theory, e.g. IIA
target level T -duality
target
/ quantum field theory quantum level T -duality
string theory, e.g. IIB / quantum field theory.
The problem starts with the question of existence, and even of the meaning of such an arrow. 1.3. T -duality on the target level is an intensively studied object in physics as well as in mathematics. We are not qualified to review the extensive relevant literature here, but let us mention mirror symmetry as one prominent aspect, mainly studied in algebraic geometry (see, e.g., [20]). 1.4. In general, the target of a string theory is a manifold equipped with further geometric structures which in physics play the role of low-energy effective fields. The problem of topological T -duality can be understood schematically as the question of studying the dotted arrow in the following diagram. target
forget geometry
/ underlying topological space
target level T -duality
topological T -duality
forget geometry / underlying topological space. target
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1105
1.5. At this level one faces the following natural problems: (1) How can one characterize the topological T -dual of a topological space? It is not a priori clear that this is possible at all. (2) If one understands the characterization of T -duals on the topological level, then one wonders if a given space admits a T -dual. (3) Given a satisfactory characterization of topological T -duals one asks for a classification of T -duals of a given space. As long as string theory is not part of rigorous mathematics the answer to the first question has to be found by physical reasoning and is part of the construction of mathematical models. Once an answer has been proposed the remaining two questions can be studied rigorously by methods of algebraic topology. This is the philosophy of the present paper. For a certain class of spaces to be explained below we propose a mathematical characterization of topological T -duals. On this basis we then present a thorough and rigorous study of the existence and classification problems. 1.6. The expression “space” has to be understood in a somewhat generalized sense since we consider targets with additional non-trivial B-field background. There are several possibilities to model these backgrounds mathematically. In the present paper we use an axiomatic approach going under the notion of a twist, see Sec. A.1. 1.7. Topological T -duality in the presence of non-trivial B-field backgrounds has been studied mainly in the case of T n -principal bundles ([1–5, 14–16]). Our proposal for the characterizations of T -duals in terms of T -duality triples is strongly based on the analysis made in these papers. 1.8. The quantum field theory level T -duality predicts transformation rules for the low-energy effective fields which are objects of classical differential geometry like metrics and connections on the T n -bundle, but also more exotic objects like a connective structure and a curving of the B-field background (these notions are explained in the framework of gerbes, e.g., in [12]). These transformation rules are known as Buscher rules [9, 10]. 1.9. The Buscher rules provide local rules for the behavior of the geometric objects under T -duality on the target level. The underlying spaces of the targets (being principal bundles on manifolds) are locally isomorphic. Therefore, topological T duality is really interesting only on the global level. The idea for setting up a characterization of a topological T -dual comes from the desire to realize the Buscher transformation rules globally. The analysis of this transition from geometry to topology has been started in the case of circle bundles, e.g., [2] and continued including the higher dimensional case with [3, 5, 1], without stating a precise mathematical definition of topological T -duality there.
December 15, 2006 16:52 WSPC/148-RMP
1106
J070-00287
U. Bunke, P. Rumpf & Th. Schick
1.10. Currently, such a precise mathematical definition of topological T -duality has to be given in an ad hoc manner. For T n -bundles with twists we know three possibilities: (1) A definition in the framework of non-commutative geometry can be extracted from the works [14–16] and will be explained in 2.26. (2) The homotopy theoretic definition used in the present paper is based on the notion of a T -duality triple (see Definition 2.4). (3) Following an idea of T. Pantev, in a forthcoming paper [8] we propose a definition of topological T -duality for T n -bundles with twists using Pontrjagin duality for topological group stacks. Surprisingly, all three definitions eventually lead to equivalent theories of topological T -duality for T n -bundles with twists (the equivalence of (1) and (2) is shown in [19], and the equivalence of (2) and (3) is shown in [8]). This provides strong evidence for the fact that these definitions for topological T -duality correctly reflect the T -duality on the target or even quantum theory level. 1.11. If two spaces (with twist, i.e. B-field background) are in T -duality then this has strong consequences on certain of their topological invariants. For example, there are distinguished isomorphisms (called T -duality isomorphisms, see Definition 2.18) between their twisted cohomology groups and twisted K-theory groups. The existence of these T -duality isomorphisms has already been observed in [1, 14] and their follow-ups. The desire for a T -duality isomorphism actually was one of our main guiding principle which led to the introduction of the notion of a T -duality triple and therefore our mathematical definition of topological T -duality. 1.12. Having understood T -duality on the level of underlying topological spaces one can now lift back to the geometric level. We hope that the topological classification results (and their natural generalizations to topological stacks in order to include non-free T n -actions) will find applications to mirror symmetry in algebraic geometry and string theory. 2. Topological T -duality via T -duality triples 2.1. In this section we propose a mathematical set-up for topological T -duality of total spaces of T n -bundles with twists and give detailed statements of our classification results. We will also shed some light on the relation with other pictures in the literature. 2.2. In the present paper we will use elements of the homotopy classification theory of principal fiber bundles [13, Chap. 4]. Therefore, spaces in the present paper are always assumed to be Hausdorff and paracompact. 2.3. Let us fix a base space B and n ∈ N. By T n := U (1) × · · · × U (1) n-factors
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1107
we denote the n-torus. The fundamental notion of the theory is that of a pair. Definition 2.1. A pair (E, h) over B consists of a principal T n -bundle E → B and a cohomology class h ∈ H 3 (E, Z). An isomorphism of pairs φ: (E, h) → (E , h ) is an isomorphism E@ @@ @@ @@
φ
B
/ E } }} }} } ~}
of T n -principal bundles such that φ∗ h = h. We let P (B) denote the set of isomorphism classes of pairs over B. We can extend P to a functor P : {Spaces}op → {Sets}. Let f : B → B be a continuous map and (E, h) ∈ P (B). Then we define (E , h ) := P (f )(E, H) as the pull-back of (E, h). More precisely, the T n -bundle E → B is defined by the pull-back diagram E
F
/E
f
/B
, B and h := F ∗ h. 2.4. The study of topological T -duality started with the case of circle bundles, i.e. n = 1. Guided by the experience obtained in [1–3, 14], a mathematical definition of topological T -duality for pairs in the case n = 1 was given in [6]. In the latter paper T -duality appears in two flavors. On the one hand, T -duality is a relation (see [6, Definition 2.9]) which may or ˆ over B. The relation has a ˆ h) may not be satisfied by two pairs (E, h) and (E, cohomological characterization. We will not recall the details of the definition here since it will be equivalent to Definition 2.4 in terms of T -duality triples (reduced to the case n = 1). On the other hand we construct in [6] a T -duality transformation, a natural automorphisms of functors of order two T : P → P,
(2.1) ˆ ˆ which assigns to each pair (E, h) a specific T -dual (E, h) := T (E, h). The existence of such a transformation is a special property of the case n = 1. It has already been observed in [3, 6, 14], that such a transformation cannot exist for general higher dimensional torus bundles. The first reason is that for n ≥ 2 not every pair admits a T -dual which implies that T in (2.1) could at most be partially defined. An additional obstruction (to a partially defined transformation) is the non-uniqueness of T -duals.
December 15, 2006 16:52 WSPC/148-RMP
1108
J070-00287
U. Bunke, P. Rumpf & Th. Schick
2.5. In order to describe topological T -duality in the higher dimensional (n > 1) case we introduce the notion of a T -duality triple. To this end we must categorify the third integral cohomology using the notion of twists. There are various models for twists, some of them are reviewed in Sec. A.1. The reader not familiar with the concept of twists and twisted cohomology theories is advised to consult this appendix. The results of the present paper are independent of the choice of the model. Therefore, let us once and for all fix a model for twists. Let us recall the essential properties of twists used in the constructions below. First of all we have a transformation ∼ =
{category of twists over B}/isomorphism → H 3 (B, Z) which is natural in B. For a twist H we let [H] denote the cohomology class corresponding to the isomorphism class of H. Furthermore, given isomorphic twists H, H , the set HomTwists (H, H ) is a torsor over H 2 (B, Z), and this structure is again compatible with the functoriality in B. In this paper we frequently identify the based set of automorphisms HomTwists (H, H) with H 2 (B, Z). For a twist H over B we will use the schematic notation H
/B
which acquires real sense if one realizes twists as gerbes or bundles of compact operators over B. 2.6. We fix an integer n ≥ 1 and a connected base space B with a base point b ∈ B. A T n -principal bundle π : F → B is classified by an n-tuple of Chern classes ˆ : Fˆ → B be a second T n -principal bundle with Chern c1 , . . . , cn ∈ H 2 (B, Z). Let π 2 classes cˆ1 , . . . , cˆn ∈ H (B, Z). Let H be a twist on F such that its characteristic class lies in the second filtration step of the Leray–Serre spectral sequence filtration, i.e. satisfies [H] ∈ F 2 H 3 (F, Z) (see Sec. A.2 for notation). Furthermore we assume that its leading part fulfills [H]2,1 =
n
2,1 yi ⊗ cˆi ∈ π E∞ ,
(2.2)
i=1
where yi are generators of the cohomology of the fibre U (1)n of F , compare again ˆ be a twist on Fˆ such that [H] ˆ ∈ F 2 H 3 (Fˆ , Z) and (with Sec. A.2. Similarly, let H similar notation) ˆ 2,1 = [H]
n i=1
2,1 yˆi ⊗ ci ∈ πˆ E∞ .
(2.3)
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1109
ˆ → p∗ H as indicated in We assume that we have an isomorphism of twists u : pˆ∗ H the diagram
H
p∗ H o } }} }} } ~} }
u
ˆ pˆ∗ H
# { F ×B FˆH HH pˆ v HH vv HH vvp v HH v v H$ ~ zvv r ˆ F II uF II uu IIπ u u II uu II I$ zuuuu πˆ B
AA AA AA AA ˆ H .
(2.4)
We require that this isomorphism satisfies the condition P(u) which we now describe. Let Fb and Fˆb denote the fibers of F and Fˆ over b ∈ B and consider the induced diagram
Fb
Fb × FˆFb FF pˆ x x FF b xx FF x FF xx pb x |x "
. Fˆb ∼
ˆ imply the existence of isomorphisms v : H|F → 0 The assumptions on H and H b ∼ ˆ and vˆ : 0 → H|Fˆb . We now consider the composition u|F ×Fˆ pˆ∗ ˆ p∗ bv bv b b ∗ ˆ ∗ (2.5) u(b) := 0 −−→ pˆb H|Fˆb −−−−−→ pb H|Fb −−→ 0 ∈ H 2 (Fb × Fˆb , Z). The condition P(u) requires that n yi ∪ yˆi ∈ H 2 (Fb × Fˆb , Z)/(im(p∗b ) + im(ˆ p∗b )). [u(b)] =
(2.6)
i=1
The class [u(b)] in this quotient is well-defined independent of the choice of v and vˆ. Definition 2.2. An n-dimensional T -duality over B triple is a triple ˆ u) ((F, H), (Fˆ , H), / Fˆ /F , H ˆ ˆ : Fˆ → B, twists H consisting of T n -bundles π : F → B, π ∼ ˆ → p∗ H satisfying Eqs. (2.2) and (2.3), respectively, and an isomorphism u : pˆ∗ H (for notation see (2.4)) which satisfies condition P(u). ˆ u) extends the pair Definition 2.3. We will say that the triple ((F, H), (Fˆ , H), ˆ ˆ (F, [H]) and connects the two pairs (F, [H]) and (F , [H]).
December 15, 2006 16:52 WSPC/148-RMP
1110
J070-00287
U. Bunke, P. Rumpf & Th. Schick
2.7. We can now define our notion of topological T -duality based on T -duality triples. ˆ over B are in T -duality if there is a Definition 2.4. Two pairs (F, h) and (Fˆ , h) T -duality triple connecting them. The main results of the present paper concern the following problems: (1) Classification of isomorphism classes of T -duality triples over B. (2) Classification of T -duality triples which connect two given pairs. (3) Existence and classification of T -duality triples extending a given pair. 2.8. There is a natural notion of an isomorphism of T -duality triples. Its details will be spelled out in Definition 4.5. If f : B → B is a continuous map, and ˆ ), u ) is a T -duality triple over B , then one defines a T x := ((F , H ), (Fˆ , H ˆ u) = f ∗ x over B in a canonical way. First of all the duality triple ((F, H), (Fˆ , H), underlying T n -bundles are given by the pull-back diagrams F B
φ
f
/ F / B
Fˆ
ˆ φ
, B
f
/ Fˆ / B
.
ˆ . Finally we consider the ˆ := φˆ∗ H Then we define the twists H := φ∗ H and H ˆ ˆ ˆ induced map ψ := (φ, φ) : F ×B F → F ×B F and define u as the composition ∗
ˆ∼ ˆ ψ→u ψ ∗ (p )∗ H ∼ pˆ∗ H p )∗ H = ψ ∗ (ˆ = p∗ H of natural isomorphisms and the pull-back of u via ψ. Definition 2.5. We define the functor Triplen : {spaces}op → {sets} which associates to a space B the set of isomorphism classes Triplen (B) of ndimensional T -duality triples over B. 2.9. In Lemma 7.1 we will observe that the functor Triplen is homotopy invariant. In general, given a contravariant homotopy invariant functor from spaces to sets one asks whether it can be represented by a classifying space. If this is the case, then the functor can be studied by applying methods of algebraic topology to its classifying space. Our study of the functor Triplen follows this philosophy. 2.10. In the following we describe a space Rn which will turn out to be a classifying space of the functor Triplen by Theorem 2.8. Consider the product of two copies of the Eilenberg–MacLane space K(Zn , 2) × ˆ1 , . . . , x ˆn of the second integral K(Zn , 2) with canonical generators x1 , . . . , xn and x
n x ∪ x ˆ as a map q : K(Zn , 2) × cohomology. We consider the class q := i i i=1 n K(Z , 2) → K(Z, 4).
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1111
Definition 2.6. Let Rn be the homotopy fiber of q. We consider the two components of the map (c, cˆ) : Rn → K(Zn , 2) × K(Zn , 2) as ˆ n → Rn . ˆn : F the classifying maps of two T n -principal bundles πn : Fn → Rn and π ˆ n we show the following ˆ n and Fn ×Rn F By a calculation of the cohomology of Fn , F theorem. Theorem 2.7 (Theorem 4.6). There exists a unique isomorphism class of nˆ n, H ˆ n ), un ] ∈ Triple (Rn ) dimensional T -duality triples [xn,univ ] = [(Fn , Hn ), (F n n ˆ with underlying T -bundles isomorphic to Fn and Fn . Let Pn denote the set-valued functor classified by Rn . This functor associates to B the set Pn (B) of homotopy classes [f ] of maps f : B → Rn . The universal triple [xn,univ ] induces a natural transformation of functors ΨB : Pn → Triplen (B) by ΨB ([f ]) := Triplen (f )[xn,univ ] = f ∗ [xn,univ ]. The following theorem characterizes Rn as a classifying space of the functor Triplen . Theorem 2.8 (Theorem 7.12). The natural transformation Ψ is an isomorphism of functors. 2.11. In order to prove Theorem 2.8 we must investigate the fine structure of the functor Triplen . Of particular importance is the following action of H 3 (B, Z) ˆ u) represent a class [x] ∈ on Triplen (B) (see 7.3). Let x := ((F, H), (Fˆ , H), 3 Triplen (B), and let α ∈ H (B, Z). We choose a twist V in the class α and set ˆ ⊗π ˆ ∗ V), u ⊗ r∗ idV ) (see diagram (2.4) for the definix + V := ((F, H ⊗ π ∗ V), (Fˆ , H tion of r). Then we define [x] + α := [x + V]. (F,Fˆ )
(B) of isomorphism classes of n-dimensional We now consider the set Triplen T -duality triples over fixed T n -bundles F and Fˆ (see 7.2). The group H 3 (B, Z) acts (F,Fˆ )
naturally on Triplen
(B) by the same construction as above. (F,Fˆ )
Proposition 2.9 (Proposition 7.3). Triplen
(B) is an H 3 (B, Z)-torsor.
2.12. In terms of the classifying spaces, fixing F and Fˆ corresponds to fixing clasˆ) (F,F sifying maps (c, cˆ) : B → K(Zn , 2) × K(Zn , 2). The set Triplen (B) then corresponds to the set of homotopy classes of lifts in the diagram 7 Rn f
B
(c,ˆ c)
.
(c,ˆ c)
/ K(Zn , 2) × K(Zn , 2)
Since the homotopy fiber of (c, cˆ) has the homotopy type of a K(Z, 3)-space it is clear by obstruction theory that H 3 (B, Z) acts freely and transitively on the set of such lifts. In combination with Proposition 2.9 this leads to the key step in the proof that Rn is the correct classifying space.
December 15, 2006 16:52 WSPC/148-RMP
1112
J070-00287
U. Bunke, P. Rumpf & Th. Schick
2.13. Now let ψ and ψˆ be bundle automorphisms of F and Fˆ . We can realize ψ and ψˆ as right multiplication by maps ψ, ψˆ : B → T n ∼ = K(Zn , 1). In this way ˆ ∈ H 1 (B, Zn ). the homotopy classes of ψ and ψˆ can be considered as classes [ψ], [ψ] ˆ u) be an n-dimensional T -duality triple. Then we form the Let x := ((F, H), (Fˆ , H), ˆ (ψ,ψ) ∗ ˆ ∗ u). We introduce the notation cˆ∪[ψ] := ˆ (ψ, ψ) := ((F, ψ H), (Fˆ , ψˆ∗ H), triple x
n 3 ˆi ∪ [ψ]i ∈ H (B, Z), where cˆ1 , . . . , cˆn are the components of the Chern class i=1 c ˆ similarly. Then of Fˆ , and [ψ]1 , . . . , [ψ]n are the components of [ψ]. We define c ∪ [ψ] we show: (F,Fˆ )
Proposition 2.10 (Proposition 7.17). In Triplen
(B) we have
ˆ ˆ [x(ψ,ψ) ] = [x] + cˆ ∪ [ψ] + c ∪ [ψ].
There is a natural forgetful map ˆ
F) Ψ : Triple(F, (B) → Triplen (B). n 3 Recall Definition 2.4 of the map r and note that im(r d2,1 2 ) ⊆ H (B, Z) (see Sec. A.2 for notation) is exactly the subgroup of elements which can be written in the form c ∪ a + cˆ ∪ b for a, b ∈ H 1 (B, Zn ). Proposition 7.17 immediately implies: 3 Corollary 2.11. If α ∈ im(r d2,1 2 ) ⊆ H (B, Z), then we have Ψ([x] + α) = Ψ([x]).
2.14. Let (e1 , . . . , en , eˆ1 , . . . , eˆn ) be the standard basis of Z2n . Let O(n, n, Z) ⊂ GL(2n, Z) be the subgroup of transformations which fix the quadratic form q :
n
n Z2n → Z with q( i=1 ai ei + bi eˆi ) := i=1 ai bi . Proposition 2.12 (Lemma 4.1). The group O(n, n, Z) acts by homotopy equivalences on Rn . We have an induced action of O(n, n, Z) on the functor Triplen by automorphisms. In the literature this group is sometimes called the T -duality group. 2.15. Recall Definition 2.1 of the functor B → P (B) which associates to a space B the set of isomorphism classes of n-dimensional pairs over B. We will write P˜(0) := P since this functor appears at the lowest level of a tower of functors P˜(0) ← P˜(1) ← · · · (see 5.4). In the notation for these functors we will not indicate the dimension n of the torus T n explicitly. The functor P˜(0) is homotopy invariant (the proof of [6, Lemma 2.2] extends from the case n = 1 to arbitrary n ≥ 1). Generalizing again the approach of [6] ˜ n (0) for the from the case n = 1 to general n ≥ 1 we construct a classifying space R n n n ˜ functor P(0) as follows. Let U → K(Z , 2) be the universal T -bundle. Then we define ˜ n (0) := U n ×T n Map(T n , K(Z, 3)). R ˜ n (0) → K(Zn , 2) classifies a T n -principal bundle F ˜ n (0) → The natural map R ˜ n (0) → K(Z, 3). We interpret the homotopy ˜ n (0) which admits a natural map F R
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1113
˜ ˜ n (0), Z). The isomorphism class of the uniclass of this map as a class h(0) ∈ H 3 (F ˜ ˜ ˜ ˜ versal pair [Fn (0), h(0)] ∈ P(0) (Rn (0)) induces a natural transformation of functors ˜ n (0)] → P˜(0) (B) (see Lemma 5.1) which turns out to be an isomorphism. v˜B : [B, R ˜n → R ˜ n (0). It is the uni2.16. In Sec. 5 we introduce the one-connected cover R ˜ versal covering of a certain connected component of Rn (0). The first entry of the ˆ n, H ˆ n ), un ) over Rn gives rise to a classifying map universal triple ((Fn , Hn ), (F ˜ n (0). f (0) : Rn → R We shall see (Lemma 5.3) that f (0) has a factorization
f
Rn
f (0)
˜ < Rn .
(2.7)
/R ˜ n (0)
Note that the factorization f is not unique. ˜ n is a weak homotopy Theorem 2.13 (Theorem 5.3). The map f : Rn → R equivalence. 2.17. There are two natural transformations of functors
P
Triplen HH v HH sˆ vv HH v HH vv s v H# v {v
, P
where ˆ u) := (F, [H]), s((F, H), (Fˆ , H),
ˆ u) := (Fˆ , [H]). ˆ sˆ((F, H), (Fˆ , H),
The problem of the existence and the classification of T -duals of a pair (F, h) ∈ P (B) is essentially a question about the fiber s−1 (F, h) ⊆ Triplen (B). The transformation s is realized on the level of classifiying spaces by the map ˜ n (0) ˜n → R R in diagram (2.7). This allows to translate questions about the fibers of s to homotopy theory. 2.18. Consider a pair (F, h) over a space B. The representatives of elements of s−1 (F, h) will be called extensions of (F, h). Definition 2.14. An extension of (F, h) to an n-dimensional T -duality triple is an n-dimensional T -duality triple ((F, H), (Fˆ , H), u) over B such that [H] = h. The difference between the notions of an extension of (F, h) and an element in the fiber s−1 (F, h) is seen on the level of the notion of an isomorphism of extensions (see Definition 7.19). Roughly speaking, an isomorphism of extensions of (F, h) is
December 15, 2006 16:52 WSPC/148-RMP
1114
J070-00287
U. Bunke, P. Rumpf & Th. Schick
an isomorphism of triples such that the underlying bundle isomorphism of F is the identity. Definition 2.15. We let Ext(F, h) denote the set of isomorphism classes of extensions of (F, h) to n-dimensional T -duality triples. We have a natural surjective map Ext(F, h) → s−1 (F, h) which in general may not be injective. 2.19. We then consider the following two problems: (1) Under which conditions does (F, h) admit an extension, i.e. is the set Ext(F, h) non-empty? (2) Describe the set Ext(F, h). Answers to these questions settle the problem of existence and classification of T -duals of (F, h) in the following sense: (1) The pair (F, h) admits a T -dual if and only if Ext(F, h) is not empty. (2) The set of T -duals of (F, h) can be written as sˆ(Ext(F, h)) ⊆ P (B). 2.20. As a consequence of Theorem 2.13 we derive the following answer to the first question. Theorem 2.16 (Theorem 5.6). The pair (F, h) admits an extension to a T ˆ u) if and only if h ∈ F 2 H 3 (F, Z). duality triple ((F, H), (Fˆ , H), In particular, the condition h ∈ F 2 H 3 (F, Z) is a necessary and sufficient condition for the existence of a T -dual to (F, h). If we write out the leading part of h as h2,1 =
2,1 , then we can read off some information about the Chern [ ni=1 yi ⊗ cˆi ] ∈ π E∞ 2,1 = π E22,1 /im(π d0,2 classes cˆ1 , . . . , cˆn of the T -dual bundle Fˆ . In fact we have π E∞ 2 ),
π 0,2 and d2 ( i<j Ai,j yi ∪ yj ) = i,j (Ai,j − Aj,i )yj ⊗ ci , where we set Ai,j := 0 for i ≥ j. It follows that the Chern classes cˆi of the dual bundle Fˆ are determined by the
n pair (F, h) up to a change cˆi → cˆi + j=1 Bi,j cj for some antisymmetric matrix B ∈ Mat(n, n, Z). Of course, the classes cˆi are completely determined by the choice of an extension of (F, h). We fix a pair (F, h). If n ≥ 2, then even the topology of the T -dual bundle Fˆ may depend on the choice of the extension of the pair (F, h) to a T -duality triple. We have already demonstrated this by an example in [6, Sec. 4.4].
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1115
The discussion above gives a description of the topological invariants of aa T dual of (F, h) in terms of relations between the Chern classes c, cˆ and the H 3 -classes ˆ Our results improve the results of [15].b [H] and [H]. 2.21. Let x := ((F, H), (Fˆ , H), u) represent a class {x} ∈ Ext(F, h). Then we let c(x), cˆ(x) denote the Chern classes of F, Fˆ . Note that the group ker(π ∗ ) ⊆ H 3 (B, Z) acts on Ext(F, h). Furthermore we have a homomorphism C : H 1 (B, Zn ) → ker(π ∗ )
n given by C(a) := i=1 ci ∪ ai , where ai denotes the components of a. The first two assertions of the following theorem have already been discussed above. Theorem 2.17. (1) The set Ext(F, h) is non-empty if and only if h ∈ F 2 H 3 (F, Z). (2) If {x} ∈ Ext(F, h), then for any antisymmetric matrix B ∈ Mat(n, n, Z) there
n exists {x } ∈ Ext(F, h) such that cˆi (x ) = cˆi (x) + j=1 Bi,j cj (x). Vice versa, if {x } ∈ Ext(F, h), then cˆ(x ) is of this form. (3) If {x} ∈ Ext(F, h), then {{x } ∈ Ext(F, h) | cˆ(x ) = cˆ(x)} is an orbit under the effective action of ker(π ∗ )/im(C). We prove this theorem in 7.29. Note that in terms of the Leray–Serre spectral 0,2 sequence of π : F → B we can identify ker(π ∗ )/im(C) ∼ = im(π d3 ). 2.22. Our work on T -duality was inspired by [3]. In [3] the authors study smooth T n -bundles equipped with real valued cohomology classes hR . Guided by the principles which we explained in 1.9, in [3] the T -dual pair is constructed geometrically using differential forms. Furthermore it was observed in [3] that a condition similar to hR ∈ F 2 H 3 (F, R) is necessary and sufficient for the construction of a T -dual to work. The problem of non-uniqueness of the construction was not addressed in that paper. In the present paper we provide a precise counterpart of [3] including the full information of integral cohomology. 2.23. An interesting feature of topological T -duality is the T -duality isomorphism in twisted cohomology theories. In [6, Definition 3.1], compare with 6.1, we introduced axioms (in particular the concept of admissibility) for a twisted cohomology theory so that one can define for two pairs in T -duality a T -duality transformation which is then proved to be an isomorphism. Examples of twisted cohomology theories satisfying the admissibility axioms are twisted K-theory and twisted real (periodic) cohomology. See Sec. A.1 for a definition of twisted K-theory. The existence of a T -duality isomorphism has been previously observed in [1–3, 14] and was our main guiding principle for the definition of the T -duality relation. a At
the moment of writing this paper, is was not clear that the notions of a T -dual used in the present paper coincides with that of [15]. Meanwhile results in this direction have been obtained in [19]. b The calculation in [15] is not correct since the inclusion X → X in (4) of [15] does not exist in j general.
December 15, 2006 16:52 WSPC/148-RMP
1116
J070-00287
U. Bunke, P. Rumpf & Th. Schick
ˆ u) is a T -duality triple. Let (X, H) → 2.24. Let us assume that ((F, H), (Fˆ , H), h(X, H) be a twisted cohomology theory, where the notation suggests that the cohomology groups depend on two entries in a functorial way, namely the space X and the twist H. The following definition uses the notation of (2.4). ˆ is deDefinition 2.18. The T -duality transformation T : h(F, H) → h(Fˆ , H) fined by T := pˆ! ◦ u∗ ◦ p∗ . Note that T shifts degrees by −n. Furthermore, it is linear over h(B). Theorem 2.19 (Theorem 6.2). If the twisted cohomology theory h(. . . , . . .) is T -admissible then the T -duality transformation is an isomorphism. 2.25. A T -duality isomorphism for twisted K-theory and twisted periodic de Rham cohomology was also obtained in [3]. In contrast to this paper, we take torsion in the third cohomology into account. We refer to [6, Sec. 4.3] for an explicit example which shows that the torsion part plays a significant role. 2.26. The approach of [14–16] to T -duality uses ideas from noncommutative topology. The class h ∈ H 3 (F, Z) is interpreted as the Dixmier–Douady class of a unique isomorphism class of a stable continuous trace algebra A := A(F, h) with / F is a bundle of C ∗ -algebras with spectrum F . Equivalently, the twist H fiber the compact operators on a separable infinite dimensional complex Hilbert space such that [H] = h (see Sec. A.1). Then we can write A(F, h) ∼ = C0 (F, H) (we assume for simplicity that B (and hence F ) is locally compact). The authors study the question of lifting the T n -action on F to an Rn -action on A such that the Mackey invariant is trivial. In this case the crossed product Aˆ := A Rn is again a continuous trace algebra with a spectrum Fˆ which is a ˆ ∈ H 3 (Fˆ , Z) denote the Dixmier–Douady class T n -principal bundle over B. Let h ˆ is the T -dual of (F, h). ˆ of A. From the point of view of [14–16] the pair (Fˆ , h) There is an obvious similarity of the following notions and their role in the theory of topological T -duality. • Rn -action on A(F, h) lifting the T n -action on F with trivial Mackey obstruction • Extension of (F, h) to a T -duality triple. The equivalence of the two approaches is established in [19]. In the approach of [14], the T -duality isomorphism for twisted K-theory is ˆ In fact, this isomorphism is Connes’ equivalent to an isomorphism K(A) ∼ = K(A). Thom isomorphism for crossed products with Rn . Using the approach via noncommutative topology, the natural two problems are to decide under which conditions the required Rn -action on A exists, and to study the set of choices for such an action. A satisfactory picture can be obtained in the cases n = 1 and n = 2. The case n = 1 is easy and has been reviewed in [6]. The main results of [14] deal with the case n = 2. The necessary and sufficient
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1117
condition for the existence of the Rn action is again that h ∈ F 2 H 3 (F, Z). It is then claimed in [14], that the action is unique. This is not always true. In fact, it follows from the diagram given in [14, Theorem 4.3.3], and the observation that d2 (we use the notation of [14]) factors over p! : H 2 (F, Z) → H 0 (B, Z), that the group H 0 (B, Z)/im(p! ) acts freely on the set of Rn -actions with trivial Mackey invariant lifting the T n -action on F . 2.27. Let us mention a very interesting aspect of [14] which goes beyond the theory covered by the present paper. As explained above the necessary and sufficient condition for the existence of a T -dual of (F, h) is h ∈ F 2 H 3 (F, Z). The lift of the T n action on F to an Rn -action on A(F, h) exists if h ∈ F 1 H 3 (F, Z), but under this weaker condition one might encounter a non-trivial Mackey obstruction. In this case the crossed product Aˆ := A Rn is not the algebra of sections of a bundle of compact operators on a T n -principal bundle over B. It has been observed in [14] (case n = 2) that one can interpret Aˆ as an algebra of sections of a bundle of noncommutative tori over B. In other words, the T -dual of (F, h) can be realized as a bundle of noncommutative tori. For further discussion of this phenomenon see also [15, 16]. 2.28. In [5] the point of view of [14, 15] is generalized even further by considering torus bundles with a completely arbitrary H-flux differential 3-form with integral periods.c It is then argued that the resulting dual should be a bundle of nonassociative noncommutative tori. 2.29. Assume that F = B × T n is the trivial T n -bundle, and that we consider the trivial twist h = 0. Then the T -dual bundle is again the trivial bundle, Fˆ = B × T n , ˆ = 0. In this situation the T -duality transformation T : and the dual twist vanishes: h K(B×T n) → K(B×T n) is a K-theory version of the Fourier–Mukai transformation (see, e.g., [17]). Note that the algebraic geometric analog is more precise. In this case F and Fˆ are bundles of dual abelian varieties. On F ×B Fˆ one has the so-called Poincar´e sheaf P. Its first Chern class c1 (considered as an automorphism of the trivial twist) satisfies the condition P(c1 ). The Fourier–Mukai transformation is a functor T : Db (F ) → Db (Fˆ ) between bounded derived categories of coherent sheaves given on L
objects by T (X) = Rˆ p∗ (P ⊗ p∗ X). Thus the T -duality transformation considered in the present paper is a coarseing of the Fourier–Mukai transformation since it takes in a certain sense only the isomorphism classes of objects into account. The tensor product with the Poincar´e sheaf plays the role of an automorphism of the trivial twist. A bundle of abelian varieties has a section. Therefore this case corresponds to the case of trivial T n -bundles in the present paper. Non-trivial bundles can be interpreted as bundles of torsors. In this case a good analog of the Poincar´e bundle c In
the present paper’s language [4] considers pairs (F, h) without any condition on h ∈ H 3 (F, Z).
December 15, 2006 16:52 WSPC/148-RMP
1118
J070-00287
U. Bunke, P. Rumpf & Th. Schick
such that P(c1 ) (see 2.6) is satisfied may not exist. In the topological situation we must replace the Poincar´e bundle by an isomorphism u of non-trivial twists in order to satisfy P(u), and to have a T -duality isomorphism. In algebraic geometry a similar observation is known (see, e.g., [11]), where twists are represented by Azumaya algebras. 3. The space Rn 3.1. If G is an abelian group and k ∈ N, then we consider the homotopy type K(G, k) of the Eilenberg–MacLane space. It is characterized by πi (K(G, k)) ∼ = 0 for i = k, and πk (K(G, k)) ∼ = G. We denote a CW -complex of this homotopy type by the same symbol. The Eilenberg–MacLane space K(G, k) classifies the cohomology functor H k (. . . , G). In fact, there is a universal class z ∈ H k (K(G, k), G) such that f → f ∗ (z) induces a natural isomorphism [B, K(G, k)] → H k (B, G), where [B, K(G, k)] denotes homotopy classes of maps. Occasionally, we will interpret K(Z, 2) also as the classifying space of the topological group T 1 := U (1). An explicit model is U/T 1 , where U is the unitary group of a separable infinite dimensional complex Hilbert space with the strong topology. The bundle U → U/T 1 is the universal T 1 -principal bundle. Note further that K(Zn , 2) ∼ = K(Z, 2)n has the homotopy type of BT n , and that this space carries an universal T n -bundle U n → K(Z, 2)n . 3.2. Let x ∈ H 2 (K(Z, 2), Z) denote the canonical generator. The product of Eilenberg–MacLane spaces K(Zn , 2) × K(Zn , 2) can be written in the form K(Zn , 2) × K(Zn , 2) ∼ = K(Z, 2)n × K(Z, 2)n . We let pi , pˆi : K(Zn , 2) × K(Zn , 2) → K(Z, 2),
i = 1, . . . , n
denote the projections onto the components of the first and second factors. Then we define the canonical generators xi , xˆi ∈ H 2 (K(Zn , 2) × K(Zn , 2), Z), i = 1, . . . , n, ˆi := pˆ∗i (x), respectively. by xi := p∗i (x) and x n Let q : K(Z , 2) × K(Zn , 2) → K(Z, 4) be the map classifying x1 ∪ xˆ1 + · · · + xn ∪ x ˆn ∈ H 4 (K(Zn , 2) × K(Zn , 2), Z). Definition 3.1. We define the homotopy type Rn by the homotopy pull-back diagram (c,ˆ c)
Rn → K(Zn , 2) × K(Zn , 2) . q ↓ ↓ ∗ → K(Z, 4) In other words, Rn is defined as the homotopy fiber of q.
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1119
3.3. For later use we determine the homotopy groups of Rn . Lemma 3.2. The homotopy groups of Rn are given by i πi (Rn )
0 ∗
1 0
2 Z2n
3 Z
≥4 . 0
Proof. The homotopy fiber of (c, cˆ) is homotopy equivalent to the one of ∗ → K(Z, 4), i.e. to K(Z, 3). The assertion follows immediately from the long exact sequence of homotopy groups. ˆ = (ˆc1 , . . . , ˆcn ), i.e. we let ci and ˆci denote 3.4. We write c = (c1 , . . . , cn ) and c the components of c or cˆ, respectively. Lemma 3.3. We have i H i (Rn , Z)
0 Z
1 0
2 Z2n
3 0
4 Zn(2n+1)−1
.
Here H 2 (Rn , Z) is freely generated by the components of c and cˆ, and H 4 (Rn , Z) is generated by all possible products of the components of c and ˆc subject to one relation 0 = c1 ∪ cˆ1 + · · · + cn ∪ ˆcn . Proof. Recall from the proof of Lemma 3.2 that the homotopy fiber of Rn → K(Zn , 2) × K(Zn , 2) is a K(Z, 3). The relevant part of the second page of the Leray–Serre spectral sequence (see Sec. A.2) (c,ˆc) E2p,q ∼ = H p (K(Zn , 2) × n q K(Z , 2), H (K(Z, 3), Z)) therefore becomes 3 2 1
Z 0 0
0 0 0
∗ 0 0
0 0 0
0 q/p
Z 0
0 1
Z2n 2
0 3
∗ 0 0 Z
.
2n(2n+1) 2
4
ˆ. The We read off that H 2 (Rn , Z) ∼ = Z2n is generated by the components of c and c 4,0 (c,ˆ c) 4 n n ∼ E2 = H (K(Z , 2) × K(Z , 2), Z) is freely generated by all possible group products of the components of c and ˆc. 0,3 Let z3 ∈ H 3 (K(Z, 3), Z) ∼ = (c,ˆc) E2 be the canonical generator. It also generates t 0,3 the group E2 of the Leray–Serre spectral sequence of the homotopy fibration
December 15, 2006 16:52 WSPC/148-RMP
1120
J070-00287
U. Bunke, P. Rumpf & Th. Schick
t : ∗ → K(Z, 4). A part of its second page is 3 2 1 0 q/p
Z 0 0 Z 0
0 0 0 0 1
0 0 0 0 2
0 0 0 0 3
∗ 0 0 . Z 4
0,3 Since H 3 (∗, Z) ∼ =0∼ = H 4 (∗, Z) we conclude that t d2 (z3 ) = z4 ∈ H 4 (K(Z, 4), Z) ∼ = 4,0 t E2 is the generator. Now by construction q ∗ z4 = x1 ∪ xˆ1 + · · · + xn ∪ x ˆn , and by ∗ naturality of the spectral sequences (c,ˆc) d0,3 2 (z) = q z4 . This implies the assertion 4 about H (Rn , Z).
3.5. Recall that we consider K(Zn , 2) ∼ = BT n (see 3.1). Definition 3.4. We define πn : Fn → Rn to be the T n -bundle which is classified by c : Rn → K(Zn , 2). Let U → K(Z, 2) be the universal T 1 -bundle. The n-fold product U n → K(Z, 2)n ∼ = K(Zn , 2) is the universal T n -bundle. By definition we get a pull-back diagram Fn → U n πn ↓ ↓ . c Rn → K(Z, 2)n 3.6. In the following we use the spectral sequence notation introduced in Sec. A.2. Lemma 3.5. (1) We have i H (Fn , Z) i
0 Z
1 0
2 Zn
3 . Z
(2) Here the group H 2 (Fn , Z) is freely generated by the components of πn∗ ˆc. (3) In particular, restriction to the fiber of Fn → Rn induces the zero homomorphism on H 2 (Fn , Z). (4) Furthermore, H 3 (Fn , Z) is generated by a class hn ∈ F 2 H 3 (Fn , Z) which is
n 2,1 . characterized by [hn ]2,1 = i=1 [yi ⊗ cˆi ] ∈ πn E∞ Proof. We write out the second page 3 2 1 0 q/p
Z
πn
E2p,q .
n(n−1)(n−2) 6
Z
n(n−1) 2
Z Z 0
n
0 0 0 1
2
Zn (n−1) 2 Z2n Z2n 2
0 0 0 3
. Z
n(2n+1)−1
4
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1121
πn 2,0 ∼ We know that πn d0,1 E2 = H 2 (Rn , Z). It follows that πn d0,1 is 2 (yi ) = ci ∈ 2 an isomorphism onto the subgroup of πn E20,2 generated by c so that πn E30,1 ∼ = 0 and πn E32,0 is freely generated by the components of cˆ. We see already that H 1 (Fn , Z) ∼ = 0. The group πn E20,2 is freely generated by all products yi ∪yj , i < j. We now use the multiplicativity of the Leray–Serre spectral sequence in order see that πn d0,2 2 (yi ∪ is injective. This implies that yj ) = yj ⊗ ci − yi ⊗ cj . We conclude that πn d0,2 2 πn 0,2 ∼ 2 E3 = 0, and it follows that H (Fn , Z) is freely generated by the components of πn∗ cˆ. 1,2 1,2 We have πn E24,0 ∼ = H 4 (Rn , Z), πn d2 (yi ⊗ cj ) = ci ∪ cj , and πn d2 (yi ⊗ ˆcj ) = ci ∪ cˆj . c1 + · · · + cn ∪ ˆcn = 0. In order to calculate ker(πn d1,2 2 ) we recall the relation c1 ∪ ˆ Let
h := y1 ⊗ ˆc1 + · · · + yn ∪ cˆn . πn 1,2 ∼ πn 0,2 Then we have πn d1,2 2 (h) = 0. We claim that ker( d2 ) = Zh ⊕ im( d2 ). Let
n π n 1,2 t := i,j=1 ai,j yi ⊗ cj + bi,j yi ⊗ ˆcj for ai,j , bi,j ∈ Z and assume that d2 (t) = 0.
n Then i,j=1 ai,j ci ∪ cj + bi,j ci ∪ cˆj = 0. This implies that ai,j + aj,i = 0 for all pairs i, j, bi,j = 0 for i = j, and that there exists b ∈ Z such that bi,i = b for all
i = 1, . . . , n. But then we can write t = i<j ai,j πn d0,2 2 (yj ∪ yi ) + bh. It follows that πn 2,1 ∼ E3 = Z is generated by the class of h. The group πn E23,0 is freely generated by the products yi ∪ yj ∪ yk , i < j < k. Furthermore, πn d0,3 2 (yi ∪ yj ∪ yk ) = yj ∪ yk ⊗ ci − yi ∪ yk ⊗ cj + yi ∪ yj ⊗ ck . We thus is injective. We conclude that H 3 (Fn , Z) ∼ calculate that πn d0,3 = F 2 H 3 (Fn , Z) is 2 πn 1,2 generated by the class hn represented by h ∈ E2 .
3.7. In the proof of Lemma 3.2 we have found a homotopy Cartesian square i
K(Z, 3) → Rn . ↓ (c,ˆ c) ↓ ∗ → K(Zn , 2) × K(Zn , 2) We pull this square back along the map ψ : K(Zn , 2) ∼ = U n ×K(Zn , 2) → K(Zn , 2)× K(Zn , 2) and obtain a cube of homotopy Cartesian squares I
i∗ Fn pr
→
Fn πn
i
λ
↓
K(Z, 1)n
K(Z, 3) → Rn ↓ (c,ˆc) ↓ ∗ → K(Zn , 2) × K(Zn , 2)
κ
ψ
→
↓
.
K(Zn , 2)
Lemma 3.6. We have I ∗ hn = ±pr∗ z3 , where z3 ∈ H 3 (K(Z, 3), Z) is the canonical generator.
December 15, 2006 16:52 WSPC/148-RMP
1122
J070-00287
U. Bunke, P. Rumpf & Th. Schick
Proof. The fiber of κ is equivalent to K(Z, 3). The second page κ E2 of the corresponding Leray–Serre spectral sequence has the form 3 2 1
Z 0 0
0 q/p
Z 0
0 0 0
∗ 0 0
0 1
Z 2
∗ 0 0
n
0 3
∗ ∗ ∗
Z
.
n(n+1) 2
4
0,3 We know that H 3 (Fn , Z) ∼ = Z is freely generated by hn . We see that κ d2 = 0 0,3 0,3 and κ E2 ∼ = H 3 (Rn , Z) is generated by hn . On the other hand the group κ E2 ∼ = H 3 (K(Z, 3), Z) is freely generated by z3 . Therefore hn = ±z3 . Note that i∗ Fn ∼ = K(Z, 3) × K(Z, 1)n so that the Leray–Serre spectral sequence λ E of λ degenerates. Let I ∗ : κ E2 → λ E2 be the induced map of the second pages and note that λ E2 has the form
3 2 1
Z 0 0
Zn 0 0
0 q/p
Z 0
Zn 1
∗ 0 0 Z
n(n−1) 2
∗ 0 0 Z
n(n−1)(n−2) 6
2
∗ 0 0 Z
,
n(n−1)(n−2)(n−3) 24
3
4
where λ E20,3 is freely generated by z3 ∈ H 3 (K(Z, 3), Z). The map I induces an equivalence of the fibers of λ and κ. In particular, it induces an isomorphism I ∗ : κ E20,3 → λ E20,3 identifying the generators above. This implies that I ∗ hn = ±pr∗ z3 . 4. The T -duality group and the universal triple 4.1. Let (e1 , . . . , en , eˆ1 , . . . , eˆn ) be the standard basis of Z2n . Let Gn ⊂ GL(2n, Z) be the subgroup of transformations which fix the form q : Z2n → Z given by
n
n q( i=1 ai ei + bi eˆi ) := i=1 ai bi . Usually denoted O(n, n, Z), Gn will here be called the group of T -duality transformations. 4.2. Each g ∈ Gn induces an equivalence g : K(Zn , 2) × K(Zn , 2) → K(Zn , 2) × K(Zn , 2). Lemma 4.1. There exist a unique homotopy class of lifts g˜ in the diagram Rn (c,ˆc)
g ˜
−−−−→
Rn (c,ˆc)
.
g
K(Zn , 2) × K(Zn , 2) −−−−→ K(Zn , 2) × K(Zn , 2) Proof. We apply obstruction theory to the problem of existence and classification of lifts g˜. The situation is simple because the fiber of (c, cˆ) is a K(Z, 3). In fact, the
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1123
obstruction of the existence of a lift is the class (c, cˆ)∗ g ∗ ( ni=1 xi ∪ xˆi ) ∈ H 4 (Rn , Z) which vanishes since g preserves q. Therefore there exists a lift g˜. The set of homotopy classes of lifts is a torsor over H 3 (Rn , Z). Since H 3 (Rn , Z) ∼ = {0}, the lift g˜ is unique. The correspondence Gn g → g˜ induces a homotopy action of Gn on Rn . 4.3. We let t ∈ Gn be the transformation given by t(ei ) = eˆi and t(ˆ ei ) = ei . Definition 4.2. The universal T -duality is the lift T := t˜ : Rn → Rn of t according to Lemma 4.1. Note that T ◦ T = idRn since t2 = 1 ∈ Gn . 4.4. Definition 4.3. The universal dual T n -bundle is defined by the pull-back T˜ ˆn → F Fn πn ↓ . π ˆn ↓ T
Rn → Rn ˆ n := T˜ ∗ hn ∈ H 3 (F ˆ n , Z). Furthermore, we define h 4.5. We consider the pull-back diagram
pn
ˆn Fn ×Rn F ˆn p
Fn
rn πn
↓
ˆn . F
π ˆn
Rn Lemma 4.4. We have i ˆ n , Z) H (Fn ×Rn F
0 Z
i
1 0
2 0
3 . Z
ˆ n , and this element generates H 3 (Fn ×R F ˆ n , Z). ˆn∗ h Moreover, πn∗ hn = π n Proof. We use the Leray–Serre spectral sequence second page has the form 3 2 1 0 q/p
Z
rn
E. The relevant part of its
2n(2n−1)(2n−2) 6
Z
2n(2n−1) 2
Z Z 0
2n
0 0 0 1
2
Z2n (2n−1) 2 Z4n Z2n 2
0 0 0 3
. Z
n(2n+1)−1
4
December 15, 2006 16:52 WSPC/148-RMP
1124
J070-00287
U. Bunke, P. Rumpf & Th. Schick
ˆi . We Now r E20,1 is freely generated by yi , yˆi , and r E22,0 is freely generated by ci , c r 0,1 r 0,1 ˆ (y ) = c and d (ˆ y ) = c . We conclude that d is injective and know that r d0,1 i i i i 2 2 2 ˆ n , Z) ∼ H 1 (Fn ×Rn F = 0. The group rn E20,2 is freely generated by the products yi ∪ yj , yˆi ∪ yˆj , i < j and rn 0,2 d2 (ˆ yi ∪ yˆj ) = yˆj ⊗ cˆi − yˆi ⊗ cˆj , yi ∪ yˆj . We have rn d0,2 2 (yi ∪ yj ) = yj ⊗ ci − yi ⊗ cj , rn 0,2 and d2 (yi ∪ yˆj ) = yˆj ⊗ ci − yi ⊗ cˆj . It follows that r d0,2 2 is injective. is injective. In a similar way we see that rn d0,3 2 We now calculate rn E32,1 . We claim that this group is freely generated by one
n
n class which can be represented by i=1 yi ⊗ˆci ∈ rn E22,1 or alternatively by i=1 yˆi ⊗ ˆ n , and by naturality of the ci ∈ rn E22,1 . In view of the construction of hn and h Leray–Serre spectral sequence this would imply the assertion of the lemma about the third cohomology. The group rn E24,0 is generated by the products ci ∪ cj , ˆci ∪ ˆcj , i ≤ j, and all
cj subject to one relation ni=1 ci ∪ cˆi = 0. Let t = i,j ai,j yi ⊗ cj + products ci ∪ ˆ
cj + i,j ci,j yˆi ⊗cj + i,j di,j yˆi ⊗ˆcj and assume that rn d2,1 (t) = 0. Then i ⊗ˆ i,j bi,j y
2 we have i,j ai,j ci ⊗ cj + i,j bi,j ci ⊗ cˆj + i,j ci,j ˆci ⊗ cj + i,j di,j ˆci ⊗ cˆj = 0. This implies that ai,j + aj,i = 0, di,j + dj,i = 0, for all i, j, bi,j + ci,j = 0 for all i = j, and that there exists a unique e ∈ Z such that bi,i + ci,i = e for all
rn 0,2 yj ∪ i = 1, . . . , n. We can now write t = i<j ai,j rn d0,2 2 (yj ∪ yi ) + i<j di,j d2 (ˆ
n n rn 0,2 (y ∪ y ˆ ) − b d (y ∪ y ˆ ) + e y ˆ ⊗ c . This already yˆi ) − i=j bi,j rn d0,2 i j i i i i 2 2 i=1 i,i
n i=1 shows that rn E32,1 is freely generated by the class of i=1 yˆi ⊗ ci . Finally note that
n rn 0,2
d2 (yi ∪ yˆi ) = ni=1 yˆi ⊗ ci − ni=1 yi ⊗ ˆci . This finishes the proof of the i=1 claim. 4.6. Let Hn ∈ T (Fn ) be a twist with isomorphism class [Hn ] = hn ∈ H 3 (Fn , Z). ˆ n ), where T˜ was defined in 4.3. Then [H ˆn. ˆn] = h ˆ n := T˜ ∗ Hn ∈ T (F Set further H ∗ ∗ ˆ n by Lemma 4.4, ˆ n in 4.5. Since pn hn = p ˆ nh Recall the definition of pn and p ∗ ˆ ˆ we conclude that there exists an isomorphism of twists un : pn Hn → p∗n Hn . Since ˆ n , Z) ∼ H 2 (Fn ×Rn F = 0 by Lemma 4.4 this isomorphism is unique. 4.7. The notion of a T -duality triple over a space B was introduced in Definition 2.2. Here we clarify the notion of an isomorphism between such triples ˆ ), u ). ˆ u) and x := ((F , H ), (Fˆ , H x := ((F, H), (Fˆ , H), Definition 4.5. We say that x and x are isomorphic if there exist underlying bundle isomorphisms ψ
F →F ↓ ↓ , B= B
ˆ
ψ Fˆ → Fˆ ↓ ↓ B= B
and isomorphisms of twists v : ψ ∗ H → H,
ˆ → H ˆ vˆ : ψˆ∗ H
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1125
such that the composition ∗ −1
ˆ
∗
ψ) u pˆ v ˆ ˆ ∗ (p )∗ H ˆ ∗ (ˆ ˆ − ˆ ∼ ˆ (ψ, pˆ∗ H −−→ pˆ∗ ψˆ∗ H p )∗ H −−−−→ (ψ, ψ) = (ψ, ψ) ∗
p v ∼ = p∗ ψ ∗ H → p∗ H
ˆ : F ×B Fˆ → F ×B Fˆ is the induced map ˆ → p∗ H. Here (ψ, ψ) is equal to u : pˆ∗ H (and the other notation is as in 2.6). 4.8. ˆ n, H ˆ n ), un ) represents the unique isomorphism class Theorem 4.6. ((Fn , Hn ), (F ˆ n ). of T -duality triples over (Fn , F ˆ n, H ˆ n ), un ) is a T -duality triple, Proof. We must only verify that ((Fn , Hn ), (F because uniqueness is clear from the conditions on T -duality triples of 2.6, from ˆ n ] by Lemma 3.5(1), and the uniqueness of un by uniqueness of [Hn ] and [H Lemma 4.4. ˆ have the required properties by construction. It remains The classes [H] and [H] to show that condition P(u) is satisfied (see 2.5). We fix a base point ∗ ∈ Rn . Note that by Lemma 5.3 and Proposition 5.4 (which are independent of the result to be ˜ 1 . In [6] we studied in detail the proved) we have a canonical equivalence R1 ∼ =R topology of R1 and the associated T -duality. The idea of the proof is to reduce the present task to the case of n = 1. We consider the n-fold product F → B of the T 1 -bundle F1 → R1 i.e. we set F := Fn1 and B := Rn1 . Let pi : B → R1 , i = 1, . . . , n, denote the projections. Let (z, zˆ) : B → K(Zn , 2) × K(Zn , 2) be the map whose components classify zi := p∗i x ˆ, and where x, xˆ ∈ H 2 (R1 , Z) are the canonical generators. and zˆi := p∗i x We now apply obstruction theory to the lifting problem Rn f (z,ˆ z)
(c,ˆ c)
↓
.
B → K(Zn , 2) × K(Zn , 2) The relation x ∪ x ˆ = 0 in H 4 (R1 , Z) implies that zi ∪ zˆi = 0, and hence that
n ˆi = 0. In view of Definition 3.1 this diagram admits a lift f . Since i=1 zi ∪ z H 3 (B, Z) ∼ = 0 the homotopy class of lifts of the lift f is in fact uniquely determined. We therefore have a pull-back diagram of principal T n -bundles f˜
F → Fn πn ↓ . π ↓ f
B → Rn We define h := f˜∗ hn . In this way we obtain a pair (F, h) over B. We further have natural projections pri : F → F1 , i = 1, . . . , n. Using the characterizations of
December 15, 2006 16:52 WSPC/148-RMP
1126
J070-00287
U. Bunke, P. Rumpf & Th. Schick
hn and h1 in Lemma 3.5, the naturality of Leray–Serre spectral sequences, and H 3 (B, Z) ∼ = 0, we see that h=
n
pr∗i h1 .
(4.1)
i=1
Let TB : B → B be the product of the T -duality transformations T : R1 → R1 on each factor. Since T is a homotopy equivalence, there is a unique homotopy classes of lifts α in Rn ↓ . B → Rn αT f ◦TB
α
Since the composition B → Rn → K(Zn , 2) × K(Zn , 2) coincides with (z, zˆ) we have α = f , and the following diagram commutes up to homotopy f
B → Rn T ↓ . TB ↓ f
B → Rn This shows that we have a pull-back diagram ˆ
f ˆn Fˆ → F π ˆn ↓ , π ˆ ↓ f
B → Rn ˆ 1 → R1 . where Fˆ → B is the n-fold product of F We get the commutative diagram ˜
ˆ
f×B f ˆn Fb × Fˆb → F ×B Fˆ → Fn ×Rn F , ↓ r ↓ rn ↓
{b}
→
B
f
→
Rn
where we assume that f (b) = ∗. It follows from Lemma 4.4 (in the case n = 1) and the K¨ unneth formula that ∗ ∗ ˆ∗ ˆ ˜ ˜ ˆ ˆ 0. Therefore ( f × H → (f˜ ×B fˆ)∗ p∗n H is the p u : ( f × f ) f ) H 2 (F ×B Fˆ , Z) ∼ = B B n unique isomorphism. ˆ 1 . Then ˆ 1 ) be twists in the classes h1 and h Let V ∈ T (F1 ) and Vˆ ∈ T (F
n ∗ ∗ ˜ by Eq. (4.1) there exists isomorphisms of twists κ : f H → i=1 pri V and
n ∗ ∗ ˆ → ˆ κ ˆ : fˆ H i=1 pri V. The choice of these isomorphisms is not unique. However, their restrictions to the fiber over {b} → B is unique. This follows from the structure of H 2 (F, Z) (and of H 2 (Fˆ , Z)) implied by Lemma 3.5(1) in the case n = 1 and the K¨ unneth formula. At this moment we fix some choices of κ and κ ˆ.
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1127
ˆ 1 be the projection onto the ith component. Note Let qi : F ×B Fˆ → F1 ×R1 F ˜ ˆ ˆ ˜ ˆ 1 ◦ qi = pri ◦ pˆ. ˆ n ◦ (f˜×B fˆ), p1 ◦ qi = pri ◦ p and p that f ◦ p = pn ◦ (f ×B f ), f ◦ pˆ = p We now have fixed isomorphisms p∗ κ : (f˜ ×B fˆ)∗ p∗n H ∼ =
n
ˆ∼ ˆ ∗n H pˆ∗ κ ˆ : (f˜ ×B fˆ)∗ p =
qi∗ p∗1 V ,
i=1
n
ˆ ˆ ∗1 V. qi∗ p
i=1
ˆ ∗1 Vˆ → p∗1 V. This induces another Note that there is a unique isomorphism ψ : p isomorphism n pˆ κ ˆ ˆ ∼ ˆ ∗n H ˆ ∗1 Vˆ qi∗ p Φ : (f˜ ×B fˆ)∗ p = ∗
i=1
n P i=1
qi∗ ψ
∼ =
n
qi∗ p∗1 V
(p∗ κ)−1
∼ =
(f˜ ×B fˆ)∗ p∗n H.
i=1
It follows that (f˜ ×B fˆ)∗ u = Φ. We can now restrict Φ to the fiber Fb × Fˆb . It was shown in [6, 3.2.4] that the restriction of ψ to the fiber T 1 × Tˆ 1 is classified by a generator of H 2 (T 1 × Tˆ 1 , Z), namely by y ∪ yˆ in the canonical basis of H 1 (T 1 × Tˆ 1, Z). If we restrict the whole composition defining Φ to the fiber then
n p∗b )), as we see that [(f˜, fˆ)∗ u(b)] = [ i=1 yi ∪ yˆi ] ∈ H 2 (Fb × Fˆb , Z)/(im(p∗b ) + im(ˆ required. 5. Pairs and triples ˜ n (0) of pairs in 2.15. The 5.1. Recall the construction of the classifying space R goal of the present section is the identification of Rn with a one-connected covering ˜ n (0). of R We start with the universal T n -bundle U n → K(Zn , 2) and the T n -space Map(T n , K(Z, 3)), where T n acts by reparametrization. In a first step we form the associated bundle ˜ n (0) := U n ×T n Map(T n , K(Z, 3)) → K(Zn , 2). p:R ˜ n (0) via the pull-back We define a T n -bundle F ˜ n (0) → U n F ↓ ↓ . p ˜ n (0) → R K(Zn , 2) There is a canonical map ˜ n (0) : F ˜ n (0) → K(Z, 3). h ˜ n (0)([v, φ], u) := φ(s), where s ∈ T n is the unique element such It is given by h ˜ n (0), and ([v, φ], u) ∈ that sv = u. Here u, v ∈ U n , φ ∈ Map(T n , K(Z, 3)), [v, φ] ∈ R ˜ n (0). F
December 15, 2006 16:52 WSPC/148-RMP
1128
J070-00287
U. Bunke, P. Rumpf & Th. Schick
5.2. Recall that a pair (F, h) over a space B consists of a T n -bundle F → B and a class h ∈ H 3 (F, Z). An isomorphism between pairs (F, h) and (F , h ) is given by a diagram Φ
F →F ↓ ↓ , B= B where Φ is a T n -bundle isomorphism such that Φ∗ h = h. Given a map f : B → B of spaces we can form the pull-back f˜
F →F ↓ ↓ . f
B → B We define the pair f ∗ (F, h) := (F , f˜∗ h) over B . Pull-back preserves isomorphism classes of pairs. 5.3. Let P˜(0) be the contravariant set-valued functor which associates to each space B the set P˜(0) (B) of isomorphism classes of pairs. ˜ n (0) is a classifying space for P˜(0) . More precisely, the Lemma 5.1. The space R ˜ n (0)] ∈ P(0) (R ˜ n (0)) induces a natural isomorphism v˜B : [B, R ˜ n (0)] → ˜ n (0), h pair [F ∗ ˜ ˜ ˜ P(0) (B) such that v˜B (f ) = f [Fn (0), hn (0)]. Proof. This is completely analogous to the proof of [6, Proposition 2.6]. We therefore refrain from repeating the proof here. ˜ n (0)) ∼ 5.4. By Lemma 5.1 we have an isomorphism π0 (R = P˜(0) (∗). Furthermore, 3 n ∼ ˜ n (1) ⊂ R ˜ n (0) note that P(0) (∗) = H (T , Z) in a canonical way. We define R 3 n to be the component which corresponds to 0 ∈ H (T , Z). Restricting the pair ˜ n (1)] over R ˜ n (0)] gives a pair [F ˜ n (1), h ˜ n (1). We let P˜(1) be the functor ˜ n (0), H [F ˜ n (1). Observe that P˜(1) (B) ⊂ P˜(0) (B) is the set of isomorphism classified by R classes of pairs [F, h] such that the restriction of h to the fibers of F vanishes. unneth formula 5.5. By Lemma 5.1 we have P˜(0) (S 1 ) ∼ = H 3 (S 1 × T n ). By the K¨ 3 1 n 3 n 2 n ∼ ˜ n (1)) ∼ H (S × T , Z) = H (T , Z) ⊕ H (T , Z), and π1 (R = P˜(1) (S 1 ) ∼ = H 2 (T n , Z) corresponds to the second summand. One can check that this bijection is a group homomorphism. ∼ ˜ n (1)) → H 2 (T n , Z) as a cohomology We consider the isomorphism φ : π1 (R 1 ˜ 2 n ˜ n (1) → class φ ∈ H (Rn (1), H (T , Z))), i.e. as a homotopy class of maps φ : R 2 n ˜ n as the homotopy pullback K(H (T , Z), 1). We define R ˜n→ ˜ n (1) R R . φ↓ ↓ ∗ → K(H 2 (T n , Z), 1)
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1129
Furthermore, we consider the pull-back ˜n → F ˜ n (1) F , ↓ ↓ ˜ n (1) ˜n→R R ˜ n ∈ H 3 (F ˜ n , Z) be the pullback of h ˜ n (1). Note that by construction and we let h ˜ ˜n → R ˜ n . Since R ˜ n is simply and naturality, hn pulls back to zero on the fiber of F 2 3 ˜ connected, it then even belongs to the second step F H (Fn , Z) of the Leray– ˜ n . Let x1 , . . . , xn ∈ H 2 (R ˜ n , Z) be the Chern ˜n → R Serre filtration associated with F ˜ classes of Fn . 5.6. ˜ n are given by Lemma 5.2. The homotopy groups of R i ˜ n) πi (R
0 ∗
1 0
3
≥4
Z
0
2 Z
2n
.
˜ n is connected and simply connected. The homotopy Proof. By construction, R ˜ ˜ fiber of Rn → Rn (1) is equivalent to the homotopy fiber of ∗ → K(H 2 (T n , Z), 1), ˜ n) ∼ i.e. to K(H 2 (T n , Z), 0). Hence this map induces an isomorphism πi (R = ˜ n (1)) for i ≥ 2. πi (R Note that K(Z, k) is an H-space for each k. Hence we have an equivalence Map(T 1 , K(Z, k)) K(Z, k) × ΩK(Z, k) K(Z, k) × K(Z, k − 1). We use the exponential law to write Map(T n , K(Z, 3)) as an iterated mapping space, and we obtain in this way an equivalence Map(T n , K(Z, 3)) K(Z, 3) × K(Z, 2)n × K(Z, 1)
n(n−1) 2
× K(Z, 0)
n(n−1)(n−2) 6
.
The long exact sequence of homotopy groups for ˜ n (0) → K(Z, 2)n Map(T n , K(Z, 3)) → R and the fact that π3 (K(Z, 2)n ) ∼ =0∼ = π1 (K(Z, 2)n ) and π2 (K(Z, 2)n ) ∼ = Zn yield the exact sequence δ ˜ n (1)) → Zn → 0 → Zn → π2 (R Z
n(n−1) 2
α ˜ n (1)) → 0 . → π1 (R
∼ ˜ ˜ n (1)) = 0 for i ≥ 4. Furthermore, we observe that π3 (R(1)) = Z and πi (R n(n−1) 2 ∼ ˜ We have seen in Sec. 5.5 that π1 (Rn (1)) = H (T 2 , Z) ∼ = Z 2 . We conclude that α must be surjective. Consequently it is injective and δ = 0. We therefore have an exact sequence ˜ n (1)) → Zn → 0, 0 → Zn → π2 (R ˜ n (1)) ∼ ˜ n) ∼ and this implies that π2 (R = π2 (R = Z2n . ˜ n , Z) introduced above form half of a By duality, the classes x1 , . . . , xn ∈ H 2 (R 2 ˜ 2n Z-basis for H (Rn , Z) ∼ =Z .
December 15, 2006 16:52 WSPC/148-RMP
1130
J070-00287
U. Bunke, P. Rumpf & Th. Schick
˜n → R ˜ n (1) is equivalent to K(H 2 (T n , Z), 0) ∼ 5.7. Since the fiber of p : R = ˜ ˜ n (1). We now π1 (Rn (1)) we can consider this map as a universal covering of R consider the problem of existence and classification of lifts in the diagram ˜n → R ∗ f˜ p ↓ ↓ . f φ 2 n ˜ B → Rn (1) −→ K(H (T , Z), 1) It follows from the construction of homotopic to a constant map. The homotopy. If a lift exists, the set H 0 (B, H 2 (T n , Z)).
˜ n that a lift f˜ exists if and only if φ ◦ f is R lift itself depends on the choice of an explicit of homotopy classes of lifts is a torsor over
5.8. The classification of homotopy classes f˜ (considered just as maps, not as lifts) lifting a homotopy class f is more subtle. In order to study this problem we assume that B is path connected and equipped with a base point b ∈ B. Let f˜0 be a lift ˜ n (1)) ∼ of f and consider x ∈ π1 (R = H 0 (B, H 2 (T n , Z)). Then we consider the lift f˜1 = xf˜0 , i.e. the composition of f˜0 with the deck transformation associated to x. ˜ n be a homotopy. Assume that f˜0 and f˜1 are homotopic. Let H : I × B → R 1 ˜ ˜ n (1)). Then p ◦ H : S × B → Rn (1) can be considered as a map h : B → Map(S 1 , R We have the following diagram x
˜ n (1)) {b} → Map(S 1 , R ev1 ↓ ↓ h , f ˜ B → Rn (1)
(5.1)
˜ n (1)) → R ˜ n (1) is the evaluation at 1 ∈ S 1 . where ev1 : Map(S 1 , R ˜ Vice versa, if x ∈ π1 (Rn (1)) is such that the diagram above admits a lift h, then f˜0 and xf˜0 are homotopic. 5.9. The existence problem for a lift h can be studied using obstruction theory. ˜ n (1)). In the proof of Lemma 5.2 we have seen that The fiber of the map ev1 is Ω(R ˜ the homotopy groups of Rn (1) are given by i
0
˜ n (1)) πi (R
1
∗
Z
2
n(n−1) 2
Z
2n
3
≥4
Z
0
.
˜ n (1) are given by It follows that the homotopy groups of ΩR i ˜ n) πi (ΩR
0 Z
n(n−1) 2
1 Z
2n
n(n−1)
We therefore have obstructions in H 1 (B, Z 2 a general discussion seems to be complicated.
2
≥3
Z
0
.
), H 2 (B, Z2n ) and H 3 (B, Z), and
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1131
˜ n . For each space B we have 5.10. Let P˜n be the set-valued functor classified by R ˜ ˜ a natural transformation pB : Pn (B) → P(1) (B) induced by composition with the ˜ n (1). ˜n → R map p : R We conclude from Sec. 5.7 that the fibers of pB are homogeneous spaces over H 0 (B, H 2 (T n , Z)). Consider a class [F, h] ∈ P˜(1) (B). By Sec. 5.7, it belongs to the image of pB if and only the restriction [F, h]|B (1) to a 1-skeleton B (1) ⊂ B is trivial. Since every principal torus bundle on a 1-dimensional complex is trivial, this is equivalent to the condition that the restriction of h to F|B (1) is trivial, or equivalently h ∈ ˜ n (1) representing a pair [F, h] with F 2 H 3 (F, Z). Let us fix a map f : B → R this property. If we choose a homotopy from φ ◦ f to the constant map, then we distinguish an element in the fiber p−1 B ([F, h]). 5.11. In Sec. 3 we have introduced a pair (Fn , hn ) over Rn such that hn ∈ F 2 H 3 (Fn , Z). The isomorphism class of this pair gives rise to a classifying map ˜ n (1), well defined upto homotopy. f (1) : Rn → R Lemma 5.3. The set of homotopy classes of maps f which are lifts of f (1) in the diagram ˜n R f p↓ f (1) ˜ n (1) Rn → R is a torsor over H 2 (T n , Z). Proof. Since Rn is simply connected we know that lifts exist. Let now x ∈ ˜ n (1)). We choose a base point b ∈ Rn and identify π1 (R ˜ n (1)) ∼ π1 (R = H 2 (Fn,b , Z), where Fn,b denotes the fiber of Fn over b. In particular we view x ∈ H 2 (Fn,b , Z). We must show that the existence of a lift h in the diagram of 5.1 (with B replaced by Rn and f replaced by f (1)) implies that x = 0. Assume that a lift h ˜ n (1). This corresponds to a T n -bundle F → exists, adjoint to a map H: S 1 ×Rn → R S 1 × Rn and a class h ∈ H 3 (F, Z). Let pr : S 1 × Rn → Rn be the projection. Since Rn is simply connected pr induces an isomorphism in second cohomology and the bundle F is the pull-back via pr of a T n -bundle from Rn . Since H restricts to f on {1}×Rn and the corresponding T n -bundle is Fn , necessarily F ∼ = pr∗ Fn = S 1 ×Fn . By the K¨ unneth formula H 3 (F, Z) ∼ = H 3 (Fn , Z) ⊕ H 2 (Fn , Z), with corresponding ˜ n (1)) decomposition h = hn ⊕ u. By the definition of h and the calculation of π1 (R 2 in Sec. 5.5, the restriction of u to Fn,b is x. Since the restriction H (Fn , Z) → H 2 (Fn,b , Z) is trivial by the description of H 2 (Fn , Z) given in Lemma 3.5, it follows that x = 0. 5.12. We now fix one choice of f in Lemma 5.3. ˜ n is a weak homotopy equivalence. Proposition 5.4. The map f : Rn → R
December 15, 2006 16:52 WSPC/148-RMP
1132
J070-00287
U. Bunke, P. Rumpf & Th. Schick
Proof. By Lemmas 3.2 and 5.2, it suffices to show that f induces isomorphisms on π2 and π3 . ˜ n , Z) ∼ ˜ n ), Z) ∼ Note that H 2 (R = HomZ (π2 (R = Z2n . We have a natural map x : n n ˜ ˜n → R ˜ n . We study the Rn → K(Z , 2) which classifies the T -principal bundle F 2 ˜ corresponding first Chern classes xi ∈ H (Rn , Z), i = 1, . . . , n. Let us consider the second page of the Leray–Serre spectral sequence p˜E2p,q of the ˜ n . Since hn ∈ F 2 H 3 (F ˜ n , Z), there are elements x ˜n → R ˆ1 , . . . , x ˆn ∈ fibration p˜ : F 2 ˜ 2,1 p˜ 2,1 ˜ ˜ := yi ⊗ x H (Rn , Z) such that hn is represented by h ˆ ∈ E , where the yi i 2 i are the canonical generators of H 1 (T n , Z) (for the fiber). Under the pull-back induced by f the spectral sequence p˜E is mapped to ˜ ∈ πn E 2,1 is the spectral sequence πn E considered in 3.6. In particular [h] ∞
n π 2,1 ∗ n ci ] ∈ E∞ . We conclude that f x ˆi = ˆci . It follows mapped to [ i=1 yi ⊗ ˆ ˜ n , Z) → H 2 (Rn , Z) maps (x1 , . . . , xn , x ˆ1 , . . . , x ˆn ) to the basis that f ∗ : H 2 (R cn ) and is therefore surjective and injective. This implies that (c1 , . . . , cn , cˆ1 , . . . , ˆ ˜ n ) is an isomorphism. f∗ : π2 (Rn ) → π2 (R ˜ n ) is surjective. A generator It now suffices to show that f : π3 (Rn ) → π3 (R 3 ˜ ˜ g ∈ π3 (Rn ) is represented by a map g : S → Rn . The corresponding pair is the trivial torus bundle pr1 : S 3 × T n → S 3 with the cohomology class of the form h = pr∗1 z for some z ∈ H 3 (S 3 , Z) which is a generator. It suffices to show that the isomorphism class of pairs [S 3 × T n , pr∗1 z] is the i pull-back of [Fn , hn ] on Rn . We consider the composition g : S 3 → K(Z, 3) → Rn , where the map i was defined in 3.7 and the first map realizes a generator of π3 (K(Z, 3)). It then follows immediately from Lemma 3.6 that g ∗ [Fn , hn ] = [S 3 × T n , ±pr∗1 z]. Choosing the opposite generator of π3 (K(Z, 3)), if necessary, the assertion follows. 5.13. ˜ n are natuCorollary 5.5. The functors Pn classified by Rn and P˜n classified by R 2 n rally isomorphic. The group H (T , Z) acts freely on the set of such isomorphisms, and it acts transitively if we fix the composition with p∗ : P˜n → P˜(1) . 5.14. We consider a pair (F, h) over a space B. Recall the notion of an extension of a pair to a T -duality triple in 2.14. Theorem 5.6. The pair (F, h) admits an extension to a T -duality triple ˆ u) if and only if h ∈ F 2 H 3 (F, Z). ((F, H), (Fˆ , H), Proof. The condition is necessary by definition of a triple. We show that it ˜ n → Rn be a homotopy inverse of a choice is also sufficient. Let g : R ˆ n, H ˆ n ), un ) is a T -duality triple of f in Proposition 5.4. Then g ∗ ((Fn , Hn ), (F ∗ ˜ n ). Therefore we can ˜ n, h ˜ over Rn such that g (Fn , [Hn ]) is isomorphic to (F ∗ ˆ ˜ n, h ˜ n ) to a T -duality ˆ consider g ((Fn , Hn ), (Fn , Hn ), un ) as an extension of (F triple.
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1133
˜ n (0). Every lift Let now (F, h) be a pair over B with classifying map φ : B → R Ψ in the diagram ˜n R ↓ Ψ φ ˜ n (0) B → R ˆ n, H ˆ n ), un ) of (F, h) to a T -duality triple. yields an extension Ψ∗ g ∗ ((Fn , Hn ), (F The lifting problem can be decomposed into two stages Ψ ˜n B→ R p↓ Ψ1 ˜ B → Rn (1) . ↓ φ ˜ B → Rn (0)
The lift Ψ1 exists since h ∈ F 1 H 3 (F, Z). Since p is the universal covering map, the stronger condition h ∈ F 2 H 3 (F, Z) implies the existence of the lift Ψ in the second stage. 6. T -Duality transformations in twisted cohomology 6.1. In [6, Sec. 3.1], we have introduced a set of axioms which describe the basic properties of twisted cohomology theories. We will only use these axioms below. Let h(. . . , . . .) be a twisted cohomology theory. ˆ u) be a T -duality triple over a space B. It gives rise to the 6.2. Let ((F, H), (Fˆ , H), diagram F ×B Fˆ p
F
pˆ r
π
↓
Fˆ .
(6.1)
πˆ B
Recall Definition 2.18 of the T -duality transformation ˆ T := pˆ! ◦ u∗ ◦ p∗ : h(F, H) → h(Fˆ , H). 6.3. There is a unique 1-dimensional T -duality triple over a point. Recall the following definition from [6]. Definition 6.1. The twisted cohomology theory h is called T -admissible, if the T duality transformation associated with the unique one-dimensional T -duality triple over a point is an isomorphism (of degree −1).
December 15, 2006 16:52 WSPC/148-RMP
1134
J070-00287
U. Bunke, P. Rumpf & Th. Schick
6.4. ˆ u) is a T -duality triple over a finite dimensional Theorem 6.2. If ((F, H), (Fˆ , H), CW-complex B, and if h is a T -admissible twisted cohomology theory, then the T -duality transformation of 2.18 is an isomorphism. Proof. Exactly as in [6, Proof of Theorem 3.13], one uses induction on the dimension of the base space, the Mayer–Vietoris exact sequence and the 5-lemma to reduce this to the case B = ∗. For B = ∗ one observes that the n-dimensional T -duality transformation is an iterated 1-dimensional T -duality transformation which is an isomorphism by the definition of T -admissibility. 7. Classification of T -duality triples and extensions 7.1. Recall from Sec. 2.8 the definition of the functor B → Triplen (B) which associates to a space the set of isomorphism classes (see Definition 4.5) of n-dimensional T -duality triples. Lemma 7.1. The functor B → Triplen (B) is homotopy invariant. ˆ u) be a triple over B. Let H : [0, 1] × B → B be Proof. Let x = ((F, H), (Fˆ , H), a map. Let f0 , f1 denote the evaluations of H at the endpoints of the interval. We must show that f0∗ x ∼ = f1∗ x. Let pr : [0, 1] × B → B denote the projection. In the first step we choose bundle isomorphisms which extend the identity at 0 ψ
H ∗ F ← pr∗ f0∗ F , ↓ ↓ [0, 1] × B = [0, 1] × B
ˆ
ψ H ∗ Fˆ ← pr∗ f0∗ Fˆ . ↓ ↓ [0, 1] × B = [0, 1] × B
Evaluation at {1} × B gives isomorphisms ψ1 : f0∗ F → f1∗ F and ψˆ1 : f0∗ Fˆ → f1∗ Fˆ . ˆ 0 : f ∗ Fˆ → Fˆ , Φ1 : f ∗ F → F , Φ ˆ 1 : f ∗ Fˆ → Fˆ , γ : H ∗ F → Let Φ0 : f0∗ F → F , Φ 0 1 1 ∗ˆ ˆ F , and γˆ : H F → F denote the induced maps. Let Ji : f0∗ F → pr∗ f0∗ F and Jˆi : f0∗ Fˆ → pr∗ f0∗ Fˆ be the canonical embeddings over B → {i} × B ⊂ [0, 1] × B, ˆ 0 we have canonical isomorphisms i = 0, 1. Since γ ◦ ψ ◦ J0 = Φ0 and γˆ ◦ ψˆ ◦ Jˆ0 = Φ of twists ˆ ∗ H. ˆ∼ ˆ J ∗ ψ∗ γ ∗ H ∼ =Φ = Φ∗ H, Jˆ∗ ψˆ∗ γˆ ∗ H 0
0
0
0
These uniquely extend to isomorphisms of twists ˆ → pr∗ Φ ˆ ∗ H. ˆ w : ψ ∗ γ ∗ H → pr∗ Φ∗ H, w ˆ : ψˆ∗ γˆ ∗ H 0
0
We now restrict to {1} × B and obtain isomorphisms of twist v : ψ1∗ Φ∗1 H ∼ = ˆ∗ ˆ J1∗ w ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ ˆ ∗ ˆ ∼ ˆ∗ ˆ∗ ∗ ˆ J1 w ∗ ∗ ˆ∗ ˆ ∼ ∼ ˆ ˆ J1 ψ γ H → J1 pr Φ0 H = Φ0 H and vˆ : ψ1 Φ1 H = J1 ψ γˆ H → J1 pr Φ0 H = ˆ ∗ H. ˆ We have the following diagram of isomorphisms of twists over I×(f ∗ F ×B f ∗ Fˆ ): Φ 0
0
ˆ ∗H ˆ pˆ∗ pr∗ Φ 0 ∗ pˆ w ˆ ↑ ˆ pˆ∗ ψˆ∗ γˆ ∗ H
ˆ 0 )∗ u pr∗ (Φ0 ,Φ ∗
→
ˆ ∗u (γ◦ψ,ˆ γ ◦ψ)
→
p pr∗ Φ∗0 H . p∗ w ↑ p∗ ψ ∗ γ ∗ H
0
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1135
Here p : I × (f0∗ F ×B f0∗ Fˆ ) → I × f0∗ F and pˆ : I × (f0∗ F ×B f0∗ Fˆ ) → I × f0∗ Fˆ are the projections. This diagram commutes after restriction to {0} × B. Hence it commutes, and so does its restriction to {1} × B. The latter restriction gives ˆ ∗0 H ˆ pˆ∗1 Φ ∗ pˆ1 v ↑
ˆ 0 )∗ u (Φ0 ,Φ
p∗1 Φ∗0 H pˆ∗ 1v ↑
→
ˆ
∗
ˆ
(Φ1 ,Φ1 ) ˆ ∗H ˆ (ψ,ψ1 ) → pˆ∗1 ψˆ1∗ Φ 1
∗
u ∗ ∗ ∗ p1 ψ1 Φ1 H
with the projections p1 : f0∗ F ×B f0∗ Fˆ → f0∗ F and pˆ1 : f0∗ F ×B f0∗ Fˆ → f0∗ Fˆ . This shows that the bundle isomorphisms ψ1 , ψˆ1 and the isomorphisms of twists v, vˆ form an isomorphism of triples f0∗ x ∼ = f1∗ x. 7.2. Let us fix the T n -bundles F and Fˆ over B. Then we can consider triples of ˆ ), u ). ˆ u), x := ((F, H ), (Fˆ , H the form x := ((F, H), (Fˆ , H), Definition 7.2. We say that x is isomorphic to x over (F, Fˆ ) if there exists an isomorphism of triples x ∼ = x such that the underlying bundle isomorphisms are the identity maps (see 4.5). (F,Fˆ )
Let Triplen
(B) denote the set of isomorphism classes of triples over (F, Fˆ ). (F,Fˆ )
(B). Let δ ∈ 7.3. We have a natural action of H 3 (B, Z) on Triplen H 3 (B, Z) and choose a twist V in the corresponding isomorphism class. If x := Fˆ ) ˆ u) represents [x] ∈ Triple(F, ((F, H), (Fˆ , H), (B), then we define the triple n ˆ ⊗π ˆ ∗ V), u ⊗ r∗ idV ), x + V := ((F, H ⊗ π ∗ V), (Fˆ , H
(7.1)
and [x] + δ := [x + V], where π : F → B, π ˆ : Fˆ → B, and r : F ×B Fˆ → B are the projections. (F,Fˆ )
Proposition 7.3. H 3 (B, Z) acts freely and transitively on the set Triplen
(B).
Proof. We prove Proposition 7.3 in several steps. First recall the following principle, which we will employ several times. Let a group G act on a set S. If τ is an equivalence relation on S preserved by G, then G acts freely and transitively on S if and only if the two following conditions hold: (1) G acts transitively on the set S/τ of equivalence classes, (2) the isotropy group of one (and hence every) equivalence class acts freely and transitively on this equivalence class. ˆ u] for the 7.4. Since F and Fˆ are fixed at the moment, we use the notation [H, H, ˆ u]. Let us first introduce an equivalence relation τ1 on the set class [(F, H), (Fˆ , H), (F,Fˆ ) ˆ u] is equivalent to [H , H ˆ , u ] if and only if Triplen (B) by defining that [H, H, 3 ∼H ∼ H and H ˆ= ˆ . The action of H (B, Z) preserves this equivalence relation. H=
December 15, 2006 16:52 WSPC/148-RMP
1136
J070-00287
U. Bunke, P. Rumpf & Th. Schick
Lemma 7.4. The action of H 3 (B, Z) on the set of equivalence classes (F,Fˆ )
Triplen
(B)/τ1 is transitive with stabilizer ker(π ∗ ) ∩ ker(ˆ π ∗ ).
Proof. The assertion about the stabilizer is an immediate consequence of the definitions. We now verify transitivity of the action on the set of equivalence classes. If Fˆ ) ˆ u] ∈ Triple(F, [H, H, (B), then by the definition of a T -duality triple, the leading n
n ˆ 2,1 is yi ⊗ cˆi ] ∈ π E 2,1 of [H] is determined by Fˆ , and similar [H] part [H]2,1 = [ i=1
∞
Fˆ ) ˆ , u ] ∈ Triple(F, determined by F . Let us consider a second class [H , H (B). n It follows from the structure of the spectral sequences that there are classes ˆ ˆ ] = [H] ˆ +π ˆ ∗ δ. δ, δˆ ∈ H 3 (B, Z) such that [H ] = [H] + π ∗ δ and [H
π ∗ ). Lemma 7.5. We have δ − δˆ ∈ ker(π ∗ ) + ker(ˆ Proof. We assume without loss of generality that B is connected and choose a base point b ∈ B. Because of the presence of the isomorphisms u and u of the pull-backs ˆ = 0 (with r as in Eq. (7.1)). We of the twists to F ×B Fˆ we know that r∗ (δ − δ) ˆ ˆ choose twists V and V representing δ and δ, respectively. For our question, we can ˆ Thus, with the choice of an appropriate ˆ by H ˆ ⊗π ˆ ∗ V. replace H by H ⊗ π ∗ V and H ∗ ∗ˆ isomorphism of twists w : r V → r V we can write u = u ⊗ w. Note that we have canonical isomorphisms V|{b} ∼ = 0, Vˆ|{b} ∼ = 0. Therefore we can consider ∼
(w(b) : 0 → r∗ V|Fb ×Fˆb
w|F
ˆ ×F b
b →
∼ r∗ Vˆ|Fb ×Fˆb → 0) ∈ H 2 (Fb × Fˆb , Z).
r 0,2 r 1,1 ˆ We know by Lemma A.1 that r d0,2 2 (w(b)) = 0 and d3 (w(b)) = δ− δ+im( d2 ). Let 2 ∗ ∗ ˆ W := H (Fb × Fb , Z)/(im(pb ) + im(ˆ pb )), and let [. . .] denote for the moment classes in this quotient. Note that by 2.5 we have well-defined classes [u(b)], [u (b)] ∈ W . Since u and u satisfy the condition P we have [u(b)] = [u (b)], and it follows that [w(b)] = [u (b)] − [u(b)] = 0. Accordingly, we choose a decomposition w(b) = ˆ for some d ∈ H 2 (Fb , Z) and dˆ ∈ H 2 (Fˆb , Z). Then p∗b (d) + pˆ∗b (d) r 0,2 ∗ r 0,2 ∗ ˆ 0 = r d0,2 pb (d)). 2 (w(b)) = d2 (pb (d)) + d2 (ˆ ∗ r 0,2 ∗ π ˆ 0,2 ˆ ˆb ) = 0, π d0,2 Since im(r d0,2 2 pb ) ∩ im( d2 p 2 (d) = 0 and d2 (d) = 0. Therefore r 0,2 π 0,2 π ˆ 0,2 ˆ r 1,1 δ − δˆ + im(r d1,1 2 ) = d3 (w(b)) = d3 (d) + d3 (d) + im( d2 ). π 1,1 π ˆ 1,1 1 ∼ ˆ Finally we use that im(r d1,1 2 ) = im( d2 ) + im( d2 ) since H (Fb × Fb , Z) = 1 1 ˆ π ∗,∗ π ˆ ∗,∗ H (Fb , Z) ⊕ H (Fb , Z). Using the relation between the images of d∗ , d∗ and ˆ ∗ we obtain the assertion. the kernels of π ∗ and π
Write now δ − δˆ = −a + b with a ∈ ker(π ∗ ) and b ∈ ker(ˆ π ∗ ). Then e := δ + a = δˆ + b, ∗ ∗ ∗ˆ ˆ ˆ ˆ +π ˆ δ = [H] ˆ ∗ e. It follows that and [H ] = [H] + π δ = [H] + π e, [H ] = [H] + π ˆ ˆ [H , H , u ] ∼τ1 [H, H, u] + e. This proves the transitivity statement of Lemma 7.4.
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1137
7.5. Fix now an equivalence class for τ1 . In fact, we can fix the corresponding twists ˆ The isomorphism classes over (F, Fˆ ) in the given τ1 -equivalence class can H and H. ˆ u] for varying u. By Triple(H, H) ˆ we be represented in the form [u] := [H, H, denote the set of these isomorphism classes. ˆ Let i : Fb × Fˆb → We introduce an equivalence relation τ2 on Triple(H, H). F ×B Fˆ be the inclusion of the fiber over the base point b ∈ B. We declare [u], [u ] ∈ ˆ to be τ2 -equivalent if and only if the class w ∈ H 2 (F ×B Fˆ , Z) which Triple(H, H) is determined by u + w = u satisfies i∗ w ∈ i∗ (p∗ H 2 (F, Z) + pˆ∗ H 2 (Fˆ , Z)).
(7.2)
ˆ if and only if there are a ∈ H (F, Z), Note that [u1 ] = [u2 ] ∈ Triple(H, H) ˆ respectively) such that a ˆ ∈ H 2 (Fˆ , Z) (representing automorphisms of H and H, ∗ ∗ ˆ + u2 . It follows that, although the class w ∈ H 2 (F ×B Fˆ , Z) is u1 = p a + pˆ a ˆ the condition (7.2) is wellnot determined by the classes [u], [u ] ∈ Triple(H, H), defined, and it defines an equivalence relation. 2
Lemma 7.6. The subgroup ker(π ∗ ) ∩ ker(ˆ π ∗ ) ⊂ H 3 (B, Z) preserves τ2 , acts transiˆ 2 , and the isotropy subgroup of each τ2 -equivalence class is tively on Triple(H, H)/τ π ˆ 1,1 ∗ im(π d1,1 π ∗ ) ⊂ H 3 (B, Z). 2 ) ∩ im( d2 ) ⊂ ker(π ) ∩ ker(ˆ
Proof. Choose δ ∈ ker(π ∗ ) ∩ ker(ˆ π ∗ ) ⊂ H 3 (B, Z) and a twist V representing δ. In order to describe the action of δ on Triple(H, H ), we choose trivializations ∼ ∼ ˆ:π ˆ ∗ V → 0. If [u] ∈ Triple(H, H ), then [u] + δ is represented w : π ∗ V → 0 and w ˆ ◦ p∗ w−1 ]. We introduce the cohomology class v := pˆ∗ w ˆ ◦ p∗ w−1 ∈ by [u ⊗ pˆ∗ w 2 H (F ×B Fˆ , Z). Assume that [u] ∼τ2 [u ], i.e. the class w ∈ H 2 (F ×B Fˆ , Z) defined by u = u + w satisfies condition (7.2). Then we have [u] + δ ∼τ2 [u ] + δ, since [u] + δ = [u + v], [u ] + δ = [u + v], and u + v = u + v + w. This shows that the action of π ∗ ) preserves τ2 . ker(π ∗ ) ∩ ker(ˆ We now calculate the stabilizer of the τ2 -equivalence classes. We choose δ ∈ π ∗ ) fixing [u], and continue to use the above notation. The restriction ker(π ∗ ) ∩ ker(ˆ ∗ of π V to the fiber Fb is canonically trivial. We can therefore consider x := −w|Fb ∈ ˆ := w ˆ|Fˆb ∈ H 2 (Fˆ , Z). We can now write i∗ v = H 2 (Fb , Z). Similarly we have x π ˆ 0,2 x). By Lemma A.1, π d0,2 x) = 0, π d0,2 p∗b (x) + pˆ∗b (ˆ 2 (x) = 0, d2 (ˆ 3 (x) = −[δ] ∈ 3 π 1,1 π ˆ 0,2 3 x) = [δ] ∈ H (B, Z)/im(πˆ d1,1 ). Note that x is wellH (B, Z)/im( d2 ), and d3 (ˆ 2 2 ˆ is well-defined defined modulo restrictions of classes in H (F, Z) to Fb . Similarly, x modulo restrictions of H 2 (Fˆ , Z) to Fˆb . Therefore the condition that π d0,2 3 (x) = 0 (ˆ x ) = 0 is independent of all choices. Furthermore, it is equivalent to and πˆ d0,2 3 π ˆ 1,1 ) ∩ im( d ). This holds if and only if x is a restriction of a class in δ ∈ im(π d1,1 2 2 2 2 ˆ ˆ H (F, Z) to Fb , and xˆ is the restriction of a class in H (F , Z) to Fb . π ˆ 1,1 ∗ ∗ ∗ 2 We see that δ ∈ im(π d1,1 2 ) ∩ im( d2 ) implies that i v ∈ i p H (F, Z) + ∗ ∗ 2 ˆ ∗ i pˆ H (F , Z), and therefore [u] + δ ∼τ2 [u]. Vice versa, if i v ∈ i∗ p∗ H 2 (F, Z) + i∗ pˆ∗ H 2 (Fˆ , Z), then we write i∗ v = i∗ p∗ y + i∗ pˆ∗ yˆ for y ∈ H 2 (F, Z) and yˆ ∈
7.7.
December 15, 2006 16:52 WSPC/148-RMP
1138
J070-00287
U. Bunke, P. Rumpf & Th. Schick
H 2 (Fˆ , Z). We then have p∗b (x) + pˆ∗b (ˆ x) = p∗b (y|Fb ) + pˆ∗b (ˆ y|Fˆb ). This implies that x = y|Fb and xˆ = yˆ|Fˆb . We now see that
π 0,2 d3 (x)
= 0 and
π ˆ 0,2 d3 (ˆ x)
= 0, and
π ˆ 1,1 therefore δ ∈ im(π d1,1 2 ) ∩ im( d2 ). It remains to prove transitivity. Choose a second class [u ] ∈ Triple(H, H ). Then we have u = u + h with h ∈ H 2 (F ×B Fˆ , Z) such that the c := i∗ h is a) for (a, a ˆ) ∈ H 2 (Fb , Z) ⊕ H 2 (Fˆb , Z). Since c is in of the form c = p∗b (a) + pˆb (ˆ 0,2 ∗ r 0,2 ∗ the image of i∗ we have 0 = r d2 (c) = r d0,2 pb (ˆ a)). Consequently, 2 (pb (a)) + d2 (ˆ 0,2 0,2 0,2 π π ˆ π ˆ π 0,2 d2 a = 0 and d2 (ˆ a) = 0. Set δ := d3 (ˆ a). Then δ = − d3 (a). It follows that π ∗ ). By Lemma A.1 and similarly to the above considerations δ ∈ ker(π ∗ ) ∩ ker(ˆ [u] + δ ∼τ2 [u ].
ˆ and let 7.6. We now fix a class [u0 ] ∈ Triple(H, H) ˆ U := {[u] ∈ Triple(H, H)|[u] ∼τ2 [u0 ]} denote the corresponding τ2 -equivalence class. Lemma 7.7. The subgroup ker(i∗ ) ⊂ H 2 (F ×B Fˆ , Z) acts transitively on U, and the stabilizer of each element is given by p∗ H 2 (F, Z) ∩ ker(i∗ ) + pˆ∗ H 2 (Fˆ , Z) ∩ ker(i∗ ). Alternatively, if Hτ2 := (i∗ )−1 i∗ (p∗ H 2 (F, Z) + pˆ∗ H 2 (Fˆ , Z)) ⊂ H 2 (F ×B Fˆ , Z), then Hτ2 acts transitively on U with point stabilizers p∗ H 2 (F, Z) + pˆ∗ H 2 (Fˆ , Z). Proof. Let [u] ∈ U and w ∈ ker(i∗ ). Then w obviously satisfies condition (7.2), and therefore we have [u + h] ∼τ2 [u]. Hence ker(i∗ ) acts on U . Let now [u], [u ] ∈ U . Then u = u + w and w satisfies condition (7.2). Let ˆ for x ∈ H 2 (F, Z) and xˆ ∈ H 2 (Fˆ , Z). We define us write i∗ w = i∗ p∗ x + i∗ pˆ∗ x ∗ ∗ ˆ. Then we have [u + w] = [u + w ] on the one hand, and w := w − p x − pˆ x ∗ i w = 0 on the other. It follows that [u ] = [u + w ]. This shows that ker(i∗ ) acts transitively on U . ˆ for Assume that w ∈ ker(i∗ ) and [u + w] = [u]. Then we have w = p∗ x + pˆ∗ x 2 2 ˆ ∗ ∗ ∗ ∗ ˆ) we conclude some x ∈ H (F, Z) and xˆ ∈ H (F , Z). From 0 = i w = i (p x + pˆ x ˆ = 0. This shows the assertion about the point stabilizers. that i∗ p∗ x = 0 and i∗ pˆ∗ x In order to deduce the second assertion of the lemma we shall observe that the natural map ker(i∗ ) → Hτ2 induces an isomorphism ker(i∗ ) p∗ H 2 (F, Z)
∩
ker(i∗ )
+
pˆ∗ H 2 (Fˆ , Z)
∼
∩
ker(i∗ )
→
Hτ 2 p∗ H 2 (F, Z)
+ pˆ∗ H 2 (Fˆ , Z)
.
(7.3)
Injectivity is clear since p∗ H 2 (F, Z) ∩ ker(i∗ ) + pˆ∗ H 2 (Fˆ , Z) ∩ ker(i∗ ) = (p∗ H 2 (F, Z) + pˆ∗ H 2 (Fˆ , Z)) ∩ ker(i∗ ). In order to show surjectivity we consider w ∈ Hτ2 . We write i∗ w = i∗ p∗ x + i∗ pˆ∗ x ˆ for some x ∈ H 2 (F, Z) and x ˆ ∈ H 2 (Fˆ , Z). Then ∗ ∗ w := w − p x − pˆ x represents the same class as w on the right-hand side of Eq. (7.3), and it satisfies i∗ w = 0.
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1139
We define a map π ˆ 1,1 ∗ 2 µ : im(π d1,1 ˆ∗ H 2 (Fˆ , Z)) 2 ) ∩ im( d2 ) → Hτ2 /(p H (F, Z) + p
im(π d1,1 2 )
im(πˆ d1,1 2 )
∗
(7.4) ∗
∩ ⊂ ker(π ) ∩ ker(ˆ π ) ⊂ in the following way. Represent a ∈ H 3 (B, Z) by a map f : B → K(Z, 3). Choose a homotopy H: [0, 1] × F → K(Z, 3) ˆ [1, 2] × Fˆ → K(Z, 3) between the constant map and f ◦ π, and a homotopy H: between f ◦ π ˆ and the constant map. Note that the homotopy classes of these homotopies can be modified by concatenation with homotopies from the constant map to the constant map, i.e. by elements of H 3 (ΣF, Z) ∼ = H 2 (F, Z) or of H 2 (Fˆ , Z), respectively. ˆ to [0, 1] × F ×B Fˆ and [1, 2] × F ×B Fˆ . By We can now pull back H and H construction, these two pull-backs coincide at {1} × F ×B Fˆ , and therefore they can be concatenated to a map [0, 2]×F ×B Fˆ → K(Z, 3). This concatenation is constant at both ends of the interval [0, 2] and thus factors over the suspension Σ((F ×B Fˆ )+ ). Therefore it represents an element in H 3 (Σ((F ×B Fˆ )+ ), Z) ∼ = H 2 (F ×B Fˆ , Z). Its 2 ∗ 2 ∗ 2 ˆ ˆ class µ(a) ∈ H (F ×B F , Z)/(p H (F, Z) + pˆ H (F , Z)) is independent of the ˆ We still have to check that µ(a) belongs to choice of the homotopies H and H. ∗ 2 the subspace Hτ2 /(p H (F, Z) + pˆ∗ H 2 (Fˆ , Z)). This will follow from a universal construction of µ for such classes and is therefore postponed until we discuss this universal construction, compare Eq. (7.10). Observe that this secondary operation actually makes sense for all elements π ∗ ), provided that we allow arbitrary values in H 2 (F ×B in ker(π ∗ ) ∩ ker(ˆ Fˆ , Z)/(p∗ H 2 (F, Z) + pˆ∗ H 2 (Fˆ , Z)). π ˆ 1,1 ∗ 2 7.8. Note that the groups im(π d1,1 2 ) ∩ im( d2 ) and Hτ2 /(p H (F, Z) + ∗ 2 ˆ pˆ H (F , Z)) act on U . π ˆ 1,1 ∗ 2 ˆ∗ H 2 (Fˆ , Z)) Lemma 7.8. The map µ : im(π d1,1 2 ) ∩ im( d2 ) → Hτ2 /(p H (F, Z) + p is compatible with the action of both groups on U . π ˆ 1,1 Proof. Choose [u] ∈ U and a ∈ im(π d1,1 2 ) ∩ im( d2 ). As in 7.7 we represent a by a map f : B → K(Z, 3). We choose a twist K on K(Z, 3) such that [K] ∈ ˆ (see 7.7) give H 3 (K(Z, 3), Z) is the canonical generator. The homotopies H and H ∼ ∼ ∗ ∗ ∗ ∗ rise to isomorphisms of twists w : 0 → π f K and w ˆ :π ˆ f K → 0. We consider ∗ ∗ −1 2 ˆ ˆ ◦ p w ∈ H (F × F , Z) and observe that [u] + a = [u + pˆ∗ w ˆ ◦ p∗ w−1 ]. The pˆ w −1 lemma now follows since by construction w ˆ ◦ w represents µ(a).
7.9. Lemma 7.9. The map µ is an isomorphism. By a combination of Lemma 7.9 with Lemmas 7.8 and 7.7 we obtain: π ˆ 1,1 Corollary 7.10. im(π d1,1 2 ) ∩ im( d2 ) acts freely and transitively on U .
Together with Lemmas 7.4 and 7.6, this finishes the proof of Proposition 7.3.
December 15, 2006 16:52 WSPC/148-RMP
1140
J070-00287
U. Bunke, P. Rumpf & Th. Schick
7.10. We now prove Lemma 7.9. We have an isomorphism 1,1 ker(i∗ )/r∗ H 2 (B, Z) ∼ = ker(r d2 ).
(7.5)
Let ib : Fb → F and ˆib : Fˆb → Fˆ be the inclusions of the fibers over the basepoint. Using isomorphisms similar to Eq. (7.5) for π and π ˆ we see that Eq. (7.5) induces an isomorphism ∼ ker(r d1,1 ker(i∗ ) = 2 ) . − → 1,1 ∗ ∗ ∗ ∗ π ˆ p (ker(ib )) + pˆ (ker(ib ) ) ker( d2 ) ⊕ ker(πˆ d1,1 2 )
To prove Lemma 7.9, we now consider the diagram π 1,1 d2 ◦pr1 ker(r d1,1 2 ) π ˆ 1,1 → im(π d1,1 2 ) ∩ im( d2 ) 1,1 1,1 ker(π d2 ) ⊕ ker(πˆ d2 ) ∼ µ↓ . = ↑ ∗ ker(i ) Hτ 2 α → ∼ = p∗ (ker(i∗b )) + pˆ∗ (ker(ˆi∗b )) p∗ H 2 (F, Z) + pˆ∗ H 2 (Fˆ , Z)
(7.6)
1,1 Here, the upper horizontal map is induced by the projection pr1 : r E21,1 ∼ = π E2 ⊕ π ˆ 1,1 π 1,1 r 1,1 π 1,1 π ˆ 1,1 E2 → E2 composed with the differential. Since d2 = d2 + d2 , on ker(r d1,1 2 ) this map indeed maps to the intersection of the two images and is an isomorphism. The map α is the isomorphism (7.3), where we note that p∗ ker(i∗b ) = p∗ H 2 (F, Z) ∩ ker(i∗ ) and pˆ∗ ker(ˆi∗b ) = pˆ∗ H 2 (Fˆ , Z) ∩ ker(i∗ ). Consequently, all maps in (7.6) apart from µ are isomorphisms, and Lemma 7.9 follows immediately from the following lemma.
Lemma 7.11. The diagram (7.6) commutes (up to sign). Note that this reduces the proof of Proposition 7.3 to a question about the cohomology of fiber bundles, more precisely about a precise description of the secondary operation µ. 7.11. To study commutativity of the diagram, we can deal with one element in the lower left corner at a time. Since all the constructions are natural, it therefore suffices to study a universal situation and prove the assertion there. To do this, we first have to identify such a universal situation. Given is a base space B together with two T n -bundles F, Fˆ , classified by their first Chern classes c1 , . . . , cn , cˆ1 , . . . , cˆn ∈ H 2 (B, Z). Moreover, we have classes ˆ1 , . . . , a ˆn ∈ H 1 (B, Z) such that a1 , . . . , an , a
r 1,1 d2 (ai ⊗ yi − a ˆi ⊗ yˆi ) = (ai ∪ ci − a ˆi ∪ cˆi ) = 0 ∈ H 3 (B, Z). i
i
7.12. We now describe the universal case for this kind of data. Let t : K → K(Z, 2)2n × K(Z, 1)2n be the homotopy fiber of the map K(Z, 2)2n × K(Z, 1)2n →
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1141
K(Z, 3) classified by i xi ∪ai − x ˆi ∪ˆ ai , where xi , x ˆ i , ai , a ˆi are the canonical generators of the cohomology of the factors of the source space. The universal T n -bundles ˆ u : Fˆ u → K, and classified by ci and cˆi pull back to bundles π u : F u → K and π u u u ˆ r : F ×K F → K is the fiber product. A straightforward calculation with the Leray–Serre spectral sequence of t shows that we have isomorphisms ∼ =
t∗ : H 1 (K(Z, 1)2n , Z) → H 1 (K, Z) ∼ =
t∗ : H 2 (K(Z, 2)2n , Z) ⊕ H 2 (K(Z, 1)2n , Z) → H 2 (K, Z) (7.7) ∼ = (ai ∪ xi − a ˆi ∪ x ˆi )Z → H 3 (K, Z). t¯∗ : H 3 (K(Z, 2)2n × K(Z, 1)2n , Z) i
ˆ u , we see that Continuing with the Leray–Serre spectral sequences of ru , π u and π for k = 2, 3, H k (F u ×K Fˆ u , Z) = ker(i∗ ),
∗
H k (F u , Z) = ker(i∗b ),
H k (Fˆ u , Z) = ker(iˆb ), (7.8)
and that the projections ∼ =
u
∼ =
u
ker(i∗ )/r∗ H 2 (K, Z) −→ ker(r d1,1 2 ) 1,1 1,1 π π ˆ ∼ −→ ker(r d1,1 2 )/(ker( d2 ) ⊕ ker( d2 )) = Z u
u
(7.9)
all are isomorphisms. 7.13. In the universal situation we have ker(i∗ ) = Hτ2 = H 2 (F u ×K Fˆ u , Z), and consequently im(µ) ⊂ Hτ2 /(p∗ H 2 (F u , Z) + pˆ∗ H 2 (Fˆ u , Z)).
(7.10)
By naturality this implies that (7.10) holds in general. This adds the missing detail in the construction of µ of (7.4). 7.14. We choose the representative u (t∗ ai ⊗ yi − t∗ a ˆi ⊗ yˆi ) ∈ r E21,1 g := i u
u
u
1,1 1,1 π π ˆ of the generator (see Eq. (7.9)) of ker(r d1,1 2 )/(ker( d2 ) ⊕ ker( d2 )), and let ∗ g ] be the class represented by g˜ in the lower left corner of g˜ ∈ ker(i ) be a lift. Let [˜ (7.6). Note that π u 1,1 d2 ◦ pr1 (g) = t∗ ai ∪ t∗ xi = t∗ a ˆ i ∪ t∗ x ˆi . (7.11) i
We must show that µ
i
i
∗
∗
t a i ∪ t ci
= α([˜ g ]).
December 15, 2006 16:52 WSPC/148-RMP
1142
J070-00287
U. Bunke, P. Rumpf & Th. Schick
Since the relevant group is cyclic, it suffices to show the equality of the leading 1,1 = ker(ru d1,1 parts in ru E∞ 2 ), i.e. that 1,1
t∗ ai ∪ t∗ xi = α([˜ g ])1,1 = (t∗ ai ⊗ yi − t∗ a ˆi ⊗ yˆi ). (7.12) µ i
i u ∗ ∗
u ∗ ∗
7.15. We have (π ) t xi = 0 and (ˆ π ) t x ˆi = 0. Thus we can choose a homotopy u [0, 1] × F → K(Z, 2) between the constant map and (π u )∗ t∗ xi , and a homotopy π u )∗ t∗ x ˆi and the constant map. Since the trans[1, 2] × Fˆ u → K(Z, 2) between (ˆ gression of the Chern class of an U (1)-bundle is represented by a generator of the first cohomology of the fiber (modulo the image of the first cohomology of the total space) we can choose these homotopies in such a way that their restrictions to [0, 1] × Fbu and [1, 2] × Fˆbu (which are necessarily constant at both ends of the intervals) are the suspensions of the generators yi or yˆi , respectively. We now take the product of the above homotopies with the corresponding maps u ∗ ∗ π u )∗ t∗ a ˆi , respectively, and then the product over the index i in order (π ) t ai and (ˆ to get homotopies h1 : [0, 1] × F u →
n
(K(Z, 2) × K(Z, 1)),
ˆ 1 : [1, 2] × Fˆ u → h
i=1
n
(K(Z, 2) × K(Z, 1)).
i=1
We finally compose with the map (K(Z, 2) × K(Z, 1))n → K(Z, 3) representing
ˆ 1 : [1, 2] × H : [0, 1] × F u → K(Z, 3) and H i xi ∪ ai to get the required homotopies
∗1 u ∗ ˆ F → K(Z, 3) in the construction of µ( t ai ∪ t xi ). 7.16. We consider the lifting problem K ↓ , K(Z, 1) → K(Z, 2)2n × K(Z, 1)2n α1 w
t
where w is the inclusion of the first K(Z, 1)-factor — defined using the base points
of the remaining factors. Since w∗ ( i ai ∪ xi − a ˆi ∪ xˆi ) = 0, a lift α1 exists. ∗ ∗ ∗ ∗ Since α1 t xi and α1 t xˆi are constant, it follows that the bundles α∗1 F u and
∗ ˆu α1 F are trivialized. Since we can choose the map xi ∪ai : (K(Z, 2)×K(Z, 1))n → K(Z, 3) to factorize over the product of smash products (K(Z, 2) ∧ K(Z, 1))n we see that the pull-back of H1 along [0, 1] × α∗1 F u → [0, 1] × F u has a factorization over the suspension Σ(α∗1 F+u ) → K(Z, 3). It represents the suspension of the class a ∪ y1 ∈ H 2 (α∗1 F u , Z) ∼ = H 2 (K(Z, 1) × T n , Z) since t∗ ai pulls back to δi1 a1 , and the null homotopy for (π u )∗ t∗ x1 gives the suspension of y1 in the fiber by our ˆ 1 has a factorization choices above. In the same way the corresponding pull-back of H Σ(α∗1 Fˆ+u ) → K(Z, 3) which in this case is actually constant and thus represents the suspension of 0 ∈ H 2 (α∗1 Fˆ u , Z). 7.17. Let A1 : α∗1 (F u ×K Fˆ u ) → F u ×K Fˆ u be the induced map over α1 . We let A∗1 denote the induced map on the Leray–Serre spectral sequences. Then we have
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1143
∗ 1,1 by the above discussion that A∗1 µ( t∗ ai ∪ xi )1,1 = a ⊗ y1 ∈ α1 ru E∞ . On the other
∗ ∗ ∗ ˆi ⊗ yˆi )) = a⊗ y1 . This shows (7.12), since it is an equality hand A1 ( i (t ai ⊗ yi − t a in a cyclic group. This finishes the proof of Lemma 7.11 and therefore of Proposition 7.3. 7.18. We consider the isomorphism class [xn,univ ] ∈ Triple(Rn ) of the uniˆ n, H ˆ n ), un ) (see Theorem 4.6). Furthermore, versal T -duality triple ((Fn , Hn ), (F we consider the functor B → Pn (B) := [B, Rn ] classified by Rn . The element [xn,univ ] ∈ Triplen (Rn ) induces a natural transformation of functors ΨB : Pn (B) → Triplen (B) which maps [f ] ∈ Pn (B) to ΨB (f ) := f ∗ [xn,univ ]. Theorem 7.12. The natural transformation Ψ : Pn → Triplen is an isomorphism. In other words, Rn is the classifying space for T -duality triples. The following is then a consequence of 4.1. Corollary 7.13. For each space B there is a natural action of the T -duality group Gn on the set of triples Triplen (B). 7.19. Proof of Theorem 7.12. Given a space B we must show that ΨB is a bijection of sets. Therefore we show: Lemma 7.14. ΨB is surjective. Lemma 7.15. ΨB is injective. 7.20. The homotopy fiber of (c, cˆ) : Rn → K(Zn , 2) × K(Zn , 2) is K(Z, 3) (see 3.2). By obstruction theory the set of homotopy classes of lifts in the diagram Rn f (c,ˆ c)
(c,ˆ c)
↓
B → K(Z , 2) × K(Zn , 2) n
is a torsor over H 3 (B, Z). So given two such lifts f0 , f1 we have a well-defined difference element δ(f1 , f0 ) ∈ H 3 (B, Z). Let F := c∗ U n and Fˆ := cˆ∗ U n be the pull-backs of the universal T n -bundle ˆ n = ˆc∗ U n . over K(Zn , 2). Note further that by construction Fn = c∗ U n and F ∗ ∗ˆ ∼ ˆ ∼ Since f lifts (c, cˆ) we have natural identifications f Fn = F and f Fn = F . We (F,Fˆ )
can thus consider f ∗ [xn,univ ] ∈ Triplen of H (B, Z) on 3
(F,Fˆ ) Triplen (B)
(B) in a natural way. Recall the action
introduced in 7.3.
Lemma 7.16. We have f1∗ [xn,univ ] = f0∗ [xn,univ ] + δ(f1 , f0 ). Proof. We can choose a model of K(Z, 3) which is an abelian group. Furthermore, we choose a model of the map (c, cˆ) : Rn → K(Zn , 2) × K(Zn , 2) which is a K(Z, 3)-principal bundle. Let a : Rn × K(Z, 3) → Rn be the right-action. Let
December 15, 2006 16:52 WSPC/148-RMP
1144
J070-00287
U. Bunke, P. Rumpf & Th. Schick
z ∈ H 3 (K(Z, 3), Z) be a generator, and let W be a twist in the corresponding isomorphism class. We claim that a∗ xn,univ ∼ = pr∗1 xn,univ + pr∗2 W.
(7.13)
We have two maps pr, m : Fn × K(Z, 3) → Fn , namely the projection and the unneth formula map induced by the right-action a : Rn × K(Z, 3) → Rn . By the K¨ ∗ ∗ Z pr h ⊕ Z pr z. Then m∗ hn = and Lemma 3.5 we have H 3 (Fn × K(Z, 3), Z) ∼ = n 2 ∗ ∗ d pr hn + b pr2 z with d, b ∈ Z to be determined. By restriction to Fn × {1} ⊂ Fn × K(Z, 3) we see that d = 1. In order to calculate b we restrict the torus bundle to a fiber of (c, cˆ). We identify this restriction with T n × K(Z, 3). By Lemma 3.6 we know that the restriction of hn is 1 × z (after choosing the appropriate sign of the generator z). It follows that the restriction of m∗ hn is equal to 1 × z × 1 + 1 × 1 × bz ∈ H 3 (T n × K(Z, 3) × K(Z, 3), Z). Let µ : K(Z, 3) × K(Z, 3) → K(Z, 3) be the multiplication. Then we have µ∗ z = z × 1 + 1 × z. By a comparison of these two formulas we conclude that b = 1. We now know that we have an isomorphism m∗ Hn ∼ = pr∗ Hn ⊗ pr∗2 W. In a ˆ n and an isomorphism ˆ similar manner we have a map m ˆ : Fn × K(Z, 3) → F ∗ ˆ ∗ ∗ ˆ 2 ∼ ˆ 2 W. Note that H (Fn ×Rn Fn × K(Z, 3) × K(Z, 3), Z) ∼ ˆ Hn ⊗ pr m ˆ Hn = pr = {0}. Therefore the identification of the isomorphism classes of the twists already implies (7.13). We now consider maps f0 , f1 : B → Rn . Then we can write f1 = a ◦ (f0 , g), where g : B → K(Z, 3) represents δ(f1 , f0 ) ∈ H 3 (B, Z). It follows that f1∗ xn,univ ∼ = (f0 , g)∗ a∗ xn,univ ∼ = f0∗ xn,univ + g ∗ W. Therefore = (f0 , g)∗ (pr∗ xn,univ + pr∗2 W) ∼ f1∗ [xx,univ ] = f0∗ [xn,univ ] + δ(f1 , f0 ). 7.21. Proof of Lemma 7.14. It suffices to show the lemma under the assumpˆ u) represent an isomorphism class x ∈ tion that B is connected. Let ((F, H), (Fˆ , H), ˆ Triplen (B). The bundles F and F give rise to classifying maps c : B → K(Zn , 2) and cˆ : B → K(Zn , 2). We thus have a map (c, cˆ) : B → K(Zn , 2) × K(Zn , 2). Let ci , cˆi be the components of c, cˆ. It follows from the structure of H that
n n π 2,1 yi ⊗ cˆi = ci ∪ cˆi . 0 = d2 i=1
i=1
This implies by Definition 3.1 of Rn the existence of a lift f in the diagram Rn f (c,ˆ c)
(c,ˆ c)
↓
.
B → K(Z , 2) × K(Z , 2) n
n
As already noticed in 7.20, the set of homotopy classes of such lifts is a torsor ˆ ), u ). By construction we over H 3 (B, Z). Now xf := f ∗ xn,univ =: ((F , H ), (Fˆ , H can assume that F = F and Fˆ = Fˆ for all such lifts. We therefore consider
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1145
(F,Fˆ )
[x], [x ] ∈ Triplen (B). By Proposition 7.3 there exists a class αf ∈ H 3 (B, Z) such that [x] = [xf ] + αf . By Lemma 7.16, if we use f := f + αf , then [x] = [xf ]. 7.22. Proof of Lemma 7.15. Let f0 , f1 : B → Rn be given such that f0∗ xn,univ and f1∗ xn,univ are isomorphic. We must show that f0 and f1 are homotopic. First ˆ : we choose an isomorphism f0∗ xn,univ ∼ = f1∗ xn,univ . Let Ψ : f0∗ Fn → f1∗ Fn and Ψ ∗ˆ ∗ˆ n f0 Fn → f1 Fn denote the underlying isomorphisms of T -bundles. Their existence implies that the compositions (c, cˆ) ◦ f0 and (c, cˆ) ◦ f1 are homotopic. We choose such a homotopy H and consider the diagram Φ
ˆ n → H ∗ (U n × U n ) → [0, 1] × f0∗ Fn ×B f0∗ F ↓ ↓ id
→
[0, 1] × B
[0, 1] × B
Un × Un ↓
.
H
→ K(Zn , 2) × K(Zn , 2) (7.14)
The bundle isomorphism Φ is uniquely determined up to homotopy by the property ˆ n → f ∗ Fn ×B that its restriction to {0} × B is the identity. Let Φ1 : f0∗ Fn ×B f0∗ F 1 ∗ˆ f1 Fn be the restriction of Φ to {1} × B. Its homotopy class depends on the choice of H in the following way. 7.23. The homotopy H can be concatenated with a map g : [0, 1] × B → Σ(B+ ) → K(Zn , 2) × K(Zn , 2). Homotopy classes of such maps are classified by H 1 (B, Zn ) × H 1 (B, Zn ). Since T n has the homotopy type of K(Zn , 1), elements in this cohomology group can be represented by maps B → T n × T n , and these maps ˆ n . Let H be obtained from H by concateact by automorphisms on f0∗ Fn ×B f0∗ F n nation with g, and let g˜ : B → T × T n represent the corresponding homotopy class. As a consequence of the discussion in Sec. A.3 the homotopy class of the evaluation Φ1 (Φ is defined by (7.14) for H in the place of H) is obtained from Φ1 by composition with g. We see that we can arrange H and the choice of Φ such ˆ For the remainder of the proof we adopt this choice. that Φ1 = (Ψ, Ψ). ˜ in the diagram 7.24. We choose a lift H f0
B b→(0,b)
↓ H˜
I ×B
→ H
Rn (c,ˆ c)
↓
,
→ K(Z , 2) × K(Z , 2) n
n
˜ at {1} × B. Then f˜ and f1 are two maps lifting and we let f˜ be the evaluation of H (c, cˆ)◦f1 . Furthermore, f0 is homotopic to f˜. By Lemma 7.1 the pull-backs f0∗ xn,univ and f˜∗ xuniv are isomorphic. Note that the underlying T n -bundles of f˜∗ xn,univ are ˆ n . Actually, by the choice of H above, the canonically isomorphic to f1∗ Fn and f1∗ F proof of Lemma 7.1 gives us an isomorphism f0∗ xn,univ ∼ = f˜∗ xn,univ such that the ˆ underlying bundle isomorphisms are Ψ and Ψ. The composition f1∗ xn,univ ∼ = f0∗ xn,univ ∼ = f˜∗ xn,univ
December 15, 2006 16:52 WSPC/148-RMP
1146
J070-00287
U. Bunke, P. Rumpf & Th. Schick
ˆ n ). We conclude that [f1∗ xn,univ ] = is therefore an isomorphism over (f1∗ Fn , f1∗ F ∗ ∗ˆ (f F ,f F ) [f˜∗ xn,univ ] holds in Triplen 1 n 1 n (B). ˜ The maps f1 and f are lifts in the diagram Rn
B
↓
(c,ˆ c)
(c,ˆ c)◦f1
→
.
K(Z , 2) × K(Z , 2) n
n
Their difference as homotopy classes of lifts is measured by δ(f1 , f˜) ∈ H 3 (B, Z). But by Proposition 7.3 and the equality [f1∗ xn,univ ] = [f˜∗ xn,univ ] we have δ(f1 , f˜) = 0. Now f0 is homotopic to f˜, and f˜ is homotopic to f1 (even as a lift), so that f0 is homotopic to f1 . 7.25. We fix classifying maps (c, cˆ) : B → K(Zn , 2) × K(Zn , 2). We consider a lift f in the diagram Rn f (c,ˆ c)
↓
(c,ˆ c)
.
B → K(Z , 2) × K(Z , 2) n
n
ˆ u). Let furthermore ψ : F → Let x := f ∗ xn,univ and write x = ((F, H), (Fˆ , H), n ˆ ˆ ˆ F and ψ : F → F be automorphisms of T -bundles. We will use the notation introduced in 2.13: (F,Fˆ )
Proposition 7.17. In Triplen
(B) we have the following identity:
ˆ ˆ [x(ψ,ψ) ] = [x] + cˆ ∪ [ψ] + c ∪ [ψ].
Proof. We have ΩK(Zn , 2) ∼ = K(Zn , 1). As in 7.23 we can concatenate the constant ˆ ∈ homotopy from (c, cˆ) to (c, cˆ) by a map corresponding to the class ([ψ], [ψ]) 1 n 1 n H (B, Z ) × H (B, Z ). In this way we obtain a new homotopy H : [0, 1] × B → K(Zn , 2) × K(Zn , 2) from (c, cˆ) to (c, cˆ). We again consider the diagram ˆ (φ,φ)
[0, 1] × F ×B Fˆ → H ∗ (U n × U n ) → ↓ ↓ [0, 1] × B
id
→
Un × Un ↓
.
H
[0, 1] × B
→ K(Zn , 2) × K(Zn , 2)
As explained in 7.23 we can choose φ and φˆ such that their restrictions φ1 and φˆ1 ˆ We now choose a lift H ˜ in to {1} × B coincide with ψ and ψ. B i0 ↓
f
→ ˜ H H
Rn (c,ˆ c)
↓
,
[0, 1] × B → K(Z , 2) × K(Z , 2) n
n
˜ |{1}×B : B → where i0 : B ∼ = {0} × B → [0, 1] × B is the inclusion. Let f1 := H ˆ ∗ (ψ, ψ) Rn . Then we have by construction f1 xn,univ ∼ . By Lemma 7.16 we have = x (F,Fˆ )
f1∗ [xn,univ ] = f ∗ [xn,univ ] + δ(f1 , f0 ) in Triplen
(B). Therefore it remains to show
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1147
ˆ We want to apply the result of Sec. A.3. The that δ(f1 , f0 ) = cˆ ∪ [ψ] + c ∪ [ψ]. homotopy H gives rise to a map H : B → L(K(Zn , 2) × K(Zn , 2)) which restricts to (c, cˆ) at 0 ∈ S 1 . We must consider the composition H
∼
Lq
pr
B → L(K(Zn , 2) × K(Zn , 2)) → LK(Z, 4) → K(Z, 4) × K(Z, 3) →2 K(Zn , 3). By Eq. (A.2) we know that δ(f0 , f1 ) is given by the cohomology class which classifies this map. In order to calculate this class we consider the following general result. 7.26. Fix integers k, m ≥ 1. A homotopy class f ∈ [B, LK(Z, k)] is characterized by a pair of cohomology classes (f0 , f1 ) ∈ H k (B, Z) × H k−1 (B, Z). Let q : K(Z, k) × K(Z, m) → K(Z, k) ∧ K(Z, m) → K(Z, k + m) be the composition of the canonical projection and the cup product. It induces a map Lq : LK(Z, k) × LK(Z, m) → LK(Z, k + m). We let q : LK(Z, k) × LK(Z, m) → K(Z, k + m − 1) be the composition of Lq with the projection LK(Z, k + m) ∼ = K(Z, k + m) × ΩK(Z, k + m) → ΩK(Z, k + m) ∼ = K(Z, k + m − 1). Lemma 7.18. We have q∗ (f, g) = f0 ∪ g1 + f1 ∪ g0 . Proof. We consider (adjoint) representatives f1 : ΣB+ → K(Z, k) and g1 : ΣB+ → K(Z, m) such that f1 is constant on [0, 1/2] × B and g1 is constant on [1/2, 1] × B. Then we see that q∗ (f, g) = q∗ (f0 , g1 ) + q∗ (f1 , g0 ), where + indicates concatenation of loops. This follows because of commutativity in the diagram Ω∪
ΩK(Z, k − 1) ∧ K(Z, l) −−−−→ Ω(K(Z, k − 1) ∧ K(Z, l)) −−−−→ ΩK(Z, k + l − 1) ∼ ∼ ∪
K(Z, k) ∧ K(Z, l)
−−−−→
K(Z, k + l)
where the first arrow in the upper row comes from the inclusion K(Z, l) → ΩK(Z, l) as constant loops. The lemma now follows by taking the adjoint. 7.27. Let us finish the proof of Proposition 7.17. Recall that Rn is the homotopy
n ˆi . We fiber of the map q : K(Zn , 2) × K(Zn , 2) → K(Z, 4) classified by i=1 xi ∪ x consider the composition 2 L(K(Zn , 2) × K(Zn , 2)) → LK(Z, 4) ∼ = K(Z, 4) × K(Z, 3) → K(Z, 3).
pr
Lq
Furthermore note that L(K(Zn , 2) × K(Zn , 2)) ∼ = K(Zn , 2) × K(Zn , 1) × K(Zn , 2) × K(Zn , 1). We let xi , ai , x ˆi , a ˆi denote the corresponding canonical generators of the cohomology of the factors. Furthermore, we let z ∈ H 3 (K(Z, 3), Z) be the generator. We apply Lemma 7.18 to the summands of q and obtain Lq ∗ ◦ pr∗2 (z) =
n i=1
(xi ∪ a ˆi + xˆi ∪ ai ).
December 15, 2006 16:52 WSPC/148-RMP
1148
J070-00287
U. Bunke, P. Rumpf & Th. Schick
We now see that (H )∗ xi = ci , (H )∗ x ˆi = cˆi , and using the discussion in Sec. A.4, ∗ ∗ ˆ ˆi = [ψ]i . We get (H ) ai = [ψ]i , (H ) a (H )∗
n
ˆ + cˆ ∪ [ψ], (ci ∪ a ˆi + cˆi ∪ ai ) = c ∪ [ψ]
i=1
using the notation of 2.13. 7.28. We fix a pair (F, h) over a space B. Then we consider n-dimensional T -duality ˆ u) which extend (F, h) (see Definition 2.14). triples x = ((F, H), (Fˆ , H), Definition 7.19. An isomorphism of extensions ˆ ), u ) x := ((F, H ), (Fˆ , H
ˆ u) , x := ((F, H), (Fˆ , H),
is an isomorphism of T -duality triples x ∼ = x (see 4.5) which induces the identity on F . The set of such isomorphism classes is denoted Ext(F, h). 7.29. Proof of Theorem 2.17. The first assertion of Theorem 2.17 is exactly Theorem 5.6. We continue with the existence statement of (2). Let x := ˆ u) represent a class in Ext(F, h). Let B ∈ Mat(n, n, Z) be anti((F, H), (Fˆ , H),
n
n symmetric and set cˆi := cˆi (x) + j=1 Bi,j cj (x). We compute i=1 ci (x) ∪ cˆi =
n
n ˆi (x) + j,i=1 Bi,j ci (x) ∪ cj (x) = 0. This implies the existence of a lift i=1 ci (x) ∪ c f in Rn f
(c,ˆ c )
(c,ˆ c)
↓
.
B → K(Z , 2) × K(Z , 2) n
n
Then {x } := {(f )∗ xn,univ } ∈ Ext(F, h) has the required properties. The remaining part of (2) was shown in Sec. 2.18. We now show (3). We first consider the set of classes of triples [x ] ∈ (F,Fˆ )
Triplen (B) which extend (F, h). It follows from Proposition 7.3 that ker(π ∗ ) acts freely and transitively on this set. It follows that ker(π ∗ ) acts transitively on any subset of Ext(F, H) with fixed c(Fˆ ). But note that the equivalence relation (F,Fˆ )
(B) is stronger than in Ext(F, h) since isomorphisms must induce in Triplen the identity on Fˆ . In Ext(F, h) we admit non-trivial bundle automorphism of Fˆ . In view of Proposition 7.17 we see that the quotient ker(π ∗ )/im(C) acts freely on the set of isomorphisms classes of extensions of (F, h) with prescribed isomorphism class of Fˆ . Appendix A. Twists, spectral sequences and other conventions A.1. Twists We start with a description of twists. The literature contains various models for twists. Therefore we first describe the common core of these models and in particular
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1149
the properties which we use in the present paper. Then we exhibit two of these models explicitly. In each case one considers a pre-sheaf of monoidal groupoids B → T (B) on the category of spaces. The objects of T (B) are called twists. The unit of the monoidal structure is called the trivial twist and denoted by 0. If f : A → B is a map of spaces, then there is a monoidal functor f ∗ : T (B) → T (A). Functoriality is implemented by natural transformations. We refer to [6, Sec. 3.1] for more details. The following three requirements provide the coupling to topology: (1) We require that there is a natural monoidal transformation T (B) → H 3 (B, Z), H → [H], (the group H 3 (B, Z) is considered as a monoidal category which has only identity morphisms) which classifies the isomorphism classes of T (B) for each B. (2) If H, H ∈ T (B) are equivalent objects, then we require that HomT (B) (H, H ) is a H 2 (B, Z)-torsor such that the composition with a fixed morphism form H to H or from H to H, respectively, gives isomorphisms of torsors. Furthermore, we require that the torsor structure is compatible with the pull∼ back. Note that we have natural bijections Hom(H, H) → H 2 (B, Z) which map compositions to sums. In order to simplify the notation we will (and have in the text) frequently identify automorphisms of twists and cohomology classes. The map HomT (B) (H0 , H0 ) × HomT (B) (H1 , H1 ) → HomT (B) (H0 ⊗ H1 , H0 ⊗ H1 ), (v, w) → v ⊗ w is bilinear. We now discuss two explicite realizations: (1) Let K be the algebra of compact operators on a separable complex Hilbert space. The group of automorphisms of K is the projective unitary group P U . As a topological group we have P U = U/U (1), where the topology of the unitary group of the Hilbert space is the strong topology. We define T (B) to be the category of locally trivial bundles with fiber K such that the transition functions are continuous functions with values in P U . We further define HomT (B) (H, H ) as the set of homotopy classes of algebra bundle isomorphisms. The monoidal structure is given by the fiberwise (completed) tensor product of bundles. In order to see that this preserves the category we use the fact that K ⊗ K ∼ = K. Isomorphism classes of objects in T (B) are classified by homotopy classes [B, BP U ]. Since the classifying space BP U has the homotopy type K(Z, 3) we see that isomorphism classes in T (B) are in one-to-one correspondence with H 3 (B, Z). The group of automorphisms of a bundle H ∈ T (B) can be identified with the group of homotopy classes of maps [B, P U ] and is therefore in oneto-one correspondence to H 2 (B, Z) (this identification is not quite automatic and uses the fact that P U is homotopy commutative, to pass from the sections of the inner automorphism bundle of the non-trivial bundle to the ones of the trivial bundle).
December 15, 2006 16:52 WSPC/148-RMP
1150
J070-00287
U. Bunke, P. Rumpf & Th. Schick
This model of twists is very suitable for a quick definition of twisted Ktheory. Assume that B is locally compact. Given a twist H ∈ T (B) we consider the C ∗ -algebra C0 (B, H) of continuous sections of H vanishing at infinity. Then we can define K(B, H) := K(C0 (B, H)), where the right-hand side is the Ktheory of the C ∗ -algebra C0 (B, H). (2) In our second model we fix an h-space model of K(Z, 3). In fact, note that there exist models which are topological abelian groups. A twist H ∈ T (B) is a map H : B → K(Z, 3). The monoidal structure is implemented by the h-space structure of K(Z, 3). The set HomT (B) (H, H ) in this model is the set of homotopy classes of homotopies from H to H . It is obvious from the definition that isomorphism classes in T (B) are classified by H 3 (B, Z). Furthermore, since ΩK(Z, 3) ∼ = K(Z, 2) we see that HomT (B) (H, H ) is empty or a torsor over 2 H (B, Z). A.2. Spectral sequences The cohomology of the total space F of a fiber bundle π : F → B has a decreasing filtration 0 ⊂ F n H n (F, Z) ⊂ F n−1 H n (F, Z) ⊂ · · · ⊂ F 1 H n (F, Z) ⊂ F 0 H n (F, Z) = H n (F, Z) such that F n H n (F, Z) = π ∗ (H n (B, Z)). By definition, x ∈ F k H n (F, Z) if for any (k − 1)-dimensional CW -complex X and map φ : X → B the condition Φ∗ (x) = 0 is satisfied. Here Φ : φ∗ F → F is the induced map. The associated graded group is k n calculated by the Leray–Serre spectral sequence (π Ers,t , π ds,t r ). If x ∈ F H (F, Z), k,n−k π k,n−k then we let x ∈ E∞ denote the leading part. Let us assume that B is connected and choose a base point b ∈ B. Let Fb := π −1 (b) be the fiber over b. We further assume that π1 (B, b) acts trivially on H ∗ (Fb , Z). In the present paper this assumption is always satisfied. Then we have isomorphisms π E2p,q ∼ = H p (B, H q (Fb , Z)). For various calculations we will employ naturality and multiplicativity of the spectral sequence. Let us now fix some notation in the case that π : F → B is a T n -principal bundle. We fix a generator of H 1 (U (1), Z). Since T n := U (1)n this provides a natural set of generators of H 1 (T n , Z). If we fix a base point in Fb , then using the right action we obtain a homeomorphism Fb ∼ = T n . We let y1 , . . . , yn ∈ H 1 (Fb , Z) denote the natural generators obtained in this way. They do not depend on the choice of the ˆ : Fˆ → B, then we will write yˆ1 , . . . , yˆn base point in Fb . If we consider a bundle π for the corresponding set of generators. Finally, the notation y1 , . . . , yn , yˆ1 , . . . , yn for the generators of H 1 (Fb × Fˆb , Z) is self-explaining. Note that H ∗ (Fb , Z) is a free Z-module. Therefore we have an isomorphism π p,q ∼ E = H p (B, H q (Fb , Z)) ∼ = H q (Fb , Z) ⊗ H p (B, Z). 2
Let c1 , . . . , cn ∈ H 2 (B, Z) be the Chern classes of the T n -bundle π : F → B. Then we know that π d0,1 2 (yi ) = ci . By multiplicativity this leads to a complete calculation of π d2 .
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples i
1151
π
Let T − → F −→ B be a fibration (of pointed spaces) with base B and fiber T . Choose δ ∈ H 3 (B, Z) such that π ∗ δ = 0. We choose a twist H over F such ∼ that [H] = δ. Furthermore, we choose a trivialization w: π ∗ H → 0. Since π ◦ i is can the constant map to the base point we have a natural isomorphism 0 → π ∗ H. We i∗ w
can
obtain the automorphism wb : 0 → π ∗ H → 0 of the trivial twist. We will consider wb ∈ H 2 (T, Z). Note that the set of trivializations w is a torsor over H 2 (F, Z). It follows that the class [wb ] ∈ H 2 (T, Z)/i∗ H 2 (F, Z) is independent of the choice of w. If we replace H by an isomorphic twist H with isomorphism u: H → H , and we let w be obtained from w by conjugation with π ∗ u, then we have wb = wb . Therefore the class [wb ] ∈ H 2 (T, Z)/i∗ H 2 (F, Z) is well-defined independent of the choice of H in the class 2 ∗ 2 given by δ. Note that we can consider π E30,2 / ker(π d0,2 3 ) ⊂ H (T, Z)/i H (F, Z). Lemma A.1. We have [wb ] ∈ H 3 (B, Z)/im(π d1,1 2 ).
π
E30,2 / ker(π d0,2 3 ) and
π 0,2 d3 ([wb ]))
= [δ] ∈
Proof. We choose a model for K(Z, 3) and a (pointed) classifying map f : B → K(Z, 3) such that δ = f ∗ z, where z ∈ H 3 (K(Z, 3), Z) is the canonical generator. We further choose a twist K on K(Z, 3) in the class z. Then we can assume that H = f ∗ K. We choose a (pointed) homotopy H: [0, 1] × F → K(Z, 3) between f ◦ p and the constant map which exists because π ∗ δ = 0. The homotopy class of such homotopies is well defined up to the action of H 2 (F, Z) ∼ = H 3 (ΣF, Z), where ΣF is the reduced 3 suspension of F . Elements of H (ΣF, Z) are represented by maps ΣF → K(Z, 3), and the action mentioned above is defined by concatenation with the homotopy given by the composition [0, 1] × F ΣF → K(Z, 3). Note that by the axioms of twists each such map H induces a well-defined isomorphism of twists ∼
∼
∼
π ∗ H → H0∗ K → H1∗ K → 0,
(A.1)
where we use canonical identifications, and in particular that H1 is the constant map. The axioms also imply that the action of H 2 (F, Z) on such isomorphisms corresponds exactly to the action of H 2 (F, Z) on the homotopies H as constructed above. Therefore, we can choose H such that the isomorphism (A.1) is exactly the isomorphism w chosen above. If we restrict H to the fiber T , then we obtain a map h: [0, 1] × T → K(Z, 3). Since H0 = f ◦ p and p ◦ i is the constant map to the base point, it follows that h0 is the constant map. Consequently, we have a factorization of h as [0, 1] × T h
∼
ΣT −→ K(Z, 3). The axioms of twists give us an isomorphism h∗0 K → h∗1 K which is by naturality the pullback i∗ w. We now see that wb ∈ H 2 (T, Z) corresponds to (h )∗ z ∈ H 3 (ΣT, Z) under the suspension isomorphism.
December 15, 2006 16:52 WSPC/148-RMP
1152
J070-00287
U. Bunke, P. Rumpf & Th. Schick
A different interpretation uses adjunction to translate h to a map g : T → ΩK(Z, 3) = K(Z, 2). Then wb = g ∗ u, where u is the canonical generator of H 2 (K(Z, 2), Z). We now consider the Leray–Serre spectral sequence of the fibration p
ΩK(Z, 3) → P K(Z, 3) −→ K(Z, 3), where P K(Z, 3) is the (contractible) space of pointed paths in K(Z, 3). Observe p 0,2 π 0,2 π 0,2 that p d0,2 2 (u) = 0 and d3 (u) = z. By naturality, [wb ] ∈ E3 / ker( d3 ) and π 0,2 3 π 1,1 d3 (wb ) = [δ] ∈ H (B, Z)/im( d2 ). A.3. Classification of lifts We consider an integer k ≥ 1 and an abelian group G. We form the loop space LK(G, k) := Map(S 1 , K(G, k)). Let h : K(G, k) × K(G, k) → K(G, k) be the h-space structure. Then the map φ : K(G, k) × ΩK(G, k) → LK(G, k) given by φ(x, l)(t) = h(x, l(t)), t ∈ S 1 , is a homotopy equivalence. We choose a homotopy inverse ψ of φ. Note that ΩK(G, k) ∼ = K(G, k − 1). We consider a space B and a map c : B → K(G, k). Furthermore we let p : X → B be the homotopy fiber of c. Then the homotopy fiber of p is a K(G, k − 1). By obstruction theory the set of lifts f˜ in the diagram X ↓
f˜ p f
Y →
B
is empty or a torsor over H k−1 (Y, G). Given two lifts f˜0 , f˜1 we have a difference element δ(f˜0 , f˜1 ) ∈ H k−1 (Y, G). In fact, we can choose a model for K(G, k − 1) which is an abelian group. Furthermore, we can choose a model for p : X → B which is a K(G, k − 1)principal bundle. Then there exists a unique map g : Y → K(G, k − 1) such that f˜1 = f˜0 g using the right-action of K(G, k − 1) on X. In this case the homotopy class of g is classified by δ(f˜0 , f˜1 ). We consider now a map H : S 1 ×Y → B such that H|{0}×Y = f (we parametrize ˜ in the S 1 by [0, 1] with endpoints identified). Furthermore we consider a lift H diagram f˜
Y i0
↓
[0, 1] × Y
→ X ˜ p↓ , H H◦q
→
B
where q : [0, 1] × Y → S 1 × Y is the quotient map, and i0 (y) := (0, y). We define ˜ |{1}×Y : Y → X. f˜ := H Let H : Y → LB be the adjoint of H. By composition with Lc : LB → LK(G, k), the homotopy inverse ψ, and the projection to the second component we
December 15, 2006 16:52 WSPC/148-RMP
J070-00287
Topological T-Duality via T-Duality Triples
1153
obtain a map pr2 ◦ ψ ◦ Lc ◦ H : Y → K(G, k − 1). The homotopy class of this map is classified by a cohomology class d(H) ∈ H k−1 (Y, G). Several times we need the following formula: δ(f˜ , f˜) = d(H).
(A.2)
References [1] P. Bouwknegt, J. Evslin and V. Mathai, Topology and H-flux of T -dual manifolds, Phys. Rev. Lett. 92(18) (2004) 181601; arXiv:hep-6h/0312284. [2] P. Bouwknegt, J. Evslin and V. Mathai. T -duality: Topology change from H-flux, Comm. Math. Phys. 249 (2004) 383–415. [3] P. Bouwknegt, K. Hannabuss and V. Mathai, T -Duality for principal torus bundles, J. High Energy Phys. 3 (2004) 018. (electronic); arXiv:hep-th/0312284. [4] P. Bouwknegt, K. Hannabuss and V. Mathai, T-duality for principal torus bundles and dimensionally reduced Gysin sequences, Adv. Theor. Math. Phys. 9 (2005) 749– 773; arXiv:hep-th/0412268. [5] P. Bouwknegt, K. Hannabuss and V. Mathai, Nonassociative tori and applications to T -duality, Comm. Math. Phys. 264 (2006) 41–69. [6] U. Bunke and Th. Schick, On the topology of T -duality, Rev. Math. Phys. 17 (2005) 77–112; arXiv:math.GT/0405132. [7] U. Bunke and Th. Schick, T -duality for non-free circle actions, Analysis, Geometry and Topology of Elliptic Operators (World Sci. Publ., Hackensack, NJ, 2006), pp. 429– 466; arXiv:math.GT/ 0508550. [8] U. Bunke, M. Spitzweck, Th. Schick and A. Thom, Pontrjagin duality for locally compact group stacks and T -duality, in preparation. [9] T. Buscher, A symmetry of the string background field equations, Phys. Lett. B 194 (1987) 59–62. [10] T. Buscher, Path integral derivation of quantum duality in nonlinear sigma models, Phys. Lett. B 201 (1988) 466–472. [11] R. Donagi and T. Pantev, Torus fibrations, gerbes, and duality, arXiv:math.AG/ 0306213. [12] N. Hitchin, Lectures on special Lagrangian submanifolds, in Winter School on Mirror Symmetry, Vector Bundles and Lagrangian Submanifolds (Cambridge, MA, 1999) AMS/IP Stud. Adv. Math., Vol. 23 (Amer. Math. Soc., Providence, RI, 2001), pp. 151–182. [13] D. Husemoller, Fibre Bundles, Graduate Texts in Mathematics, Vol. 20 (SpringerVerlag, New York, 1994). [14] V. Mathai and J. Rosenberg, T -Duality for torus bundles with H-fluxes via noncommutative topology, Comm. Math. Phys. 253 (2005) 705–721. [15] V. Mathai and J. Rosenberg, On mysteriously missing T -duals, H-flux, and the T duality group, to appear in Proc. XXXIII Int. Conf. Differential Geometric Methods in Mathematical Physics (August 2005), eds. M.-L. Ge and W. Zhang (World Scientific, 2006): arXiv:hep-th/0409073. [16] V. Mathai and J. Rosenberg, T -duality for torus bundles with H-fluxes via noncommutative topology. II. The high-dimensional case and the T -duality group, Adv. Theor. Math. Phys. 10 (2006) 123–158.
December 15, 2006 16:52 WSPC/148-RMP
1154
J070-00287
U. Bunke, P. Rumpf & Th. Schick
ˆ with its applications to Picard sheaves, [17] S. Mukai, Duality between D(X) and D(X) Nagoya Math J. 81 (1981) 153–175. [18] J. Polchinski, String Theory, Vols. I, II, Cambridge Monographs on Mathematical Physics (Cambridge University Press, 2005). [19] A. Schneider, PhD thesis, Mathematisches Institut, Georg-August-Universit¨ at G¨ ottingen, Germany (2006/07). [20] A. Strominger, S. T. Yau and E. Zaslow, Mirror symmetry is T -duality, Nuclear Phys. B 479 (1996) 243–259.
December 15, 2006 16:52 WSPC/148-RMP J070-00288
Reviews in Mathematical Physics Vol. 18, No. 10 (2006) 1155–1157 c World Scientific Publishing Company
REVIEWS IN MATHEMATICAL PHYSICS Author Index Volume 18 (2006)
Babin, A. & Figotin, A., Linear superposition in nonlinear wave dynamics Bach, V., Lieb, E. H. & Travaglia, M. V., Ferromagnetism of the Hubbard model at strong coupling in the Hartree–Fock approximation Bahn, C., Ko, C. K. & Park, Y. M., Quantum dynamical semigroups generated by noncommutative unbounded elliptic operators Bai, C., A further study on non-abelian phase spaces: Left-symmetric algebraic approach and related geometry Baumg¨ artel, H., Generalized eigenvectors for resonances in the Friedrichs model and their associated Gamov vectors Benvegn` u, A. & Spera, M., On uncertainty, braiding and entanglement in geometric quantum mechanics Berceanu, S., A holomorphic
representation of the Jacobi algebra Bojowald, M. & Skirzewski, A., Effective equations of motion for quantum systems Bouwknegt, P. & Ridout, D., Presentations of Wess–Zumino–Witten fusion rings Bunke, U., Rumpf, P. & Schick, Th., The topology of T -duality for T n -bundles Chaturvedi, S., Marmo, G., Mukunda, N., Simon, R. & Zampini, A., The Schwinger Representation of a group: Concept and applications Cicogna, G. & Laino, M., On the notion of conditional symmetry of differential equations D’Antoni, C. & Morsella, G., Scaling algebras and superselection sectors: Study of a class of models Dappiaggi, C., Moretti, V. & Pinamonti, N., Rigorous steps towards holography in
9 (2006) 971
5 (2006) 519
6 (2006) 595
5 (2006) 545
1 (2006) 61
10 (2006) 1075
1155
2 (2006) 163
7 (2006) 713
2 (2006) 201
10 (2006) 1103
8 (2006) 887
1 (2006) 1
5 (2006) 565
December 15, 2006 16:52 WSPC/148-RMP J070-00288
1156
Author Index
asymptotically flat spacetimes De Roeck, W. & Maes, C., Steady state fluctuations of the dissipated heat for a quantum stochastic model Deng, H., Hou, B.-Y., Shi, K.-J., Yang, Z.-Y., Yue, R.-H. & Zhao, L., The manifestly covariant soliton solutions on noncommutative orbifolds T 2 /Z6 and T 2 /Z3 Figotin, A., see Babin, A. Fiore, G., On the hermiticity of q-differential operators and forms on the quantum Euclidean spaces Rn q Georgescu, V. & Iftimovici, A., Localizations at infinity and essential spectrum of quantum Hamiltonians: I. General theory Horv´ athy, P. A., Dynamical (super)symmetries of monopoles and vortices Horv´ athy, P. A., The Biedenharn approach to relativistic Coulomb-type problems Hou, B.-Y., see Deng, H. Iftimovici, A., see Georgescu, V. Ignat, R. & Millot, V., Energy expansion and vortex location for a two-dimensional rotating Bose–Einstein condensate Møller, J. S., The polaron revisited
4 (2006) 349
6 (2006) 619
3 (2006) 255 9 (2006) 971
1 (2006) 79
4 (2006) 417
3 (2006) 329
3 (2006) 311 3 (2006) 255 4 (2006) 417
2 (2006) 119 5 (2006) 485
Keyl, M., Matsui, T., Schlingemann, D. & Werner, R. F., Entanglement, Haag-duality and type properties of infinite quantum spin chains Keyl, M., Quantum state estimation and largedeviations Klimˇ c´ık, C., On moment maps associated to a twisted Heisenberg double Ko, C. K., see Bahn, C. Laino, M., see Cicogna, G. Lieb, E. H., see Bach, V. Long, E., Existence and stability of solitary waves in non-linear Klein– Gordon–Maxwell equations Maes, C., see De Roeck, W. Marmo, G., see Chaturvedi, S. Matsui, T., see Keyl, M. Millot, V., see Ignat, R. Mine, T. & Nomura, Y., Periodic Aharonov–Bohm solenoids in a constant magnetic field Moretti, V., see Dappiaggi, C. Morsella, G., see D’Antoni, C. Mukunda, N., see Chaturvedi, S. Nomura, Y., see Mine, T. Park, Y. M., see Bahn, C. Pinamonti, N., see Dappiaggi, C. Qi, Y. W., Dynamics and universality of an
9 (2006) 935
1 (2006) 19
7 (2006) 781 6 (2006) 595 1 (2006) 119 5 (2006) 519
7 (2006) 747 6 (2006) 619 8 (2006) 887 9 (2006) 935 2 (2006) 119
8 (2006) 913 4 (2006) 349 5 (2006) 565 8 (2006) 887 8 (2006) 913 6 (2006) 595 4 (2006) 349
December 15, 2006 16:52 WSPC/148-RMP
J070-00288
Author Index isothermal combustion problem in 2D Ridout, D., see Bouwknegt, P. Rumpf, P., see Bunke, U. Schick, Th., see Bunke, U. Schlingemann, D., see Keyl, M. Seiringer, R., A correlation estimate for quantum many-body systems at positive temperature Shaynkman, O. V., Tipunin, I. Yu. & Vasiliev, M. A., Unfolded form of conformal equations in M dimensions and o(M + 2)-modules Shi, K.-J., see Deng, H. Simon, R., see Chaturvedi, S. Skirzewski, A., see Bojowald, M. Spera, M.,
3 (2006) 285 2 (2006) 201 10 (2006) 1103 10 (2006) 1103 9 (2006) 935
3 (2006) 233
8 (2006) 823 3 (2006) 255 8 (2006) 887 7 (2006) 713
see Benvegn` u, A. Teo, L.-P., Fay-like identities of the Toda lattice hierarchy and its dispersionless limit Tipunin, I. Yu., see Shaynkman, O. V. Travaglia, M. V., see Bach, V. Vasiliev, M. A., see Shaynkman, O. V. Weimar-Woods, E., The general structure of G-graded contractions of Lie algebras, II: The contracted Lie algebra Werner, R. F., see Keyl, M. Yang, Z.-Y., see Deng, H. Yue, R.-H., see Deng, H. Zampini, A., see Chaturvedi, S. Zhao, L., see Deng, H.
1157
10 (2006) 1075
10 (2006) 1055 8 (2006) 823 5 (2006) 519 8 (2006) 823
6 (2006) 655 9 (2006) 935 3 (2006) 255 3 (2006) 255 8 (2006) 887 3 (2006) 255