Commun. Math. Phys. 229, 1 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0712-5
Communications in
Mathematical Physics
© Springer-Verlag 2002
Editorial
Online First Publication
Springer-Verlag and the editors of CMP are pleased to announce a new service for our authors and readers: Online First, the immediate online publication of accepted papers – each as soon as it is made ready for print. While the electronic versions of the CMP issues will continue to appear simultaneously with the printed journal, from now on the electronic form of articles will be available to subscribers upon the receipt of the author-approved galley proofs. Articles will be published on the Web with a registered and permanent international identification code, the “Digital Object Identifier”, or DOI. The final page numbers will appear only with the printed version, however the printed articles will continue to display the DOI and the Online First publication date. The DOI is linked to the Web site’s URL (Uniform Resource Locator) and to the bibliographical address of the published article when that appears. It can be used to create hyperlinks between Online First articles. Online First is a genuine publication of articles in their final form. Therefore, papers cannot be changed or withdrawn after electronic publication. Any corrections that might be necessary have to be made in an Erratum, which will be hyperlinked to the article. Posted articles will be announced in LINK alert, which is a free alerting service http://link.springer.de/alert. A table of contents with all accepted, but not yet printed, papers can be found for CMP via the Springer Web page http://link.springer.de/link/service/journals/00220/tocs.htm or http://link.springer-ny.com/link/service/journals/00220/tocs.htm by clicking Online First. We are glad to be able to add this service, thus responding to requests and proposals made by the public which is served by the journal.
The Editor-in-Chief
Springer-Verlag
Commun. Math. Phys. 229, 3 – 47 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Liouville Type Equations with Singular Data and Their Applications to Periodic Multivortices for the Electroweak Theory D. Bartolucci, G. Tarantello University of Rome “Tor Vergata”, Mathematics Departement, Via della Ricerca Scientifica, 00133 Rome, Italy. E-mail:
[email protected];
[email protected] Received: 24 September 2001 / Accepted: 7 December 2001
Abstract: Motivated by the study of multivortices in the Electroweak Theory of Glashow–Salam–Weinberg [33], we obtain a concentration-compactness principle for the following class of mean field equations: (1)λ −g v = λK exp (v)/ K exp (v)dτg M
− W on M, where (M, g) is a compact 2-manifold withoutboundary, 0 < a ≤ K (x) ≤ m b, x ∈ M and λ > 0. We take W = 4π αi δpi − ψ with αi > 0, δpi the Dirac i=1
measure with pole at point pi ∈ M, i = 1, · · · , m and ψ ∈ L∞ (M) satisfying the necessary integrability condition for the solvability of (1)λ . We provide an accurate analysis for solution sequences of (1)λ , which admit a “blow up” point at a pole pi of the Dirac measure, in the same spirit of the work of Brezis–Merle [11] and Li–Shafrir [35]. As a consequence, we are able to extend the work of Struwe–Tarantello [49] and Ding– Jost–Li–Wang [21] and derive necessary and sufficient conditions for the existence of periodic N-vortices in the Electroweak Theory. Our result is sharp for N = 1, 2, 3, 4 and was motivated by the work of Spruck–Yang [46], who established an analogous sharp result for N = 1, 2. 1. Introduction The Electroweak Theory of Glashow–Salam–Weinberg (see Lai (ed.) [33]) is one of the most successful physical theories describing fundamental interactions. A goal of this paper is to show rigorously that such a theory allows for vortex type configurations which form periodic patterns in the plane. Periodic vortex configurations were first predicted in the pioneering work of Abrikosov [1] for the magnetic properties of superconductive materials and then observed experimentally, see [53]. Ambjorn and Olesen [2–5] observed that in the Electroweak Theory of Glashow–Salam–Weinberg, the Abrikosov’s Research supported by M.U.R.S.T. project: Variational Methods and Nonlinear Diff. Eq.
4
D. Bartolucci, G. Tarantello
vortices could be realized as solutions of certain selfdual equations of Bogomol’nyi type when the coupling constant satisfies a critical condition. The selfdual equations form a first order system which furnish a convenient reduction of the more complicated second order Euler–Lagrange equations of motions. They include a gauge invariant version of the Cauchy–Riemann equation, which implies that the W -boson field is locally holomorphic, so it admits isolated zeroes, whose number (counted according to the multiplicity) gives the vortex number N ∈ N. The periodicity property is realized by imposing ’t Hooft periodic boundary conditions (see [27]) on the selfdual equations. In particular, this yields to a quantization effect for the flux and total energy of the vortex, which are shown to take only specific values proportional to the vortex number. Ambjorn and Olesen used perturbation analysis and numerical experiments to support the existence of such selfdual periodic vortices, but they were unable to obtain a rigorous proof. In the early ’90s, Spruck andYang [46, 47], attacked the above mentioned Bogomol’ nyi system by variational methods and proved the existence of multivortex solutions for the bosonic sector of the theory. In particular, in the periodic case (cf. [46]), they obtained necessary and sufficent conditions for the parameters involved in the theory which guarantee existence of periodic solutions. Their results are sharp in the case of the vortex number N = 1, 2. The main motivation of the present work has been that of improving the sufficient conditions for solvability given in [46] and derive sharp results also in the case of vortex number N = 3, 4. As in [46], our analysis of N-vortices reduces to the study of a nonlinear elliptic system of equations of Liouville type. This class of equations has proved to be relevant in many other contexts, such as the assigned Gaussian curvature problem (see [16, 17, 31]), the study of the statistical mechanics of point vortices in the mean field limit (see [13, 14] and [32]) and more recently, in the asymptotic analysis of the Chern–Simons vortex condensates (cf. [28, 30, 12, 24]) as discussed in [50, 39, 40, 44, 21, 23] and [41]. See also [15]. To be more specific, for (M, g) a compact two dimensional Riemannian manifold without boundary, we will be interested to analyze the mean field equation: Keu −g u = λ −W (x) in M, (1)λ Keu dτg M
where g and dτg denote respectively the Laplace–Beltrami operator and the volume element corresponding to the metric g. Moreover we take λ > 0, the function K to satisfy a ≤ K (x) ≤ b,
∀ x ∈ M,
for some 0 < a ≤ b and the function W of the form, m W (x) = 4π αj δpj − ψ (x) ,
(1)
(2)
i=1
with ψ ∈ L∞ (M), αj ∈ IR + and δpj the Dirac measure with pole at the point pj ∈ M j = 1, · · · , m, for some m ∈ N. Clearly, the condition
m 4π (3) αj − ψ (x) dτg = λ, i=1
M
Liouville Type Equations with Singular Data
5
is necessary for the solvability of (1)λ . Notice also that solutions to (1)λ are determined up to a constant, so, without loss of generality, we can supplement (1)λ with the additional condition:
udτg = 0. (4) M
In the particular case of the Electroweak Theory, we shall need to solve a system of Liouville type equations whose structure, from a variational point of view, is similar to that of problem (1)λ -(4), when αj ∈ N and M is the flat two-dimensional torus. When λ < 8π , in view of the Moser–Trudinger inequality, the existence of a solution for (1)λ is derived by direct minimization of the corresponding variational principle (e.g. [49]). On the contrary, when λ ≥ 8π , the existence of solutions for (1)λ becames a delicate issue. Much of the progress made in this direction concerns the regular case, namely the case where the given function W does not involve the Dirac measures and so in (2) one takes αj = 0, ∀ j = 1, · · · , m. We start by discussing this situation. The case λ = 8π has been widely studied for the 2-sphere M = S 2 , in connection with the assigned Gaussian curvature problem (see [16, 17, 31]). General manifolds were considered in [39] and [22]. While the case λ > 8π has been treated by Struwe– Tarantello [49] when M is the flat 2-torus, and more generally by Ding–Jost–Li–Wang [21] for manifold of positive genus. In this situation, the existence of a solution for (1)λ is established for λ ∈ (8π, 16π ). When M = S 2 , the analogous existence result has been recently obtained by C.S. Lin in [36]. For λ ≥ 16π , the problem of existence of solutions for (1)λ is essentially open. A crucial ingredient in the above mentioned results is that, for K satisfying (1) and αj = 0 in (2) ∀ j = 1, · · · , m, the solutions of (1)λ satisfy uniform estimates in the C 0 (M)-norm, for every λ in a compact subset of IR + \ 8π N. This is a consequence of the work of Brezis–Merle [11] and Li–Shafrir [35]. We also mention that an accurate description of unbounded sequences of solutions for (1)λn , λn → λ ∈ 8π N is provided by Y.Y. Li in [34]. On the other hand, when we include Dirac measures as inhomogeneus data in (1)λ , the analysis of the corresponding solution-set becomes more involved as far as a priori estimates are concerned. In this situation, one needs to deal with the additional difficulty of considering solution sequences which become unbounded (from above) around a pole of the given Dirac measures. Locally, such a situation can be illustrated by the following example. Let µn |x − p|2α
un (x) = ln
1+
1 µ |x 8(1+α)2 n
− p|
2(1+α)
2 2 , x ∈ R ,
(5)
with p ∈ R2 and µn → +∞. Note that, for any domain ' ⊂ R2 with p ∈ ', we have
λn := eun → 8π (1 + α) , as n → ∞, '
and
e un −un = λn u − 4π αδp , e n '
on
'.
6
D. Bartolucci, G. Tarantello
Furthermore, ∃ xn → p : un (xn ) → +∞. Therefore, in the terminology of [11], the point p defines a blow up point for un . In addition eun “concentrates” around p in the sense that, e un eun = λn u → 8π (1 + α) δp , weakly in the sense of measure in '. e n '
Hence, in the “singular” case, we see that the condition λ ∈ / 8π N is no longer sufficient to guarantee uniform estimates (from above) for solutions of (1)λ . In fact, concerning (1)λ -(4), we can anticipate that the values λ ∈ 8π 1 + αj N may be responsible for a possible blow up point at the Dirac pole pj , j = 1, · · · , m, and yield to a “concentration” phenomenon. In this direction we prove: Theorem (Concentration-Compactness). Let un be a solution sequence for (1)λ=λn and (4) with λn → λ. There exist a subsequence unk of un for which the following alternative holds: either unk is uniformly bounded from above in M; unk or → +∞, max unk − ln Ke M
M
and there exist a finite (blow up) set S = {x1 , . . . , xl } ⊂ M such that: (a) ∀j ∈ {1, . . . , l} , ∃ xj,n → xj : unk xj,n → +∞, and unk → −∞ uniformly on compact sets of M \ S, l unk (b) λnk Ke un → βj δxj weakly in the sense of measure, Ke k j =1 M
with βj = 8π for xj ∈ / {p1 , . . . , pm }, or βj = 8π (1 + αi ), for xj = pi , for some i ∈ {1, . . . , m}. In particular, 1 + αj , λ = 8π n + 8π j ∈J
for some n ∈ N ∪ {0} and J ⊂ {1, · · · , m} (possibly empty) satisfying n + |J | > 0, where |J | denotes the cardinality of the set J . Clearly,this result states a concentration-compactness principle for the sequence Keun ⊂ C 0 (M). u Ke n M
We refer to Theorem 7 and Corllary 5 in Sect. 4 for a more general statement. An immediate and very helpful consequence of the above result is that we can ensure uniform bounds from above for solutions of (1)λ and (4) whenever λ lies in a compact set of R+ \ ,, where 1 + αj , ∀n ∈ N ∪ {0} , and J ⊂ {1, · · · , m} . , = 8π n + j ∈J
Liouville Type Equations with Singular Data
7
Notice that, when αj = 0, ∀ j ∈ 1, · · · , m then , = 8π N and we recover the known results concerning the “regular” case. In particular, the above result permits to extend the existence result of Ding–Jost–Li–Wang [21] to the singular equation (1)λ , provided λ ∈ (8π, 16π )\,. In fact, we shall follow [21] to treat a more general coupled system of singular Liouville type equations arising from the study of periodic Electroweak vortices as described in Sect. 5. The proof of the concentration-compactness principle above relies upon two main facts which represent the main core of our analysis. The first result concerns a local concentration- compactness principle in the same spirit of Brezis–Merle’s work [11], for solution sequences of the following “singular” problem: vn − 4π αδ in ', n = Vn (x) e p −v v (6) e n ≤ C, '
where 0 ≤ Vn ≤ b, α > 0, ' a bounded domain in R2 and p ∈ '. Note that in case vn admits a blow up point that occurs at p, it is not possible to treat problem (6) by means of Brezis–Merle’s analysis. Indeed,the arguments in [11] permit to derive a concentration-phenomenon provided lim sup Vn evn ≥ 4π (1 + α) (see n→∞ Br (p)
also the proof of Theorem 4 in Sect. 3). On the other hand, the opposite condition, Vn evn < 4π (1 + α) is no longer sufficient to exclude blow up at the point lim sup n→∞ Br (p)
p, as it happens for the case α = 0. This leaves a gap that needs to be filled for the desired concentration-compactness principle to hold. To this end, we use a Pohozaev type identity that, in case of blow up at p, leads to the following local result: 1 ('). Let the seTheorem (Concentration). Suppose that Vn → V in C 0 ' ∩ Cloc quence vn satisfy (6) and assume it admits p as the only blow up point in ' (in the sense of (16) below). Then, along a subsequence (denoted in the same way), we have: max vn → −∞,
as n → ∞, for any compact set K ⊂ ' \ {p},
K
Vn evn → βδp=0 , weakly in the sense of measures in ', with β ≥ 8π.
(7) (8)
This clearly leads to a version of Brezis–Merle’s result [11] appropriate for problem (6), which is stated in Theorem 5 of Sect. 3. At this point, the second important information to derive consists in identifying the precise value of β in (8). In the regular case α = 0, this task has been successfully carried out by Li–Shafrir in [35], where they show that any blow up point carries a mass β ∈ 8π N. In analogy with Li–Shafrir’s result, it is reasonable to expect that, in the singular case, when the pole p occurs as a blow up point, then (8) should hold with β ∈ 8π (1 + α) N.
(9)
This conjecture is supported also by the fact that all solutions of the “singular” Liouville equation −ξ = λeξ − 4π αδp in R2 , ξ (10) e < +∞, R2
8
D. Bartolucci, G. Tarantello
ξ e = 8π (1 + α), see Chen–Li [20]. Note that problem (10) occurs, with
satisfy, λ
R2
λ = V (p), as a limiting equation after blowing up vn around p. All solutions of (10) have been completely classified by Prajapat–Tarantello in [43]. In deriving (9), the solution of (10) should play a significant role as it occurs for the solutions of the “regular” Liouville equation (i.e. α = 0 in (10)) in Li–Shafrir’s analysis. At this point we are able to contribute towards (9) only in case we supplement (6) with some mild boundary conditions. More precisely, if we assume that, for a suitable constant C > 0, we have: sup vn − inf vn ≤ C, ∂'
∂'
(11)
then we are able to prove that (9) holds with β = 8π (1 + α) ,
(12)
see Theorem 6. Again, we shall derive (12) by means of Pohozaev’s identity as in [19,20]. Notice that if un is defined on a compact manifold (cf. problem (1)λ ), locally, we may ensure (11) by means of Green’s representation formula. As already mentioned, the analysis of problem (1)λ described above will be crucial to establish periodic N -vortex solutions for the Electroweak Theory, as it will be clarified in the last section where we give the basic definitions and formulate this theory in terms of the unitary gauge. By imposing that the magnetic excitation is in the third spatial direction, as in [46], we derive the Bogomol’nyi (first order) system of equations and discuss the (gauge invariant) ’t Hooft periodic boundary conditions. Consequently, we will show that selfdual N -vortex solution for the theory can be obtained by solving an elliptic system of equations on the flat two torus, see (171). Towards this goal, in Sect. 5, we shall introduce a variational framework that will allow us to follow the approach of Ding–Jost–Li–Wang [21] and obtain an existence result for a general Liouville-type system modelled over the elliptic system (171), see Theorem 8. The concentrationcompactness principle discussed above will be crucial at this stage. As a consequence of Theorem 8, we will be able to derive natural conditions, both necessary and sufficient, for the existence of selfdual N -vortex solutions that improve those of Spruck–Yang [46], see Theorem 10 and Corollary 7. 2. Preliminaries In this section we shall collect some known results which will be used frequently in the following sections. Let ' ⊂ R2 be an open domain and vn be a solution sequence for the equation: −v = V evn in ', (13) n
satisfying
n
evn ≤ C,
(14)
'
where 0 ≤ Vn ≤ b
in ',
(15)
for suitable positive constants C and b. Following [11], we define a “blow-up” point relative to vn as follows:
Liouville Type Equations with Singular Data
9
Definition. The point p ∈ ' is said to be a blow up point for vn if, ∃ {xn } ⊂ ' : xn → p and vn (xn ) → +∞.
(16)
Concerning (13)–(14), Brezis and Merle in [11] proved the following: Theorem 1 (Brezis–Merle). Suppose that vn satisfies (13), (14) and assume that (15) holds. Then, possibly extracting a subsequence (still called vn ), one of the following alternative holds: i) vn is uniformly bounded in L∞ loc ('). ii) vn → −∞ uniformly on compact sets of '. iii) There exist a finite set S = {q1 , · · · , qr } ⊂ ', (blow up set)
and corresponding 1 r i sequences {xn }n∈N , · · · , {xn }n∈N , in ' with xn → qi and vn xni → +∞ for i ∈ 1, · · · , r. Moreover, vn → −∞ uniformly on compact sets of ' \ S, and Vn evn → r βi δqi weakly in the sense of measures in ', with βi ≥ 4π for any i ∈ 1, · · · , r. i=1
Li–Shafrir in [35] have further investigated alternative iii), by showing that, under the additional hypothesis (17) below, each blow up point pi carries a mass βi = 8π mi , with mi ∈ N, i = 1, · · · , r. More precisely, Theorem 2 (Li–Shafrir). Suppose that vn satisfies (13), (14) and assume that 0 ≤ Vn ∈ C 0 ' , Vn → V , uniformly in '.
(17)
If alternative iii) holds in Theorem 1, then βi = 8π mi , with mi ∈ N and i = 1, · · · , r. An immediate consequence of Theorems 1 and 2 is the following: Corollary 1. Suppose that vn satisfies (13), (14) and
lim sup Vn evn < 8π. n→∞
'
Assume (17), then vn is uniformly bounded from above on compact sets of '. That is, only alternative i) and ii) may occur in Theorem 1 in this case. A very accurate description on the behaviour of vn , near each blow up point, is furnished by Li in [34]. In particular Li’s analysis excludes the possibility of “multiple-bubbling” (i.e. mi > 1 in Theorem 2) under some mild boundary conditions, see (18) below. This fact was first noticed by Wolanski and it is an immediate consequence of Theorem 0.3 in [34]. Theorem 3. Let vn be a sequence of solutions of (13) satsfying (14), (17) and in addition assume that Vn is a sequence of Lipschitz continuous functions with |∇Vn | ≤ C0 in '. Let p ∈ ' be such that v (p) = max v → +∞, and V evn → βδ , n
n
'
n
p
weakly in the sense of measure in '. If max vn − min vn ≤ C, ∂'
for suitable C > 0, then β = 8π .
∂'
(18)
10
D. Bartolucci, G. Tarantello
We shall provide the appropriate version of Theorem 1 and 3 in case Dirac measures are included into Eq. (13) as inhomogeneous data. See Theorem 5 and Theorem 6 in Sect. 3. We recall the following form of Harnack inequality which will be widely used in the sequel. Lemma 1 (Harnack type inequality). Let ' ⊂ R2 be a smooth bounded domain and v satisfy: −v = f
in
',
(19)
with f ∈ Ls ('), s > 1. For any subdomain ' ⊂⊂ ' there exists two positive constants τ ∈ (0, 1) and γ > 0, depending on ' only, such that: (a)
if sup v ≤ C, then sup v ≤ τ inf v + (1 + τ ) γ ||f ||p + (1 − τ ) C, ∂'
'
'
(20) (b)
if inf v ≥ −C, then τ sup v ≤ inf v + (1 + τ ) γ ||f ||p + (1 − τ ) C. ∂'
'
'
(21) 3. Local Analysis: The Case of Blow Up at the Dirac Pole In this section we are interested to analyze the blow up behaviour of a solution sequence around a blow up point which is assumed to coincide with a pole of the Dirac measure included into the equation under examination. Without loss of generality, we take such a pole p = 0 and due to the dilation invariance of our problem via the transformation: u (x) → u (Rx) + 2 ln R, we can always assume that our local assumptions hold on the unit ball. Set D = B1 (0) and Dr = Br (0). In the following analysis, we adopt the point of view of considering solutions to (1)λ normalized by the condition, Keu = λ. M
Therefore, “locally”, we are going to consider a solution sequence un of the problem: un − ψ (x) − 4π α δ in D, n = Kn (x) e n n p=0
−u u (x) (22) e n dx ≤ C, D
where, unless otherwise specified, we assume that, 0 ≤ Kn ∈ C 0 D , and Kn → K uniformly in D, ||ψn ||Ls (D) ≤ C, s > 1,
(23) (24)
for a suitable constant C > 0. We suppose that zero is a point of blow up for un in D, and more precisely that, ∃{xn } ⊂ D : xn → 0 and sup un (xn ) = un (xn ) → +∞, as n → ∞. D
(25)
Liouville Type Equations with Singular Data
11
Lemma 2 (Minimal-Mass Lemma). Suppose that (23) and (24) hold. Let the sequence un satisfy (22) with αn → α > 0, (25) and Kn eun → ν, weakly in the sense of measure in D, then K (0) > 0 and ν ({0}) ≥ 8π . Proof. To derive our conclusion, we need to show that, for every r ∈ (0, 1),
lim inf Kn eun ≥ 8π. n→∞
(26)
Dr
To this purpose, fix r ∈ (0, 1), and note that for n sufficently large, max un = un (xn ). Dr
Set
δn = exp −
un (xn ) 2
→ 0, as n → ∞,
and define tn = max {δn , |xn |} → 0, as n → ∞. The sequence of functions, ξn (x) = un (tn x) + 2 ln tn , defined on the set Bn = Br/tn (0), satisfies: ξn ˜ −ξ
n = Kn (tn x) e − ψn (x) − 4π αn δp=0 eξn (x) dx ≤ C,
in Bn , (27)
Bn
with ψ˜ n (x) = tn2 ψn (tn x). Note that, ψ˜ n p ≤ C and ψ˜ n → 0, weakly in L (Bn ) p 2 Lloc R . Set yn = xt n , and note that |yn | ≤ 1, so by taking a subsequence, we can assume n yn → y0 ∈ R2 , with |y0 | ≤ 1. Furthermore sup ξn = ξn (yn ) = un (xn ) + 2 ln tn ≥ Bn
un (xn ) + 2 ln δn = 0. Hence ξn is uniformly bounded from below along yn . We distinguish two cases: Case A. ξn (yn ) ≤ C, ∀ n ∈ N. Case B. lim sup ξn (yn ) = +∞. n→∞
Concerning Case A, we obtain Claim. Under the assumptions of Case A, we have that,
lim inf Kn eun ≥ 8π (1 + α) . n→∞
Dr
12
D. Bartolucci, G. Tarantello
Proof of the Claim. In this case, 0 ≤ sup ξn = ξn (yn ) ≤ C.
(28)
Bn
Write ξn (x) = 2αn ln |x| + φn (x), so that φn (x) defines the regular part of ξn , and for every R > 1 there exists a constant CR > 0 such that it satisfies: −φn = Kn (tn x) eξn − ψ˜ n (x) on D2R , (29) sup |φn | ≤ CR , ∂D2R
provided n is sufficently large. Since fn = Kn (tn x) eξn − ψ˜ n (x) is uniformly bounded in Ls (D2R )s > 1, by elliptic estimates, we may conclude that |φn | is uniformly bounded 1,δ in Cloc R2 for some δ ∈ (0, 1). Therefore, we can use a diagonal process to conclude that, along a subsequence, φn → φ uniformly on compact sets of R2 and φ satisfies: −φ = |x|2α K (0) eφ in R2 , 2α φ (30) |x| e < +∞. R2
Notice that since φ + 2α ln |x| is bounded in R2 , necessarily K (0) = 0. Thus, by the results in [20], we have K (0) |x|2α eφ = 8π (1 + α). So, R2
lim inf
n→+∞
Kn eun = lim inf
n→+∞
Dr
≥
Kn (tn x) eξn = lim inf
n→+∞
Bn
Kn (tn x) |x|2αn eφn
Bn
K (0) |x|2α eφ = 8π (1 + α) .
R2
That is, ν ({0}) ≥ 8π (1 + α) in this case. Case B. In this case, necessarily tn = |xn | (along a subsequence) and consequently |y0 | = 1 (recall y0 = lim xt n ). Hence, in this situation, ξn admits a blow up point n→∞ n at y0 = 0. So we can apply the Li–Shafrir Theorem (cf. Theorem 2 in Sect. 2), to the sequence ξn in a small neighborhood Bδ (y0 ) ⊂⊂ R2 \ {0} and obtain that K (0) > 0 and νBr (y0 ) = 8π mδy0 , for some m ∈ N. So, we derive the desired conclusion in this case as well.
An immediate consequence of Lemma 2 is the following extension of Corollary 1: Corollary 2. Let un satisfy (22) with lim sup Kn eun < 8π . Then un is uniformly n→∞ D
bounded from above on any compact set of D, and so it cannot admit a blow-up point in D.
Liouville Type Equations with Singular Data
13
Proof of Corollary. Argue by contradiction and suppose that un admits a blow-up point p in D. According to whether p = 0 or p = 0, we can apply either Corollary 1 in Sect. 2 or Lemma 2 above in a small neighborhood of p to conclude that necessarily lim inf eun ≥ 8π . n→∞
D
Next, we want to show that, if zero is the only point of blow up for un in D, that is, for any r ∈ (0, 1) , ∃ Cr > 0, such that : max un ≤ Cr ,
(31)
max un → +∞,
(32)
D\Dr
Dr
then un undergoes to the same concentration phenomenon that occurs in alternative iii) of Brezis–Merle’s Theorem. Namely that there exists a subsequence ukn of un , such that, as n → ∞, we have ukn → −∞,
as n → ∞, uniformly on every compact set of ' ⊂ D \ {0}, (33)
Kkn eukn → βδp=0 , weakly in the sense of measures on D, with β ≥ 8π.
(34)
For this purpose, we decompose un as the sum of its regular and singular part. Hence, define the function sn (x) as the unique solution of the problem: sn = ψn (x) + 4π αn δp=0 in D (35) sn = 0 on ∂D. Consequently, sn (x) = 2αn ln |x| + σn (x) ,
(36)
and, by virtue of (24), σn satisfies: ||σn ||C 0,γ (D) ≤ C0 , for some γ ∈ (0, 1) and C0 > 0.
(37)
un (x) = vn (x) + sn (x) .
(38)
Write
So vn solves the problem: vn n = Vn (x) e
−v |x|2αn evn (x) dx < C,
in D (39)
D
with Vn (x) = Kn (x) |x|2αn eσn (x) .
(40)
We observe the equivalence between the blow-up properties for the sequence un satisfying (22) and those of its corresponding regular part vn satisfying (39).
14
D. Bartolucci, G. Tarantello
Lemma 3. un satisfies (31)–(32) if and only if vn satisfies (31)–(32). Proof. If un satisfies (31)–(32), then also vn satisfies (31)–(32), as it follows easily by (36), (37) and (38). To show the vice versa case we argue by contradiction and suppose that ∀ 0 < r < 1, ∃ Cr > 0: max vn ≤ Cr ,
D\Dr
max vn → +∞,
(41)
|x|≤r
while there exists a uniform constant C, such that, un (x) ≤ C, ∀ x ∈ D.
(42)
Clearly (42) implies that, fn (x) := Vn evn = Kn eun satisfies 0 ≤ fn ≤ C1 , ∀ x ∈ D, with C1 > 0 a suitable constant. We see that vn fulfills all the assumptions of Lemma 1 in Sect. 2 and we derive that inf vn → +∞, for every r ∈ (0, 1). Hence vn blows up Dr uniformly in D, which is impossible since it contradicts the condition |x|2αn evn ≤ C, ∀n ∈ N.
D
Thus, in case of blow up at zero, Lemma 3 shows that, to deduce the nature of the limiting measure for Kn eun = Kn |x|2αn eσn evn , we face a delicate problem, as we have to control a product of two competing terms: |x|2αn vanishing as x → 0, against evn which explodes as x → 0. We obtain the following, Theorem 4 (Concentration). Let un satisfy (22), (31) and (32) with αn → α ≥ 0,
(43)
1 0 ≤ Kn ∈ C 0 D : Kn → K uniformly in D, and in Cloc (D) ,
(44)
||ψn ||Ls (D) ≤ C, s > 2,
(45)
then there exist a subsequence ukn of un , such that (33) and (34) hold. Proof. We shall work with the sequence vn (x) = un (x) − sn (x) ,
(46)
where sn (x) = 2αn ln |x| + σn (x) is the unique solution of problem (35). Hence, vn solves problem (39) and, in view of our assumptions, along a subsequence, we have that 1 (D). In particular, σn → σ uniformly in D and in Cloc 1 Kn (x) eσn (x) → K (x) eσ (x) in C 0 D ∩ Cloc (D) ,
and K (0) > 0, as it follows by Lemma 2. We need to show that for every r ∈ (0, 1), along a subsequence, we have min vn → −∞, as n → ∞.
|x|=r
Liouville Type Equations with Singular Data
15
Indeed, by (39), (40) and (41) we can apply Harnack’s inequality, as stated in Lemma 1, to conclude that max vn → −∞, as n → ∞
|x|=r
(47)
and by a diagonalization process derive (33). We argue by contradiction and suppose that there exist r ∈ (0, 1) and C > 0 such that min vn ≥ −C, ∀n ∈ N.
|x|=r
By the maximum principle and (41) we conclude that vn is uniformly bounded in L∞ loc (Dr \ {0}). Thus we can use elliptic estimates, and by extracting a subsequence, we may assume that 1,δ vn → ξ pointwise a.e. and in Cloc (Dr \ {0}) , for some δ ∈ (0, 1) ,
(48)
0 Vn (x) evn (x) → V (x) eξ (x) , in Cloc (Dr \ {0}) ,
(49)
where, we recall that Vn (x) = |x|2αn Kn (x) eσn (x) (see (40)) and we have set, V (x) = |x|2α K (x) eσ (x) .
(50)
Note that by Fatou’s lemma, V eξ ∈ L1 (Dr ). Consequently, by taking into account Lemma 2, we derive Vn evn → ν = V eξ + βδp=0 , weakly in the sense of measures in Dr , with β ≥ 8π. (51) Since K (0) > 0, from (44) and (51) we may also conclude, |x|2αn eσn evn → |x|2α eσ eξ +
β δp=0 , K (0)
(52)
weakly in the sense of measures in Dr . Fix 0 < r0 < r, and on D0 = Dr0 define: ϕn (x) = Vn (x) evn (x) and ϕ (x) = V (x) eξ (x) .
(53)
We use Green’s representation formula for vn in D0 , to derive that, ξ (x) =
1 β ln + φ (x) + γ (x) , 2π |x|
(54)
with, φ (x) =
1 2π
ln D0
1 ϕ (y) dy, |x − y|
(55)
16
D. Bartolucci, G. Tarantello
and 1 γ (x) = 2π
|y|=r0
1 ∂ξ ln |x − y| (y) dy − ∂ν 2π
|y|=r0
(x − y) · ν ξ (y) dy. |x − y|2
(56)
Clearly, γ ∈ C 1 (Dr ) , for every r ∈ (0, r0 ) .
(57)
Next we note that φ ∈ L∞ (D0 ). To see this, we observe first that φ (x) is clearly bounded from below on D0 , as we have, φ (x) ≥
1 1 ||ϕ||L1 (D0 ) , ln 2π 2r0
∀ x ∈ D0 .
Since K (0) > 0, for r sufficently small, by (54), we find ϕ (x) = V (x) eξ (x) =
|x|2α |x|
β 2π
K (x) eφ (x) + γ (x) ≥
c
, |x| 0 < |x| < r and suitable c > 0. β 2π −2α
Thus, by the integrability of ϕ, we see that necessarily β < 4π (1 + α) .
(58)
Since β ≥ 8π, notice that (58) already yields to a contradiction in case α ∈ [0, 1]. However, for α ∈ [0, 1], it follows as in [11], that it is possible to derive (33) and (34) β under weaker assumptions on Kn and ψn , see [8]. In case α > 1, set s = 2π − 2α and note that, in view of (58), we have s ∈ (0, 2). Since, ϕ (x) = V (x) eξ (x) ≤
C φ (x) e in D0 , |x|s
(59)
and eφ ∈ Lk (D0 ), ∀ k ≥ 1 (see for example Corollary 1 in [11]), by Holder’s inequality
we obtain immediately that ϕ ∈ Lt (D0 ), ∀ t ∈ 1, 2s . In turn, from (55) we get that φ is also bounded from above. Consequently, 1 β 2α ξ (x) |x| e , |x| < r0 , and s = =O − 2α ∈ (0, 2) . (60) |x|s 2π To estimate ∇φ (x), for |x| = r < r0 , note that
1 1 1 |∇φ (x)| ≤ ϕ (y) dy = |x − y| 2π 2π D0
+
1 2π
{|x−y|≤ 2r }∩D0
{|x−y|≥ 2r }∩D0
1 ϕ (y) dy |x − y|
1 ϕ (y) dy = I1,r + I2,r . |x − y|
Liouville Type Equations with Singular Data
17
Fix t ∈ 1, 2s and choose τ > 0 so that t τ−t 1 < 2, and 0 < τ < 2 − s. Thus, by Holder’s inequality we obtain, I1,r ≤
c1 , for some suitable τ ∈ (0, 2 − s) , and c1 > 0. r 1−τ
Concerning I2,r , we use (60) to get,
1 1 I2,r = ϕ (y) dy ≤ C |x − y| 2π {|x−y|≤ 2r }∩D0
|x−y|≤ 2r
(61)
1 dy , |x − y| |y|s
since |x| = r, the condition |x − y| ≤ 2r implies that |y| ≥ 2r whence,
dy C I2,r ≤ s ≤ c2 r 1−s , |x − y| r |x−y|≤ 2r
for suitable c2 > 0. In conclusion, ∀ x : |x| = r < r0 , c1 1 |∇φ (x)| ≤ 1−τ + c2 r 1−s ≤ C 1−τ + 1 , r r
(62)
for suitable τ ∈ (0, 2 − s) and C > 0. At this point we are ready to derive our contradiction by means of a Pohozaev type identity. Multiply (39) by x · ∇vn and integrate over Dr , r ∈ (0, r0 ). After some integration by parts we obtain the following identity:
|∇vn |2 − (ν, ∇vn ) (x, ∇vn ) dσ (x, ν) 2 ∂Dr
v n = (x, ν) Vn (x) e dσ − (2Vn (x) + x · ∇Vn (x)) evn dx. (63) ∂Dr
Dr
We may use (48), (51) and (52) together with the uniform convergence x · ∇Kn (x) → x · ∇K (x) and x · ∇σn (x) → x · ∇σ (x) in Dr , to pass to the limit in (63) as n → ∞, and derive the following identity:
|∇ξ |2 − (ν, ∇ξ ) (x, ∇ξ ) dσ (64) (x, ν) 2 ∂Dr
= (x, ν) V (x) eξ (x) dσ ∂Dr
−
(2V (x) + x · ∇V (x)) eξ (x) dx − 2β (1 + α) ,
(65)
Dr
for any r ∈ (0, r0 ). We shall analyze the right- and left-hand side of the identity above separately. Set, η = φ + γ,
(66)
18
D. Bartolucci, G. Tarantello
so that, by (54) we have, ∇ξ (x) = −
β x + ∇η (x) . 2π |x|2
Thus, (64) gives,
|∇ξ |2 − (ν, ∇ξ ) (x, ∇ξ ) dσ (x, ν) 2
Br := ∂Dr
=r |x|=r
1 2
|x|=r
|x|=r
1 2
= −
1 β x 2 −2 · ∇η + |∇η| dσ 2π |x|2 |x|2
β 1 x − + · ∇η 2π |x| |x|
1 − 2
=r
2
−r
β 2π
β 2π
β 2π
2
2 2π +
2 dσ
1 β x 1 + · ∇η + |∇η|2 − 2π |x|2 2 |x|2
β r 2π
|x|=r
x r · ∇η + 2 2 |x|
x · ∇η |x|
|∇η|2 − r |x|=r
Since γ ∈ C 1 (Dr ), by (66) and (62) we find: 1 |∇η (x)| ≤ C 1−τ + 1 , for |x| = r, r
|x|=r
2
dσ x · ∇η |x|
2 .
(67)
with τ ∈ (0, 2 − s), and C > 0 suitable constants. So, Br = −
β2 + o (1) , as r → 0. 4π
(68)
On the other hand, by (65) and (60) we also have,
ξ (x) dσ − (2V (x) + x · ∇V (x)) eξ (x) dx − 2β (1 + α) Br = (x, ν) V (x) e ∂Dr
=r
Dr
V (x) eξ (x) dσ + 2 (1 + α)
|x|=r
+
V (x) eξ (x) dx
Dr
x · ∇K (x) |x|2α eξ (x) dx
Dr
+
x · ∇σ (x) V (x) eξ (x) dx − 2β (1 + α)
Dr
= − 2β (1 + α) + o (1) , as r → 0.
(69)
Liouville Type Equations with Singular Data
19
Letting r → 0, by comparing (68) and (69), we see that necessarily β = 8π (1 + α), in contradiction with (58). Thus, we have established that there exist a subsequence vkn such that, for every compact set A ⊂ D \ {0}, max vkn → −∞, A
as n → ∞.
By (46) we immediately derive (33) for ukn . Furthermore, Vkn evkn → 0 uniformly on compact subsets of D \ {0}, so ν is supported at zero and Kkn eukn = Vkn evkn → ν = βδp=0 , with β = ν{0} ≥ 8π , by Lemma 2.
A useful consequence of Theorem 4 is the following version of the Brezis–Merle result. Theorem 5. Assume (43), (44), (45) and let un be a solution sequence for problem (22) with αn → α ≥ 0. There exists a subsequence ukn of un for which one of the following alternative holds: (i) sup ukn (x) − 2αkn ln |x| ≤ C' , ∀ ' ⊂⊂ D. ' (ii) sup ukn (x) − 2αkn ln |x| → −∞, ∀ ' ⊂⊂ D. '
(iii) There exist a finite and nonempty set S = {q1 , ...., ql } ⊂ D, l ∈ N, and sequences of points {xn1 }n∈N , ...., {xnl }n∈N ⊂ D, such that xni → qi and ukn xni → ∞ for i ∈ 1, · · · , l. Moreover sup ukn (x) − 2αkn ln |x| → −∞ on any compact '
l set ' ⊂ D \ S and Kkn eukn → βi δpi weakly in the sense of measures in D, i=1
furthermore βi ∈ 8πN if qi = 0 and βi ≥ 8π if qi = 0 for some i = 1, · · · , l. Proof. As above, we shall work with the sequence vn defined in (38). Note that in any subdomain ' ⊂⊂ D \ {0} we have,
evn ≤ C' , (70) '
with C' > 0 a suitable constant depending on ' only. Recall that the blow up set S of vn in D, is defined by setting, S = {x ∈ D : ∃{xn } ⊂ D such that xn → x and vn (xn ) → +∞}. Since for every δ > 0 sufficiently small, the solution sequence vn satisfies to all assumptions of Brezis–Merle’s Theorem in D \ D δ , and vn (x) = un (x) − 2αn ln |x| + O (1) in D, we may conclude that S = S \ {0} is a finite set, and along a subsequence, un (x) − 2αn ln |x| satisfies one of the alternatives (i)–(iii) above with D replaced by D = D \ {0} and S replaced by S . Obviously, each blow up point for vn in S (when not empty) is also a blow up point for un . Hence we are left to analyze what happens around zero. Observe that, by virtue of Lemma 3, the point x = 0 is a blow up point for vn if and only if it is a blow up point for un . Thus, in case zero is not a blow up point for vn (and hence for un ), that is S = S , then vn is uniformly bounded from above in a small neighborhood of zero. This, combined with (70), gives that vn satisfies to
20
D. Bartolucci, G. Tarantello
all assumptions of Brezis–Merle’s Theorem on D, and so, we immediately derive the desired conclusion in this case. If zero is a blow up point for vn , and hence for un , then S = S {0}. Thus, un satisfies to all assumptions of Theorem 4 in a ball Br0 (0), which we may take disjoint from S (when S = ∅). Then by virtue of (38), the conclusion follows in this case as well, by combining Brezis–Merle’s result applied to vn on D \ {0} with Theorem 4 applied to un in Br0 (0). Our next goal, is to determine the precise value of the “mass” β that occurs in (34). We can handle the case where we assume that un satisfies (22) together with the following “mild” boundary condition: max un − min un ≤ C, ∂D
∂D
(71)
with C a suitable positive constant. We obtain, Theorem 6 (Mass Quantization). Under the assumptions of Theorem 4, suppose in addition that un satisfies (71). Then, (34) holds with β = 8π (1 + α) .
(72)
Proof. As before, we shall work with the sequence vn (x) = un (x) − sn (x) , where sn (x) = 2αn ln |x| + σn (x) is the unique solution for (35). By virtue of (45), along a subsequence, we have that 1 σn (x) → σ (x) uniformly in D and in Cloc (D) .
(73)
Since sn = 0 on ∂D, we have that un and vn coincide on ∂D, so that (71) still holds if we replace un with vn . In particular, 0 ≤ vn − min vn ≤ C2 , ; on ∂D. ∂D
Define hn as the unique solution of the Dirichlet problem: in D, −hn = 0 hn = vn − min vn on ∂D.
(74)
(75)
∂D
By (74), we have that ||hn ||∞ ≤ C, for suitable C > 0,
(76)
1 (D). Furthermore, the and, along a subsequence, we may assume that hn → h, in Cloc function
wn = vn − min un − hn , ∂D
(77)
Liouville Type Equations with Singular Data
21
satisfies the Dirichlet problem: wn −w n = Wn e
Wn (x) ewn dx ≤ C, D wn = 0, on ∂D,
in D, (78)
with Wn (x) = Kn (x) |x|2αn eγn (x) , and γn (x) = σn (x) + hn (x) + min un . ∂D
(79)
We have, ∇γn → ∇γ and ∇Kn → ∇K, uniformly on compact sets of D,
(80)
with γ = σ + h. Since Wn (x) ewn = Kn (x) eun (x) , by Theorem 4, along a subsequence, we find, Wn ewn → 0 uniformly on compact sets of D \ {0},
(81)
Wn ewn → βδp=0 weakly in the sense of measure in D.
(82)
Again notice that (82) can be stated equivalently as follows: |x|2αn eγn ewn →
β δp=0 , weakly in the sense of measure in D. K (0)
(83)
Set fn (x) := Wn (x) ewn (x) , by the Green’s representation formula, write
1 1 wn (x) = ln fn (y) dy + R (x, y) fn (y) dy, |x − y| 2π D
(84)
(85)
D
where R (x, y) is the regular part of the Green’s function associated to the Laplacian operator with respect to Dirichlet boundary conditions on D. Passing to the limit into (85) we obtain, wn (x) →
β 1 1 + βR (x, 0) , in Cloc ln (D \ {0}) . 2π |x|
(86)
Set g (x) = βR (x, 0) ∈ C 1 (D) , and let w0 (x) =
1 β + g (x) . ln 2π |x|
(87)
22
D. Bartolucci, G. Tarantello
At this point we can argue as in the proof of Theorem 4 and consider the following Pohozaev type identity: |∇wn |2 − (ν, ∇wn ) (x, ∇wn ) dσ (x, ν) 2
= (x, ν) Wn ewn dσ − (2Wn (x) + x · ∇Wn (x)) ewn dx, r ∈ (0, 1) .
∂Dr
∂Dr
(88)
(89)
Dr
Letting n → ∞ in (88), and using (81), (82), (86) and (83) at the limit we find the identity: r 2
|∇w0 |2 dσ − r
∂Dr
(ν, ∇w0 )2 dσ = −2β (1 + α) .
(90)
∂Dr
Inserting (87), we obtain, r 2
|∇w0 |2 dσ ∂Dr
r = 2
∂Dr
β 2π
2
dσ r − 2 2 |x|
∂Dr
β x r · ∇g dσ + 2 2π |x| 2
|∇g|2 dσ ∂Dr
β2 = + o (1) , as r → 0, 4π
r (ν, ∇w0 )2 dσ ∂Dr
=r =
∂Dr β2
2π
β 2π
2
dσ −r |x|2
∂Dr
(91) β x · ∇g dσ + r 2π |x|2
+ o (1) , as r → 0.
∂Dr
1 (x, ∇g)2 dσ |x|2 (92)
Consequently, passing to the limit as r → 0 in (90), by (91) and (92), we derive the identity: −
β2 = −2β (1 + α) , 4π
that gives, β = 8π (1 + α), as claimed.
4. A Concentration-Compactness Result We are going to apply the results established above in order to derive the concentrationcompactness principle stated in the Introduction. For this purpose, let (M, g) be a compact two dimensional Riemannian manifold without boundary and Z = {p1 , · · · , pm }
Liouville Type Equations with Singular Data
23
be a finite set of points in M. For given αj > 0, j = 1, ..., m and λ > 0, we rewrite the mean field problem (1)λ -(4) as follows: m u V e (x) 1 − 4π −g u = λ − |M| αj δpj − φ (x) in M, V (x) eu dτg j =1 (93) M udτg = 0, M
and take the functions V (x) = eσ (x) , with σ ∈ C 1 (M) ,
(94)
φ ∈ Ls (M) , with s > 2.
(95)
We refer to problem (93) as problem (93)λ . For the solvability of (93)λ , we need to satisfy the necessary condition,
φdτg =
m
αj .
(96)
j =1
M
Furthermore, without loss of generality we may also assume that
σ dτg = 0.
(97)
M
Condition (96) and (97) will be assumed throughout this section. Concerning problem (93)λ , we shall analyze the behaviour of a blow up sequence un which satisfies: m u n Vn (x) e 1 −g un = λn − αj,n δpj − φn (x) in M, − 4π |M| Vn (x) eun dτg j =1 M un dτg = 0. M
(98) with λn → λ, and αj,n → αj > 0, j ∈ 1, · · · , m,
(99)
Vn (x) := eσn (x) and σn (x) → σ (x) in C 1 (M) ,
(100)
||φn ||Ls ≤ C0 , for suitable C0 > 0 and s > 2.
(101)
24
D. Bartolucci, G. Tarantello
Theorem 7. Assume (98), (99), (100), (101) and suppose that
max un (x) − ln Vn (x) eun dτg → +∞, as n → ∞, (blow up). M
(102)
M
Then, there exists a non-empty finite set S = {x1 , . . . , xl } ⊂ M, such that, along a subsequence, we have λn M
l Vn (x) eun → βi δxi , weakly in the sense of measures on M, Vn (x) eun dτg i=1
/ {p1 , . . . , pm } and βi = 8π 1 + αj if xi = pj for some where βi = 8π if xi ∈ i ∈ {1, . . . , l} and j ∈ {1, . . . , m}. In particular, λ = 8π n + 8π 1 + αj , j ∈J
for some n ∈ N {0} and J ⊂ {1, . . . , m}, with n + |J | > 0. Here |J | denotes the cardinality of the finite, (possible empty) set J . Theorem 7 may be considered as an extension to the singular problem (98) of a result obtained by Y.Y. Li in [34], for the “regular” mean field equations where no Diracmeasures are included into the equation. " Set , = 8πn + 8π 1 + αj , n ∈ N {0} and J ⊂ {1, . . . , m} , as an imj ∈J
mediate consequence of Theorem 7, we derive: Corollary 3. For every compact set E ⊂ R+ \, and every λ ∈ E, all solutions of (93)λ , are uniformly bounded from above in M. Let u0 be the unique solution for the following problem: m αj δpj − φ (x) u0 = 4π j =1
u0 dτg = 0,
in M (103)
M
see [6]. Note that, in view of (96), problem (103) is well posed. since φ ∈ Moreover, Ls (M) with s > 2, the function σ0 (x) = u0 (x) − 2αj ln dg x, pj belongs to C 1,γ (M), for suitable γ ∈ (0, 1). Here dg denotes the euclidean distance on (M, g). By means of u0 we can reformulate (93)λ in terms of the (smooth) function, w (x) = u (x) − u0 (x) x ∈ M, (the regular part of u), which satisfies, w − w = λ V0 (x) e 1 − |M| g V0 (x) ew dτg wdτ = 0, M g M
(104)
in M,
(105)
Liouville Type Equations with Singular Data
25
with V0 (x) = eσ (x) + u0 (x) =
m #
2α dg x, pj j eσ1 (x) ,
(106)
j =1
and σ1 (x) = σ (x)+σ0 (x), σ1 ∈ C 1 (M). We refer to problem (105) as problem (105)λ . In terms of problem (105)λ we have Corollary 4. For every compact set E ⊂ R+ \ ,, all solutions of (105)λ , with λ ∈ E and V0 satisfying (106), are uniformly bounded in C 2,δ (M), for suitable δ ∈ (0, 1). Proof. By Theorem 7 we have that the right-hand side of (105)λ is bounded uniformly in L∞ (M)-norm, ∀λ ∈ E. Thus the desired conclusion follows by a bootstrap argument and standard elliptic estimates, see [6]. Clearly, taking into account the decomposition (104), Theorem 7 together with Corollary 4 immediately yields to the concentration-compactness result stated in the Introduction. More precisely we have: Corollary 5 (Concentration/Compactness). Let un satisfy (98) and assume (99), (100) and (101). Then un admits a subsequence (still denoted in the same way) which satisfies the following alternative: either un is uniformly bounded from above in M and its regular part converges C 2 (M)uniformly, or (102) holds together with the conclusion of Theorem 7. Proof of Theorem 7. Set
vn (x) = un (x) − ln
Vn (x) eun dτg , x ∈ M,
(107)
M
which satisfies, m 1 −g vn = λn Vn (x) evn − αj,n δpj − φn in M − 4π |M| j =1 v Vn (x) e n dτg = 1.
(108)
M
By (102), max vn → +∞. M
(109)
In a small neighborhood U (p) of a given point p ∈ M, define an isothermal coordinate system y = (y1 , y2 ) centered at p, so that p corresponds to y = 0, and ds 2 = e2ϕ dy12 + dy22 , in B2r (0) = y12 + y22 ≤ 2r , where ϕ is smooth and ϕ (p) = 0. Recalling that Z = {p $ 1 , . . . , pm }, choose such a neighborhood small enough so that if p ∈ / Z, then U (p) {p1 , . . . , pm } = ∅, while, if p = pj for some j ∈ {1, . . . , m}
26
D. Bartolucci, G. Tarantello
$ Z = pj . Consequently, with respect to the isothermal coordinates, vn then U pj satisfies: vn −vn = λn eϕ (y) V
n (y) e λ n − 4π φ (y) − 4π αδ −eϕ (y) |M| in Br (0), n y=0
(110) ϕ (y) evn dy ≤ 1, V e (y) n Br (0)
/ Z, or α = αj if where = ∂y1 y1 + ∂y2 y2 is the usual Laplacian and α = 0, if p ∈ p = pj for some j ∈ {1, . . . , m}. In view of (99), (100) and (101), we easily check that (110) satisfies all the assumpλn − 4π φ . Thus, we may tions of Theorem 5, with Kn = λn eϕ Vn and ψn = eϕ |M| n conclude that, along a subsequence, one of the following alternatives hold: (i) vn − 2α ln |y| is uniformly bounded on compact sets of Br (0). (ii) vn − 2α ln |y| → −∞ uniformly on compact sets of Br (0). (iii) There exist a finite set S = {y1 , . . . , ys } ⊂ Br (0), such that vn − 2α ln |y| → −∞ uniformly on compact sets of Br (0) \ S, and the sequence of measures λn Vn evn eϕ (y) dy →
s
βj δyj , weakly in the sense of measure in Br (0) ,
j =1
(111) and βj ≥ 8π for j ∈ {1, . . . , m}. In view of (109), in fact, only alternative (ii) or (iii) are possible. Since M is compact and connected, we can patch up such “local” information, and conclude that there exists a non-empty finite set S = {x1 , · · · , xl } ⊂ M, such that, along a subsequence, we have λn Vn evn dτg →
l
βj δxj , weakly in the sense of measure in M, with βj ≥ 8π.
j =1
(112) In order to characterize precisely the values βj in (112), denote by G (x, y) the Green’s function associated to M as given by the unique solutions for the problem 1 in M = δx=y − |M| −G
(113) G (x, y) dτg = 0, M
see [6]. In view of (108) and (112), by Green’s representation Theorem, we have that 1 vn − |M|
vn dτg → M
l
βj G x, xj + u0 ,
j =1
by (103). Hence, on uniformly on a compact set of M \ S, with u0 as uniquely defined
1 any compact set of M \ S Z , the sequence vn − vn dτg admits uniformly |M| M
Liouville Type Equations with Singular Data
27
bounded mean oscillation. Consequently, for every open set ', with ' ⊂ M \ S Z , there exist a constant C > 0 such that, max vn − min vn ≤ C. '
(114)
'
For xi ∈ S and r > 0, set Ui = x ∈ M : dg (x, xi ) < r . $ $ Take r > 0 sufficently small so that Ui Uj = ∅ for i = j and ∂Ui S Z = ∅. Note that for n large, sup vn is attained at an interior point xi,n ∈ Ui . Suppose first Ui
that xi ∈ / Z. In this situation, we may further assume that Ui ∩ Z = ∅. Hence, in an isothermal coordinate system centered at xi,n the equation in (110) holds with α = 0. So, using (114) with ' = Ui , we can apply Y.Y. Li’s local result (see Theorem 0.3 in [34] or Theorem 3 in Sect. 2 ) to conclude that
λn
Vn evn dτg → 8π, as n → ∞,
Ui
and so βi = 8π in this case. If xi ∈ Z, hence xi = pj for some j ∈ {1, . . . , m}, then in terms of an isothermal coordinate system centered at pj , we see that vn satisfies the equation in (110) with α = αj . Consequently, by means of (114), we easily check that all the assumptions of Theorem 6 are satisfied and derive that,
λn
Vn evn dτg → 8π 1 + αj , as n → ∞,
Ui
so βi = 8π 1 + αj in this case. In conclusion,
λ = lim λn = lim λn n→+∞
n→+∞
and Theorem 7 is established.
M
Vn evn dτg = lim λn n→+∞
l
i=1 U
i
Vn evn dτg ∈ ,,
28
D. Bartolucci, G. Tarantello
5. An Existence Result Motivated by the study of Electroweak multivortices, in this section we are going to analyze the following elliptic system: u 1 1 −g u1 = λ V eu1 − |M| V e dτ g M m u 2 e 1 +µ u2 − |M| − 4π αj δpj − φ on M, e dτg j =1 M u (115) 1 1 V eu g u2 = λ − |M| 1 2 V e dτg M u 2 µ 1 eu on M, − |M| + 2 2 dτ e 2 cos θ g M u2 dτg = 0, u1 dτg = 0, M
M
with µ ≥ 0, λ > 0, the angle θ ∈ 0, π 2 , V and φ satisfying (94)–(97). We denote problem (115) with (115)λ . We will show that problem (115)λ admits a variational structure that permits to reduce the search of its solutions to finding critical points for an appropriate action functional (see (117) below). It turns out that, in view of the Moser–Trudinger inequality (cf. Fontana [25] and (127) below), for λ ∈ (0, 8π ) such a functional is coercive and bounded from below. Hence, in this case, solutions to (115)λ will be easily derived by minimization. More interestingly, we will be able to establish an existence result for (115)λ also in case λ ∈ (8π, 16π). In this range of parameters the relative action functional is no longer bounded from below and, in general, also lacks the compactness property required for the validity of the Palais–Smale condition. Thus, to use techniques from critical point theory, we need to understand the loss of compactness for the solution set of (115)λ in terms of the parameter λ. For this purpose note that any solution u2 to the second equation in (115)λ is uniformly bounded from above independently of u1 . Thus, we may treat the first equation in (115)λ as a problem of type (93)λ and use the concentrationcompactness principle established above to control the possible lack of compactness for its solutions. In this way, we are able to adapt for (115)λ a min-max construction introduced by Ding–Jost–Li–Wang in [21] and derive the following result: Theorem 8. Let M be a compact Riemannian manifold with positive genus. For given αj ∈ R+ and pj ∈ M, j = 1, · · · , m, assume (94), (95), (96) and (97). Then for any µ ≥ 0 and λ ∈ (0, 16π ) \ 8π, 8π 1 + αj ; j = 1, · · · , m , problem (115) admits a solution. In particular note that by taking µ = 0, from Theorem 8 we also obtain the following extension of the Ding–Jost–Li–Wang result [21] for the singular problem (93)λ . Corollary 6. Let M be a compact Riemannian manifold with positive genus. For given αj ∈ R+ and pj ∈ M, j = 1, · ·· , m, assume (94), (95), (96) and (97). Then, for every λ ∈ (0, 16π ) \ 8π, 8π 1 + αj ; j = 1, · · · , m , problem (93)λ admits a solution.
Liouville Type Equations with Singular Data
29
It will be clear from the proof that, when λ ∈ (0, 8π ), Theorem 8 and Corollary 6 hold without the assumption that M admits positive genus. To establish Theorem 8, we reformulate (115)λ as a “regular” problem, and by means of the function u0 defined in (103), we set, u1 = u0 + w1 ,
u2 = w2 .
So, in terms of the unknown (w1 , w2 ), problem (115)λ becomes: w 1 1 − |M| −g w1 = λ V0 ew1 V0 e dτg M w 2 1 − |M| +µ ew2 e dτ g M w 1 1 V0 e g w2 = λ 2 V0 ew1 dτg − |M| M w 2 µ 1 ew − |M| + e 2 dτg 2 cos2 θ M u dτ = 0, u dτ 1 g 2 g = 0, M
on M, (116)
on M,
M
with V0 as defined in (106). Note in particular that V0 ∈ L∞ (M). We start to obtain a variational formulation for (116) on the Hilbert space E × E, where
E = w ∈ H 1 (M) : wdτg = 0 , M
is equipped with the standard scalar product < φ, ψ >=
M
∇φ · ∇ψ dτg .
Indeed, for every (w1 , w2 ) ∈ E × E define the functional:
1 tan2 θ 1 |∇w1 |2 dτg − λ ln V0 ew1 dτg I (w1 , w2 ) = |M| 2 2 M M
2 w1 1 + ∇ ew2 dτg . + w2 dτg + µtan2 θ ln |M| 2 M
M
(117)
30
D. Bartolucci, G. Tarantello
It is easy to check that I ∈ C 1 (E × E) and any critical point (w1 , w2 ) ∈ E × E for I satisfies:
w 2 1 tan θ V0 e ∂I ϕ dτg (w1 , w2 ) ϕ = ∇w1 · ∇ϕ − λ ∂w1 2 V0 e w1
+
M
∇ M
∂I (w1 , w2 ) ϕ = 2 ∂w2
∇ M
w
1
2
M
+ w2 · ∇ϕdτg = 0,
e w2 ϕ w + w2 · ∇ϕdτg + µtan2 θ dτg = 0, 2 e 2 dτg
w
1
M M
(118) for every ϕ ∈ E. In other words, critical points of I define (weak) solutions for the system of equations
w 1 λ 2 V0 e 1 1 −g w1 dτ − |M| on M, 2 w1 + w2 = 2 tan θ V e 2 cos θ 0 g M (119) w 1 µ 2 2 1 g w1 + w2 = tan θ ew on M, − |M| 2 2 2 e dτg M w1 = 0, w2 = 0, M
M
from which we immediately obtain (116). Therefore, solution to (116) can be derived as critical points of I in E × E. For this purpose, from now on we fix µ ≥ 0, and in the sequel we emphasize no longer the dependence on µ. Next we note that for any given w1 ∈ E, there exists a unique w2 ∈ E, (depending on w1 ) which satisfies weakly the second equation in (119). A simple application of the Implicit Function Theorem also shows that the dependence of w2 on w1 is of class C 1 . More precisely, Lemma 4. There exist a C 1 -map γ : E → E, such that, ∂I (w, z) = 0 ∂w2
in E ∗ if and only if z = γ (w) ;
(120)
here E ∗ denotes the dual space of E. Proof. We start with the following: Claim 1. For every w1 ∈ E, there exists a unique w2 ∈ E satisfying, ∂I (w1 , w2 ) = 0 ∂w2
in E ∗ .
(121)
To obtain Claim 1, fix w1 ∈ E and observe that the functional I0 (w) := I (w1 , w) is coercive, weakly lower semicontinuous and bounded below on E. The corresponding
Liouville Type Equations with Singular Data
31
minimizer w2 ∈ E gives the desired unique solution for (121), as we show that there are no other critical points for I0 in E. In fact, suppose that z ∈ E is another critical point for I0 , therefore it satisfies: ∂I (w1 , z) = 0. ∂w2 Set ψ = z − w2 . The function f ∈ C 2 (R, R) defined by f (t) = I (w1 , w2 + tψ) , satisfies, f (0) = f (1) = 0,
(122)
|∇ψ|2 dτg
f (t) = 2 M
2
+ µtan2 θ
ew2 + tψ ψ 2 dτg − w + tψ dτg e 2
ew2 + tψ ψdτg w + tψ dτg e 2
M
M
M
M
≥ 0,
(123)
∀t ∈ R, since by Jensens’s inequality, 2
w w e e w ψ 2 dτg − w ψdτg ≥ 0, ∀ w, ψ ∈ E. e dτ e dτ M M
g
M M
g
(124)
Hence necessarily, ψ = 0, and so z = w2 . Thus, it is well defined the map γ : E → E that at every w1 ∈ E, we associate the unique w2 satisfying (121). In other words, (120) holds. To show that γ ∈ C 1 (E) we apply the Implicit Function Theorem (see [38]) to the function F : E × E → E ∗ , F (w1 , w2 ) =
∂I (w1 , w2 ) . ∂w2
∂F Claim 2. F ∈ C 1 (E × E, E ∗ ) and for every (w1 , w2 ) ∈ E × E, ∂w (w1 , w2 ) defines 2 ∗ an isomorphism from E onto E . From (118) it is straightforward to check that F is Frechét differentiable and for every (w1 , w2 ) , (ψ1 , ψ2 ) ∈ E × E, we have:
F (w1 , w2 ) (ψ1 , ψ2 ) = with
∂F ∂F (w1 , w2 ) ψ1 + (w1 , w2 ) ψ2 ∈ E ∗ , ∂w1 ∂w2
∂F (w1 , w2 ) ψ1 φ = ∇ψ1 · ∇φ dτg , ∂w1 M
32
D. Bartolucci, G. Tarantello
∂F (w1 , w2 ) ψ2 φ ∂w2
=2
∇ψ2 · ∇φ dτg + µtan2 θ
M
w w 2 2 e e w φ ψ2 − w ψ2 dτg dτg , 2 2 e dτg e dτg
M M
M
for every φ ∈ E. Since the map : w → ew is continuous (in fact compact) from E into Lp , ∀ p ≥ 1, (see [6]), we immediately conclude that F ∈ C 1 (E × E, E ∗ ). Moreover, if we identify in the canonical way E with its dual space E ∗ , then for (w1 , w2 ) ∈ (E × E), ∂F ∂w2 (w1 , w2 ) identifies with a continuous linear operator A ∈ B (E, E) of the form, A = 2I + K,
(125)
where K (depending on w2 only) is a compact linear map on E. It remains to show that A defines an isomorphism onto E. In view of (125), by the Fredholm alternative, this is ensured as soon as we check that KerA = {0}. For this purpose, let ψ ∈ KerA, thus, ∂F 0 = Aψ, ψ = (w1 , w2 ) ψ ψ ∂w2 2
w w 2 2 e e ψ 2 dτg − w ψdτg , = 2 |∇ψ|2 dτg + µtan2 θ w 2 2 e dτ e dτ M
M M
g
g
M M
and by (124) we immediately obtain ψ = 0. At this point the conclusion that γ ∈ C 1 (E) easily follows by claim 1 and the Implicit Function Theorem around each pair (w1 , γ (w1 )). For w ∈ E, define the functional:
1 tan2 θ 1 |∇w|2 dτg − λ ln V0 ew dτg Jλ (w) = |M| 2 2 M M
2 w 1 + ∇ eγ (w) dτg . + γ (w) dτg + µtan2 θ ln |M| 2 M
M
Clearly Jλ ∈ C 1 (E), and in view of (120) we have that w defines a critical point for Jλ if and only if the pair (w, γ (w)) is a critical point for I . So, to establish Theorem 8, it will suffice to find critical points for Jλ in E. Proof of Theorem 8. First of all notice that in view of (97)–(103), by Jensen’s inequality we find
1 1 w V0 e dτg = eσ +u0 +w dτg ≥ 1, |M| |M| M
M
Liouville Type Equations with Singular Data
and similarly, 1 |M|
33
eγ (w) dτg ≥ 1,
M
for any w ∈ E. Therefore, for w ∈ E, Jλ (w) defines a non-increasing function of λ and
tan2 θ 1 1 |∇w|2 dτg − λ ln Jλ (w) ≥ (126) V0 ew dτg . |M| 2 2 M
M
Recall the Moser–Trudinger inequality (cf. [37, 25]):
1 ||∇w||2L2 (M) , ∀ w ∈ E, ew ≤ CM exp 16π
(127)
M
with CM > 0 a constant depending on M only. From (126) and (127) it follows immediately that for λ < 8π, the functional Jλ is coercive and bounded from below in E. A critical point in this case is easily obtained by minimization. Indeed, for a minimizing sequence {wn } ⊂ E, we find that w 2 n ||∇wn ||2L2 (M) + ∇ ≤ C, ∀n ∈ N, + γ (wn ) 2 L2 (M) for suitable constant C > 0. In turn, this implies that zn := γ (wn ) satisfies: (a)
∇
(b) M
||∇zn ||2L2 (M) ≤ C, ∀ n ∈ N,
e zn µ z ϕdτg = 0, ∀ ϕ ∈ E. + zn · ∇ϕdτg + tan2 θ 2 2 e n dτg
w
n
M M
(128) Therefore, we can find a subsequence of the pair (wn , zn ) (still denoted in the same way) such that, (wn , zn ) → (w, z) weakly in E and strongly in Ls (M) , s > 1, w z e n , e n → ew , ez strongly in Ls (M) , s > 1;
(129)
for suitable (w, z) ∈ E × E. Hence we can pass to the limit into (128) and via (120), derive that necessarily z = γ (w), and consequently that Jλ attains its infimum at w. Therefore, the pair (w, γ (w)) defines a solution for (119). Note that for λ < 8π , we do not need to require M with positive genus, which is a necessary assumption to treat the remaining case, λ ∈ (8π, 16π ) \ 8π 1 + αj , j = 1, . . . , m . (130) For this purpose, recall that Z = {p1 , . . . , pm } denotes the zeroes for the function V0 ∈ L∞ (M). Let X : M → RN be the embedding map of M into RN N ≥ 3, and ,1 ⊂ M \ Z be a regular simple closed curve such that its image ,˜ 1 = X (,1 ) links with a closed curve ,2 ⊂ RN \ X (M).
34
D. Bartolucci, G. Tarantello
For any w ∈ E, define its center of mass: M
Xew dτg
m (w) = w . e dτg M
Consider the family of continuous maps h : D → E, such that, using polar coordinates in D (the open unit disk of R2 ) we have: (i)
lim Jλ (h (r, τ )) = −∞, τ ∈ [0, 2π ] ,
r→1−
(ii) m (τ ) = lim m (h (r, τ )), defines a continuous map from S 1 into ,˜ 1 with nonr→1−
zero degree.
Denote by Dλ the set of all such maps. Claim 1. If λ > 8π then Dλ is not empty. To establish Claim 1, let δ > 0 be sufficiently small and for any p ∈ M \ Z, set U2δ (p) ⊂ M, a ball around p with radius 2δ, where it is defined as an isothermal 2 coordinate y = (y1 , y2 ). Suppose that p corresponds to y = 0 and ds = 2 system 2 2 ϕ e dy1 + dy2 with ϕ smooth and ϕ (p) = 0. In U2δ (p), consider the function uε,p in such a way that, in terms of the isothermal coordinates, we have: ε uε,p (y) = ln 2 , ε + σ π |y|2 where σ = V0 (p) > 0. Denote by wε,p ∈ E the unique solution for the problem: −w
u ε,p 1 = 8π X euε,p − |M| Xe dτg
ε,p
wε,p dτg = 0,
on M,
M
(131)
M
where X denotes a standard cut-off function supported in U2δ (p) with X identically equal to 1 in Uδ (p). Clearly, wε,p ∈ E depends continuously on ε > 0 and p ∈ M. Moreover, by means of the Green’s representation formula, it is not difficult show that, for p ∈ M \ Z and ε → 0, we have: 1 ∇wε,p 2 2 = 32π ln + O (1) , L (M) ε
1 1 V0 ewε,p dτg = 2 ln + O (1) , ln |M| ε M
ewε,p w → δp weakly in the sense of measure. e ε,p dτ
M
g
For details, in case M is the flat two torus, see [49].
Liouville Type Equations with Singular Data
35
We notice also that
w 2 1 ε,p + µtan2 θ ln eγ wε,p dτg ≤ C, + γ wε,p ∇ |M| 2 L2 (M)
(132)
M
for a suitable constant C > 0 independent of ε > 0 and p ∈ M. Indeed, in view of (131), we find that z = γ wε,p satisfies:
X euε,p tan2 θ ez 1 4π + µ tan2 θ + µ − z = 4π z u 2 2 |M| e dτg X e ε,p dτg M M z = 0.
on M,
M
(133) Therefore, by means of Green’s representation formula, we find a constant A > 0 such that max z ≤ A,
(134)
M
which also implies,
M
ez eA . On the other hand, ≤ |M| z e dτg
w
ε,p
2
z 1 e − + z = µtan2 θ z , |M| e dτg
M
wε,p from which we immediately derive ∇ + z 2
L2 (M)
≤ R, for a suitable R > 0,
independent of ε and p, and (132) follows. Therefore, if we denote by p = p (τ ), τ ∈ [0, 2π] a regular simple parametrization of ,1 , in view of (134) and (132), we easily check that, for λ > 8π , the function h (r, τ ) := w1−r,p(τ ) belongs to Dλ . At this point we may follow [21], and define, cλ = inf
sup Jλ (w) ,
(135)
h∈Dλ w∈h(D)
as a “good” candidate for a critical value of Jλ . Claim 2. If λ ∈ (8π, 16π) then cλ > −∞. The proof of Claim 2 relies in an essential way upon the following improved form of the Moser–Trudinger inequality, (see [18]): Lemma. Let S1 , S2 be two subsets of M, satisfying dist(S1 , S2 ) ≥ δ0 > 0 and γ0 ∈ 1 0, 2 . For any ε > 0, there exists a constant c = c (ε, δ0 , γ0 ) > 0 such that, for all u ∈ E, satisfying: u u e dτg e dτg S1
M
≥ γ0 , eu dτg
S2
M
eu dτg
≥ γ0 ,
(136)
36
D. Bartolucci, G. Tarantello
we have,
eu dτg ≤ c exp
M
1 32π − ε
|∇u|2 dτg .
(137)
M
A simple application of Holder’s inequality, shows that, for λ < 16π , Jλ is bounded below and coercive on the set of functions satisfying (137). Notice that for any h ∈ Dλ there exists a function w ∈ h (D) with m (w) ∈ ,2 . So, if by contradiction, we assume that (135) yields to the value −∞, then we would find a sequence wn ∈ E, with m (wn ) ∈ ,2 , and Jλ (wn ) → −∞.
(138)
Hence, wn has to violate (136). So there must exist a point p0 ∈ M such that, for every ε > 0, we have ewn dτg Uε (p0 )
w e n dτg
→ 1, as n → ∞.
'
But this is impossible, since it would imply that,
|m (wn ) − X (p0 )| ≤ Uε (p0 )
ewn dτg |X − X (p0 )| w + o (1) e n dτg '
≤ dist (X (Uε (p0 )) , X (p0 )) + o (1) , as n → ∞. Therefore, by choosing ε > 0 sufficiently small, for n large, we can ensure that |m (wn ) − X (p0 )| < 21 dist (,2 , X (M)) in contradiction with the fact that m (wn ) ∈ ,2 and p0 ∈ M. Claim 3. There exist a dense set E ⊂ (8π, 16π ) such that for every λ ∈ E there exist a sequence {wn } ⊂ E, satisfying: (a) wn is uniformly bounded in E, (b) Jλ (wn ) → cλ , (c) Jλ (wn ) → 0, in E ∗ . Moreover cλ defines a critical point for Jλ , for every λ ∈ E. To establish (a)–(c), as in [49] and [21] we shall use Struwe’s monotonicity trick. For this purpose note that in view of (128), if λ1 < λ2 then Dλ1 ⊂ Dλ2 and cλ1 ≥ cλ2 . Hence cλ defines a non-increasing function of λ. Denote by E ⊂ (8π, 16π ) the set of λ s where cλ is differentiable. Clearly E is dense in [8π, 16π ], see [48]. Thus, for any λ ∈ E, we can choose a sequence λn ! λ such that 0≤
cλn − cλ ≤ C, λ − λn
for some constant C independent of n.
(139)
Liouville Type Equations with Singular Data
37
Using (139), we see that, for hn ∈ Dλn ⊂ Dλ satisfying: sup
w∈hn (D)
Jλn (w) ≤ cλn + λ − λn ,
(140)
and for all w ∈ hn (D) such that Jλ (w) ≥ cλ − (λ − λn ) , we have: 1 ln |M|
Jλn (w) − Jλ (w) λ − λn cλn − cλ ≤ 2cotan2 θ ≤ 2cotan2 θ (C + 1) ≡ C1 , λ − λn
V0 ew dτg = 2cotan2 θ
M
(141)
and consequently,
1 w ≤ 4cotan θJλn (w) + 2λn ln V0 e dτg |M| M 2 ≤ 4cotan θ cλn + λ − λn + 2λn C1 .
||∇w||2L2 (M)
2
(142)
In view of (139), we have cλn ≤ cλ + C (λ − λn ) ,
(143)
and from (142), we find a constant R > 0, independent of n, such that if (140) and (141) hold, then ||∇w||L2 (M) ≤ R. To establish our claim, we are going to show that there exist a sequence {wn } ⊂ E with ||∇wn ||2L2 (M) ≤ R, satisfying (b)-(c). For this purpose we argue by contradiction,
and suppose that every w ∈ E such that ||∇w||2L2 (M) ≤ R and |Jλ (w) − cλ | < a, satisfies Jλ (w)E ∗ ≥ δ, for suitable a > 0 and δ > 0. Thus, by means of standard deformation arguments (cf. [48]), we can find ε ∈ (0, a) and an homeomorphism η : E → E with the following properties: (i) η (w) = w, provided |Jλ (w) − cλ | ≥ a; (ii) Jλ (η (w)) ≤ Jλ (w); (iii) if ||∇w||L2 (M) ≤ R, and Jλ (w) ≤ cλ + ε, then Jλ (η (w)) ≤ cλ − ε. Let us choose hn ∈ Dλn ⊂ Dλ so that (140) holds. In view of (i) and (ii), we have that η ◦ hn ∈ Dλn ⊂ Dλ . Let wn ∈ hn (D) such that Jλ (wn ) = max Jλ (w). By definition w∈hn (D)
Jλ (wn ) ≥ cλ , so that (140) and (141) hold for wn and imply, ||∇wn ||L2 (M) ≤ R. Therefore, by (140) and (143), we have, max Jλ (w) = Jλ (wn ) ≤ Jλn (wn ) ≤ cλn +λ−λn ≤ cλ +(C + 1) (λ − λn ) < cλ +ε,
w∈hn (D)
provided n is sufficiently large.
38
D. Bartolucci, G. Tarantello
But this is impossible, since (ii) implies that, for n sufficiently large, we would have, max
w∈η◦hn (D)
Jλ (w) < cλ − ε
and η ◦ hn ∈ Dλ , which is a contradiction to the definition of cλ . At this point, it is easy to show that if {wn } satisfies (a),(b) and (c) then wn admits a strongly convergent subsequence to a critical point of Jλ with cλ the corresponding critical value. Indeed, if {wn } satisfies (a),(b) and (c), then both wn and zn := γ (wn ) define uniformly bounded sequences in E. Hence, as above, passing to a subsequence, (denoted in the same way) we find (w, z) ∈ E such that (129) holds and z = γ (w). Note also that, as n → ∞,
w e zn n z 2 ∇ + zn · ∇ (wn − w) dτg = −µtan2 θ (wn − w) dτg = o (1) , 2 e n dτ M
and
g
M M
Jλ (wn ) (wn − w) ≤ Jλ (wn )
E∗
||∇ (wn − w)||L2 (M) = o (1) ,
Therefore, as n → ∞, o (1) = Jλ (wn ) (wn − w) V0 ewn (wn − w) dτg
2 tan θ M = ∇wn · ∇ (wn − w) dτg − λ 2 V0 ewn dτg M
∇
+2 M
=
tan2 θ 2
M
w
n
2
+ zn
z e n (wn − w) dτg M z · ∇ (wn − w) dτg + µtan2 θ e n dτg
M
||∇wn ||2L2 (M) − ||∇w||2L2 (M) + o (1) .
Consequently, wn → w and γ (wn ) → γ (w) strongly in E, cλ = Jλ (w) and Jλ (w) = 0. So, cλ defines a critical value for Jλ for every λ ∈ E, and Claim 3 is established. From Claim 3 we get that, for every λ ∈ E, problem (115)λ admits a solution. Next we use the compactness result of the previous section to show that, in fact, this remains true for everyλ satisfying (130). To this end, let {λn } ⊂ E so that λn ! λ and denote by u1,n , u2,n a solution pair for (115)λ=λn . From the second equation in (115)λn and by a simple application of the maximum principle, we have that eu2,n 1 max µ u ≤ 1 + λ cos2 θ , ∀ n ∈ N. M |M| e 2,n dτ M
g
Therefore, we can use Theorem 7 for the first equation in (115)λn (with u2,n µ 1 + φ) and, by (130), conclude that eu1,n φn = 4π eu2,n − |M| is also e dτg eu1,n dτg M
uniformly bounded in M.
M
Liouville Type Equations with Singular Data
39
Consequently, setting u1,n = u0 + w1,n (u0 given in (103)) and u2,n = w2,n , we find that the pair w1,n , w2,n is uniformly bounded on E. So we can argue as above to conclude that, along a subsequence, wj,n → wj , j = 1, 2, strongly in E, and by elliptic regularity, in any other relevant norm. Clearly, u1 = u0 + w1 and u2 = w2 give the desired solution for (115)λ . In the following section we show how an application of Theorem 8 yields the existence of multivortices in the Electroweak theory. 6. Vortices in the Electroweak Theory The Electroweak Theory is a relativistic non-abelian gauge field theory defined on four dimensional Minkowski space R1+3 . The gauge group is SU(2) ×U(1), acting on the space of C2 -valued Higgs matter field φ, and with corresponding gauge potentials A = (Aa ), a = 1, 2, 3 and B defined through the real 4-vectors Aa = Aaµ , B = Bµ , µ = 0, 1, 2, 3. We refer to [46] for a detailed discussion about this model. Here we are going to consider the theory when formulated with respect to the unitary gauge, which is defined by the condition, 0 φ= , (144) ϕ with ϕ a real scalar function. We follow [46], and consider the new field configurations Wµ , Pµ and Zµ given as follows: Pµ = Bµ cos θ + A3µ sin θ (145) Zµ = −Bµ sin θ + A3µ cos θ, 1 Wµ = √ A1µ + iA2µ . 2 The angle θ is called the Weinberg (mixing) angle and relates the SU(2)-coupling constant g with the U(1)-coupling constant g∗ and the electron charge −e via the relations, g
cos θ =
g2
+ g∗2
1/2 ,
(146)
e = g∗ cos θ = g sin θ. The fields Wµ , Zµ are expected to be the massive fields mediating short-range (weak) interactions, while Pµ mediates the long-range (electromagnetic) interactions, see [33]. Motivated by [4], we will be interested to analyze Electroweak-vortex configurations and so we take the magnetic excitation to be confined in the third direction, and express the vortex ansatz as follows: Aa0 = Aa3 = B0 = B3 = 0, a = 1, 2, 3, Aaj = Aaj (x1 , x2 ) , Bj = Bj (x1 , x2 ) , j = 1, 2, and a = 1, 2, 3, φ = φ (x1 , x2 ) .
(147)
40
D. Bartolucci, G. Tarantello
Let Dµ = ∂µ − igA3µ , Pµν = ∂µ Pν − ∂ν Pµ and Zµν = ∂µ Zν − ∂ν Zµ . If we assume further that the relations W1 = W , W2 = iW hold, namely that A12 = −A21 and A22 = A11 , then the energy density corresponding to the theory may be expressed as follows: 1 2 1 2 E = |D1 W + iD2 W |2 + P12 + Z12 − 2g (Z12 cos θ + P12 sin θ) |W |2 2 2
2 2 1 2 2 2 2 2 2 2 2 |W | + 2g 2 |W |4 + ∂j ϕ + ϕ Z + g ϕ + σ ϕ − ϕ , (148) g 0 j 4 cos2 θ with σ and ϕ0 given positive constants. The residual U (1) symmetry of the model is now represented by the invariance of (148) under the gauge transformation 1 W → exp (iξ )W, Pj → Pj + ∂j ξ, Zj → Zj , j = 1, 2 and ϕ → ϕ, e
(149)
for any given smooth function ξ : R2 → R.At this point we can introduce the appropriate periodic boundary conditions for the (vortex like) field configurations (W, P , Z, ϕ) in the unitary gauge that account for (149). Let ' = {x = (x1 , x2 ) ∈ R2 | x = s1 a1 + s2 a2 , 0 < s1 , s2 < 1}, be the fundamental domain of a lattice in R2 generated by the independent vectors a1 and a2 . Set ,ak = {x ∈ R2 | x = sk ak , 0 < sk < 1}, so that ∂' = ,a1 ∪ ,a2 ∪ {a1 + ,a1 } ∪ {a2 + ,ak } ∪ {0, a1 , a2 , a1 + a2 }. We require
exp (iξk (x + ak ))W (x + ak ) = exp (iξk (x))W (x) , 1 1 Pj − ∂j ξk (x + ak ) = Pj − ∂j ξk (x) , e e
Zj (x + ak ) = Zj (x) ,
j = 1, 2,
(150)
j = 1, 2,
ϕ (x + ak ) = ϕ (x) ,
where x ∈ ,a1 ,a2 − ,ak , k = 1, 2, and ξ1 , ξ2 are smooth real functions defined in a neighborhood of ,a2 ∪ {a1 + ,a2 } and ,a1 ∪ {a2 + ,a1 } respectively. It is explicitly worked out in [46] that the full SU(2) ×U(1)-t’Hooft periodic boundary conditions imposed on the (A, B, φ) field configurations reduce to (150) when expressed in terms of the unitary gauge field configurations. Let B denote the (magnetic) flux of P through ',
B = P12 dx. (151) '
The periodic boundary conditions (150) yield a quantization property for B. In fact, since W is a single valued complex scalar field (that for simplicity we assume not to vanish on ∂') then its phase change around ∂' must be an integral multiple of 2π, say 2πN .
Liouville Type Equations with Singular Data
41
Thus, as a consequence of (150), we find,
2π B = P12 dx = (∂1 P2 − ∂2 P1 ) dx = Pj dx j = N. e '
'
(152)
∂'
The integer N , called the total vortex number, defines a topological invariant, which, in the selfdual setting, we check to coincide with the degree of W at zero on ∂'. Using the energy density (148), after integration over ' and some straightforward calculations, we derive the following expression for the total energy:
2 % 1 g 2 2 2 |D1 W + iD2 W | + dx Edx = P12 − ϕ − 2g sin θ |W | E= 2 2 sin θ 0 ' '
2 1 g 2 + Z12 − ϕ − ϕ02 − 2g cos θ |W |2 2 2 cos θ '
gϕ 2 % dx Z j + ε j k ∂k ϕ + 2 cos θ
2 g2 g2 2 2 ϕ04 + σ− − ϕ − ϕ 0 2 8 cos θ 8 sin2 θ '
% gϕ02 gϕ02 g P12 − Z12 − ∂k εj k Zj ϕ 2 dx, + 2 sin θ 2 sin θ 2 cos θ see Ambjorn–Olesen [4]. Therefore, from (150) and (152) we find, gϕ02 π N gϕ02 g2 |'| , for σ ≥ E≥ − . e 8 cos2 θ sin θ 8 sin θ In the critical case,
g2 , 8 cos2 θ the above energy lower bound may be saturated by the solutions of the following Selfdual Bogomol’nyi system: D1 W + iD2 W = 0, g P12 = 2 sin θ ϕ02 + 2g sin θ |W |2 , (153) g Z12 = 2 cos θ ϕ 2 − ϕ02 + 2g cos θ |W |2 , Zj = − 2 cos θ εj k ∂k ln ϕ, j = 1, 2. g σ =
It is straightforward to verify that solutions of (153) give rise to static solutions for the original equations of motion of the Electroweak Theory. Moreover, using the second equation in (153) and (152), we have that necessarily the vortex number N must be positive. Our next task will be to establish a solution for (153) subject to the periodic boundary conditions (150). In the sequel, we shall refer to the solution of (153)–(150) with total
42
D. Bartolucci, G. Tarantello
vortex number N as to selfdual N -vortices. To seek selfdual N -vortices we shall reduce to analyze a system of elliptic equations of Liouville type over the flat 2-torus. To this purpose, we use complex notations in R2 and let ∂+ =
1 (∂1 + i∂2 ) , 2
ω = A31 + iA32 .
Thus, the first equation in (153) reads: ∂ +W =
1 igωW. 2
(154)
The relation (154) implies that, if Z (W ) is the set of zeroes of W in ', then Z (W ) contains a finite number of points, and for p0 ∈ Z (W ) we have, W (z) = (z − p0 )n0 h0 (z) ,
(155)
in a neighborhood of p0 , with n0 ∈ N and h0 a non-vanishing complex valued smooth function (see [29]). Consequently, for solutions of (153), the vortex number N counts the zeroes of W in ' according to their multiplicities. Furthermore in ' \ Z (W ), Eq. (154) may be expressed as follows: ω=−
2i + ∂ ln W. g
(156)
Set Z (W ) = {p1 , · · · , pm } ⊂ ' and assume that nl ∈ N is the multiplicity of the zero pl , l = 1, · · · , m, so that N = n1 + · · · + nm . We refer to pl has a vortex point. By virtue of (145), Eq. (156) implies that −
1 ln |W |2 = ∂1 A32 − ∂2 A31 = P12 sin θ + Z12 cos θ, in ' \ Z (W ) . 2g
(157)
Since we work in the unitary gauge, it is necessary to impose that the real Higgs field ϕ never vanishes on ', and the last equation in (153) gives that, Z12 =
2 cos θ ln ϕ. g
(158)
Thus, using (157) and (158), we may reduce (153) to the following elliptic system: − ln |W |2 = g 2 ϕ 2 + 4g 2 |W |2 on ' \ Z (W ) 2 2 (159) g 2 2 2 ln ϕ = ϕ − ϕ0 + g |W | on ' \ Z (W ). 2 4 cos θ Using (155) and the substitution |W |2 = ew and ϕ 2 = eu , we may combine (153) and the boundary conditions (150) to show that any selfdual N -vortex yields to a solution of the following elliptic problem: m nl δpl in ', −w = 4g 2 ew + g 2 eu − 4π l=1 2 2 g2 (160) u + g ϕ0 in ', −u = −2g 2 ew − e 2 2 2 cos θ 2 cos θ w, u doubly periodic on ∂'.
Liouville Type Equations with Singular Data
43
Interestingly enough, such procedure can be reversed in the sense that any solution of (160) yields to a selfdual N -vortex for the Electroweak Theory. Indeed, if (u, w) is a solution for (160), then we can first recover ϕ, W, Pj , Zj , by setting: ϕ (z) = exp ( 21 u (z)), m W (z) = exp ( 21 [w (z) + iQ (z)]); Q (z) = 2 nl arg (z − pl ) , l=1 Zj = − 2 cos θ εj k ∂k ln ϕ, j = 1, 2. g
(161)
It was observed by Taubes [51] (see also [29]) that by using such an expression for W , the function ∂ + ln W extends smoothly over the points pl , l = 1, · · · , m. Therefore (156) may be used to recover A3j , j = 1, 2 and, in view of (145), derive Pj = cosec θA3j (z) − cot θZj (z) , j = 1, 2.
(162)
It is not hard to check that indeed ϕ, W, Pj , Zj , j = 1, 2, as given by (161) and (162) define a solution of the Bogomol’nyi system (153), which satisfies the periodic boundary condition (150). Moreover, W vanishes exactly at each point pj with multiplicity nj , j = 1, · · · , m, and the corresponding total vortex number is given by N = n1 + · · · + nm . In view of the periodic boundary conditions, we may set (160) over the flat two dimensional torus M = R2 /ZZa1 + ZZa2 . By means of this identification we are reduced to seek solutions for the elliptic system: m nj δpj on M, −w = 4g 2 ew + g 2 eu − 4π j =1 (163) 2 2 2 ϕ g g 2 w u 0 −u = −2g e − e + on M. 2 cos2 θ 2 cos2 θ Integrating over M, we find the following constraints for the solvability of (163): 2 4g M 2 2g
M
ew + g 2 ew +
u e = 4π N,
M
g 2 ϕ02 g2 u |M| . e = 2 cos2 θ M 2 cos2 θ
(164)
Consequently, 4π N − g 2 ϕ02 |M| w , e = 4g 2 sin2 θ M u g 2 ϕ02 |M| − 4π N cos2 θ e = , g 2 sin2 θ M
(165)
which imply the following necessary condition for the sovability of (163): g 2 ϕ02 <
g 2 ϕ02 4π N < . |M| cos2 θ
(166)
44
D. Bartolucci, G. Tarantello
Decomposing
w = u1 + c, u = u2 + d,
1 w, u1 = 0 and c = |M| M M 1 u, with u2 = 0 and d = |M|
with
M
(167)
M
from (165) we can express the mean values (c, d) of w and u respectively as follows: 4π N − g 2 ϕ02 |M| 1 exp c = w, e 4g 2 sin2 θ M (168) 2 2 g ϕ0 |M| − 4π N cos2 θ 1 exp d = . u e g 2 sin2 θ M
Setting λ = 4πN −
g 2 ϕ02 |M| − 4π N cos2 θ sin2 θ
=
4π N − g 2 ϕ02 |M| sin2 θ
,
(169)
the necessary constraint (166) may be stated as follows: 0 < λ < 4π N.
(170)
Using (167) and (168), after straightforward calculations, we are able to formulate (163) in terms of the unknowns (u1 , u2 ) as follows: m u1 u2 −u1 = λ e u1 + (4πN − λ) e u2 − 4π nj δpj on M, e e j =1 M M u1 − 4π N ) eu2 + 1 4π N + (4π N − λ) tan2 θ (λ e 1 −u on M, λ = − + u2 2 u 2 2 2 |M| e 1 e 2 cos θ M M u2 = 0. u1 = 0, M
M
(171) Note that, if (166) holds, then via (168), (169) and (170) we have a complete equivalence between problem (163) and (171). Clearly, problem (171) is a particular case of the general elliptic system (115)λ in the N situation where M is the flat 2-torus, µ = 4π N − λ, αj = nj ∈ N, φ = |M| and V ≡ 1. Thus, by setting λN = min{16π, 4π N }, from Theorem 8, we derive, Theorem 9. For any λ ∈ (0, λN ) \ {8π }, there exists a solution (u1 , u2 ) of the coupled elliptic system (171). It is a very interesting open problem to know whether or not problem (171) admits a solution for λ = 8π. From Theorem 9 we immediately derive that, except for the values of the parameters satisfying the identity: 4π N − g 2 ϕ02 |M| = 1, 8π sin2 θ when N = 3 and 4, the condition (166) defines a necessary and sufficient condition for the existence of a selfdual N-vortex. More precisely we have:
Liouville Type Equations with Singular Data
45
Theorem 10 (Existence of Multivortex in the Electroweak Theory). For given N ∈ N, suppose that (166) holds and 1 =
4π N − g 2 ϕ02 |M| 8π sin2 θ
< 2.
Then, for any {p1 , . . . , pm} ⊂ M, and n1, . . . , nm ∈ N, with N = n1 + · · · + nm , there exist a selfdual N -vortex ϕ, W, Pj , Zj j = 1, 2 satisfying (153) and (150) so that ϕ > 0, W vanishes precisely at pj with multiplicity nj , j = 1, · · · , m and the total flux B = 2πN/e. 4π N − g 2 ϕ02 |M| = 1, the condition (166) is 8π sin2 θ necessary and sufficient for the existence of a selfdual N -vortex solution. Corollary 7. In case N = 1, 2, 3, 4 and
As already mentioned, Spruck–Yang [46] have derived the analogue of Theorem 10 under the more restricitive assumption 4π N − g 2 ϕ02 |M| 8π sin2 θ
< 1,
which implies Corollary 7 only for N = 1, 2. Acknowledgements. The authors wish to thank the referee for pointing out a mistake (concerning the proof of Theorem 9) in the earlier version of the paper. His/her appropriate criticism helped us to improve the presentation of our results.
References 1. Abrikosov, A.A.: On the magnetic properties of superconductors of second group. Sov. Phys. JETP 5 1174–1182 (1957) 2. Ambjorn, J., Olesen, P.: Anti-screening of large magnetic fields by vector-boson. Phys. Lett. B214, 565– 569 (1988) 3. Ambjorn, J., Olesen, P.: A magnetic condensate solution of the classical electroweak theory . Phys. Lett. B218, 67–71 (1989) 4. Ambjorn, J., Olesen, P.: On electroweak magnetis. Nucl. Phys. B315, 606–614 (1989) 5. Ambjorn, J., Olesen, P.: A condensate solution of the electroweak theory which interpolates between the broken and the symmetry phase. Nucl. Phys. B330, 193–204 (1990) 6. Aubin, T.: Nonlinear analysis on Manifolds: Monge–Ampere Equations. New-Yor–Berlin: SpringerVerlag, 1982 7. Bartolucci, D.: A priori estimate for an elliptic equation with exponential nonlinearity. Preprint 2001 8. Bartolucci, D.: A compactness result for periodic multivortices in the Electroweak Theory. Preprint 2001 9. Bartolucci, D., Tarantello, G.: The Liouville equations with singular data: A concentration-compactness principle via a local representation formula. Accepted for publication on Jour. Diff. Eqs. 10. Brezis, H., Li, Y.Y., Shafrir, I.: A sup+inf inequality for some nonlinear elliptic equations involving exponential nonlinearities. J. Funct. Anal. 115, 344–358 (1993) 11. Brezis, H., Merle, F.: Uniform estimates and blow-up behaviour for solutions of −u = V (x)eu in two dimensions. Comm. in P.D.E. 16 (8,9), 1223–1253 (1991) 12. Caffarelli, L., Yang, Y.: Vortex condensation in the Chern–Simons–Higgs model: an existence theorem. Commun. Math. Phys. 168, 321–336 (1995) 13. Caglioti, E., Lions, P.L., Marchioro C., Pulvirenti, M.: A special class of stationary flows for two dimensional Euler equations: A statistical mechanics description. Commun. Math. Phys. 143, 501–525 (1992) 14. Caglioti, E., Lions, P.L., Marchioro C., Pulvirenti, M.: A special class of stationary flows for two dimensional Euler equations: A statistical mechanics description, part II. Commun. Math. Phys. 174, 229–260 (1995)
46
D. Bartolucci, G. Tarantello
15. Chae, D., Imanuvilov, O.: The Existence of Non-topological Multivortex Solutions in the Relativistic Self-Dual Chern–Simons Theory. Commun. Math. Phys. 215, 119–142 (2000) 16. Chang, S.Y.A., Yang, P.: Prescribing Gaussian Curvature on S 2 . Acta Math. 159, 214–259 (1987) 17. Chen, W., Ding, W.: Scalar curvatures on S 2 . Trans. AMS 303, 365–382 (1987) 18. Chen, W., Li, C.: Prescribing Gaussian curvature on surfaces with conical singularities. J. Geom. Anal. 1, 359–372 (1991) 19. Chen, W., Li, C.: Classification of solutions of some nonlinear elliptic equations. Duke Math. J. 63 (3), 615–622 (1991) 20. Chen, W., Li, C.: Qualitative properties of solutions to some nonlinear elliptic equations in R2 . Duke Math. J. 71 (2), 427–439 (1993) 21. Ding, W., Jost, J., Li, J., Wang, G.: Existence results for mean field equations. Ann. Inst. H.Poincarè Anal. Nonlin. 16, 653–666 (1999) 22. Ding, W., Jost, J., Li, J., Wang, G.: The differential equation u = 8π − 8π heu on a compact Riemann surface. Asian J. Math. 1, 230–248 (1997) 23. Ding, W., Jost, J., Li, J., Wang, G.: An analysis of the two vortex case in the Chern–Simons–Higgs model. Calc. Var. and P.D.E. 7, 87-97 (1998) 24. Dunne, G.: Selfdual Chern–Simons Theories. Lecture Notes in Phys. m36. Berlin–New York: SpringerVerlag, 1995 25. Fontana, L.: Sharp borderline Sobolev inequalities on compact Riemannian Manifolds. Commun. Math. Elv. 68, 415–454 (1993) 26. Gilbarg, D., Trudinger, N.: Elliptic Partial Differential Equations of Second Order. New York: SpringerVerlag, 1977 27. ’t Hooft, G.: A property of electric and magnetic flux in nonabelian gauge theories. Nucl. Phys. B153, 141–160 (1979) 28. Hong, J., Kim, Y., Pac, P.Y.: Multivortex solutions of the abelian Chern–Simons theory. Phys. Rev. Letter 64, 2230–2233 (1990) 29. Jaffe, A., Taubes, C.H.: Vortices and Monopoles. Boston: Birkhauser (1980) 30. Jackiw, R., Weinberg, E.J.: Selfdual Chern–Simons vortices. Phys. Rev. Lett. 64, 2234–2237 (1990) 31. Kazdan, J., Warner, F.: Curvature functions for compact 2-manifolds. Ann. of Math. 99, 14–47 (1974) 32. Kiessling, M.H.K.: Statistical mechanics of classical particles with logaritmic interaction. Comm. Pure Appl. Math. 46, 27–56 (1993) 33. Lai, C.H. (ed.): Selected Papers on Gauge Theory of Weak and Electromagnetic Interactions. Singapore: World Scientific 34. Li, Y.Y.: Harnack Type inequality: The Method of Moving Planes. Commun. Math. Phys. 200, 421–444 (1999) 35. Li, Y.Y., Shafrir, I.: Blow-up analysis for Solutions of −u = V (x)eu in dimension two. Ind. Univ. Math. J. 43, (4) 1255–1270 (1994) 36. Lin, C.S.: Topological Degree for Mean Field Equations on S 2 . Duke Math. J. 104 (3), 501–536 (2000) 37. Moser, J.: A sharp form of an inequality of N. Trudinger. Indiana Univ. Math. J. 20, 1077–1092 (1971) 38. Nirenberg, L.: Topics in Nonlinear Analysis. Courant Lecture Notes in Math. Providence, RI: AMS, 2001 39. Nolasco, M., Tarantello, G.: Double vortex condensates in the Chern–Simons–Higgs theory. Calc. of Var. and P.D.E. 9, 31–94 (1999) 40. Nolasco, M., Tarantello, G.: On a sharp Sobolev-type Inequality on two-dimensional compact manifold. Arch. Rational Mech. Anal. 145, 161–195 (1998) 41. Nolasco, M., Tarantello, G.: Vortex Condensates for the SU(3) Chern–Simons Theory. Commun. Math. Phys. 213 (3), 599–639 (2000) 42. Olesen, P.: Soliton condensation in some self-dual Chern–Simons theories. Phys. Lett. B265, 361–365 (1991) 43. Prajapat, J., Tarantello, G.: On a class of Elliptic problems in R2 : Symmetry and Uniqueness Results. Proc. Royal Soc. of Edimb. 131 (4), 967–985 (2001) 44. Ricciardi, T.,Tarantello, G.: Self-dual vortices in the Maxwell–Chern–Simons–Higgs theory. Comm. Pure Appl. Math. 53 (7), 811–851 (2000) 45. Spruck, J., Yang, Y.: Topological solutions in the self-dual Chern–Simons theory: existence and approximation. Ann. Inst. H. Poincaré Anal. Nonlin. 12, 75–97 (1995) 46. Spruck, J., Yang, Y.: On Multivortices in the Electroweak Theory I: Existence of Periodic Solutions. Commun. Math. Phys. 144, 1–16 (1992) 47. Spruck, J., Yang, Y.: On Multivortices in the Electroweak Theory II: Existence of Bogomol’nyi Solutions in R2 . Commun. Math. Phys. 144, 215–234 (1992) 48. Struwe, M.: Variational Methods and Application to nonlinear partial differential equations and Hamiltonian systems. Springer-Verlag, Berlin-Heidelberg (1990). 49. Struwe, M., Tarantello, G.: On multivortex solutions in Chern–Simons gauge theory. Boll. U.M.I. 8, 109–121 (1998)
Liouville Type Equations with Singular Data
47
50. Tarantello, G.: Multiple condensate solutions for the Chern–Simons–Higgs theory. J. Math. Phys. 37, 3769–3796 (1996) 51. Taubes, C.H.: Arbitrary N-vortex solutions to the first order Ginzburg–Landau equation. Commun. Math. Phys. 72, 277–292 (1980) 52. Taubes, C.H.: On the equivalence of first order and second order equations for gauge theories. Commun. Math. Phys. 75, 207–227 (1980) 53. Wang, S., Yang, Y.: Abrikosov’s Vortices in the critical coupling. SIAM J. Math. Anal. 23, 1125–1140 (1992) Communicated by P. Constantin
Commun. Math. Phys. 229, 49 – 55 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0675-6
Communications in
Mathematical Physics
On the Spectrum of a Class of Second Order Periodic Elliptic Differential Operators L. Friedlander Department of Mathematics, University of Arizona, Tucson, AZ 85721, USA Received: 19 December 2001 / Accepted: 10 January 2002 Published online: 24 July 2002 – © Springer-Verlag 2002
Abstract: Under an additional symmetry condition, we prove that the spectrum of a second order self-adjoint elliptic differential operator with periodic coefficients is purely absolutely continuous.
1. Introduction Let L=−
n p,l=1
n ∂ ∂ 1 ∂ ∂ al (x) gpl (x) + + al (x) + V (x) ∂xp ∂xl i ∂xl ∂xl
(1.1)
l=1
be an elliptic second order differential operator in Rn . We assume that gpl (x) = glp (x), p, l = 1, . . . , n, all the functions gpl (x), al (x), and V (x) are smooth, real-valued, and 2π-periodic in all variables. The differential expression (1.1) defines a self-adjoint operator in L2 (Rn ). It is believed that its spectrum is always purely absolutely continuous. However, this theorem has not been proven yet. In this paper, we prove that the spectrum of L is absolutely continuous under an additional symmetry assumption on L. Before we formulate our theorem, let us recall some previous results. In his celebrated paper, L. E. Thomas [Th] proved absolute continuity of the spectrum for a periodic Schrödinger operator. M. Sh. Birman and T. A. Suslina proved the theorem for a twodimensional magnetic Schrödinger operator [BS], and A. Sobolev [S] proved it for a magnetic Schrödinger operator in higher dimensions. A. Morame proved in [M] the absence of singular spectrum for a two-dimensional periodic Schrödinger operator in the case of a non-constant metric (see also [KL].) There have been a number of recent publications on the subject; we are not going to review them here. If n > 2, all previously known results deal essentially with the situations where the leading coefficients gpl (x) are constant.
50
L. Friedlander
Our additional assumption will be that the operator L is invariant under the symmetry x1 → −x1 . We will use the following notations. The indices that take values from 1 to n will be denoted by Roman letters; the indices that take values from 2 to n will be denoted by Greek letters. If x = (x1 , x2 , . . . xn ) then x = (x2 , . . . , xn ), so x = (x1 , x ). In terms of the coefficients of L, our symmetry assumption means that g11 (−x1 , x ) = g11 (x1 , x ), aα (−x1 , x ) = aα (x1 , x ),
gαβ (−x1 , x ) = gαβ (x1 , x ), V (−x1 , x ) = V (x1 , x ),
g1α (−x1 , x ) = −g1α (x1 , x ),
a1 (−x1 , x ) = −a1 (x1 , x ).
(1.2)
Theorem. Assume that the operator L given by (1.1) is elliptic, that the functions gpl (x), al (x), and V (x) are smooth, real-valued, 2π-periodic in all variables, and that they satisfy (1.2). Then the spectrum of the operator L in L2 (Rn ) is purely absolutely continuous. Let us recall some facts from Floquet’s theory (e.g., see [Ku].) Let k = (k1 , . . . , kn ) ∈ Rn . One introduces a family of operators L(k) = − +
n ∂ ∂ + ikp gpl (x) + ikl ∂xp ∂xl
p,l=1 n
1 i
l=1
∂ ∂ al (x) + ikl + + ikl al (x) + V (x). ∂xl ∂xl
(1.3)
As a set, the L2 spectrum of the operator L is the union of periodic spectra of operators L(k) over all k ∈ Rn . (Actually, one can take k ∈ [0, 1)n .) It follows from a theorem of Thomas [Th, Ku] that the spectrum of L is not purely absolutely continuous if, for some value of λ, the equation (L(k) − λ)u = 0 (1.4) has a non-trivial periodic solution for any choice of k ∈ Cn . Let us emphasize that here the quasi-momentum k is allowed to be complex-valued. We will assume that this is the case, and our assumption will eventually lead us to a contradiction. Because L(k) − λ is an operator of the type (1.1), with V (x) replaced by V (x) − λ, we can assume that λ = 0. So, our assumption is ker(L(k)) = 0,
k ∈ Cn .
(1.4)
Here, L(k) is considered an operator acting on periodic functions. In Sect. 2 we exhibit our main construction, and in Sect. 3 we prove the theorem. 2. The Main Construction First, we restrict ourselves to quasi-momenta k = (k1 , 0, . . . , 0). With some abuse of notations, we will use k for k1 . Then, the problem of finding periodic solutions of the equation L(k)u = 0 is equivalent to the problem of finding solutions of the equation Lu = 0 that are periodic in x -variables, and that satisfy the quasi-periodicity condition u(x1 + 2π, x ) = e2πik u(x1 , x ).
(2.1)
On the Spectrum of a Class of Second Order Periodic Elliptic Differential Operators
51
Let C = [−π, π] × Tn−1 be a cylinder; here Tn−1 is an (n − 1)-dimensional torus. We denote ζ = exp(2πik) = 0. Then the above problem is equivalent to the boundary value problem Lu = 0
in C,
u(π, x ) = ζ u(−π, x ),
∂u ∂u (π, x ) = ζ (−π, x ). ∂x1 ∂x1
(2.2)
Let ± = {±π }×Tn−1 be the top and the bottom of the cylinder C. The symmetry assumptions (1.2) imply that g1α (±π, x ) = 0, so the x1 direction is normal to both the top and the bottom of the cylinder. It is convenient to take ∂ν = ∂/∂ν = ±g11 (±π, x )∂/∂x1 as a standard outward normal vector to ± at (±π, x ). With this convention, one has the standard Green formula (Lu, v)C = (u, Lv)C − (∂ν u, v) + (u, ∂ν v) ,
(2.3)
where = + ∪− , and (·, ·) is the usual L2 scalar product. Notice that a1 (±π, x ) = 0 (see (1.2)), so there are no boundary terms that come from first order terms in (1.1). We will make the reduction of problem (2.2) to the boundary. To make this reduction, we introduce the Dirichlet-to-Neumann operators. Let ker(LD ) be the space of solutions of the Dirichlet problem for the equation Lu = 0 in C. This space is finite dimensional. We introduce a space L = {φ(x ) ∈ L2 (Tn−1 ) : φ(x ) = ∂ν u(−π, x )
for some u ∈ ker(LD )}.
(2.4)
Notice that, in this definition, one can replace −π by π because the operator L is invariant under the reflection x1 → −x1 . It is a standard fact from the elliptic theory that the boundary value problem Lu = 0
in C,
u(−π, x ) = ψ(x ), u(π, x ) = 0
(2.5)
is solvable if and only if ψ ⊥ L. If problem (2.5) is solvable then its solution is not unique, but one can find the unique solution that satisfies an additional constraint ∂ν u(−π, x ) ⊥ L. Such a solution will be denoted by P ψ. (P stands for “Poisson operator.”) For a function u(x) in C, we define j± u to be its normal derivatives on ± . Finally we define the Dirichlet-to-Neumann operators N0 ψ = j− P ψ,
N1 ψ = j+ P ψ,
ψ ∈ L⊥ .
(2.6)
In words, one takes the solution u(x) of (2.5) that satisfies the additional condition ∂ν u(−π, x ) ⊥ L; then N0 ψ = ∂ν u(−π, x ) and N1 ψ = ∂ν u(π, x ). Clearly, N0 maps L⊥ into L⊥ . Because the operator L is invariant under the symmetry x1 → −x1 , one can interchange + and − . It means that if u(x) is the solution of Lu = 0 such that u(−π, x ) = 0, u(π, x ) = ψ, and ∂ν u(π, x ) ∈ L⊥ then ∂ν u(π, x ) = N0 ψ and ∂ν u(−π, x ) = N1 ψ. This is actually the main reason why the symmetry assumption is helpful. It is known that N0 is an elliptic pseudo-differential operator of order 1, its principal symbol is positive; so the number of its non-positive eigenvalues is finite. The fact that it is defined not on the whole Sobolev space H 1 (Tn−1 ) but only on its subspace of finite codimension is not essential. The operator N1 is a smoothing operator because the Schwarz kernel of the Poisson operator P is smooth outside of − . To make the reduction of problem (2.2) to the boundary, we set u(−π, x ) = ψ(x ), and solve the equation Lu = 0, together with the first boundary condition in (2.2); then the second boundary condition will give us an equation for ψ(x). We start from the following proposition.
52
L. Friedlander
Proposition 1. Let ζ = ±1. Then the problem f (−π, x ) = ψ(x ),
Lf = 0 in C,
f (π, x ) = ζ ψ(x )
(2.7)
is solvable if and only if ψ(x ) ∈ L⊥ . Proof. Let L˜ =
∂ν u(−π, x ) ) : u(x) ∈ ker(L D . ∂ν u(π, x )
Recall that LD is the operator in C given by the differential expression (1.1), with the Dirichlet boundary conditions. The problem (2.7) is solvable if and only if ψ ˜ ⊥ L. (2.8) ζψ The operator L is invariant under the reflection x1 → −x1 , so the kernel of LD splits into the direct sum of even solutions and odd solutions of Lu = 0, ker(LD ) = (ker(LD ))ev ⊕ (ker(LD ))odd . ˜ ev + (L) ˜ odd . Denote by Lev(odd) the This splitting gives rise to the splitting L˜ = (L) ev ev ( odd ) . Then L = L + Lodd . Notice that space of first components from L˜ ev odd φ φ ev ev ev odd odd odd ˜ ˜ L = : φ ∈L and L = : φ ∈L . φ ev −φ odd Now (2.8) holds if and only if (1 + ζ )(ψ, φ ev ) = 0 for every φ ev ∈ Lev and (1 − ζ )(ψ, φ odd ) = 0 for every φ odd ∈ Lodd . Our assumption ζ = ±1, together with L = Lev + Lodd , implies that this is equivalent to ψ ⊥ L. Let ζ = ±1, 0. For ψ ∈ L⊥ , the general solution of problem (2.7) is f (x) = (P ψ)(x1 , x ) + ζ (P ψ)(−x1 , x ) + v(x),
v ∈ ker(LD ).
Here, once again, we used the invariance of L under the reflection x1 → −x1 . The last boundary condition from (2.2) is equivalent to ∂f ∂f (π, x ) + ζ (−π, x ) = 0. ∂ν ∂ν In terms of the Dirichlet-to-Neumann operators, the last equality can be rewritten as 2ζ N0 ψ + (1 + ζ 2 )N1 ψ +
∂v ∂v (π, x ) + ζ (−π, x ) = 0. ∂ν ∂ν
(2.9)
In particular, 2ζ N0 ψ + (1 + ζ 2 )N1 ψ ∈ L.
(2.10)
Proposition 2. Let ζ = ±1, 0. The problem (2.2), with u(−π, x ) = ψ(x ) is solvable if and only if ψ ∈ L⊥ , and (2.10) holds.
On the Spectrum of a Class of Second Order Periodic Elliptic Differential Operators
53
Proof. The “only if” part has already been proven. Let us do the “if” part. Assume that (2.10) holds. Denote −φ(x ) = 2ζ N0 ψ + (1 + ζ 2 )N1 ψ ∈ L. We decompose φ as a sum φ ev +φ odd . (See the proof of Proposition 1.) Let v ev(odd) be the even (odd) solution of the Dirichlet problem for Lv = 0 such that ∂ν v ev(odd) (−π, x ) = φ ev(odd) (x ). Then, (2.9) is satisfied for the function v(x) =
v ev (x) v odd (x) + . 1+ζ 1−ζ
We conclude that assumption (1.5) implies that the inclusion (2.10) has a non-trivial solution ψ ∈ L⊥ for every ζ = ±1, 0. In the next section we will show that this can not happen. 3. Proof of the Theorem Let Q be the orthogonal projection onto the space L⊥ , and let z = (ζ 2 + 1)/2ζ . Then (2.10) can be rewritten as N0 ψ + zN˜ 1 ψ = 0,
ψ ∈ L⊥ ,
(3.1)
where N˜ 1 = QN1 . The assumption (1.5) implies that Eq. (3.1) has a non-trivial solution for every z = ±1. First, we establish some simple properties of the operators N0 and N˜ 1 . Proposition 3. (i) Operators N0 and N˜ 1 are self-adjoint in L⊥ ⊂ L2 (Tn−1 ). (ii) ker(N˜ 1 ) = {0}. Proof. Let ψ1 , ψ2 ∈ L⊥ , and let uj (x1 , x ) = P ψj (x ), j = 1, 2. One applies Green’s formula (2.3) to u1 and u2 to get (N0 ψ1 , ψ2 ) = (∂ν u1 , u2 )− = (u1 , ∂ν u2 )− = (ψ1 , N0 ψ2 ). This means that the operator N0 is symmetric, and, if one takes H 1 (Tn−1 ) ∩ L⊥ as its domain, then it becomes self-adjoint. Let v(x) ∈ ker(LD ) be such a function that ∂ν (u2 (−x1 , x ) + v(x)) ∈ L⊥
when x1 = −π.
Let w(x) = u2 (−x1 , x ) + v(x). The invariance of L under the reflection x1 → −x1 implies Lw = 0. In addition, ∂ν w(−π, x ) = N˜ 1 ψ2 ,
w(π, x ) = ψ2 (x ).
We apply Green’s formula (2.3) to u1 and w to get (N˜ 1 ψ1 , ψ2 ) = (∂ν u1 , w)+ = (u1 , ∂ν w)− = (ψ1 , N˜ 1 ψ2 ). This equation shows that the operator N˜ 1 is self-adjoint. (The operator N˜ 1 is bounded, so one does not have to worry about its domain.)
54
L. Friedlander
Finally, suppose that N˜ 1 ψ = 0. Let u(x) = P ψ. One has Lu = 0, u(−π, x ) = ψ(x ), u(π, x ) = 0, and ∂ν u(π, x ) ∈ L. Let v(x) ∈ ker(LD ), and ∂ν v(π, x ) = ∂ν u(π, x ). Then the function w = u − v is a solution of the equation Lw = 0, and, on + , both w and its normal derivative vanish. Therefore, w(x) = 0, and ψ(x ) = u(−π, x ) = v(−π, x ) = 0. It follows from the theory of analytic families of operators (e.g., see [Ka]) that dim ker(N0 + zN˜ 1 ) = const for all complex numbers z, outside a discrete set E ⊂ C. Our assumption that (3.1) has a non-trivial solution for all z = ±1 implies that this constant is positive. Moreover, the Riesz projections onto ker(N0 + zN˜ 1 ) depend on z analytically in C \ E. In particular, one can construct a family of functions ψ(z), z ∈ R \ E, such that ψ(z) = 1, ψ(z) solves (3.1), and ψ(z) is continuous in z. We restrict z to the real axis as a matter of convenience. Let us show that (N˜ 1 ψ(z1 ), ψ(z2 )) = 0
(3.2)
for any z1 , z2 ∈ R \ E. If z1 = z2 then 0 = ((N0 + z1 N˜ 1 )ψ(z1 ), ψ(z2 )) = (ψ(z1 ), (N0 + z2 N˜ 1 )ψ(z2 )) + (z1 − z2 )(N˜ 1 ψ(z1 ), ψ(z2 )) = (z1 − z2 )(N˜ 1 ψ(z1 ), ψ(z2 )), and (3.2) follows immediately. If z2 = z1 then we take the limit z2 → z1 in (3.2). Equation (3.2) implies (N0 ψ(z1 ), ψ(z2 )) = 0,
z1 , z2 ∈ R \ E.
(3.3)
Let M be the linear span of all functions ψ(z), z ∈ R \ E. It follows from (3.3) that (N0 ψ, ψ) = 0 for every ψ ∈ M. Let us recall that N0 is a self-adjoint, bounded from below operator with discrete spectrum. Therefore, dim M < ∞. Now, let zj ∈ R \ E be a sequence such that zj → ∞, and let ψj = ψ(zj ). The functions ψj lie on a finite-dimensional sphere in L⊥ , so one can assume that ψj → ψ. Note that ψ = 1, so ψ = 0. One has 1 N˜ 1 ψj = − N0 ψj . zj
(3.4)
The operator N˜ 1 is bounded, and the restriction of N0 to M is bounded. (M is finitedimensional!) By taking the limit j → ∞ in (3.4), one gets N˜ 1 ψ = 0. This contradicts statement (ii) of Proposition 3. References [BS] Birman, M.Sh., Suslina, T.A.: Two-dimensional periodic magnetic Hamiltonian is absolutely continuous (in Russian). Algebra i Analiz 9, 32–48 (1997); translation in St. Petersburg Math. J. 9, 21–32 (1998) [Ka] Kato, T.: Perturbation theory for linear operators. Berlin–Heidelberg–NewYork: Springer Verlag, 1966 [KL] Kuchment, P., Levendorskiî, S.: On the structure of spectra of periodic elliptic operators. To appear in Transactions of the AMS [Ku] Kuchment, P.: Floquet Theory for Partial Differential Equations. Basel: Birkhäuser Verlag, 1993 [M] Morame, A.: Absence of singular spectrum for a perturbation of a two-dimensional Laplace–Beltrami operator with periodic electo–magnetic potential. J. Phys. A: Math. Gen. 31, 7593–7601 (1998) [S] Sobolev, A.: Absolute continuity of the periodic magnetic Schrödinger operator. Inventiones Mathematicae 137, 85–119 (1999)
On the Spectrum of a Class of Second Order Periodic Elliptic Differential Operators
55
[Th] Thomas, L.E.: Time Dependent Approach to Scattering from Impurities in a Crystal. Commun. Math. Phys. 33, 335–343 (1973) Communicated by P. Sarnak
Commun. Math. Phys. 229, 57 – 71 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0676-5
Communications in
Mathematical Physics
Central Limit Theorems and Invariance Principles for Time-One Maps of Hyperbolic Flows Ian Melbourne1 , Andrei Török2,3 1 Department of Mathematics and Statistics, University of Surrey, Guildford GU2 7XH, UK.
E-mail:
[email protected]
2 Department of Mathematics, University of Houston, Houston, TX 77204-3008, USA.
E-mail:
[email protected]
3 Institute of Mathematics of the Romanian Academy, P.O. Box 1–764, 70700 Bucharest, Romania
Received: 4 January 2002 / Accepted: 16 February 2002 Published online: 24 July 2002 – © Springer-Verlag 2002
Abstract: We give a general method for deducing statistical limit laws in situations where rapid decay of correlations has been established. As an application of this method, we obtain new results for time-one maps of hyperbolic flows. In particular, using recent results of Dolgopyat, we prove that many classical limit theorems of probability theory, such as the central limit theorem, the law of the iterated logarithm, and approximation by Brownian motion (almost sure invariance principle), are typically valid for such time-one maps. The central limit theorem for hyperbolic flows goes back to Ratner 1973 and is always valid, irrespective of mixing hypotheses. We give examples which demonstrate that the situation for time-one maps is more delicate than that for hyperbolic flows, illustrating the need for rapid mixing hypotheses.
1. Introduction Let ⊂ M be a topologically mixing hyperbolic basic set for a smooth flow Tt on a compact manifold M. Let µ denote an equilibrium measure supported on , corresponding to a Hölder continuous potential [7]. In this paper, we are interested in proving statistical limit laws such as the central limit theorem for the time-one map T = T1 of such a flow. We note that such limit laws are well-known for the hyperbolic flow itself. See Ratner [22] for the central limit theorem, Wong [28] for the law of the iterated logarithm, and Denker and Philipp [9] for the almost sure invariance principle. See also [18]. The validity of such results for time-one maps is considerably more delicate than that for flows. To see this, suppose that X is a mixing hyperbolic basic set and r : X → R is a Hölder roof function. Let Xr denote the suspension of X and consider the suspension flow Tt : Xr → Xr . Suppose that r is cohomologous to a rational constant (for example, take r ≡ 1). Then the time-one map T = T1 is far from ergodic and the above statistical
58
I. Melbourne, A. Török
limit laws fail abjectly. Nevertheless, these results are valid for the flow [18]. In Sect. 4, we discuss the situation when Xr is mixing but not rapidly mixing. Dolgopyat [10] gave necessary and sufficient conditions for hyperbolic flows to exhibit rapid decay of correlations in the sense that for each n ≥ 1, and all sufficiently regular observations φ, ψ : → R, there exists a constant C(φ, ψ, n) such that φ(ψ ◦ Tt ) dµ − φ dµ ψ dµ ≤ C(φ, ψ, n)/|t|n ,
(1)
for all t ∈ R. Dolgopyat also proved that a sufficient condition for this result to hold is that there are periodic points x1 , x2 ∈ with periods P1 , P2 such that P1 /P2 is Diophantine. Thus most hyperbolic flows are rapidly mixing (whereas previously Ruelle [24] and Pollicott [21] had proved the existence of mixing hyperbolic flows whose rates of mixing are arbitrarily slow). An important feature of this theorem is that, for fixed φ, condition (1) holds for a large class of “test functions” ψ. Indeed, as a first step, Dolgopyat proves this result for one-sided subshifts where ψ is required only to be L∞ and C(φ, ψ, n) = D(φ, n)|ψ|∞ . In this paper, we prove that a simple consequence of such an “L∞ ” rapid decay result is that any sufficiently regular mean zero observation φ is cohomologous in Lp to a martingale for all p ∈ [2, n). Here, n > 4 is sufficiently rapid decay for our purposes (and n > 2 suffices for the CLT). As a consequence of the martingale reduction, we derive several classical limit theorems, the most powerful being the almost sure invariance principle. Theorem 1. Let ⊂ M be a topologically mixing hyperbolic basic set for a smooth flow Tt with equilibrium measure µ, corresponding to a Hölder continuous potential. Suppose that there are periodic points x1 , x2 ∈ with periods P1 , P2 such that P1 /P2 is Diophantine. Let φ : M → R be sufficiently regular 1 with mean zero ( φ dµ = 0) t and 0 φ ◦ Ts ds unbounded. Then there is a Brownian motion W with variance 1 σ = lim N→∞ N 2
N−1 j =0
φ ◦ Tj
2
dµ > 0,
and a sequence of random variables {S(N ) : N ≥ 1}, equal in distribution to the sequence { N−1 j =0 φ ◦ Tj : N ≥ 1}, such that for each δ > 0, S([t]) = W (t) + O(t 1/4+δ )
as t → ∞,
almost surely. N φ ◦ Tj replaced by 0 φ ◦ Tt dt) is an 1 immediate consequence of the ASIP for time-one maps, since 0 φ ◦ Tt dt satisfies the hypotheses of Theorem 1. As mentioned earlier, the ASIP for hyperbolic flows is valid even when mixing fails [9, 18]. Remark 1. The ASIP for flows (with
N−1 j =0
1 It suffices that φ is C ∞ in the flow direction, and that φ together with its time derivatives are Hölder continuous for some fixed Hölder exponent
Central Limit Theorems and Invariance Principles for Time-One Maps
59
Consequences of the ASIP include the central limit theorem, the weak invariance principle and the law of the iterated logarithm, see [20,12]. We note that Dolgopyat [11], using rather different methods, has proved a version of the above result for time-one maps of Anosov flows with jointly nonintegrable stable and unstable foliations. Remark 2. The error term O(t 1/4+δ ) for all δ > 0 improves the error term O(t 1/2−α ) for some α < 0 which is more usual in the literature [9, 11, 20]. The improved error term is obtained also in [12, 18]. In Sect. 2, we prove a simple (but apparently novel) abstract result relating rapid mixing and approximation by a martingale. The central limit theorem and weak invariance principle for suspensions of one-sided subshifts of finite type are then an immediate consequence of Dolgopyat’s rapid mixing theorem. In Sect. 3, we prove Theorem 1 by passing in the standard way from one-sided subshifts to two-sided subshifts [25, 6] and then from suspensions of two-sided subshifts to hyperbolic flows [5]. In Sect. 4, we consider the situation where the rapid mixing hypothesis is relaxed. 2. Decay of Correlations and Martingales In this section we prove a simple result that derives statistical limit theorems such as the central limit theorem as a consequence of rapid decay of correlations. Proposition 1. Let (Y, m) be a probability space and T : Y → Y be a measure preserving transformation. Let f ∈ L∞ . Suppose that there exists a constant C > 0 such that Y f (g ◦ T ) dm| ≤ C|g|∞ , for all g ∈ L∞ . Define Ug = g ◦ T , so U : Lp → Lp is an isometry for all 1 ≤ p ≤ ∞. Let U ∗ : L2 → L2 be the L2 -adjoint of U . (p−1)/p Then U ∗ f ∈ L∞ and |U ∗ f |p ≤ C 1/p |f |∞ for all p ≥ 1 finite, |U ∗ f |∞ ≤ |f |∞ . Proof. By assumption, we have | (U ∗ f ) g| = | f Ug| ≤ C|g|∞ . By duality, |U ∗ f |1 ≤ C. (Take g = sgn(U ∗ f ).) Next we derive the L∞ estimate. Let ' > 0 and suppose that |U ∗ f | ≥ |f |∞ + ' on a set A. Take g = χA sgn(U ∗ f ). Then µ(A)[|f |∞ + '] ≤ | (U ∗ f ) g| ≤ |f Ug| = T −1 (A) |f | ≤ µ(T −1 (A))|f |∞ = µ(A)|f |∞ , so that µ(A) = 0. Hence |U ∗ f |∞ ≤ |f |∞ . Finally, compute that
p−1 p−1 |U ∗ f |p = |U ∗ f |p−1 |U ∗ f | ≤ |U ∗ f |∞ |U ∗ f |1 ≤ |f |∞ C.
60
I. Melbourne, A. Török
Lemma 1. Let (Y, m) be a probability space and T : Y → Y be a measure preserving transformation. Define U ∗ : L2 → L2 as in Proposition 1. Let φ : Y → R be in L∞ with Y φ dm = 0. Fix n > 2, and suppose that there is a constant C (depending on φ and n) such that φ (ψ ◦ T j ) dm ≤ C |ψ|∞ , Y jn
(2)
for all ψ ∈ L∞ and j ≥ 1. = 0. + χ ◦ T − χ , where φ and χ lie in Lp , for all p < n, and U ∗ φ Then φ = φ Proof. It follows from Proposition 1 that (U ∗ )j φ ∈ L∞ , and that |(U ∗ )j φ|p ≤
C 1/p (p−1)/p |φ|∞ , j n/p
(3)
∗ j p for all finite p ≥ 1. If p < n, then ∞ j =1 (U ) φ converges absolutely in L . Define ∞ = φ − U χ + χ . Then χ and φ lie in Lp . Moreover U ∗ φ = 0 χ = j =1 (U ∗ )j φ and φ (cf. Gordin [13]). N−1 j are as in Lemma 1. Define φN = Remark 3. Assume that φ and φ j =0 U φ and N 2 2 define φN similarly. Then φN = φN + χ ◦ T − χ . If χ ∈ L , then χ ◦ T N = o(N ) N + o(N 1/2 ) almost almost everywhere by Birkhoff’s ergodic theorem, hence φN = φ everywhere. Theorem 2 (Central limit theorem (CLT)). Let (Y, m) be a probability space and suppose that T : Y → Y is ergodic. Let φ : Y → R be in L∞ with Y φ dm = 0. Suppose that φ satisfies condition (2) for some n > 2 (and all ψ ∈ L∞ , j ≥ 1). Then N−1 j √1 j =0 φ ◦ T converges in distribution as N → ∞ to a normal distribution with N
mean zero and variance σ 2 for some σ ≥ 0. j 2 2 Moreover, σ 2 = limN→∞ N1 Y ( N−1 j =0 φ ◦ T ) dm, and σ = 0 if and only if φ is p an L -coboundary for all p < n. N + o(N 1/2 ) so it Proof. Choose n > p ≥ 2 in Lemma 1 and Remark 3. Then φN = φ suffices to prove the CLT with φ replaced by φ . Passing to the natural extension [23], we ◦ T j for obtain a biinfinite stationary ergodic martingale {Xj : j ∈ Z}, where X−j = φ N−1 1 Xj j ≥ 0 (cf. [12, Remark 3.12]). Hence it follows from Billingsley [1] that √ 2 N j =0 converges to a normal distribution with mean zero and variance X1 as N → ±∞. N−1 j In particular, √1 j =0 φ ◦ T converges to a normal distribution with mean zero and N 2 . Moreover, the variance is zero if and only if φ = 0 which means variance σ 2 = φ that φ = χ ◦ T − χ is an Lp -coboundary. Finally,we verify the formula for σ 2 in the last statement of the theorem. First note 2 . That is, σ = √1 |φ 2 = 1 φ | . Writing φN = φ N + χ ◦ T N − χ , that σ 2 = φ N N N N 2 N |2 + 2|χ |2 so that lim supN→∞ √1 |φN |2 ≤ σ . Similarly, we compute that |φN |2 ≤ |φ lim inf N→∞
√1 |φN |2 N
≥ σ.
N
Central Limit Theorems and Invariance Principles for Time-One Maps
61
Remark 4. Suppose that Tt : Y → Y is a semiflow and that the time-one map T = T1 is ergodic and satisfies the rapid decay condition (2) for some n > 2. Then the conclusion 1 of Theorem 2 is valid for the time-one map T . Moreover, replacing φ by 0 φ ◦ Tt dt, T we conclude that √1 0 φ ◦ Tt dt converges in distribution as T → ∞ to a normal T
distribution with mean zero and variance σ 2 for some σ ≥ 0, and σ 2 = 0 if and only if 1 p 0 φ ◦ Tt dt is an L -coboundary. Remark 5. Under the hypotheses of Theorem 2 (or Remark 4), the weak invariance principle (WIP) (otherwise known as the functional central limit theorem) follows by [2]. The use of martingale approximations to prove the CLT and WIP for dynamical systems is standard since Gordin [13]. In certain situations (see [8, 12] and Sect. 3 of the present paper) martingale approximation leads to the almost sure invariance principle and hence the law of the iterated logarithm. However, this step relies on the class of dynamical systems under consideration being closed under time-reversal (see [8, Remarques 2.7(1)] and [12, Remark 6.5]). Remark 6. The key hypothesis in Theorem 2 is that for the fixed mean zero observation φ, the correlation function Y φ(ψ ◦ T j )dm decays rapidly for all ψ ∈ L∞ . Such a hypothesis cannot hold for an invertible mapping T , since the operator U ∗ appearing in the proof of the theorem would be a unitary operator and so could not be strictly contractive. However, this hypothesis is often satisfied when T is noninvertible. We note that Theorem 2 is both more restricted and more general than a related result of Liverani [15]. Liverani requires only that Y φ(φ ◦ T j )dm decays rapidly (so ψ = φ), and n > 1 is sufficiently rapid decay. However, Liverani requires an additional a priori estimate on the contractivity of the transfer operator. Application to suspensions of one-sided subshifts of finite type. We recall the notion of a symbolic (semi)-flow [5, 19]. Suppose that σ : X+ → X + is an aperiodic one-sided subshift of finite type. Fix θ ∈ (0, 1). Define the metric dθ (x, y) = θ N , where N is the largest positive integer such that xi = yj for all i < N . Define the Hölder space Fθ (X + ) consisting of continuous functions v : X+ → R that are Lipschitz with respect to this metric, with Lipschitz constant |v|θ . Let µ be an equilibrium measure on X + corresponding to a Hölder potential in Fθ (X + ). Let r ∈ Fθ (X + ) be a strictly positive roof function, and define the suspension Xr+ = {(x, s) ∈ X + × R : 0 ≤ s ≤ r(x)}/ ∼, where (x, r(x)) ∼ (σ x, 0). The suspension (semi)-flow is given by Tt (x, s) = (x, s + t) and the invariant measure µr = µ × 0/ r dµ is an equilibrium measure for the flow, where 0 is Lebesgue measure on R. Define the space Fθ (Xr+ ) consisting of continuous functions φ : Xr+ → R that are Lipschitz with respect to the metric dθ (x, x ) + |s − s | on X + × R restricted to {(x, s) ∈ X+ × R : 0 ≤ s ≤ r(x)}. Note that the functions in Fθ (Xr+ ) are continuous along the flow direction. Let Fk,θ (Xr+ ) consist of functions φ that are C k in the flow direction j such that ∂t φ ∈ Fθ (Xr+ ) for j = 0, 1, . . . , k, and let |φ|k,θ denote the maximum of the j Lipschitz constants corresponding to ∂t φ. Theorem 3. Let Xr+ be a Hölder suspension of an aperiodic one-sided subshift of finite type, with Hölder equilibrium measure µ. Suppose that there are periodic points y1 , y2 ∈ Xr+ with periods P1 , P2 such that P1 /P2 is Diophantine. Then there is an integer k ≥ 1 such that the CLT and WIP (forthe time-one map as well as the flow) hold for all observations φ ∈ Fk,θ (Xr+ ) with Xr φ dµr = 0.
62
I. Melbourne, A. Török
Proof. Under the Diophantine hypothesis, Dolgopyat [10] proved that for any n ≥ 1, there exists an integer k(n) ≥ 1 and a constant C(n) > 0 such that if φ ∈ Fk(n),θ (Xr+ ) and ψ ∈ L∞ (Xr+ ), then φ(ψ ◦ Tt )dµr − φdµr ψdµr ≤ C(n)|φ|k(n),θ |ψ|∞ /t n , for all t > 0. Take n > 2 in (4), and apply Theorem 2 and Remark 5.
(4)
N−1
j =0 φ ◦ j T ) are said to be degenerate. We want conditions that exclude this possibility. Similarly,
If the variance σ 2 vanishes (in Theorem 2), then the CLT and WIP (for
in the CLT and WIP for
√1 T
T 0
√1 N
φ ◦ Tt dt, (Remark 4) we wish to rule out the possibility
that σ 2 = 0. The next result shows that these situations are highly unlikely in the hyperbolic case. Proposition 2. Assume the set up of Theorem 3. The following are equivalent. (a) σ 2 = 0, T (b) 0 φ(Ts y)ds = 0 whenever y is a periodic point of period T , t (c) There is a Hölder g : Xr+ → R such that 0 φ ◦ Ts ds = g − g ◦ Tt for all t, and T (d) 0 φ(Ts y)ds is uniformly bounded (in T > 0 and y ∈ Xr+ ). If σ 2 = 0, then conditions (a)–(d) hold. Proof. The equivalence of (b) and (c) is the Livšic periodic point theorem [16] and [14, Theorem 19.2.4]. It is clear that (c) implies (d). If (d) is valid, then the CLT is degenerate, so (d) implies (a). If (a) is valid, then by Theorem 2, ψ = χ − χ ◦ T1 almost everywhere, where χ ∈ Lp 1 t 1 (2 ≤ p < n) and ψ = 0 φ ◦Tu du. Define Ft = 0 ψ ◦Ts ds and h = 0 χ ◦Ts ds. Then Ft : Xr+ → R is a continuous (even Lipschitz) cocycle and h ∈ Lp (Xr+ ). Moreover, Ft = h ◦ Tt − h so F is an Lp coboundary. The Livšic regularity theorem for hyperbolic flows [17, 27] guarantees that h has a Hölder continuous version. Now suppose that y is a periodic point of period T and compute that T
0 φ(Ts y)ds
=
1 T 0 ( 0 φ(Ts+u y)ds)du = FT (y) = 0,
proving (b). Finally, it is immediate from Theorem 2 and Remark 4 that σ 2 = 0 implies that σ 2 = 0. Remark 7. Ratner [22] proved the CLT for hyperbolic flows and showed that σ2 = 0 if and only if φ is an L2 -coboundary (in some sense). However, verifiable criteria for nondegeneracy were first given by [18] who proved the equivalence of (a) and (d) (without requiring rapid mixing).
Central Limit Theorems and Invariance Principles for Time-One Maps
63
3. Almost Sure Invariance Principle for Hyperbolic Flows In this section, we prove Theorem 1. The proof consists of three ingredients: (a) Reduction to a suspended flow over a two-sided subshift of finite type, using the symbolic dynamics of Bowen [4, 5]. (b) Reduction to the situation where the roof function defining the suspension and the observation φ depend only on future coordinates (following [25, 6]). (c) Application of the martingale approximation of Sect. 2 and standard techniques from probability theory (cf. Conze and le Borgne [8] and Field et al. [12]). 3.1. Reduction to a suspended subshift. This step is by now completely standard [4, 5, 7] and we omit the details. After the reduction, we have a flow on the suspension Xr of an aperiodic two-sided subshift of finite type σ : X → X. Here, the roof function r ∈ Fθ (X) is strictly positive and the suspension is defined to be Xr = {(x, s) ∈ X × R : 0 ≤ s ≤ r(x)}/ ∼, where (x, r(x)) ∼ (σ x, 0). The suspension flowTt (x, s) = (x, s +t) is weak mixing with respect to an equilibrium measure µr = µ×0/ r dµ, where µ is an equilibrium measure on X corresponding to a Hölder potential. The reduced observation φ lies in Fk,θ (Xr ) and has mean zero. (The spaces Fθ (X) and Fk,θ (Xr ) for the two-sided shift are defined analogously to the one-sided case.) 3.2. Reduction to future coordinates. By [25, 6], r is cohomologous to a roof function r ∈ Fθ 1/2 (X) that depends only on future coordinates, and the suspension flows on Xr and Xr are topologically conjugate. Unfortunately, r is not strictly positive which introduces a number of technical difficulties. (In particular, it is not clear how to define j Fθ (Xr ).) To circumvent these difficulties, define rn = n−1 j =0 r ◦ σ . There exists an integer m ≥ 1 such that rm is strictly positive, and it is possible to pass from observations in Fk,θ (Xr ) to observations in Fk,θ (Xrm ) and then to Fk,θ 1/2 (Xrm ) (cf. [10, 21]). We omit the tedious details. The upshot of the discussion above is that without loss of generality we may suppose from the outset that r ∈ Fθ (X) depends only on future coordinates. Suppose that φ ∈ Fk+1,θ (Xr ). A generalization of the argument of [25, 6] shows that there is a constant q (depending only on Xr and θ ) such that φ is cohomologous in Fk,θ 1/q (Xr ) to an element ψ ∈ Fk,θ 1/q (Xr ) depending only on future coordinates. Since we could not find this fact mentioned even implicitly in the literature, we give the proof in detail in the appendix (Theorem 5). This completes Step (b), and we may suppose without loss that r and φ depend only on future coordinates. 3.3. Martingale approximation. This step is almost identical to that in [12] and we only sketch the details. Since the class of hyperbolic sets for smooth flows is closed under time-reversal, it is sufficient to prove the ASIP in reverse time. Hence we consider reverse partial sums φ−N = N−1 j =0 φ ◦ T−j . By Lemma 1 (with n > 4) and Dolgopyat’s results (4), φ = ψ + χ − χ ◦ T1 , where ψ, χ ∈ L4 , ψ depends only on future coordinates, and U ∗ ψ = 0. Here, U ∗ is the adjoint of the (noninvertible) isometry U : L2 (Xr+ ) → L2 (Xr+ ) induced by T1 . As in Remark 3, φ−N = ψ−N + o(N 1/4 ), hence it suffices to prove the ASIP for ψ.
64
I. Melbourne, A. Török
Since ψ and T1 depend only on future coordinates, the condition U ∗ ψ = 0 guarantees that the sequence {ψ−N , N ∈ Z} is a martingale (with respect to the sequence of σ algebras TN (M+ ), where M+ is the σ -algebra on Xr+ lifted up to Xr ). We now apply the method of Strassen [26]. The version stated in [12, Theorem B.3] is sufficient for our purposes. (Hypothesis (a) in [12] is automatically valid since ψ lies in L4 and the sequence ψ ◦ T−j is stationary. Hypothesis (b) follows as in [12] from the strong law of large numbers for martingales since the partial sums of squares also admit a martingale approximation.) 4. Counterexamples for Nonrapid Mixing Time-One Maps Let Xr be the suspension by a Hölder roof function r of a hyperbolic basic set X. In the introduction, we mentioned that the hyperbolic flow Tt : Xr → Xr enjoys statistical properties such as the ASIP, without requiring even ergodicity for the time-one map T1 : Xr → Xr . This illustrates the fact that establishing statistical properties is more delicate for time-one maps than for the flow. It is natural to ask whether weak mixing is a sufficient condition for the ASIP to hold for the time-one map. We strongly conjecture that the answer is negative and that it is necessary to impose rapid mixing hypotheses as in this paper. In this section, we show that certain aspects of Theorem 1 break down when Xr is weak mixing but not rapidly mixing. (The example below is also an alternative counterexample to rapid mixing of hyperbolic flows, cf. [21, 24].) We give an example of a suspension Xr , and a mean zero observation φ : Xr → R, satisfying the following properties: (i) (ii) (iii) (iv)
X is a (one-sided) subshift of finite type on two symbols, The suspension flow Tt : Xr → Xr is weak mixing, The roof function r is Hölder continuous, The observation φ is Hölder continuous, C ∞ in the flow direction, and the derivatives in the flow direction are Hölder continuous (with respect to a fixed Hölder exponent),
and yet (v) limN→∞
1 N
Xr
2 dµ does not exist. φN r
j As usual, φN = N−1 j =0 φ ◦ T , where T = T1 is the time-one map. In fact, we prove the following result. Theorem 4. The suspension Xr can be constructed so that condition (i)–(iv) are satisfied, and for any ' > 0, 1 2 lim sup 2−' φN dµr = ∞. Xr N→∞ N Construction of Xr and φ. Let b : [0, 1] → R be a C ∞ function supported inside 1 (0, 1) (in [1/4, 3/4] say), satisfying 0 b(s)ds = 0. We extend b to a smooth 1-periodic function on R. Then b(s) = bk e2πiks , (5) k∈Z
Central Limit Theorems and Invariance Principles for Time-One Maps
65
where b0 = 0 and b−k = bk . More importantly, bk → 0 as |k| → ∞ and bk = 0 infinitely often. Choose an irrational number α ∈ (1, 2) such that the equation |kα − p| < |bk |
(6)
has infinitely manysolutions k ∈ Z, p ∈ Z. (The set of such α is a dense Gδ in R.) 01 Let A = and let X = XA denote the corresponding subshift of finite type. 11 Write X = C0 ∪ C1 , where Cm consists of symbols starting with an m. Define the roof function r : X → R by r|C0 ≡ 1 and r|C1 ≡ α. Let Tt : Xr → Xr be the corresponding suspension flow. Since α is irrational, Tt is mixing. Consider the observation φ : Xr → R given by φ(x, s) = b(s) on C0 × [0, 1] and φ(x, s) = 0 on C1 × [0, α]. Evidently, Xr and φ satisfy conditions (i)–(iv) listed above. Since α > 1, and two consecutive 0’s in the symbol of x are forbidden, we have the following result. Proposition 3. For any x ∈ X, t > 0, the set Tt ({x} × [0, 1]) ∩ C0 × [0, 1] is connected. 1 2 1 To prove Theorem 4, it suffices to show that lim supN→∞ N 2−' C0 0 φN ds dµ = ∞. j n We compute that T (x, s) = (σ x, s + j − rn (x)), where n = n(x, s, j ) is such that rn (x) ≤ s + j < rn+1 (x). Note that rn (x) = n1 + n2 α, where n1 + n2 = n. Also, n ≤ j (since α > 1). In particular, φ ◦ T j (x, s) = b(s − n2 α) or φ ◦ T j (x, s) = 0, and we can write φ ◦ T j (x, s) = IC0 (σ n(x,s,j ) x)b(s − n2 (x, s, j )α). By Proposition 3, for each x ∈ C0 and j ≥ 0, there exist integers n = n(x, j ) and n2 = n2 (x, j ) such that for each s ∈ [0, 1] either n(x, s, j ) = n, n2 (x, s, j ) = n2 , or T j (x, s) ∈ C1 × [0, α]. If T j (x, s) ∈ C1 × [0, α] for all s ∈ [0, 1], we simply choose n(x, j ) to be any n for which σ n x ∈ C1 . This means that we can write φ ◦ T j (x, s) = IC0 (σ n(x,j ) x) bk e2πiks e−2πikn2 (x,j )α , k
for all (x, s) ∈ C0 × [0, 1], j ≥ 0. Hence φN (x, s) = bk,N (x)e2πiks , k
where bk,N (x) = bk
N−1
IC0 (σ n(x,j ) x)e−2πikn2 (x,j )α .
j =0
By Parseval’s identity, we have 1 C0
0
2 φN ds dµ =
k
C0
|bk,N |2 dµ.
66
I. Melbourne, A. Török
The next step is to estimate bk,N for suitable choices of k and N . By definition of b and α, we can choose k ∈ Z and N ≥ 1 arbitrarily large such that bk ∼ N −'/3 and |kα − p| < |bk | for some p ∈ Z. Then |e−2πikn2 (j,x)α − 1| = |e−2πikn2 (j,x)α − e−2πin2 (j,x)p | ≤ 4π n2 (j, x)|kα − p| ≤ 4πN |bk | ∼ 4π |bk |1−'/3 < 1/2. Hence for such k, N large enough |bk,N (x)| ≥ M(x, N )|bk |(1 − 4π |bk |1−'/3 ) ≥ M(x, N )|bk |/2, where M(x, N) =
N−1 j =0
IC0 (σ n(x,j ) x).
Proposition 4. Let K = µr (C0 × [0, 1]) and r = lim inf N→∞ C0 M(x, N)2 /N 2 dµ ≥ K 3 r.
X
rdµ. Then
j Proof. Let M0 (x, s, N) = N−1 j =0 IC0 ×[0,1] ◦ T (x, s). Since Tt is mixing, it follows that T is mixing and hence ergodic. By the ergodic theorem, M0 (x, s, N )/N → K almost everywhere. Since M0 (x, s, N )/N ≤ 1, the dominated convergence theorem implies that
1
lim
N→∞ C0
M0 (x, s, N)2 /N 2 ds dµ =
0
C0
1
K 2 ds dµ = K 3 r.
0
By definition of n(x, j ), M(x, N) =
N−1
IC0 ◦ σ n(x,j ) x ≥
j =0
IC0 ◦ σ n(x,s,j ) x = M0 (x, s, N ),
j =0
and the result follows. By Proposition 4,
N−1
C0
M(x, N )2 /N 2 dµ ≥ K 3 r/2 eventually so that C0
|bk,N |2 dµ ≥ K 3 |bk |2 N 2 r/8.
We conclude that 1 N 2−'
C0
1 0
2 φN ds dµ
≥ as required.
1 N 2−'
C0
|bk,N |2 dµ ≥ K 3 |bk |2 N ' r/8 ∼ K 3 N '/3 r/8 → ∞,
Central Limit Theorems and Invariance Principles for Time-One Maps
67
Appendix A. Reduction to Future Coordinates Suppose that σ : X → X is a two-sided subshift of finite type. Let θ ∈ (0, 1), r ∈ Fθ (X), and define the suspension Xr corresponding to the roof function r with suspension flow Tt . As described earlier, we define the “metric” dθ ((x, s), (y, t)) = dθ (x, y) + |s − t|. Let Fθ (Xr ) denote the space of continuous function v : Xr → R that are Lipschitz with respect to the metric dθ and let |v|θ denote the Lipschitz constant. Remark 8. We have used dθ to denote the metrics on X and Xr but the context should avoid any ambiguity. Also, it should be noted that dθ is not really a metric on Xr due to the identifications, but this turns out only to be a minor inconvenience. In this regard, we caution that the continuity assumption for elements of Fθ (Xr ) is not implied by the Lipschitz assumption. Let (x, s) ∈ Xr . Then Tt (x, s) = (σ j x, s +t −rj (x)), where s +t ∈ [rj (x), rj +1 (x)). The lap number j is a function of x, s, t. Note that j ∈ (t/ max r, t/ min r]. Proposition 5. Suppose x, x ∈ X and xi = xi for all i ≥ 0. Then the limit
<(x, x ) =
∞
r(σ j x) − r(σ j x ) = lim rj (x) − rj (x )
j =0
j →∞
exists. Moreover, there exists a t0 ≥ 1 such that if xi = xi for all i ≥ 0 and if j and k are the lap numbers corresponding to Tt (x, s) and Tt (x , s − <(x, x )), then |j − k| ≤ 1 for all t ≥ t0 . Proof. Note that |r(σ j x) − r(σ j x )| ≤ |r|θ dθ (σ j x, σ j x ) ≤ θ j |r|θ so that < is welldefined. Let j and k be the lap numbers for Tt (x, s) and Tt (x , s − <(x, x )) respectively. Thus s + t ∈ [rj (x), rj +1 (x)) and s + t ∈ [rk (x ) − <(x, x ), rk+1 (x ) − <(x, x )). As k → ∞, the interval [rk (x ) − <(x, x ), rk+1 (x ) − <(x, x )) converges to the interval [rk (x), rk+1 (x)). Hence, within an arbitrarily small error, the intervals [rj (x), rj +1 (x)) and [rk (x), rk+1 (x)) must eventually overlap. But if |j − k| ≥ 2, then these intervals are separated by at least distance min r. It follows that eventually |j − k| ≤ 1. Corollary 1. There exists N ≥ 1 such that |v ◦ Tn (x, s) − v ◦ Tn (x , s − <(x, x ))| ≤ |v|θ 1 + |r|θ /(1 − θ) θ n/|r|∞ , for all v ∈ Fθ (Xr ) and n ≥ N . Proof. Denote the lap numbers of Tn (x, s) and Tn (x , s − <(x, x )) by j and k respectively. It follows from Proposition 5 that for each n ≥ N large enough, |j − k| ≤ 1. In the case k = j , |v ◦ Tn (x, s) − v ◦ Tn (x , s − <(x, x ))| ≤ |v|θ dθ (σ j x, σ j x ) + |<(x, x ) − rj (x) + rj (x )| ≤ |v|θ θ j [1 + |r|θ /(1 − θ)] ≤ |v|θ θ n/|r|∞ [1 + |r|θ /(1 − θ)].
68
I. Melbourne, A. Török
In the case k = j + 1, we have the estimate v(σ j x, s + n − rj (x)) − v(σ j x, r(σ j x)) + v(σ j +1 x, 0) − v(σ j +1 x , s − <(x, x ) + n − rj +1 (x )) ≤ |v|θ (r(σ j (x)) − (s + n − rj (x)) + dθ (σ j +1 x, σ j +1 x ) + (s − <(x, x ) + n − rj +1 (x )| = |v|θ dθ (σ j +1 x, σ j +1 x ) + (rj +1 (x) − rj +1 (x ) − <(x, x ))| ≤ |v|θ θ j +1 [1 + |r|θ /(1 − θ)] ≤ |v|θ θ n/|r|∞ [1 + |r|θ /(1 − θ)]. The case k = j − 1 is similar.
Proposition 6. Suppose that x, x , y, y ∈ X and xi = xi for i ≥ 0 and yi = yi for i ≥ 0. If dθ (x, y) < θ 2N , dθ (x , y ) < θ 2N , then |<(x, x ) − <(y, y )| < 4|r|θ θ N /(1 − θ). Proof. Write <(x, x ) − <(y, y ) = (rN (x) − rN (y)) − (rN (x ) − rN (y )) + <(σ N x, σ N x ) − <(σ N y, σ N y ). Now, |rN (x) − rN (y)| ≤
N−1
|r(σ j x) − r(σ j y)| ≤
j =0
≤
N−1
N−1
|r|θ dθ (σ j x, σ j y)
j =0
|r|θ θ −j dθ (x, y) ≤ |r|θ θ −N dθ (x, y)/(1 − θ)
j =0
≤ |r|θ θ N /(1 − θ), and similarly for rN (x ) − rN (y ). Next, compute that |<(σ N x, σ N x )| ≤
∞
|r(σ j x) − r(σ j x )| ≤ |r|θ
j =N
and similarly for <(σ N y, σ N y ).
∞
θ j = |r|θ θ N /(1 − θ),
j =N
Let ∂t v = (∂/∂t )(v◦Tt )|t=0 denote the derivative of v : Xr → R in the flow direction. j Let Fk,θ (Xr ) denote the space of functions v : Xr → R such that ∂t v ∈ Fθ (Xr ) for j j = 0, . . . , k and define |v|k,θ = maxj =0,... ,k |∂t v|θ . Theorem 5. Let σ : X → X be a two-sided subshift and let r ∈ Fθ (X) be a roof function, r > 0. Suppose further that r depends only on future coordinates. Define q = (4 + 2|1/r|∞ )|r|∞ . Let v ∈ Fk+1,θ (Xr ). Then there exists w, χ ∈ Fk,θ 1/q (Xr ) such that w depends only on future coordinates, and v = w + χ − χ ◦ T1 .
Central Limit Theorems and Invariance Principles for Time-One Maps
69
Proof. For each letter a, choose an element x a ∈ X such that (x a )0 = a. Given x ∈ X define ϕ(x) ∈ X as follows: (ϕ(x))i = xi for i ≥ 0 and (ϕ(x))i = (x x0 )i for i ≤ 0. So the future coordinates of ϕ(x) agree with x whereas the past coordinates of ϕ(x) depend only on x0 . In particular, the map ϕ : X → X depends only on future coordinates. By Proposition 5, we can define ϕ (x, s) = (ϕx, s − <(x, ϕx)). Define (formally for the moment) χ=
∞
(v ◦ Tn − v ◦ Tn ◦ ϕ ).
n=0
Compute that v = w + χ − χ ◦ T1 where w = ∞ ϕ − v ◦ Tn ◦ ϕ ◦ T1 ), n=0 (v ◦ Tn ◦ which clearly depends only on future coordinates (since ϕ and r depend only on future coordinates). It remains to show that χ (and hence w) lies in Fk,θ 1/q (Xr ). First, we show that χ is C k+1 in the flow direction. Differentiating χ formally term by j j j term yields the series ∂t χ = ∞ ϕ ). For fixed 0 ≤ j ≤ k +1, n=0 ((∂t v)◦Tn −(∂t v)◦Tn ◦ j since ∂t v ∈ Fθ (Xr ), we deduce from Proposition 5 that the nth term of ∂t χ is bounded in j absolute value by |∂t v|θ θ n/|r|∞ [1 + |r|θ /(1 − θ)] and so the series converges uniformly j to a continuous function ∂t χ . In particular, χ is C k+1 in the flow direction. j
It remains to show that ∂t χ is Lipschitz with respect to the dθ 1/q metric for all 0 ≤ j ≤ k. It suffices to show that χ is Lipschitz with respect to the dθ 1/q metric under j the assumption that v ∈ F1,θ (the general case follows replacing v by ∂t v). Moreover, 1 since χ is C and hence Lipschitz in the flow direction (which we can identify with the s variable), we may keep the s variable fixed. Choose N large as in Proposition 5. In analogy with the proof of Proposition 6, we have the decomposition |χ (x, s) − χ (y, s)| ≤ A1 (x, y) + A2 (x, y) + B(x) + B(y), where A1 (x, y) =
N
|v ◦ Tn (x, s) − v ◦ Tn (y, s)|,
n=0
A2 (x, y) = B(x) =
N
|v ◦ Tn ( ϕ (x, s)) − v ◦ Tn ( ϕ (y, s))|,
n=0 ∞
|v ◦ Tn (x, s) − v ◦ Tn ( ϕ (x, s))|.
n=N+1
Let q1 = |r|∞ and q2 = 2 + |1/r|∞ . We claim that provided N is large enough (independent of v), there exists a constant K > 0 such that (i) B(x) ≤ Kθ N/q1 for all x ∈ X, and (ii) A1 (x, y), A2 (x, y) ≤ Kθ N/2 for all x, y ∈ X with dθ (x, y) < θ Nq2 . Let q = 2q1 q2 . It then follows that |χ (x, s) − χ (y, s)| ≤ 4Kdθ 1/q (x, y), proving the result. As before, the nth term of B(x) is dominated by Cθ n/|r|∞ = Cθ n/q1 , verifying (i). It remains to verify (ii). We give the details for the more difficult term A2 (x, y). Choose N so large that 4|r|θ θ N /(1 − θ) < min r/2 and N θ N/2 < 1.
70
I. Melbourne, A. Török
Suppose that dθ (x, y) < θ Nq2 . By Proposition 6, |<(x, ϕx) − <(y, ϕy)| < min r/2. Also, |rj (ϕx) − rj (ϕy)| ≤ |r|θ θ −j +1 θ Nq2 /(1 − θ) ≤ |r|θ θ N(q2 −|1/r|∞ ) /(1 − θ) = |r|θ θ 2N /(1 − θ) < min r/2, for all 1 ≤ j ≤ [N |1/r|∞ ] + 1. Hence for this range of j , the intervals [rj (ϕx) + <(x, ϕx), rj +1 (ϕx)+<(x, ϕx)) and [rj (ϕy)+<(y, ϕy), rj +1 (ϕy)+<(y, ϕy)) almost coincide (the initial points are within distance min r, as are the final points). It follows as ϕ (x, s)) and Tn ( ϕ (y, s)) in the proof of Proposition 5 that the lap numbers j and k of Tn ( satisfy |j − k| ≤ 1 for all 0 ≤ n ≤ N . The estimation of the terms in A2 (x, y) now splits into three cases as in the proof of Corollary 1. When j = k, we obtain the term
v σ j ϕx, s − <(x, ϕx) + n − rj (ϕx) − v σ j ϕy, s − <(y, ϕy) + n − rj (ϕy) , which is dominated by |v|θ dθ (σ j ϕx, σ j ϕy) + |rj (ϕx) − rj (ϕy)| + |<(x, ϕx) − <(y, ϕy)| ≤ |v|θ [1 + |r|θ /(1 − θ)]θ −j dθ (ϕx, ϕy) + 4|r|θ θ N /(1 − θ) ≤ |v|θ [1 + |r|θ /(1 − θ)]θ Nq2 −n|1/r|∞ + 4|r|θ θ N /(1 − θ) . The computations for j = k ± 1 lead to the same estimates (just as in the proof of Corollary 1) and summing the terms we obtain A2 (x, y) ≤ |v|θ [1 + |r|θ /(1 − θ)]θ N(q2 −|1/r|∞ ) /(1 − θ) + 4|r|θ N θ N /(1 − θ) ≤ |v|θ [1 + |r|θ /(1 − θ)]θ 2N /(1 − θ) + 4|r|θ θ N/2 /(1 − θ) , (since N θ N/2 < 1) completing the proof.
Acknowledgement. This research was supported in part by NSF Grant DMS-0071735 and by the ESF “Probabilistic methods in non-hyperbolic dynamics” (PRODYN) programme. IM is grateful to Francois Ledrappier and Matthew Nicol for helpful discussions and suggestions.
References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
Billingsley, P.: The Lindeberg-Lévy theorem for martingales. Proc. Amer. Math. Soc. 12, 788–792 (1961) Billingsley, P.: Convergence of Probability Measures. New York: Wiley, 1968 Billingsley, P.: Probability and Measure. New York: Wiley, 1986 Bowen, R.: Periodic orbits for hyperbolic flows. Amer. J. Math. 94, 1–30 (1972) Bowen, R.: Symbolic dynamics for hyperbolic flows. Amer. J. Math. 95, 429–460 (1973) Bowen, R.: Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms. Lecture Notes in Math. 470, Berlin: Springer, 1975 Bowen, R., Ruelle, D.: The ergodic theory of Axiom A flows. Invent. Math. 29, 181–202 (1975) Conze, J.-P., Le Borgne, S.: Méthode de martingales et flow géodésique sur une surface de courbure constante négative. Ergod. Th. & Dynam. Sys. 21, 421–441 (2001) Denker, M., Philipp, W.: Approximation by Brownian motion for Gibbs measures and flows under a function. Ergod. Th. & Dynam. Sys. 4, 541–552 (1984) Dolgopyat, D.: Prevalence of rapid mixing in hyperbolic flows. Ergod. Th. & Dynam. Sys. 18, 1097–1114 (1998) Dolgopyat, D.: Limit theorems for partially hyperbolic systems. Preprint, Penn. State Univ., 2001
Central Limit Theorems and Invariance Principles for Time-One Maps
71
12. Field, M. J., Melbourne, I., Török, A.: Decay of correlations, central limit theorems and approximation by Brownian motion for compact Lie group extensions. To appear in Ergod. Th. & Dynam. Sys. 13. Gordin, M.I.: The central limit theorem for stationary processes. Soviet Math. Dokl. 10, 1174–1176 (1969) 14. Katok, A., Hasselblatt, B.: Introduction to the Modern Theory of Dynamical Systems. Encyclopedia of Math. and its Applications 54, Cambridge: Cambridge Univ. Press, 1995 15. Liverani, C.: Central limit theorem for deterministic systems. In: International Conference on Dynamical Systems (F. Ledrappier, J. Lewowicz, and S. Newhouse, eds.), Pitman Research Notes in Math. 362, Harlow: Longman Group Ltd, 1996, pp. 56–75. 16. Livšic, A.N.: Homology properties of Y -systems. Math. Notes 10, 758–763 (1971) 17. Livšic, A.N.: Cohomology of dynamical systems. Math. USSR Izvestija 6, 1278–1301 (1972) 18. Melbourne, I., Török, A.: Statistical limit theorems for suspension flows. Preprint, 2002 19. Parry, W., Pollicott, M.: Zeta Functions and the Periodic Orbit Structure of Hyperbolic Dynamics. Astérique 187–188, Montrouge: Société Mathématique de France, 1990 20. Philipp, W., Stout, W.F: Almost Sure Invariance Principles for Partial Sums of Weakly Dependent Random Variables. Mem. of the Am. Math. Soc. 161, Providence, RI: Amer. Math. Soc., 1975 21. Pollicott, M.: On the rate of mixing of Axiom A flows. Invent. Math. 81, 413–426 (1985) 22. Ratner, M.: The central limit theorem for geodesic flows on n-dimensional manifolds of negative curvature. Israel J. Math. 16, 181–197 (1973) 23. Rohlin, V.A.: Exact endomorphisms of a Lebesgue space. Izv. Akad. Nauk SSSR Ser. Mat. 25, 499–530 (1961) 24. Ruelle, D.: Flows which do not exponentially mix. C. R. Acad. Sci. Paris 296, 191–194 (1983) 25. Sinai, Y.G: Gibbs measures in ergodic theory. Russ. Math. Surv. 27, 21–70 (1972) 26. Strassen, V.: Almost sure behavior of sums of independent random variables and martingales. Proc. 5th Berkeley Symp. Math. Statist. Probab. 2, 1967, pp. 315–343 27. Walkden, C.P.: Livšic theorems for hyperbolic flows. Trans. Am. Math. Soc. 352, 1299–1313 (2000) 28. Wong, S.: Law of the iterated logarithm for transitive C 2 Anosov flows and semiflows over maps of the interval. Monatsh. Math. 94, 163–173 (1982) Communicated by G. Gallavotti
Commun. Math. Phys. 229, 73 – 120 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Duality, Biorthogonal Polynomials and Multi-Matrix Models M. Bertola1,2 , B. Eynard1,3 , J. Harnad1,2 1 Centre de recherches mathématiques, Université de Montréal, C. P. 6128, succ. centre ville, Montréal,
Québec H3C 3J7, Canada. E-mail:
[email protected]
2 Department of Mathematics and Statistics, Concordia University, 7141 Sherbrooke W., Montréal, Québec
H4B 1R6, Canada. E-mail:
[email protected]
3 Service de Physique Théorique, CEA/Saclay, Orme des Merisiers, 91191 Gif-sur-Yvette Cedex, France.
E-mail:
[email protected] Received: 14 September 2001 / Accepted: 18 February 2002
Abstract: The statistical distribution of eigenvalues of pairs of coupled random matrices can be expressed in terms of integral kernels having a generalized Christoffel–Darboux form constructed from sequences of biorthogonal polynomials. For measures involving exponentials of a pair of polynomials V1 , V2 in two different variables, these kernels may be expressed in terms of finite dimensional “windows” spanned by finite subsequences having length equal to the degree of one or the other of the polynomials V1 , V2 . The vectors formed by such subsequences satisfy “dual pairs” of first order systems of linear differential equations with polynomial coefficients, having rank equal to one of the degrees of V1 or V2 and degree equal to the other. They also satisfy recursion relations connecting the consecutive windows, and deformation equations, determining how they change under variations in the coefficients of the polynomials V1 and V2 . Viewed as overdetermined systems of linear difference-differential-deformation equations, these are shown to be compatible, and hence to admit simultaneous fundamental systems of solutions. The main result is the demonstration of a spectral duality property; namely, that the spectral curves defined by the characteristic equations of the pair of matrices defining the dual differential systems are equal upon interchange of eigenvalue and polynomial parameters. 1. Introduction 1.1. Random matrices. 1.1.1. Background and motivation. Random matrices [37, 7] play an important rôle in many areas of physics. They were first introduced by Wigner [51] in the context of the spectra of large nuclei, and the theory was greatly developed in pioneering work of Work supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC) and the Fonds FCAR du Québec.
74
M. Bertola, B. Eynard, J. Harnad
Mehta, Gaudin [39, 23] and Dyson [16, 17]. It has found subsequent applications in solid state physics [24] (e.g., conduction in mesoscopic devices, quantum chaos and, lately, crystal growth[42]), in particle physics [50], 2d-quantum gravity and string theory [11, 12, 5]. The reason for the success and large range of applications of random matrices is due, in part, to their universality property; when the size of the matrices N becomes large, the statistics of the eigenvalues tend to be independent of the model, and determined only by its symmetries and the spectral region considered, relative to critical points and edges in the spectral density. Matrix integrals are also known to give special realizations of KP , Toda and isomonodromic τ -functions, and thus have a close relationship to integrable systems [13, 36, 46, 47, 2, 3, 28, 7]. Random matrices also have important applications in pure mathematics, for example, in the statistical distribution of the zeros of ζ -functions [41, 43, 34]. They are also related to other statistical problems such as random word growth and the lengths of nondecreasing subequences of random sequences [4, 33]. The model we shall consider here is called the “2-matrix model” [30, 38, 10, 40, 12, 19, 22]. This involves an ensemble consisting of pairs of N × N hermitian matrices M1 and M2 , with a U (N ) invariant probability measure of the form: 1 1 dµ(M1 , M2 ) := exp tr (−V1 (M1 ) − V2 (M2 ) + M1 M2 )dM1 dM2 , τN τN
(1.1)
where dM1 dM2 is the standard Lebesgue measure for pairs of Hermitian matrices, V1 and V2 are polynomials of degrees d1 + 1, d2 + 1 respectively, called the potentials, with coeffficients viewed as deformation parameters, and the normalization factor (partition function) is τN = dµ, (1.2) M1
M2
which is known to be a KP τ -function in each set of deformation parameters, as well as providing solutions to the two-Toda equations [48, 2, 3]. This model was introduced in [30, 38] as a toy model for quantum gravity and string theory. The main interest was in a special “double scaling” limit, where N → ∞ and the potentials V1 and V2 are fine-tuned to critical potentials. The asymptotic behaviour in such limits is related to finite dimensional irreducible representations of the 2Dconformal group [14, 45, 10]. The best known example is when V1 and V2 are cubic polynomials, tuned to their critical values, which reproduces the critical behaviour of the Ising model on a random surface [35, 8]. It is important to note that the 2-matrix model contains more critical points than a 1-matrix model; for instance, the 1-matrix-model cannot have an Ising transition [14]. String theorists have also introduced a generalization, known as the “multi-matrix model” [13, 19, 20, 22], where one has a set of m ≥ 2 matrices (N ×N hermitian) coupled together in a chain, with a measure of the form m m−1 m 1 1 dµ(M1 , . . . , Mm ) = exp tr − Vj (Mj ) + Mi Mi+1 dMi , τN τN j =1
j =1
i=1
(1.3) and the Vj ’s are again polynomials in their arguments. This model has the same universal behaviour as the 2-matrix model and, in some sense, does not seem to contain any more
Duality, Biorthogonal Polynomials and Multi-Matrix Models
75
information. Throughout the main body of this work, we will concentrate on the 2-matrixmodel, for which the statistics of the eigenvalues can be calculated using biorthogonal polynomials [38, 37, 22, 20, 2, 3]. In the appendix, it will be explained how to extend all the results in the present work from the 2-matrix model to the differential systems associated with the multi-matrix model. 1.1.2. Relation to biorthogonal polynomials. By biorthogonal polynomials, we mean two sequences of monic polynomials πn (x) = x n + · · · ,
σn (y) = y n + · · · ,
n = 0, 1, . . .
which are orthogonal with respect to a coupled measure on the product space: dx dy πn (x)σm (y)e−V1 (x)−V2 (y)+xy = hn δmn ,
(1.4)
(1.5)
where V1 (x) and V2 (y) are polynomials chosen to be the same as those appearing in the 2-matrix model measure (1.1), and a suitable contour is chosen to make the integrals convergent. The orthogonality relations determine the two families. Once the biorthogonal polynomials are known, they may be used to compute four different kernels: N
K 12 (x, y) =
N−1 n=0
N
K 11 (x, x ) = N
1 πn (x)σn (y)e−V1 (x) e−V2 (y) , hn N
dy K 12 (x, y) e
x y
,
N
dx K 12 (x, y) exy , N N
dx dy K 12 (x, y) exy ex y . K 21 (y , x ) = K 22 (y , y) =
(1.6)
(1.7)
All the statistical properties of the spectra of the 2-matrix ensemble may then be expressed in terms of these kernels [22] and the corresponding Fredholm integral operators N
Kij , i, j = 1, 2. For instance the density of eigenvalues of the first matrix is: N
ρ 1 (x) =
1 N K 11 (x, x), N
the correlation function of two eigenvalues of the first matrix is: N N N N N 1
ρ 11 (x, x ) = K 11 (x, x)K 11 (x , x ) − K 11 (x, x )K 11 (x , x) , N2
(1.8)
(1.9)
and the correlation function of two eigenvalues, one of the first matrix and one of the second is: N N N N N 1 xy ρ 12 (x, y) = (x, x) (y, y) − (x, y)( (y, x) − e ) . (1.10) K 11 K 22 K 12 K 21 N2 Any other correlation function of m eigenvalues can similarly be written as a determinant involving these four kernels only.
76
M. Bertola, B. Eynard, J. Harnad
The spacing distributions (the probability that two neighbouring eigenvalues are at some given distance) can be computed as Fredholm determinants. For example, the probability that some subset J of the real axis contains no eigenvalue of the first matrix is the Fredholm determinant: N N,1 pJ = det 1 − K11 ◦ χJ , (1.11) where χJ is the characteristic function of the set J . An important feature in the study of the N → ∞ limit is that the kernels Kij may be expressed [20] in terms of sums involving only a fixed number of terms (either d1 + 1 or d2 + 1), independently of N , as a consequence of a “generalized Christoffel–Darboux” formula [44, 49] following from the recursion relations satisfied by the biorthogonal polynomials. This allows one, in the N → ∞ limit, with suitable scaling in the spectral variables, depending on the region considered, to treat N as just a parameter.
1.2. Duality. 1.2.1. Dual isomonodromic deformations. The notion of duality arises in a number of contexts, both in relation to isospectral flows [1] and isomonodromic [26, 27, 25] deformations. What is meant here by “duality” in the case of isomonodromic deformations is the existence of a pair of parametric families of meromorphic covariant derivative operators on the Riemann sphere D1 :=
∂ + L(x, u), ∂x
D2 :=
∂ + M(y, u), ∂y
(1.12)
where L(x, u) and M(y, u) are, respectively, l × l and m × m matrices that are rational functions of the complex variables x ∈ P1 and y ∈ P1 , with pole divisors of fixed degrees, depending smoothly on a set of deformation parameters u = (u1 , u2 , . . . ) in such a way that: (1) The matrices L(x, u) and M(y, u) are obtained from the integral curves of a set of commuting (in general, nonautonomous) vector fields defined on a phase space M by composition with a prescribed pair of maps (possibly depending explicitly on the deformation parameters) from M to the spaces of rational, l × l or m × m matrix valued rational functions of the spectral parameter x or y, respectively, with pole divisors of fixed degree. (2) The generalized monodromy data of both the operators D1 and D2 are invariant under the u-deformations. (This includes the monodromy representation of the fundamental group of the punctured Riemann sphere obtained by removing the locus of poles and, in the case of non-Fuchsian systems, the Stokes matrices and connection matrices [31, 32].) (3) The spectral curves determined by the characteristic equations: det(L(x, u) − y1) = 0,
det(M(y, u) − x1) = 0
(1.13)
are biholomorphically equivalent. (A similar definition can be given for the case of dual isospectral flows of matrices L(x, u) and M(y, u).)
Duality, Biorthogonal Polynomials and Multi-Matrix Models
77
Such “dual pairs” of isomonodromic families occur in many applications. They are related to the solution of “dual” matrix Riemann–Hilbert (RH) problems [29, 25, 27] which, in certain cases, are equivalent to determining the resolvents of a special class of
with kernels of the form: “integrable” Fredholm integral operators K, K, K(x, x ) =
y ) = K(y,
l fi (x)gi (x ) i=1 m a=1
,
(1.14)
ga (y ) f a (y)
. y − y
(1.15)
x − x
Here, the vector valued functions f(x, u) = (f1 (x, u), . . . , fl (x, u)); g(x, u) = (g1 (x, u), . . . , gl (x, u)) (1.16)
f(y, u) = (f 1 (y, u), . . . , f m (y, u));
g(y, u) = (
g1 (y, u), . . . ,
gm (y, u)). (1.17) depend on the spectral variables x and y as well as on some, but not necessarily all the deformation parameters (u1 , u2 , . . . ). They satisfy overdetermined, compatible differential systems in these variables which imply the invariance of the monodromy of associated “vacuum” isomonodromic families of covariant derivative operators D0,1 :=
∂ + L0 (x, u), ∂x
D0,2 :=
∂ + M0 (y, u). ∂y
(1.18)
y ) are related to each other by applying partial The dual kernels K(x, x ) and K(y, integral transforms with respect to one of the two spectral variables (x, y) (e.g., FourierLaplace transforms) to an integral operator on the product space P1 × P1 . Application of the Riemann-Hilbert dressing method for suitably chosen “dual” sets of discontinuity data then gives rise to the “dressed” families (1.12), which have a similar relation to the resolvent kernels of the two operators. (Additional deformation parameter dependence may enter, besides that contained in the vacuum equations, characterizing the ˆ
support of these operators.) The Fredholm determinants det(1 − K) and det(1 − K) may then be shown, through deformation formulae, to coincide with the corresponding isomonodromic tau functions, and with each other (see e.g. [25, 27]). Such systems arise naturally, as discussed above, in the study of the spectral statistics of random matrix ensembles, both in the finite case, and in suitable infinite limits. An example of such dual pairs is given by the class of exponential kernels in which: fi (x) = (−1)i eui x , f a (x) = (−1)a eva x ,
gi (x) = e−ui x ,
ga (x) = e−va x .
(1.19) (1.20)
For the case l = m = 2, these include the sine kernel KS (x, x ) =
sin(u(x − x )) , (x − x )
S (y, y ) = sin(v(y − y )) K (y − y )
(1.21)
governing the spectral statistics in the scaling limit of the GUE in the bulk region [37]. Other examples include the various Painlevé equations PII , PIV , PV , PVI [26, 28, 15] which each possess “dual” isomonodromic representations. A last class of examples, unrelated to random matrices, but including a special case of PVI , consists of the isomonodromic representations of the WDVV equations of topological 2D gravity entering in
78
M. Bertola, B. Eynard, J. Harnad
the theory of Frobenius manifolds [15]. These possess both Fuchsian representations with n finite poles with residues of rank 1, and non-Fuchsian n × n representations, having a single irregular singular point of Poincaré index 1 at ∞. 1.2.2. Duality in the large N Limit. It is proved in [52], under a suitable large N assumption, and conjectured for other cases [52, 20] that in a particular large N limit (where N
the coefficients {uK , vJ } scale as N , and the supports of ρ1 (x) := limN→∞ ρ 1 (x) and N
ρ2 (y) := limN→∞ ρ 2 (y) are connected intervals [a1 , b1 ], [a2 , b2 ] respectively), the following functions: b1 b2 ρ1 (x )
ρ2 (y )
1
1 V dx , g(y) := (y) − dy , f (x) := V1 (x) − 2
N N a1 x − x a2 y − y (1.22) are inverses of each other. In other words, if y = f (x) then g(y) = x. One can see that the functions f (x) and g(y) are related to the eigenvalues of the operators which implement the derivative with respect to x and y for the biorthogonal polynomials. The spectral duality Theorems 4.1 and 4.2 which we present in this work are a more precise statement related to this conjecture, with a rigorous proof valid for all N . 1.3. Outline of the article. 1.3.1. Biorthogonal polynomials and differential systems. In Sect. 2, we consider the normalized quasi-polynomials 1 ψn (x) = √ πn (x)e−V1 (x) , hn
1 φn (y) = √ σn (y)e−V2 (y) , hn
n = 0, . . . , ∞. (1.23)
Viewing these as the components of a pair of column vectors * = (ψ0 , ψ1 , . . . , ψn , . . . )t
and
∞
, = (φ0 , φ1 , . . . , φn , . . . )t , ∞
(1.24)
we obtain a pair of semi-infinite matrices Q and P that implement multiplication of * by ∞
x and derivation with respect to x, respectively. Equivalently, we obtain the transposes d Qt and P t by applying − dy or multiplication by −y to ,. By construction, these satisfy the Heisenberg commutation relations [P , Q] = 1,
∞
(1.25)
and, as shown in Sect. 2, they are finite band matrices; Q has nonvanishing elements only along diagonals that range from 1 above the principal diagonal to d2 below it, and P has nonvanishing elements only along the diagonals from 1 below the principal to d1 above it, where d1 + 1 and d2 + 1 are the degrees of the polynomials V1 (x) and V2 (y), respectively. The first result (Prop. 2.1) following from the finite recursion relations satisfied by the quasi-polynomials {ψn (x)} and {φn (y)} is a set of “generalized Christoffel–Darboux
Duality, Biorthogonal Polynomials and Multi-Matrix Models
79
N
N
relations [49, 20], which imply that the kernels K 11 (x, x ) and K 22 (y , y) may be expressed as: N−1 N−1 N N
, (x ), A * (x) * (y ), B ,(y) N N N N
, (y , y) = , (1.26) K 11 (x, x ) = − K 22 x − x
y − y where * (x) and ,(y) are the d2 + 1 and d1 + 1 dimensional column vectors with comN
N
N−1
N−1
ponents [ψN−d2 , . . . , ψN ] and [φN−d2 , . . . , φN ], respectively, and * (y) and , (x) are the d2 +1 and d1 +1 component row vectors with components [ψ N−1 , . . . , ψ N+d −1 ] 2 and [φ N−1 , . . . , φ N+d −1 ], respectively, where the underbarred quantities {ψ n (y)} and 1 {φ n (y)} designate the Fourier-Laplace transforms of the quasi-polynomials {ψn (y)} and N
N
{φn (y)}. The matrices A and B are, essentially, the nonvanishing parts of the matrices obtained by commuting Q and P , respectively, with the projectors to the appropriate finite-dimensional subspace. A similar “differential” form of the generalized Christoffel– Darboux relations holds, following from applying the derivations ∂x + ∂x and ∂y + ∂y
N
N
to the kernels K 11 (x, x ) and K 22 (y , y). The recursion relations satisfied by the quasi-polynomials {ψn (x)}n=0,... ,∞ and {φn (y)}n=0,... ,∞ may be conveniently expressed (Lemma 2.3) as: a * (x) = * (x),
N N
N+1
b ,(y) = , (y),
N N
N+1
where the “ladder” matrices a and b are linear in x and in y, respectively, and are N
N
formed from the rows of Q and the columns of P (see Eqs. (2.49, 2.50) for their exact definitions). The vectors * (x) and ,(y) also satisfy the following differential equations (Lemma 2.4)
N
N
N ∂ * = − D1 (x) * , ∂x N N N
N ∂ , = − D2 (y) ,, ∂y N N
(1.27)
N
where the matrices D1 (x) and D2 (y) are, respectively, of size (d2 + 1) × (d2 + 1) and (d1 + 1) × (d1 + 1), with entries that are polynomials in the indicated variables of degrees d1 and d2 , respectively (and also polynomials in the matrix entries of Q and P ). Furthermore, if {uK }K=1...d1 +1 and {vJ }J =1...d2 +1 are the coefficients of the polynomials V1 (x) and V2 (y), respectively, and these are varied smoothly, the effect of such deformations is given by the following system of PDE’s (Lemma 2.8) N,* ∂ * = U K *, ∂uK N N N,, ∂ , = U K ,, ∂uK N N N,*
N,*
N,,
N,* ∂ * = − V J *, ∂vJ N N N,, ∂ , = − V J ,, ∂vJ N N N,,
(1.28)
where the matrices U K (x), V J (x), U K (y) and V J (y) are again polynomials in the indicated variables and in the matrix entries of Q and P .
80
M. Bertola, B. Eynard, J. Harnad
1.3.2. Compatibility. So far, these statements are just a re-writing of the infinite series of recursion relations, differential equations and deformation equations satisfied by the functions {ψn (x)}n=0,... ,∞ and {φn (y)}n=0,... ,∞ , projected onto the finite “windows” represented by the vectors * and ,. However, we may now view these equaN
N
tions as defining an overdetermined system of finite difference-differential-deformation equations for vector functions of the variable {N, x, y, uK , vJ }, and ask whether, as such, these systems are compatible; i.e., whether they admit a basis of simultaneous linearly independent solutions. The affirmative answer to this question is provided in Sect. 3 by Prop. 3.3, which states that sequences of invertible (d2 + 1) × (d2 + 1) and (d1 + 1) × (d1 + 1) matrices * (x) and ,(y) exist (fundamental solutions), for which N
N
all the column vectors satisfy the above difference-differential-deformation equations simultaneously. The compatibility of the deformation equations and finite difference equations with the x and y differential equations imply, in particular, that the (generalized) monodromy of the polynomial covariant derivative operators N ∂ + D1 (x), ∂x
N ∂ + D2 (y) ∂y
(1.29)
is invariant under both the {uK , vJ } deformations and the shifts in N . A similar statement can be made of the corresponding operators N ∂ − D 1 (x), ∂x
N ∂ − D 2 (y) ∂y N−1
(1.30) N−1
annihilating the Fourier–Laplace transformed vectors , (x) and * (y). 1.3.3. Spectral duality. A result that is not at all obvious from the above discussion is N
N
proved in Prop. 4.1; namely, that the pairs of matrices (D1 (x), D 2 (y)) and N
N
(D2 (y), D 1 (x)) have the same spectral curves. More specifically, we have the following equalities between their characteristic equations, which actually are identities:
N
N
ud1+1 det x1d1 +1 − D 2 (y) = vd2+1 det y1d2 +1 − D 1 (x) , N N ud1+1 det x1d1 +1 − D 2 (y) = vd2+1 det y1d2 +1 − D 1 (x) .
(1.31) (1.32)
(Note that the two integers d1 + 1 and d2 + 1 which determine, in one case, the dimension of the matrix, in the other, its degree as a polynomial in the variable x or y, are interchanged in these equalities, as are the rôles of the variables x and y.) What is even less obvious, and actually depends on the validity of the Heisenberg commutation relation satisfied by Q and P , is that these curves are not only pairwise N
N
equal, but in fact they all coincide, since the pairs of matrices (D1 (x)), D 1 (x)) and
Duality, Biorthogonal Polynomials and Multi-Matrix Models N
81
N
(D2 (y)), D 2 (y)) are conjugate to each other (Theorem 4.2), so we also have the equalities:
N y1d2 +1 − D1( x)
N
= det y1d2 +1 − D 1 (x) , N N det x1d1 +1 − D 2 (y) = det x1d1 +1 − D 2 (y) . det
(1.33)
(1.34)
Moreover, the transformations relating them are x and y independent, and are just those defined by the matrices entering in the generalized Christoffel–Darboux relations: N N
N
N
A D1 (x) = D 1 (x) A,
N N
N
N
B D2 (y) = D 2 (y) B .
(1.35)
These same matrices also relate the system of deformation equations, where they enter as gauge transformations depending on the deformation parameters {uK , vJ }. The key to proving all these results lies in noting (Theorem 4.1) that, as a consequence of the differential equations and recursion relations satisfied by the ψn ’s, φn ’s and their Fourier-Laplace transforms, the following quantities are in fact independent of all the variables N, x, y, {uK }K=1,... ,d1 +1 and {vJ }J =1,... ,d2 +1 .
f˜N (y) :=
N−1
N
˜ (y), B , ˜ (y) , * N
N−1
g˜ N (x) :=
N−1
N
˜ (x), A * ˜ (x) , , N
(1.36)
N−1
˜ (y), , ˜ (x) are any solutions of the full system of difference˜ (y), * ˜ (x), , where * N
N
differential-deformation equations. This allows us to conclude that there exist compatible sequences of fundamental solutions * (x) and ,(y) of the above difference-differentialN
N
deformation equations, and the corresponding equations for the Fourier-Laplace transN−1 N−1
formed quantities , , * such that
, , A * ) ≡ 1,
N−1 N
N
* , B , ≡ 1,
N−1 N
N
(1.37)
for all values of {N, x, y, uJ , vK }. This fact may be viewed as a form of the bilinear identities implying the existence of τ -functions [48, 3]. The development of this relation, and its connection with the isomonodromic deformation equations, which requires a study of the formal asymptotics of the fundamental solutions near x = ∞ and y = ∞ will be left to a later work [6]. In the appendix, all the above results are extended to the sequences of multiorthogonal functions that replace the biorthogonal quasi-polynomials in the multi-dimensional case associated to the multi-matrix model discussed above.
82
M. Bertola, B. Eynard, J. Harnad
2. Biorthogonal Polynomials 2.1. Biorthogonality measure. We first consider sequences of biorthogonal polynomials with respect to the measure arising in the study of the two-matrix model discussed in the introduction. Consideration of the recursion relations obtained by multiplication by or derivation with respect to the independent variable gives rise to representations of the Heisenberg relations (string equation) in terms of pairs of semis-infinite matrices. However, most of the results obtained here may also be shown valid in the fully infinite case. Let us fix two polynomials, which we refer to as the “potentials”, V1 (x) =
d 1 +1 K=1
uK K x , K
V2 (y) =
d 2 +1 J =1
vJ J y . J
(2.1)
The coupling constants are normalized in a convenient way so that the derivatives are V1 (x) =
d 1 +1
uK x K−1 .
(2.2)
K=1
We may define two sequences of mutually orthogonal monic polynomials πn (x), σn (y) of degree n such that dx dy πn (x)σm (y)e−V1 (x)−V2 (y)+xy = hn δmn , (2.3) /x
/˜ y
πn (x) = x n + · · · ,
σn (y) = y n + · · · .
(2.4)
In order that the integrals be convergent, one should suitably define the two closed contours of integration /x , /˜ y . If we require that these be the real axis, the degrees of the potentials must be even with the leading coefficient having positive real part. In applications of random matrices to string theory however, the integral is not convergent on the real axis, and the contour should approach ∞ in some appropriate Stokes sector in the complex plane. It is more convenient to deal with the quasi-polynomials defined by 1 ψn (x) = √ πn (x)e−V1 (x) , hn and their Fourier–Laplace transforms dx exy ψn (x), ψ n (y) = /x
1 φn (y) = √ σn (y)e−V2 (y) , hn φ n (x) =
/y
dy exy φn (y).
(2.5)
(2.6)
The choice of normalization is somewhat arbitrary. We have here chosen the most “symmetric” one, in which the leading coefficients in the various recursion relations are the same. In this notation the orthogonality relations take on a simpler form, dxdy ψn (x)φm (y)exy = δmn . (2.7) dx ψn (x)φ m (x) = dy ψ n (y)φm (y) = (We suppress for the present the specification of the contour of integration.) We shall think of the spaces spanned by the ψn (x)’s and φn (y)’s as infinite graded spaces in
Duality, Biorthogonal Polynomials and Multi-Matrix Models
83
duality through the pairing in Eq. (2.7). It can easily be seen, using these relations and integration by parts, that multiplication of the ψn ’s by x produces a linear combination of ψm ’s with n − d2 ≤ m ≤ n + 1 and multiplication of the φn ’s by y produces a linear combination of φj ’s with n − d1 ≤ j ≤ n + 1. Moreover it is clear (through integration by parts) that multiplication of the ψn ’s by x is dual to application of ∂y to the φn ’s and vice-versa. 2.2. Recursion relations and generalized Christoffel–Darboux formulae. We denote by Q and P the semi-infinite matrices which implement multiplication and differentiation by x on the space spanned by the ψn quasi-polynomials. Introducing the semi-infinite column vectors * := [ψ0 , . . . , ψn , . . . ]t
∞
and
, := [φ0 , . . . , φn , . . . ]t ,
(2.8)
∞
the above remarks imply that x * := Q * , ∞
∞
y , = −P t ,, ∞
∞
∂ * := P * , ∞ ∂x ∞ ∂ , = −Qt ,, ∞ ∂y ∞
where P and Q are semi-infinite matrices of the form1 α0 (0) γ (0) 0 ··· ··· ··· 0 α (1) α (1) γ (1) 0 · · · · ·· 0 1 0 α2 (2) α1 (2) α0 (2) γ (2) 0 ··· 0 . .. .. .. Q := . . ··· γ (d2 −1) 0 0 αd2−1 (d2 −1) αd (d2 ) αd −1 (d2 ) · · · α0 (d2 ) γ (d2 ) 0 2 2 .. .. .. .. .. .. . . . . . . 0 −P :=
(2.9)
···
··· ··· , ··· ··· ···
βd1 (d1 ) 0 ··· ··· .. . βd1 (d1 +1) 0 · · · γ (0) β0 (1) β1 (2) .. .. .. , . . . 0 γ (1) β0 (2) .. .. .. . . . 0 0 γ (2) β0 (3) .. .. .. .. .. .. . . . . . .
β0 (0) β1 (1)
(2.10)
···
(2.11)
satisfying the string equation, [P , Q] = 1.
(2.12)
The fact that both matrices have a finite band size as indicated in (2.10), (2.11) follows from the fact that they are related polynomially to each other through the potentials V1 and V2 as follows. 1 In what follows a round bracket on the right of a matrix signifies that the matrix extends to infinity to the right and below.
84
M. Bertola, B. Eynard, J. Harnad
Lemma 2.1. The two matrices P and Q satisfy the following relations: P + V1 (Q) ≥0 = 0, −Qt + V2 (−P t ) ≥0 = 0,
(2.13) (2.14)
where the subscript ≥0 means the part above the main diagonal (main diagonal included). Proof. It is obvious that 1 Q * = xψn (x) = x √ πn (x)e−V1 (x) ∞ n hn hn+1 = ψn+1 (x) + lower terms, hn ∂ 1 P + V1 (Q) * = + V1 (x) ψn (x) = √ πn (x)e−V1 (x) ∞ n ∂x hn hn−1 =n ψn−1 (x) + lower terms. hn
(2.15)
(2.16)
Equation (2.15) means that Q has only one diagonal above the main diagonal with entries given by γ (n) = hn+1 / hn . (2.17) Equation (2.16) then implies that Eq. (2.13) holds and P has d1 diagonals above the main one. Repeating the argument for the φn quasi-polynomials similarly shows that −P t (i.e. multiplication by y) has one diagonal above the main one, Eq. (2.14) holds and −Qt (i.e. differentiation by y) has d2 upper diagonals. This proves the lemma and also shows that P and Q are of the finite band sizes indicated in (2.10), (2.11). Equations (2.9) are just equivalent to the following set of recursion relations:
Q*
∞ n
= x ψn (x) = γ (n)ψn+1 (x) +
∞ n
αj (n)ψn−j (x),
(2.18)
j =0
P*
d2
d
=
1 ∂ ψn (x) = −γ (n − 1)ψn−1 (x) − βj (n + j )ψn+j (x), ∂x
d1 −P , = y φn (y) = γ (n)φn+1 (y) + βj (n)φn−j (y), t
∞ n
(2.19)
j =0
(2.20)
j =0
d2 ∂ −Qt , = αj (n + j )φn+j (y). (2.21) φn (y) = −γ (n − 1)φn−1 (y) − ∞ n ∂y j =0
In the following we also define α−1 (n) := γ (n) =: β−1 (n), αj (n) := 0, ∀j ∈ [−1, d2 ], βj (n) := 0, ∀j ∈ [−1, d1 ].
(2.22)
Duality, Biorthogonal Polynomials and Multi-Matrix Models
85
Defining similarly the semi-infinite row vectors consisting of the Fourier–Laplace transformed functions * := [ψ 0 , . . . , ψ n , . . . ] ∞
, := [φ 0 , . . . , φ n , . . . ],
and
∞
(2.23)
it follows from the dual pairing (2.7), and integration by parts that: Lemma 2.2.
−* Pt ∞
*Q
n
d1
= yψ n (y) =
t
∞
n
= ∂y ψ n (y) =
,Q ∞
n
= xφ n (x) =
−,P ∞
n
βj (n + j )ψ n+j (y),
(2.24)
αl (n)ψ n−l (y),
(2.25)
αl (n + l)φ n+l (x),
(2.26)
j =−1 d2
l=−1
d2 l=−1
= ∂x φ n (x) =
d1 j =−1
βj (n)φ n−j (x).
Now, we introduce the semi–infinite shift matrices 0 1 0 0 0 ··· 0 0 0 1 0 0 · · · 1 0 0 0 1 0 · · · 0 4 := 4−1 := 4t = 0 0 0 0 1 · · · , 0 .. . 0 0 0 0 0 0
0 0 1 0
0 0 0 1
0 0 0 0 .. 0 0 .
(2.27)
0 0 0 0
··· ··· ··· . ···
(2.28)
0 ···
(The notation 4−1 is a convenient shorthand for the transpose 4t , but only signifies that 4−1 is the right inverse of 4. It leads to the abbreviated form 4−j := (4t )j .) Introducing the diagonal semi-infinite matrices αj := diag(αj (0), αj (1), . . . ),
βj := diag(βj (0), βj (1), . . . ),
(2.29)
(where we set αj (n) := 0 and βj (n) := 0 when n < j ), Eqs. (2.10) , (2.11) can be concisely written as Q :=
d2
4−j αj ;
j =−1
−P :=
d1
4j β j .
(2.30)
j =−1
The commutation relation in Eq. (2.12) gives in particular the following quadratic relations between the coefficients {αj , βk }, d1
βj (n + j )αj −l−1 (n + j ) =
j =l d2 k=l
d1
αj −l−1 (n)βj (n + l + 1), ∀n, ∀l ∈ [0, d1 ], (2.31)
j =l
βk−l−1 (n + k)αk (n + k) =
d2 k=l
αk (n + l + 1)βk−l−1 (n), ∀n, ∀ l ∈ [0, d2 ]. (2.32)
86
M. Bertola, B. Eynard, J. Harnad
Our first objective is to define a set of closed differential and difference systems for vectors consisting of finite sequences of the functions {ψn }, {φn } and their Fourier– Laplace transforms, equivalent to the systems (2.18)–(2.21), and to study their properties and relations. Consider first the sequence of functions {ψn (x)}. For these we have a multiplicative recursion relation defined by the coefficients {αj (n)} and a differential recursion relation defined by the coefficients {βj (n)}. We shall show presently that we can define closed systems of first order linear ODE’s for any consecutive sequence of d2 +1 functions (ψN−d2 , . . . , ψN ) (or (φ N−1 , . . . , φ N+d −1 )) with coefficients that are 2 polynomials in x. Similar systems can be constructed for any sequence (φN−d1 , . . . , φN ) (or (ψ N−1 , . . . , ψ N+d −1 )). We introduce the following definitions and notations 1
Definition 2.1. A window of size d1 + 1 or d2 + 1 is any subset of d1 + 1 or d2 + 1 consecutive elements of type ψn , φ n , φn or ψ n , with the notations * := [ψN−d2 , . . . , ψN ]t , N
N ≥ d2 ,
N
* := [ψN , . . . , ψN+d1 ]t , N ≥ 0, * := [ψ N−d , . . . , ψ N ], N ≥ d2 , 2
N N
* := [ψ N , . . . , ψ N+d ], 1
N ≥ 0,
, := [φN−d1 , . . . , φN ]t , N
N
, := [φN , . . . , φN+d2 ]t , , := [φ N−d , . . . , φ N ], N
1
N
, := [φ N , . . . , φ N+d ], 2
N ≥ d1 , (2.33) N ≥ 0, (2.34) N ≥ d1 , (2.35) N ≥ 0. (2.36)
Notice the difference in the positioning of the windows for the vectors constructed from the ψn ’s and the φn ’s, and the fact that the barred quantities are defined to be row vectors while the unbarred ones are column vectors. N−1
Definition 2.2. For any N for which these are defined, the pairs of windows (* , , ) N
N−1
as well as (,, * ) of dimensions d2 + 1 and d1 + 1, respectively, will be called dual windows.
N
The reason for identifying these particular windows as dual will appear in the sequel. Let us now consider the kernels
N
K 11 (x, x ) :=
N−1 n=0
N
K 12 (x, y) :=
N−1 n=0
ψn (x)φ n (x ),
ψn (x)φn (y),
N
K 22 (y , y) :=
N−1 n=0
N
K 21 (y , x ) :=
N−1 n=0
ψ n (y )φn (y), (2.37)
ψ n (y )φ n (x ), (2.38)
that appear in the computation of correlation functions for 2–matrix models [22].
Duality, Biorthogonal Polynomials and Multi-Matrix Models
87
Define the following pair of matrices, which will play an important rôle in what follows: −γ (N−1) 0 0 0 0 αd2(N) · · · α2 (N) α1 (N) 0 N α2 (N +1) 0 ; A := 0 αd2(N+1) · · · 0 0 αd2(N+2) ··· 0 0 0 0 αd2(N+d2 −1) 0
(2.39)
−γ (N−1) 0 0 0 0 βd1(N) · · · β2 (N) β1 (N) 0 N β2 (N +1) 0 . B := 0 βd1(N+1) · · · 0 0 βd1(N+2) ··· 0 0 0 0 βd1(N+d1 −1) 0
(2.40)
For any N , the recursion relations (2.18), (2.20), (2.24), (2.26) and the differential relations (2.19), (2.21), (2.25), (2.27) imply that the following generalized Christoffel– Darboux formulae, as well as their “differential” analogs are satisfied. Proposition 2.1. Generalized Christoffel–Darboux relations: N
(x − x )K 11 (x, x ) = γ (N − 1)ψN φ N−1 − =−
N−1
N
j −1 d2 j =1 k=0
, (x ), A * (x) ,
N
=
(2.41)
N
(y − y)K 22 (y , y) = −γ (N − 1)ψ N−1 φN +
αj (N + k)φ N+k ψN+k−j
N−1
N
j −1 d1 j =1 k=0
βj (N + k)ψ N+k φN−j +k
* (y ), B ,(y) .
(2.42)
N
“Differential” generalized Christoffel–Darboux relations: N
N
t N−1
(∂x + ∂x ) K 11 (x , x) = − ,(x ), (B) * (x) , N N N−1 N ∂y + ∂y K 22 (y , y) = − * (y ), (A)t , (y) . N
(2.43) (2.44)
Proof. Use the relations (2.18)–(2.21), (2.24)–(2.27) and simplify the telescopic sums by cancellation of common terms. Although it will not be needed in the remainder of this paper, for the sake of completeness, we also include the following analogous result for the kernels K12 and K21 , which may be similarly derived. It is related to the above by applying Fourier–Laplace transforms with respect to one of the variables.
88
M. Bertola, B. Eynard, J. Harnad
Proposition 2.2.
N
N−1 t
N
(x + ∂y )K 12 (x, y) = − , (y), A * (x) ,
N
N
N−1 t
(2.45)
N
(y + ∂x )K 12 (x, y) = − * (y ), B ,(y) ,
N y − ∂x K 21 (y , x ) =
N
N−1
N
(2.46)
* (y ), B ,(x ) ,
N ∂y − x K 21 (y , x ) = −
N
N−1
(2.47)
N
, (x ), A * t (y ) .
(2.48)
N
2.3. Folding. We now introduce the sequence of companion–like matrices a (x) and N
b (y) of sizes d2 + 1 and d1 + 1, respectively,
N
0
1
a (x) := N
0 0
−αd2 (N) γ (N)
0 0 ···
0
1
0 0
0 0 ···
b (y) := N
−βd1 (N) γ (N)
0 .. . 0
0
0 1
,
−α1 (N) (x−α0 (N)) γ (N) γ (N)
0 .. . 0
0
0 1
,
−β1 (N) (y−β0 (N)) γ (N) γ (N)
N ≥ d2 ,
(2.49)
N ≥ d1 .
(2.50)
We then have the following: Lemma 2.3. The sequence of matrices a , b implement the shift N → N + 1 in the N N
windows of quasi-polynomials in the sense that a * (x) = * (x),
N N
N+1
b ,(y) = , (y),
N N
N+1
(2.51)
and in general * =
N+j
a
N+j −1
· · · a *, N N
, =
N+j
b
N+j −1
···b,. N N
(2.52)
Proof. This is nothing but a matricial form of the sequence of recursion relations (2.18), (2.20) expressing the higher order polynomials as linear combinations of a fixed subset with polynomial coefficients. We will refer to this process of expressing any ψn (x) by means of linear combinations of elements in a specific window with polynomial coefficients as folding onto the specified window.
Duality, Biorthogonal Polynomials and Multi-Matrix Models
89
The determinants of the matrices a and b are easily computed to be N
N
det(a ) = (−1)d2 +1 αd2 (N )/γ (N ),
det( b ) = (−1)d1 +1 βd1 (N )/γ (N ).
N
N
(2.53)
From Eqs. (2.13) and (2.14) we find the relations αd2 (N ) = vd2 +1
d2
γ (N − j );
βd1 (N ) = ud1 +1
j =1
d1
γ (N − j ).
(2.54)
j =1
Since the coefficients γ (N) are the square roots of the ratios of normalization factors, they cannot vanish for any N , and neither can αd2 (N ) or βd1 (N ), since the deformation parameters ud1 +1 , vd2 +1 are the leading coefficients of the polynomials V1 (x), V2 (y) and hence also may not vanish. It follows that the matrices a and b are all invertible. N
We denote their inverses as follows −αd −1(N) −1 N a := a = N
−1 b := b
N
N
2
···
1
0
0
..
αd2 (N)
0 .
0 0 −βd −1(N) 1 β (N) · · · d1 1 0 = .. . 0 0
0
x−α0 (N) −γ (N) αd2 (N) αd2 (N)
0
0
0
1
0
y−β0 (n) −γ (N) βd1 (N) βd1 (N)
0
0
0
0
1
0
N
,
(2.55)
.
(2.56)
N N
The shifts N → N − 1 are thus implemented by the inverse matrices a, b , and the folding may take place in either direction with respect to polynomial degrees. 2.4. Folded linear differential systems. We now define the following sequences of finite diagonal matrices: N α j := diag αj (N + j − d1 ), αj (N + j − d1 + 1), . . . , αj (N + j ) , j = −1, . . . d2 , N β j := diag βj (N + j − d2 ), βj (N + j − d2 + 1), . . . , βj (N + j ) , j = −1, . . . d1 .
(2.57)
(2.58)
Recall that α−1 (n) = γ (n) = β−1 (n) by our previous conventions, but the diagonal N
N
matrices α −1 and β −1 differ in dimensions. When the context leaves no doubt as to the dimension we will write them as N N α −1 =γ := diag γ (N − d1 − 1), . . . , γ (N − 1) (2.59)
90
M. Bertola, B. Eynard, J. Harnad
or N N β −1 =γ := diag γ (N − d2 − 1), γ (N − d2 ), . . . , γ (N − 1) .
(2.60)
In either case, we denote the inverse matrix as N
γ := (γ )−1 .
(2.61)
N
We can now give the closed differential systems referred to previously. Lemma 2.4. The windows of quasi-polynomials * , , satisfy the following differential N
systems N ∂ * = − D1 (x) * , ∂x N N N ∂ , = − D2 (y) ,, ∂y N N
N
N ≥ d2 + 1,
(2.62)
N ≥ d1 + 1,
(2.63)
where N
N N−1
N
D1 (x) :=γ a + β0 +
d1
N
βj
j =1 N D2 (y)
N N−1
N
:=γ b + α0 +
d2 j =1
N
αj
a
a
· · · a ∈ gld2 +1 [x],
(2.64)
b
b
· · · b ∈ gld1 +1 [y].
(2.65)
N+j −1 N+j −2
N+j −1N+j −2
N
N
These, taken together for all N are equivalent to the relations (2.19), (2.21) when the recursion relations (2.51), (2.52) are taken into account. Proof. Consider the case of the ψn ’s. The differential relations (2.19) may be written by stacking them in a window of size d2 + 1, as follows: d
2 N N ∂ βj * . * = −γ * − ∂x N N−1 N+j
(2.66)
j =0
Using the folding relations Eq. (2.52), we immediately obtain (2.62), (2.64). The same procedure applied to Eq. (2.21) yields (2.63), (2.65). We can repeat a similar procedure for the sequences {ψ n (y)}n∈N and {φ n (x)}n∈N . The N
N
corresponding windows are represented as row vectors * and , since their components are naturally dual to the φn ’s and ψn ’s respectively. The matrices defining the relevant
Duality, Biorthogonal Polynomials and Multi-Matrix Models
91
folding are now a :=
N
N b :=
x−α0 (N) γ (N−1) −α1 (N+1) γ (N−1)
.. .
−αd2 (N+d2 ) γ (N−1) y−β0(N) γ (N−1) −β1 (N+1) γ (N−1)
.. .
−βd1 (N+d1 ) γ (N−1)
and we again denote their inverses as −1 N a := a ,
1 0 0 · 0 ·· 0 , 0 0 1 0 0 0 1 0 0 · 0 ·· 0 , 0 0 1 0 0 0
(2.67)
(2.68)
−1 N . b := b
N
(2.69)
N
As previously, we now have: N
N
Lemma 2.5. The sequence of matrices { a} { b } implement the shift N → N − 1, N−1
N N
* = * b; N
N−1
N N
, =,a.
N
(2.70) N
N
Similarly to the diagonal matrices αj , βj , we define the matrices α j and β j as N α j := diag αj (N ), αj (N + 1), . . . , αj (N + d1 ) , j = −1, . . . , d2 , (2.71) N β j := diag βj (N ), βj (N + 1), . . . βj (N + d2 ) , j = −1, . . . , d1 .
(2.72)
As before we have the two definitions N
N
N
N
α −1 := γ := diag (γ (N ), γ (N + 1), , . . . γ (N + d1 )) ,
β −1 := γ := diag (γ (N ), γ (N + 1), . . . , γ (N + d2 )) ,
(2.73) (2.74)
which will be used if there is no ambiguity regarding dimensions. By repeating a procedure similar to what led to the differential systems in Lemma 2.4, we find: N−1 N−1
Lemma 2.6. The dual windows of Laplace–transformed quasi-polynomials * , , satisfy the following differential systems N N−1 ∂ N−1 * (y) = * (y)D 2 (y), N ≥ d1 + 1, ∂y N N−1 ∂ N−1 , (x) = , (x)D 1 (x), N ≥ d2 + 1, ∂x
(2.75) (2.76)
92
M. Bertola, B. Eynard, J. Harnad
where N
N−1
N−1
D 2 (y) := b γ + α 0 + N
N
d2 N−j N−1 N−1 N−2 b b · · · b αj ,
(2.77)
j =1
N−1
N−1
D 1 (x) := a γ + β 0 N
d1 N−j N−1 N−1 N−2 a a · · · a βj . +
(2.78)
j =1
Summarizing, we have thus obtained four differential systems Size (d2 + 1) × (d2 + 1)
Size (d1 + 1) × (d1 + 1)
N ∂ * (x) = − D1 (x) * (x) ∂x N N N N−1 ∂ N−1 , (x) = , (x) D 1 (x) ∂x
N N−1 ∂ N−1 * (y) = * (y) D 2 (y) ∂y N ∂ ,(y) = − D2 (y) ,(y) ∂y N N
(2.79)
It should be noted that the two matrices D1 and D 1 (as well as D2 and D 2 ) have so far only superficial similarities. In particular they do not depend on the same subsets of the coefficients {αj (n)} and {βj (n)}. On the other hand the pairs (D1 , D 2 ) and (D2 , D 1 ) do depend on the same αj (n)’s and βj (n)’s although they are of different dimensions.
2.5. Deformation equations. The following lemma gives the effect of an infinitesimal deformation in the coefficients {uK , vJ } expressed as differential equations for the (semi)-infinite vectors *∞ , ,∞ of biorthogonal quasi-polynomials (as well as their Fourier–Laplace transforms) and for the matrices P , Q. (Derivations with different conventions can be found in [12, 2, 18].) Lemma 2.7. ∂uK * = U K * , ∞ ∞ t ∂vJ * = − V J * , ∞ ∞ t K ∂uK , = − U ,,
(2.80)
∂vJ , = V J ,,
(2.83)
∞ ∞
(2.81) (2.82)
∞
∞
where U K := −
1 K Q K
>0
+
1 K Q 2
! 0
, V J := −
1 (−P t )J J
>0
+
! 1 . (−P t )J 0 2 (2.84)
Duality, Biorthogonal Polynomials and Multi-Matrix Models
93
Componentwise these read, ∂uK ψn (x) =
K j =0
∂vJ ψn (x) = −
UjK (n)ψn+j (x),
J j =0
∂uK φn (y) = −
K j =0
∂vJ φn (y) =
J j =0
(2.85)
VjJ (n − j )ψn−j (x),
(2.86)
UjK (n − j )φn−j (y),
(2.87)
VjJ (n)φn+j (y),
(2.88)
where we have used the notation K J J UjK (n) := Un,j +n , Vj (n) := Vn,j +n .
(2.89)
(The same relations hold for the Fourier–Laplace transforms with respect to the variables x and y, since the coefficients do not depend on these variables.) Moreover, we have the following equations for the matrices P , Q: ∂uK Q = −[Q, U K ],
(2.90)
∂vJ Q = [Q, V
Jt
],
(2.91)
K
∂uK P = −[P , U ],
(2.92)
∂vJ P = [P , V
].
(2.93)
Jt
Proof. Equations (2.80) and (2.83) are just definitions, (2.82) and (2.81) follow from (2.7). Equation (2.84) is proved in a way similar to Lemma 2.1. From the definitions (2.5), one has: 1 ∂uK hn 1 1 U K * = ∂uK ψn (x) = − ψn (x) + √ ∂uK πn (x)e−V1 (x) − x K ψn (x), ∞ n 2 hn K hn (2.94) and t 1 ∂uK hn 1 K − U φn (y) + √ ∂uK σn (y)e−V2 (y) . , = ∂uK φn (y) = − ∞ n 2 hn hn
(2.95)
Since ∂uK πn (x) has a degree lower than n, one sees that Eq. (2.94) implies that UK
>0
and similarly Eq. (2.95) implies that
=−
UK
1 K , Q >0 K
<0
= 0.
(2.96)
(2.97)
94
M. Bertola, B. Eynard, J. Harnad
They also imply that the diagonal part must be: 1 ∂uK hn 1 K 1 ∂uK hn 1 K K Q Q =− − = =− , (2.98) Un,n n,n n,n 2 hn K 2 hn 2K which proves (2.84). Equations (2.90)–(2.93) follow from multiplying (2.80) and (2.81) by x and (2.82) and (2.83) by y and the linear independence of the component functions forming the vectors The coefficients of the expansion must vanish, and that is precisely the relations (2.90) to (2.93). We also require the “folded” version of the deformation equations. This leads to eight N−1
N−1
equations giving the action of ∂uK and ∂vJ on * , * , , and , . They are introduced in N
N
the following lemma, for which we need to define diagonal matrices which play rôles similar to that of the matrices defined in (2.57), (2.58) (2.71) , (2.72) for the differential equations with respect to x or y. N,d
U
:= diag(UjK (N −d), . . . , UjK (N )),
j,K
N,d V j,J
(2.99)
:= diag(VjJ (N −d −j ), . . . , VjJ (N −j )).
(2.100)
With this notation we have: Lemma 2.8. The deformation equations can be written in the folded windows (and dual windows) as N,* ∂ * = U K *, ∂uK N N N,* ∂ * = − V J *, ∂vJ N N
∂ N−1 N−1 N,* * = * U K, ∂uK N−1 N−1 N,* ∂ * = − * V J, ∂vJ
(2.101)
N,, ∂ , = − U K ,, ∂uK N N N,, ∂ , = V J ,, ∂vJ N N
N−1 N,, ∂ N−1 , = − , U K, ∂uK ∂ N−1 N−1 N,, , = , V J, ∂vJ
(2.102)
where N,*
U
N,*
V
K
J
:= :=
K N,d2 U j,K j =0 J
a
N+j −1
N,d2 N−j V j,J a
N,*
· · · a,
U
N
··· a ,
V
j =0 N,,
UK
N,, VJ
:=
K
J
J N−j N−1 := b ··· b
N,*
N−1
K N−j,d1 N−j N−1 U j,K b · · · b ,
U
j =0
b · · · b,
N+j−1
N
K
N,,
V
b··· b
j =0 N
N+j−1
N+d1−1,d1 U j,K , N+d1−1,d1 V j,J ,
(2.103)
j =0
N,,
j =0
J N+j,d1 := V j,J
K
:=
J
:=
K N−j N−1 a ··· a
N+d2−1−j,d2 U j,K
j =0
:=
J
a ··· a
j =0 N
N−1+j
N+j +d2−1,d2 V j,J .
(2.104)
Duality, Biorthogonal Polynomials and Multi-Matrix Models
95
Proof. This follows exactly the same lines as the proof of Lemma 2.4.
3. Compatibility of the Finite Difference–Differential–Deformation Systems We want to now prove that the recursion relations (2.51), (2.70), the linear differential systems (2.79) and the systems of deformation equations (2.101), (2.102) are all compatible in the sense that they admit a basis of simultaneous solutions (fundamental systems). Proposition 3.1. The shifts N → N + 1 in Eqs. (2.51) implemented by a and b and the N
N
sequence of differential equations (2.62), (2.63), respectively, are compatible as vector differential–difference systems.That is, ! there exists a sequence of (d2 + 1) × (d2 + 1) fundamental matrix solutions * (x) and (d1 + 1) × (d1 + 1) fundamental N N≥d2 +1 ! matrix solutions ,(y) simultaneously satisfying the equations N
N≥d1 +1
* (x) = a (x) * (x), N
N+1
(3.1)
N
N ∂ * (x) = −D 1 (x) * (x), ∂x N N
N ≥ d2 + 1,
(3.2)
and , (y) = b (y) ,(y),
N+1
N
(3.3)
N
N ∂ ,(y) = −D 2 (y) ,(y), ∂y N N
N ≥ d1 + 1,
(3.4)
respectively. The same result holds for the barred quantities and the shifts N → N − 1 N
N−1
N
implemented by a and b . That is, there exist fundamental solutions { * (y)}N≥d2 1+1 of N−1
dimension (d2 + 1) × (d2 + 1) and fundamental solutions { , (x)}N≥d2 +1 of dimension (d1 + 1) × (d1 + 1) simultaneously satisfying the recursion relations and differential systems N−1
N
N
* (y) = *(y) b (y),
∂ * (y) = * (y)D 2 (y), ∂y N−1
N−1
N
(3.5) N ≥ d1 + 1,
(3.6)
and N−1
N
N
, (x) = ,(x) a(x), N N−1 ∂ N−1 , (x) = , (x)D 1 (x), ∂x
respectively.
(3.7) N ≥ d2 + 1,
(3.8)
96
M. Bertola, B. Eynard, J. Harnad
Proof. We prove the compatibility for only one of the four shift-differential systems, the others being completely analogous. The statement amounts to proving that N N+1 N+1 d N N N a (x), ∂x + D1 (x) = a ◦ ∂x + D1 (x) ◦ a = ∂x + a(x) D1 (x) a (x) + a(x) N N dx N (3.9) where the dependence of a on x has been emphasized. N
Let
t ˜ (x) := ψ˜ N−d2 (x), . . . , ψ˜ N (x) , *
N ≥ d2
N
be any solution to the equation
N ∂x + D1 (x)
˜ (x) = 0. *
(3.10)
(3.11)
N
At this stage the labeling N − d2 , . . . , N has no particular meaning because there are no ψ˜ n (x)’s with n ∈ [N − d2 , N ]. Nevertheless we can define ˜ (x) := *
N+j
˜ (x) a ··· a *
N+j−1
j ≥1,
(3.12)
0 ≤ j ≤ N − d2 .
(3.13)
N N
and N−j
˜ (x), ˜ (x) := a · · · N−1 a * *
N−j
N
˜ Because of the recursive structure of the matrices a , the above definition, e.g., for * N
N+1
actually defines not d2 + 1 new functions, but only one new function: ψ˜ N+1 . Therefore, componentwise, we have defined a sequence of new functions ψ˜ m (x), which satisfy the recursion relation x ψ˜ m (x) =
d2
αj (m)ψ˜ m−j (x), m ≥ d2 .
(3.14)
j =−1
By this definition and by the structure of the matrix N
N N−1
N
D 1 =γ a + β 0 +
d1
N
βj
j =1
a
a
N+j −1N+j −2
· · · a, N
(3.15)
the differential system componentwise now reads ∂x ψ˜ n (x) = −
d1 j =−1
βj (n + j )ψ˜ n+j (x),
n = N − d2 , . . . , N,
(3.16)
Duality, Biorthogonal Polynomials and Multi-Matrix Models
97
where the ψ˜ n ’s that fall outside the window N − d2 , . . . , N have been defined above in terms of the ones within the window. Therefore we need to prove that the newly defined function ψ˜ N+1 (x) :=
d
2 αl (N ) x − α0 (N ) ψ˜ N (x) − ψ˜ N−l (x) γ (N ) γ (N )
(3.17)
l=1
satisfies the same sort of differential equation as the preceding ones. (A simple argument by induction then shows that all ψ˜ N+j satisfy the same sort of differential equation for any j > 1). This in turn amounts to proving that ∂x ψ˜ N+1 (x) = −
d1
βj (N + 1 + j )ψ˜ N+1+j (x).
(3.18)
j =−1
To do this we compute γ (N)∂x ψ˜ N+1 (x) = =
d dx
x ψ˜ N (x) −
d2 d1 l=0 j =−1 d1
−x =
j =−1 d2 d1
d2
αl (N )ψ˜ N−l (x)
l=0
αl (N )βj (N − l +j )ψ˜ N−l+j (x) + ψ˜ N (x) βj (N + j )ψ˜ N+j (x) αl (N )βj (N − l +j )ψ˜ N−l+j (x) + ψ˜ N (x)
l=0 j =−1 d2 d1
−
αl (N + j )βj (N + j )ψ˜ N+j −l (x).
(3.19)
l=−1 j =−1
Rearranging Eqs. (3.18) and (3.19), we have to prove the identity
−γ (N)
d1
βj (N + 1 + j )ψ˜ N+1+j (x)
j =−1
=
d2 d1
αl (N )βj (N − l +j )ψ˜ N−l+j (x) + ψ˜ N (x)
l=0 j =−1 d2 d1
−
l=−1 j =−1
αl (N + j )βj (N + j )ψ˜ N+j −l (x),
(3.20)
98
M. Bertola, B. Eynard, J. Harnad
or equivalently ψ˜ N (x) = − +
d2 d1
αl (N )βj (N − l +j )ψ˜ N−l+j (x)
l=−1 j =−1 d2 d1
αl (N + j )βj (N + j )ψ˜ N+j −l (x).
(3.21)
l=−1 j =−1
But this last equation is nothing but the Heisenberg commutation relations in Eq. (2.31) and Eq. (2.32). This means that rearranging the coefficients in front of ψ˜ N+r (x) in the RHS of Eq. (3.21), the only nonvanishing coefficient is that of ψ˜ N (x) and it is exactly 1. A similar argument may be used to prove that the relations (3.16) hold also for 1 ≤ n < N − d2 . For future convenience we remark that this verification amounts to the fact that the coefficients of the ψ˜ N+r (x)’s are the same as for the orthogonal quasi-polynomials ψN+r (x), since it relies only on the recursion relations, which are the same. Given that the quasi-polynomials ψN+r (x) are linearly independent, the equality of LHS and RHS follows also for any other sequence. 3.1 means that we can define d2 + 1 sequences of functions " Proposition # (q) ψn (x) in such a way that in any “window” of size d2 + 1 they conn∈N,q=0,... ,d2
stitute a fundamental system of solutions to the differential system (3.2). One of these sequences is obviously provided by the orthogonal quasi-polynomials. Each of them satisfies both the recursion relations (q) xψn (x)
=
d2 l=−1
(q)
αl (n)ψn−l (x)
(3.22)
and the derivative relations (q) ∂x ψn (x)
=−
d1 j =−1
(q)
βj (n + j )ψn+j (x).
(3.23)
Remark 3.1. In principle, in order to define these d2 + 1 sequences one should solve the differential system (3.2) in a given window and then define recursively the rest of the sequence backwards and forwards. To pass from the semi-infinite case to the infinite (q) one, we may define the full sequence ψn for n ∈ Z just by application of products of the matrices a and their inverses, provided the αj (n)’s are so defined that all the a ’s are invertible.
N
N
In a completely parallel manner we can define d1 + 1 sequences of functions # " (q) which provide fundamental systems satisfying φn (y) n∈N,q=0,... ,d1
N ∂y + D2 (y) ,(y) = 0. N
(3.24)
Duality, Biorthogonal Polynomials and Multi-Matrix Models
99
Moreover, with minor modifications, we can construct analogous sequences for the dual systems N−1
N−1
N
N−1
N−1
N
∂y * (y) = * (y) D 2 (y),
(3.25)
∂x , (x) = , (x) D 1 (x).
(3.26)
N
N
The only difference" is that the the shift N → N −1. The # matrices a and "b now implement # (q) (y) (y) and ψ will therefore satisfy barred sequences φ (q) n n n∈N,q=0..d1
the recursion relations xφ (q) (x) = n yψ (q) (y) = n
d2
αl (n + l)φ (q) (x), n+l
∂x φ (q) (x) = n
βj (n + j )ψ (q) (x), n+j
∂y ψ (q) (y) = n
l=−1 d1
j =−1
n∈N,q=0..d1 d1 j =−1 d2 l=−1
βj (n)φ (q) (x), n−j
(3.27)
αl (n)φ (q) (y). n−l
(3.28)
A completely analogous statement holds for the matrices defining the deformation equations in any window. Proposition 3.2. The shifts N → N + 1 implemented by a in Eq. (3.1) and the sequence N
of differential equations N,*
∂uK * = U N
K
N,*
*,
∂vJ * = − V
N
N
J
*,
N ≥ d2 + 1
N
(3.29)
N
are compatible, as are the shifts N → N −1 implemented by b in (3.5) and the sequence of differential equations N−1
N−1 N,*
∂uK * = * U
N−1 N,*
N−1
K,
∂vJ * = − * V J ,
N ≥ 1.
(3.30)
N
Similarly the shifts implemented by b and a are compatible with the equations N
N,, K
∂uK , = − U N
N−1
,, N
N−1 N,, K,
∂uK , = − , U
N,, J
∂vJ , = V N
N−1
,, N
N−1 N,,
∂vJ , = , V J ,
N ≥ d1 + 1, N ≥ 1.
(3.31) (3.32)
Proof. We will prove compatibility of only one of the eight kinds of systems with the shift; the remaining cases are proven similarly. As for Prop. 3.1, we first define a continuous parametric family (depending on x) of solutions to the system N,* ˜ = 0. ∂uK − U K * (3.33) N
100
M. Bertola, B. Eynard, J. Harnad
˜ := We then define the shifted functions * N+j
˜ so that the equation reads, a ··· a *
N+j−1
N N
componentwise ∂uK ψ˜ n =
K j =0
UjK (n)ψ˜ n+j ,
n = N − d2 , . . . , N.
(3.34)
Then we have to check that the newly defined ψ˜ N+1 also satisfies ∂uK ψ˜ N+1 =
K j =0
UjK (N+1)ψ˜ N+1+j
(3.35)
(and by induction the corresponding equations for ψ˜ N+r , r ≥ 1). As before, we use the relations ψ˜ N+1 =
d
2 αl (N ) x − α0 (N ) ψ˜ N (x) − ψ˜ N−l (x) γ (N ) γ (N )
(3.36)
l=1
and the differential system satisfied by the ψ˜ N−d2 , . . . , ψ˜ N . To conclude the equality we can reason as in the remark in the proof of Prop. 3.1. The final proposition in this section assures that the deformation equations are compatible with the x, y differential equations in all windows. Proposition 3.3. The system of equations N ∂x + D1 * (x) = 0, N N,* ∂uK − U K * (x) = 0, N N,* ∂vJ + V (J ) * (x) = 0,
(3.37) (3.38) (3.39)
N
* (x) = a (x) * (X),
N+1
N
N
N ≥ d2 + 1,
(3.40)
! is compatible, and hence sequences of fundamental systems of solutions * (x) N
to all equations exist. The same statement holds for the system N ∂y + D2 ,(y) = 0, N N,, ∂uK + U K ,(y) = 0, N N,, ∂vJ − V (J ) ,(y) = 0,
(3.41) (3.42) (3.43)
N
, (y) = b (x) ,(y),
N+1
N
N
N>d2
N ≥ d1 + 1.
(3.44)
The corresponding systems for the barred sequences are also compatble, and hence also N−1
N−1
admit simultaneous sequences of fundamental solutions * (x) and , (y).
Duality, Biorthogonal Polynomials and Multi-Matrix Models
101
Proof. The proof will only be given for the system (3.37)–(3.40) since the others are proved in the same way. The compatibility follows from Props. 3.1 and 3.2 together with a proof of compatibility of Eqs. (3.37)–(3.39). Indeed, from the d2 + 1 functions in ˜ = [ψ˜ N−d2 (x, u, v), . . . , ψ˜ N (x, u, v)]t * N
(3.45)
we can consistently define a whole sequence of functions ψ˜ n ’s by means of the xrecursion relations in such a way that componentwise the system reads ∂x ψ˜ n = −
d1
βj (n + j )ψ˜ n+j ,
(3.46)
j =−1
∂uK ψ˜ n =
K j =0
∂vJ ψ˜ n = −
UjK (n)ψ˜ n+j ,
J j =0
VjJ (n − j )ψ˜ n−j .
(3.47)
(3.48)
Taking cross derivatives and using these expressions one gets, e.g., ∂uk ∂x ψ˜ n = −
d1 K k=0 j =−1
UkK (n + j )βj (n + j )ψ˜ n+j +k
d1 ∂ − βj (n + j ) ψ˜ n+j , ∂uK
(3.49)
j =−1
∂x ∂uk ψ˜ n = −
d1 K j =−1 k=0
UkK (n)βj (n + j + k)ψ˜ n+j +k .
(3.50)
The expressions for the derivatives of the coefficients βj may be obtained from the deformation equation (2.92) for P and then substituted back into Eq. (3.49). However, to prove the equality of the two cross derivatives it is sufficient to collect the coefficients of ψ˜ n+q and note that exactly the same coefficients appear when the functions {ψ˜ n } are replaced by the orthogonal quasi-polynomials {ψn }, for which the equality of the two expressions certainly holds. Since the orthogonal quasi-polynomials are linearly independent functions, the individual coeficients must agree. (This is essentially the same argument as in the remark at the end of proof of Prop. 3.1). The mutual compatibility of the ∂uK and ∂vJ deformations is proved in exactly the same way. One just takes the the cross derivatives in Eqs. (3.47) and (3.48) and notes that, since these are the same as for the case of the orthogonal quasi-polynomials {ψn }, the corresponding coefficients in the cross differentiated expression must be equal. 4. Spectral Duality The aim of this section is to state and prove some remarkable relations between systems related to dual windows (in the sense of Def. 2.1), which will justify the terminology.
102
M. Bertola, B. Eynard, J. Harnad
One of the main results will be that the four spectral curves given by the characteristic polynomials of D1 , D 1 , D2 , D 2 associated to the four systems (Eq. (2.79)) on the two N−1
N−1
pairs of dual windows (* , , ), (,, * ) are actually the same curve. N
N
4.1. Dual spectral curves. First we need a linear algebra lemma. Lemma 4.1. Let T be a square matrix having the block form 0 F1 0 0 0 0 0 F 0 0 2 . T = 0 0 0 .. 0 , 0 0 0 0 Fd G0 G1 G2 · · · Gd
(4.1)
where the d + 1 blocks have compatible sizes and the diagonal blocks are square. Then det [1 − T ] = det [1 − D] , where D := Gd +
d−1
(4.2)
Gk · Fk+1 · · · Fd ,
(4.3)
k=0
and 1 denotes, according to the context, the unit matrix of appropriate size. Proof. Let
0 0 NF := 0 0 0
F1 0 0 0 0 F2 0 0 .. . . 0 0 0 0 0 0 Fd 0 0 0 0
(4.4)
We multiply the matrix 1 − T from the right by the matrix 1 F 1 F 1 F2 F 1 F 2 F 3 F1 · · · F d 0 1 F2 F 2 F3 F2 · · · F d .. −1 . .. . (1 − NF ) = 0 0 . . . 0 0 0 1 Fd 0 0 0 0 1
.
(4.5)
Since the matrix (1−NF )−1 is unimodular, the determinant of 1−T remains unaffected. Then one computes 1 0 0 0 0 0 1 0 0 0 0 0 1 0 −1 0 (1 − T ) · (1 − NF ) = (4.6) , . . 0 0 0 . 0 1−D from which the statement follows by taking the determinant.
Duality, Biorthogonal Polynomials and Multi-Matrix Models
103
Proposition 4.1. The spectral curves associated to the characteristic polynomials of N
N
N
N
D1 , D 2 , D2 , D 1 are pairwise equal. More precisely, we have the formulae N N ud1+1 det x1d1 +1 − D 2 (y) = vd2+1 det y1d2 +1 − D 1 (x) , N N ud1+1 det x1d1 +1 − D 2 (y) = vd2+1 det y1d2 +1 − D 1 (x) ,
(4.7) (4.8)
which connect the spectral curves of the differential operators of different dimensions operating on the two pairs of dual windows. Proof. We will only prove one equality since the other is proved similarly. We start with the computation of the characteristic polynomial of D1 (x), N det y1 − D1 (x) d1 N N N N−1 = det − γ a det 1 − a γ y1− β0 − βj a · · · a N−1 N
j =1
d
1 N N γ (N − 1)2 = det 1 − a γ y1− β0 − βj N−1 N vd2 +1
j =1
N+j−1
N
a · · · a ,
N+j−1
(4.9)
N
where we have used the identity in Eq. (2.54). Now we use Lemma 4.1 with the identifications N N γ Gd1 = a y1− β0 , Gk = − a γ β d1−k−1 , k = 1 . . . d1 , Fk = a , N−1 N
N−1 N
N+d1−k
(4.10) and thus obtain det 1 − a γ y1− N−1 N
N β0
−
d1 j =1
N βj
a · · · a = det 1(d1 +1)(d2 +1) − Ta b .
N+j−1
N
(4.11) The matrix Ta b is defined by a 0 0 0 N+d1 −1 .. . 0 0 0 Ta b := a 0 0 0 N N N N − a βd1 γ − a βd1 −1 γ · · · a (y1− β0 )γ N−1 N−1 N−1 N N N 1 0 0 0 a 0 0 0 N+d1 −1 .. .. . 0 0 0 . 0 0 0 = 0 0 1 0 0 0 a 0 N N N N 0 0 0 a γ − βd1 −1 γ · · · (y1− β0 ) γ − β d 1 N−1 N
N
N
(4.12)
.
(4.13)
104
M. Bertola, B. Eynard, J. Harnad
We regard Ta b as an endomorphism of Cd2 +1 ⊗ Cd1 +1 . Let P12 be the involution interchanging the two factors of the tensor product and let the matrix C implement the reversal of order endomorphism within the (d2 + 1) × (d2 + 1) blocks: 0 0 0 0 1 d2 +1 times 0 0 0 1 0 %& ' $ ·· C := Blockdiag (R, R, .., R), R := (4.14) 0 0 · 0 0 ∈ GLd1 +1 . 0 1 0 0 0 1 0 0 0 0 A direct inspection shows CP12 Ta b P12 C −1 =
N−d2+1 t
b
0
0
0
..
0
0
0
N−1 t
0
0
0
0
0
N−d2 N−1
N−d2+1 N−1
− γ α d2 bt − γ α d2−1 bt N−1
N−1
0 .
0 b
N−1 N−1
· · · − γ α 1 bt N−1
0
0 , (4.15) 0 N t b N N−1 t γ x1− α 0 b
N−1
where we now have d2 + 1 square blocks of dimension d1 + 1 (i.e. the number of blocks and the dimensions of the blocks have been interchanged). The barred symbols are precisely those defined in Eqs. (2.67, 2.68), (2.71, 2.72) and we recall that the matrices α, γ appearing are all diagonal and hence commute among themselves. We now use Lemma 4.1 again to get N vd2 +1 det y1 − D 1 (x) γ (N − 1)2 = det 1(d1 +1)(d2 +1) − Ta b = det 1(d1 +1)(d2 +1) − CP12 Ta b P12 C −1 d2 N −j N−1 N N−1 N−1 t = det 1d1 +1 − γ x1 − α 0 − α j b · · · bt bt N−1
(
N t
)
= det − γ b N−1
=
ud1 +1 γ (N − 1)2
(
j =1
N
)
t
det x1 − D 2 (y) N det x1 − D 2 (y) .
(4.16)
This concludes the proof. Remark 4.1. Notice that this proof was based purely on an algebraic reinterpretation of the characteristic equations for D1 (x) and D 2 (y) in which the same set of recursion parameters {αj (n), βj (n)} appear. No assumption was required about any relations between these parameters, and therefore the equalities (4.7), (4.8) are really just identities.
Duality, Biorthogonal Polynomials and Multi-Matrix Models
105
4.2. Duality pairings. In what follows we will derive a deeper form of duality; namely, that the linear differential equations satisfied by dual windows are also dual, in the sense of having the same associated spectral curves. This follows from the generalized Christoffel–Darboux formulae satisfied by kernels Kij , i, j = 1, 2 when the biorthogonal polynomials and their Fourier-Laplace transforms are replaced by any solution to the equations of Prop. 3.3. Proposition 4.2. If {ψ˜ n (y)}n∈N and {φ˜ n (y)}n∈N are two arbitrary sequences of functions satisfying both the recursion relations under multiplication by y and the differential relations under application of ∂y (constructed as in Prop. 3.1), then
N N−1 N−1 N
t ˜ (y ), (A) , ˜ (y ), B , ˜ (y) , ˜ (y) = y − y * * (4.17) ∂y + ∂y N
and
(∂x + ∂x )
N−1
N
N
˜ (x ), A * ˜ (x) = x − x ,
N
N
N−1
˜ (x) . ˜ (x ), (B) * ,
t
(4.18)
N
Proof. We shall prove the equality (4.17) only, since (4.18) is proved identically. The expressions on either side of Eq. (4.17) read (understanding the ψ˜ n ’s to depend on y
and the φ˜ n ’s on y):
N−1 N ˜ ˜ ∂y + ∂y * (y ), B ,(y) N j −1 d1 = ∂y + ∂y −γ (N − 1)ψ˜ βj (N + k)ψ˜ φ˜ N + φ˜ N−j +k N−1
= γ (N −1) ψ˜ N−1 +
j −1 d1 d2 j =1 k=0 l=−1
−
j −1 d1 d2
j =1 k=0 l=−1
d2
N+k
j =1 k=0
αl (N +l)φ˜ N+l − γ (N −1)
l=−1
d2 l=−1
αl (N −1)ψ˜ N−1−l φ˜ N
αl (N +k)βj (N +k)ψ˜ N+k−l φ˜ N+k−j βj (N +k)αl (N +k−j +l)ψ˜ N+k φ˜ N+k−j+l
(4.19)
and
N N−1
˜ (y ), A , ˜ (y) y −y * N = (y − y) γ (N − 1)ψ˜ N φ˜ N−1 − = γ (N −1) φ˜ N−1
d1 j =−1
j −1 d2 j =1 k=0
αj (N + k)ψ˜ N+k−j φ˜ N+k
βj (N +j )ψ˜ N+j − γ (N −1) ψ˜ N
d1 j =−1
βj (n−1)φ˜ N−1−j
106
M. Bertola, B. Eynard, J. Harnad
−
d2 d1 l−1 l=1 k=0 j =−1
+
d2 d1 l−1
l=1 k=0 j =−1
αl (N +k)βj (N +k−l +j )ψ˜ N+k−l+j φ˜ N+k αl (N +k)βj (N +k)ψ˜ N+k−l φ˜ N+k−j .
(4.20)
The claim now is that these two expressions are the same. There are two ways of proving this. The first is a straightforward computation collecting all bilinear terms of the form Fpq (y , y) := ψ˜ N+p (y )φ˜ N+q (y)
(4.21)
in the difference between (4.19) and (4.20) and proving that their coefficients vanish. The coefficient of Fpq with p ≥ 0 and q ≥ 0 vanish identically, as do the coefficient of Fpq with p < −1 and q < −1. The coefficient of Fpq with p ≥ 0 and q < −1 vanishes due to relation (2.31) with l = p − q − 1 and n = N + q. The coefficient of Fpq with p < −1 and q ≥ 0 vanishes due to relation (2.32) with l = q − p − 1 and n = N + p. These cancellations are summarized in Table 4.1. The second way does not involve any computation; in fact, we can already conclude that the coefficients of all terms ψ˜ N+p (y )φ˜ N+q (y) must agree. We know that for the polynomial solutions {ψn (x), φn (y)} and their Fourier–Laplace transforms {ψ n (y ), φ n (x )} the two expressions are the same because
∂y + ∂y
N−1 N * (y ), B ,(y) = ∂y + ∂y (y − y) ψ n (y )φn (y)
N−1
N
= (y − y) ∂y + ∂y
N−1 n=0
n=0
N N−1
t ψ n (y )φn (y) = y − y * (y ), (A) , (y) , (4.22)
N
where we have used the generalized Christoffel–Darboux formulae and the identity
(y − y), ∂y + ∂y = 0. (4.23) Since the ψ n (y )’s are linearly independent and so are the φn (y)’s, the functions Fpq (y , y) := ψ N+p (y )φN+q (y)
(4.24)
are also linearly independent. Considering Eqs. (4.19) and (4.20) as linear equalities for the Fpq (y , y)’s, one concludes that the coefficient in the two equations must be equal. Before stating the next result, we define new pairings by studying the effect of deformations on the kernels. By using the same Christoffel–Darboux trick one easily computes j K N−1 ∂ ψn (x)φ n (x ) = UjK (N −l)ψN−l+j φ N−l , ∂uK
(4.25)
j K N−1 ∂ ψn (x)φ n (x ) = VjK (N −l)φ N−l+j ψN−l . ∂vK
(4.26)
n=0
n=0
j =1 l=1
j =1 l=1
Duality, Biorthogonal Polynomials and Multi-Matrix Models
107
Table 4.1. Comparison of coefficients in Eq. (4.19) and Eq. (4.20)
Coeff. of ψ˜ N−1 φ˜ N−1 ψ˜ N φ˜ N
Eq. (4.19)
+
*d1
j =1 αj (N + j − 1)βj (N + j − 1) *d1 j =1 αj (N )βj (N )
−γ 2 (N − 1) −
ψ˜ N+p φ˜ N+q p ≥ 0, q ≥ 0
l=q+1
p+q ≥1 ψ˜ N−p φ˜ N−q
p ≥ 1, q ≥ 1
k=0
p+q ≥3 ψ˜ N−1 φ˜ N
d2
−
αl (N + q)βp−q+l (N + p)
γ (N − 1)α0 (N ) − γ (N − 1)α0 (N − 1)
j =1
d2
−
l=q+1
k=0
j =1
αk+p (N + k)βk+q (N + k)
αj (N − 1 + j )βj −1 (N − 1 + j )
−
αj −1 (N + j − 1)βj (N + j − 1)
αj −1 (N − 1)βj (N )
γ (N − 1)β0 (N ) − γ (N − 1)β0 (N − 1)
j =0
− j =−1
j =−1
−γ (N − 1)βq−1 (N − 1)
βj +q (N )αj (N − q)
αj (N + p + j )βj +p+1 (N + p + j )
−
j =0
ψ˜ N−1 φ˜ N+q
γ (N − 1)βp (N + p)
αj (N − 1)βj +p+1 (N + p) γ (N − 1)αq (N + q)
q≥1
j =p
αj (N + j − 1)βj −q−1 (N + j − 1)
−
j =0
αj +q+1 (N + q)βj (N − 1)
ψ˜ N−p φ˜ N j =−1
−
q ≥ 0, p ≥ 2
αj +p (N + j )βj (N + j )
−
j =−1
αj +p (N )βj (N − p)
αj (N + j + p)βj +p+q (N + j + p) j =−1
ψ˜ N−p φ˜ N+q
j =0
−γ (N − 1)αp−1 (N − 1)
p≥2
q ≥ 2, p ≥ 0
αj (N )βj −1 (N − 1)
αj (N + j )βj +q (N + j )
q≥2
ψ˜ N+p φ˜ N−q
αl (N + q)βp−q+l (N + p)
j =1
−
ψ˜ N φ˜ N−q
p≥1
j =1 αj (N + j − 1)βj (N + j − 1) *d2 j =1 αj (N )βj (N )
−γ 2 (N − 1) −
αk+p (N + k)βk+q (N + k)
j =1
ψ˜ N+p φ˜ N−1
+
γ 2 (N − 1)
*d2
ψ˜ N φ˜ N−1
Eq. (4.20)
γ 2 (N − 1)
0
βj +p+q (N + p)αj (N − q) 0
j =q−1
−
j =−1
αj +p (N + j )βj −q (N + j ) αj +p+q (N + q)βj (N − p)
/ [−1, . . . , d2 ] and βj ≡ 0 if j ∈ / [−1, . . . , d1 ]. We have defined αj := 0 if j ∈
108
M. Bertola, B. Eynard, J. Harnad
This pairing does not change if we take Fourier–Laplace transforms of * or ,, so we can easily write the deformations of the two kernels K11 and K22 . Proposition 4.3. For any two sequences satisfying both the deformation equations and the x-recursion relations we have
j K N−1 N ∂ ˜ (y ), B , ˜ (y) = (y − y) UjK (N −l)ψ˜ N−l+j φ˜ N−l , (4.27) * N ∂uK j =1 l=1
j J N−1 N ∂ ˜ (y ), B , ˜ (y) = (y − y) VjJ (N −l)φ˜ N−l+j ψ˜ N−l , (4.28) * N ∂vJ j =1 l=1 j K N−1 N ∂ , (x ), A * (x) = (x − x) UjK (N −l)ψ˜ N−l+j φ˜ N−l , (4.29) ∂uK N j =1 l=1 j J N−1 N ∂ VjJ (N −l)φ˜ N−l+j ψ˜ N−l . (4.30) , (x ), A * (x) = (x − x) ∂vJ N j =1 l=1
Proof. We prove only one identity, the others being completely similar. If the two sequences consist of the orthogonal quasi-polynomials (and the corresponding Fourier– Laplace transforms), then the equality follows immediately from:
N−1 N N ∂ ∂
* (y ), B ,(y) = (y − y) ψ n (y )φn (y) ∂uK N−1 ∂uK n=0 j K
K = (y − y) Uj (N −l)ψ N−l+j φN−l . (4.31) j =1 l=1
Expanding both sides by means of the recursion relations and the deformation equations in order to obtain linear expressions in Fp,q (y , y) in the LHS and RHS, one concludes that the coefficients must be the same. But the same final expression relies only on the recursion relations, and hence the equality holds for any pair of sequences of functions ψ˜ n (y ) and φ˜ n (y) satisfying these same recursion relations (by the same argument as in the proof of Prop. 4.2). Theorem 4.1. If {ψ˜ n (y)}n∈N and {φ˜ n (y)}n∈N (or {φ˜ n (x)}n∈N and {ψ˜ n (x)}n∈N ) are arbitrary pairs of sequences of functions satisfying the recursion relations (2.24), (2.20), (resp. (2.26)), (2.18), the differential relations (2.25), (2.21), (resp. (2.27), (2.19)) and the deformation equations (2.85)-(2.88), then the bilinear expressions
N−1 N ˜ (y), B , ˜ (y) , f˜N (y) := * (4.32)
g˜ N (x) :=
N
N−1
N
˜ (x), A * ˜ (x) , , N
(4.33)
are independent of y (resp. x) and N , and also are constant in the deformation parameters {uK , vJ }.
Duality, Biorthogonal Polynomials and Multi-Matrix Models
109
Proof. Using Prop. 4.2 and setting y = y we find at once that d ˜ fN (y) = 0, dy i.e. fN (y) = fN does not depend on y. A similar computation shows that g˜ N (x) is independent of x. Now we also compute (for, say, M < N )
N−1 N−1 M−1 N M
˜ (y), B , ˜ (y), B , ˜ (y ) − * ˜ (y ) = (y − y ) (4.34) * ψ˜ n (y )φ˜ n (y). N
M
n=M
Letting y = y we obtain f˜N = f˜M , and similarly for g˜ N To prove the independence of the deformation parameters uK and vJ , we use Prop. 4.3 in a similar way to the above and then again set y = y (or x = x ) to conclude the proof. Theorem 4.1 allows us to conclude that we can choose fundamental systems of solutions to the pairs of dual differential-difference-deformation equations normalized in such a way that the pairing gives the identity matrix. Corollary 4.1. There exist two pairs of sequences of fundamental matrix solutions to N−1
the difference–differential–deformation equations (3.1)–(3.8), (3.29)–(3.32) (* , , ), N
N−1
(,, * ) such that N
, , A * ) ≡ 1,
N−1 N
* , B , ≡ 1.
N−1 N
N
(4.35)
N
We conclude with the following theorem. Theorem 4.2. The differential-deformation systems N
N,,
∂y , = − D2 (y) ,, N
N−1
N,,
∂uK , = − U K ,,
N
N
N−1 N
N
N−1
∂y * = * D 2 (y),
∂vJ , = V J ,, N
N−1 N,*
∂uK * = * U K ,
(4.36)
N
N−1 N,*
N−1
∂vJ * = − * V J , (4.37) N
for K = 1, . . . , d1 +1; J = 1, . . . , d2 +1, are put in duality by the matrix B, N N
N
N
B D2 (y) = D 2 (y) B, N
N N,,
N
N,*
(4.38) N,*
N
∂uK B = B U K (y) − U K (y) B, ∂vJ B =
(4.39)
N N,, V J (y) B − B V J (y).
N
N
(4.40)
N
In particular, since the matrices D 2 (y) and D 2 (y) are conjugate to each other, their spectral curves are the same. Similarly, the differential-deformation systems N
∂x * = − D1 (x) * , N
N−1
N
N−1 N
∂x , = , D 1 (x),
N,*
∂uK * = U K * , N
N−1
N
N,*
∂vJ * = − V J * ,
N−1 N,,
∂uK , = − , U K ,
N
N
N−1
N−1 N,,
(4.41)
∂vJ , = , V J , (4.42)
110
M. Bertola, B. Eynard, J. Harnad N
K = 1, . . . , d1 +1; J = 1, . . . , d2 + 1, are put in duality by the matrix A, N N
N
N
A D1 (x) = D 1 (x) A,
(4.43)
N
N,, N N,* N A V J (x) − V J (x) A,
N
N,,
∂vJ A =
N
(4.44)
N N,*
∂uK A = U K (x) A − A U K (x), N
(4.45)
N
and hence the spectral curves of D 1 (x) and D 1 (x) are also the same. Proof. The three relations follow easily by taking a fundamental system of solutions for the two compatible differential–deformation systems and using Theorem 4.1. This theorem together with Prop. 4.1 proves that the four spectral curves N N Prop. 4.1 det y1 − D 2 (x) = 0 ←→ det x1 − D 1 (y) = 0 Thm. 4.1 Thm. 4.1 N N Prop. 4.1 det y1 − D 2 (x) = 0 ←→ det x1 − D 1 (y) = 0
(4.46)
all coincide.
4.3. Concluding remarks. In this work, the main results concern the compatibility of the difference- differential-deformation systems arising from the “folding” procedure (Proposition 3.1), and the resulting spectral duality Theorems 4.1 and 4.2. The constancy of the bilinear pairings between solutions given by Corollary 4.1 may be viewed as a form of the bilinear relations for Baker functions which imply the Hirota bilinear equations for the associated tau function of Eq. (1.2). Another consequence of this compatibility is the fact that the (generalized) monN
N
d d odromy of the covariant derivative operators dx − D1 (x), and dy − D2 (y) is independent of both the continuous deformation parameters {uK , vJ } and the integer N ; i.e., we have a dual pair of differential operators families whose coefficients satisfy differential equations in the parameters {uK , vJ } and difference equations in the discrete parameters N that generate isomonodromic deformations. Associated to such isomonodromic deformation equations, there is a sequence of isomonodromic tau functions in the sense N
of refs. [31, 32]. However, since the highest terms of the polynomial matrices D1 (x) N
and D2 (y) have a very degenerate spectrum (in fact, they have rank 1), the standard definition of the isomonodromic tau function does not apply. To introduce a suitable definition for this situation, an analysis of the formal asymptotics at x = ∞ (or y = ∞) is required. Also, the systems of Proposition 3.1 represent in a sense, the “vacuum” isomonodromic deformation systems associated with the Fredholm kernels appearing in Proposition 2.1. When the corresponding integral operator is supported on a union of intervals, the computation of its resolvent is equivalent to a Riemann-Hilbert problem with discontinuities given across these cuts [29]. The resulting “dressed” Baker functions
Duality, Biorthogonal Polynomials and Multi-Matrix Models
111
determine isomonodromic families of covariant derivative operators having, in addition to polynomial parts, poles at the endpoints of the intervals, which may be viewed as new deformation parameters. The associated isomonodromic tau functions are given by the Fredholm determinants of the integral operator supported on the union of intervals [25, 29]. The study of the formal asymptotics associated to the vacuum systems, the corresponding isomonodromic tau functions and the relation of these to the spectral invariants will be developed in a subsequent work ([6]), as will the study of the N → ∞ asymptotics of the biorthogonal polynomials and associated Fredholm kernels. A. Appendix: Multi-Matrix Model The “multi-matrix-model” is a generalization of the 2-matrix-model, which was introduced in the context of string theory and conformal field theory [13, 12], and has been extensively studied [37, 38, 12]. Our notations in the following mainly follow [20]. Calculations of spectral statistics in this model again involves biorthogonal polynomials which obey linear differential systems of finite rank. It will be shown in this appendix that these also satisfy an extended form of the spectral duality relations derived for the 2-matrix case. The results will be summarized, but only a brief sketch of the proofs will be indicated. Consider m ≥ 2 random hermitian N × N matrices M1 , M2 , . . . , Mm , with the measure dµ =
m
e
−Tr Vk (Mk )
k=1
m−1
e
Tr Mk Mk+1
k=1
M
d Mk ,
(A.1)
k=1
where dMk is the standard Lebesgue measure for hermitian matrices, and the potentials Vk , k = 1, . . . , m, are polynomials of degrees dk + 1, with coefficients Vk (x) = uk,0 +
d k +1 l=1
uk,l l x. l
(A.2)
As in the 2-matrix case, all the correlation functions and statistical properties of the eigenvalues of the m matrices can be expressed in terms of determinants involving m2 Fredholm integral kernels, which are constructed from an infinite sequence of biorthogonal polynomials and their integral transforms [22]. In this case, the biorthogonal polynomials πn (x) = x n + · · · ,
σn (y) = y n + · · ·
(A.3)
are defined to be orthogonal in the following sense:
···
dx1 dx2 , . . . , dxm−1 dxm
m k=1
e−Vk (xk )
m−1
exk xk+1 πn (x1 )σl (xm ) = hn δnl ,
k=1
(A.4) where the integral is convergent on the real axis if all the degrees dk + 1 are even, and the leading coefficients are positive. Otherwise we need to consider other integration paths
112
M. Bertola, B. Eynard, J. Harnad
in the complex plane, without boundaries, so that integration by parts may be done. This uniquely determines the polynomials πn and σn for all n. + , From πn , we define the following m sequences of functions ψ1,n n=0,... ,∞ , . . . , + , ψm,n n=0,... ,∞ : 1 ψ1,n (x) := √ πn (x)e−V1 (x) , hn ψ2,n (x) := dy ψ1,n (y) exy , ψk+1,n (x) := dy ψk,n (y) exy e−Vk (y) ,
m − 1 ≥ k ≥ 2,
for
and from σn , the following m sequences of functions + , φm,n n=0,... ,∞ :
(A.5)
, + φ1,n n=0,... ,∞ , . . . ,
1 φm,n (x) := √ σn (x)e−Vm (x) , hn φk−1,n (x) := φ1,n (x) :=
dy φk,n (y) exy e−Vk−1 (x)
for
m ≥ k ≥ 3,
dy φ2,n (x) exy ,
(A.6)
which are dual bases for the respective spaces they span: dx ψk,n (x) φk,l (x) = δnl .
(A.7)
A.1. Recursion relations. We define the semi-infinite matrices Pk and Qk for each k = 1, . . . , m, such that [9, 20] ∂ xψk,n (x) = Qk (n,l) ψk,l (x), Pk (n,l) ψk,l (x), (A.8) ψk,n (x) = ∂x l
l
where it will be shown below that these only involve finite sums. From the pairing (A.7), and integration by parts, we have ∂ xφk,n (x) = Qtk (n,l) φk,l (x), Pkt (n,l) φk,l (x). (A.9) φk,n (x) = − ∂x l
l
Note that these matrices all satisfy Heisenberg relations [9], [Pk , Qk ] = 1.
(A.10)
Using the definitions of ψk,n and φk,n , we find the following relationships between them: Pk+1 = Qk k ∈ [1, m − 1],
−Pk = Qk+1 − Vk (Qk ) k ∈ [2, m − 1],
−P1 = Q2 . (A.11)
In particular, this implies that Qk−1 + Qk+1 = Vk (Qk )
for k ∈ [2, m − 1].
(A.12)
These relations are enough to ensure that all the matrices Qk and Pk are of finite band type.
Duality, Biorthogonal Polynomials and Multi-Matrix Models
113
Proposition A.1. The matrix Qk has rk bands above the principal diagonal and sk bands below the principal diagonal, with r1 = 1 and rk = sm = 1 and sk =
k−1
dl for k ∈ [2, m],
l=1 m
(A.13)
dl for k ∈ [1, m − 1].
(A.14)
l=k+1
Proof. Q1 multiplies the vector of polynomials {πn (x)}n=0,... ,∞ by x, and can therefore raise the degree at most by 1; i.e. Q1 has at most one line above diagonal. From the same argument, since the multiplication by P1 + V1 (Q1 ) takes the derivative of the vector polynomial [πn (x)]n=0,... ,∞ with respect to x, it must lower the degree, therefore P1 has at most d1 = deg V1 lines above diagonal and so does Q2 = −P1 . Using A.12 recursively, it follows that Qk has at most rk lines above diagonal. Repeating the argument with the polynomials {σn }, we see that Qtk has at most sk lines above the diagonal. Denoting αk,l (n) := Qk (n,n+l) ,
(A.15)
the recursion relations may be written componentwise as xψk,n (x) =
+rk
αk,l (n)ψk,n+l (x),
+rk
xφk,n (x) =
l=−sk
αk,l (n − l)φk,n−l (x),
l=−sk
(A.16) which implies +r2 ∂ ψ1,n (x) = − α2,l (n)ψ1,n+l (x), ∂x l=−s2
+rk−1
∂ ψk,n (x) = αk−1,l (n)ψk,n+l (x), k ∈ [2, m], ∂x
(A.17)
l=−sk−1
and +r2 ∂ φ1,n (x) = α2,l (n − l)φ1,n−l (x), ∂x l=−s2
+rk−1 ∂ αk−1,l (n − l)φk,n−l (x), k ∈ [2, m]. φk,n (x) = − ∂x
(A.18)
l=−sk−1
Note that
α1,1 (n) = αm,−1 (n + 1) = γ (n) =
hn+1 . hn
(A.19)
114
M. Bertola, B. Eynard, J. Harnad
A.2. Folding. Again, it is possible to “fold” these recursion relations to form finite rank linear differential systems with polynomial coefficients. For each k, define the following windows of size rk + sk : ψk,N−sk .. *k = ,k = φk,N−rk , . . . , φk,N+sk −1 . (A.20) , . N N ψk,N+rk −1 Shift operators. For each k, define “ladder” matrices of size rk + sk , 0 1 0 0 .. . 0 0 0 , ak (x) = 0 0 0 1 N α αk,rk −1 (N) x−αk,0 (N) k,−sk (N) − αk,r (N) . . . αk,r (N) . . . − αk,r (N) k αk,r k−1 (N−rk +1) k − αkk,r (N−rk ) 1 0 0 k .. k,0 (N) . αx−α(N−r 0 0 k,rk k) , a˜ k (x) = .. N . 0 0 1 αk,−s (N+sk ) − αk,r k(N−rk ) 0 0 0
(A.21)
k
which implement the shifts2 in N , ak (x) *k = *k , N
N
,k a˜ k (x) = ,k .
N +1
N N
(A.22)
N−1
Differential systems. The differential systems3 satisfied by the vectors *k N (x) and ,k N (x) are: N ∂ ,k (x) = − ,k (x) D˜ k (x), ∂x N N
N N ∂ N *k (x) = Dk (x) *k (x), ∂x
(A.23)
N
N
where Dk (x) and D˜ k (x) are matrices of size rk + sk , and with polynomial coefficients of degree at most dk in x. N
The matrices Dk (x) are given by: N D1 (x) N Dk (x)
r2 l 1N α 2,l = − α 2,0 − 1N
l=1 rk−1
j =1
a1 N+l−j
l kN α k−1,l = α k−1,0 + kN
for m ≥ k ≥ 2,
l=1
j =1
s2 l−1 1N α 2,−l (x) −
ak N+l−j
l=1 sk−1
j =0
a1 (x) N−l+j
l−1 kN α k−1,−l (x) + l=1
−1
j =0
,
−1
ak (x) N−l+j (A.24)
2 In the m = 2 case, we had a = a , a = a˜ , b = (a˜ t )−1 , b = (a t )−1 . 1 1 2 2 3 Notice that the notations are changed for m = 2. D is now what we called −D t and D ˜ 2 is what we 2 2 previously called −D2t .
Duality, Biorthogonal Polynomials and Multi-Matrix Models
115
where kN
α j,l := diag (αj,l (N − sk ), . . . , αj,l (N + rk − 1)).
(A.25)
Similarly: N
rk−1
1N
D˜ 1 (x) = − α˜ 2,0 − N
kN
D˜ k (x) = α˜ k−1,0 +
l=1 j =0 rk−1 l−1
l=1
for m ≥ k ≥ 2,
l−1
a˜ 1 (x) N−j
−1 1N α˜ , a˜ 1 (x)
sk−1
kN
N+j l=1 j =1
sk−1 l
α˜ 2,l −
l
N−j
l=1
2,−l
−1 a˜ k (x)
a˜ k (x) α˜ k−1,l +
j =0
1N
j =1
N+j
kN
α˜
k−1,−l
(A.26)
where kN
α˜ j,l := diag (αj,l (N − l − rk ), . . . , αj,l (N − l + sk − 1)).
(A.27)
Christoffel–Darboux matrices. Consider the kernel K
N k,k
(x, y) =
N−1
φk,n (y)ψk,n (x).
(A.28)
n=0
The generalization of the Christoffel–Darboux theorem for this kernel reads (x − y)K
N k,k
(x, y) =
rk l
αk,l (N l=1 j =1 sk l
−
− j )ψk,N−j +l φk,N−j
αk,−l (N − j + l)ψk,N−j φk,N−j +l
l=1 j =1
= ,k (x) Ak *k (y), N
N
N
(A.29)
where Ak is the (rk + sk ) × (rk + sk ) matrix: N
0 .. .
...
... 0 ... 0 ... Ak = −αk,−sk (N) . . . N .. . 0 .. .. . . 0 0
αk,rk (N − rk ) 0 0 .. .. .. . . . .. .. . 0 . 0 0 αk,1 (N − 1) . . . αk,rk (N − 1) . −αk,−1 (N ) 0 ... 0 .. . 0 ... 0 .. .. .. .. . . . . 0 0 0 −αk,+sk −1 (N + sk − 1) (A.30) 0 .. .
116
M. Bertola, B. Eynard, J. Harnad
There is also a “differential Christoffel–Darboux theorem”, for k > 1: rk−1
(∂x + ∂y )K
N k,k
(x, y) =
l
αk−1,l (N − j )ψk,N−j +l φk,N−j
l=1 j =1 sk−1 l
−
αk−1,−l (N − j + l)ψk,N−j φk,N−j +l
l=1 j =1
ˆ k (y), ˆ k (x) Ak−1 * = , N
(A.31)
N
N
where
ˆ k = ψk,N −sk−1 . . . ψk,N +rk−1 −1 t , *
ˆ k = φk,N−rk−1 . . . φk,N+sk−1 −1 , ,
N
N
(A.32) and for k = 1, (∂x + ∂y )K
N 1,1
(x, y) = −
r2 l
+
α2,l (N − j )ψ1,N−j +l φ1,N−j
l=1 j =1 s2 l
α2,−l (N − j + l)ψ1,N−j φ1,N−j +l
l=1 j =1
ˆ 1 (x) A2 * ˆ 1 (y), = −, N
N
where
ˆ 1 = ψ1,N−s2 . . . ψ1,N+r2 −1 t , *
(A.33)
N
ˆ 1 = φ1,N−r2 . . . φ1,N+s2 −1 . ,
N
N
N
(A.34)
N
A.3. Duality. All the systems Dk (x) and D k (x) have the same spectral curve. This result follows again in two steps. • For each k, we have: N
N
D˜ k (x) Ak = Ak Dk (x), N
which implies that det
N y1 − Dk (x)
(A.35)
N
N
= det y1 − D˜ k (x) .
• The relationship between spectral curves for different k is: N N
det x1 − Dk+1 (y) ∝ det y1 − Vk (x)1 + Dk (x) and
for k > 1,
N N det x1 − D2 (y) ∝ det y1 + D1 (x) .
(A.36)
(A.37)
(A.38)
Duality, Biorthogonal Polynomials and Multi-Matrix Models
117
Proof. We prove A.35 using the same method as for m = 2. Here is a sketch of the proof for k > 1. ˜ k N (x) and , ˜ k N (y) be any solutions of the differential systems Let * N ∂ ˜ k (y) D˜ k (y). ˜ k (y) = − , , ∂y N N
N ∂ ˜ k (x), ˜ k (x) = Dk (x) * * ∂x N N
(A.39)
We construct the functions ψ˜ k,n (x) with N − sk − sk−1 ≤ n ≤ N + rk + rk−1 − 1 and φ˜ k,n (y) with N − rk − rk−1 ≤ n ≤ N + sk + sk−1 − 1 by recursively applying the shift operators (which are again compatible with the differential systems). This gives ˜ k (y) Ak * ˆ k (y) Ak−1 * ˜ k (x) = (x − y) , ˆ k (x). (∂x + ∂y ) , N
N
N
N
N
N
(A.40)
This equality holds term by term, since the coefficients for each monomial of type ψ˜ k,n (x)φ˜ k,l (y) are the same as when ψ˜ k,n (x) = ψk,n (x) and φ˜ k,l (y) = φk,l (y), which ˜ k N (y) are linearly independent functions of x and y. By taking x = y one has, for any , ˜ and *k N (x),
N
˜ k (x) D˜ k (x) , N
N Ak − Ak Dk (x) N N
˜ k (x) = 0. *
(A.41)
N
Since this holds for any basis of solutions, the factor in brackets (. . . ) must vanish.
Here is a sketch of the proof of (A.37); it is very similar to the proof of Prop. 4.1. First, we show how to prove that det (y1 + D1 (x)) ∝ det (x1 − D2 (y)),
(A.42)
the other cases being similar. First, notice that: det (y1 + D1 (x)) =
1N
det − α 2,−s2 det a1 . . . a1 N−s2
× det 1 − a1
y
N−s2 1N α
2,−s2
N−1
a1 . . . a1 + N −1 N−s2 +1
r2 l=−s2 +1
1N
α 2,l
a1 N−s2 1N α
2,−s2
a1 . . . a1 . N+l−1 N−s2 +1 (A.43)
Using Lemma 4.1, the last determinant can be written as the determinant of a block matrix T1 of size (r1 + s1 ) × (r2 + s2 ), det (y1 + D1 (x)) = c1 det (1 − T1 ), c1 = const.,
(A.44)
118
M. Bertola, B. Eynard, J. Harnad
where
a1
0
0
0
..
0
0
0
0
T1 :=
− a1N −s
N+r2 −1
0 .
0 a1 N−s2 +1
α 1N 2,r α 1N 2,r −1 α 1N 2,−s +1 y−α 1N 2,0 2 −a 2 2 ··· a1N −s ··· − a1N −s 1N −s2 1N 2 α 1N 2 α 1N 2 α 1N α 2,−s2 2,−s2 2,−s2 2,−s2
. (A.45)
On the other hand, by the same argument, we have det (x1 − D2 (y)) = c2 det (1 − T2 ), c2 = const.,
(A.46)
where
T2 :=
a2
0
0
0
..
0
0
0
0
− a2N −s
N+r1 −1
0 .
0 a2 N −s1 +1
α 2N 1,r α 2N 1,−s +1 α 2N 1,r −1 x−α 2N 1 −a 1 1 ··· a2N −s 2N 1,0 ··· − a2N −s 2N −s1 2N 1 α 2N 1 α 1 α 2N α 1,−s 1,−s1 1,−s 1,−s1 1 1
. (A.47)
It is easy to see that T1 and T2 are equal up to permutations of rows and of columns, and therefore they have the same determinant. The other equalities with k > 1, N N
det x1 − Dk+1 (y) ∝ det y1 − Vk (x)Id + Dk (x) for k > 1, (A.48) are proved by the same method and by induction on k. We define the sequence of functions xj (x, y), 1 ≤ j ≤ k + 1, such that xk = x
,
xk+1 = y
,
xj −1 = Vj (xj ) − xj +1
2 ≤ j ≤ k.
(A.49)
We then prove by induction on j that det (xj −1 − Dj (xj )) ∝ det (xj − Dj +1 (xj +1 ))
2 ≤ j ≤ k.
(A.50)
Each step of the induction is similar to the method described above for k = 1. This completes the proof of A.37. It can also be proven that all these systems are compatible with the shifts and deformations. It follows that if ,k (x) and *k (x) denote fundamental solution matrices for N
N
the systems (A.23), it is possible to choose their normalizations such that ,k (x) Ak *k (x) = 1. N
N
N
(A.51)
This may again be viewed as a form of the bilinear identities that allow us to deduce bilinear equations for τ -functions.
Duality, Biorthogonal Polynomials and Multi-Matrix Models
119
Acknowledgement. The authors would like to thank J. Hurtubise for helpful discussions relating to this work. The first two authors (MB, BE) would like to thank the CRM for support throughout the period (2000–2001) in which this work was completed.
References 1. Adams, M.R., Harnad, J., Hurtubise, J.: Dual Moment Maps to Loop Algebras. Lett. Math. Phys. 20, 294–308 (1990) 2. Adler, M., Van Moerbeke, P.: String-orthogonal polynomials, string equations and 2-Toda symmetries. Comm. Pure and Appl. Math. J. 50, 241–290 (1997) 3. Adler, M., Van Moerbeke, P.: The Spectrum of Coupled Random Matrices. Ann. Math. 149, 921–976 (1999) 4. Baik, J., Deift, P., Johansson, K.: The Longest increasing subsequence in a random permutation and a unitary random matrix model. J. Amer. Math. Soc. 12, 1119–1178 (1999). 5. Banks, T., Fischler, W., Shenker, S.H., Susskind, L.: M theory as a matrix model: A conjecture. Phys. Rev. D55, 5112 (1997), hep-th/9610043 6. Bertola, M., Eynard, B., Harnad, J.: Formal asymptotics of dual isomonodromic families and tau functions associated to two-matrix models. In preparation, 2002 7. Bleher, P.M., Its, A.R., eds.: Random Matrix Models and Their Applications. MSRI Research Publications 40.Cambridge: Cambridge Univ. Press, 2001 8. Boulatov, D.V., Kazakov V.A.: The Ising model on a random planar lattice: the structure of the phase transition and the exact critical exponents. Phys. Lett. B 186, 379 (1987) 9. Chadha, S., Mahoux, G., Mehta, M.L.: A method of integration over matrix variables 2. J. Phys. A: Math. Gen. 14, 579 (1981) 10. Daul, J.M., Kazakov, V., Kostov, I.K.: Rational Theories of 2D Gravity from the Two-Matrix Model. Nucl. Phys. B409, 311–338 (1993), hep-th/9303093 11. David, F.: Planar diagrams, two-dimensional lattice gravity and surface models. Nucl. Phys. B 257 [FS14], 45 (1985) 12. Di Francesco, P., Ginsparg, P., Zinn-Justin, J.: 2D Gravity and Random Matrices. Phys. Rep. 254, 1 (1995) 13. Douglas, M.: Strings in less than one dimension and the generalized KdV hierarchies. Phys. Lett. 238B, 176 (1990) 14. Douglas, M.R.: The two-matrix model. In: Random surfaces and quantum gravity. (Cargèse, 1990), NATO Adv. Sci. Inst. Ser. B Phys. 262. New York: Plenum, 1991, pp. 77–83 15. Dubrovin, B.: Geometry of 2D topological field theory. In: Integrable systems and quantum groups. M. Francaviglia, S. Greco eds., Lect. Notes in Math. 1260, Berlin–Heidelberg–New York: Springer, 1996, pp. 120–348 16. Dyson, F.: Statistical theory of energy levels of complex systems, I, II and III. J. Math. Phys. 3, 140–156, 157–165, 166–177 (1962) 17. Dyson, F.: Correlations between the eigenvalues of a random matrix. Commun. Math. Phys. 19, 235 (1970) 18. Ercolani, N.M., McLaughlin, K.T.-R.:Asymptotics and integrable structures for biorthogonal polynomials associated to a random two-matrix model. Physica D 152–153, 232–268 (2001) 19. Eynard, B.: Eigenvalue distribution of large random matrices, from one matrix to several coupled matrices. Nucl. Phys. B 506, 633 (1997), cond-mat/9707005 20. Eynard, B.: Correlation functions of eigenvalues of multi-matrix models, and the limit of a time dependent matrix. J. Phys. A: Math. Gen. 31, 8081 (1998), cond-mat/9801075 21. Eynard, B.: An introduction to random matrices. Lectures given at Saclay, October 2000, notes available at http://www-spht.cea.fr/articles/t01/014/. 22. Eynard, B., Mehta, M.L.: Matrices coupled in a chain: Eigenvalue correlations. J. Phys. A: Math. Gen. 31, 4449 (1998), cond-mat/9710230 23. Gaudin, M.: Sur la loi limite de l’espacement des valeurs propres d’une matrice aléatoire. Nucl. Phys. 25, 447–458 (1960) 24. Guhr, T., Mueller-Groeling, A., Weidenmuller, H.A.: Random matrix theories in quantum physics: Common concepts. Phys. Rep. 299, 189 (1998) 25. Harnad, J., Its, A.R.: Integrable Fredholm Operators and Dual Isomonodromic Deformations. solvint/9706002 Commun. Math. Phys., in press, 2002 26. Harnad, J.: Dual Isomonodromic Deformations and Moment Maps into Loop Algebras. Commun. Math. Phys. 166, 337–365 (1994) 27. Harnad, J.: Dual Isomonodromic Tau Functions and Determinants of Integrable Fredholm Operators. In: Random Matrices and Their Applications, MSRI Research Publications 40. Cambridge: Cambridge Univ. Press, eds. P.M. Bleher and A.R. Its, 2001
120
M. Bertola, B. Eynard, J. Harnad
28. Harnad, J., Tracy C.A., Widom, H.: Hamiltonian Structure of Equations Appearing in Random Matrices. In: Low Dimensional Topology and Quantum Field Theory, ed. H. Osborn. New York: Plenum, 1993, pp. 231–245 29. Its, A.R., Izergin, A.G., Korepin, V.E., Slavnov, N.A.: Differential Equations for Quantum Correlation Functions. Int. J. Mod. Phys. B4, 1003–1037 (1990) 30. Itzykson, C., Zuber, J.B.: The planar approximation (II). J. Math. Phys. 21, 411 (1980) 31. Jimbo, M., Miwa, T., Ueno, K.: Monodromy Preserving Deformation of Linear Ordinary Differential Equations with Rational Coefficients I. Physica 2D, 306–352 (1981) 32. Jimbo, M., Miwa, T.: Monodromy Preserving Deformation of Linear Ordinary Differential Equations with Rational Coefficients II, III. Physica 2D, 407–448 (1981); ibid., 4D, 26–46 (1981) 33. Johansson, K.: Shape fluctuations and random matrices. Commun. Math. Phys. 209, 437–476 (2000) 34. Katz, N., Sarnak, P.: Random Matrices, Frobenius Eigenvalues and Monodromy. A.M.S. Colloquium Publications, Vol. 45. Providence, RI: AMS, 1999 35. Kazakov, V.A.: Ising model on a dynamical planar random lattice: exact solution. Phys Lett. A119, 140–144 (1986) 36. Kostov, I.: Gauge Invariant Matrix Model for the A-D-E Closed Strings. Phys. Lett. B297, 74 (1992), hep-th/9208053 37. Mehta, M.L.: Random Matrices, 2nd edition. New York: Academic Press, 1991 38. Mehta, M.L.: A method of integration over matrix variables. Commun. Math. Phys. 79, 327 (1981) 39. Mehta, M.L., Gaudin, M.: On the density of eigenvalues of a random matrix. Nucl. Phys. 18, 420–427 (1960) 40. Mehta, M.L., Shukla, P.: Two coupled matrices: Eigenvalue correlations and spacing functions. J. Phys. A: Math. Gen. 27, 7793–7803 (1994) 41. Odlyzko, A.M.: On the distribution of spacings between the zeros of the zeta function. Math. Comp. 48, 273–308 (1987) 42. Praehofer, M., Spohn, H.: Universal distributions for growth processes in 1 + 1 dimensions and random matrices. Phys. Rev. Lett. 84 (2000) 4882, cond-mat/9912264. 43. Rudnick, Z., Sarnak, P.: Zeros of principal L-functions and random matrix theory. Duke Math. J. 81, 269–322 (1996) 44. Szegö, G.: Orthogonal Polynomials. Providence, RI: AMS, 1939 45. Tada, T.: (q, p) critical points from the two-matrix models. Phys. Lett. B 259, 442 (1991) n 46. Tracy, C.A., Widom, H.: Introduction to random matrices. In: Geometric and Quantum Aspects of Integrable Systems, ed. G.F. Helminck. Springer Lecture Notes in Physics 424. Berlin–Heidelberg–New York: Springer, 1993, pp. 103–130 47. Tracy, C.A., Widom, H.: Fredholm determinants, differential equations and matrix models. Commun. Math. Phys. 161, 289–309 (1994) 48. Ueno, K., Takasaki, K.: Toda Lattice Hierarchy. Adv. Studies Pure Math. 4, 1–95 (1984) 49. Userkesm, A., Norset, S.P.: Christoffel–Darboux-Type Formulae and a Recurrence for Biorthogonal Polynomials. Constructive Approximation, 5, 437–454 (1989) 50. Verbaarshot, J.J.M.: Random matrix model approach to chiral symmetry. Nucl. Phys. Proc. Suppl. 53, 88 (1997) 51. Wigner, E.P.: On the statistical distribution of widths and spacings of nuclear resonance levels. Proc. Cambridge Phil. Soc. 47, 790–798 (1951) 52. Zinn-Justin, P.: Universality of correlation functions of hermitian random matrices in an external field. Commun. Math. Phys. 194, 631 (1998), cond-mat/9705044 Communicated by L. Takhtajan
Commun. Math. Phys. 229, 121 – 139 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0679-2
Communications in
Mathematical Physics
The Local Structure of Zero Mode Producing Magnetic Potentials Daniel M. Elton School of Mathematical Sciences, University of Sussex, Falmer, Brighton BN1 9QH, United Kingdom. E-mail:
[email protected] Received: 30 May 2001 / Accepted: 11 March 2002 Published online: 24 July 2002 – © Springer-Verlag 2002
Abstract: We consider the class of continuous magnetic potentials on R3 which decay as o(|x|−1 ). Within this class it is shown that the set of potentials whose associated WeylDirac operator produces zero modes with multiplicity m forms a smooth submanifold of co-dimension m2 when m = 0, 1, 2, and is contained in a smooth submanifold of co-dimension 2m − 1 when m ≥ 3. 1. Introduction Let A be a continuous real magnetic (vector) potential on R3 . Consider the Weyl-Dirac operator σ.(D − A), where σ = (σ1 , σ2 , σ3 ) is the vector of Pauli matrices and D = −i∇, where ∇ = (∂1 , ∂2 , ∂3 ) denotes the gradient operator on R3 . The operator σ.(D − A) acts on 2component spinor fields, which (for our purposes) are simply C2 valued functions on R3 . It is a classical result that σ.(D − A) with the domain C0∞ is essentially self-adjoint (see Theorem 4.3 in [15] for example). Thus σ.(D−A) has a unique self-adjoint extension which is an unbounded operator on L2 (R3 , C2 ). This extension will also be denoted by σ.(D − A). In this paper we are interested in the problem of determining when 0 is an eigenvalue of σ.(D − A) or, equivalently, of determining when σ.(D − A) has a non-trivial kernel. Definition 1. Any eigenfunction of σ.(D − A) corresponding to 0 is called a zero mode. Zero modes first arose in the work of [9] where it was shown that the existence of zero modes implied single electron atoms of sufficiently large nuclear charge collapse Supported by EPSRC under grant GR/M21720.
122
D. M. Elton
in the presence of certain external magnetic fields. The first examples of zero modes were then given in the associated paper [13]. Further examples using ideas from [13] appeared in [1] and [6], whilst in [8] a related 2-dimensional problem and ideas from differential geometry were used to construct a new class of examples. An alternative approach was taken in [3] and [4] where the authors were interested in the question of “how many” magnetic potentials (or magnetic fields) from a given class produce zero modes. The results given below are of a similar spirit to those in [3] and [4]. Let us introduce the class of magnetic potentials to which our results apply. Definition 2. Let A denote the set of all A ∈ C 0 (R3 , R3 ) with A = o(|x|−1 ) as |x| → ∞. We define a norm on A using the expression A A = (1 + |x|)A L∞ . It is clear that A is a Banach space which contains C0∞ as a dense subset. Definition 3. For any m ∈ N0 let Zm denote the subset of A consisting of those A ∈ A for which the operator σ.(D − A) has an m-dimensional kernel. The set Zm is simply the set of magnetic potentials in A which possess zero modes with multiplicity m. The following basic properties are established for these sets. Theorem 1. We have the following; (i) A = m≥0 Zm . (ii) Z0 is a dense open subset of A. (iii) For any m ∈ N0 and non-empty open set ⊆ R3 , C0∞ (, R3 ) ∩ Zm = ∅. Remark 1. Parts (i) and (ii) are similar to known results; in particular part (i) simply says dim Ker σ.(D − A) < +∞ for any A ∈ A, a result which comes from a general theory for elliptic operators on Rn (see [5, 12, 7] and references therein). Results similar to part (ii) appear in [3] and [4]. In [2] and [8] it is shown that dim Ker σ.(D − A) takes arbitrary finite values if A is allowed to vary over A (see Sect. 5 for further details). The fact that there are smooth compactly supported magnetic potentials possessing zero modes with arbitrary finite multiplicity appears to be new. For any m ∈ N0 let SA(m) denote the set of self-adjoint m×m matrices (with complex entries). Therefore SA(m) is a (real) vector space of dimension m2 . Our second main result can now be stated as follows. Theorem 2. For m = 1, 2 the set Zm is a smooth sub-manifold of A with co-dimension m2 . More precisely, let m ∈ {1, 2} and suppose A0 ∈ Zm . Then we can find a neighbourhood U ⊆ A of A0 and a smooth map : U → SA(m) with the following properties; (i) (A0 ) = 0. (ii) For any m ≥ 0, A ∈ Zm ∩ U iff rank((A)) = m − m . (iii) The Fréchet derivative DF (A0 ) : A → SA(m) has full rank (i.e. rank m2 ). It is not clear whether Theorem 2 extends to the case m ≥ 3; although the map can be defined in the same manner it is no longer apparent that the derivative DF (A0 ) has full rank. This property can be established for certain magnetic potentials, namely the class of examples given in [8]; see Corollary 1, Theorem 5 and Remark 5 for further details. We are also able to obtain the following weaker result in general.
The Local Structure of Zero Mode Producing Magnetic Potentials
123
Theorem 3. For any m ≥ 3 the set Zm is contained in a smooth sub-manifold of A with co-dimension 2m − 1. More precisely, let m ≥ 3 and suppose A0 ∈ Zm . Then we can find a neighbourhood U ⊆ A of A0 and a smooth map : U → R2m−1 with the following properties; (i) If A ∈ Zm ∩ U then (A) = 0. (ii) The Fréchet derivative DF (A0 ) : A → R2m−1 has full rank (i.e. rank 2m − 1). The paper is arranged as follows. In Sect. 2 we introduce some weighted function spaces on which the operator σ.(D − A) defines a Fredholm map and rephrase our problem in terms of these spaces. The map appearing in Theorem 2 is defined in Sect. 3 where properties (i) and (ii) are also justified. The rank of the derivative DF (A0 ) is discussed in Sect. 4 where the proofs of Theorems 2 and 3 are completed. Theorem 2 is extended to cover the examples from [8] in Sect. 5 where we also establish Theorem 1. Finally, in Appendix A we give a proof of the Unique Continuation Property (UCP) for σ.(D − A). Notation. We use N0 to denote the set N ∪ {0} and |·| to denote the usual norm on any of the spaces R, C, R3 or C2 . By writing H = K ⊕ L we mean that K and L are complementary subspaces of a Banach space H . Also DF is used to denote the Fréchet derivative of a map between (open subsets of) Banach spaces. For any k ∈ Z and p ∈ [1, ∞], we use H p,k to denote the Sobolev space of “functions with k p-integrable derivatives”. This notation is simplified to H k in the case p = 2 and Lp in the case k = 0. We use C 0 and C 1 to denote the spaces of continuous and continuously differentiable functions without restrictions on growth at ∞. If not otherwise specified, elements of these standard function spaces are assumed to have domain R3 and codomain either R3 (for vector fields) or C2 (for spinors). 2. Function Spaces and Fredholm Maps In order to make σ.(D − A) act as a Fredholm map we need to introduce some weighted Sobolev type spaces. Definition 4. Let H1 denote the Banach space obtained by taking the completion of C0∞ (R3 , C2 ) with respect to the norm defined by φ 2H1 =
(1 + |x|2 )−1 |φ(x)|2 d 3 x +
3
|Dj φ(x)|2 d 3 x .
j =1
We also define H−1 to be the Banach space obtained by taking the dual of H1 with respect to the L2 pairing on R3 and, to complete the scale of spaces, we set H0 = L2 (R3 , C2 ). For k = −1, 0, 1, we will use ·, ·k to denote the sesquilinear pairing obtained from the dual pairing of Hk and H−k . Remark 2. The pairing ·, ·0 is nothing other than the usual L2 pairing, which will also be written as ·, ·. For k = −1, 1, ·, ·k is an extension of the L2 pairing; it follows that the pairings ·, ·k for k = −1, 0, 1 agree on their common domains. We also note that H1 ∩ H0 = H 1 .
124
D. M. Elton
It is easy to check that the operator σ.D defines a bounded map Hk → Hk−1 for k = 0, 1. The next result allows us to deal with magnetic potentials (in fact, the need for such a result essentially governs our choice of A). Proposition 1. If A ∈ A then multiplication by σ.A defines a compact map Hk → Hk−1 for k = 0, 1. Proof. Duality reduces the result to the case k = 1. Now multiplication by σ.A clearly defines a continuous bilinear map A × H1 → H0 , whilst multiplication by any element of C0∞ defines a compact map H1 → H0 (by the same argument which is used to show that multiplication by an element of C0∞ defines a compact map H 1 → L2 ). The fact that C0∞ is dense in A then completes the result. If A ∈ A it follows that σ.(D − A) defines a bounded map Hk → Hk−1 for k = 0, 1. This map will be denoted by TA when k = 1. Now (Hk )∗ = H−k while the operator σ.(D − A) is formally self-adjoint (n.b. A is real valued). It follows that the adjoint of the map TA is the operator σ.(D − A) acting as a bounded map H0 → H−1 . In particular, TA φ = TA∗ φ ∈ H0 ∩ H−1
for all φ ∈ H1 ∩ H0 .
(1)
Theorem 4. If A ∈ A then the maps TA : H1 → H0 and TA∗ : H0 → H−1 are Fredholm with index 0. Proof. Standard properties of Fredholm maps and Proposition 1 reduce our consideration to the map TA with A = 0. Now T0 = σ.D is an elliptic system of differential operators on R3 (n.b. the principal symbol of T0 at any (x, ξ ) ∈ R3 × R3 is the matrix σ.ξ which has determinant −|ξ |2 ). Furthermore T0 is homogeneous of order 1 and has constant coefficients. The result now follows directly from Theorem 3 in [12], or Theorem 4.31 and Remark 4.27 in [7]. We conclude this section with some useful regularity results; in particular, Proposition 2 restates our problem in terms of the map TA . Remark 3. If A ∈ A and φ ∈ L2 satisfies σ.(D − A)φ ∈ L2 (in a distributional sense) then σ.Dφ ∈ L2 and thus φ ∈ H 1 (n.b. σ.D is a first order elliptic differential operator on R3 with constant coefficients). On the other hand, if φ ∈ L2loc solves σ.(D − A)φ = 0 p,1 then standard elliptic regularity results give us φ ∈ Hloc for all p ∈ [1, ∞); in particular, taking any p > 3 and using a Sobolev embedding theorem, it follows that φ is continuous. Proposition 2. If A ∈ A then Ker TA = Ker TA∗ = Ker σ.(D − A) (where the last set is the kernel of σ.(D − A) regarded as an unbounded operator on L2 ). In particular, given m ∈ N0 , we have A ∈ Zm iff dim Ker TA = m. Proof. From Remark 3 we know that if φ ∈ H0 = L2 solves TA∗ φ = σ.(D − A)φ = 0 then we must have φ ∈ H 1 = H1 ∩ H0 . It follows that Ker TA∗ = Ker σ.(D − A) ⊆ Ker TA . On the other hand TA is a Fredholm operator with index 0 (by Theorem 4) so dim Ker TA = dim Ker TA∗ . The result now follows. Proposition 3. Suppose A ∈ A. If TA φ ∈ H0 ∩ H−1 for some φ ∈ H1 then we must also have φ ∈ H0 .
The Local Structure of Zero Mode Producing Magnetic Potentials
125
Proof. Since TA and TA∗ are Fredholm maps we can write ψ ∈ H0 ψ, θ 0 = 0 for all θ ∈ Ker TA∗
(2)
ψ ∈ H−1 ψ, θ −1 = 0 for all θ ∈ Ker TA .
(3)
Ran TA = and Ran TA∗ =
Now suppose TA φ ∈ H0 ∩ H−1 for some φ ∈ H1 . Setting ψ = TA φ it follows that ψ ∈ Ran TA and ψ ∈ H−1 . If θ ∈ Ker TA then we also have θ ∈ Ker TA∗ (by Proposition 2) so ψ, θ −1 = ψ, θ 0 = 0 by (2) and Remark 2. Using (3) it follows that we must have ψ ∈ Ran TA∗ . Hence there exists φ ∈ H0 with TA∗ φ = ψ = TA φ. Since ψ ∈ H0 = L2 Remark 3 then gives φ ∈ H 1 = H1 ∩ H0 . Therefore φ − φ ∈ Ker TA and so φ ∈ φ + Ker TA∗ ⊂ H0 by Proposition 2. 3. Definition of The essential idea behind our analysis is to reduce the study of the operator TA for A near some fixed potential A0 to the study of a finite dimensional operator (the matrix (A)). This is achieved by using suitable projections to split the operator TA into an isomorphism between infinite dimensional spaces, and an operator acting between the finite dimensional spaces K = Ker TA0 and L, a complementary space of Ran TA0 (this splitting is carried out in Proposition 4). The most significant technical issue lies in the choice of the projections onto K and L (the operators denoted by P1 and P0∗ below); these must be chosen so as to ensure the symmetry of the operator TA (cf. (1)) is inherited by the split operator. The necessary properties for the projections (given in (4)) determine how we must choose L. Let A0 ∈ A and set K = Ker TA0 , m = dim K ∈ N0 . From Proposition 2 we have Ker TA∗0 = K and hence K ⊂ H1 ∩ H0 . We now need to choose a space L which simultaneously complements Ran TA0 and Ran TA∗0 . Since Ran TA∗0 is a closed subspace of H−1 with co-dimension dim Ker TA0 = m, and C0∞ is dense in H−1 , we can choose an m–dimensional space L ⊂ C0∞ ⊂ H0 ∩H−1 such that H−1 = L⊕Ran TA∗0 . Proposition 3 and (1) then imply L ∩ Ran TA0 = 0, and hence we have H0 = L ⊕ Ran TA0 since Ran TA0 is a closed subspace of H0 with co-dimension m = dim L. For k = −1, 0 let Kk⊥ ⊆ Hk denote the annihilator of K ⊂ H−k and, for k = 0, 1, ⊥ k −k let L⊥ k ⊆ H denote the annihilator of L ⊂ H . Therefore K0 = Ran TA0 and ⊥ ∗ K−1 = Ran TA0 , whilst we can write H 1 = K ⊕ L⊥ 1,
⊥ H 0 = K ⊕ L⊥ 0 = L ⊕ K0
and
⊥ H−1 = L ⊕ K−1 .
We will use these direct sums to define projections; in order to do this explicitly choose a basis {φ1 , . . . , φm } for K and pick a convenient basis for L using the following result. Lemma 1. We can choose a basis {ψ1 , . . . , ψm } for L such that φi , ψj 0 = φi , ψj 1 = δij for all i, j ∈ {1, . . . , m}.
126
D. M. Elton
Proof. Let {θ1 , . . . , θm } be any basis for L and define a matrix by Qij = φi , θj 0 for i, j ∈ {1, . . . , m}. It is straightforward to check that Q is invertible; indeed if Qx = 0 for some x ∈ Cm , then x1 θ1 + · · · + xm θm must be in L ∩ K0⊥ = 0, which implies x = 0. The required basis may now be defined by ψj =
m
(Q−1 )ij θi
i=1
for j ∈ {1, . . . , m}. For k = 0, 1 let Pk denote the projection in Hk onto K along L⊥ k . As a consequence of Lemma 1 we have that Pk φ =
m
φ, ψi k φi
and
i=1
Pk∗ ψ =
m
ψ, φi −k ψi
i=1
for all φ ∈ Hk and ψ ∈ H−k . In particular, it follows that Pk∗ is the projection in H−k ⊥ . From Remark 2 we also get onto L along K−k P1 φ = P 0 φ ∈ H 1 ∩ H 0
and
P0∗ ψ = P1∗ ψ ∈ H0 ∩ H−1
(4)
for all φ ∈ H1 ∩ H0 and ψ ∈ H0 ∩ H−1 . Using the projections P1 and P0∗ we can decompose the spaces H1 and H0 as H1 = Ran P1 ⊕ Ker P1 = K ⊕ L⊥ 1 and H0 = Ran P0∗ ⊕ Ker P0∗ = L ⊕ K0⊥ respectively. With these decompositions in mind, write the map TA : H1 → H0 in matrix form as S00 (A) S01 (A) TA = . S10 (A) S11 (A) ⊥ In particular S11 (A) ∈ L(L⊥ 1 , K0 ) is the bounded linear map defined by
S11 (A)φ = (I − P0∗ ) TA (I − P1 )φ = (I − P0∗ ) TA φ ⊥ ⊥ for all φ ∈ L⊥ 1 . Clearly A → S11 (A) defines a smooth map A → L(L1 , K0 ), where ⊥ ⊥ L(L1 , K0 ) is given the operator norm. Furthermore S11 (A0 ) is simply the map TA0 with domain restricted to a complementary space of Ker TA0 and codomain restricted to K0⊥ = Ran TA0 . It follows that S11 (A0 ) is an isomorphism. The fact that the set ⊥ of invertible elements in L(L⊥ 1 , K0 ) is open then allows us to find a neighbourhood ⊥ U ⊆ A of A0 such that the map S11 (A) : L⊥ 1 → K0 is invertible for all A ∈ U . Given A ∈ U we can then define a linear map S(A) : K → L between finite dimensional vector spaces by
S(A) = S00 (A) − S01 (A)S11 (A)−1 S10 (A).
(5)
The Local Structure of Zero Mode Producing Magnetic Potentials
127
Proposition 4. For any A ∈ U and m ∈ N0 we have A ∈ Zm iff S(A) : K → L has rank m − m . Proof. Assume A ∈ U so S11 (A) is invertible. Define operators I S01 (A)S11 (A)−1 : L ⊕ K0⊥ → L ⊕ K0⊥ , TA,+ = 0 I I 0 ⊥ : K ⊕ L⊥ TA,− = 1 → K ⊕ L1 S11 (A)−1 S10 (A) I and
TA,0 =
S(A) 0 0 S11 (A)
⊥ : K ⊕ L⊥ 1 → L ⊕ K0 .
A straightforward computation shows TA,+ TA,0 TA,− = TA . Since TA,± and S11 (A) are invertible we then get dim Ker TA = dim Ker TA,0 = dim Ker S(A). The result now follows from Proposition 2. Let M(m) denote the set of m×m matrices with complex entries and, for any A ∈ U , let (A) ∈ M(m) denote the matrix of S(A) with respect to the bases {φ1 , . . . , φm } and {ψ1 , . . . , ψm } for K and L respectively. From Lemma 1 we get (A)ij = S(A)φj , φi 0 for i, j ∈ {1, . . . , m}. Since S00 (A) = P0∗ TA P1 and φi , φj ∈ Ran P1 ∩ Ran P0 , (5) then gives (A)ij = TA φj , φi 0 − S01 (A)S11 (A)−1 S10 (A)φj , φi 0 .
(6)
In Proposition 4 the projections P1 and P0∗ are used to transform the operator TA into the diagonal matrix operator TA,0 , which in turn is used to define (A). By taking adjoints it can be shown that the projections P0 and P1∗ transform the operator TA∗ into (TA,0 )∗ . The symmetry given by (1) and (4), together with the regularity result Proposition 3, can then be used to show that TA,0 and (TA,0 )∗ agree on their common domain, which implies (A) must be self-adjoint. These ideas underlie the proof of the next result, although the unnecessary intermediate results are not established explicitly or completely. Proposition 5. For all A ∈ U the m×m matrix (A) is self-adjoint. Proof. For each i ∈ {1, . . . , m} set fi = S11 (A)−1 S10 (A)φi . A priori we have fi ∈ 1 L⊥ 1 ⊆ H . Hence P1 fi = 0 and so, using the definitions of S11 (A) and S10 (A), we get TA fi = S11 (A)fi + P0∗ TA fi = (I − P0∗ )TA φi + P0∗ TA fi .
(7)
However Ran P0∗ = L ⊂ H0 ∩ H−1 whilst φi ∈ K ⊂ H1 ∩ H0 so (I − P0∗ )TA φi = (I − P1∗ )TA∗ φi ∈ H0 ∩ H−1 by (1) and (4). Proposition 3 then gives fi ∈ H1 ∩ H0 . It follows that P0 fi = P1 fi = 0 while (7) can be rewritten as TA fi = TA∗ fi = (I − P1∗ )TA∗ φi + P1∗ TA∗ fi .
(8)
128
D. M. Elton
For i, j ∈ {1, . . . , m}, (8), (7) and the identities P1 fj = 0, P0 φi = φi lead to TA fj , fi 0 = fj , TA∗ fi 1 = TA (I − P1 )fj , φi 0 + P1 fj , TA∗ fi 1 = (I − P0∗ )TA φj , φi 0 + P0∗ TA fi , φi 0 = P0∗ TA fj , φi 0 . However we have P0∗ TA fj = S01 (A)S11 (A)−1 S10 (A)φj by the definitions of S01 (A) and fj . Combined with (6) it follows that (A)ij = TA φj , φi 0 − TA fj , fi 0 . Since we have φi , φj , fi , fj ∈ H1 ∩ H0 , Remark 2 and (1) now complete the result. Since TA0 P1 = 0 we have S00 (A0 ), S10 (A0 ) = 0 and so S(A0 ) = 0 by (5). This observation and Proposition 4 can be reworded for as follows. Proposition 6. We have (A0 ) = 0. Furthermore, for any A ∈ U and m ∈ N0 , A ∈ Zm iff (A) ∈ SA(m) has rank m − m . Proposition 7. The map : U → SA(m) is smooth. Furthermore, for any A ∈ A, the matrix DF (A0 )(A) ∈ SA(m) is given by DF (A0 )(A)ij = −σ.Aφj , φi for i, j ∈ {1, . . . , m}. Proof. For α, β = 0, 1 the map A → Sαβ (A) is affine and hence smooth. Furthermore ⊥ S11 : U → L(L⊥ 1 , K0 ) is pointwise invertible so the Inverse Function Theorem (see Theorem I.5.1 in [11] for example) shows that A → S11 (A)−1 is a smooth map on U . The fact that S : U → L(K, L) and : U → SA(m) are smooth now follows immediately from their definitions. Let A ∈ A. If j ∈ {1, . . . , m} then φj ∈ K = Ker TA0 so TA0 +εA φj = (TA0 − ε σ.A) φj = −ε σ.Aφj
(9)
for any ε ∈ R. On the other hand, the maps S01 , S10 and S11 (·)−1 are smooth on U , while S01 (A0 ), S10 (A0 ) = 0. Hence S01 (A0 + εA), S10 (A0 + εA) = O(ε) and S11 (A0 + εA)−1 = O(1) (all in operator norm) for sufficiently small |ε|. Combined with (6) and (9) we then get (A0 + εA)ij − (A0 )ij = −εσ.Aφj , φi + O(ε 2 ) for i, j ∈ {1, . . . , m} and sufficiently small |ε|. The result now follows. 4. The Rank of In view of the results of the previous section the key remaining question concerns the rank of the derivative of . Definition 5. We call A0 ∈ Zm a regular potential if the map DF (A0 ) : A → SA(m) has full rank (i.e. rank m2 ). Remark 4. Since DF (A0 ) : A → SA(m) is continuous, C0∞ is dense in A and SA(m) is finite dimensional, we have DF (A0 )(A) = DF (A0 )(C0∞ ). It follows that A0 is a regular potential iff DF (A0 )(C0∞ ) = SA(m).
The Local Structure of Zero Mode Producing Magnetic Potentials
129
The next result indicates in which sense “regular potentials” are well behaved; it is simply a summary of parts of Propositions 5, 6 and 7. Theorem 5. If m ∈ N0 and A0 ∈ Zm is a regular potential then, in a neighbourhood of A0 , the set Zm is a smooth sub-manifold of A with co-dimension m2 . More precisely the smooth map : U → SA(m) defined above has the following properties; (i) (A0 ) = 0. (ii) For any m ≥ 0, A ∈ Zm ∩ U iff rank((A)) = m − m . (iii) DF (A0 ) : A → SA(m) has full rank (i.e. rank m2 ). The defining condition on regular potentials can be restated in a computationally more practical way as follows. Proposition 8. Let m ∈ N0 and suppose A0 ∈ Zm . Then A0 is not a regular potential iff we can find m± ∈ N0 with 0 < m+ + m− ≤ m, and linearly independent spinors ψ1 , . . . , ψm+ +m− ∈ Ker TA0 such that m+
m−
σj ψk , ψk C2 =
k=1
σj ψm+ +k , ψm+ +k C2
(10)
k=1
for j = 1, 2, 3. Proof. The set DF (A0 )(C0∞ ) is a linear subspace of SA(m) whilst the expression Tr(XY ) defines an inner product on SA(m). With the help of Remark 4 it follows that A0 is not a regular potential iff we can find 0 = X ∈ SA(m) such that
Tr X DF (A0 )(A) = 0 for all A ∈ C0∞ . (11) Let X ∈ SA(m). Since X is self-adjoint we can find a unitary matrix V such that X = V Y V ∗ , where Y = diag(λ1 , . . . , λm ) is the matrix of eigenvalues of X. Clearly we may assume that these eigenvalues are ordered so that yk2 for k = 1, . . . , m+ λk = −yk2 for k = m+ + 1, . . . , m+ + m− 0 for k = m+ + m− + 1, . . . , m, where yk ∈ R+ for k = 1, . . . , m+ + m− and m+ , m− ∈ N0 are the number of positive and negative eigenvalues of X (counted with multiplicity); in particular m+ + m− ≤ m whilst X = 0 iff 0 < m+ + m− . Using Proposition 7 we now get m
− Tr X DF (A0 )(A) = Xij σ.Aφi , φj = i,j =1
=
m+
m−
σ.Aψm+ +k , ψm+ +k ,
k=1
where ψk =
λk σ.AVik φi , Vj k φj
i,j,k=1
σ.Aψk , ψk −
k=1
m
m i=1
yk Vik φi
(12)
130
D. M. Elton
for k = 1, . . . , m+ +m− . The fact that V is unitary implies ψ1 , . . . , ψm+ +m− are linearly independent whilst (12) shows (11) holds iff m+
σ.Aψk , ψk =
k=1
m−
σ.Aψm+ +k , ψm+ +k
(13)
k=1
for all A ∈ C0∞ . Given x0 ∈ R3 and j ∈ {1, 2, 3}, take a sequence {A(n) }n∈N in C0∞ whose j th component converges to a Dirac-delta centered at x0 and whose other components are 0. Using the continuity of ψk (see Remark 3) we then get n→∞ (n) Aj σj ψk , ψk C2 d 3 x −−−→ σj ψk (x0 ), ψk (x0 )C2 σ.A(n) ψk , ψk = for k = 1, . . . , m+ + m− . The equivalence of (10) and (13) now follows. We now show that, for certain choices of m+ and m− , (10) cannot be satisfied by linearly independent spinors ψ1 , . . . , ψm+ +m− . In the proof of the next result a repeated Latin index will imply summation over {1, 2, 3}, unless otherwise indicated. We will also use eij k to denote the totally anti-symmetric tensor. Proposition 9. Let A ∈ A and suppose we have ψ1 , . . . , ψm ∈ Ker σ.(D − A) which satisfy m
σj ψk , ψk C2 = 0
k=1
for j = 1, 2, 3. Then ψ1 = · · · = ψm = 0. p,1
Proof. Suppose ψ ∈ L2loc solves σ.(D −A)ψ = 0. By Remark 3 we thus have ψ ∈ Hloc for all p ∈ [1, ∞). Using the fact that A is real we then get ∇|ψ|2 = i(D − A)ψ, ψC2 − iψ, (D − A)ψC2
(14)
p
as functions in Lloc (R3 , C). On the other hand the algebraic properties of the Pauli matrices can be coupled with the equation σ.(D − A)ψ = 0 to show |(D − A)ψ|2R3 ,C2 = ieij k σk Di ψ, Dj ψC2 − eij k Aj ∂i σk ψ, ψC2
(15)
p
as functions in Lloc (R3 , C). If we additionally suppose that ψ ∈ L2 then ψ ∈ H 1 (see Remark 3). Integrating (15) over R3 thus gives (16) (D − A)ψ 2 = −eij k Aj ∂i σk ψ, ψC2 d 3 x, where we have made use of the symmetry of σk Di ψ, Dj ψ with respect to i and j (n.b. Di and Dj are commuting self-adjoint operators). By applying (16) to the spinors ψ1 , . . . , ψm and using our hypothesis we get m m 2 (D − A)ψk = −eij l Aj ∂i σl ψk , ψk C2 d 3 x = 0. k=1
k=1
L2
function for k = 1, . . . , m. Combined with (14) Therefore (D − A)ψk = 0 as an it follows that |ψk | is constant. However ψk ∈ L2 so we must have ψk = 0 for k = 1, . . . , m.
The Local Structure of Zero Mode Producing Magnetic Potentials
131
Proposition 10. Let A ∈ A. Suppose ψ1 , ψ2 ∈ L2loc solve σ.(D − A)ψk = 0 and satisfy σj ψ1 , ψ1 C2 = σj ψ2 , ψ2 C2
(17)
for j = 1, 2, 3. Then ψ2 = αψ1 for some α ∈ S 1 . Proof. By Remark 3 we know that ψ1 and ψ2 are continuous. Therefore (17) is satisfied pointwise on R3 , from which a standard calculation can be used to show that the vectors ψ1 (x), ψ2 (x) ∈ C2 are parallel with equal lengths for any x ∈ R3 ; that is, we have ψ2 = αψ1 , where α is some S 1 valued function on R3 . If ψ1 is identically 0 we can clearly choose α to be constant. Now suppose ψ1 is not identically 0 and choose a bounded non-empty connected open set ⊆ R3 such that |ψ1 | is bounded away from 0 on (recall that ψ1 is continuous). It follows that α = |ψ1 |−2 ψ2 , ψ1 C2 on . However p,1 ψ1 , ψ2 ∈ Hloc () for any p ∈ [1, ∞) (by Remark 3) so, with the help of the bound on p,1 |ψ1 |−2 , we get α ∈ Hloc () for any p ∈ [1, ∞). In particular, α is continuous. Since ψ2 = αψ1 and σ.(D − A)ψk = 0 for k = 1, 2 we have σ.(Dα)ψ1 = σ.(Dα)ψ1 + α σ.(D − A)ψ1 = σ.(D − A)(αψ1 ) = 0 p
as functions in Lloc () for any p ∈ [1, ∞). However α is S 1 valued from which it can be shown that (σ.(Dα))∗ σ.(Dα) = |Dα|2 I2 . Therefore |Dα|2 ψ1 = 0 on and so Dα = −i∇α = 0. Thus α is constant on (recall that is connected). In summary we have shown that ψ2 = αψ1 on for some (constant) α ∈ S 1 . Setting ψ = ψ2 − αψ1 it follows that ψ ∈ L2loc , σ.(D − A)ψ = 0 on R3 , and ψ = 0 on . By the UCP for the Weyl-Dirac operator (see Theorem 10) we then get ψ = 0 on R3 , completing the result. If m± ∈ N0 satisfy m+ + m− ≤ 2 then either m+ = 0, m− = 0 or m+ = m− = 1. The next result thus follows directly from Propositions 8, 9 and 10. Theorem 6. The sets Z0 , Z1 and Z2 contain only regular potentials. Theorem 2 is an obvious consequence of Theorems 5 and 6. On the other hand Theorem 3 can be proved using Propositions 8, 9 and 10 as follows. Proof of Theorem 3. Define a projection 4 : SA(m) → R × Cm−1 ! R2m−1 by 4(X) = (X11 , X12 , . . . , X1m ) for all X ∈ SA(m) (n.b. X11 ∈ R). Therefore the composite := 4 ◦ is a smooth map U → R × Cm−1 . Furthermore Proposition 7 gives
DF (A0 )(A) = − σ.Aφ1 , φ1 , σ.Aφ2 , φ1 , . . . , σ.Aφm , φ1 for all A ∈ A. Suppose DF (A0 )(C0∞ ) = R × Cm−1 . Using the fact that the map DF (A0 ) is (R–)linear we can therefore find x1 , x2 , y2 , . . . , xm , ym ∈ R, not all of which are 0, such that x1 σ.Aφ1 , φ1 + x2 Reσ.Aφ2 , φ1 + y2 Imσ.Aφ2 , φ1 + · · · + xm Reσ.Aφm , φ1 + ym Imσ.Aφm , φ1 = 0
(18)
132
D. M. Elton
for all A ∈ C0∞ . Set z1 = x1 and zj = (xj + iyj )/2 ∈ C for j = 2, . . . , m. Proposition 7 and a straightforward calculation can then be used to show that (18) can be rewritten as
Tr X DF (A0 )(A) = 0
(19)
for all A ∈ C0∞ , where X ∈ SA(m) is given by
z 1 z2 . . . z2 0 . . . X = .. ... . zm 0 . . .
zm 0 . .. . 0
Now
det(X − λI ) = (−λ)m−2 λ2 − z1 λ − (|z2 |2 + · · · + |zm |2 ) for all λ ∈ C. Therefore X has at most 2 non-zero eigenvalues (counting multiplicities). Arguing as in the proof of Proposition 8 it follows that (19) is equivalent to (10) with m+ + m− ≤ 2. By applying either Proposition 9 or Proposition 10 we now obtain a contradiction. Thus we must have DF (A0 )(C0∞ ) = R × Cm−1 , which establishes property (ii) for . On the other hand property (i) follows directly from Proposition 6 and the definition of .
5. The Examples of Erd˝os and Solovej In [8] Erd˝os and Solovej constructed a large class of magnetic fields on R3 for which the dimension of the kernel of any corresponding Weyl–Dirac operator can be determined exactly. These magnetic fields are obtained by pulling back magnetic fields (or 2–forms) from S 2 to R3 using the map ρ : R3 → S 3 → S 2 obtained by composing inverse stereographic projection with the Hopf map from S 3 to S 2 . Any corresponding zero modes are essentially given as the pull-backs of the Aharonov-Casher zero modes of the Weyl-Dirac operator on S 2 . The explicit nature of the latter (in particular the fact that different zero modes can be obtained from each other by multiplying by suitable holomorphic functions) can be used to show that magnetic potentials corresponding to the magnetic fields considered by Erd˝os and Solovej must be regular (in the sense of Definition 5). The necessary results from [8] are summarised in Theorem 7, although firstly we give a precise definition of the class of magnetic fields under consideration. Definition 6. Let BES denote the set of those magnetic (vector) fields B which can be written as B(x) = (1 + |x|2 )−1 M(x)g(ρ(x)) for some g ∈ C ∞ (S 2 , R), where M is the vector field given by T 2x1 x3 − 2x2 4 2x2 x3 + 2x1 . M(x) = (1 + |x|2 )2 1 − x 2 − x 2 + x 2 1 2 3
The Local Structure of Zero Mode Producing Magnetic Potentials
133
Remark 5. Clearly any B ∈ BES is smooth and satisfies B(x) = O(|x|−4 ) as |x| → ∞. It follows that it is possible to find a smooth magnetic potential A with B = curl A and A(x) = O(|x|−2 ) as |x| → ∞ (for example one can use the integral operator from Theorem A.1 in [9]); any such A is contained in A. On the other hand, if A, A ∈ C 0 are two magnetic potentials which give rise to the same magnetic field (in the sense that curl A = curl A as distributions) then we can find θ ∈ C 1 (R3 , R) with ∇θ = A − A . Multiplication by eiθ then establishes a unitary equivalence between the operators σ.(D−A) and σ.(D−A ); in particular, these operators must have kernels of equal dimension. The next result is taken from Theorems 35, 39 and 41 in [8] (and rewritten in terms of magnetic potentials with the help of Remark 5 above). Theorem 7 (Erd˝os and Solovej). Let B ∈ BES and suppose B, M = 4π 2 (2m + 1)
(20)
for some m ∈ Z (here ·, · denotes the inner product on L2 (R3 , R3 )). Let A ∈ C 0 be any magnetic potential which satisfies curl A = B (in a distributional sense) and set s = sgn(m + 21 ). Then we have the following: (i) dim Ker σ.(D − A) = |m + 21 | − 21 . (ii) If φ ∈ Ker σ.(D − A) then (1 + |x|2 )σ.Mφ = 4s φ. (iii) Suppose φ ∈ Ker σ.(D − A) with φ(x0 ) = 0 for some x0 ∈ R3 . Then we can find a neighbourhood ⊆ R3 of x0 and a submersion τ : → C such that any ψ ∈ Ker σ.(D − A) can be written as ψ = (f ◦τ ) φ on for some function f : τ () → C. Furthermore f is holomorphic if s = +1 and anti-holomorphic if s = −1. Remark 6. Let m ∈ N and define B (m) ∈ BES to be the magnetic field given by Definition 6 when g is the constant 2m+1. Taking A(m) = 41 (2m+1)M one can check that A(m) ∈ A, curl A(m) = B (m) and B (m) , M = 4π 2 (2m + 1). Theorem 7 then gives A(m) ∈ Zm . The magnetic potential A(1) was considered in [13] where it was given as the first example of a zero mode producing magnetic potential. The magnetic potentials A(m) for m > 1 were considered in [2] where it was shown that they produce at least m linearly independent zero modes (n.b. the exact dimension of the set of zero modes was not determined). Proposition 11. Let ⊆ C be a connected open set and m, n ∈ N0 . If m k=1
|fk |2 =
n
|gk |2
k=1
for some holomorphic functions f1 , . . . , fm , g1 , . . . , gn : → C, then Span{f1 , . . . , fm } = Span{g1 , . . . , gn }. Remark 7. If h : → C is anti-holomorphic then its conjugate h : z → h(z) is holomorphic and |h|2 = |h|2 . It follows that Proposition 11 also holds for anti-holomorphic functions.
134
D. M. Elton
Proof. If 1 ⊆ C is open and h : 1 → C is holomorphic then ?|h(z)|2 = 4 ∂z ∂z¯ h(z)h(z) = 4 ∂z h(z) ∂z h(z) = 4|h (z)|2
(21)
for all z ∈ 1 . Now let n ∈ N0 . Using symmetry we can clearly establish the result from the following claim. Claim: Let 1 ⊆ C be a connected open set and m ∈ N0 . If m
|fk |2 =
k=1
n
|gk |2
(22)
k=1
for some holomorphic functions f1 , . . . , fm , g1 , . . . , gn : 1 → C then g1 , . . . , gn ∈ Span{f1 , . . . , fm }. If m = 0 then (22) implies g1 = · · · = gn = 0 ∈ Span ∅, completing the claim in this case. Now assume the claim is true for some m = M ∈ N0 . Suppose M+1
|fk |2 =
k=1
n
|gk |2
(23)
k=1
for some holomorphic functions f1 , . . . , fM+1 , g1 , . . . , gn : 1 → C. If fM+1 ≡ 0 then the claim for m = M + 1 clearly reduces to the case m = M. Now suppose otherwise; thus we can find a non-empty connected open set 0 ⊆ 1 such that fM+1 is nowhere zero on 0 . Dividing (23) by |fM+1 |2 , applying the laplacian and using (21) then gives M k=1
|fk |2 =
n k=1
| gk |2
on 0 , where fk = fk /fM+1 and gk = gk /fM+1 . Applying the inductive assumption to the holomorphic functions f1 , . . . fM , g1 , . . . , gn : 0 → C then gives g1 , . . . , gn ∈ Span{f1 , . . . , fM } as functions on 0 . Integration and multiplication by fM+1 then implies g1 , . . . , gn ∈ fM+1 Span{f1 , . . . , fM , 1} = Span{f1 , . . . , fM , fM+1 } on 0 , and hence on 1 by the uniqueness of holomorphic extensions. Induction now completes the claim. Theorem 8. Suppose B ∈ BES satisfies (20) and let A ∈ C 0 be any magnetic potential which satisfies curl A = B (in a distributional sense). If m, n ∈ N0 and φ1 , . . . , φm , ψ1 , . . . , ψn ∈ Ker σ.(D − A) satisfy m k=1
σj φk , φk C2 =
n
σj ψk , ψk C2
k=1
for j = 1, 2, 3, then Span{φ1 , . . . , φm } = Span{ψ1 , . . . , ψn }.
The Local Structure of Zero Mode Producing Magnetic Potentials
135
Proof. Since M is real σ.Mφ, φC2 = 3j =1 Mj σj φ, φC2 for any φ. Coupling this with our hypothesis and Theorem 7(ii) we obtain m
|φk |2 =
k=1
n
|ψk |2 .
(24)
k=1
If φ1 , . . . , φm , ψ1 , . . . , ψn are identically zero then the result holds trivially. Now suppose otherwise. Using the continuity of the φk ’s and ψk ’s (see Remark 3) we may, without loss of generality, suppose φ1 (x0 ) = 0 for some x0 ∈ R3 . Using Theorem 7(iii) we can now find a neighbourhood ⊆ R3 of x0 , a submersion τ : → C and functions f1 , . . . , fm , g1 , . . . , gn : τ () → C such that φk = (fk◦τ ) φ1 and ψk = (gk◦τ ) φ1 on . Furthermore f1 , . . . , fm , g1 , . . . , gn are either all holomorphic or all anti-holomorphic. On the other hand (24) implies m k=1
|fk |2 =
n
|gk |2
k=1
on some connected open neighbourhood ⊂ C of τ (x0 ). Proposition 11 (see also Remark 7) then gives Span{φ1 , . . . , φm } = Span{ψ1 , . . . , ψn } on τ −1 ( ). However τ −1 ( ) is a non-empty open subset of R3 so the result now follows from the UCP for the Weyl-Dirac operator (see Theorem 10). Combined, Proposition 8, Theorem 7 and Theorem 8 give the following. Corollary 1. Let A ∈ A. If B := curl A ∈ BES and satisfies (20) for some m ∈ Z, then A is a regular potential in Zm , where m = |m + 21 | − 21 . Before proofing Theorem 1 we give a useful combination of Theorem 5 and the Implicit Mapping Theorem (see Sect. I.5 in [11] for example). Theorem 9. Suppose A0 ∈ Zm is a regular potential for some m ∈ N0 . Let be the map given by Theorem 5 and choose any subspace L ⊂ A of dimension m2 for which the map DF (A0 ) : L → SA(m) is surjective. Then we can find neighbourhoods V ⊆ L of 0 and W ⊆ Ker DF (A0 ) of 0, and a smooth map A : W → V with the following properties: (i) A(0) = 0. (ii) DF A(0) = 0. (iii) A ∈ Zm ∩ U iff A = A0 + A + A(A ) for some A ∈ W , where U ⊆ A is defined by U = A0 + W + V . Remark 8. Since L and Ker DF (A0 ) are complimentary subspaces of A the set U defined in part (iii) is a neighbourhood of A0 ∈ A. Also, since C0∞ is dense in A, it is always possible to choose L to be a subspace of C0∞ . Proof of Theorem 1. Part (i). Using Proposition 2 and Theorem 4 we get dim Ker σ.(D − A) = dim Ker TA < +∞ for all A ∈ A. Part (ii). The assignment A → TA is a continuous map from A to the (open) subset of L(H1 , H0 ) consisting of those operators which are Fredholm. On the other hand the function dim Ker is upper semi-continuous on the set of Fredholm operators (see
136
D. M. Elton
Theorem IV.5.17 in [10] for example). Thus, given any A0 ∈ A, we have dim Ker TA ≤ dim Ker TA0 for all A in some neighbourhood of A0 . Using Proposition 2 it follows that m ≤m Zm is an open subset of A for any m ∈ N0 ; in particular Z0 must be open. Suppose A0 ∈ Zm for some m > 0 and let U and be as given by Theorem 3 (n.b. it is easy to see from its proof that Theorem 3 also holds for m = 1, 2). Using the above observation we may assume U ⊂ m ≤m Zm . Theorem 3(i) then gives A∈U∩
m ≥m Zm
%⇒ (A) = 0.
(25)
On the other hand Theorem 3(ii) allows us to find a subspace L ⊂ A of dimension 2m − 1 such that the restricted map DF (A0 ) : L → R2m−1 is surjective. Setting V = (L + A0 ) ∩ U it follows that V is a neighbourhood of A0 in L + A0 whilst the restricted map |V : V → R2m−1 is smooth and has an invertible differential at A0 . Therefore |V maps a neighbourhood of A0 ∈ V diffeomorphically onto a neighbourhood of 0 ∈ R2m−1 . Hence, for all sufficiently small ε > 0, we can define Aε ∈ V ⊂ U as the inverse of (ε, 0, . . . , 0) ∈ R2m−1 under the map |V . Clearly Aε → A0 as ε → 0 whilst (Aε ) = (ε, 0, . . . , 0) = 0 so Aε ∈ m <m Zm by (25). In summary, we have thus shown Zm is contained in the closure of m <m Zm for any m > 0. Induction and part (i) now imply Z0 is dense in A. Part (iii). Let A ∈ C0∞ and, for any R > 0, define a scaled potential AR ∈ C0∞ by AR (x) = R A(Rx). It is straightforward to check that φ is a zero mode for A iff φR is a zero mode for AR , where φR is defined by φR (x) = φ(Rx). Hence we have A ∈ Zm iff AR ∈ Zm . On the other hand we have supp(AR ) = R −1 supp(A) so, by choosing R sufficiently large, we can ensure that supp(AR ) is contained in an arbitrary neighbourhood of 0. Combining this observation with the fact that σ.D is translationally invariant, it follows that it is sufficient to consider the case = R3 . Using Remark 6 and Corollary 1 we know that Zm must contain a regular potential, say A0 . Now let V , W, U ⊂ A and A : V → W be as given by Theorem 9 with L chosen so that L ⊂ C0∞ (see Remark 8). Therefore V ⊂ C0∞ . The density of C0∞ in A also means we can find A ∈ U ∩ C0∞ . By the definition of U we then have A = A0 + AW + AV for some AW ∈ W and AV ∈ V . Now set A = A0 + AW + A(AW ) ∈ U . Since A(AW ) ∈ V ⊂ C0∞ we have A = A − AV + A(AW ) ∈ C0∞ . On the other hand, Theorem 9(iii) gives us A ∈ Zm , completing the result.
A. The UCP for the Weyl–Dirac Operator on R3 In this appendix we establish the Unique Continuation Property for the operator σ.(D − A) when A is a continuous vector potential on R3 . Our proof is modeled on an argument due to E. Heinz (see Appendix to Sect. XIII.13 of [14]). In what follows we set BR = {|x| < R} ⊂ R3 and BR = BR \{0} for any R > 0. We also use · to denote the L2 norm on S 2 . Theorem 10. Suppose ⊆ R3 is a connected open set and A is a continuous vector potential on . If φ ∈ L2loc () satisfies σ.(D − A)φ = 0 on and φ = 0 in some non-empty open subset of , then φ = 0 on . In the next two results we develop the key estimate for our argument.
The Local Structure of Zero Mode Producing Magnetic Potentials
137
Proposition 12. Suppose B is a symmetric operator on S 2 . Then, for any α ∈ R and u ∈ C0∞ (R+ , Dom(B)), we have ∞ ∞ e2(α−1)t u(t) 2 dt ≤ e2αt (Dt − iB)u(t) 2 dt. 0
0
Proof. For any w ∈ e
−t/2
C0∞ (R+ , Dom(B)) t (s−t)/2 −s/2
w(t) ≤
e
e
we have
Ds w(s) ds ≤
0
∞
0
Using the Cauchy–Schwartz inequality it follows that ∞ ∞ e−t w(t) 2 ≤ e−s ds Ds w(s) 2 ds = 0
0
0
e−s/2 Ds w(s) ds. ∞
Ds w(s) 2 ds.
Multiplying by e−t and integrating over R+ with respect to t then gives ∞ ∞ e−2t w(t) 2 dt ≤ Ds w(s) 2 ds. 0
(26)
0
On the other hand, the symmetry of the operator B implies (Dt − i(B − α))w(t) 2 = Dt w(t) 2 + ∂t w(t), (B − α)w(t) + (B − α)w(t) 2 . Combining this with (26) we then get ∞ e−2t w(t) 2 dt ≤ 0
∞ 0
(Dt − i(B − α))w(t) 2 dt.
The result clearly follows if we take w(t) = eαt u(t). 1 with supp(φ) ⊂ B for some R > 0. Then, for any Proposition 13. Suppose φ ∈ Hloc R β ∈ R, we have |x|β |φ(x)|2 d 3 x ≤ R 2 |x|β |σ.Dφ(x)|2 d 3 x. BR
BR ∞ C0 (BR ).
Proof. By continuity we may assume φ ∈ Now consider the change of variables from R3 \{0} to R×S 2 given by (r, ω) → (t, ω), where t = − ln(r/R). Under this change of variables we clearly have r → Re−t ,
rDr → −Dt
and
r −1 dr → − dt.
Define u ∈ C0∞ (R+ , H 1 (S 2 )) by u(t) = φ(r, ·). Since d 3 x = r 2 dr dS 2 we get ∞ β 2 3 β+3 |x| |φ(x)| d x = R e−(β+3)t u(t) 2 dt. BR
(27)
0
Let B denote the operator σ.(x × D). It is straightforward to check B acts as a selfadjoint operator on S 2 , whilst on BR we can write σ.Dφ = r −2 σ.x(rDr + iB)φ . Since |r −2 σ.xψ| = r −1 |ψ| for any ψ ∈ C2 , we now get β 2 3 |x| |σ.Dφ(x)| d x = r β−2 |(rDr + iB)φ(r, ω)|2 r 2 dr dS 2 BR BR ∞ = R β+1 e−(β+1)t (Dt − iB)u(t) 2 dt. (28) 0
The result now follows from (27), (28) and Proposition 12 with α = −(β + 1)/2.
138
D. M. Elton
Lemma 2. Suppose ⊆ R3 is a connected open set and A is a continuous vector 1 () satisfies σ.(D − A)φ = 0 potential on with |A| ≤ M for some M. If φ ∈ Hloc on and φ = 0 in a neighbourhood of some x0 ∈ , then φ = 0 on the ball x0 + BR , where R = min{1/(9M), dist(x0 , ∂)/3}. Proof. Since the operator σ.D is translationally invariant, we may assume x0 = 0. Now choose χ ∈ C0∞ with supp(χ ) ⊂ B3R , χ = 1 on B2R and Ran χ = [0, 1]. Set ψ = χ φ. . Thus, Therefore σ.Dψ = χ σ.Aφ + σ.(Dχ )φ on . On the other hand supp(ψ) ⊂ B3R for any β ∈ R, Proposition 13 gives us |x|β |φ(x)|2 d 3 x ≤ |x|β |ψ(x)|2 d 3 x B2R B3R ≤ 18R 2 |x|β |σ.Aφ(x)|2 d 3 x + 18R 2 |x|β |σ.(Dχ )φ(x)|2 d 3 x. B3R
B3R
Now |σ.Aφ|2 ≤ 2|A|2 |φ|2 ≤ 2M 2 |φ|2 whilst supp(Dχ ) ⊂ B3R \B2R . Hence |x|β |φ(x)|2 d 3 x ≤ 36M 2 R 2 |x|β |φ(x)|2 d 3 x + C |x|β |φ(x)|2 d 3 x B2R
B3R
B3R \B2R
for some constant C. Now 36M 2 R 2 < 1/2 and, for any β ≤ 0, R β ≤ |x|β on BR and |x|β ≤ (2R)β on B3R \B2R . Therefore |φ(x)|2 d 3 x ≤ R −β |x|β |φ(x)|2 d 3 x ≤ 2β (2C + 1) |φ(x)|2 d 3 x. BR
2R
Taking the limit β → −∞ it follows that
BR |φ(x)|
B3R \B2R
2 d 3x
= 0.
Proof of Theorem 10. Since φ ∈ L2loc () solves σ.(D − A)φ = 0 on for some 1 (). Now, by hypothesis, continuous A, standard regularity results give us φ ∈ Hloc there exists x0 ∈ such that φ = 0 in a neighbourhood of x0 . Let x1 ∈ and choose a continuous path γ : [0, 1] → with γ (t) = xt for t = 0, 1. Since is open we can find some R0 > 0 such that := γ (t) + BR0 ⊆ . t∈[0,1]
Now is bounded and A is continuous so we can find M such that |A| ≤ M on . Set R = min{1/(9M), R0 /3} and choose a finite set of points {t0 , . . . , tn } ⊂ [0, 1] such that t0 = 0, tn = 1 and γ (ti ) ∈ γ (ti−1 ) + BR for each i ∈ {1, . . . , n}. Since φ = 0 in a neighbourhood of x0 = γ (t0 ), Lemma 2 and induction implies that φ = 0 on γ (tn )+BR . Therefore φ = 0 in a neighbourhood of γ (tn ) = x1 , completing the result.
References 1. Adam, C., Muratori, B. and Nash, C.: Zero modes of the Dirac operator in three dimensions. Phys. Rev. D 60, 125001 (1999) 2. Adam, C., Muratori, B. and Nash, C.: Degeneracy of zero modes of the Dirac operator in three dimensions. Phys. Lett. B 485, 314–318 (2000) 3. Balinsky, A. A. and Evans, W. D.: On the zero modes of Pauli operators. J. Funct. Anal. 179, 120–135 (2001)
The Local Structure of Zero Mode Producing Magnetic Potentials
139
4. Balinsky, A. A. and Evans, W. D.: On the zero modes of Weyl-Dirac operators and their multiplicity. Bull. London Math. Soc., to appear 5. Bagirov, L. A. and Kondrat’ev, V. A.: Elliptic Equations in Rn . Differentsial’nye Uravneniya 11, 498–504 (1975) = Differential Equations 11, 375–379 (1976) 6. Elton, D. M.: New examples of zero modes. J. Phys. A 33, 7297–7303 (2000) 7. Elton, D. M.: Fredholm properties of elliptic operators on Rn . Dissertationes Math. (Rozprawy Mat.) 399, 1–72 (2001) 8. Erd˝os, L. and Solovej, J. P.: The kernel of Dirac operators on S3 and R3 . Rev. Math. Phys. 13, 1247–1280 (2001) 9. Fröhlich, J., Lieb, E. and Loss, M.: Stability of Coulomb Systems with Magnetic Fields I. The One Electron Atom. Commun. Math. Phys. 104, 251–270 (1986) 10. Kato, T.: Perturbation Theory for Linear Operators, 2nd Edition. Berlin: Springer-Verlag, 1976 11. Lang, S.: Differentiable Manifolds. Reading, Massachusetts: Addison Wesley, 1972 12. Lockhart, R. B. and McOwen, R. C.: On elliptic systems in Rn . Acta. Math. 150, 125–135 (1983) 13. Loss, M. and Yau, H. T.: Stability of Coulomb Systems with Magnetic Fields III. Zero Energy States of the Pauli Operator. Commun. Math. Phys. 104, 283–290 (1986) 14. Reed, M. and Simon, B.: Methods of Modern Mathematical Physics IV: Analysis of Operators. New York: Academic Press, 1978 15. Thaller, B.: The Dirac Equation. Berlin: Springer-Verlag, 1992 Communicated by B. Simon
Commun. Math. Phys. 229, 141 – 182 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Diffusive Fluctuations for One-Dimensional Totally Asymmetric Interacting Random Dynamics Timo Seppäläinen Department of Mathematics, University of Wisconsin, Madison, WI 53706-1388, USA Received: 4 October 2001 / Accepted: 12 March 2002
Abstract: We study central limit theorems for a totally asymmetric, one-dimensional interacting random system. The models we work with are the Aldous–Diaconis–Hammersley process and the related stick model. The A-D-H process represents a particle configuration on the line, or a 1-dimensional interface on the plane which moves in one fixed direction through random local jumps. The stick model is the process of local slopes of the A-D-H process, and has a conserved quantity. The results describe the fluctuations of these systems around the deterministic evolution to which the random system converges under hydrodynamic scaling. We look at diffusive fluctuations, by which we mean fluctuations on the scale of the classical central limit theorem. In the scaling limit these fluctuations obey deterministic equations with random initial conditions given by the initial fluctuations. Of particular interest is the effect of macroscopic shocks, which play a dominant role because dynamical noise is suppressed on the scale we are working. 1. Introduction We study fluctuations on the scale of the classical central limit theorem for totally asymmetric interacting random systems in one space dimension. The model system for which we prove theorems is the Aldous–Diaconis–Hammersley process. To summarize this model in one sentence, it consists of point particles on the real line that jump randomly to the left, at rate equal to the distance to the left neighbor, with new locations chosen uniformly at random between the jumper and its left neighbor. The idea for this process appeared in Hammersley’s classic paper [15], and Aldous and Diaconis [1] first defined it as an infinite system of interacting particles. We consider the general nonequilibrium hydrodynamic limit situation where the limiting interface (or tagged particle, depending on one’s point of view) is governed by a Hamilton–Jacobi equation ut + f (ux ) = 0. The initial distributions can be fairly Research partially supported by NSF grants DMS-9801085 and DMS-0126775.
142
T. Seppäläinen
arbitrary, subject to a limit assumption on the fluctuations around the initial macrosopic profile and some moment bounds. In particular, we do not restrict to product initial distributions or particular types of initial macroscopic profiles. The overall picture is this: the limiting fluctuation field ζ (x, t) is governed by the linearization of the hydrodynamic equation: ζt + f (ux )ζx = 0, where u(x, t) is the deterministic limit around which the random interface fluctuates. This is a deterministic equation, and all the randomness is confined to the initial condition. The dynamics transports the initial fluctuations along the characteristics and shocks of the hydrodynamic equation. This picture of characteristics rigidly transporting fluctuations has been understood to some degree for quite a while, and has been proved in some special cases. What our paper furnishes are proofs in the general nonequilibrium setting, for our particular model. The mathematical reason for the suppression of dynamical noise lies in two facts: (i) The most general evolution of Hammersley’s process can be realized as an envelope of an infinite family of simpler processes. This is the microscopic variational representation of the process. (ii) The results of Baik–Deift–Johansson [2] imply that these simpler processes have fluctuations of order n1/3 which are swamped by the initial diffusive fluctuations of order n1/2 . It has been more common to use the exclusion process and its variants to develop a rigorous theory of large scale behavior. But at the time of writing this paper, Hammersley’s process has one mathematical advantage over the exclusion process. The commonality is that for both processes, the totally asymmetric versions can be conveniently coupled with simple growth models, and represented by particle-level variational formulations. For Hammersley’s process this growth model is the increasing sequences model on a planar Poisson point process [1, 22]. For exclusion it is the last-passage percolation model with weakly increasing paths on the two-dimensional square lattice [24]. The advantage of Hammersley’s process is that better probability estimates are available for the planar increasing sequences model than for the lattice last-passage model. In particular, Lemma 4.1(b) used in our proof has not yet been proved for the exclusion model. This estimate for the increasing sequences model was proved by Baik, Deift and Johansson [2] with Riemann–Hilbert techniques. Obtaining this estimate for exclusion has not been simply a matter of repeating the argument, but turned out to be a somewhat tricky problem (personal communication from J. Baik). A recent preprint [3], circulated after this work was completed, provides this estimate for discrete-time exclusion. Once this estimate is available, we believe that the results of this paper can be repeated for totally asymmetric simple exclusion. The reader can find comprehensive overviews of fluctuation results for interacting systems in [14] and in Chapter 11 of [18]. So we make only a few remarks here. Fundamental work on the fluctuations of the asymmetric exclusion process was done by Ferrari and Fontes. In a series of papers ([10–12]) they study the fluctuations of the current and the tagged particle in equilibrium, and the fluctuations of a second class particle with shock initial conditions, given by a product measure with different densities to the left and right of the origin. In this situation the authors prove the basic feature of asymmetric fluctuations, namely the rigid transport along the characteristics. Their proofs use couplings and monotonicity arguments and necessitate special initial distributions such as i.i.d. distributions or product measures with piecewise constant densities. A predecessor of our paper is a recent work of Rezakhanlou [21] on the totally asymmetric exclusion process. He used the variational representation of the exclusion
Fluctuations for Totally Asymmetric Systems
143
process and the Ferrari-Fontes results to study the nonequilibrium situation. What he obtained corresponds to our Theorem 2.2(i), with some technical restrictions on the space-time points for which the result is valid. Our paper complements the earlier work by going into general nonequilibrium profiles and initial distributions, and by giving complete results on the convergence of the entire interface and the distribution-valued density fluctuation field. Our results cover tagged particles for Hammersley’s process, and thereby also the current for the stick process. The stick process is the process of increments for Hammersley’s process, and hence the displacements of Hammersley’s particles are the currents of the stick process. Let us contrast our methods and results with those of symmetric, reversible processes. In the one-dimensional setting symmetric means that particles are equally likely to jump both left and right. The fluctuation theory of reversible interacting processes relies on methods of martingales and the Holley-Stroock theory of generalized OrnsteinUhlenbeck processes. The limiting fluctuation fields of reversible processes are governed by equations driven by white noise. Both our methods and the qualitative results are different. We use no martingale theory. Instead our methods rely on sharp control of the paths of individual particles, and on the theory of shocks and characteristics of onedimensional conservation laws and Hamilton–Jacobi equations. And we already highlighted the main difference, that for asymmetric systems no dynamical noise is visible on the diffusive scale. Between symmetric and asymmetric systems are the weakly asymmetric systems where the asymmetry vanishes in the hydrodynamic limit. The central limit behavior of weakly asymmetric systems is qualitatively the same as that of symmetric systems, governed by a linear stochastic partial differential equation whose drift term is the linearization of the hydrodynamic equation. But the weakly asymmetric systems have an additional interesting feature proved by Bertini and Giacomin [4]: A small perturbation of a flat profile obeys, on larger space and time scales, a nonlinear stochastic equation of KPZ type. This raises the question whether such a result could be obtained for asymmetric systems at some suitable scaling. The shocks of the hydrodynamic equation turn out to have interesting effects on fluctuations. For example, a basic result one would expect is that the motion of a tagged particle converges to something related to Brownian motion. But we find that in the presence of shocks the fluctuation processes of a tagged particle are not tight in the Skorokhod space D([0, ∞), R). We can still prove the tagged particle’s convergence to a function of Brownian motion uniformly on compact time intervals away from shocks, and even pointwise at the shocks. The limiting path has discontinuities at the shock times, and is not right-continuous (in time), but instead lower semicontinuous at the discontinuities. We prove a distributional limit theorem for the entire interface in a weaker topology, p as an element of Lloc (R). This limiting process is a weak solution of the linearization of the hydrodynamic equation, as mentioned above. However, due to the shocks a particular version of the limiting process has to be chosen (from among the a.e. equal versions) to get a weak solution of the linearized equation. And this weak solution turns out to disagree with the pointwise distributional limit at the shocks. Organization of the paper. Section 2 describes the particle process and the results. The last part of that section gives a rigorous construction of the process in terms of increasing sequences among Poisson points on the Euclidean plane. This is the variational coupling formulation basic for our approach. Section 3 records properties of the characteristics
144
T. Seppäläinen
and shocks of a one-dimensional conservation law. The approach here follows [20] and is based on the Hopf–Lax and Lax–Oleinik formulas. Sections 4 and 5 contain probability estimates needed for taking advantage of the variational coupling formulation. These are based on the known estimates for increasing sequences ([2, 17, 23]). The remaining sections discuss the proofs of the theorems. A longer version of this paper with more complete proofs is available at http://front.math.ucdavis.edu/math.PR/0108174. 2. Results We study the large scale behavior of the Aldous–Diaconis–Hammersley process, or Hammersley’s process for short. The state of this process is z(t) = (zi (t) : i ∈ Z) that represents a countable collection of labeled point particles on R. The variable zi (t) ∈ R is the location of particle i at time t. The particles are ordered, so that zi−1 (t) ≤ zi (t) for all i ∈ Z and t ≥ 0. All particles make jumps to the left, according to the following description. Suppose the state at time t is z(t) = (zi (t) : i ∈ Z). Then each particle i has an instantaneous rate of zi (t) − zi−1 (t) of jumping, independently of all other particles. And when particle i jumps, its new location is chosen uniformly at random from the interval (zi−1 (t), zi (t)). Of course this sketch needs justification because infinitely many jumps happen in every positive time interval. In Sect. 2.4 we give a rigorous construction of this infinite-particle dynamics in terms of increasing sequences on the plane. Instead of thinking about a particle configuration, we can regard Hammersley’s process as a model for a 1-dimensional interface on the plane. The interface is represented by the height function z(t) defined on the integers, so that zi (t) is the height of the interface above site i. Through the jumps of the zi ’s the interface moves downward. The stick process η(t) = (ηi (t) : i ∈ Z) is the process of increment variables defined by ηi (t) = zi (t) − zi−1 (t). The dynamics of η(·) can be represented by the following generator L which acts on bounded cylinder functions ψ on the product space [0, ∞)Z : ηi [ψ(ηu,i,i+1 ) − ψ(η)]du, Lψ(η) = i∈Z
0
where ηu,i,i+1 represents the configuration after a piece of size u has been moved from u,i,i+1 site i to i + 1: ηiu,i,i+1 = ηi − u, ηi+1 = ηi+1 + u, and ηju,i,i+1 = ηj for j = i, i + 1. This process can be rigorously defined on a certain subspace of the full product space [0, ∞)Z , see [22] for details. Let u0 be a nondecreasing locally Lipschitz continuous function on R. It represents the initial macroscopic interface. The evolving macroscopic interface u(x, t), (x, t) ∈ R × [0, ∞), is the unique viscosity solution of the Hamilton–Jacobi equation ut + f (ux ) = 0,
u(x, 0) = u0 (x),
(1)
with velocity function f (ρ) = ρ 2 . Equivalently, u is defined for t > 0 by the Hopf–Lax formula x−y u(x, t) = inf u0 (y) + tg , (2) y:y≤x t
Fluctuations for Totally Asymmetric Systems
145
where g(x) = x 2 /4 is the convex dual of f . For a fixed t the partial x-derivative ρ(x, t) = ux (x, t) exists for all but countably many x. This function is the unique entropy solution of the Burgers equation ρt + f (ρ)x = 0,
ρ(x, 0) = ρ0 (x),
(3)
u0
where ρ0 = (a.e. defined derivative). We look at properties of these equations in Sect. 3. See Chapters 3, 10, 11 in [9] for basic theory. Assume we have a sequence zn (·) of Hammersley’s processes, with random initial configurations {zin (0) : i ∈ Z}, and n = 1, 2, 3, . . . is the index of the sequence. The n (nt) objective of our paper is to study the fluctuations of the random interface z[nx] around the deterministic interface nu(x, t) in the diffusive, or central limit theorem, scale n1/2 . The fluctuations are described by the stochastic process ζn (x, t) defined for (x, t) ∈ R × [0, ∞) by n ζn (x, t) = n−1/2 {z[nx] (nt) − nu(x, t)}.
(4)
Think of the initial process {ζn (y, 0) : y ∈ R} as a random function with values in the Skorokhod space D(R) of right-continuous functions on R with left limits (RCLL functions). This space is metrized as follows. Let be the collection of strictly increasing, bijective Lipschitz functions λ : R → R such that ∞ λ(x) − λ(y) −u λ = e d(λ, u)du + sup log < ∞, (5) x−y 0 x =y where
u) = sup |(λ(x) ∧ u) ∨ (−u) − (x ∧ u) ∨ (−u)| ∧ 1. d(λ, x∈R
For α, β ∈ D(R) and u > 0 let d(α, β, λ, u) = sup |α ((x ∧ u) ∨ (−u)) − β ((λ(x) ∧ u) ∨ (−u))| ∧ 1 x∈R
and then
dS (α, β) = inf λ + λ∈
∞
e
−u
d(α, β, λ, u)du .
(6)
0
The metric dS is complete and separable. Convergence dS (αj , α) → 0 is equivalent to the existence of a sequence λj ∈ such that λj converges to the identity function uniformly on compacts, and |αj − α ◦ λj | → 0 uniformly on compacts. Let C(R) denote the subspace of continuous functions. Our basic hypothesis is weak convergence at time 0 to a continuous limit function: There exists a C(R)-valued random function ζ0 such that ζn (·, 0) → ζ0 (·) in distribution as n → ∞, on the space D(R).
(7)
Assumption (7) is in fact equivalent to a stronger assumption, which is important for us so we clarify it right away. Let Du (R) be the space of RCLL functions endowed with the du -metric of uniform convergence on compact sets: ∞ du (α, β) = sup |α(r) − β(r)| ∧ 1 2−j for α, β ∈ Du (R). (8) j =1
−j ≤r≤j
146
T. Seppäläinen
The metric du is much stronger than the Skorokhod metric dS , and in fact Du (R) is not even separable. But on C(R) the two metrics induce the same topologies. Because the jumps of ζn (·, 0) occur at deterministic locations and because the limit process ζ0 (·) is continuous, it follows that ζn (·, 0) is measurable as a Du (R)-valued random function, and assumption (7) is equivalent to this stronger assumption: There exists a C(R)-valued random function ζ0 such that ζn (·, 0) → ζ0 (·) in distribution as n → ∞, on the space Du (R).
(9)
See Sect. 18 in [5] for details of this point. Since the state space is large, we need a uniformity assumption. But only on one side since the dynamics is totally asymmetric. There exists a fixed b ∈ R such that for every ε > 0 one can find q and n0
n such that sup P sup nk −2 z[nb] (0) − zkn (0) ≥ ε ≤ ε. n≥n0
(10)
k:k≤nq
If (10) holds for some b, it holds for all b. It forces u0 to satisfy lim |y|−2 u0 (y) = 0.
(11)
y−∞
A consequence of assumption (9) is convergence in probability to the macroscopic interface u0 : lim P
n→∞
n sup |n−1 z[ny] (0) − u0 (y)| ≥ ε
y∈[a,b]
= 0.
(12)
n (nt) → u(x, t) in probThis and (10) are sufficient for a hydrodynamic limit: n−1 z[nx] ability as n → ∞, uniformly over (x, t) in compact sets. See [22]. Property (11) guarantees that there exists a nonempty compact set I (x, t) ⊆ (−∞, x] on which the infimum in (2) is achieved: x−y I (x, t) = y ≤ x : u(x, t) = u0 (y) + tg . t
For t = 0 it is convenient to have the convention I (x, 0) = {x}. The minimal and maximal Hopf–Lax minimizers are y − (x, t) = inf I (x, t) Define ρ ± (x, t) = g
and
x − y ± (x, t) t
y + (x, t) = sup I (x, t).
(13)
for (x, t) ∈ R × (0, ∞).
(14)
It turns out that, for a fixed t, y − (x, t) = y + (x, t) for all except at most countably many x. At such points the function ρ(x, t) = ρ ± (x, t) is defined and continuous, and is the x-derivative ρ(x, t) = ux (x, t) of the viscosity solution of (1). Definition (14) is called the Lax-Oleinik formula. We say that (x, t) ∈ R × (0, ∞) is the location of a shock if y − (x, t) < y + (x, t). We will not call (x, 0) a shock even if the initial function u0 is nondifferentiable at x. Our first result shows that later fluctuations are close to a deterministic transformation of the initial fluctuations.
Fluctuations for Totally Asymmetric Systems
147
Theorem 2.1. Suppose u0 is a locally Lipschitz continuous function. Assume (9) and (10). (i) Let A ⊆ R × [0, ∞) be a compact set such that either (a) A is finite, or (b) there are no shocks in A, in other words y − (x, t) = y + (x, t) for all (x, t) ∈ A. Then in probability. (15) lim sup ζn (x, t) − inf ζn (y, 0) = 0 n→∞ (x,t)∈A
y∈I (x,t)
(ii) For all −∞ < a < b < ∞, 0 < τ < ∞, and 1 ≤ p < ∞, p b ζn (x, t) − inf ζn (y, 0) dx = 0 lim sup in probability. n→∞ 0≤t≤τ
y∈I (x,t)
a
(16)
From this theorem we deduce distributional limits for the interface and the stick profile. 2.1. Weak limits and the linearized equation. In assumption (9) we assumed the existence of a C(R)-valued random function ζ0 . On the probability space of ζ0 define random variables ζ (x, t), (x, t) ∈ R × [0, ∞), by ζ (x, t) =
inf
y∈I (x,t)
ζ0 (y) .
(17)
To formulate a process-level weak convergence result, we consider, for a fixed t, the p random function x → ζn (x, t) as an element of the space Lloc (R) of functions that are p locally in Lp . By definition, a measurable function f on R lies in Lloc (R) if for all 0 < k < ∞, f Lp [−k,k] ≡
[−k,k]
|f (x)|p dx
1/p
< ∞.
p
Lloc (R) is a complete separable metric space under the metric dp (f, g) =
∞
2−k f − gLp [−k,k] ∧ 1 ,
(18)
k=1 p
and we endow Lloc (R) with its Borel σ -algebra. The path ζn : t → ζn (·, t) is a measurable map from the underlying probability space into the Skorokhod space D ([0, ∞) , p p Lloc (R) of right-continuous Lloc (R)-valued paths with left limits at all time points t. The p random variables ζ (x, t) defined in (17) specify an Lloc (R)-valued path ζ : t → ζ (·, t)
p which is a random element of the space C [0, ∞), Lloc (R) of continuous paths. Theorem 2.2. Suppose u0 is a locally Lipschitz continuous function. Assume (9) and (10). (i) For any finitely many points (xi , ti ) ∈ R × [0, ∞), 1 ≤ i ≤ k, we have the limit in distribution d
(ζn (x1 , t1 ), . . . , ζn (xk , tk )) −→ (ζ (x1 , t1 ), . . . , ζ (xk , tk )) in the space Rk .
as n → ∞ (19)
148
T. Seppäläinen
(ii) The process ζn converges in distribution to the process ζ on the path space D [0, ∞), p Lloc (R) . As one would expect, ζ (x, t) is a solution of the linearization of the Hamilton–Jacobi equation (1). For this we must choose the correct version of ζ in the a.e. sense. Let ζ¯ (x, t) = 21 ζ0 (y − (x, t)) + ζ0 (y + (x, t)) . For a fixed t, ζ¯ (x, t) = ζ (x, t) at all x except shock locations. ζ¯ is a weak solution of the equation ζ¯t (x, t) + f (ρ(x, t))ζ¯x (x, t) = 0 ,
ζ¯ (·, 0) = ζ0 (·).
(20)
This is a linear transport equation with a discontinuous coefficient. The appropriate definition of a weak solution is that, for all φ ∈ Cc∞ (R × [0, ∞)), ζ¯ satisfies this integral criterion: ∞ ∞ ¯ φt (x, t)ζ (x, t)dx dt + dt ζ¯ (x, t)d[φ(·, t)f (ρ + (·, t))](x) R R 0 0 + ζ0 (x)φ(x, 0)dx = 0. (21) R
For each t, the x-integral in the second term is with respect to the signed measure µ = µ(t) on R defined by µ(a, b] = φ(b, t)f (ρ + (b, t)) − φ(a, t)f (ρ + (a, t)). For this to make sense we took the right-continuous version ρ + (·, t) of ρ(·, t). The definition also requires that ρ(·, t) be locally of bounded variation, which is true by the Lax-Oleinik formula (14). Equation (21) shows why the choice of ζ¯ matters. Suppose (r(t), t) is a shock location for t0 ≤ t ≤ t1 . Then f (ρ(·, t)) jumps at r(t) and the measure µ gives nonzero mass to the singleton {r(t)} for each t. Clearly the value of the second term in (21) depends on which value ζ¯ (r(t), t) takes. It is a curious discord that the correct weak solution of (20) differs from the pointwise limit in (19) at the shocks. There is no dynamically generated noise in Eq. (20), as all the randomness is in the initial data ζ0 . The equation expresses the point that on the diffusive scale the initial noise is transported along the characteristics, and the noise created by the dynamics is not visible because it is of lower order. That ζ¯ is a weak solution of (20) follows from this more general result. Given a strictly convex, differentiable flux function f , let 0(λ, ρ) ∈ [0, 1] for λ = ρ be defined by f (λ) − f (ρ) = 0(λ, ρ)f (ρ) + (1 − 0(λ, ρ))f (λ). λ−ρ
(22)
Let ρ ± (x, t) be the functions defined by the Lax–Oleinik formula (14). Given a continuous function v0 , set for (x, t) ∈ R × (0, ∞) first θ (x, t) = 0(ρ + (x, t), ρ − (x, t)) and then v(x, t) = θ (x, t)v0 (y + (x, t)) + (1 − θ(x, t)) v0 (y − (x, t)).
(23)
Fluctuations for Totally Asymmetric Systems
149
Theorem 2.3. Suppose f is a strictly convex C 1 flux function with convex conjugate g, the minimizers y ± (x, t) are defined by (13), and ρ ± (x, t) are defined by the Lax-Oleinik formula (14). Let v0 be an arbitrary continuous function on R, and define v by (23). Then v is a weak solution of the linear transport equation vt + f (ρ(x, t))vx = 0 ,
v|t=0 = v0 ,
(24)
in the sense of the integral criterion (21). We would expect v to be the unique weak solution of (24) under some natural uniqueness criterion. Presently a uniqueness theory exists for continuous solutions of equations of this type. See Petrova and Popov [19] and their references. For the special case f (ρ) = ρ 2 we get 0 ≡ 21 , which explains why we defined ζ¯ as the 21 , 21 convex combination of ζ0 (y ± (x, t)). 2.1.1. Remark. Above we chose to work with the x-right-continuous function ζn (x, t) defined by (4). The reader may prefer to linearly interpolate between the point locations zkn to define an x-continuous random interface n n zn (x, t) = (nx − [nx]) z[nx]+1 (nt) + ([nx] + 1 − nx) z[nx] (nt),
(25)
and then consider the x-continuous fluctuation process ζn(c) (x, t) = n−1/2 {zn (x, t) − nu(x, t)}. (c)
The results would be the same. In particular, assumption (9) is equivalent to ζn (·, 0) → ζ0 (·) weakly in C(R). 2.1.2. Remark. In Theorems 2.1 and 2.2 part (i) is a sharper statement for a restricted set of space-time points, and part (ii) is a weaker statement without restriction on spacetime points. Let us emphasize that the limits in (15) and (19) are valid for any finite collection of points, including shock locations. However, the uniform convergence in (15) cannot be extended to arbitrary compact sets that contain shocks. This can be seen from examples.
2.2. Starting in local equilibrium. Now we take the point of view of an observer on the initial interface, whose location is taken as the origin. Furthermore, we assume that at time zero this observer sees the interface to his left and right in local equilibrium, n (0). (If we think of which is an assumption on the local slopes ηin (0) = zin (0) − zi−1 n n the zi ’s as particles, we call the ηi ’s interparticle distances.) Then we can strengthen the distributional limits to almost sure limits, and give the limiting objects concrete descriptions in terms of Brownian motion. For the precise hypotheses, let ρ0 be a nonnegative, locally bounded measurable function on R. It will be the macroscopic profile of the ηin (0) variables. Assume that for some real number b (and hence for all b), lim |r|−1 · sup ρ0 (x) = 0.
r→−∞
r≤x≤b
(26)
150
T. Seppäläinen
Define a locally Lipschitz function u0 by u0 (0) = 0 ,
u0 (x) − u0 (y) =
x
y
ρ0 (r)dr
for all y < x.
Let u(x, t) and ρ(x, t) be again the relevant solutions of the macroscopic equations (1) and (3). The assumption on the initial interfaces zn (0) is as follows: For each n, z0n (0) = 0 with probability 1, and the variables (ηin (0) : i ∈ Z) are mutually independent, exponentially distributed with expectations i/n E[ηin (0)] = nu0 (i/n) − nu0 ((i − 1)/n) = n ρ0 (x)dx.
(27)
(i−1)/n
Now −z0n (t) is the cumulative current from site 0 for the stick process ηn (·), which is the total stick length that has moved across the bond (0, 1) during time interval (0, t]. Let B(·) denote a two-sided standard Brownian motion. In other words, take two independent 1-dimensional standard Brownian motions B1 (s) and B2 (s) defined for 0 ≤ s < ∞, and set B1 (s), s≥0 B(s) = (28) −B2 (−s), s < 0. The limiting processes are defined in terms of this Brownian motion by y ρ02 (s)ds ζ0 (y) = B 0
and ζ (x, t) =
inf
y∈I (x,t)
ζ0 (y) =
inf
y∈I (x,t)
The integrals above are signed, so for y < 0,
y 0
y
B 0
ρ02 (s)ds
ρ02 (s)ds = −
0 y
.
(29)
ρ02 (s)ds ≤ 0.
Theorem 2.4. Assume (26) and (27). Then we can construct the processes {zn (·)} on a common probability space with a two-sided Brownian motion B(·) so that the following almost sure limits hold. (i) Let A ⊆ R × [0, ∞) be a compact set such that either (a) A is finite, or (b) there are no shocks in A, in other words y − (x, t) = y + (x, t) for all (x, t) ∈ A. Then lim
sup |ζn (x, t) − ζ (x, t)| = 0
n→∞ (x,t)∈A
a.s.
(30)
(i) For all −∞ < a < b < ∞, 0 < τ < ∞, and 1 ≤ p < ∞, lim sup
n→∞ 0≤t≤τ
b a
|ζn (x, t) − ζ (x, t)|p dx = 0
a.s.
(31)
Fluctuations for Totally Asymmetric Systems
151
2.2.1. Remark. We assumed the initial increment variables {ηin (0)} exponentially distributed in assumption (27) just to be concrete. It is a natural choice because i.i.d. exponential distributions are invariant for the η(·) process so we can call (27) “local equilibrium”. But the validity of Theorem 2.4 does not depend on this special choice. The proof works as long as the initial distributions can be suitably embedded in Brownian motion, and the moments are sufficiently bounded for various estimates needed along the way. However, definition (29) of ζ (x, t) would y change with different choices of initial distributions. The ρ02 (s) inside the integral 0 ρ02 (s)ds appears because the variance of an exponential random variable is the square of the mean. 2.2.2. Moving along a characteristic from the origin. If y ± (x, t) = 0, which means that (x, t) is a point on a genuine characteristic (not a shock) emanating from (0, 0), then ζ (x, t) = 0 and (30) gives ζn (x, t) → 0 a.s. This tells us that n−1/2 is the wrong normalization. We might expect the fluctuation to be of size n1/3 because the situation studied by Baik, Deift and Johansson [2] is of this type. Their initial condition corresponds to setting zi (0) = 0 for i ≤ 0 and zi (0) = ∞ for i > 0. And their result can be expressed as the weak limit of n−1/3 {z[nx] (nt) − nx 2 /(4t)} for x, t > 0. In this situation u0 (x) = ∞ · 1(0,∞) (x) and y ± (x, t) = 0 for all x, t > 0. On the other hand, suppose y − (x, t) ≤ 0 ≤ y + (x, t) with at least one inequality strict. Then (x, t) lies on a characteristic from the origin that is a shock. Now ζ (x, t) = 0 with positive probability, and with probability 1 if y − (x, t) < 0 < y + (x, t). Equation (30) says that the current across a shock has fluctuations of order n1/2 . 2.2.3. Shock. A shock produces discontinuous fluctuations that skip segments of the Brownian path of the initial fluctuation. Consider the simplest shock case, with initial profile λ, y < 0 ρ0 (y) = ρ, y > 0, with λ > ρ. The convex flux f (ρ) = ρ 2 preserves a downward jump. At later times t > 0 the shock is located at x = (ρ + λ)t, and the profile is given by ρ(x, t) =
λ, x < (ρ + λ)t ρ, x > (ρ + λ)t.
The Hopf–Lax minimizers are y ± (x, t) = x − 2λt for x < (ρ + λ)t, y ± (x, t) = x − 2ρt for x > (ρ + λ)t, and I (x, t) = {(ρ − λ)t, (λ − ρ)t} for x = (ρ + λ)t. At macroscopic time t, the limiting fluctuation process is 2 B λ (x − 2λt) ,
x < (ρ + λ)t 2 t (ρ − λ) , B ρ 2 t (λ − ρ) }, x = (ρ + λ)t ζ (x, t) = min{B λ 2 B ρ (x − 2ρt) , x > (ρ + λ)t.
(32)
There is a jump in ζ (·, t) at the shock x = (ρ + λ)t, and the path may be left- or right-continuous, depending on which choice makes it lower semicontinuous. The initial fluctuation in the range {B(s) : λ2 t (ρ − λ) < s < ρ 2 t (λ − ρ)} disappeared from (32). Ferrari and Fontes [10] show that in asymmetric exclusion this becomes the fluctuation of a second class particle.
152
T. Seppäläinen
2.2.4. A tagged particle fails to be tight in the presence of shocks. A basic question is the fluctuation of the motion of a tagged particle. In other words, fix x and consider n (nt) − nu(x, t)} as t varies in [0, T ]. If there are no the process ζn (x, t) = n−1/2 {z[nx] shocks in {x} × [0, T ], (30) gives uniform convergence to a time-changed Brownian path. But if (x, σ ) is a shock for some σ ∈ (0, T ), it turns out that the sequence of processes {ζn (x, ·)} is not even tight in the Skorokhod space D([0, T ], R). To see this, recall this condition for tightness: for every ε > 0 there must exist a δ > 0 such that P (wn (δ) > ε) < ε for all n, where wn (δ) is the following modulus of continuity: wn (δ) = inf {ti } wn ({ti }), where the infimum is over partitions {ti } of [0, T ] such that ti − ti−1 > δ for all i, and wn ({ti }) = max sup{|ζn (x, s) − ζn (x, t)| : s, t ∈ [ti−1 , ti )}. i
(See [5, Chapter 3] or [8, Chapter 3].) Now fix y0 < y − (x, σ ), a constant α > 0, the event A = {ζ0 (y + (x, σ )) < ζ0 (y) − α for y ∈ [y0 , y − (x, σ )]}, and β = P (A) > 0. The probability P (A) is positive because y − (x, σ ) < y + (x, σ ) by the assumption that (x, σ ) is a shock. Let ε < (α ∧ β)/8, and suppose there is a δ > 0 such that P (wn (δ) ≥ ε) < ε for all n. Fix τ ∈ (σ, σ + δ/2) so that (x, τ ) is not a shock and so that y(x, τ ) ∈ [y0 , y − (x, σ )]. This is possible because t → y − (x, t) is right-continuous and nonincreasing. Let Fn be the event Fn = {wn (δ) < ε , |ζn (x, t) − ζ (x, t)| ≤ ε for t = σ, τ }. By (30), P (Fn ) ≥ 1 − 2ε for large enough n, and then P (A ∩ Fn ) ≥ β/2 > 0. Fix a sample point ω ∈ A ∩ Fn . Fix a partition {ti } that achieves wn ({ti }) < ε for this ω. At this ω, ζn (x, σ ) ≤ ζ (x, σ ) + ε ≤ ζ0 (y + (x, σ )) + ε ≤ ζ0 (y(x, τ )) − α + ε = ζ (x, τ ) − α + ε ≤ ζn (x, τ ) − 3α/4. This implies that σ and τ cannot lie in the same partition interval [ti−1 , ti ), so there must be at least one partition point in (σ, τ ]. Since τ − σ < δ/2, there must be a unique partition point tk ∈ {ti } ∩ (σ, τ ]. Then wn ({ti }) < ε forces |ζn (x, t) − ζn (x, σ )| < ε for t ∈ [tk−1 , tk ), while |ζn (x, t) − ζn (x, τ )| < ε for t ∈ [tk , tk+1 ). Combining this with the earlier inequality gives ζn (x, tk −) ≤ ζn (x, tk ) − α/2, which by the continuity of u(x, t) implies n (nt −) ≤ zn (nt ) − n1/2 α/2 z[nx] k k [nx] n (·) jumps leftward. and contradicts the basic rule that the particle z[nx]
Fluctuations for Totally Asymmetric Systems
153
2.3. Fluctuations for the conserved quantity. We continue assuming that the process starts in local equilibrium according to assumptions (26) and (27). In this section we consider the fluctuations of the empirical density of the stick variables {ηin (nt)}. Total stick length is conserved by the dynamics, as each jump of particle zi means that a random portion is subtracted from ηi and added on to ηi+1 . Under assumptions (26) and (27) the empirical measure n−1 i ηin (nt)δi/n satisfies a hydrodynamic limit. Precisely, for any finite a < b, 1 n→∞ n
[nb]
lim
i=[na]+1
ηin (nt) =
b a
ρ(x, t)dx
a.s.
(33)
See [22]. Actually only a limit in probability is proved in [22], but the result can be strengthened under assumption (27). The next theorem is the fluctuation theorem for this hydrodynamic limit. The result is stated for the random distribution ξn (t) defined below. First set ρin (t) = n
i/n
(i−1)/n
ρ(x, t)dx = nu(i/n, t) − nu((i − 1)/n, t).
Then, for compactly supported test functions φ, define
ξn (t, φ) = n−1/2 φ(i/n) ηin (nt) − ρin (t) . i∈Z
Define another random distribution ξ(t) by ∞ ξ(t, φ) = − φ (x)ζ (x, t)dx, −∞
(34)
where ζ (x, t) is defined by (29) in terms of the Brownian motion B(·). We want to put ξn (t) and ξ(t) into some reasonable metric space, and a workable −1 choice turns out to be the space Hloc (R) of distributions that are locally in H −1 (R). To explain this we need some definitions. A source for the background needed here is Chapter 9 in [13]. Let D be the space of distributions in Schwartz’s notation. Elements F ∈ D are linear functionals on the space Cc∞ (R) of compactly supported infinitely differentiable functions, and they are continuous in this sense: F (φj ) → F (φ) if all derivatives of φj converge uniformly to the corresponding derivatives of φ, and all φj and φ are supported on a common compact set. Distributions can be multiplied by smooth functions: if χ is a C ∞ -function then the distribution χ F is defined by χ F (φ) = F (χ φ). The Sobolev space H 1 (R) contains those L2 -functions v that possess a weak derivative v in L2 . It is a separable Hilbert space with (one possible) norm vH 1 (R) = vL2 (R) + v L2 (R) . H −1 (R) is the dual space of H 1 (R), and itself a separable Hilbert space. A continuous linear functional on H 1 (R) also acts continuously on Cc∞ (R), and consequently the elements of H −1 (R) are from the space D . Give H −1 (R) the operator norm F H −1 (R) = sup{|F (v)| : vH 1 (R) ≤ 1}.
154
T. Seppäläinen
Now we can define the space of distributions locally in H −1 : −1 (R) = {F ∈ D : χ F ∈ H −1 (R) for all χ ∈ Cc∞ (R)}. Hloc
(35)
Fix once and for all an increasing sequence of Cc∞ (R) functions χk such that 1[−k+1,k−1] ≤ χk ≤ 1(−k,k) . −1 −1 Then a distribution F lies in Hloc (R) iff χk F ∈ H −1 (R) for all k. We metrize Hloc (R) by
R(F, G) =
∞
2−k 1 ∧ χk F − χk GH −1 (R) .
(36)
k=1 −1 (R) is a complete separable metric space. The process t → ξn (t) Under this metric Hloc −1 is a random element of the Skorokhod space D([0, ∞), Hloc (R)), and t → ξ(t) is a −1 random element of the space C([0, ∞), Hloc (R)).
Theorem 2.5. Assume (26) and (27). Construct the processes {zn (·)} on a common probability space with a two-sided Brownian motion B(·) so that the conclusions of Theorem 2.4 are valid. Fix a finite time horizon τ < ∞. Then almost surely lim sup R(ξn (t), ξ(t)) = 0.
n→∞ t∈[0,τ ]
(37)
−1 In particular, ξn (·) converges almost surely to ξ(·) on the path space D([0, ∞), Hloc (R)).
Theorem 2.5 is a corollary of Theorem 2.4 and is valid under any hypotheses that make Theorem 2.4 true. See Remark 2.2.1. To complement the theorem, we give alternative characterizations of the limiting distribution-valued process ξ(·). Spohn [27, p. 260] argued that the limiting fluctuations of an asymmetric conservative system should be governed by the equation ∂t ξ + ∂x [f (ρ)ξ ] = 0.
(38)
By definition (34), ξ(t) = ∂x ζ (·, t) in the distribution sense. Formally differentiating through (20) with respect to x then gives exactly Eq. (38). Thus we can regard ξ(·) as a distribution solution to (38). Following (21), the correct interpretation of the distribution ∂x [f (ρ(·, t))ξ(t)] is then, applied to a test function ψ ∈ Cc∞ (R), (39) ζ¯ (x, t)d[f (ρ(·, t))ψ ](x). ∂x [f (ρ(·, t))ξ(t)](ψ) = R
We can also consider ξ as a Gaussian process indexed by time and compactly supported test functions. Define forward characteristics w± (a, t) by w− (a, t) = inf{x : y ± (x, t) ≥ a} and
w+ (a, t) = sup{x : y ± (x, t) ≤ a}.
We discuss these in Sect. 3. For a fixed t, w− (a, t) = w + (a, t) for all but countably many points a ∈ R. As functions of t, w± (a, ·) are the minimal and maximal Filippov solutions of the initial value problem dx = f (ρ(x, t)) , dt
x(0) = a.
(40)
Fluctuations for Totally Asymmetric Systems
155
See [6] and [20] for more about this. Ignoring the Lebesgue null set of shocks, we can write ± y (x,t) 2 ρ0 (r)dr dx, ξ(t, φ) = − φ (x)B R
0
which shows that ξ = {ξ(t, φ) : t ∈ [0, ∞), φ ∈ Cc∞ (R)} is a mean zero Gaussian process. Its distribution is determined by the correlations E[ξ(s, ψ)ξ(t, φ)], which can be shown to satisfy ψ(w(r, s))φ(w(r, t))ρ02 (r)dr. (41) E[ξ(s, ψ)ξ(t, φ)] = R
Here we wrote w(r, t) for the a.e. defined function that agrees with both w − (r, t) and w+ (r, t) at a.e. r, for any fixed t. Correlations (41) show that ξ(t, φ) can be equivalently described as follows. Fix a single two-sided Brownian motion W (·). For t ∈ [0, ∞) and φ ∈ Cc∞ (R), define the random variables ξ (t, φ) by the Itô integrals φ(w(r, t))ρ0 (r)dW (r). (42) ξ (t, φ) = R
The function φ(w(r, t)) is supported on some compact interval a ≤ r ≤ b so there is no problem in defining the stochastic integral (42) as a function of the increments {W (r) − W (a) : a ≤ r ≤ b}. The process ξ = { ξ (t, φ) : t ∈ [0, ∞), φ ∈ Cc∞ (R)} has the correlations given in (41). From this we conclude that on the product space ∞ ξ are identical. R[0,∞)×Cc (R) the distributions of ξ and 2.4. Construction of the process and the variational coupling. The purpose of this section is to establish notation. For more explanation of this construction we refer to [1, 22, 23, 25]. Consider a rate one, homogeneous Poisson point process on R × (0, ∞). A sequence (x1 , t1 ), (x2 , t2 ), . . . , (xm , tm ) of Poisson points is increasing if x1 < x2 < · · · < xm
and
t1 < t2 < · · · < tm .
For (a, s), (b, t) ∈ R × [0, ∞), let L((a, s), (b, t)) be the maximal number of Poisson points on an increasing sequence contained in (a, b] × (s, t]. Abbreviate L(b, t) = L((0, 0), (b, t)). Define an inverse to L by ((a, s), m, τ ) = inf{h > 0 : L((a, s), (a + h, s + τ )) ≥ m} . Again abbreviate (m, τ ) = ((0, 0), m, τ ). The well-known laws of large numbers are √ 1 1 a2 and lim ([sa], st) = a.s. lim L(sb, st) = 2 bt s→∞ s s→∞ s 4t Assume given a probability space (D, F, P ) on which are defined a rate one Poisson point process on R×(0, ∞) and an initial configuration (zi (0) : i ∈ Z) for Hammersley’s process. The process z(t) = (zk (t) : k ∈ Z) is defined by zk (t) = inf {zi (0) + ((zi (0), 0), k − i, t)} i:i≤k
(43)
156
T. Seppäläinen
for all k ∈ Z and t > 0. Define the state space Z −2 Z = z = (zi ) ∈ R : zi−1 ≤ zi for all i, and lim i zi = 0 . i→−∞
If (zi (0)) ∈ Z a.s., then the infimum in (43) is attained at some finite i and z(t) ∈ Z for all t a.s. Thus (43) defines a time-homogeneous Markov process z(·) on Z. In this paper we work with a family of processes {zn (·)}. For each n assume a probability space (D, F, P ) that supports the initial configuration zn (0) = (zin (0) : i ∈ Z) and the space-time Poisson point process. On D define the random variables n,i (t) = ((zin (0), 0), m, t) . Em
Then, following (43), the processes
(44)
{zn (t)}
zkn (t) = inf
i:i≤k
are defined by n,i zin (0) + Ek−i (t) .
(45)
3. Characteristics and the Hopf–Lax Formula Our proofs take advantage of the correspondence between the macroscopic and microscopic situations, and estimates on the probability that the microscopic situation deviates from the macroscopic one. This section collects facts about the macroscopic situation. Recall that the initial function u0 is nondecreasing, locally Lipschitz, and satisfies the left growth bound (11). We work throughout with the flux f (ρ) = ρ 2 with convex conjugate g(x) = x 2 /4. Similar results can be derived for any strictly convex, differentiable conjugate pair (f, g), and the regularity assumption on u0 can be relaxed to lower semicontinuity. The growth bound (11) would need to be tailored to the g in question. The function F(y) = u0 (y) + tg((x − y)/t) is continuous and satisfies limy→−∞ F(y) = ∞. Consequently the minimum in the Hopf–Lax formula (2) is achieved at some point y, and the set of minimizers is compact. Define the function u(x, t) by the initial condition u(x, 0) = u0 (x) and by (2). Then for a fixed t > 0, u(·, t) is locally Lipschitz in x (we check this below), and x −2 u(x, t) → 0 as x → −∞. The Hopf–Lax formula can be iterated as a semigroup: x−y (46) u(x, t) = inf u(y, s) + (t − s)g y:y≤x t −s for all 0 < s < t and x ∈ R. Define the set of minimizers in (46) by x−y I (x; s, t) = y ≤ x : u(x, t) = u(y, s) + (t − s)g . t −s
(47)
I (x; s, t) is nonempty and compact. Define minimal and maximal minimizers by y − (x; s, t) = inf I (x; s, t)
and
y + (x; s, t) = sup I (x; s, t).
(48)
The following properties can be checked: If x1 < x2 then y + (x1 ; s, t) ≤ y − (x2 ; s, t), while if t1 < t2 then y + (x; s, t2 ) ≤ y − (x; s, t1 ). y ± (x; s, t) is nondecreasing in x and nonincreasing in t. y + is right- and y − left-continuous in x, while y + is left- and y − right-continuous in t. Consequently, for fixed s < t, y ± (· ; s, t) have the same continuity points, they coincide on these continuity points, and y − (x; s, t) < y + (x; s, t) iff x is a
Fluctuations for Totally Asymmetric Systems
157
discontinuity point. A similar statement holds for y ± (x; s, ·) as a function of t, for fixed x, s. Next define minimal and maximal forward characteristics by w− (a; s, t) = sup{x : y ± (x; s, t) < a} = inf{x : y ± (x; s, t) ≥ a}
(49)
w+ (a; s, t) = sup{x : y ± (x; s, t) ≤ a} = inf{x : y ± (x; s, t) > a}.
(50)
and
The equalities between the alternative definitions follow from the properties of y ± (x; s, t). For w± (a; s, t) we have these properties: nondecreasing in a, nondecreasing in t, w+ is right- and w − left-continuous in a, w + (a1 ; s, t) ≤ w − (a2 ; s, t) for a1 < a2 . As above, for fixed s < t, w± (· ; s, t) have the same points of continuity, coincide on continuity points, and w − (a; s, t) < w+ (a; s, t) iff a is a discontinuity point. Note the equivalence y − (x; s, t) ≤ a ≤ y + (x; s, t) ⇐⇒ w − (a; s, t) ≤ x ≤ w + (a; s, t).
(51)
Note also that as a trivial consequence of the definitions, y − (x; s, t) ≤ y + (x; s, t) ≤ x, and a ≤ w− (a; s, t) ≤ w + (a; s, t). We adopt the following notational conventions. When the ± functions coincide we write y ± (x; s, t) = y(x; s, t) and w ± (a; s, t) = w(a; s, t). When s = 0 abbreviate y ± (x; 0, t) = y ± (x, t) and similarly y(x, t), w ± (a, t), w(a, t). As mentioned earlier, u(x, t) is the unique viscosity solution of the Hamilton–Jacobi equation ut + f (ux ) = 0 with f (ρ) = ρ 2 and initial data u|t=0 = u0 . Set b(x) = g (x) = x/2, and define two functions ρ ± (x, t) by x − y ± (x, t) ± ρ (x, t) = b for (x, t) ∈ R × (0, ∞). (52) t For a fixed t > 0, ρ ± give the one-sided x-derivatives of u: ρ ± (x, t) = lim
ε→0±
u(x + ε, t) − u(x, t) . ε
There is a function ρ such that ρ(x, t) = ρ ± (x, t) for all but countably many x, because y − (x, t) = y + (x, t) for all but countably many x (for fixed t again). The a.e. defined function ρ(x, t) is the unique entropy solution of the Burgers equation ρt + f (ρ)x = 0 with initial condition ρ0 = u0 . Formula (52) is known as the Lax–Oleinik formula. The next lemma collects some properties proved in Sect. 3 of Rezakhanlou [20]. It is worthwile to note that strict convexity and continuous differentiability of g(x) = x 2 /4 are critical for many of the good properties of the characteristics utilized in this section. The reader can compare with [26] where the g function corresponding to the K-exclusion process is not known to possess these properties. Lemma 3.1. (a) Suppose 0 ≤ t1 < t2 < t3 , y2 ∈ I (x; t2 , t3 ), and y1 ∈ I (y2 ; t1 , t2 ). Then y1 ∈ I (x; t1 , t3 ) and x − y1 x − y2 y2 − y1 = = . t3 − t1 t3 − t 2 t2 − t 1 In other words, the points (y1 , t1 ), (y2 , t2 ), and (x, t3 ) lie on a line segment.
158
T. Seppäläinen
(b) For 0 ≤ s < s1 < t, y ± (x; s1 , t) =
t − s1 ± s1 − s x+ y (x; s, t). t −s t −s
(c) For 0 < s < t and all a ∈ R, w ± (a; s, t) = w(a; s, t). For s = 0 and a ∈ R, w± (a, t) = w(a, t) is guaranteed by lim inf ε0
u0 (a + ε) − u0 (a) u0 (a) − u0 (a − ε) ≤ lim sup . ε ε ε0
(53)
(d) Suppose 0 ≤ t1 < t2 < t3 . Then w ± (a; t1 , t3 ) = w(w± (a; t1 , t2 ); t2 , t3 ). Next some further properties. Lemma 3.2. (i) Let 0 < s < t and x1 = w(x0 ; s, t). Then y − (x1 , t) ≤ y − (x0 , s) ≤ y + (x0 , s) ≤ y + (x1 , t).
(54)
Conversely, if (54) holds and the middle inequality is strict, then x1 = w(x0 ; s, t). (ii) Let (x, t) and (x1 , t1 ) be arbitrary points in R × (0, ∞) with t ≤ t1 . Suppose the open intervals Jx,t = (y − (x, t), y + (x, t)) and Jx1 ,t1 = (y − (x1 , t1 ), y + (x1 , t1 )) are nonempty. Then one of two cases happens: either the intervals are disjoint, which happens if t = t1 and x = x1 , or if t < t1 and x1 = w(x; t, t1 ). Or Jx,t ⊆ Jx1 ,t1 which happens if (x, t) = (x1 , t1 ) or if t < t1 and x1 = w(x; t, t1 ). Proof. (i) By (51), y ≡ y − (x1 ; s, t) ≤ x0 ≤ y + (x1 ; s, t) ≡ y . By Lemma 3.1(a), y − (y ; 0, s), y + (y ; 0, s) ∈ I (x1 , t). By the monotonicity of y ± (· , s), y − (x1 ; 0, t) ≤ y − (y ; 0, s) ≤ y − (x0 ; 0, s) ≤ y + (x0 ; 0, s) ≤ y + (y ; 0, s) ≤ y + (x1 ; 0, t). For the converse part, if x1 > w(x0 ; s, t) then by monotonicity and the part already proved, y − (x1 ; 0, t) ≥ y + (w(x0 ; s, t); 0, t) ≥ y + (x0 ; 0, s). This contradicts (54) if the middle inequality of (54) is strict. Similarly rule out the case x1 < w(x0 ; s, t). (ii) If t = t1 then either x = x1 or the intervals must be disjoint, because x < x1 implies y + (x, t) ≤ y − (x1 , t). Suppose t1 > t. If x1 = w(x; t, t1 ) then by part (i) (y − (x, t), y + (x, t)) is contained in (y − (x1 , t1 ), y + (x1 , t1 )). On the other hand, if x1 > w(x; t, t1 ) then we have disjointness by the argument already used in part (i): y − (x1 ; 0, t1 ) ≥ y + (w(x; t, t1 ); 0, t1 ) ≥ y + (x; 0, t). Similarly for the remaining cases. " #
Fluctuations for Totally Asymmetric Systems
159
Part (i) of the previous lemma has the following meaning. We say that (x, t) is a shock if y − (x, t) < y + (x, t). By (52), this is the same as saying that the Lax–Oleinik solution ρ is discontinuous at (x, t). For when the minimizer y(x, t) = y ± (x, t) is unique and (xj , tj ) → (x, t), then any choice yj ∈ I (xj , tj ) satisfies yj → y(x, t). Inequalities (54) imply that once a shock is created, it moves along a forward characteristic and never disappears. Note though that shocks merge when characteristics merge, so the number of shocks may decrease. Next we look at the continuity of characteristics and u(x, t). Lemma 3.3. (a) Fix 0 ≤ s < T < ∞ and suppose the function u(·, s) on the righthand side of (46) is locally Lipschitz in the x-variable. Fix a < b, and let K be the Lipschitz constant of u(·, s) on the interval [y − (a; s, T ), b]. Then for all x ∈ [a, b] and t ∈ (s, T ], I (x; s, t) ⊆ [x − 4K(t − s), x]. (b) Recall that u0 is assumed locally Lipschitz and to satisfy (11). Then u(x, t) is locally Lipschitz on R × [0, ∞). For any a < b and T < ∞ there exists a constant L = L(a, b, T ) such that I (x; s, t) ⊆ [x − L(t − s), x] for all x ∈ [a, b] and 0 ≤ s < t ≤ T . Also, limt→0 u(x, t) = u0 (x) for all x ∈ R. Proof. (a) Let c = y − (a; s, T ). Since y ± (x; s, t) is nonincreasing in t, I (x; s, t) ⊆ [c, b] for all (x, t) ∈ [a, b] × (s, T ]. Any y ∈ I (x; s, t) must satisfy y ≤ x and u(x, s) ≥ u(y, s) + (x − y)2 /(4(t − s)), from which follows (x − y)2 ≤ 4(t − s)(u(x, s) − u(y, s)) ≤ 4(t − s)K(x − y). (b) First step: to show that on a bounded interval [a, b], u(·, t) is locally Lipschitz in the x-variable with Lipschitz constant independent of t ∈ [0, T ]. Let K be the Lipschitz constant of u0 on [y − (a, T ), b]. Let x1 < x2 in [a, b] and pick y1 ∈ I (x1 , t). Then y1 ∈ [y − (x1 , t), x1 ] ⊆ [y − (a, T ), b]. Set y2 = y1 +x2 −x1 ∈ [y1 , x2 ] ⊆ [y − (a, T ), b]. Then 0 ≤ u(x2 , t) − u(x1 , t) x2 − y2 x1 − y1 ≤ u0 (y2 ) + tg − u0 (y1 ) − tg t t = u0 (y2 ) − u0 (y1 ) ≤ K|y2 − y1 | = K|x2 − x1 |. Second step: to show the existence of a constant C such that for all x ∈ [a, b] and 0 ≤ s < t ≤ T , |u(x, t) − u(x, s)| ≤ C|t − s|. The two steps together imply that u is locally Lipschitz. For the second step, apply the first step to let K be the common Lipschitz constant for the functions {u(·, t) : 0 ≤ t ≤ T } on the x-interval y − (a, T ) ≤ x ≤ b. Let y = y − (x; s, t) ∈ I (x; s, t). By part (a), |x − y| ≤ 4K(t − s). Furthermore, by Lemma 3.1(b), s − s y = y − (x; s, t) = x + 1 − y (x, t) ∈ [y − (a, T ), b]. t t Now we may reason as follows for x ∈ [a, b] and 0 ≤ s < t ≤ T : (x − y)2 4(t − s) 2 ≤ u(x, s) − u(y, s) ≤ K|x − y| ≤ 4K (t − s).
0 ≤ u(x, s) − u(x, t) = u(x, s) − u(y, s) −
We may take C = 4K 2 and the proof of Lipschitz continuity is complete.
160
T. Seppäläinen
By Lemma 3.1(b), [y − (a; s, T ), b] ⊆ [y − (a, T ), b] for all 0 ≤ s < T , so a single Lipschitz constant K works for all s in part (a). The limit u(x, 0+) = u0 (x) now follows from u0 (y + (x, t)) ≤ u(x, t) ≤ u0 (x). " # Let 0 ≤ t0 < T . A forward characteristic emanating from (x0 , t0 ) is any function r(t), t0 ≤ t ≤ T , that satisfies r(t0 ) = x0 and w− (r(s); s, t) ≤ r(t) ≤ w + (r(s); s, t) for all t0 ≤ s < t ≤ T .
(55)
Notice that this implies r(t) = w(r(s); s, t) for all 0 < s < t. Multiple forward characteristics can emanate only from a point (a, 0) on the t = 0 line [and only if (53) fails]. For example, w± (a, t) are forward characteristics that emanate from (a, 0). By Lemma 3.3 a forward characteristic r(·) is locally Lipschitz. Hence r (t) exists at a.e. t, and r is the integral of its derivative. We can find a formula for the derivative. Let if y − (x, t) = y + (x, t) f (ρ(x, t)), + − h(x, t) = f (ρ (x, t)) − f (ρ (x, t)) (56) , if y − (x, t) < y + (x, t). + − ρ (x, t) − ρ (x, t) Theorem 3.1. For any forward characteristic r(·), r (t) = h(r(t), t) for Lebesgue a.e. t. The above theorem is from Rezakhanlou [20]. As the last result of this section we show that the shocks in a compact set can be enclosed in an open set with small t-sections. Proposition 3.1. Fix −∞ < a < b < ∞, 0 < τ < ∞, and ε > 0. Then there exists an open set G ⊆ R × (0, ∞) such that G contains all the shocks in [a, b] × (0, τ ], and for each t ∈ (0, τ ], the t-section Gt = {x : (x, t) ∈ G} has 1-dimensional Lebesgue measure |Gt | ≤ ε. The following notion will be helpful for the proof: Say a shock (x, t) is a new shock if there does not exist a shock (x0 , t0 ) such that t0 < t and x = w(x0 ; t0 , t). There can be at most countably many new shocks because if (x, t) and (x , t ) are new shocks, the open intervals (y − (x, t), y + (x, t)) and (y − (x , t ), y + (x , t )) must be disjoint by Lemma 3.2(ii). Proof of Proposition 3.1. Fix a point c < y − (a, τ ). Let {(xi , ti ) : i ≥ 1} be the (at most countably many) new shocks in [a, b] × (0, τ ]. Let S be the set of all shocks in [c, b + 1] × (0, τ + 1]. Define Iy(x, t) = y + (x, t) − y − (x, t), so that Iy(x, t) > 0 iff (x, t) is a shock. Let ε1 < (ε/2) · (b − y − (c, τ ))−1 . Write B(x, δ) for the Euclidean ball in R2 centered at x with radius δ. Define the following subset of R2 : (x − ε1 Iy(x, t), x + ε1 Iy(x, t)) × {t} ∪ B((xi , ti ), 2−i−2 ε) . H = (x,t)∈S
i≥1
We claim that every shock in [a, b] × (0, τ ] is an interior point of H . This is clear for new shocks. If (x1 , t1 ) is a non-new shock, we can find a shock (x0 , t0 ) such that 0 < t0 < t and x1 = w(x0 ; t0 , t1 ). Since we chose c < y − (a, τ ) ≤ y − (x1 , t1 ), the shock (x0 , t0 ) and the forward characteristic w(x0 ; t0 , t) for t0 ≤ t ≤ t1 lie in S. Furthermore, since S contains the shocks in [x1 , b + 1] × [t1 , τ + 1], we can choose t2 > t1 so that S contains the forward characteristic r(t) ≡ w(x0 ; t0 , t) for t0 ≤ t ≤ t2 .
Fluctuations for Totally Asymmetric Systems
161
Let h = ε1 Iy(x0 , t0 ) > 0. By (54), ε1 Iy(r(t), t) ≥ h for t0 ≤ t ≤ t2 . Consequently H contains the set t0
and consequently |Gt | ≤
2ε1 Iy(x, t) +
c≤x≤b
2−i−1 ε
i≥1
≤ 2ε1 (y + (b, t) − y − (c, t)) + ε/2 < ε. The inequalities above follow because, as x ∈ [c, b] ranges over the shock locations with time coordinate t, the open intervals (y − (x, t), y + (x, t)) are disjoint subintervals of (y − (c, t), y + (b, t)), which itself is a subinterval of (y − (c, τ ), b). " #
4. Estimates for Increasing Sequences We have the following bounds on L and . Lemma 4.1. Suppose a, s and h are positive real numbers. (a) For x ≥ 2, define
! I (x) = 2x cosh−1 (x/2) − 2 x 2 − 4 .
When x > 0 is small enough, there is a constant C such that I (2 + x) ≥ Cx 3/2 . For any C, I (x) ≥ Cx for large enough x. For all real b > 0 and m ≥ 2b, P {L(b, b) ≥ m} ≤ exp (−bI (m/b)) .
(57)
(b) There are fixed positive constants B0 , B1 , d0 , C0 and C1 such that if a ≥ B0 and B1 a 4/3 ≤ hs ≤ d0 a 2 , then a2 s 3 h3 P ([a], s) > . + h ≤ C0 exp −C1 4 4s a (c) There are finite positive constants C0 and C1 such that for all 0 < a ≤ s, P {([a], s) > s} ≤ C0 exp(−C1 s 2 ) .
162
T. Seppäläinen
Part (a) was first proved by Kim [17]. Seppäläinen [23] proved that I (x) is the correct rate function for the deviations in (57). Part (b) is a consequence of Lemma 7.1(iv) in Baik–Deift–Johansson [2]. [See Lemma 5.2 in [25] for the conversion of Baik–Deift– Johansson’s lemma into part (b) above.] Part (c) is a consequence of Lemma 2.2 in Johansson [16]. Next we use these inequalities to derive estimates tailored to our needs. Most technical complications arise from the need to treat small t that vanish as n → ∞, in order to get the t-uniformity of the theorems. When using Lemma 4.1, it is often useful to √ √ d d note that L(a, b) = L( ab, ab ) (= means equality in distribution). This follows from the invariance of the homogeneous planar Poisson point process under the maps (x, y) → (rx, r −1 y), r > 0. Lemma 4.2. (a) Let β, τ > 0. Then there exists a constant α ∈ (0, ∞) such that P ([αnt 1/2 ], nt) ≤ nβ for some t ∈ [n−2 (log n)2 , τ ] < ∞. n≥1
(b) Let b, δ, γ , τ > 0. Assume these restrictions: γ < 3/4 and γ (1 + δ) < 1. Then there exists a constant α ∈ (0, ∞) such that nP ([αnt γ ], nt) ≤ bαnt 1/2 for some t ∈ [n−(1+δ) , τ ] < ∞. n≥1
(c) Let b, δ, γ , η, τ > 0. Assume these restrictions: 1/2 < γ < 3/4, γ (1 + δ) < 1, and δ < (3 − 4γ − 2η)/(4γ − 2). Then there exists a constant α ∈ (0, ∞) such that nP ([αnt γ ], nt) ≤ bn1/2+η for some t ∈ [n−(1+δ) , τ ] < ∞. n≥1
Proof. We prove part (a), and omit the similar arguments of the other parts. Set ti = 4i n−2 (log n)2 for i ≥ 0. Pick K so that tK−1 < τ ≤ tK . Then K ≤ C log n for a constant C. Let α0 = α/2 to account for the effect of replacing the integer [αnt 1/2 ] by αnt 1/2 . Suppose α0 > 6β 1/2 . Pick a1 so that I (x) ≥ a1 x for x ≥ 3. Increase α0 further so that a1 α0 ≥ 2. Use Lemma 4.1(a) to bound the probability: P ([αnt 1/2 ], nt) ≤ nβ for some t ∈ [n−2 (log n)2 , τ ] ≤ P L(nβ, nt) ≥ α0 nt 1/2 for some t ∈ [n−2 (log n)2 , τ ] ≤
K−1 i=0
=
K−1 i=0
≤
K−1
1/2 P L (nβ, nti+1 ) ≥ α0 nti 1/2 P L n(βti+1 )1/2 , n(βti+1 )1/2 ≥ α0 nti exp −n(βti+1 )1/2 I α0 (4β)−1/2
i=0
≤ K exp {−a1 α0 log n} ≤ Cn−2 log n. This bound is summable over n.
# "
Fluctuations for Totally Asymmetric Systems
163
Lemma 4.3. Let r, τ > 0 be positive constants. Then there exist finite positive constants C0 and C1 such that, for all n ≥ 1, [nr]
m2 m4/3 log n P (m, nt) ≤ − for some t ∈ [n−2 , τ ] 4nt 4nt m=1 ≤ C0 n5/3 exp −C1 (log n)3/2 .
Proof. It suffices to consider m2/3 > log n, otherwise the probability is 0. As in the previous proof, partition the time interval [n−2 , τ ] by n−2 = t0 < t1 < · · · < tK−1 < τ ≤ tK , and bound the probability by Lemma 4.1(a): m2 m4/3 log n P (m, nt) ≤ − for some t ∈ [n−2 , τ ] 4nt 4nt 2 −1 −2/3 ≤ P L m (4nt) (1 − m log n), nt ≥ m for some t ∈ [n−2 , τ ] ≤
K−1
P L m2 (4nti )−1 (1 − m−2/3 log n), nti+1 ≥ m
i=0
≤
K−1
exp (−bi I (m/bi )) ,
i=0
where we wrote bi =
m 2
ti+1 ti
1/2
1/2
1 − m−2/3 log n
.
We argue separately for two ranges of m. Case 1. log n < m2/3 ≤ (1 + δ) log n for a small δ ∈ (0, 1/4). Define the partition by ti = 4i t0 . Check that then m/bi ≥ δ −1/2 . Pick a constant a0 so that I (x) ≥ a0 x for x ≥ δ −1/2 . (This makes sense because δ −1/2 > 2.) The size K of the partition satisfies K ≤ C log n for a constant C that depends on τ . Summing the bound exp (−bi I (m/bi )) ≤ exp(−a0 m) over m and i gives the following upper bound:
K−1
e−a0 m ≤ C0 (log n)5/2 exp −C1 (log n)3/2
m:log n<m2/3 ≤(1+δ) log n i=0
for suitable constants C0 , C1 . Case 2. (1 + δ)3/2 (log n)3/2 ≤ m ≤ nr. Set θ = 1 + (1/2)m−2/3 log n, and define the partition by ti = t0 θ i . Now K ≤ Cm2/3 . Note that m log n −1/2 log n ti 1/2 ti 1/2 1 − 2/3 1+ =2 ≥2 ti+1 m ti+1 2m2/3 bi 1/2 log n log n ≥2+ , = 2 1+ 2m2/3 4m2/3
164
T. Seppäläinen
where we used the inequalities (1 − h)−1/2 ≥ 1 + h/2 and (1 + h)1/2 ≥ 1 + h/4 that are valid for 0 ≤ h < 1. Choose a1 so that I (2 + x) ≥ a1 x 3/2 for 0 < x < 1/4. Check that bi ≥ Cm for a constant C = C(δ). Then finally the bound becomes K−1 i=0
log n exp (−bi I (m/bi )) ≤ K exp −Cm 4m2/3
3/2 ≤ Cm2/3 exp −C1 (log n)3/2 .
Summing this over (1 + δ)3/2 (log n)3/2 ≤ m ≤ nr, and combining with Case 1 above, gives the bound in the statement of the lemma. " # Lemma 4.4. Let τ > 0. There exist finite positive constants M0 , n0 , C0 , and C1 such that, for all n ≥ n0 and m ≥ M0 (log n)3/2 , m2 m4/3 log n −1 P (m, nt) > + for some t ∈ [n , τ ] 4nt 4nt ≤ C0 m2/3 exp −C1 (log n)3 . We omit the proof of this lemma. It is a partition argument in the same vein as the previous proof. The range of m’s not covered by the last lemma are taken care of by the next statement. Lemma 4.5. Let τ, α, M be positive constants. Then there exist finite positive constants n0 , C0 and C1 such that, for all n ≥ n0 , P (m, nt) > nα for some t ∈ [n−1 , τ ] and 1 ≤ m ≤ M(log n)3/2 ≤ C0 exp(−C1 nα ). Proof. Since (m, nt) is nonincreasing in t and nondecreasing in m, the probability is bounded by P ([M(log n)2/3 ], 1) > nα = P ([M(log n)2/3 ], nα/2 ) > nα/2 ≤ C0 exp(−C1 nα ). d
We used the equality in distribution L(a, b) = L((ab)1/2 , (ab)1/2 ) and Lemma 4.1(c). # " 5. Estimates for the Microscopic Variational Formula For the nth process the appropriately scaled variational coupling equality (45) reads n z[nx] (nt) =
n,i inf {zin (0) + E[nx]−i (nt)}.
i:i≤[nx]
(58)
Let in (x, t) denote the minimal i at which the infimum is attained in (58): n,i n in (x, t) = inf{i : z[nx] (nt) = zin (0) + E[nx]−i (nt)}.
(59)
Fluctuations for Totally Asymmetric Systems
165
It is proved in [22] that under assumption (10) in (x, t) is almost surely finite, and it is n,i a nonincreasing function of t. For t = 0 we interpret Em (0) = ∞ for m ≥ 1, and in (x, 0) = [nx]. The technical key to benefiting from (58) lies in estimating how far the n−1 -scaled random minimizing indices of (58) lie from the set I (x, t) of macroscopic minimizers. We start with a crude bound, and successively refine it. Lemma 5.1. For c < a < b and τ > 0 define the events Gn = {for some x ∈ [a, b] and t ∈ (0, τ ], (58) is minimized by some i ≤ cn} . (60) For fixed a < b and τ we can choose c < 0 such that this holds: (i) Under the uniformity assumption (10), limn→∞ P (Gn ) = 0. (ii) Under assumptions (26) and (27), ∞ n=1 P (Gn ) < ∞. Proof. Since in (x, t) is nonincreasing in t, we can express (60) as Gn = {for some x ∈ [a, b] and t = τ , (58) is minimized by some i ≤ cn} . Let C1 > 0, and pick C0 > 0 so that I (x) ≥ C1 x for x ≥ C0 [recall Lemma 4.1(a)]. Pick ε > 0 small enough so that C02 τ ε < 1. And then pick c < 0 so that c <
√ −1 and c < q for the q that satisfies assumption (10) for ε. Abbreviate a 1 − C0 τ ε n (0)−zn (0). Under these conditions i ≤ nc implies εi 2 ≤ ([na]−i)2 /(C 2 τ ), Yn,i = z[nb] 0 i and consequently on the event An = {i −2 nYn,i ≤ ε for all i ≤ nc} we have
−1/2 ≥ C0 ([na] − i) · nτ Yn,i
for all i ≤ nc.
(61)
Now we can estimate: n (0), nτ ) ≥ [na] − i P (Gn ) ≤ P for some i ≤ cn, L (zin (0), 0), (z[nb] # " ≤ P (Acn ) + E 1An · exp −(nτ Yn,i )1/2 I ([na] − i)(nτ Yn,i )−1/2 i≤nc
≤
P (Acn ) +
exp[−C1 ([na] − i)].
i≤nc
n,i n (0) which must follow (nτ ) ≤ z[nx] The first inequality above comes from zin (0)+E[nx]−i if i is to be a minimizer in (58), and also from [na] ≤ [nx] ≤ [nb]. The second inequality comes from (57), and the last from (61) and I (x) ≥ C1 x. Assumption (10) implies that P (Acn ) → 0, so part (i) of the lemma is proved. To get the summability P (Gn ) < ∞ required in part (ii), apply standard large deviation estimates of exponential random variables. " #
166
T. Seppäläinen
Lemma 5.2. Let a < b, α > 0, τ > 0, γ ∈ (1/2, 3/4). Suppose δ > 0 satisfies γ (1 + δ) < 1 and δ < (3 − 4γ )/(4γ − 2). Define the events Hn,0 = for some x ∈ [a, b] and t ∈ [n−(1+δ) , τ ], (58) is minimized by some i < [nx] − αnt γ } and Hn,1 = for some x ∈ [a, b] and t ∈ (0, n−(1+δ) ],
(58) is minimized by some i < [nx] − αn(1−δ)/2 .
If α is chosen large enough, the following is true: (i) Under assumptions (9) and (10), lim n→∞ P (Hn,0 ∪ Hn,1 ) = 0. (ii) Under assumptions (26) and (27), ∞ n=1 P (Hn,0 ∪ Hn,1 ) < ∞. Proof. We prove statements (i) and (ii) for Hn,0 , and leave to the reader the similar arguments needed for Hn,1 . The challenge here is in the small values of t that vanish as n → ∞. The proof will be achieved in two rounds. First we rule out minimizers i ≤ nx −αnt 1/2 , and then in the second step we rule out i ≤ nx −αnt γ . By conditioning on the event Gcn of Lemma 5.1, it suffices to consider i ≥ nc. Suppose some i ∈ [nc, nx − αnt 1/2 ] minimizes (58) for some x ∈ [a, b] and t ∈ −(1+δ) [n , τ ]. Since [nx] is among the indices over which the infimum is taken in (58), it must follow that n,i n n zin (0) + E[nx]−i (nt) ≤ z[nx] (0) ≤ z[nb] (0).
Bound the left-hand side from below: n,[nc] n,[nc] n,i n n zin (0) + E[nx]−i (nt) ≥ z[nc] (0) + E[nx]−i (nt) ≥ z[nc] (0) + E[αnt 1/2 ] (nt).
The consequence is that for some t ∈ [n−(1+δ) , τ ], n,[nc] n n E[αnt 1/2 ] (nt) ≤ z[nb] (0) − z[nc] (0).
Let β = u0 (b) − u0 (c) + 1, and Gn be the event in Lemma 5.1. Then the previous reasoning gives P for some x ∈ [a, b] and t ∈ [n−(1+δ) , τ ], (58) is minimized by some i ≤ nx − αnt 1/2 ≤ P (Gn ) + P [αnt 1/2 ], nt ≤ nβ for some t ∈ [n−(1+δ) , τ ] n n +P z[nb] (0) − z[nc] (0) > nβ . Note that γ (1 + δ) < 1 forces δ < 1, and then n−(1+δ) > n−2 (log n)2 for large n. Thus Lemma 4.2(a) applies, and we can conclude that the second probability after the inequality above is summable over n ≥ 1 if α is chosen large enough. The last probability converges to zero in Case (i), and is summable over n in Case (ii) of the lemma.
Fluctuations for Totally Asymmetric Systems
167
Now condition on the event that all minimizers satisfy i ≥ nx − αnt 1/2 , for (x, t) in the range under consideration. Under this condition, Hn,0 %⇒ for some x ∈ [a, b], t ∈ [n−(1+δ) , τ ], i ∈ [nx − αnt 1/2 , nx − αnt γ ]: n,i n zin (0) + E[nx]−i (nt) ≤ z[nx] (0)
%⇒ for some x ∈ [a, b], t ∈ [n−(1+δ) , τ ], n,[nx−αnt E[αnt γ]
1/2 ]
n n (nt) ≤ z[nx] (0) − z[nx−αnt 1/2 ] (0).
By the definition of ζn , we can write n n (0) − z[nx−αnt z[nx] 1/2 ] (0) = n u0 (x) − u0 (x − αt 1/2 ) + n1/2 ζn (x, 0) − ζn (x − αt 1/2 , 0)
≤ Cαnt 1/2 + 2n1/2 · sup ζn (x, 0), x∈[c,b]
where [c, b] is an interval that contains [x − αt 1/2 , x] for all (x, t) under consideration, and C is the Lipschitz constant for u0 on the interval [c, b]. By assumption δ < (3 − 4γ )/(4γ − 2), so we can pick a small η > 0 so that δ < (3 − 4γ − 2η)/(4γ − 2). Now summarize everything in this upper bound: P (Hn,0 ) ≤ P (Gn ) + P [αnt 1/2 ], nt ≤ nβ for some t ∈ [n−(1+δ) , τ ] n n +P z[nb] (0) − z[nc] (0) > nβ n,[nx−αnt 1/2 ] (nt) ≤ Cαnt 1/2 + 2n1/2+η for some x ∈ [a, b], t ∈ [n−(1+δ) , τ ] +P E[αnt γ] +P
sup ζn (x, 0) > nη
x∈[c,b]
≡ P (Gn ) + pn,1 + pn,2 + pn,3 + pn,4 . By Lemma 4.2(a)–(c), n pn,j < ∞ for j = 1, 3. Note that for pn,3 we have to sum over the superscript [nx − αnt 1/2 ] as x varies over [a, b]. This gives O(n) terms, which is why the probabilities in Lemma 4.2(b)–(c) are multiplied by n. Under assumption (9), lim n→∞ pn,k = 0 for k = 2, 4. Under assumption (27) and local boundedness of ρ0 , # n pn,k < ∞ for k = 2, 4. This proves the lemma for the event Hn,0 . " As usual, the distance between a point x and a set A is denoted by dist(x, A) = inf{|x − y| : y ∈ A}. Lemma 5.3. Let A ⊆ R × [0, ∞) be a compact set. Assume that A satisfies either assumption (a) or (b): (a) A is a finite set; or (b) there are no shocks in A, in other words y − (x, t) = y + (x, t) for all (x, t) ∈ A. For δ > 0, define the events Hn = Hn (δ) by Hn = for some (x, t) ∈ A, (58) is minimized by some i
such that dist n−1 i, I (x, t) > δ .
168
T. Seppäläinen
(i) Under assumptions (9) and (10), lim n→∞ P (Hn ) = 0. (ii) Under assumptions (26) and (27), ∞ n=1 P (Hn ) < ∞. Proof. Fix finite a < b and τ > 0 so that A ⊆ [a, b] × [0, τ ]. For small enough σ > 0, I (x, t) ⊆ [x − δ/2, x] for all x ∈ [a, b] and t ∈ (0, σ ] by Lemma 3.3, so Lemma 5.2 gives the conclusion for 0 < t ≤ σ . Thus for the proof we can assume that A ⊆ [a, b] × [σ, τ ], where 0 < σ < τ . The important point here is bounding t away from 0 because the estimation gets harder if t → 0 as n → ∞. Choose c < 0, c < y − (a, τ ) ≤ a so that Lemma 5.1 is satisfied. By that lemma we only need to consider minimizers in the range [nc, nb]. Let I (x, t)(δ) = {q : |q − y| < δ for some y ∈ I (x, t)} be the δ-neighborhood of I (x, t). Set ε=
1 · inf{u0 (y) + tg((x − y)/t) − u(x, t) : 5 (x, t) ∈ A, y ∈ [c, x] \ I (x, t)(δ) }.
It turns out that ε is a positive quantity if A satisfies one of the two assumptions (a) or (b) in the statement of the lemma. For an arbitrary compact set with shocks ε can be zero. For each x ∈ [a, b], t ∈ [σ, τ ], choose finitely many points ak = ak (x, t) and bk = bk (x, t), 1 ≤ k ≤ K = K(x, t), so that [c, x] \ I (x, t)(δ) =
K
[ak , bk ]
k=1
and
tg x − bk − tg x − ak ≤ ε t t
for each k = 1, . . . , K. For each (x, t) ∈ A, choose a point yx,t ∈ I (x, t). Reason as follows: for some (x, t) ∈ A, (58) is minimized by some i such that n−1 i ∈ [c, x] \ I (x, t)(δ) %⇒ for some (x, t) ∈ A, (58) is minimized by some i such that n−1 i ∈ [ak , bk ] for some 1 ≤ k ≤ K %⇒ for some (x, t) ∈ A and 1 ≤ k ≤ K, n,[ny
]
n,[nak ] x,t n n z[na (0) + E[nx]−[nb (nt) ≤ z[ny (0) + E[nx]−[ny (nt) x,t ] x,t ] k] k]
%⇒ for some (x, t) ∈ A and 1 ≤ k ≤ K, n either z[na (0) < nu0 (ak ) − nε, k]
n,[nak ] (nt) < ntg((x − bk )/t) − nε, or E[nx]−[nb k]
n or z[ny (0) > nu0 (yx,t ) + nε, x,t ] n,[ny
]
x,t or E[nx]−[ny (nt) > ntg((x − yx,t )/t) + nε x,t ]
Fluctuations for Totally Asymmetric Systems
169
n (0) − nu (y)| > nε, or for some %⇒ for some y ∈ [c, b], |z[ny] 0
x ∈ [a, b], t ∈ [σ, τ ], y ∈ [c, x], n,[ny] E[nx]−[ny] (nt) − ntg((x − y)/t) ≥ nε. The next to last implication above followed from the choice of ε, because x − bk x − ak ≥ u0 (ak ) + tg −ε u0 (ak ) + tg t t x − yx,t ≥ u0 (yx,t ) + tg + 4ε. t The entire argument can be summarized in this bound: P (Hn ) ≤ P (Gn ) + P (Hn,0 ) + P (Hn,1 ) n (0) − nu (y)| > nε for some y ∈ [c, b] +P |z[ny] 0 n,[ny] + P E[nx]−[ny] (nt) − ntg((x − y)/t) ≥ nε for some x ∈ [a, b], y ∈ [c, x], t ∈ [σ, τ ] ) . Apply the assumptions and previous lemmas to treat the terms on the right-hand side n,[ny] above. The probabilities of E[nx]−[ny] (nt) are handled by Lemmas 4.3 and 4.4. Assumpn (0) − nu (y)} to a y-continuous process in tion (9) of weak convergence of n−1/2 {z[ny] 0 the topology of uniform convergence on compact sets of y’s guarantees that n (0) − nu (y)| > nε for some y ∈ [c, b] = 0. lim P |z[ny] 0 n→∞
Under assumptions (26) and (27) use elementary large deviation estimates after a partitioning: if c = b0 < b1 < · · · < bk = b is a fine enough partition, monotonicity of both n (0) and nu (y), and the Lipschitz continuity of u (y), give z[ny] 0 0 n (0) − nu (y)| > nε for some y ∈ [c, b] P |z[ny] 0 ≤
k j =0
P
n |z[nb (0) − nu (b )| > nε/2 . 0 j j]
These probabilities are summable over n, by large deviation bounds for exponential random variables. " # 6. Proof of Theorem 2.1 6.1. Proof of Theorem 2.1(i). Recall the definition (59) of in (x, t). Choose a random yn (x, t) ∈ I (x, t) such that |n−1 in (x, t) − yn (x, t)| = dist n−1 in (x, t), I (x, t) . We prove the limit (15) in Theorem 2.1 with separate arguments for different ranges of t. Let [a, b] × [0, τ ] be a compact rectangle that contains A.
170
T. Seppäläinen
6.1.1. Lower Bound, Case 1. Consider t ∈ (0, n−(1+δ) ] for a small δ > 0. By Lemma 5.2(i) c , and thereby assume that i (x, t) ≥ nx − αn(1−δ)/2 we may condition on the event Hn,1 n for all (x, t) ∈ [a, b] × (0, n−(1+δ) ]. Since the E-term is always nonnegative, n n z[nx] (nt) ≥ zinn (x,t) (0) ≥ z[nx−αn (1−δ)/2 ] (0).
Furthermore, by Lemma 3.3 and by the local Lipschitz property of u0 , there exists a constant C such that u0 (y) ≥ u0 (x) − Ct ≥ u0 (x) − Cn−(1+δ) for all x ∈ [a, b], t ∈ (0, n−(1+δ) ], and y ∈ I (x, t). Monotonicity in time and space give n (0) ≤ zn (0) whenever y ∈ I (x, t). We get the following u(x, t) ≤ u0 (x), and z[ny] [nx] c for y ∈ I (x, t): lower bound, valid on the event Hn,1 n n (nt) − nu(x, t) − {z[ny] (0) − nu0 (y)} z[nx] n n −δ ≥ −{z[nx] (0) − z[nx−αn . (1−δ)/2 ] (0)} − Cn
Add and subtract the term nu0 (x) − nu0 (x − αn−(1+δ)/2 ), which is of order O(n1/2−δ/2 ) uniformly over x ∈ [a, b] by the local Lipschitz property of u0 . Multiply through by n−1/2 and uniformize over (x, t): inf{ζn (x, t) − ζn (y, 0) : (x, t) ∈ [a, b] × (0, n−(1+δ) ], y ∈ I (x, t)}
≥ − sup{ζn (x, 0) − ζn (y, 0) : x ∈ [a, b], |x − y| ≤ 2αn−(1+δ)/2 } − Cn−δ/2 .
c , hence by Lemma 5.2 with probability 1 − ε if n This bound is valid on the event Hn,1 is large enough. The lower bound converges to 0 in probability by assumption (9). The constant α was replaced by 2α to account for the effects of integer parts.
6.1.2. Lower Bound, Case 2. Now t ∈ [n−(1+δ) , τ ]. n z[nx] (nt) − nu(x, t) n,i (x,t)
n = zinn (x,t) (0) + E[nx]−i (nt) − nu(x, t) n (x,t) n,in (x,t) n = z[ny (0) − nu0 (yn (x, t)) + E[nx]−i (nt) − (nx − in (x, t))2 /4tn n (x,t)] n (x,t) n −1 (0) − nu (n i (x, t)) + nu (y (x, t)) + zinn (x,t) (0) − z[ny 0 n 0 n n (x,t)] −1 + n F(x, n in (x, t)) − F (x, yn (x, t)) .
Above we used the notation F(x, y) = u0 (y)+tg((x −y)/t) for the function minimized in the Hopf–Lax formula (2). Since yn (x, t) ∈ I (x, t) minimizes F(x, ·), the term F x, n−1 in (x, t) − F (x, yn (x, t)) is nonnegative and can be discarded. Recalling the definition (4) of ζn , we get n ζn (x, t) = n−1/2 z[nx] (nt) − nu(x, t) n,in (x,t) (nt) − (nx − in (x, t))2 /4tn (62) ≥ inf ζn (y, 0) + n−1/2 E[nx]−i (x,t) n y∈I (x,t) + n−1/2 zinn (x,t) (0) − nu0 (n−1 in (x, t)) n − n−1/2 z[ny (0) − nu (y (x, t)) . 0 n n (x,t)]
Fluctuations for Totally Asymmetric Systems
171
Recall the definitions of the events Hn,0 and Hn (δ) in Lemmas 5.2 and 5.3. By Lemma 5.3, limn→∞ P (Hn (δ)) = 0 for any fixed δ > 0. Then it is possible to find a sequence δn 0 such that limn→∞ P (Hn (δn )) = 0. Now condition on the event c ∩ H (δ )c , the complement of these events. Then for all (x, t) ∈ A such that Hn,0 n n t ∈ [n−(1+δ) , τ ], [nx] − in (x, t) ≤ αnt γ and |n−1 in (x, t) − yn (x, t)| ≤ δn .
(63)
c ∩ H (δ )c , for all (x, t) ∈ A such that t ∈ Consequently we get, on the event Hn,0 n n −(1+δ) , τ ], [n
ζn (x, t) −
inf
y∈I (x,t)
ζn (y, 0) ≥ Rn,1 + Rn,2 ,
where we abbreviated n,i Rn,1 = inf n−1/2 Em (nt) − m2 /(4tn) : [nc] ≤ i ≤ [nb],
0 ≤ m ≤ αnt γ , t ∈ [n−(1+δ) , τ ] and Rn,2 = inf {ζn (r, 0) − ζn (s, 0) : |r − s| ≤ δn and r, s ∈ [c, d]} .
In the definition of Rn,1 and Rn,2 we picked c < d depending on α, γ , and δn in (63) to ensure that yn (x, t) and n−1 in (x, t) ∈ [c, d] for all (x, t) ∈ A and for all n. An application of Lemma 4.3 gives the estimate ∞
P Rn,1 ≤ −ε < ∞.
n=1
By the assumption of weak convergence ζn (·, 0) → ζ0 on Du (R) and the continuity of ζ0 , limn→∞ |Rn,2 | = 0 in probability. We now have for Case 2
lim P ζn (x, t) − inf ζn (y, 0) ≤ −ε inf y∈I (x,t) (x,t)∈A , t∈[n−(1+δ) ,τ ]
≤ lim P (Hn,1 ) + P (Hn (δn )) + P Rn,1 ≤ −ε/2 + P Rn,2 ≤ −ε/2 = 0 n→∞ n→∞
for any ε > 0. Cases 1 and 2 together give
lim P
n→∞
inf
(x,t)∈A
ζn (x, t) −
inf
y∈I (x,t)
This completes the proof of the lower bound.
ζn (y, 0) ≤ −ε = 0.
172
T. Seppäläinen
6.1.3. Upper Bound, Case 1. Consider 0 < t ≤ n−1 . Let C be a finite constant such n (nt) ≤ zn (0), we that u(x, t) ≥ u0 (x) − Ct for all a ≤ x ≤ b, 0 ≤ t ≤ τ . Since z[nx] [nx] can write n n z[nx] (nt) − nu(x, t) − {z[ny] (0) − nu0 (y)}
n n ≤ z[nx] (0) − nu0 (x) − {z[ny] (0) − nu0 (y)} + Ctn.
Estimate this uniformly over a ≤ x ≤ b, 0 < t ≤ n−1 . By Lemma 3.3, there is a −1 constant γ such that √y ∈ I (x, t) implies |x − y| ≤ γ t ≤ γ n , for all (x, t) in this range. Dividing by n above gives, for n large enough to have x − γ n−1 ≥ a − 1, sup{ζn (x, t) − ζn (y, 0) : a ≤ x ≤ b, 0 < t ≤ n−1 , y ∈ I (x, t)}
≤ sup{ζn (x, 0) − ζn (y, 0) : x, y ∈ [a − 1, b], |x − y| ≤ γ n−1 } + Cn−1/2 .
The last quantity converges to 0 in probability by assumption (9). 6.1.4. Upper Bound, Case 2. Lastly consider n−1 ≤ t ≤ τ . Use the fact that u(x, t) = u0 (y) + (x − y)2 /(4t) for any y ∈ I (x, t), n,[ny]
n n z[nx] (nt) − nu(x, t) = inf {z[ny] (0) + E[nx]−[ny] (nt) − nu(x, t)} y≤x n ≤ inf z[ny] (0) − nu0 (y) + Rn,3 + C, y∈I (x,t)
where Rn,3 =
sup
sup
x∈[a,b],t∈[n−1 ,τ ] y∈I (x,t)
n,[ny]
E[nx]−[ny] (nt) −
([nx] − [ny])2 , 4nt
and the constant C accounts for replacing (x − y)2 /(4t) with ([nx] − [ny])2 /(4nt). Consequently sup ζn (x, t) − inf ζn (y, 0) ≤ n−1/2 Rn,3 + Cn−1/2 . y∈I (x,t)
(x,t)∈A, n−1 ≤t≤τ
By Lemmas 4.4 and 4.5, ∞
P Rn,3 > nα < ∞
n=1
for any α > 1/3. This gives the required upper bound. Combining Cases 1 and 2, we have bounded sup ζn (x, t) − inf ζn (y, 0) (x,t)∈A
y∈I (x,t)
above by a random variable that vanishes in probability as n → ∞. Together with the lower bound, this completes the proof of part (i) of Theorem 2.1.
Fluctuations for Totally Asymmetric Systems
173
6.2. Proof of Theorem 2.1(ii). Fix −∞ < a < b < ∞ and τ < ∞. The goal is to prove the limit in probability (16). Let ε > 0. Let p ζn (x, t) − inf ζn (y, 0) . Yn = sup y∈I (x,t)
(x,t)∈[a,b]×[0,τ ]
We claim there exists a constant C ∈ (0, ∞) such that P (Yn > C) ≤ ε for all n. This can be proved by repeating the arguments of Sect. 6.1 for the compact set A0 = [a, b]×[0, τ ]. Since A0 can contain shocks, the argument does not give Yn → 0, but it still gives a bound in probability. Define the event Dn = {Yn ≤ C}. By Proposition 3.1 we can find an open set G ⊆ R × (0, ∞) such that G contains all the shocks in [a, b] × [0, τ ], and its t-section has 1-dimensional Lebesgue measure |Gt | < ε/(3C) for all t. (Recall that by definition there are no shocks on the t = 0 line.) Let A = [a, b] × [0, τ ] \ G. A is a compact set with no shocks, so by Theorem 2.1(i) p Xn ≡ sup ζn (x, t) − inf ζn (y, 0) → 0 in probability. (64) y∈I (x,t)
(x,t)∈A
Let At = {x : (x, t) ∈ A} be the t-section of A. On the event Dn we can now bound p b sup ζn (x, t) − inf ζn (y, 0) dx 0≤t≤τ
a
y∈I (x,t)
≤ sup 0≤t≤τ
At
Xn dx +
[a,b]\At
Yn dx
≤ (b − a)Xn + Yn ε/(3C) ≤ (b − a)Xn + ε/3. Thus by (64)
lim sup P n→∞
sup
0≤t≤τ
b a
p ζn (x, t) − inf ζn (y, 0) dx ≥ ε ≤ P (D c ) ≤ ε. n y∈I (x,t)
This proves (16). 7. Proof of the Weak Limit and the Linearized Equation 7.1. Proof of Theorem 2.2. For part (i), take A = {(x1 , t1 ), . . . , (xk , tk )} in (15). The mapping h → inf h(y), . . . , inf h(y) y∈I (x1 ,t1 )
y∈I (xk ,tk )
from Du (R) into Rk is continuous. Then use the assumption (9) of weak convergence at time zero, and the continuous mapping theorem [5, p. 30]. Part (ii) goes by the same general principle. Let us abbreviate σn (x, t) =
inf
y∈I (x,t)
ζn (y, 0).
(65)
p We need to check that ζn defines a random element of D [0, ∞), Lloc (R) and that σn
p and ζ define random elements of C [0, ∞), Lloc (R) . For f ∈ Du (R) let Gf (x, t) = inf y∈I (x,t) f (y).
174
T. Seppäläinen
p Lemma 7.1. G is a continuous map from Du (R) into C [0, ∞), Lloc (R) , when we p interpret Gf as the path t → Gf (·, t) ∈ Lloc (R). Proof. For 0 ≤ t ≤ T and a ≤ x ≤ b, I (x, t) ⊆ [y − (a, T ), b], and so Gf is locally p bounded as a function of (x, t). Consequently for a fixed t, Gf (·, t) is in Lloc (R). p Secondly, we need to argue that as s → t, Gf (·, s) → Gf (·, t) in Lloc (R). By the local boundedness and dominated convergence, we need only show Gf (x, s) → Gf (x, t) for a.e. x.
(66)
Recall from Sect. 3 that if y1 ∈ I (x, s1 ) and y2 ∈ I (x, s2 ) for s1 < s2 , then y2 ≤ y1 . Consider first s ' t. Fix x so that (x, t) is not a shock. Then for any choice ys ∈ I (x, s), ys y(x, t) = the unique Hopf–Lax minimizer for (x, t). By right-continuity f (ys ) → f (y(x, t)) = Gf (x, t), and since we can let f (ys ) be arbitrarily close to Gf (x, s), we have (66). Now suppose s t. f has at most countably many discontinuities, so we still have ¯ t) for a.e. x if we exclude all x such that (x, t) is a shock, and all points x = w ± (y, discontinuities y¯ of f . Suppose x is not one of the excluded points. Then (x, t) has a unique minimizer y(x, t), and the previous paragraph shows again f (ys ) → f (y(x, t)) if f is continuous at y(x, t). But suppose y¯ = y(x, t) is a discontinuity for f . Then it must be that w− (y, ¯ t) < x < w + (y, ¯ t). [Justification: (51) forces w − (y, ¯ t) ≤ x ≤ + ± w (y, ¯ t), but x ∈ {w (y, ¯ t)} cannot happen because x is not among the excluded points.] Since forward characteristics are continuous, it follows that for s > t but close enough to t, w− (y, ¯ s) < x < w+ (y, ¯ s). This implies y ± (x, s) = y¯ which in turn says Gf (x, s) = Gf (x, t). Again (66) checks. p We have shown that the function t → Gf (·,
t) ∈ Lloc (R) is continuous. Continuity p of the map G : Du (R) → C [0, ∞), Lloc (R) is a consequence of the locally uniform topology on Du (R). " # By definitions (17) and (65), the processes ζ and σn are obtained by applying the mapping G to the Du (R)-valued random functions ζ0 and ζn (·, 0). This checks that σn
p and ζ define random elements of C [0, ∞), Lloc (R) . Also, by assumption (9) and the
d p continuous mapping theorem, σn → ζ in the space C [0, ∞), Lloc (R) . Now consider ζn defined by (4). It is locally bounded in (x, t) so local Lp -integrability is not a problem. By dominated convergence, for a fixed sample point ω the map t → p ζn (·, t; ω) from [0, ∞) into Lloc (R) is right-continuous. d
The weak convergence ζn → ζ now follows readily. For any finite time-horizon T , p sup0≤t≤T dp (ζn (t), σn (t)) → 0 in probability by (16), where dp is the metric on Lloc (R). Uniform in time is stronger than the Skorokhod topology.
So it follows that, if we let dD p denote the Skorokhod metric on D [0, ∞), Lloc (R) , dD (ζn , σn ) → 0 in probability d
d
also. This together with σn → ζ implies ζn → ζ . We have proved Theorem 2.2. 7.2. Proof of Theorem 2.3. Lemma 7.2. Let F, G be right-continuous functions, F locally BV and G nondecreasing. Let H − be the left-continuous inverse of G defined by H − (y) = sup{x : G(x) < y} = inf{x : G(x) ≥ y}.
Fluctuations for Totally Asymmetric Systems
175
Then for all continuous functions ϕ for which the integrals exist,
−
ϕ(H (y))dF (y) =
ϕ(x)d(F ◦ G)(x).
Proof. It suffices to take ϕ = 1(a,b] , the indicator function of a left-open right-closed interval. Check that {y : a < H − (y) ≤ b} = (G(a), G(b)]. Then
1(a,b] (H − (y))dF (y) = =
1(G(a),G(b)] (y)dF (y) = F (G(b)) − F (G(a)) 1(a,b] d(F ◦ G).
# "
This lemma will be applied below to the pair G(a) = w + (a, t), H − (b) = y − (b, t). ∞ Fix a test function x φ ∈ Cc (R × [0, ∞)). Let (A, B) × [0, T ) contain the support of φ. Let F(x, t) = −∞ φ(y, t)dy. By Theorem 3.1, for any q ∈ R we have the formula −F(q, 0) =
T
0 T
= 0
d F(w + (q, t), t)dt dt +
Ft (w (q, t), t)dt +
T
φ(w + (q, t), t)h(w + (q, t), t) dt. (67)
0
Now we calculate, beginning with the leftmost term of (21), with v in place of ζ¯ . Note that v(x, t) = v0 (y − (x, t)) a.e. so in this first integral these two are interchangeable.
T
dt
v(x, t)φt (x, t)dx =
0
T
dt
0
=
0
T
dt
v0 (y − (x, t))d[Ft (·, t)](x) v0 (q)d[Ft (w + (·, t), t)](q).
There is a fixed compact interval [a, b] on which the Lebesgue-Stieltjes measure d[Ft (w + (·, t), t)](q) is supported for all t ∈ [0, T ]. Let a = q0 < q1 < · · · < qm = b be a partition of this interval with mesh I = max(qi − qi−1 ). We can choose the partitions so that w(qi , t) = w ± (qi , t) for all i, because by Lemma 3.1(c) we only need to pick the qi ’s outside a certain Lebesgue null set. The integrand v0 is continuous by assumption, hence the q-integral can be written as a limit, and the last line above equals = 0
T
dt lim
I→0
v0 (qi ){Ft (w(qi , t), t) − Ft (w(qi−1 , t), t)}.
i
The function inside the t-integral is bounded by a constant, uniformly over t ∈ [0, T ] and over partitions of [a, b]. Hence we can take the limit outside, apply (67), and then
176
T. Seppäläinen
put the limit back inside, to get
= lim
I→0
v0 (qi )
i
= lim
−
I→0
T
0
{Ft (w(qi , t), t) − Ft (w(qi−1 , t), t)}dt
v0 (qi )
i
T
0
{φ(w(qi , t), t)h(w(qi , t), t)
−φ(w(qi−1 , t), t)h(w(qi−1 , t), t)}dt −
T
=−
lim
I→0
0
v0 (qi )[F(qi , 0) − F(qi−1 , 0)]
i
v0 (qi ){φ(w(qi , t), t)h(w(qi , t), t)
i
−φ(w(qi−1 , t), t)h(w(qi−1 , t), t)}dt −
v0 (q)φ(q, 0)dq.
(68)
At this point we replace h(w(qi , t), t) by f (ρ + (w(qi , t), t)) and write RI for the error term. Then the last line above equals
T
=−
lim
I→0
0
v0 (qi ){φ(w(qi , t), t)f (ρ + (w(qi , t), t))
i
− φ(w(qi−1 , t), t)f (ρ + (w(qi−1 , t), t))}dt −
v0 (q)φ(q, 0)dq − lim RI . I→0
Ignoring the term limI→0 RI for the moment, take the I → 0 limit in the first sum to get again a Lebesgue-Stieltjes integral. After another application of Lemma 7.2, we get this intermediate equation:
T
dt
0
v(x, t)φt (x, t)dx
=−
T
0
T
=− 0
dt
v0 (q)d[φ(w + (·, t), t)f (ρ + (w + (·, t), t))](q) −
v0 (q)φ(q, 0)dq
− lim RI I→0 − + dt v0 (y (x, t))d[φ(·, t)f (ρ (·, t))](x) − v0 (x)φ(x, 0)dx − lim RI . I→0
(69)
It remains to take care of limI→0 RI . Notice that on line (68), at the stage where RI was introduced, the summation can be restricted to i such that w(qi−1 , t) < w(qi , t) because otherwise w(qi−1 , t) = w(qi , t) and the expression in braces {} equals zero.
Fluctuations for Totally Asymmetric Systems
177
Thus we can write RI as follows, and sum by parts: T $ RI = dt v0 (qi ) φ(w(qi , t), t) h(w(qi , t), t) − f (ρ + (w(qi , t), t)) 0
i:w(qi−1 ,t)<w(qi ,t)
% − φ(w(qi−1 , t), t) h(w(qi−1 , t), t) − f (ρ + (w(qi−1 , t), t)) T = dt φ(w(qi , t), t) h(w(qi , t), t) − f (ρ + (w(qi , t), t)) 0
i
% $ × v0 (qi )1{w(qi−1 , t) < w(qi , t)} − v0 (qi+1 )1{w(qi , t) < w(qi+1 , t)} .
Now note that the last sum can be restricted to i such that (w(qi , t), t) is a shock because h(x, t) − f (ρ + (x, t)) = 0 unless (x, t) is a shock. Supposing that (x, t) is a shock, observe that if y − (x, t) ≤ qi < qi+1 ≤ y + (x, t) then w(qi , t) = w(qi+1 , t) = x. [In general w+ (y + (x, t), t) could be strictly larger than x, but then w − (y + (x, t), t) < w+ (y + (x, t), t), which we have prevented by assuming w − (qi , t) = w + (qi , t).] Consequently, for the shock (x, t), $ % v0 (qi )1{w(qi−1 , t) < w(qi , t)} − v0 (qi+1 )1{w(qi , t) < w(qi+1 , t)} i:w(qi ,t)=x
= v0 min{qi : qi ≥ y − (x, t)} − v0 min{qi : qi > y + (x, t)} .
To express this in a single function, write LI (x, t) = 1 [(x, t) is a shock, and x ∈ {w(qi , t) : 0 ≤ i ≤ m}]
$ % · v0 min{qi : qi > y + (x, t)} − v0 min{qi : qi ≥ y − (x, t)} . The subscript I expresses the dependence of LI on the partition. For any shock (x, t), some qi lies in (y − (x, t), y + (x, t)) when I is small enough, and then x = w(qi , 0). By the continuity of v0 we have the convergence lim LI (x, t) = v0 (y + (x, t)) − v0 (y − (x, t)),
I→0
(70)
which happens boundedly and at all (x, t). Now write T dt φ(x, t) f (ρ + (x, t), t) − h(x, t) LI (x, t) RI = 0
a≤x≤b T
=
dt
0
θ (x, t)LI (x, t)d[φ(·, t)f (ρ + (·, t))](x),
where we recognized that f (ρ + (x, t), t) − h(x, t) = θ(x, t)[f (ρ + (x, t), t) − f (ρ − (x, t), t)]. The x-integral lives entirely on the countable set of shocks because LI vanishes elsewhere. This is why we can slip the continuous φ(x, t) factor into the integrator. Taking the limit gives T $ % lim RI = dt θ (x, t) v0 (y + (x, t)) − v0 (y − (x, t)) d[φ(·, t)f (ρ + (·, t)](x). I→0
0
Substituting this on line (69) above completes the proof that v(x, t) satisfies (21). We have proved Theorem 2.3.
178
T. Seppäläinen
8. Proof of Theorem 2.4 We begin by realizing the initial configurations (zin (0) : i ∈ Z) with Skorokhod’s representation. Let (D, F, P ) be a probability space on which are defined a two-sided Brownian motion B(·), and independently of it a space-time Poisson point process for constructing the Hammersley dynamics. Recall that B(·) is defined by B(s) = B1 (s) for s ≥ 0 and B(s) = −B2 (−s) for s < 0, where B1 (·), B2 (·) are two independent standard 1-dimensional Brownian motions defined on [0, ∞). For each n, define a two-sided Brownian motion Bn (·) by Bn (s) = n1/2 B(s/n) for s ∈ R. Fix n. Construct the Skorokhod representation for the independent mean zero random variables ηin − E[ηin ] whose distribution is defined in assumption (27). The usual construction (see e.g. Sect. 7.6 in [7]) is applied to B1 for i > 0 and to B2 for i ≤ 0. This gives random variables · · · ≤ Tn,−2 ≤ Tn,−1 ≤ 0 = Tn,0 ≤ Tn,1 ≤ Tn,2 ≤ · · · such that the variables {τn,i = Tn,i − Tn,i−1 : i ∈ Z} are mutually independent, we have the equality in distribution of the processes d
{Bn (Tn,i ) − Bn (Tn,i−1 ) : i ∈ Z} = {ηin (0) − E[ηin (0)] : i ∈ Z}, and for each i E[τn,i ] = Var[ηin (0)] = E[ηin (0)]2 = n
2
i/n
(i−1)/n
ρ0 (s)ds
.
(71)
Note that the assumption of exponentially distributed ηin (0) was used here. Now we take this construction as the definition of the initial interface: zin (0) = nu0 (i/n) + Bn (Tn,i ) = nu0 (i/n) + n1/2 B(n−1 Tn,i ).
(72)
The initial process ζn (y, 0) defined by (4) is now given by ζn (y, 0) = B(n−1 Tn,[ny] ) + n1/2 (u0 ([ny]/n) − u0 (y)) . Lemma 8.1. For any −∞ < a < b < ∞, y Tn,[ny] 2 lim sup ρ0 (s)ds = 0 − n→∞ y∈[a,b] n 0
(73)
almost surely.
Proof. Suppose 0 ≤ a < b. The other cases are handled with similar arguments. Let us first check y 1 ρ02 (s)ds = 0. (74) lim sup ETn,[ny] − n→∞ y∈[a,b] n 0 By (71), 1 ETn,[ny] = n
[ny] [ny]/n
n
0
i=1
i/n (i−1)/n
2 ρ0 (s)ds
1" i−1 n
, ni
(r)dr.
Fluctuations for Totally Asymmetric Systems
179
2 The integrand is bounded by the assumption ρ0 ∈ L∞ loc (R), and converges to ρ0 (r) at every Lebesgue point r of ρ0 . So the required convergence in (74) holds for each fixed y. To get uniformity over y, observe that y 1 2 sup ETn,[ny] − ρ0 (s)ds 0 y∈[a,b] n 2 b [nb] i/n C ≤ n ρ0 (s)ds 1" i−1 i (r) − ρ02 (r) dr + , , n n n 0 (i−1)/n i=1
and apply dominated convergence. Next we show that, for a fixed y, lim n−1 Tn,[ny] − n−1 ETn,[ny] = 0 n→∞
a.s.
(75)
The moments of the waiting times τn,i satisfy E[(τn,i )k ] ≤ Ck E
"
2k #
ηin (0) − E[ηin (0)]
≤ Ck < ∞
for all 0 ≤ i ≤ nb, for constants Ck , Ck . The first inequality follows from the BurkholderDavis-Gundy inequalities, and the second from the local boundedness of ρ0 . Thus by Chebychev, 4 [ny]
C 1 P |Tn,[ny] − ETn,[ny] | ≥ nε ≤ 4 4 E (τn,i − Eτn,i ) ≤ 2 4 , n ε n ε i=1
and now Borel–Cantelli gives (75). To get (75) uniformly over y ∈ [a, b], partition [a, b] and use monotonicity of Tn,i in i. Combine this with (74). " # By definition (72) and the path-continuity of Brownian motion, Lemma 8.1 is sufficient for proving that y lim sup ζn (y, 0) − B ρ02 (s)ds = 0 almost surely. (76) n→∞ y∈[a,b]
0
Thus to prove limit (30) in Theorem 2.4 it suffices to show lim sup ζn (x, t) − inf ζn (y, 0) = 0 n→∞ (x,t)∈A
y∈I (x,t)
almost surely.
(77)
In other words, we need to strengthen (15) to a.s. convergence. One can follow the case-by-case reasoning in Sect. 6.1 for (15), and check that in each case the stronger assumptions (26) and (27) give almost sure convergence. To prove part (ii) of Theorem 2.4, one can show that M ≡ supn sup(x,t)∈A0 |ζn (x, t)| is a.s. finite for an arbitrary compact set A0 . This furnishes the a.s. bound needed to turn the proof of part (ii) of Theorem 2.1 (Sect. 6.2) into a proof for part (ii) of Theorem 2.4.
180
T. Seppäläinen
9. The Distribution-Valued Processes −1 Let us check that ξn (t) ∈ Hloc (R), or in other words that χ ξn (t) ∈ H −1 (R) for any ∞ 1 χ ∈ Cc (R). Let ϕ ∈ H (R). Set
u¯ n (x, t) = n1/2 (u(x, t) − u([nx]/n, t)) . By a summation by parts, χ ξn (t, ϕ) = ξn (t, χ ϕ) n = n−1/2 χ (i/n)ϕ(i/n){zin (nt) − nu(i/n, t) − zi−1 (nt) + nu((i − 1)/n, t)} =−
i∈Z
=−
i∈Z
=−
i∈Z
[χ ((i + 1)/n)ϕ((i + 1)/n) − χ (i/n)ϕ(i/n)] n−1/2 {zin (nt) − nu(i/n, t)} (i+1)/n i/n
(χ ϕ) (x)dx · n−1/2 {zin (nt) − nu(i/n, t)}
(χ ϕ) (x) (ζn (x, t) + u¯ n (x, t)) dx.
Use (χ ϕ) = ϕχ + ϕ χ and the Schwarz inequality. Let [a, b] contain the support of χ . By the local Lipschitz property of u (Lemma 3.3), u¯ n = O(n−1/2 ) on any compact set. We get
1/2
|χ (x)| {ζn (x, t) + u¯ n (x, t)} dx
|χ ξn (t, ϕ)| ≤ ϕL2 (R) + ϕ L2 (R)
2
2
1/2
|χ (x)|2 {ζn (x, t) + u¯ n (x, t)}2 dx 1/2 b 1 2 ≤ C · ϕH 1 (R) ζn (x, t) dx +√ . n a This verifies that χ ξn (t) ∈ H −1 (R), because for any fixed ω, the process ζn (x, t) is −1 (R). We leave it to the reader locally bounded in (x, t). In other words, ξn (t) ∈ Hloc to fill in the details to show that ξn (·) is a random element of the Skorokhod space −1 −1 (R)), and ξ(·) is a random element of C([0, ∞), Hloc (R)). D([0, ∞), Hloc We move to the main point, to prove the strong law sup0≤t≤τ R(ξn (t), ξ(t)) → 0 a.s. Note that in the definition (36) of R(F, G) the quantities χk F − χk GH −1 (R) are nondecreasing in k. Consequently, for any k, R(F, G) ≤ χk F − χk GH −1 (R) + 2−k . So it suffices to show that for any k, almost surely lim sup
sup
n→∞ 0≤t≤τ ϕ H 1 (R) ≤1
|ξn (t, χk ϕ) − ξ(t, χk ϕ)| = 0.
Fluctuations for Totally Asymmetric Systems
181
Following the earlier calculation and by the definition (34), we get |ξn (t, χk ϕ) − ξ(t, χk ϕ)| ≤ − (χk ϕ) (x)ζn (x, t)dx + (χk ϕ) (x)ζ (x, t)dx + (χk ϕ) (x)u¯ n (x, t)dx 1/2 b 1 2 ≤ Ck · ϕH 1 (R) |ζn (x, t) − ζ (x, t)| dx +√ . n a The constant Ck is determined by χk . Consequently sup
sup
0≤t≤τ ϕH 1 (R) ≤1
≤ Ck ·
|ξn (t, χk ϕ) − ξ(t, χk ϕ)|
sup 0≤t≤τ
b a
1/2 |ζn (x, t) − ζ (x, t)|2 dx
1 +√ , n
which converges a.s. to 0 by Theorem 2.4(ii). This completes the proof of Theorem 2.5. References 1. Aldous, D. and Diaconis, P.: Hammersley’s interacting particle process and longest increasing subsequences. Probab. Theory Related Fields 103, 199–213 (1995) 2. Baik, J., Deift, P., Johansson, K.: On the distribution of the length of the longest increasing subsequence of random permutations. J. Am. Math. Soc. 12, 1119–1178 (1999) 3. Baik, J., Deift, P., McLaughlin, K., Miller, P., Zhou, X.: Optimal tail estimates for directed last passage site percolation with geometric random variables. Preprint at http://front.math.ucdavis.edu/math.PR/0112162, 2001 4. Bertini, L., Giacomin, G.: Stochastic Burgers and KPZ equations from particle systems. Commun. Math. Phys. 183, 571–607 (1997) 5. Billingsley, P.: Convergence of Probability Measures. New York: John Wiley and Sons, 1968 6. Dafermos, C.M.: Generalized characteristics and the structure of solutions of hyperbolic conservation laws. Indiana Univ. Math. J. 26, 1097–1119 (1977) 7. Durrett, R.: Probability: Theory and Examples. Second Edition. London: Duxbury Press, Wadsworth Publishing Company, 1996 8. Ethier, S.N., Kurtz, T.G.: Markov Processes: Characterization and Convergence. New York: John Wiley and Sons, 1986 9. Evans, L.C.: Partial Differential Equations. Providence, RI: American Mathematical Society, 1998 10. Ferrari, P., Fontes L.: Shock fluctuations in the asymmetric simple exclusion process. Probab. Theory Related Fields 99, 305–319 (1994) 11. Ferrari, P., Fontes, L.: Current fluctuations for the asymmetric simple exclusion process. Ann. Probab. 22, 820–832 (1994) 12. Ferrari, P., Fontes L.: Poissonian approximation for the tagged particle in asymmetric simple exclusion. J. Appl. Probab. 33, 411–419 (1996) 13. Folland, G.B.: Real Analysis: Modern Techniques and Their Applications. Second Edition. New York: John Wiley and Sons, 1999 14. Giacomin, G., Lebowitz, J., Presutti, E.: Deterministic and stochastic hydrodynamic equations arising from simple microscopic model systems. In: Stochastic partial differential equations: Six perspectives, Math. Surveys Monogr. 64, Providence, RI: Am. Math. Soc., 1999, pp. 107–152 15. Hammersley, J.M.: A few seedlings of research. Proc. Sixth Berkeley Symp. Math. Stat. Probab. Vol. I, 345–394 (1972) 16. Johansson, K.: The longest increasing subsequence in a random permutation and a unitary random matrix model. Math. Res. Lett. 5, no. 1–2, 63–82 (1998) 17. Kim, J.H.: On increasing subsequences of random permutations. J. Combin. Theory Ser. A 76, 148–155 (1996) 18. Kipnis, C., Landim, C.: Scaling Limits of Interacting Particle Systems. Grundlehren der mathematischen Wissenschaften, Vol. 320, Berlin: Springer Verlag, 1999
182
T. Seppäläinen
19. Petrova, G., Popov, G.: Linear transport equations with discontinuous coefficients. Comm. Partial Differential Equations 24, 1849–1873 (1999) 20. Rezakhanlou, F.: Microscopic structure of shocks in one conservation laws. Ann. Inst. H. Poincaré Anal. Non Linéaire 12, 119–153 (1995) 21. Rezakhanlou, F.: A central limit theorem for the asymmetric simple exclusion process. Preprint, 2000 22. T. Seppäläinen: A microscopic model for the Burgers equation and longest increasing subsequences. Electronic J. Probab. 1, Paper 5, 1–51 (1996) 23. Seppäläinen, T.: Large deviations for increasing sequences on the plane. Probab. Theory Related Fields 112, 221–244 (1998) 24. Seppäläinen T.: Coupling the totally asymmetric simple exclusion process with a moving interface. Markov Process. Related Fields 4, 593–628 (1998) 25. Seppäläinen, T.: Perturbation of the equilibrium for a totally asymmetric stick process in one dimension. Ann. Probab. 29, 176–204 (2001) 26. Seppäläinen, T.: Second-class particles as microscopic characteristics in totally asymmetric nearestneighbor K-exclusion processes. Trans. Am. Math. Soc. 353, 4801–4829 (2001) 27. Spohn, H.: Large Scale Dynamics of Interacting Particles. Berlin: Springer-Verlag, 1991 Communicated by H. Spohn
Commun. Math. Phys. 229, 183 – 207 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Bosonic Monocluster Expansion A. Abdesselam1 , J. Magnen2 , V. Rivasseau2 1 Département de Mathématiques, Université Paris XIII, Paris-Nord, Villetaneuse, Avenue J.B. Clément,
93430 Villetaneuse, France
2 Centre de Physique Théorique, CNRS UMR 7644, Ecole Polytechnique, 91128 Palaiseau Cedex, France
Received: 18 May 2000 / Accepted: 21 March 2001
Abstract: We compute Green’s functions of a Bosonic field theory with cutoffs by means of a “minimal” expansion which in a single move, interpolating a generalized propagator, performs the usual tasks of the cluster and Mayer expansion. In this way it allows a direct construction of the infinite volume or thermodynamic limit and it brings constructive Bosonic expansions closer to constructive Fermionic expansions and to perturbation theory. 1. Introduction A key problem in physics is to construct the thermodynamic limit of large systems. Only intensive or normalized quantities have a well defined limit. For a Bosonic field theory the standard way to construct this limit is to introduce first a finite volume cutoff, then to perform a cluster expansion, which writes the theory as a polymer gas but with hard core constraints, then to perform a Mayer expansion which removes these constraints by comparing this gas to a perfect gas [GJS1]. In fact in [GJS2, part 3] it was shown that after interpolation of covariances has be used to decouple regions by the cluster expansion it is also possible to “recouple” them in such a way as to replace a region containing an observable by a “clean” region with no observable, in order to obtain an expansion for the normalized Schwinger functions. The situation was nevertheless still slightly frustrating for two reasons. Firstly for Fermionic theories there is no need of a sequence of two expansions on top of each other: a single tree formula expresses directly the infinite volume limit of normalized functions as a convergent series [AR1]. It is therefore desirable to have such a single formula computing directly the infinite volume limit of connected Green’s functions in the Bosonic case too. Secondly mathematically both the cluster and the Mayer expansions can be written elegantly using forest formulas [AR2]; they have therefore some common nature, which led us to suspect for quite a while that there should exist a single expansion performing
184
A. Abdesselam, J. Magnen, V. Rivasseau
both tasks at once, instead of one after the other. In fact the first example of such a formula was given in [AR2], but it is still really a somewhat artificial mixing of the two expansions (using a two stages formula technically called a “jungle” formula), and it is not obtained by interpolating propagators only. In this paper we propose a much more natural solution to this problem, which writes directly the infinite volume limit of normalized functions as a convergent series. To incorporate the Mayer expansion we have simply to work in some extended space of copies. Therefore we propose, for any space IRd , to define the Mayer space as IRd × IN. In this extended space we introduce expansions steps which interpolate solely the (generalized) propagator of the extended theory. The outcome of our expansion is not exactly but almost a tree formula in this extended Mayer space-time. It generates a single cluster (hence we name our expansion a “monocluster” expansion), and the profile of this cluster in the Mayer space is a solid-on-solid profile, with no overhangs. This means that our expansion makes truly a minimal use of the Mayer copies. We hope to extend this analysis in the future to multiscale expansions such as the one of [AR3], written for the infrared φ44 model. This would suppress the need for iteration of Mayer expansions to perform renormalization (probably the most cumbersome aspect of explicit multiscale expansions). In this way we hope to obtain a completely explicit non-perturbative solution of the renormalization group induction for Bosonic theories (apart from the inductive computation of the effective constants). It would bring these Bosonic theories to the same level of understanding than Fermionic theories, for which such explicit solutions are known [DR]. For a review of rigorous renormalization group methods for bosonic field theory models we refer the reader to [Br, Ga, GK, Ri]. 2. The Model Let C(x, y) be the smooth translation-invariant kernel of a covariance operator on IRd , i.e. such that (f, g) →< f, Cg >L2 (IRd ) is a positive continuous bilinear form on the Schwartz space S(IRd ). By the Bochner-Minlos theorem (see [GJ]), there is an associated Gaussian measure dµC on S (IRd ) with covariance C. The smoothness of C insures that dµC is supported on smooth functions. We assume that C satisfies a condition of rapid decay: ∀r ≥ 1, ∃K1 (r) > 0, ∀x, y ∈ IRd , |C(x, y)| ≤ K1 (r)(1 + |x − y|)−r
(1)
Let P (x) be a real polynomial with even degree 2m and positive leading coefficient. There is then a constant K2 > 0 such that, for all x ∈ IR, |P (x)| ≤ K2 (1 + x 2m ). We introduce a discretization d def d D = (2) [ki , ki + 1[ (k1 , . . . , kd ) ∈ ZZ i=1
of IRd with boxes of unit size. If x ∈ IRd , we denote by (x) the unique ∈ D containing x. We denote by a hypercube of IRd that is a union of boxes in D, and by || the number of these boxes, which also happens to be equal to vol(). For any λ ≥ 0, we introduce a partition function with free boundary conditions: def Z() = dµC (φ) exp −λ P (φ(x))dx (3)
Bosonic Monocluster Expansion
185
as well as unnormalized Schwinger functions, for x1 , . . . , xn in IRd : def S,u (x1 , . . . , xn ) = dµC (φ)φ(x1 ) · · · φ(xn ) exp −λ P (φ(x))dx
(4)
These are well defined quantities, besides Z() > 0. Indeed, by Jensen’s inequality and Wick’s theorem (see [GJ]), Z() ≥ exp dµC (φ)(−λ) P (φ(x))dx (5) ≥ exp −K2 λ dx dµC (φ)(1 + φ(x)2m ) (6) (2m)! ≥ exp −K2 λ|| 1 + m C(0, 0) > 0. (7) 2 m! One can thus consider the finite-volume normalized Schwinger functions, or correlation functions, def
S (x1 , . . . , xn ) =
S,u (x1 , . . . , xn ) Z()
(8)
and study their thermodynamic limit when IRd . The typical example we have in mind is the φ 4 theory in a single slice of momenta, that is with both ultraviolet and infrared cut-offs as defined, e.g. by the choice: def
C(x, y) =
d d p ip(x−y) e−p e (2π )d p2 + 1 2
(9)
and P (x) = x 4 . One of the classical results we rederive using our new expansion scheme is Theorem 1. There exists λ0 > 0, such that, for any λ ∈ [0, λ0 ], any n ≥ 1, and x1 , . . . , xn ∈ IRd , S(x1 , . . . , xn ) = limIRd S (x1 , . . . , xn ) exists. Of course, more results can be obtained with our method, like Borel summability of perturbation theory, or complete asymptotic expansion of the decay rate of S(x1 , x2 ) etc . . . But as explained in the introduction, our purpose here is rather to present, at work, a new expansion scheme in the cluster expansion business that produces a sum over a single polymer (i.e. set of cubes), and therefore completely avoids the so-called Mayer expansion. 3. The Expansion def
We first introduce a denumerable set of copies of the field φ. We let L = D × IN which we identify with a discretization of the “Mayer space” IRd × IN. This Mayer space can be pictured as in Fig.1 with the horizontal direction to represent the cubes in D and the vertical direction to represent the copies index in IN. This index is therefore called the height of the cube.
186
A. Abdesselam, J. Magnen, V. Rivasseau
For M a positive matrix with entries indexed by elements b of L, we define the covariance operator on IRd × IN: C[M](x, k; x , k ) = C(x, x )M(b(x, k), b(x , k )) def
(1)
where b(x, k) = ((x), k) denotes, with a slight abuse of terminology, the box of L containing the pair (x, k). In particular we consider M∅ defined by if k = k = 0 1 def M∅ ((, k), ( , k )) = δ, if k = k ≥ 1 (2) 0 otherwise i.e. in block form L0 L≥1 1 0 M∅ = 0 Id def
(3)
def
where L0 = D × {0}, L≥1 = D × IN∗ , 1 is the matrix with entries 1 everywhere and Id is the identity matrix. Clearly, C∅ = C[M∅ ] is a positive covariance operator; and we can define dµC∅ (%) the measure of a Gaussian random field %(x, k) on IRd × IN, with def
covariance C∅ . We introduce also the notations D = { ∈ D| ⊂ }, and for any def
integer N ≥ 0, L,N = D × {0, 1, . . . , N} ⊂ L. Now consider def
H,N (x1 , . . . , xn ) = n dµC∅ (%) %(xi , 0) exp −λ i=1
(,k)∈L,N
P (%(x, k))dx .
(4)
We obviously have, due to the definition of C∅ , the factorization N||
H,N (x1 , . . . , xn ) = S,u (x1 , . . . , xn ) · Z0 where def
Z0 =
dµ1l C1l exp −λ
(5)
P (%(x, k))dx
(6)
the normalization of an isolated cube, does not depend on , since the kernel C is translation-invariant. Here, 1l denotes the sharp characteristic function of , hence note that Z0 = Z(). Before starting combinatorial definitions we give an informal description of our expansion. def
Let (0 = { ∈ D|∃i, xi ∈ } × {0} ⊂ L0 be the set of cubes which contain the sources. They belong to the first layer L0 . The idea now is to remark that in an ordinary cluster expansion we introduce a weakening parameter h1 which tests the coupling between the source cubes of (0 and the other cubes of L0 through the non local propagator C. But in this way in the decoupled term, the normalization Z() does
Bosonic Monocluster Expansion
187
not quite factor out: in the numerator the cubes of (0 have been factored from the other ones of L0 , hence in the numerator we have factored Z(\(0 ) which does not cancel completely the denominatorZ() . So we introduce new cubes (, 1), which are copies of the cubes (, 0) ∈ (0 but without the source fields, and with height index 1 instead of 0. Initially they are decoupled between themselves and from the cubes with height 0. As the parameter h1 varies from 1 to 0 we do not merely turn off the couplings between (0 and \(0 but also turn on the couplings between the copies (, 1) and between these copies and the cubes of \(0 . In this way the copies couple together and substitute for the source cubes, hence at h1 = 0 the exact factor Z() cancels between the numerator and the denominator. Correction terms involve a cluster link, either with plus sign between (0 and \(0 , or with a minus sign between the copies or between a copy and a cube of \(0 . This minus sign is the analog of a Mayer link in the standard point of view. Then we iterate. This means that at each step a single cluster has been built containing the sources (0 . We cover this “monocluster” with exactly a single layer of decoupled copies which (together with the remaining set of boxes of height 0 not in the cluster) form what we call the “roof” of the cluster. Then we turn off the couplings of the cluster to the roof, and we turn on the internal couplings of the roof. An expansion step creates either a + link between the cluster and the roof, or a - link between the roof cubes themselves. In each case the cluster grows. When factorization succeeds, the roof exactly reconstructs the normalization factor Z() which then cancels with the denominator. As an example in Fig.1 we have shown a world with only seven cubes and 4 layers. The cubes are indexed by two indices, one between 1 and 7, the other, the height, between 0 and 3. The initial cubes containing sources are (3,0) and (4,0). The first expansion step created a minus link l1 between (3,1) and (1,0), and the second expansion step created a minus link l2 between (4,1) and (7,0). The “roof” is always made of seven cubes: at the third step in Fig.1, these seven cubes are (1,1), (2,0), (3,2), (4,2), (5,0), (6,0) and (7,1). If factorization succeeds they will together exactly reconstruct Z(). Otherwise a new propagator l3 will appear, for instance with a plus sign between (4,0) and (5,0), or with a minus sign between (3,2) and (4,2). In the first case (5,0) will become a cluster cube and (5,1) a roof cube; in the second case (3,2) and (4,2) will both become cluster cubes and (3,3) and (4,3) will become roof cubes. But couplings between cluster and roof cubes are turned off only if they existed at all, and this entails some subtle constraints: for instance we shall see below that in Fig.1 the new propagator l3 can appear between (3,2) and (4,1) but not between (3,1) and (4,2). After this informal overview we now proceed to write down precisely our expansion for H,N (x1 , . . . , xn ). We introduce some combinatorial definitions. First we define the def
notion of a polymer. We define (−1 = ∅. We say that a finite set ( ⊂ L is a polymer if, whenever (, k) ∈ (, we also have (, k ) ∈ ( for any k , 0 ≤ k ≤ k. We also introduce the altitude function h( of a polymer, on D as: def
h( () =
−1 if {k|(, k) ∈ (} = ∅ max{k|(, k) ∈ (} otherwise.
(7)
A polymer ( is uniquely determined by its altitude function h( . We also introduce the roof W (() ⊂ L of a polymer ( as: def
W (() = {(, h( () + 1)| ∈ D}
(8)
188
A. Abdesselam, J. Magnen, V. Rivasseau
1,3 1,2 1,1
l1
1,0
2,0
l2 3,0
Sky
4,0
5,0
6,0
7,0
Roof Propagator
Sources
Cluster
Fig. 1. A cluster graph
def
and its sky S(() = L\(( ∪ W (()). The sets (, W (() and S(() then form a partition of L. Let g = (l1 , . . . , lp ) be an ordered sequence of unordered pairs of the form l = {b, b } with b, b distinct elements of L. p is called the length of the graph and p = 0, def
corresponding to g = ∅ is allowed. We define, for 1 ≤ i ≤ p, (i,g = (0 ∪ l1 ∪ · · · ∪ li . def
def
We also set, by convention, (0,g = (0 and (−1,g = (−1 = ∅. We say that g is a cluster-graph if, for any i, 1 ≤ i ≤ p, the unordered pair li is of the form {b, b } for some b and b that satisfy one of the following two conditions: (i) b ∈ (i−1,g and b ∈ W ((i−1,g ), (ii) b, b ∈ W ((i−1,g ) and b ∈ / L0 . It is easy to check by induction on i that (i,g defined previously is indeed a polymer, for any i, 1 ≤ i ≤ p. A pair li , which is called a link of the graph g, is said of type cluster-roof or (W if (i) occurs, and of type roof-roof or W W if (ii) occurs (see Fig.1). If b ∈ L, we define the conception index of b with respect to g: def µg (b) = inf {i| − 1 ≤ i ≤ p, b ∈ W ((i,g )} ∪ {p + 1}
(9)
and the creation index of b: def νg (b) = inf {i| − 1 ≤ i ≤ p, b ∈ (i,g } ∪ {p + 1} .
(10)
In Fig.1, g = (l1 , l2 ) and p = 2. The conception index keeps track of the step at which a cube enters into the roof: for example µg (3, 0) = −1, µg (5, 0) = −1, µg (3, 1) = 0, µg (4, 2) = 2 and µg (6, 3) = 3, where here the value p + 1 = 3 means simply the future. The creation index keeps track of the step at which a cube enters into the cluster:
Bosonic Monocluster Expansion
189
for example νg (3, 0) = 0, νg (4, 1) = νg (7, 0) = 2 and νg (5, 0) = νg (6, 3) = 3 (where 3 again means the future). Note that we always have µg (b) < νg (b) if b ∈ ((p,g ∪ W ((p,g )). Indeed, by definition of a cluster-graph (i,g \(i−1,g = li \(i−1,g ⊂ W ((i−1,g ). In fact, W ((i ) can be viewed as a solid-on-solid interface that elevates in L as the cluster (i,g grows with increasing i. A cube b has to belong to a W ((i,g ) before it belongs to a (i,g . If b, b are two elements of L we define: sµg (b, b ) = max(µg (b), µg (b )),
(11)
iνg (b, b ) = min(νg (b), νg (b )),
(12)
sνg (b, b ) = max(νg (b), νg (b )).
(13)
def
def
def
Before sµg (b, b ), one of the cubes in the pair at least belongs to the sky, they are decoupled and the cluster expansion step cannot create a link between them. Between sµg (b, b ) and iνg (b, b ) both cubes are conceived but not created, hence the cluster expansion step turns on their coupling and they can be linked by a minus link. Between iνg (b, b ) and sνg (b, b ) one cube is in the cluster, the other in the roof and the cluster expansion turns off their remaining coupling hence they can be linked by a plus link. Finally after sνg (b, b ) both cubes are in the cluster, their coupling is frozen and no link can join them anymore. In our example of Fig.1, sµg ((3, 1), (5, 0)) = 0, iνg ((3, 1), (5, 0)) = 1 and sνg ((3, 1), (5, 0)) = 3; also sµg ((1, 0), (7, 0)) = −1, iνg ((1, 0), (7, 0)) = 1 and sνg ((1, 0), (7, 0)) = 2. Now given a decreasing vector h of p+1 parameters 1 > h1 > · · · > hp > hp+1 > 0 def
def
1 with the additional convention h0 = 1 and h−1 = +∞ so that h−1 = 0, we define the following matrix Mg,h on L. For b, b in L we let 1 if b = b 0 def if b = b and sµg (b, b ) ≥ iνg (b, b ) Mg,h = 1 hsνg (b,b ) − h 1 if b = b and sµg (b, b ) < iνg (b, b ). h iνg (b,b )
sµg (b,b )
(14) We will later prove that Mg,h is a positive matrix. Before that, we introduce the following operation on covariance matrices on L. If ( is a polymer, and M is a matrix on L, we define the new matrix T( [M] by M(b, b ) if b, b ∈ ( 1 if b, b ∈ W (() def T( [M](b, b ) = (15) if b, b ∈ S(() δb,b 0 otherwise. or in block form ( W (() S(() M|( 0 0 T( (M) = . 0 1 0 0 0 Id Obviously T( [M] is positive if M is.
(16)
190
A. Abdesselam, J. Magnen, V. Rivasseau
Lemma 1. If g = (g , lp ) is a cluster-graph of length p ≥ 1, and h = (h , hp+1 ) is a decreasing vector of parameters, we have hp+1 hp+1 T(p,g [Mg ,h ]. Mg ,h + 1 − (17) Mg,h = hp hp Proof. We check the equality for every pair of boxes b, b in L. The case b = b holds trivially. • If b = b are both in (p,g , then the choice of upper cut-off on the infimum in (9) and (10) readily implies that µg (b) = µg (b) ≤ p − 1 and νg (b) = νg (b) ≤ p. Therefore, Mg,h (b, b ) = Mg ,h (b, b ) = T(p,g [Mg ,h ](b, b )
(18)
so that (17) holds. • If b = b are both in W ((p,g ), then µg (b) = µg (b) ≤ p, whereas νg (b) = p + 1, νg (b) = p and likewise for b . Therefore 1 1 Mg,h (b, b ) = hp+1 , (19) − hp+1 hsµg (b,b )
Mg ,h (b, b ) = hp
1 1 − hp hsµg (b,b )
,
(20)
whereas T(p,g [Mg ,h ](b, b ) = 1, and thus hp+1 hp+1 T(p,g [Mg ,h ](b, b ) Mg ,h (b, b ) + 1 − hp hp hp+1 hp+1 1 1 = + 1− hp − hp hp hsµg (b,b ) hp 1 1 = hp+1 , − hp+1 hsµg (b,b )
(21)
(22)
so that (17) holds. • If b ∈ (p,g and b ∈ W ((p,g ), then µg (b) = µg (b) ≤ p, νg (b) = νg (b) ≤ p, µg (b ) = µg (b ) ≤ p, νg (b ) = p + 1 and νg (b ) = p. Therefore sµg (b, b ) = sµg (b, b ) and iνg (b, b ) = iνg (b, b ). Besides T(p,g [Mg ,h ](b, b ) = 0. So if sµg (b, b ) ≥ iνg (b, b ) both sides of (17) vanish; else we have 1 1 (23) − Mg,h (b, b ) = hp+1 hiνg (b,b ) hsµg (b,b ) and
Mg ,h (b, b ) = hp
which implies (17).
1 hiνg (b,b )
−
1 hsµg (b,b )
,
(24)
Bosonic Monocluster Expansion
191
• Finally if b ∈ S((p,g ) ⊂ S((p−1,g ) and b = b is anywhere in L, we have T(p,g [Mg ,h ](b, b ) = 0, µg (b) = νg (b) = p + 1 and µg (b) = νg (b) = p. Thus sµg (b, b ) ≥ iνg (b, b ) and sµg (b, b ) ≥ iνg (b, b ) so that both sides of (17) vanish again. This completes the check in every case.
Lemma 2. For any cluster-graph g of length p ≥ 0 and associated decreasing parameter vector h of length p + 1, the matrix Mg,h is positive. Proof. Convex combinations and the operation M → T( [M] preserve positivity; so, by induction thanks to the previous lemma, we only need to check the p = 0 situation. But then g = ∅, h = (h1 ), and for b ∈ L we have −1 if b ∈ L0 (25) µ∅ (b) = 0 if b ∈ (W ((0 )\L0 ) ⊂ D × {1} , 1 if b ∈ S(( ) 0 and
ν∅ (b) =
0 if b ∈ (0 1 if b ∈ W ((0 ) ∪ S((0 ) .
(26)
Now a straightforward calculation using (14) shows that, in block form, we have
M∅,(h1 )
S((0 ) ( W ((0 ) ∩ L0 W ((0 )\L0 0 0 0 1 h1 1 1 (1 − h1 )1 0 , = h1 1 0 (1 − h )1 (1 − h )1 + h Id 0 1 1 1 0 0 0 Id
i.e.
M∅,(h1 )
1 1 = h1 0 0
1 1 0 0
0 0 Id 0
1 0 0 0 + (1 − h1 ) 0 0 0 Id
0 1 1 0
0 1 1 0
0 0 0 Id
(27)
(28)
or M∅,(h1 ) = h1 M∅ + (1 − h1 )T(0 [M∅ ],
(29)
which is clearly positive. Remark that we have showed, en passant, that (17) really starts at p = 0, M∅ being the matrix corresponding to a cluster-graph of “length -1”. We need some more notation to proceed. Here g = (l1 , . . . , lp ), p ≥ 0, is a cluster-graph, h = (h1 , . . . , hp+1 ) is a decreasing vector of parameters. For any b ∈ L, and any α, 0 ≤ α ≤ p + 1, we let def µg,α (b) = inf {i| − 1 ≤ i ≤ α − 1, b ∈ W ((i,g )} ∪ {α}
(30)
def νg,α (b) = inf {i| − 1 ≤ i ≤ α − 1, b ∈ (i,g } ∪ {α} .
(31)
and
192
A. Abdesselam, J. Magnen, V. Rivasseau
This is the same as the previously defined µg (b) and νg (b), using the truncation (l1 , . . . , lα−1 ) of g instead of the full graph g. We also denote for b, b in L, sµg,α (b, b ) = max(µg,α (b), µg,α (b )),
(32)
sνg,α (b, b ) = max(νg,α (b), νg,α (b )),
(33)
iνg,α (b, b ) = min(νg,α (b), νg,α (b )) .
(34)
def
def
and def
We next define for any q, 1 ≤ q ≤ p, the expression ω(g, h, q) as follows. Let lq = {b, b } for some b = b in L. • If b ∈ (q−1,g and b ∈ W ((q−1,g ), we let def 0 ω(g, h, q) = 1 hiν −
if sµ ≥ iν if sµ < iν .
1 hsµ
(35)
where sµ and iν are shorthand for sµg,q−1 (b, b ) and iνg,q−1 (b, b ) respectively. Note that sµg,q−1 (b, b ) = sµg (b, b ) ≤ q − 1 and iνg,q−1 (b, b ) = iνg (b, b ) ≤ q − 1. For instance in Fig.1 the l3 link can appear between (3,2) and (4,1) because in that case sµ = 1 and iν = 2, but it will not appear between (3,1) and (4,2) because in that case sµ = 2 and iν = 1, hence the corresponding factor ω is 0. • If b, b ∈ W ((q−1,g ), then we let def
ω(g, h, q) = −
1 hsµ
(36)
where, again, sµ is shorthand for sµg,q−1 (b, b ). Note again that sµg,q−1 (b, b ) = sµg (b, b ) ≤ q − 1.
(37)
• Finally, in every other case for b and b , we let def
ω(g, h, q) = 0. b
(38)
Now let l = {b, b } be an unordered pair of elements of L, such that b = (, k) and = ( , k ); we then introduce the functional derivation operator: δ δ def Dl = dx dx C(x, x ) (39) , k) δ%(x, k) δ%(x
We also introduce def
R(g, h) = n i=1
dµC [Mg,h ] (%)
%(xi , 0) exp −λ
p
ω(g, h, q)Dlq
q=1
(,k)∈L,N
P (%(x, k))dx ,
(40)
the functional derivations acting on any factor to their right. We are now ready to state the main lemma for our expansion scheme.
Bosonic Monocluster Expansion
193
Lemma 3. For any m ≥ 1, H,N (x1 , . . . , xn ) =
0≤p<m g=(l1 ,... ,lp ) 1>h1 >···>hp >0
+
dh1 . . . dhp R(g, (h1 , . . . , hp , 0))
g=(l1 ,... ,lm ) 1>h1 >···>hm >0
dh1 . . . dhm R(g, (h1 , . . . , hm , hm )) .
(41)
The sums on g are on all cluster-graphs with the prescribed length. Proof. We first prove the lemma for m = 1. For that we notice, according to Eq. (29), that H,N (x1 , . . . , xn ) = R(g, h),
(42)
where g = ∅ is the empty graph and h = (h1 ) with h1 = 1. We then simply write 1 d H,N (x1 , . . . , xn ) = R(∅, (0)) + dh1 R(∅, (h1 )). (43) dh1 0 The covariance matrix appearing in R(∅, (h1 )) is M∅,(h1 ) = h1 M∅ + (1 − h1 )T(0 [M∅ ].
(44)
Therefore, the derivation with respect to h1 produces a functional derivation operator acting on the integrand, associated to a matrix element of M∅ − T(0 [M∅ ] (this is obvious by Wick’s theorem for polynomial integrands, then true for our smooth decreasing integrand by an easy limiting argument, see [GJ]). That is we get a sum over l1 = {b, b } and a factor (M∅ − T(0 [M∅ ])(b, b )Dl1 in the functional integral defining R(∅, (h1 )). It is a simple check to verify, with our previous definitions, that (M∅ − T(0 [M∅ ])(b, b ) = ω(l1 , (h1 , h1 ), 1) 1 if b ∈ (0 , b ∈ W ((0 ) ∩ L0 = −1 if b = b ∈ W ((0 ) and {b, b } ⊂ L0 .
(45) (46)
Besides, the covariance matrix M∅,(h1 ) involved in the functional integral can be rewritten, according to (17), as M(l1 ),(h1 ,h1 ) . Therefore
1 dh1 R((l1 ), (h1 , h1 )), (47) H,N (x1 , . . . , xn ) = R(∅, (0)) + l1
0
which is the wanted result for m = 1. We now prove the induction step from m ≥ 1 to m + 1. For this, we simply have to show that, given a cluster-graph g = (l1 , . . . , lm ) of length m and parameters 1 > h1 > · · · > hm > 0, R(g, (h1 , . . . , hm , hm )) = R(g, (h1 , . . . , hm , 0))
hm + dhm+1 R((g, lm+1 ), (h1 , . . . , hm , hm+1 , hm+1 )), lm+1 0
(48)
194
A. Abdesselam, J. Magnen, V. Rivasseau
which is proven in the same way as for the m = 1 case. Indeed, we write R(g, (h1 , . . . , hm , hm ))
= R(g, (h1 , . . . , hm , 0)) +
hm 0
dhm+1
d R(g, (h1 , . . . , hm , hm+1 )) dhm+1
(49)
and use (17) to explicit the dependence on hm+1 of the covariance matrix: hm+1 hm+1 Mg,(h1 ,... ,hm+1 ) = T(m,g [Mg ,(h1 ,... ,hm ) ], (50) Mg ,(h1 ,... ,hm ) + 1 − hm hm where g = (l1 , . . . , lm−1 ). Derivation with respect to hm+1 again introduces a sum over a new link lm+1 = {b, b }, with a corresponding functional derivation operator Dlm+1 times a factor 1 Mg ,(h1 ,... ,hm ) − T(m,g [Mg ,(h1 ,... ,hm ) ] (b, b ) hm
(51)
which is easily checked to be equal to ω((g, lm+1 ), (h1 , . . . , hm+1 , hm+1 ), m + 1) .
(52)
Indeed, if b = b ∈ W ((m,g ), (51) is equal to 1 1 Mg ,(h1 ,... ,hm ) (b, b ) − 1 = hm hm
hm
1 1 − hm hsµg (b,b )
−1
(53)
since νg (b) = νg (b ) = m. The situation b ∈ (m,g , b ∈ W ((m,g ) can be checked in the same way. Finally the involved covariance matrix can be rewritten, thanks to (17), as Mg,(h1 ,... ,hm+1 ) = M(g,lm+1 ),(h1 ,... ,hm+1 ,hm+1 ) which proves (48).
(54)
The easy proof that the cluster-graphs that are summed over in Lemma 3 satisfy the conditions (i) and (ii) stated earlier, is left to the reader. We are now ready to move on to the proof of Theorem 1. We first notice that, if g = (l1 , . . . , lp ) is a cluster-graph, then #((p,g ) ≥ p; besides, the contribution of g in (41) vanishes if (p,g is not contained in D,N since a functional δ derivation δ%(x,k) would have nothing to contract to. As a result, p > #(D,N ) implies that g = (l1 , . . . , lp ) gives a zero contribution; it is then straightforward to take the limit m → +∞ in (41) to write H,N (x1 , . . . , xn ) =
+∞
p=0
g=(l1 ,... ,lp ) (p,g ⊂D,N
1>h1 >···>hp >0
dh1 . . . dhp R(g, (h1 , . . . , hp , 0)). (55)
Bosonic Monocluster Expansion
195
We can now write an expression for the normalized Schwinger functions since: S (x1 , . . . , xn ) = =
H,N (x1 , . . . , xn )
(56)
N||
Z() × Z0 +∞
p=0
g=(l1 ,... ,lp ) (p,g ⊂D,N
A(g, , N ),
(57)
where A0 (g)
def
A(g, , N) =
#((p,g )
Z0
#()−#(Yg )
Z(Yg ) · Z0 × Z()
(58)
with the following notations. def
• First, Yg = { ∈ |h(p,g () < N }. • Next, Z(Yg ) is defined as in (3) of Sect. 2 by
def Z(Yg ) = dµC (%) exp −λ P (φ(x))dx
(59)
∈Yg
with a free boundary condition covariance. • Finally, A0 (g) is defined, independently of and N , by A0 (g) = dh1 . . . dhp dµC [Mg,(h 1>h1 >···>hp >0
p
ω(g, (h, 0), q)Dlq
q=1 n
%(xi , 0) exp −λ
i=1
Mg,(h1 ,... ,hp ) (b, b ) =
def
] (%)
(,k)∈(p,g
where
1 ,... ,hp )
P (%(x, k))dx ,
Mg,(h1 ,... ,hp ,0) (b, b ) if b, b ∈ (p,g . 0 otherwise.
(60)
(61)
The factorization (58) stems from the fact that the parameter vectors involved in (55) have a null last component, and therefore the corresponding covariance matrix is Mg,(h1 ,... ,hp ,0) = T(p,g [Mg,(h1 ,... ,hp ,0) ],
(62)
which completely couples together the cubes of W ((p,g ) and decouples them from the rest of L. This accounts for the factor Z(Yg ) which might be different from Z(), in case (p,g reaches the highest cubes of L,N which contain all interaction terms of as soon as the form exp(−λ P (%(x, k))dx). For a given g, A(g, , N ) = A#((0 (g) p,g ) Z0
196
A. Abdesselam, J. Magnen, V. Rivasseau
N > max{h(p,g ()| ∈ D} which is finite. Besides, the only dependence in is embodied in the condition (p,g ⊂ L,N . We will then show in the next section that there exists a positive function B(g) of cluster-graphs g, depending on λ, such that, for small λ,
B(g) < +∞, (63) g
where the sum is without restriction on g, and such that |A(g, , N )| ≤ B(g)
(64)
for any g, , and N satisfying (p,g ⊂ L,N and N ≥ #(). The discrete version of the Lebesgue dominated convergence theorem will thus allow us to first take the limit N → +∞ and then the limit IRd in (57) thereby proving Theorem 1. The next section is devoted to finding a uniform estimate B(g) which does the job. 4. The Uniform Estimates We first use a very coarse bound for the “parasite” factors in (58) of Sect. 3. Lemma 4. 0<
1 #(( ) Z0 p,g
#()−#(Yg )
×
Z(Yg ) · Z0 Z()
where def
K3 = K2
≤ exp 2K3 λ#((p,g ),
(1)
(2m)! 1 + m C(0, 0) . 2 m!
(2)
Proof. Indeed as we derived in Sect. 2 a lower bound for Z(), it is easy to do the same with Z0 and Z(Yg ), from which we obtain the three estimates 1 ≥ Z() ≥ exp(−K3 λ#()),
(3)
1 ≥ Z(Yg ) ≥ exp(−K3 λ#(Yg )),
(4)
1 ≥ Z0 ≥ exp(−K3 λ) .
(5)
and
Now given g, and N , with N ≥ #(), we have two possible situations: 1st case. Yg = . Then 1 #(( ) Z0 p,g
#()−#(Yg )
×
Z(Yg ) · Z0 Z()
−#((p,g )
= Z0
≤ exp K3 λ#((p,g ) .
(6)
2nd case. Yg ⊂ and Yg = . Then N ≤ max{h(p,g ()| ∈ D} from the remarks
Bosonic Monocluster Expansion
197
at the end of Sect. 3. But #() ≤ N and max{h(p,g ()| ∈ D} ≤ #((p,g ) so that #() ≤ #((p,g ) and thus 1 #(( ) Z0 p,g
#()−#(Yg )
×
Z(Yg ) · Z0 Z()
−#((p,g )
≤ Z0
· Z()−1
≤ exp 2K3 λ#((p,g ) .
(7) (8)
We now need a few lemmas to bound A0 (g). Lemma 5. If b = (, k) ∈ (p,g , and ∈ D, then
Mg,(h1 ,... ,hp ) (b, ( , k )) ≤ 1.
(9)
k ≥0
Proof. Let us denote b = ( , k ). Now only b ∈ (p,g contributes. Besides, either b = b or sµg (b, b ) < iνg (b, b ) is needed for Mg,(h1 ,... ,hp ) (b, b ) = 0. Now remark that, for any c ∈ (p,g , µg (c) ≤ j < νg (c) is equivalent to c ∈ W ((j,g ). Therefore sµg (b, b ) < iνg (b, b ) means that there is i, −1 ≤ i ≤ p, such that both b and b belong to W ((i,g ). 1st case. = . Since W (() has a unique cube with a given , whatever is the cluster (, the only contribution comes from k = k which gives 1 and satisfies the inequality. def
2nd case. = . Let [k1 , k2 ] = {k |∃i, µg (b) ≤ i < νg (b), (, k ) ∈ W ((i,g )}. First notice that when k1 = k2 the sum reduces to one term and the bound is trivial, def
therefore we suppose from now on that k2 ≥ k1 + 1. We let µk = µg (( , k )) and def
νk = νg (( , k )). If b = ( , k ) with k1 < k < k2 , it follows from the definition of a cluster-graph like g that we have νk = µk +1 , µk > µg (b) and νk < νg (b). Therefore
1 1 (10) Mg,(h1 ,... ,hp ) (b, ( , k )) = hνg (b) − hµk +1 hµk k1
1
One also checks easily that the contribution of k = k1 is 1 1 hνg (b) − hµk +1 hµg (b)
(12)
1
and that of k = k2 is h νk
2
1 hνg (b)
−
1 hµk
2
.
(13)
198
A. Abdesselam, J. Magnen, V. Rivasseau
Therefore
Mg,(h1 ,... ,hp ) (b, ( , k )) =
k ≥0
hνg (b) hµk
2
−
hνg (b) hµg (b)
+
hνk
2
hνg (b)
−
hνk
2
hµk
.
(14)
2
But, from µg (b) < µk2 < νg (b) ≤ νk2 , it follows that there is α, β, γ ∈ [0, 1] such that hνk = αhνg (b), hνg (b) = βhµk and hµk = γ hµg . Thus 2 2 2
Mg,(h1 ,... ,hp ) (b, ( , k )) = β − βγ + α − αβ (15) k ≥0
≤ β + α − αβ ≤ 1 − (1 − α)(1 − β) ≤ 1,
(16) (17) (18)
which proves the assertion. As a consequence of this lemma we have a bound C[Mg,(h ,... ,h ) ](x, k; x , k ) ≤ G(((x), k), ((x ), k )), 1
p
where the function G(b, b ) on L2 satisfies
∀b ∈ L, G(b, b ) ≤ K4
(19)
(20)
b ∈L
for some constant K4 . Indeed, G((, k), ( , k )) = Mg,(h1 ,... ,hp ) ((, k), ( , k )) × K1 (d + 1) × (1 + d(, ))−(d+1) (21)
def
def
with d(, ) = min{|x − y| | x ∈ , y ∈ } works, since the sum over k , by Lemma 5, is no greater than 1, and the sum over is bounded by the rapid decay (1) of the propagator. Note that K4 , unlike G(b, b ), is independent of g and (h1 , . . . , hp ). Lemma 6 (The principle of local factorials). We have the bound: 5 dµ (%) %(z , k ) · · · %(z , k ) n(b)!, (22) 1 1 r r ≤ K5 × C [Mg,(h1 ,... ,hp ) ] b∈L
def
where n(b) = #({j |1 ≤ j ≤ r, ((zj ), kj ) = b}) and K5 is a constant. Proof. Using Wick’s theorem, the functional integral can be computed as a sum over contractions c of the fields %(zj , kj ), with the propagator of C[Mg,(h1 ,... ,hp ) ]. c is simply an involution without fixed points of the set J = {1, . . . , r}. We get dµ (%) %(z , k ) · · · %(z , k ) 1 1 r r C [Mg,(h1 ,... ,hp ) ] = C[Mg,(h1 ,... ,hp ) ](x, j ; xc(j ) , kc(j ) ) (23) c {j,j }⊂J j =c(j )
G(bj , bc(j ) ), (24) ≤ c
{j,j }⊂J j =c(j )
Bosonic Monocluster Expansion
199
where bj denotes ((xj ), kj ) ∈ L. Suppose we have ordered J as {j1 , . . . , js } such that n(bj1 ) ≥ n(bj2 ) ≥ · · · ≥ n(bjs ). To sum over c(j1 ), we first sum over bc(j1 ) , then over c(j1 ) knowing bc(j1 ) . The sum overbc(j1 ) is bounded by K4 . The sum over c(j1 ) knowing bc(j1 ) costs a factor n(bc(j1 ) ) ≤ n(bj1 )n(bc(j1 ) ) because of the ordering of J . We now pick the element j with the smallest over c(j ) label in J \{j1 , c(j1 )}, and sum in the same way, thus getting a factor K4 n(bj )n(bc(j ) ), and so on. Since n(bj ) will appear exactly once by definition of a contraction c, we obtain a bound r r n(bj ) = K42 × n(b)n(b) (25) K42 × j ∈J
b∈L n(b)=0
≤ K5r
n(b)!
(26)
b∈L def
with K5 =
√
eK4 .
We now explain the bound on A0 (g). First note that A0 (g) decomposes as
A0 (g) = A0 (g, ρ),
(27)
ρ
where ρ is a derivation procedure for the operators Dlq and A0 (g, ρ) is the contribution p of ρ in the expansion that computes the action of q=1 Dlq on the integrand n
%(xi , 0) exp −λ P (%(x, k))dx . (28) (,k)∈(p,g
i=1
When considering the expression for A0 (g, ρ), we take out of the functional pintegral all the ω(g, (h, 0), q) factors, as well as the C(x, x ) factors coming from q=1 Dlq , and also the spatial integrations dx that come from the Dlq , as well as all numerical factors such as λ or the coefficients of the polynomial P . The resulting expression is a functional integral of the form: I = dµC˜ (%) %(z1 , k1 ) . . . %(zv , kv )
exp −λ P (%(x, k))dx , (29) (,k)∈(p,g
where C˜ denotes C[Mg,(h1 ,... ,hp ) ]. We bound it using |I| ≤ dµC˜ (%) |%(z1 , k1 ) . . . %(zv , kv )| exp(λK6 #((p,g )),
(30)
where K6 = min{P (x)|x ∈ IR}. Then by the Cauchy-Schwartz inequality, |I| ≤ exp(λK6 #((p,g ))
dµC˜ (%) %(z1 , k1 )2 . . . %(zv , kv )2 .
(31)
200
A. Abdesselam, J. Magnen, V. Rivasseau
Now we bound the functional integral in the last inequality using Lemma 6 thus obtaining: |I| ≤ exp(λK6 #((p,g )) × K5v ×
1
(2ng,ρ (b))! 4 ,
(32)
b∈L def
where ng,ρ (b) = #({j |1 ≤ j ≤ v, ((zj ), kj ) = b}). We now explain the bound on the sum over the derivation procedures ρ that act on n
%(xi , 0) exp −λ P (%(x, k))dx . (33) (,k)∈(p,g
i=1
First we bound the propagators C(x, x ) corresponding to a Dlq with lq = −r {(, k), ( , k )} by K1 (r)(1 + d(, )) . The exponent r will be adjusted later. We also bound the spatial integrations dx by 1. Since each (, k) ∈ (p,g \(0 beδ that acts on the corresponding interaction term longs to an lq , there is at least a δ% exp(−λ P (%(x, k))dx); therefore there is at least λ#((p,g )−#((0 ) in factor and eventually some more factors λ that we bound by 1 as we assume from now on that λ ≤ 1. We also introduce the notation ||P || for the maximum absolute value of the coeffiδ can derive an interaction term, and thus cients of the polynomial P . Note that each δ%(x,k) generate a coefficient of P . We therefore globally bound these factors by (1 + ||P ||)2p . def
We let ng (b) = #({q|1 ≤ q ≤ p, b ∈ lq }), i.e. the coordination number of b with respect def
to the graph g, for any b ∈ (p,g . We also let s(b) = #({i|1 ≤ i ≤ n, b = ((xi ), 0)}) that counts the sources located in b. δ be the Choose an arbitrary order to perform the functional derivations. Let δ%(x,k) one performed last. It is located in b = ((x), k), and can either derive one of the sources, which gives s(b) possibilities. It can also derive a new vertex from the interaction exp(−λ (x) P (%(y, k))dy), we then have to choose the derived monomial in P , and the field in the monomial which gives at most (2m)2 new possibilities. Finally it can rederive a vertex that was derived for the first time by a previously performed functional derivation δ%(xδ ,k) that is also located in b. This gives a total number of possibilities, for δ 2 δ%(x,k) , that is bounded by s(b) + 4m ng (b). We then do the same sum over the ways of computing the before last functional derivation, and so on. It follows that the number of derivation procedures ρ is bounded by
ng (b)
s(b) + 4m2 ng (b)
,
(34)
b∈(p,g
since there are ng (b) functional derivations in each b. We write for convenience
s(b) + 4m2 ng (b)
b∈(p,g
ng (b)
≤
ng (b)!es(b)+4m
b∈(p,g
≤ en+8m
2p
2 n (b) g
b∈(p,g
ng (b)! .
(35) (36)
Bosonic Monocluster Expansion
201
Now note that in (32), v ≤ n + 4mp, and for each b, ng,ρ ≤ s(b) + 2mng (b). As a result, the previous bound on I becomes |I| ≤ exp(K6 #((p,g )) × (1 + K5 )n+4mp
1 2s(b) + 4mng (b) ! 4
(37)
b∈(p,g
≤ exp(K6 #((p,g )) × (1 + K5 )n+4mp × s(b)! × (ng (b)!)m × exp(3ms(b) + 6m2 ng (b))
(38)
b∈(p,g
≤ exp(K6 #((p,g )) × (1 + K5 )n+4mp √ 2 × n! × e3mn+12m p × (ng (b)!)m .
(39)
b∈(p,g
We are now able to write a raw bound on A0 (g) as: |A0 (g)| ≤ λ#((p,g )−#((0 ) × ×
p q=1
1>h1 >···>hp >0
K1 (r)(1 + d(q , q ))−r
dh1 . . . dhp
p
|ω(g, (h, 0), q)|
q=1
×(1 + ||P ||)2p × exp(K6 #((p,g )) × (1 + K5 )n+4mp √ 2 × n! × e(3m+1)n+20m p × (ng (b)!)m+1 ,
(40)
b∈(p,g
where q , q are such that lq = {(q , kq ), (q , kq )}, for some kq and kq . The right-hand side is not quite B(g), we need first to get rid of the local factorials ng (b)!. This requires a volume argument and the next two lemmas. Lemma 7. If g = (l1 , . . . , lp ) is a cluster-graph with A0 (g) = 0, and lqα = {bα , bα }, 1 ≤ α ≤ 3, are three links in g such that q1 < q2 < q3 and b1 = b2 = b3 ; then b1 , b2 and b3 cannot all be of the form ( , kα ) with the same ∈ D. Proof. Ad absurdum. Let b = b1 = b2 = b3 = (, k), and bα = ( , kα ), 1 ≤ α ≤ 3. Since lq ⊂ (q−1,g for any q, and since q1 < q2 < q3 we have that k1 , k2 and k3 are distinct. We even have k1 < k2 < k3 . Indeed, if for instance k2 < k1 , since lq1 = {b, ( , k1 )} ⊂ (q1 ,g and (q1 ,g is a cluster, it would follow that ( , k2 ) ∈ (q1 ,g and thus lq2 ⊂ (q1 ,g ⊂ (q2 −1,g , which is not allowed. Now if we only consider lq1 and lq2 , since b ∈ (q2 −1,g , lq2 can only be of type cluster-roof, and ω(g, (h, 0), q2 ) = 0 implies sµg,q2 −1 (b, b2 ) < iνg,q2 −1 (b, b2 ). That is, there exists q < q2 such that b, b2 ∈ W ((q,g ). Thus b ∈ / (q,g and therefore q < q1 . Besides, b2 ∈ W ((q,g ) and k2 > k1 implies b1 ∈ (q,g ⊂ (q1 −1,g . But lq1 ⊂ (q1 −1,g , therefore b ∈ / (q1 −1,g . As a result, µg (b1 ) < µg (b) = q1 − 1. We can now do the same reasoning, considering lq2 and lq3 this time, to conclude µg (b2 ) < µg (b) = q2 − 1 as well, which gives a different value for µg (b) and proves a contradiction.
202
A. Abdesselam, J. Magnen, V. Rivasseau
Lemma 8 (The volume argument). We have, with the notations of (40),
(ng (b)!)m+1 ×
p q=1
b∈(p,g
p
(1 + d(q , q ))−r1 ≤ K7
(41)
for some constants r1 and K7 that only depend on the dimension d and the degree 2m of the interaction. Proof. We let r1 = 4d(m + 2). We now write
(ng (b)!)
m+1
×
p q=1
b∈(p,g
with def
ξ(b)
(42)
b∈(p,g ng (b)≥1
ξ(b) = (ng (b)!)m+1 ×
(1 + d(q , q ))−r1 =
r1
(1 + d((b), (b )))− 2 ,
(43)
b linked to b
where the product is over all b ∈ (p,g such that {b, b } is a link of g, and (b) denotes the first projection on D of the pair b ∈ L. Now it follows from Lemma 7 that there cannot be more than two cubes b , with the same (b ), linked to b. Remark that there is a constant K such that for δ big enough, #({ ∈ D|d((b), ) ≤ δ}) ≤ Kδ d .
(44)
#({b ∈ (p,g | b linked to b, d((b), (b )) ≤ δ}) ≤ 2Kδ d .
(45)
Therefore n d If ng (b) is big enough and if we set δ = ( 4K ) , it follows that at least that are linked to b satisfy d((b), (b )) > δ. As a result: 1
ng (b) 2
cubes b
r1 ng (b)
ξ(b) ≤ (ng (b)!)m+1 × (1 + δ)− 4 r1 ng (b) ng (b) − 4d (m+1)ng (b) ≤ ng (b) × 4K ≤ ng (b)−ng (b) × (4K) for any value of ng (b). Taking K7 =
K 2
(47)
r1 ng (b) 4d
because of our choice for r1 . It easily follows that ξ(b) ≤ def
(46)
(48) K
for some constant K
concludes the proof of the lemma.
≥ 1,
We now return to (40) and proceed to define the bounding term B(g). First we choose r = r1 + d + 1. Next we note that #((p,g ) − #((0 ) ≥ p and #((p,g ) ≤ 2p + n. Combining Lemma 4, (40) and Lemma 8, we now easily obtain a bound p |A(g, , N )| ≤ K8 (n)K9 λp × dh1 . . . dhp 1>h1 >···>hp >0
p q=1
|ω(g, (h, 0), q)|(1 + d(q , q ))−(d+1) ,
(49)
where K8 (n) and K9 are independent of g, and N . We let B(g) be the right-hand side of (49). The proof of Theorem 1 will be complete when we prove the following result.
Bosonic Monocluster Expansion
203
Proposition 1. There exists λ0 > 0 such that for any λ ∈ [0, λ0 ],
B(g) < +∞,
(50)
g
where the cluster-graph g is summed without any restriction of volume in L. Proof. For any cluster-graph g with nonzero contribution, we define the following function σg : {1, . . . , p} → {0, . . . , p − 1}. Let q, 1 ≤ q ≤ p, and lq = {bq , bq }, and let
bq and bq be the two elements of W ((q−1,g ) with the same first projection on D as bq and bq respectively. We pose, by definition,
def
σg (q) = max(µg (bq ), µg (bq )) < q.
(51)
Note that, indeed, σg (q) ≥ 0, otherwise we would have bq , bq ∈ W−1 = L0 and therefore also bq , bq ∈ W−1 , which would give ω(g, (h, 0), q) = 0 and a zero contribution for g. We will first bound the conditional sum on g, knowing σg . We start by summing over the last link lp knowing g = (l1 , . . . , lp−1 ) and σg . We first perform the sum over lp = {bp , bp } with bp = (p , k) and bp = (p , k ), knowing p and p . This is done thanks to the factor |ω(g, (h, 0), p)| as in Lemma 5. Note that there are three cases. 1st case. lp is a roof-roof link. In this situation bp , bp ∈ W ((p−1,g ) and thus bp = bp ,
bp = bp and
1 1 . |ω(g, (h, 0), p)| = − = hsµg (bp ,bp ) hσg (p)
(52)
2nd case. lp is cluster-roof, with bp ∈ W ((p−1,g ) and bp ∈ (p−1,g . Then bp = bp is unique, and we have to sum over the second projection k of bp , 0 ≤ k ≤ h(p−1,g (p ), with the condition that sµg (bp , bp ) < iνg (bp , bp ). We obtain, the previous condition being implicit in the following sums:
|ω(g, (h, 0), p)| =
k
k
1 hiνg (bp ,bp )
−
1
hsµg (bp ,bp )
.
(53)
Note that p = p as no link is vertical. We let [k1 , k2 ] = {k |0 ≤ k ≤ h(p−1,g (p ) def
and ∃i, µg (bp ) ≤ i ≤ p − 1, (p , k ) ∈ W ((i,g )}.
(54)
204
A. Abdesselam, J. Magnen, V. Rivasseau
With the notation µk = µg ((p , k )) and νk = νg ((p , k )), we have that for any k ,
k1 ≤ k < k2 , µk +1 = νk . Note also that νk2 = µg (bp ). Therefore
k
1 hiνg (bp ,bp )
1
−
hsµg (bp ,bp )
1 1 1 1 = + − − hµk +1 hµk hµk +1 hµg (bp ) k1
=
1 hµ
p
−
2
1 hµg (bp )
g (bp )
(55)
,
(56)
which is positive; since µg (bp ) ≥ µg (bp ) is necessary for the existence of cluster-roof
links {bp , bp } with bp under bp . Finally
|ω(g, (h, 0), p)| ≤
k
1 hµ
g (bp )
=
1 . hσg (p)
(57)
3rd case. lp is cluster-roof, with bp ∈ W ((p−1,g ) and bp ∈ (p−1,g . The symmetric of the 2nd case is treated in the same way, giving a bound of hσ 1(p) again. g
So summing on lp , knowing p and p , gives a bound of
3 hσg (p) .
We then need to sum over the unordered pair {p , p }, knowing g = (l1 , . . . , lp−1 )
and σg . Note that one of the cubes bp and bp has a µg equal to σg (p). Assume it is bp for instance. Since σg (p) = µg (bp ) ≥ 0, we have that bp ∈ / L0 . There is then a unique box b just under bp , i.e. such that b = (p , k − 1) if bp = (p , k). We then have νg (b) = µg (bp ) = σg (p). Either σg (p) = 0, in this case b ∈ (0 , for which there is at most #((0 ) ≤ n possibilities. Or σg (p) > 0; in that case b ∈ lσg (p) \(σg (p)−1,g which leaves two possibilities. Once we know b, we know one of the elements of {p , p }. The sum over the other one is done thanks to the factor (1 + d(p , p ))−(d+1) , and is bounded by some constant. As a result
|ω(g, (h, 0), p)|(1 + d(p , p ))−(d+1) lp
≤
K10 1l{σg (p)>0} + n1l{σg (p)=0} hσg (p)
(58)
for some constant K10 , the sum being over lp knowing (l1 , . . . , lp−1 ) and the full map σg . 1l{...} denotes the characteristic function of the event between braces.
Bosonic Monocluster Expansion
205
We can now repeat the operation and sum over lp−1 knowing (l1 , . . . , lp−2 ) and σg ; and so on. We then get
B(g) ≤
σ
g of length p
p p K8 (n)K9 λp K10
×
1>h1 >···>hp >0
dh1 . . . dhp
p 1l{σ (q)>0} + n1l{σ (q)=0} , hσ (q)
(59)
q=1
where the sum is over all maps σ : {1, . . . , p} → {0, . . . , p − 1} such that σ (q) < q for any q, 1 ≤ q ≤ p. The last step relies on the following lemma. Lemma 9. For any p ≥ 1, any J = {j1 , . . . , jα } ⊂ {1, . . . , p} with j1 < · · · < jα , we have
1>h1 >···>hp >0
σ |J
p
1
q=1
hσ (q)
dh1 . . . dhp
≤
ep , α!
(60)
where the sum is over maps σ : {1, . . . , p} → {0, . . . , p − 1} such that for any q ∈ J , σ (q) = 0 and for any q ∈ / J , 1 ≤ σ (q) < q. Proof of the lemma. We remark that 1 ∈ J . We perform a change of variables by letting hq = s1 s2 . . . sq , 1 ≤ q ≤ p, so that 1>h1 >···>hp >0
dh1 . . . dhp
p q=1
1 = hσ (q)
1
1
ds1 . . .
0
0
dsp
p
q=1
sj (61)
σ (q)<j
and
σ |J
1>h1 >···>hp >0
dh1 . . . dhp
p
1
q=1
hσ (q)
1
= 0
1
ds1 . . . 0
dsp
p
Pq (s),
(62)
q=1
where def
Pq (s) =
if q ∈ J 1≤j
(63)
(we recall that 1 ∈ J , and that by convention an empty product is 1). Suppose q ∈ / J and q + 1 ∈ J . The product of the corresponding factors is then (1 + sq−1 + sq−1 sq−2 + · · · + sq−1 sq−2 . . . s2 )s1 s2 . . . sq ≤ (1 + sq−1 + sq−1 sq−2 + · · · + sq−1 sq−2 . . . s2 )s1 s2 . . . sq +s1 s2 . . . sq−1 = s1 s2 . . . sq−1 (1 + sq + sq sq−1 + · · · + sq sq−1 . . . s2 ),
(64) (65)
206
A. Abdesselam, J. Magnen, V. Rivasseau
which is the product we would get if the opposite situation occurred, that is q ∈ J and q+1 ∈ / J . Therefore, if we lower the elements of J , one by one, in {1, . . . , p} we maximize the right-hand side of (62), and we only need to prove the bound for 1 1 p α ds1 . . . dsp sj (1 + sq−1 + sq−1 sq−2 + · · · + sq−1 sq−2 . . . s2 ). 0
0
q=1 1≤j
α+1
(66) Now for given s1 , . . . , sα we compute
1 0
dsα+1 . . .
1 0
dsp
p
(1 + sq−1 + sq−1 sq−2 + · · · + sq−1 sq−2 . . . s2 )
(67)
α+1
by changing to the variables yα+1 , . . . , yp defined by def
yq = sq (1 + sq−1 + sq−1 sq−2 + · · · + sq−1 sq−2 . . . s2 )
(68)
def
for α +1 ≤ q ≤ p. We then obtain, with yα = sα +sα sα−1 +· · ·+sα sα−1 . . . s2 ≤ α −1, 1+yα 1+yα+1 1+yp−1 dyα+1 dyα+2 . . . dyp 0
0
1+yα
≤ 0
1+yα
≤ 0
≤e
dyα+1 dyα+1
1+yα
0
0
1+yα+1 0 1+yα+1 0
dyα+1
dyα+2 . . .
1+yp−2
0
dyα+2 . . .
1+yα+1
0
1+yp−3
0
dyα+2 . . .
eyp−1 dyp−1
(69)
(e1+yp−2 − 1)dyp−2
(70)
1+yp−3
0
eyp−2 dyp−2
(71)
and, by repeating the argument leading from (69) to (71), we get the inequality 1 1 q p
dsα+1 . . . dsp sk 0
0
yα
α+1
j =2 j ≤k≤q−1
p−1−α
≤e ·e ≤ eα−1 · ep−1−α = ep−2 .
(72) (73)
Therefore
σ |J
1>h1 >···>hp >0
≤ ep−2 ×
dh1 . . . dhp
1 0
≤
ep−2 , α!
which proves the lemma.
p
1
q=1
hσ (q)
1
ds1 . . . 0
dsα
α
sj
(74)
q=1 1≤j
(75)
Bosonic Monocluster Expansion
207
Now the end of the proof of convergence is trivial:
p p B(g) ≤ K8 (n)K9 λp K10 n#(J ) g
p≥0 J ⊂{1,... ,p} σ |J
×
1>h1 >···>hp >0
dh1 . . . dhp
p
1
q=1
hσ (q)
p nj p p ≤ K8 (n)K9 λp K10 ep j j! p≥0 0≤j ≤p
≤ K8 (n)en (2eK9 K10 λ)p < +∞
(76)
(77) (78)
p≥0
for λ small enough.
Acknowledgements. We thank C. de Calan for his contribution to Lemma 9 and C. Kopper for a careful reading of our manuscript.
References [AR1]
Abdesselam, A., Rivasseau, V.: Explicit Fermionic Tree Expansion. Lett. in Math. Phys. 44, 77 (1998) [AR2] Abdesselam, A., Rivasseau, V.: Trees, forests and jungles: A botanical garden for cluster expansions. In: Constructive Physics, V. Rivasseau, ed., Lecture Notes in Physics 446, Berlin–Heidelberg–New York: Springer Verlag, 1995 [AR3] Abdesselam, A., Rivasseau, V.: An Explicit Large Versus Small Field Multiscale Cluster Expansion. Rev. Math. Phys. 9, No. 2, 123 (1997) [Br] Brydges, D.: Weak perturbations of massless Gaussian measures. In: Mathematical Quantum Theory I: Field theory and many body theory, J. Feldman, R. Froese and L. Rosen, eds., AMS-CRM, 1994 [DR] Disertori, M., Rivasseau, V.: Continuous Constructive Fermionic Renormalization. Ann. Henri Poincaré 1, 1 (2000) [Ga] Gallavotti, G.: Renormalization theory and ultraviolet stability for scalar fields via renormalization group methods. Rev. Mod. Phys. 57, 471 (1985) [GK] Gawedzki K., Kupiainen, A.: Asymptotic freedom beyond perturbation theory. In: Proceedings of Les Houches Summer school on Critical phenomena, Random systems and Gauge theories, Amsterdam: North Holland, 1984 [GJ] Glimm, J., Jaffe, A.: Quantum Physics: A Functional Integral Point of View. New York: SpringerVerlag, 1988 [GJS1] Glimm, J., Jaffe, A., Spencer, T.: The particle structure of the weakly coupled P (φ)2 model and other applications of high temperature expansions. In: Constructive Quantum Field Theory, G. Velo and A.S. Wightman, eds., New York: Springer Verlag, 1973 [GJS2] Glimm, J., Jaffe, A., Spencer, T.: The Wightman axioms and particle structure in the P (φ)2 quantum field model. Ann. of Math. (2) 100, 585 (1974) [Ri] Rivasseau, V.: From Perturbative to Constructive Renormalization. Princeton, NJ: Princeton University Press, 1991 Communicated by D. Brydges
Commun. Math. Phys. 229, 209–227 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0687-2
Communications in
Mathematical Physics
Functors of White Noise Associated to Characters of the Infinite Symmetric Group Marek Bo˙zejko1,∗ , M˘ad˘alin Gu¸ta˘ 2 1 2
Instytut Matematyczny, Uniwersytet Wrocławski, Plac Grunwałdzki 2/4, 50-384 Wrocław, Poland Mathematisch Instituut, Katholieke Universiteit Nijmegen, Toernooiveld 1, 6526 ED Nijmegen, The Netherlands. E-mail:
[email protected]
Received: 28 September 2001 / Accepted: 10 November 2001 Published online: 31 July 2002 – © Springer-Verlag 2002
Abstract: The characters φα,β of the infinite symmetric group are extended to multiplicative positive definite functions tα,β on pair partitions by using an explicit representation due to Veršik and Kerov. The von Neumann algebra α,β (K) generated by the fields ωα,β (f ) with f in an infinite dimensional real Hilbert space K is infinite and the vacuum vector is not separating. For a family tN depending on an integer N < −1 an “exclusion principle” is found allowing at most |N | “identical particles” on the same state: 1 a(f )a ∗ (g) = f, g 1 + d(|g >< f |). N The algebras N (2R (Z)) are type I∞ factors. Functors of white noise N are constructed and proved to be non-equivalent for different values of N . 1. Introduction The theory of non-commutative processes with “independent increments” has been the object of various investigations in quantum probability, the most important approaches being the Hudson-Parthasarathy calculus [19] based on tensor independence, and Voiculescu’s free probability with its concept of free independence [25]. A general theory of quantum white noise, Brownian motion and Markov processes is developed by Köstler [14] in the spirit of Kümmerer’s approach to quantum probability [15–17]. The white noise is described by a finite quantum probability space of A0 -valued random variables i.e., a von Neumann algebra A, endowed with a tracial normal state ρ together with a subalgebra A0 and the state preserving conditional expectation P0 from A to A0 [22]. The triple (A, ρ, A0 ) is provided with a filtration of subalgebras AI of A, for all closed intervals I of the time axis R. A group (St )t∈R of automorphisms of (A, ρ) ∗
Partially supported by KBN grant 2P03A05415
210
M. Bo˙zejko, M. Gu¸ta˘
acts as a shift on the local algebras St (AI ) = AI+t and lets A0 be pointwise invariant. For disjoint intervals I, J the local algebras AI and AJ are statistically independent over A0 i.e., PI ◦ PJ = P0 , a notion in which we recognise a commuting square of von Neumann algebras [8]. The quantum Brownian motion is an additive cocycle (Bt )t∈R with respect to the white noise (A, ρ, St , AI ) over A0 , that is, a process which is adapted to the filtration A[0,t] , satisfying Bs+t = Bt + St (Bs ) and certain continuity requirements in the Lp -norms (see definitions in Chap. 3 of [14]). A more functorial approach has been abstracted from the study of the algebra of deformed or q-commutation relations [2, 4–7, 9, 18, 26]. The selfadjoint field operators 2 (R ) have vacuum expectations expressed as a ωq (f ) := aq (f ) + aq∗ (f ) for f ∈ LR + sum over all possible partitions of the terms in the monomial into pairs fk , fl , ρq (ωq (f1 ) . . . ωq (fn )) = tq (V) (1.1) V ∈P2 (n)
(k,l)∈V
where tq (V) = q cr(V ) and cr(V) is the number of crossings of the pair partition V. The classical Brownian motion is realised for q = 1 while the free Brownian motion [25] for (q) q = 0 by defining Bt = ωq (χ[0,t] ). Those functions t : P2 (∞) → C on pair partitions which give rise to positive “gaussian” functionals ρt on the algebra of fields ω(f ), are called positive definite [3]. In particular they restrict to positive definite functions on the infinite symmetric group S(∞), by identifying the permutations with certain pair partitions [3]. The representations with respect to such functionals ρt are called generalised Brownian motions and are the object of the papers [3, 10, 11, 21]. We regard the usual (symmetric) Fock space over a Hilbert space as an endofunctor of the category of Hilbert spaces. This can be generalised to the analytic functors of Joyal [13] whose symmetries are determined by combinatorial objects like species of structures [12, 1]. In [10, 11] it has been observed that such generalised Fock spaces are the representation spaces of generalised Brownian motions. One can define creation and annihilation operators a (f ) whose sums are the fields ω(f ). This framework is described in Sect. 2. For positive definite functions which have a certain multiplicativity property, it has been shown [11] that the field operators are selfadjoint and thus one can investigate the von Neumann algebras which they generate as well as the existence of functors from the category of (real) Hilbert spaces with contractions to the category of non-commutative probability spaces. For tracial vacuum states ρt , the functor t of white noise constructed in [11] is a concrete realisation of a quantum white noise in the sense of Köstler, if we define A0 := C, A := t (L2 (R)), AI t (L2 (I)) with St := (st ), where st is the shift (t) operator on L2 (R). The Brownian motion is Bs := ωt (χ[0,s] ). No example of the function t is yet known such that the vacuum state ρt is faithful but not tracial. In this paper we treat a class of generalised Brownian motions for which the vacuum state is not faithful. They arise from the characters of the infinite symmetric group S(∞). By the theorem of Thoma [23] any such character has the form ∞ ρm (σ ) ∞ m m+1 m φα,β (σ ) = αi + (−1) βi , m≥2
i=1
i=1
∞ ∞ ∞ where ∞ (αi )i=1 and (βi )i=1 are decreasing sequences of positive numbers with i=1 αi + i=1 βi ≤ 1 and ρm (σ ) is the number of cycles of length m in the cycle decomposition
Functors of White Noise Associated with the Infinite Symmetric Group
211
of σ . An explicit construction of the representation of S(∞) with respect to the positive definite function φα,β has been presented by Veršik and Kerov in [24]. In Sect. 3 we employ this representation to calculate the expression of a multiplicative positive definite function on pair partitions tα,β which extends φα,β . For an arbitrary pair partition V we have ∞ ρm (V ) ∞ m m+1 m αi + (−1) βi , tα,β (V) = m≥2
i=1
i=1
where ρm (V) is the number of cycles of length m in the pair partiton V, the cycle of a pair partition being a new combinatorial concept extending that from permutations. For any real Hilbert space K we construct the von Neumann algebra α,β (K) generated by selfadjoint field operators ωα,β (f ) with f ∈ K and investigate its properties by using the general theory developed in [11]. If the space K is infinite dimensional we find that α,β (K) is an infinite von Neumann algebra and the vacuum state is not faithful. This is the object of Sect. 4. In the last section we treat in more detail a particular case of positive definite functions indexed by a natural number N < −1, tN (V) =
1 N
|V |−c(V )
,
where |V| is the number of pairs of V and c(V) the number of cycles. Alternatively to the representation inspired from Veršik and Kerov, we use the technique of deformation of the inner product on the full Fock space known from the q-deformed Brownian motion [2, 4, 5] and obtain an interesting example of relations between creation and annihilation operators: ∗ (g) = f, g 1 − aN (f )aN
1 d(|g >< f |). |N |
The differential second quantisation operators d(A) are defined similarly to their counterparts in quantum field theory [20]. For f = g this implies that the number operator Nf = d(|f >< f |) counting the number of one-particle f -states is bounded by |N |, an “exclusion principle” which could have some interest also from the physics point of view. The algebra N (2R (Z)) generated by the field operators ωN (f ) acting on FN (2 (Z)) contains all bounded operators on the Fock space. From the functorial point of view a different algebra N (K) generated by the socalled generalised Wick product operators #(f, V) acting on FN (KC ⊕ 2 (Z)) is more interesting. For any contraction T : K → K between real Hilbert spaces one can define its second quantisation, a vacuum state preserving a completely positive map N (T ) between N (K) and N (K ). Altogether, N is a functor of white noise. For infinite dimensional K the von Neumann algebra N (K) is a discrete sum of type I∞ factors; for finite dimensional K, N (K) is a matrix algebra. In particular N (R) =
N+1
Mk (C),
k=2
implying that the functors N are inequivalent for different values of N .
212
M. Bo˙zejko, M. Gu¸ta˘
2. Theory of Generalised Brownian Motion In this section we define the notion of generalised Brownian motion [2–5], and describe some results from [10, 11]. Definition 2.1. Let K be a real separable Hilbert space. The algebra A(K) is the free unital ∗ -algebra with generators ω(h) for all h ∈ K, divided by the relations: ω(af + bg) = aω(f ) + bω(g),
ω(f ) = ω(f )∗
(2.1)
for all f, g ∈ K and a, b ∈ R. In this paper we will consider G.N.S.-like representations of such ∗ -algebras with respect to positive functionals called gaussian states. These states arise from a noncommutative central limit theorem (Theorem 0 in [3]) and are described by functions on pair partitions. Definition 2.2. Let S be a finite ordered set with n elements. We denote by P2 (S) the set of pair partitions of S, that is V ∈ P2 (S) if V consists of 21 n disjoint ordered pairs (l, r) with l < r having S as their reunion. The set of all pair partitions is P2 (∞) :=
∞
P2 (2r).
(2.2)
r=0
Note that P2 (n) = ∅ if n is odd. We use the symbol t exclusively for functions t : P2 (∞) → C. We will always choose the normalisation t(p) = 1 for p the pair partition containing only one pair. Definition 2.3. A Gaussian state on A(K) is a positive normalised linear functional ρt with moments fk , fl t(V) (2.3) ρt (ω(f1 ) . . . ω(fn )) = V ∈P2 (n)
(k,l)∈V
for even n, and zero for odd n. A function t is called positive definite if ρt is a Gaussian state. Remark 2.4. If t is positive definite then its restriction to the pair partitions of the form Vπ := {(i, 2n + 1 − π(i)) : i = 1, . . . , n)}, where π ∈ S(n), is a positive definite function on the symmetric group S(n) [3]. By analysing the GNS representation associated to a Gaussian state ρt we have generalised [10, 11] the notion of Fock space over KC in the following way: Definition 2.5. Let V = (Vn )∞ n=0 be a collection of Hilbert spaces such that each Vn carries a unitary representation Un of the symmetric group S(n). Let H be a (complex) Hilbert space. The V-Fock space over H is defined by FV (H) :=
∞ 1 Vn ⊗s H⊗n , n! n=0
(2.4)
Functors of White Noise Associated with the Infinite Symmetric Group
213
where ⊗s denotes the closed subspace of the tensor product Vn ⊗H⊗n whose orthogonal projection is 1 Un (τ ) ⊗ U˜ n (τ ), (2.5) Pn = n! τ ∈S(n)
and U˜ n f1 ⊗ . . . ⊗ fn = fτ −1 (1) ⊗ . . . ⊗ fτ −1 (n)
(2.6)
1 in (2.4) refers to the inner product on Vn ⊗s H⊗n . We note for fi ∈ H. The factor n! that FV is an endofunctor of the category of Hilbert spaces with contractions called analytic functor [13]. We use the shorter notation v ⊗s (h1 ⊗ . . . ⊗ hn ) for the vector Pn v ⊗ (h1 ⊗ . . . ⊗ hn ). Let T : H1 → H2 be a contraction between Hilbert spaces. Then its second quantisation on the level of Hilbert spaces FV (T ) is defined by
FV (T ) : v ⊗s (h1 ⊗ . . . ⊗ hn ) → v ⊗s (T h1 ⊗ . . . ⊗ T hn )
(2.7)
for all v ∈ Vn , hi ∈ H when n ≥ 1, and equal to the identity on V0 . On FV (H) we define creation and annihilation operators whose domain consists of vectors with a “finite number of particles”. If ψn+1 ∈ Vn+1 ⊗s H⊗n+1 then a(f )ψn+1 = (jn∗ ⊗ r ∗ (f ))ψn+1 , where jn : Vn → Vn+1 are densely defined linear maps having the intertwining property jn Un (τ ) = Un (ι(τ ))jn
(2.8)
for all τ ∈ S(n), and ι : S(n) 1→ S(n + 1) being the natural inclusion by keeping the element n + 1 fixed. The operator r(f ) is the right creation operator on the full Fock space in the notation of Voiculescu (see Remark 2.6.7 in [25]). Equation (2.8) insures that (jn∗ ⊗ r ∗ (f ))Pn+1 = Pn (jn∗ ⊗ r ∗ (f ))Pn+1 . The creation operator a ∗ (f ) is the adjoint of a(f ) and has the action a ∗ (f ) : v ⊗s (f1 ⊗ . . . ⊗ fn ) → (jn v) ⊗s (f1 ⊗ . . . ⊗ fn ⊗ h)
(2.9)
for v ∈ Vn , fi ∈ H. Remark. One can also use the left creation operator l(f ) by choosing another inclusion of S(n) into S(n + 1). Theorem 2.6. Let K be an infinite dimensional real Hilbert space and t positive definite function on pair partitions. Then there exists a unique (up to unitary equivalence) analytic functor V and densely defined linear maps jn : Vn → Vn+1 satisfying (2.8) such that the G.N.S. representation of A(K) w.r.t. ρt is unitarily equivalent to the ∗ -algebra of symmetric operators ω(f ) := a(f ) + a ∗ (f ) for f ∈ K, acting on the Hilbert space FV (KC ). The state ρt is implemented by a unit vector 2 ∈ V0 . Remark. In [11] we have shown how the spaces Vn and the maps jn arise through the representation of the ∗ -semigroup BP2 (∞) of broken pair partitions, with respect to the positive functional tˆ. The semigroup BP2 (∞) contains P2 (∞) as a sub-semigroup / P2 (∞). The and tˆ is the extension of t to BP2 (∞) by setting tˆ(d) = 0 for all d ∈ positivity of the function t is linked thus with an algebraic object rather than through an indirect Definition 2.3. We will therefore denote the V-Fock space over H associated to the positive definite t by Ft (H), and the creation and annihilation operators by at (f ). We denote by the same symbol ρt the vacuum state ·, · on the algebra of creation and annihilation operators a(f ) acting on Ft (H).
214
M. Bo˙zejko, M. Gu¸ta˘
3. Generalised Brownian Motions Associated to Characters of the Infinite Symmetric Group By S(∞) we denote the infinite symmetric group, i.e. the group of finitary permutations of a countable set. A finite character of S(∞) is a central positive definite indecomposable (not representable as a nontrivial convex combination of other such functions) function. The fundamental result of Thoma gives an explicit description of the finite characters. Theorem 3.1 ([23]). All normalized finite characters of the group S(∞) are given by the formula ∞ ρm (σ ) ∞ m m+1 m φα,β (σ ) = αi + (−1) βi , (3.1) m≥2
i=1
i=1
where α1 ≥ α2 ≥ · · · ≥ 0, β1 ≥ β2 ≥ · · · ≥ 0, αi + number of cycles of length m in the permutation σ .
βi ≤ 1, and ρm (σ ) is the
We extend the character φα,β to a positive definite function on a pair partition tα,β having a certain multiplicative property. This extension is based on the inclusion of S(n) in P2 (2n) formulated in Remark 2.4. The representation of Veršik and Kerov [24]. We deal first with the case ∞ i=1 αi = 1, thus βj = 0. We consider a fixed but arbitrary n ∈ N ∪ {∞}. Let N = {1, 2, . . . } and α = α1 , α2 , . . . a measure on N . Let Xn = n1 N with the product measure
(α) mn = n1 α. The group S(n) acts on Xn by σ (x1 , . . . , xn ) = (xσ −1 (1) , . . . , xσ −1 (n) ) (α) and preserves mn . We define X˜n = {(x, y) ∈ Xn × Xn : x ∼ y}, where x ∼ y means x = σy for some σ ∈ S(n). The Hilbert space
(α) 2 2 (α) ˜ Vn = f : Xn → C | ∞ > ||f || = |f (x, y)| dmn (x) (3.2) Xn y∼x
(α)
carries a unitary representation Un
of S(n) given by
(Un(α) (σ )h)(x, y)
= h(σ −1 x, y).
(3.3)
Let 1n be the indicator function of the diagonal {(x, x) | x ∈ Xn } ⊂ X˜n . (α)
Theorem 3.2. On Vn we have Un(α) (σ )1n , 1n = m(α) n {x : σ x = x} = φα,0 (σ ).
(3.4)
In particular for n = ∞ we obtain the representation of S(∞) associated to the character φα,0 in the convex hull of the vector 1∞ . (α)
For any n ∈ N there is a natural isometry jn from Vn
(α)
to Vn+1 ,
(jn h)(x, y) = δxn+1 ,yn+1 h(x (n) , y (n) ),
(3.5)
where x = (x1 , . . . , xn , xn+1 ) = (x (n) , xn+1 ). The maps jn satisfy (2.8). We have (α) [11]. We denote the thus a representation of the ∗ -semigroup BP2 (∞) on ∞ n=0 Vn associated positive definite function on pair partitions by tα .
Functors of White Noise Associated with the Infinite Symmetric Group
215
Definition 3.3. Let V ∈ P2 (2n). There exists a unique noncrossing pair partition Vˆ ∈ P2 (2n) such that the set of left points of the pairs in V and Vˆ coincide. A cycle in V is a sequence ((l1 , r1 ), . . . , (lm , rm )) of pairs of V such that the pairs (l1 , r2 ), ˆ The length of this cycle is m. We denote by ρm (V) the (l2 , r3 ), . . . , (lm , r1 ) belong to V. number of cycles of length m in the pair partition V. Theorem 3.4. The function tα has the expression tα (V) =
m≥2
∞ i=1
αim
ρm (V ) .
(3.6)
Proof. Let V ∈ P2 (2n). In order to calculate tα (V) we will have to deal with the spaces (α) Vk for k = n. The indicator functions δ(x,y) for x, y ∈ Xk and x = σy for some (α) (α) σ ∈ S(k), form a basis in Vk . As the operators jk are isometries, we identify all Vk (α) for 0 ≤ k ≤ n with their image in Vp under jk,p := jk . . . jp−1 for p > k and denote ∗ by Pk,p = jk,p jk,p the orthogonal projections onto these subspaces. The actions of the various operators are: jk,p δ(x,y) = δ((x,z),(y,z)) , (3.7) z∈Xp−k ∗ δ(x,y) = jk,p
Pk,p δ(x,y) =
p i=k+1 p
αxi αxi
i=k+1
p i=k+1 p i=k+1
δxi ,yi · δ(x (k) ,y (k) ) , δxi ,yi ·
z∈Xp−k
(α)
Uk (σ )δ(x,y) = δ(σ x,y) .
δ((x (k) ,z),(y (k) ,z)) ,
(3.8)
(3.9) (3.10)
The function tα can be calculated (see Theorem 3.2 [11]) in terms of the operators (α) Up (σ ) and jk,p : r tα (V) = 10 , (3.11) jk∗a+1 ,pa · Up(α) (πa ) · jka ,pa 10 , a a=1
(α)
where k1 = kr+1 = 0 and 10 is a unit vector of V0 = C. The numbers ka , pa and the permutations πa are determined from the “standard form” (see Fig. 1) of the pair partition V which consists of a repeated sequence (from right to left) of pa − ka right legs denoted d0 , followed by a permutation πa , then a sequence of pa − ka+1 left legs denoted d0∗ , in such a way that two pairs intersect at most one time at the rightmost possible permutation. We interpret (3.11) in the following way. We begin with the characteristic function of the pair of empty sets δ∅,∅ = 10 . We apply the operators one by one from the right to left. Then every operator jka ,pa (Eq. 3.7) brings a sum over all |Xpa −ka | possible “words” of pa − ka letters to be added to both sequences of the previous pair. Next, a permutation πa (Eq. 3.10) acts on the first sequence of the pair leaving the other sequence unchanged. The operator jk∗a+1 ,pa compares the last pa − ka+1 letters of the two sequences and if
216
M. Bo˙zejko, M. Gu¸ta˘
1
2
3
4
5
6
7
8 π 1
π 2
4 d*
9 10
2
0
d*
d 0
d
0
3 0
Fig. 1. The standard form of a pair partition
they coincide, it erases them from both sequences and produces a coefficient equal to the product of the weights αi of the letters. If the letters differ in at least one position, the pair is removed. In the end we are back to the pair of empty sets and we have a coefficient which is the value of tα (V). ˆ Let x : V → X be a possible Now we come to the two pair partitions V and V. sequence of letters attributed to the right legs of V by the steps jka ,pa of the above procedure. The right legs of V and Vˆ coincide by definition, thus x defines also a function xˆ : Vˆ → X . As the pair partition Vˆ is noncrossing, it corresponds to the second sequence of the pair, on which no permutation of letters is performed. The term x of the sum survives the tests of all jk∗a+1 ,pa if and only if for each two pairs one from V and one from Vˆ having the same left leg, the corresponding letters in x and respectively xˆ coincide. But this means that all the pairs in each cycle of V must have the same letter, which implies ∞ ρm (V ) tα (V) = αx(V ) = αim . (3.12) x:ct. on cycles
V ∈V
m≥2
i=1
The general case. Let γ = 1 − i αi − i βi . As previously we fix n ∈ N ∪ {∞}. Let N+ = N− = N and Q = N+ ∪ N− ∪ [0, γ ] with the measure µ defined as the Lebesgue measure on [0, γ ], µ(i) = αi for i ∈ N+ , and µ(j ) = βj for j ∈ N− . The
(α,β) = n1 µ; X˜ is defined as before, as measure space is Xn = n1 Q with measure mn (α,β) (Eq. 3.2). The representation of S(n) is given by well as the Hilbert space Vn (Un(α,β) (σ )h)(x, y) = (−1)i(σ,x) h(σ −1 x, y),
(3.13)
where i(σ, x) is the number of inversions in the sequence (σ i1 (x), σ i2 (x) . . . ) of indices ir (x) for which xi ∈ N− . The vector 1n is the indicator function of the diagonal {(x, x)} ⊂ X˜ . (α,β)
Theorem 3.5 ([24]). On Vn
we have
Un(α,β) (σ )1n , 1n = φα,β (σ ).
(3.14)
In particular for n = ∞ we obtain the representation of S(∞) associated to the character φα,β in the convex hull of the vector 1∞ .
Functors of White Noise Associated with the Infinite Symmetric Group
217
We define jn as in (3.5). Then we have the general version of Theorem 3.4: Theorem 3.6. The function tα,β has the expression ∞ ρm (V ) ∞ m m+1 m tα,β (V) = αi + (−1) βi . m≥2
i=1
(3.15)
i=1
Definition 3.7 ([3]). A function t on pair partitions is called multiplicative if for all k, l, n ∈ N with 0 ≤ k < l ≤ n and all V1 ∈ P2 ({1, . . . , k, l + 1, . . . , n}) and V2 ∈ P2 ({k + 1, . . . , l}) we have t(V1 ∪ V2 ) = t(V1 ) · t(V2 ).
(3.16)
Corollary 3.8. The functions tα,β are multiplicative. Proof. If V = V1 ∪ V2 as in Definition 3.7 then any cycle of V belongs either to V1 or to V2 , thus ρm (V) = ρm (V1 ) + ρm (V2 ) for all m ≥ 2.
(3.17)
Corollary 3.9. The operators ωα,β (f ) are essentially selfadjoint. Proof. This follows from the previous corollary and Proposition 5.10 in [11].
From now on we will denote by the same symbol the selfadjoint closure of the essentially selfadjoint operator ωα,β (f ). 4. The von Neumann Algebras Γα,β (K) Using Corollary 3.9 we can make a step further and construct von Neumann algebras generated by the “field operators” ωα,β (f ) for vectors f in certain real Hilbert spaces. Definition 4.1. Let α,β (K) be the von Neumman algebra generated by the spectral projections of the operators ωα,β (f ) acting on Fα,β (KC ), for all f in K, a real Hilbert space. On α,β (K) we distinguish the vacuum state ρα,β (X) := 2α,β , X2α,β . Remark. In [11] we have used the notation ˜ α,β (K), the symbol α,β (K) refering to another von Neumann algebra which we do not investigate in this paper. On Fα,β (KC ) we define a ∗ -algebra Wα,β (K) of (not necessarily bounded) operators (fin) having D := Fα,β (K) as domain and leaving this domain invariant. We call Wα,β (K) a Wick algebra and its elements generalised Wick products. Such an operator is a finite linear combination of elementary operators denoted by a symbol #(f, V) where f : F → K, V ∈ P2 (P ) and {F, P } is a partition into disjoint subsets of an arbitrary ordered set. The simplest examples of Wick products are #(∅, V) = tα,β (V)1, and #(f, ∅) = ωα,β (f ). For the exact definition of the Wick products we refer to Sect. 4 of [11]. Let #(f1 , V1 ) and #(f2 , V2 ) be two Wick products. Then #(f1 , V1 )2, #(f2 , V2 )2 =
V˜ ∈P2 (F ∗ ,F2 ) 1
with the following notations:
˜ · t(V1∗ ∪ V2 ∪ V) ˜ ηf1∗ ⊕f2 (V)
(4.1)
218
M. Bo˙zejko, M. Gu¸ta˘
1) if A is a finite ordered set then A∗ denotes the same set with the reversed order and similarly for f ∗ and V ∗ ; 2) if fi : Ai → K with Ai finite ordered sets then A1 + A2 denotes the ordered set obtained by concatenating A1 and A2 , f1 ⊕ f2 : A1 + A2 → K is the map which restricts to fi on Ai for i = 1, 2; 3) if {A1 , A2 } is a partition into disjoint subsets of an ordered set A then P2 (A1 , A2 ) denotes the subset of P2 (A) whose elements contain only pairs with one element from A1 and the other from A2 ; 4) with the previous notations, for V ∈ P2 (A1 , A2 ), ηf1 ⊕f2 (V) :=
f1 (l), f2 (r) .
(4.2)
(l,r)∈V
The adjoint of #(f, V) is #(f ∗ , V ∗ ). The product of two Wick products is written in terms of elementary Wick products as follows: #(f1 , V1 ) · #(f2 , V2 ) = ηf1 ⊕f2 (V) · #(fˇ1 ⊕ fˇ2 , V1 ∪ V2 ∪ V) (4.3) P˜1 ,P˜2 V ∈P2 (P˜1 ,P˜2 )
with P˜i denoting a subset of Pi , and fˇi is the restriction of the function fi : Pi → K to the complement of P˜i in Pi . Remark 4.2. In general, if the field operators ωt (f ) are bounded then the Wick algebra Wt (K) is weakly dense in t (K) for infinite dimensional K. However for finite dimensional K the von Neumann closure of Wα,β (K) can be larger than t (K). If the field operators are unbounded then the Wick operators are affiliated to t (K). Let us take a closer look at the type of the von Neumann algebras α,β (K). The following cases are already known: 1) α1 = 1: we obtain the classical (commutative) Brownian motion Bt := ω(χ(0,t] ). for K = L2R (R+ ), or the algebra of n independent gaussian random variable for K = Rn . 2) β1 = 1: fermionic Brownian motion, the corresponding von Neumann algebra 0,1 (K) being the type I I1 hyperfinite factor for dim(K) = ∞. 3) αi = βi = 0: free Brownian motion [25], 0,0 (Cn ) is the non-hyperfinite I I1 factor isomorphic to the von Neumann algebra of the free group with n generators (n ≥ 2 or n = ∞). In all the above cases the vacuum state ρα,β is tracial. Lemma 4.3. Let α, β be as in Theorem 3.1 with α1 = 1, β1 = 1, Then α,β (2R (Z)) does not have any tracial normal state.
i
αi +
i
βi = 0.
Proof. Let us suppose that there exists a tracial state τ on α,β (2R (Z)) and consider the automorphism α,β (S) := adFα,β (S) of α,β (2R (Z)), where S is the right shift on 2R (Z). From Sect. 5 of [11] we have w−lim α,β (S n )(X) = ρα,β (X)1 n→∞
(4.4)
Functors of White Noise Associated with the Infinite Symmetric Group
219
for elements X in the Wick algebra Wα,β . If the field operators ωα,β (f ) are bounded then Wα,β (2R (Z)) is dense in α,β (2R (Z)) and we can conclude that τ (α,β (S n )(XY )) → ρα,β (XY ) for X, Y ∈ Wα,β (2R (Z)). This would imply ρα,β (ω1 ω2 ω3 ω4 ω5 ω1 ω4 ω3 ω2 ω5 ) = ρα,β (ω5 ω1 ω2 ω3 ω4 ω5 ω1 ω4 ω3 ω2 ), ρα,β (ω1 ω2 ω3 ω4 ω5 ω3 ω6 ω2 ω1 ω6 ω5 ω4 ) = ρα,β (ω4 ω1 ω2 ω3 ω4 ω5 ω3 ω6 ω2 ω1 ω6 ω5 ) for ωi = ωα,β (ei ) with ei normalised orthogonal on each other. Thus αi2 − βi2 = αi2 − βi2 αi3 + βi3 ,
i
αi2 −
i
i
βi2
i
i
αi4 −
i
βi4
i
=
i
i
αi3 +
i
i
2 βi3
i
.
If ωα,β (f ) are unbounded operators one has to be more careful with expressions of the
p (c) (c) (c) type τ ( i=1 ωki ). We consider first the cutoff fields ωi = Pi ωi , where Pi is the spectral projection of ωi associated to the interval [−c, c]. Then we still have w−lim α,β (S n )(M (c) ) = ρα,β (M (c) )1
(4.5)
n→∞
p (c) for M (c) = i=1 ωki . Finally by letting c → ∞ we obtain the same result as in the case of bounded fields. Lemma 4.4. Let α, β be as in Lemma 4.3. The vacuum vector 2 ∈ Fα,β (2R (Z)) is not separating for α,β (2R (Z)).
Proof. Let (en )n∈Z be the orthonormal basis of 2R (Z) and denote ai = a (ei ). We consider the generalised Wick products X = #({(1, 4), (2, 6)}, f) and Y = #({(2, 4)}, g) with f(3) = g(3) = e2 , f(5) = g(1) = e1 . Then X2 = Y 2 = a1∗ a3 a2∗ a3∗ 2.
(4.6)
∗ ψn = a3 a2∗ a4 a3∗ a5 a4∗ . . . an an−1 an∗ 2,
(4.7)
On the other hand consider
then
n+2 a1∗ 2, Xψn = αi + (−)n+1 βin+2 ,
a1∗ 2, Y ψn
i
=
i
which cannot be equal for all n.
αin
+ (−)
n+1
i
i
βin ,
(4.8)
Remark. At first sight it might be surprising that the state ρα,β which is obtained by extending the characters of the infinite symmetric group is not tracial. However the trace property of ρα,β on α,β (K) is independent of the trace property of the characters φα,β .
220
M. Bo˙zejko, M. Gu¸ta˘
5. An Example with Type I∞ Factors We consider in more detail the following particular functions: 1) tN := tα,β with αi = N1 for i = 1, . . . , N and βj = 0 for all j ; 2) t−N := tα,β with βi = N1 for i = 1, . . . , N and αj = 0 for all j . For any N ∈ Z \ {0} we have
tN (V) =
1 N
|V |−c(V )
,
(5.1)
where |V| and c(V) denote the number of pairs, respectively the number of cycles of the pair partition V. The corresponding character is denoted by φN . In the spirit of [2–5] we define an alternative representation of the algebra of creation and annihilation operators using the technique of deformation of the inner product on the full Fock space. Let H be a Hilbert space and F (fin) (H) be the linear span of the vectors of the form f1 ⊗ . . . ⊗ fn with n ≥ 0 and fi ∈ H endowed with the usual inner product on the full Fock space over H. We consider the new sesquilinear form ·, ·N given by the sesquilinear extension of f1 ⊗ . . . ⊗ fn , g1 ⊗ . . . ⊗ gm N = δn,m φN (τ ) f1 , gτ (1) . . . fn , gτ (n) . τ ∈S(n)
(5.2) The positivity of ·, ·N follows from that of the operator QN defined on F (fin) (H) whose restriction to H⊗n is (n) QN = φN (τ )U˜ n (τ ). (5.3) τ ∈S(n)
We denote by DN the operator on F (fin) (H) which restricts to (n+1)
DN
:= 1 +
n+1 1 ˜ Un+1 (π1,i ) N
(5.4)
i=2
(0)
on H⊗n+1 and DN 2 = 2. The permutation πj,i is the transposition of i and j . Let ˜ (n) be the operator 1 ⊗ Q(n) acting on H⊗n+1 . The following relation is immediate: Q N
N
(n+1)
QN
(n+1)
= DN
˜ (n) . Q N
(5.5)
Let l(f ), l ∗ (f ) be the left creation and annihilation operators on the full Fock space over ∗ (f ) := l(f ) and the annihilation H. On F (fin) (H) we define a new creation operator aN operator aN (f ) = l ∗ (f )DN . From (5.5) we get ∗ η, aN (f )ξ N = aN (f )η, ξ N (5.6) for η, ξ ∈ F (fin) (H). By dividing out the || · ||N -norm zero vectors we obtain the pre(fin) Hilbert space FN (H) whose completion with respect to ·, ·N is denoted by FN (H). ∗ (f ) are well defined on F (fin) (H) and are each other’s adjoint The operators aN (f ), aN N ∗ (f ) be the symmetric “field operators”. on FN (H). Let ωN (f ) := aN (f ) + aN We will prove that the vacuum expectations of monomials in ωN (·) satisfy Eq. 2.3 characterising the gaussian states with t = tN .
Functors of White Noise Associated with the Infinite Symmetric Group
221
Lemma 5.1. For any f, g ∈ H the following relation holds on FN (H): ∗ aN (f )aN (g) = f, g 1 +
1 d(|g >< f |), N
(5.7)
where the differential second quantisation operator d(A) is defined by d(A)f1 ⊗ . . . ⊗ fn =
n
f1 ⊗ . . . ⊗ Afi ⊗ . . . ⊗ fn
(5.8)
i=1
for A ∈ B(H). Proof. Direct application of the definitions.
Lemma 5.2. Let f1 , . . . , fp ∈ H. Then 2, ωN (f1 ) . . . ωN (fp )2 =
tN (V)
V ∈P2 (p)
fk , fl .
(5.9)
(k,l)∈V
Proof. For p odd the expectation is zero. From the definitions of the creation and annihilation operators it is clear that the vacuum state is a Fock state for a certain positive definite function t on pair partitions. We have to prove that t ≡ tN with the latter as defined in (5.1). By linearity it is enough to prove the relation for p2 pairs of orthonormal vectors e1 , . . . e p ordered such as they give rise to a certain pair partition 2 V = {(l1 , r1 ), . . . , (l p , r p )} ∈ P2 (p): 2
2
2,
p i=1
l
aNi (fi )2
= tN (V)
(5.10)
r
for aNk (flk ) = aN (ek ) = (aN k (frk ))∗ . We consider the innermost pairs of the noncrossing pair partition Vˆ associated to V (see Definition 3.3). Each such pair is of the form (k, k + 1) for some 1 ≤ k < p. The ∗ (f corresponding term in the monomial is aN (fk )aN k+1 ). We distinguish two cases: 1) If fk = fk+1 = f then (k, k + 1) ∈ V. By (5.9) we have ∗ aN (fk )aN (fk+1 ) = 1 +
1 d(Pf ) N
(5.11)
with Pf the projection on the one dimensional space spanned by f . The term d(Pf ) brings contribution zero to the expectation because the rest of the vectors fi are orthogonal on f . Thus the pair (k, k + 1) ∈ V can be deleted without changing the expectation. 2) If fk = fk+1 then k and k + 1 belong to different pairs (k, a) and respectively (b, k + 1) in V. By (5.9) we have ∗ (fk+1 ) = aN (fk )aN
1 d(|fk+1 >< fk |). N
(5.12)
222
M. Bo˙zejko, M. Gu¸ta˘
p The action of the operator d(|fk+1 >< fk |) on k+2 a i (fi )2 is in effect to replace the vector fk which appears exactly once in any tensor product, by fk+1 . Equivalently one can delete the positions k and k + 1 from the ordered sequence and replace the pairs (b, k + 1), (k, a) by one pair (b, a) leaving the other pairs invariant. Let us denote this pair partition by Vˇ k . Then 2,
p i=1
aNi (fi )2
= t(V) =
1 ˇ t(Vk ). N
(5.13)
By repeating the procedure of reducing the number of pairs through Step 1) or 2) we arrive at p −c(V ) p 1 2 i 2, aN (fi )2 = t(V) = = tN (V). (5.14) N i=1
Remark. We thus conclude that the representation of the algebra of creation and annihilation operators constructed in this section and the one described in Sect. 2 are unitarily equivalent as GNS representations with respect to the Fock state ρN associated to tN . As in Definition 4.1, we denote by N (K) the von Neumann algebra generated by the spectral projections of the selfadjoint extensions of the operators ωN (f ) acting on FN (KC ) for all f ∈ K.
From the relations (5.9) we can conclude that aN (f ) is bounded for N < 0 and unbounded for N > 0. We will concentrate on the von Neumann algebra N (K) for N < −1 and infinite dimensional K. We have seen that except for the three special cases when the von Neumann algebras α,β (K) are infinite. In fact we will show that N (2R (Z)) is the whole algebra of bounded operators on FN (2 (Z)) (for N < −1). Proposition 5.3. Let N < −1 be an integer. Then N (2R (Z)) = B(Ft (2 (Z))). Proof. We show that the projection P2 onto the one dimensional space spanned by the vacuum vector belongs to N (2R (Z)). From this we can conclude that the algebra N (2R (Z)) is the whole B(FN (2 (Z))) because 2 is a cyclic vector for WN (2R (Z)) which is a dense in N (2R (Z)). Let Ni := d(|ei >< ei |) be the number operator counting “one-particle” ei -states in FN (2 (Z)). For simplicity we make the notations ωi := ωN (ei ) and similarly for ai , ai∗ . Let #(f, V) be an arbitrary Wick product with f : F → 2R (Z), V ∈ P2 (P ) and {F, P } a disjoint partition of the ordered set {1, . . . , 2n + p}. On the Wick algebra WN (2R (Z)) we define the map F: F : #(f, V) → w−lim ωn #(f, V)ωn = #(f, V), n→∞
(5.15)
where V = V ∪ {(0, 2n + p + 1)} is obtained by adding the pair (0, 2n + p + 1) to V which embraces all other points of the set {1, . . . , 2n + p}. Such a map has been
Functors of White Noise Associated with the Infinite Symmetric Group
223
used previousely in Sect. 6 of [11]. The following limits are easy to check by taking (fin) expectations with respect to vectors in FN (2 (Z)): 1 ai ai , N 1 w−lim ωn ai∗ ai ωn = 2 Ni , n→∞ N w−lim ωn Ni ωn = Ni ,
w−lim ωn ai ai ωn = n→∞
n→∞
w−lim ωn ai ai∗ ωn = 1 + n→∞
1 Ni . N
These relations lead to w−lim Fk (ωi2 ) = 1 + k→∞
N +1 Ni , N2
(5.16)
which implies that Ni ∈ N (2R (Z)). Let P (i) denote the projections on the eigenspace of Ni with corresponding eigenvalue equal to zero. Then (P (i))∞ i=−∞ form a commuting family of projections in N (2R (Z)) and P2 = w−lim k→∞
k
P (i).
(5.17)
i=−k
Definition 5.4. i) The category of non-commutative probability spaces has as objects pairs (A, ρA ) of von Neumann algebras and normal states and as morphisms between two objects (A, ρA ) and (B, ρB ) all completely positive maps T : A → B such that T (1A ) = 1A and ρB (T x) = ρA (x) for all x ∈ A. ii) A functor G from the category of (real) Hilbert spaces with contractions to the category of non-commutative probability spaces is called functor of white noise if G({0}) = C, where {0} stands for the zero dimensional Hilbert space. We construct for any real Hilbert space K a von Neumann algebra N (K) such that N becomes a functor of white noise. Definition 5.5. Let K be a real Hilbert space. On the Fock space FN (KC ⊕ 2 (Z)) we define the von Neumann algebra N (K) generated by the Wick products #(f, V) with Im(f) ⊂ K ⊕ 0. Lemma 5.6. Let T : K → K be a contraction. Then the map defined on the Wick products #(f, V) ∈ N (K) by N (T ) : #(f, V) → #((T ⊕ 1) ◦ f, V) extends to a morphism from N (K) to N noise.
(K ).
(5.18)
Moreover N is a functor of white
Proof. If T T ∗ = 1K then #(f, V) → FN (T ⊕ 1)#(f, V)FN (T ⊕ 1)∗ = #((T ⊕ 1) ◦ f, V)
(5.19)
restricts to the desired map on Wick products #(f, V) with f(k) = fk ⊕ 0 and subsequently extends to a completely positive map N (T ) from N (K) to N (K ).
224
M. Bo˙zejko, M. Gu¸ta˘
If T ∗ T = 1K then there exists an orthogonal operator OT : K⊕2R (Z) → K ⊕2R (Z) whose restriction to K coincides with T and thus FN (OT )#(f, V)FN (OT )∗ = #(OT ◦ f, V)
(5.20)
has again the required action on Wick products with vectors from K and extends to a N (T ) from N (K) to N (K ). An arbitrary contraction T can be written as a product P I of a co-isometry and an isometry. Then we define N (T ) := N (P )N (I ) which does not depend on the particular choice of P and I , and apply the previous cases. By definition N (∅) = C. ∗ -homomorphism
Theorem 5.7. The von Neumann algebra N (2R (Z)) is isomorphic to a discrete sum of type I∞ factors. Proof. We denoted by (ei ⊕ eˇj )i,j ∈Z an orthonormal basis of 2 (Z) ⊕ 2 (Z). The corˇ j . As in the proof of Proposition 5.3 one can responding number operators are Ni and N ˇ j ∈ N (2 (Z)) . The common eigenspaces of all show that Ni ∈ N (2R (Z)) and N R ˇ j are finite dimensional which implies that the selfadjoint the number operators Ni , N elements Z in the center of N (2R (Z)) have discrete spectrum and thus the center is isomorphic to ∞ (M) for a countable discrete set M. Let n := (ni )i∈Z be a sequence of natural numbers such that only a finite number of them are different from zero. We denote by FN (n) the joint eigenspace of the operators Ni with eigenvalues ni . The projection Pn onto this space belongs to N (2R (Z)). We make a similar notation for the projections ˇ j . Let Q ≤ Pn be another projection in N (2 (Z)) which Pˇm onto the eigenspaces of N R is equivalent to Pn . Then there exists a partial isometry W ∈ N (2R (Z)) such that W W ∗ = Pn and W ∗ W = Q. Furthermore the projectionsPˇm Q and Pˇm Pn have finite range for all m which implies that Q = Pn , thus finite. As n Pn = 1 we conclude that each factor in the decomposition of N (2R (Z)) must be of type I . But just as in Lemma 4.3 we can show that on N (2R (Z)) there is no tracial state and thus all factors are I∞ . Theorem 5.8. The von Neumann algebra N (R) is isomorphic to N+1 p=2 Mp (C). Proof. From Theorem 5.7 we have that Ni ∈ N (2R (Z)) ⊂ B(FN (2 (Z) ⊕ 2 (Z))). Let Pi,k be the spectral projection of Ni corresponding to the eigenvalue 0 ≤ k ≤ −N . Then the creation operator ai∗ can be written as ai∗
=
N−1
Pi,k+1 ωi Pi,k ,
(5.21)
k=0
and thus all creation and annihilation operators ai belong to N (2R (Z)). The Wick products #(f, V) can be expressed in terms of the creation and annihilation operators by using the relations ai aj∗ = δi,j +
1 d(|j >< i|)) N
(5.22)
and the commutation relations [a(f ), d(A)] = a(A∗ f )
(5.23)
Functors of White Noise Associated with the Infinite Symmetric Group
225
for A ∈ B(2R (Z)). Thus N (K) is generated by the operators a (f ⊕ 0) with f ∈ K. In particular N (R) is the von Neumann algebra generated by a := a (e) on FN (C ⊕ 2 (Z)) with e a unit vector in C. Let N be the corresponding number operator and ψk a vector with Nψk = kψk . Notice that 0 ≤ k ≤ |N |. If moreover we have aψk = 0 then the cyclic representation of ψk has dimension |N | − k + 1 and is isomorphic to M|N|−k+1 (C). By choosing the appropriate basis of the representation we obtain the matrix of a ∗
0 0 1 . ∗ a = . N . 0 0
1 0 .. .
√0 2 .. .
... ... .. .
0 0 .. .
. √ 0 0 . . . |N | − k 0 0 ... 0
The space FN (C ⊕ 2 (Z)) decomposes into an infinite number of copies of each of these representations provided that we verify that such “pseudo-vacuum” vectors ψk exist in FN (C ⊕ 2 (Z)). Let b := a(e0 ) be another annihilation operator. Define for 0 ≤ k < N , ψk = b∗ a ∗k 2 +
1 a ∗ d(|e0 >< e|)a ∗k 2. |N | − k + 1
(5.24)
Then Nψk = kψk and aψk =
1 1 k−1 d(|e0 >< e|)a ∗k 2 + (1 + )(|e0 >< e|)a ∗k 2 = 0. N |N | − k + 1 N (5.25)
We show that ψk = 0. Making use of the previous equality we have ψk , ψk = 2, a k bψk
1 2, a k d(|e >< e0 |)d(|e0 >< e|)a ∗k 2 |N|(|N | − k + 1) k ∗k 2 = ||a 2|| − ||a ∗k 2||2 |N|(|N | − k + 1) (|N | − k)(|N | + 1) ∗k 2 ||a 2|| = 0. = (5.26) |N|(|N | − k + 1) = ||a ∗k 2||2 −
Finally let ψN be a vector with NψN = N ψN . The monomials in creation operators a ∗ and (ak∗ )k∈Z applied to the vacuum form a total set in the Fock space FN (C ⊕ 2 (Z)). Let us write ψN =
p k=−p
˜ ak∗ ψˆ k + a ∗ ψ,
(5.27)
226
M. Bo˙zejko, M. Gu¸ta˘
˜ We show that ||ψN ||2 = for some vectors ψˆ k with Nψˆ k = N ψˆ k and Nψ˜ = (N − 1)ψ. 2 |N| · ||aψN || : p ||ψN ||2 = ψN , a ∗ a ak∗ ψˆ k + a ∗ ψ˜
k=−p
p 1 1 ∗ ˆ ∗ ˜ = ψN , ||ψN ||2 , a k ψk + a ψ = |N | |N |
(5.28)
k=−p
where we have used that aa ∗ ψ˜ = a ∗ aak∗ ψˆ k =
1 ˜ |N| ψ,
and
1 ∗ 1 1 ∗ a d(|ek >< e|)ψˆ k = [a ∗ , d(|ek >< e|)]ψˆ k = a ψˆ k . N N |N | k
Corollary 5.9. The functors N1 and N2 are not isomorphic for N1 = N2 . References 1. Bergeron, F., Labelle, G., Leroux, P.: Combinatorial Species and Tree-like Structures. Cambridge: Cambridge University Press, 1998 2. Bo˙zejko, M., Kümmerer B., Speicher, R.: q-Gaussian processes: non-commutative and classical aspects. Commun. Math. Phys. 185, 129–154 (1997) 3. Bo˙zejko, M., Speicher, R.: Interpolations between Bosonic and Fermionic relations given by generalized Brownian motion. Math. Z. 222, 135–160 (1996) 4. Bo˙zejko, M., Speicher, R.: An example of a generalized Brownian motion. Commun. Math. Phys. 137, 519–531 (1991) 5. Bo˙zejko, M., Speicher, R.:An example of a generalized Brownian motion II. In: Quantum Probability and Related Topics VII. L. Accardi, ed. Singapore: World Scientific, 1992, pp. 67–77 6. Fivel, D.: Interpolation between Fermi and Bose statistics using generalized commutators. Phys. Rev. Lett. 65, 3361–3364 (1990) 7. Frisch, U., Bourret, R.: Parastochastics. J. Math. Phys. 11, 364–390 (1970) 8. Goodman, F.H., de la Harpe, P., Jones, V.F.R.: Coxeter Graphs and Towers of Algebras. BerlinHeidelberg-New York: Springer-Verlag, 1989 9. Greenberg, O.W.: Particles with small violations of Fermi or Bose Statistics. Phys. Rev. D 43, 4111–4120 (1991) 10. Gu¸ta˘ , M., Maassen, H.: Symmetric Hilbert spaces arising from species of structures. Preprint mathph/0007005, to appear in Math. Z. 11. Gu¸ta˘ , M., Maassen, H.: Generalised Brownian motion and second quantisation. Preprint mathph/0011028, to appear in J. Funct. Anal. 12. Joyal, A.: Une Théorie Combinatoire des Séries Formelles. Adv. Math. 42, 1–82 (1981) 13. Joyal, A.: Foncteurs Analytiques et Especes de Structures. In: Combinatoire enumerative, Proc. Colloq., Montreal/Can. 1985, Lect. Notes Math. 1234, Berlin-Heidelberg-New York: Springer-Verlag, 1986, pp. 126–159 14. Köstler, C.: Quanten-Markoff-Prozesse und Quanten-Brownsche Bewegungen. PhD thesis, Stuttgart 2000 15. Kümmerer, B.: Quantum white noise. In: Infinite Dimensional Harmonic Analysis. H. Heyer, et al. ed. Bamberg: D. u. M. Graebner, 1996, 156–168 16. Kümmerer, B.: Survey on a theory of non-commutative Markov processes. In: Quantum Probability and Applications III. L. Accardi, W. von Waldeenfels, eds. Lect. Notes Math. 1303, Heidelberg: Springer-Verlag, 1988, pp. 154–182 17. Kümmerer, B.: A Dilation Theory for Completely Positive Operators on W ∗ -Algebras. Dissertation, Tübingen 1982 18. Maassen, H., van Leeuwen, H.: A q-deformation of the Gaussian distribution. J. Math. Phys. 36, 4743–4756 (1995)
Functors of White Noise Associated with the Infinite Symmetric Group
227
19. Parthasarathy, K.R.: An Introduction to Quantum Stochastic Calculus. Basel: Birkhäuser Verlag, 1992 20. Simon, B.: The P (F)2 Quantum Euclidian Field Theory. Princeton, NJ: Princeton Univ. Press, 1974 21. Speicher, R.: Generalized statistics of macroscopic fields. Lett. Math. Phys. 27, 97–104 (1993) 22. Takesaki, M.: Theory of Operator Algebras I. New York: Springer-Verlag, 1979 23. Thoma, E.: Die Unzerlegbaren positiv-definiten Klassenfunktionen der abzählbar unendlichen, symmetrischen Gruppe. Math. Z. 85, 40–61 (1964) 24. Veršik, A.M., Kerov, S.V.: Characters and factor representations of the infinite symmetric group. Soviet Math. Dokl. 23, 389–392 (1981) 25. Voiculescu, D., Dykema, K., Nica, A.: Free Random Variables. Providence, RI: AMS, 1992 26. Zagier, D.: Realizability of a model in infinite statistics. Commun. Math. Phys. 147, 199–210 (1992) Communicated by H. Araki
Commun. Math. Phys. 229, 229 – 269 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Vector Bundles and Lax Equations on Algebraic Curves I. Krichever1,2, 1 Department of Mathematics, Columbia University, New York, NY 10027, USA.
E-mail:
[email protected]
2 Landau Institute for Theoretical Physics and ITEP, Moscow, Russia
Received: 25 September 2001 / Accepted: 22 December 2001
Abstract: The Hamiltonian theory of zero-curvature equations with spectral parameter on an arbitrary compact Riemann surface is constructed. It is shown that the equations can be seen as commuting flows of an infinite-dimensional field generalization of the Hitchin system. The field analog of the elliptic Calogero-Moser system is proposed. An explicit parameterization of Hitchin system based on the Tyurin parameters for stable holomorphic vector bundles on algebraic curves is obtained. 1. Introduction The main goal of this paper is to construct a Hamiltonian theory of zero curvature equations on an algebraic curve introduced in [1], and identify them as infinite-dimensional field analogs of the Hitchin system [2]. The zero curvature equation ∂t L − ∂x M + [L, M] = 0,
(1.1)
where L(x, t, λ) and M(x, t, λ) are rational matrix functions of a spectral parameter λ uis (x, t)(λ − λi )−s , M = v0 (x, t) + vj k (x, t)(λ − µj )−k , L = u0 (x, t) + i,s
j,k
(1.2) of degree n and m, respectively, was proposed in [4] as one of the most general type of representation for integrable systems. Equation (1.1), which has to be valid identically in λ, is equivalent to a system of (n + m + 1) matrix equations for the unknown functions Research is supported in part by National Science Foundation under the grant DMS-98-02577 and by CRDF Award RP1-2102
230
I. Krichever
u0 , v0 , uis , vj k . The number of the equations is less than the number of unknown functions. That is due to a gauge symmetry of (1.1). If g(x, t) is an arbitrary matrix function then the transformation L −→ gx g −1 + gLg −1 ,
M −→ gt g −1 + gMg −1
(1.3)
maps solutions of (1.1) into solutions of the same equations. The gauge transformation can be used to normalize L and M. For example, in the gauge u0 = v0 = 0 the numbers of equations and unknown functions are equal. Hence, Eq. (1.1) is well-defined. The Riemann-Roch theorem shows that the naive direct generalization of the zero curvature equation for matrix functions that are meromorphic on an algebraic curve of genus g > 0 leads to an over-determined system of equations. Indeed, the dimension of (r × r) matrix functions with fixed degree d divisor of poles in general position equals r 2 (d − g + 1). If divisors of L and M have degrees n and m, then the commutator [L, M] is of degree n + m. Therefore, the number of equations r 2 (n + m − g + 1) is bigger than the number r 2 (n + m − 2g + 1) of unknown functions modulo gauge equivalence. There are two ways to overcome the difficulty in defining the zero curvature equations on algebraic curves. The first one is based on a choice of special ansatz for L and M. In this way a few integrable systems were found with Lax matrices that are elliptic functions of the spectral parameter. The second possibility, based on a theory of high rank solutions of the KP equation [3], was discovered in [1]. It was shown that if in addition to fixed poles the matrix functions L and M have moving rg poles with special dependence on x and t, then Eq. (1.1) is a well-defined system on the space of singular parts of L and M at fixed poles. Recently, an algebraic construction of the zero curvature equations on an algebraic curve was proposed in [5]. If matrix functions L and M do not depend on x, then (1.1) reduces to the Lax equation ∂t L = [M, L] .
(1.4)
A theory of the Lax equations on an algebraic curve was briefly outlined in [1]. In the next section for each effective degree N > g divisor D on a smooth genus g algebraic curve we introduce a space LD of the Lax matrices, and define a hierarchy of commuting flows on it. The spaces of the Lax matrices associated to equivalent divisors are isomorphic. If D = K is the divisor of zeros of a holomorphic differential, then the space LK of the moduli space of is identified with an open set of the cotangent bundle T ∗ (M) semistable holomorphic vector bundles on , i.e. with an open set of the phase space of the Hitchin system. The commuting hierarchy of the Lax equations on LK are commuting flows of the Hitchin system. The conventional approach to a theory of the Hitchin system is based on a repre as the Hamiltonian reduction of free infinite-dimensional system sentation of T ∗ (M) modulo infinite-dimensional gauge group. In the finite-gap or algebro-geometric theory of soliton equations involutivity of the integrals of motion does not come for granted, as in the case of the Hamiltonian reduction. Instead, the commutativity of the hierarchy of the Lax equations is a starting point. It implies involutivity of the integrals, whenever the equations are Hamiltonian. The Lax matrices provide an explicit parameterization of the Hitchin system based on Tyurin parameters for framed stable holomorphic bundles on an algebraic curve [6]. Let V be a stable, rank r, and degree rg holomorphic vector bundle on . Then the dimension of the space of its holomorphic sections is r = dim H 0 (, V ). Let σ1 , . . . , σr be a basis
Vector Bundles and Lax Equations on Algebraic Curves
231
of this space. The vectors σi (γ ) are linear independent at the fiber of V over a generic point γ ∈ , and are linearly dependent r i=1
αsi σi (γs ) = 0
(1.5)
at zeros γs of the corresponding section of the determinant bundle associated to V . For a generic V these zeros are simple, i.e. the number of distinct points γs is equal to rg = deg V , and the vectors αs = (αsi ) of linear dependence (1.5) are uniquely defined up to a multiplication. A change of the basis σi = j gij σj corresponds to the linear of the moduli αs = g T αs . Hence, an open set M ⊂ M transformation of the vectors αs , space of vector bundles is parameterized by points of the factor-space M = M0 /SLr , M0 ⊂ S rg × CP r−1 , (1.6) where SLr acts diagonally on the symmetric power of CP r−1 . In [1, 7] the parameters (γs , αs ) were called Tyurin parameters. Recently, the Tyurin parameterization of the Hitchin system for r = 2 was found [8]. In Sect. 3 we show that the standard scheme to solve conventional Lax equations using the concept of the Baker-Akhiezer function is evenly applicable to the case of Lax equations on algebraic curves. We would like to emphasize that solution of the Lax equations via the spectral transform of the phase space to algebraic-geometric data does not use a Hamiltonian description of the system. Moreover, a’priori it’s not clear, why the Lax equations are Hamiltonian. In Sect. 4 we clarify this problem using the approach to the Hamiltonian theory of soliton equations proposed in [9–11]. It turns out that for D = K the universal two-form which is expressed in terms of the Lax matrix and its eigenvectors coincides with canonical symplectic structure on the cotangent bundle T ∗ (M). If the divisor DK = D − K is effective, then the form is non-degenerate on symplectic leaves defined by a choice of the orbits of the adjoint action of SLr on the singular parts of L ∈ LD at the punctures Pm ∈ DK . In Sect. 5 for each degree N > g divisor D on a commuting hierarchy of zero curvature equations is defined. The infinite-dimensional phase space AD of the hierarchy can be seen as a space of connections ∂x − L(x, q) along loops in M0 . We would like to emphasize that AD does depend on the divisor D and not simply on its equivalence class, as in the case of the Lax equations. If DK is effective, then the equations of the hierarchy are Hamiltonian after restriction on symplectic leaves. The Riemann surface of the Bloch solutions of the equation (∂x − L(x, q))ψ(x, q) = 0, x ∈ S 1 , q ∈
(1.7)
is an analog of the spectral curves in the x-independent case.Algebro-geometric solutions of the hierarchy are constructed in the last section. Note that they can be constructed in all the cases independently of whether the equations are Hamiltonian or not. It is instructive to present two examples of the zero curvature equations. The first one is a field analog of the elliptic Calogero–Moser system. The elliptic CM system is a system of r particles with coordinates qi on an elliptic curve with the Hamiltonian 1 2 H = pi + ℘ (qi − qj ) , (1.8) 2 i
i =j
232
I. Krichever
where ℘ (q) is the Weierstrass function. In [12] the elliptic CM system was identified with a particular case of the Hitchin system on an elliptic curve with a puncture. In Sect. 5 we show that the zero curvature equation on an elliptic curve with a puncture is equivalent to the Hamiltonian system which can be seen as the field analog of the elliptic CM system. For r = 2 this system is equivalent to the system on a space of periodic functions p(x), q(x) with canonical Poisson brackets {p(x), q(y)} = δ(x − y). The Hamiltonian is H = p2 1 − qx2 −
2 qxx 2 )℘ (2q) dx. + 2(1 − 3q x 2(1 − qx2 )
(1.9)
(1.10)
The second example is the Krichever-Novikov equation [3] qt =
1 3 1 2 qxxx + (1 − qxx ) − Q(q)qx2 , 4 8qx 2
(1.11)
where Q(q) = ∂q - + -2 , - = -(q, y) = ζ (q − y) + ζ (q + y) − ζ (2q).
(1.12)
Note that Q(q) does not depend on y. Each solution q = q(x, t) of (1.11) defines a rank 2, genus 1 solution of the KP equation by the formula 2 8u(x, y, t) = qxx − 1 qx−2 − 2qxxx qx−1 + 8qxx - + 4qx2 ∂q - − - . (1.13) Equation (1.12) has zero curvature representation on the elliptic curve with puncture with r = 2. The difference between the two examples is in the choice of orbits at the puncture. In the first example the orbit is that of the diagonal matrix diag(1, −1), while the second example corresponds to the orbit of the Jordan cell. 2. The Lax Equations We define first the space of Lax matrices associated with a generic effective divisor D on , and a point (γ , α) = {γs , αs } of the symmetric product X = S rg × CP r−1 . Throughout the paper it is assumed that the points γs ∈ are distinct, γs = γk . Let Fγ ,α be the space of meromorphic vector functions f on , that are holomorphic except at the points γs , at which they have a simple pole of the form f (z) =
λ s αs + O(1), λs ∈ C. z − z(γs )
(2.1)
The Riemann-Roch theorem implies that dim Fγ ,α ≥ r(rg − g + 1) − rg(r − 1) = r.
(2.2)
The first term in (2.2) is the dimension of the space of meromorphic vector-functions with simple poles at γs . The second term is the number of equations equivalent to the constraint that poles of f are proportional to the vectors αs . The space Fs of meromorphic functions in the neighborhood of γs that have a simple pole at γs of the form (2.1) is the space of local sections of the vector bundle Vγ ,α
Vector Bundles and Lax Equations on Algebraic Curves
233
corresponding to (γ , α) under the inverse to the Tyurin map described in terms of Hecke modification of the trivial bundle. The space of global holomorphic sections of Vγ ,α is just the space Fγ ,α . Let M0 be an open set of the parameters (γ , α) such that dim Fγ ,α = r. Let D = i mi Pi be an effective divisor on that does not intersect with γ . Then we define a space NγD,α of meromorphic matrix functions M = M(q), q ∈ , such that: 10 . M is holomorphic except at the points γs , where it has at most simple poles, and at the points Pi of D, where it has poles of degree not greater than mi ; 20 . The coefficient Ms0 of the Laurent expansion of M at γs M=
Ms0 + Ms1 + Ms2 (z − zs ) + O((z − zs )2 ), zs = z(γs ), z − zs
(2.3)
is a rank 1 matrix of the form ij
j
Ms0 = µs αsT ←→ Ms0 = µis αs ,
(2.4)
where µs is a vector. The constraint (2.4) does not depend on a choice of local coordinate z in the neighborhood of γs . If (γ , α) ∈ M0 , then the constraints (2.4) are linear independent and dim NγD,α = r 2 (N + rg − g + 1) − r 2 g(r − 1) = r 2 (N + 1) , N = deg D.
(2.5)
Central to all our further constructions is a map D : NγD,α −→ Tγ ,α M0
(2.6)
from NγD,α to the tangent space to M0 at the point (γ , α). The tangent vector ∂m = D(M) is defined by derivatives of the coordinates ∂m zs = −tr Ms0 = −αsT µs , zs = z(γs ),
∂m αsT
=
−αsT Ms1
+ κs αsT
,
(2.7) (2.8)
where κs is a scalar. The tangent space to CP r−1 at a point represented by the vector αs is a space of r-dimensional vectors v modulo equivalence v = v + κs αs . Therefore, the right hand side of (2.8) is a well-defined tangent vector to CP r−1 . Simple dimension counting shows that on an open set of M0 the linear map D is an injection for N < g − 1, and is an isomorphism for N = g − 1. Let us define the space LD γ ,α of the Lax matrices as the kernel of D. In other words: a matrix function L(q) ∈ NγD,α is a Lax matrix if (i) the singular term of the expansion L=
Ls0 + Ls1 + Ls2 (z − zs ) + O((z − zs )2 ), Ls0 = βs αsT , zs = z(γs ), (2.9) z − zs
is traceless αsT βs = tr Ls0 = 0;
(2.10)
(ii) αsT is a left eigenvector of the matrix Ls1 αsT Ls1 = αsT κs .
(2.11)
234
I. Krichever
For a non-special degree N ≥ g divisor D and a generic set of the parameters (γ , α), the space LD γ ,α is of dimension 2 2 dim LD γ ,α = r (N + 1) − rg − rg(r − 1) = r (N − g + 1) .
(2.12)
A key characterization of constraints (2.9)–(2.11) is as follows. Lemma 2.1. A meromorphic matrix-function L in the neighborhood U of γs with a pole at γs satisfies the constraints (2.10) and (2.11) if and only if it has the form s (z)-−1 L = -s (z)L s (z),
(2.13)
s and -s are holomorphic in U , and det -s has at most simple zero at γs . where L Proof. Let gs be a constant non-degenerate matrix such that αsT gs = e1T , e1T = (1, 0, 0, . . . , 0).
(2.14)
If L satisfies (2.9, 2.10), then, the coefficient Ls0 of the Laurent expansion at γs of the gauge equivalent Lax matrix Ls = gs−1 Lgs =
Ls0 + Ls1 + O(z − zs ), zs = z(γs ), z − zs
(2.15)
equals f e1T , where f = gs−1 βs . Therefore, it has non-zero entries at the first column, only, (Ls0 )i,j = 0, j = 2, . . . , r.
(2.16)
Further, the vector e1T is a left eigenvector for Ls1 corresponding to the eigenvalue κs . Hence, the first row of Ls1 equals (Ls1 )11 = κs , (Ls1 )1j = 0, j = 2, . . . , r.
(2.17)
s = fs−1 Ls fs , where fs is the diagonal From (2.16), (2.17) it follows that the matrix L matrix fs (z) = diag{(z − zs ), 1, 1, . . . , 1},
(2.18)
is regular at γs . Hence, the Lax matrix L has the form (2.13), where -s = gs fs (z).
(2.19)
Conversely suppose L has the form (2.13), and let αs be the unique (up to multiplication) vector such that αsT -s (zs ) = 0. Then the Laurent expansion of L at γs has the form (2.9). The trace of L is holomorphic, which implies (2.10). Using the equality αsT -s (zs ) = 0 we obtain that αsT L is holomorphic at γs and its evaluation at this point is proportional to αsT . This implies (2.11) and the lemma is proved.
Vector Bundles and Lax Equations on Algebraic Curves
235
Let [D] be the equivalence class of a degree N > g divisor D. Then for any set (γ , α) there is a divisor D equivalent to D that does not intersect with γ . Constraints (2.10) and (2.11) are invariant under the transformation L → hL, where h is a function D holomorphic in the neighborhood of γs . Therefore, the spaces LD γ ,α and Lγ ,α of Lax matrices corresponding to equivalent divisors D and D are isomorphic. They can be regarded as charts of a total space L[D] , the Lax matrices corresponding to [D]. Let us consider in greater detail the case D = K, where K is the zero divisor of a holomorphic differential dz. Then Ldz, where L ∈ LK γ ,α , is a matrix valued one-form that is holomorphic everywhere except at the points γs . The constraints (2.10, 2.11) imply that the space Fs of local sections of Vγ ,α is invariant under the adjoint action of L, f ∈ Fs −→ LT (z)f (z) ∈ Fs .
(2.20)
Therefore, the gauge equivalence class of the matrix valued differential Ldz can be seen as a global section of the bundle End(Vγ ,α ) ⊗ ;1,0 (). It is basic in the Hitchin system theory, that the space of such sections, called Higgs fields, is identified with the cotangent bundle T ∗ (M). It is instructive to establish directly the equivalence LK /SLr = T ∗ (M),
(2.21)
using the map (2.6). The formula L, M = −
s
resγs Tr (LM) dz
(2.22)
D defines a natural pairing between LK γ ,α and Nγ ,α . For a generic degree (g − 1) divisor D the map (2.6) is an isomorphism. Therefore each tangent vector w = (˙zs , α˙ s ) to M0 at the point (zs = z(γs ), αs ) can be represented in the form D(M). From (2.7, 2.8) it follows that (2.22) actually defines a pairing between LK γ ,α and the tangent space Tγ ,α (M0 ),
L, w =
s
(κs z˙ s + α˙ sT βs ).
(2.23)
This formula shows that the vector βs and the eigenvalue κs in (2.10, 2.11) can be regarded as coordinates of a cotangent vector to M0 . Note that κs under the change of dz to another holomorphic differential dz1 get transformed to κs = κs dz/dz1 . Therefore, the pair (γs , κs ) can be seen as a point of the cotangent bundle T ∗ () to the curve . The pairing (2.23) descends to pairing of LK /SLr with tangent vectors to M. Indeed, tangent vectors to M at a point represented by gauge equivalence class of α are identified with vectors α˙ s modulo transformation α˙ sT → α˙ sT + αsT W , where W is a matrix. Under this transformation the right hand side of (2.23) does not change due to the equation rg s=1
βs αsT =
s
resγs Ldz = 0,
which is valid, because Ldz is holomorphic except at γ .
(2.24)
236
I. Krichever
The induced pairing of LK /SLr with T (M) is non-degenerate. Indeed, if w = D(M), then (2.22) implies that resPi Tr (LM) dz. (2.25) L, w = L, M = i
Therefore, if (2.23) is degenerate then there is a nontrivial L which has zero of order mi at all the points Pi of D. That is impossible because D is a generic degree (g − 1) divisor. Our next goal is to introduce an explicit parameterization of LK . Recall that we always assume (γ , α) ∈ M0 . Lemma 2.2. The map L ∈ LK −→ {αs , βs , γs , κs } ,
(2.26)
where pairs of orthogonal vectors (αsT βs ) = 0 are considered modulo gauge transformations αs → λs αs , βs → λ−1 s βs ,
(2.27)
and satisfy Eq. (2.24), is one-to-one correspondence. Proof. Suppose that images of L and L under (2.26) coincide, then (L − L )dz is a holomorphic matrix valued differential ϕ such that αsT ϕ(γs ) = 0.
(2.28)
Let FγP,α be the space of meromorphic vector functions with poles at γs of the form (2.1) and with simple pole at a point P ∈ . By the definition of M0 , the constraints (2.1) are linearly independent. Therefore, FγP,α has dimension 2r, and the vectors of singular part of f ∈ FγP,α at P span the whole space Cr . From (2.28) it follows that if f ∈ FγP,α , then the differential f T ϕ has no poles at γs . As the sum of all the residues of a meromorphic differential equals zero, then the f T ϕ is regular at P . That implies ϕ(P ) = 0. The point P is arbitrary, therefore (2.26) is an injection. The map (2.26) is linear on fibers over (γ , α). Therefore, in order to complete a proof of the lemma, it is enough to show that dimension of LK γ ,α is greater than or equal to the dimension d of the corresponding data (βs , κs ). The vectors βs are orthogonal to αs . Therefore, d equals r 2 g minus the rank of the system of equations (2.24). Let us show that if (γ , α) ∈ M0 , then the vectors αs span Cr . Suppose that they span an l-dimensional subspace, then by a gauge transformation we can reduce the problem to the case when the vectors αs have the (r − l) vanishing coordinates, αsi = 0, i > l. The Riemann–Roch theorem then implies that the dimension of the corresponding space Fγ ,α is not less than l(rg − g + 1) − rg(l − 1) + (r − l) = (r − l)g + r. If the rank of αsi is r, then Eqs. (2.24) are linearly independent by themselves, but one of them is already satisfied due to the orthogonality condition for βs , which implies Tr (βs αsT ) = 0. Therefore the dimension of the fiber of data (2.26) over (γ , α) ∈ M0 equals r 2 (g − 1) + 1. On the other hand, for L ∈ LK γ ,α among constraints (2.10) there are at most (rg − 1) linearly independent, because a meromorphic differential can not have a single simple 2 pole. Hence, dimension counting as in (2.5) implies dim LK γ ,α ≥ r (g − 1) + 1 and the lemma is proved.
Vector Bundles and Lax Equations on Algebraic Curves
237
Example. Let be a hyperelliptic curve defined by the equation y 2 = R(x) = x 2g+1 +
2g
ui x i .
(2.29)
i=0
A set of points γs on is a set of pairs (ys , xs ), such that ys2 = R(xs ). A meromorphic differential on with residues (βs αsT ) at γs has the form g−1 rg y + y dx s dx , = Li x i + (βs αsT ) L 2y x − xs 2y i=0
(2.30)
(2.31)
s=1
where Li is a set of arbitrary matrices. The constraints (2.11) are a system of linear equations defining Li : g i=0
αnT Li xki +
s =n
(αnT βs )αsT
yn + y s = κn αnT , n = 1, . . . , rg xn − x s
(2.32)
in terms of data (2.26). In a similar way the Lax matrices can be explicitly written for any algebraic curve using the Riemann theta-functions. For g > 1, the correspondence (2.26) descends to a system of local coordinates on LK /SLr over an open set M0 of M0 , which we define as follows. As shown above, for (γ , α) ∈ M0 the matrix αsi is of rank r. We call (γ , α) a nonspecial set of the Tyurin parameters if additionally they satisfy the constraint: there is a subset of (r + 1) indices s1 , . . . , sr+1 such that all minors of (r + 1) × r matrix αsij are non-degenerate. The action of the gauge group on the space of non-special sets of the Tyurin parameters M0 is free. Let us define charts of coordinates on a smooth bundle of equivalence classes of Lax matrices over M0 . Consider the open set of M0 such that the vectors αj , j = 1, . . . , r, are linearly independent and all the coefficients of an expansion of αr+1 in this basis do not vanish αr+1 =
r
cj αj , cj = 0.
(2.33)
s=1
Then for each point of this open set there exists a unique matrix W ∈ GLr , such that T W is αjT W is proportional to the basis vector ej with the coordinates eji = δji , and αr+1 proportional to the vector e0 = j ej . Using the global gauge transformation defined by W , Bs = W −1 βs , As = W T αs ,
(2.34)
and the part of local transformations As → λs As ; Bs → λ−1 s Bs ,
(2.35)
238
I. Krichever
for s = 1, . . . , r + 1, we obtain that on the open set of M0 each equivalence class has representation of the form (As , Bs ) such that Ai = ei , i = 1, . . . , r; Ar+1 = e0 .
(2.36)
This representation is unique up to local transformations (2.35) for s = r + 2, . . . , rg. In the gauge (2.36) Eq. (2.24) can be easily solved for B1 , . . . , Br+1 . Using (2.36), we get i =− Bji + Br+1
rg s=r+2
j
Bsi As .
(2.37) j
The orthogonality condition of Bj to Aj = ej implies that Bj = 0. Hence, i Br+1 =−
rg s=r+2
Bsi Ais .
(2.38)
The sets of r(g − 1) + 1) pairs of orthogonal vectors As , Bs modulo the transformations (2.35), and points {γs , κs } ∈ S rg (T ∗ ()) provide a parameterization of an open set of T ∗ (M). Here and below M = M0 /SLr . In the same way, taking various subsets of (r +1) indices we obtain charts of local coordinates which cover T ∗ (M). In Sect. 4 we provide a similar explicit parameterization of LD for divisors D such that DK = D − K is an effective divisor. Our next goal is to construct a hierarchy of commuting flows on a total space LD of a vector bundle over an open set of M0 . Let us identify the tangent space TL (LD ) to LD at the point L with the space of meromorphic matrix functions spanned by derivatives ∂τ L|τ =0 of all one-parametric deformations L(q, τ ) ∈ LD of L.
D Lemma 2.3. The commutator [M, L] of matrix functions L ∈ LD γ ,α and M ∈ Nγ ,α is a D tangent vector to L at L if and only if its divisor of poles outside the points γs is not greater than D.
Proof. First of all, let us show that the tangent space TL (LD ) can be identified with a space of matrix functions T on with poles of order not greater than mi at Pi , and double poles at the points γs , where they have an expansion of the form T = z˙ s
βs αsT β˙s αsT + βs α˙ sT + + Ts1 + O(z − zs ). (z − zs )2 z − zs
(2.39)
Here z˙ s is a constant, and α˙ s , β˙s are vectors that satisfy the constraint αsT β˙s + α˙ sT βs = 0.
(2.40)
The vectors αs , βs are defined by L. In addition it is required that the following equation holds: αsT Ts1 = α˙ s κs + αs κ˙ s − α˙ sT Ls1 − z˙ s αsT Ls2 , where Ls1 , Ls2 and κs are defined by (2.9,2.11), and κ˙ s is a constant.
(2.41)
Vector Bundles and Lax Equations on Algebraic Curves
239
Equations (2.40) and (2.41) can be easily checked for a tangent vector ∂τ L|τ =0 , if we identify (˙zs , α˙ s , β˙s ) with z˙ s = ∂τ z(γs (τ ))|τ =0 , α˙ s = ∂τ αs (τ ))|τ =0 , β˙s = ∂τ βs (τ ))|τ =0
(2.42)
and Ts1 with Ts1 = (∂τ Ls1 − z˙ s Ls2 ) |τ =0 .
(2.43)
Direct counting of a number of the constraints shows that the space of matrix functions that have poles of order mi at Pi , and satisfy (2.39-2.41) equals r 2 (N + 1), which is the dimension of LD . Therefore, these relations are necessary and sufficient conditions for T to be a tangent vector. From (2.10, 2.11) it follows that, if we define z˙ s and α˙ s with the help of formulae (2.7, 2.8), then the expansion of [M, L] at γs satisfies the constraints (2.39-2.41). The lemma is thus proved. The lemma directly implies that the Lax equation Lt = [M, L] is a well-defined system on an open set of LD , whenever we can define M = M(L) as a function of L that outside of the points γs commutes with L up to a meromorphic function with poles at the points Pi of order not greater than mi . Let us fix a point P0 ∈ and local coordinates w in the neighborhoods of the punctures P0 , Pi ∈ D. Our next goal is to define gauge invariant functions Ma (L) that satisfy the conditions of Lemma 2.3. They are parameterized by sets a = (Pi , n, m), where n > 0, m > −mi are integers.
(2.44)
As follows from (2.5), for generic L ∈ LD γ ,α there is a unique matrix function Ma (q) such that: (i) it has the form (2.3,2.4) at the points γs ; (ii) outside of the divisor γ it has pole at the point Pi , only, where the singular part at Ma coincides with the singular part of w −m Ln , i.e. Ma− = Ma (q) − w −m Ln (q) = O(1) is regular at Pi ,
(2.45)
(iii) Ma is normalized by the condition Ma (P0 ) = 0. Theorem 2.1. The equations ∂a L = [Ma , L], ∂a = ∂/∂ta
(2.46)
define a hierarchy of commuting flows on an open set of LD , which descends to the commuting hierarchy on an open set of LD /SLr . By definition, Ma only depends on L, i.e. Ma = Ma (L). Equation (2.45) implies that [Ma , L] satisfies the conditions of Lemma 2.3. Therefore, the right-hand side of (2.46) is a tangent vector to LD at the point L. Hence, (2.46) is a well-defined dynamical system on an open set of LD . The Laurent expansion of (2.46) at γs shows that the projection π∗ (∂a ) ∈ T (M0 ) of the vector ∂a ∈ T (LD ) equals π∗ (∂a ) = D(Ma ) .
(2.47)
240
I. Krichever
Now let us prove the second statement of the theorem. Commutativity of flows (2.46) is equivalent to the equation ∂a Mb − ∂b Ma − [Ma , Mb ] = 0.
(2.48)
The left-hand side of (2.48) equals zero at P0 , and, as follows from (2.47) its expansion at γs satisfies (2.39-2.41). Therefore, it equals zero identically, if it is regular at D. This easily follows from standard arguments used in KP theory. If indices a and b correspond to the same point Pi , i.e. a = (Pi , n, m), b = (Pi , n1 , m1 ), then in the neighborhood of Pi we have ∂a Mb = w−m1 ∂a Ln1 + ∂a Mb−
= w−m1 [Ma , Ln1 ] + ∂a Mb− = w−m1 [Ma− , Ln1 ] + ∂a Mb− ,
(2.49)
and [Ma , Mb ] = [w−m Ln + Ma− , w−m1 Ln1 + Mb− ]
= w−m [Ln , Mb− ] − w −m1 [Ln1 , Ma− ] + 0(1)
(2.50)
From (2.49, 2.50) it follows that the left-hand side of (2.48) is regular at Pi . From the definition of Ma , it is regular at all the other points of D as well. In a similar way we prove (2.48) for indices a = (Pi , n, m), b = (Pj , n , m ) for Pi = Pj . Let us now define an extended hierarchy of commuting flows on generic fibers of the evaluation map LD γ ,α → L(P0 ) = L0 . Note that these fibers are invariant with respect to (2.46). Additional flows are parameterized by indices a = (P0 , m; l), m > 0, l = 1, . . . , r.
(2.51)
Let L0 be a matrix with distinct eigenvalues, and let us fix a representation of L0 in the form I0 K0 I0−1 , where K0 is a diagonal matrix. Then for each L ∈ LD γ ,α , such that L(q) = L0 , there exists a unique holomorphic matrix function I, I(q) = I0 , which diagonalizes L in the neighborhood of q, i.e. L = IKI −1 . For each index a of the 0 form (2.51) we define Ma as the unique matrix Ma ∈ NγnP ,α that in the neighborhood of P0 has the form Ma = w−m I(w)El I −1 (w) + O(w),
(2.52)
ij
where El is the diagonal matrix El = δ il δ j l . Theorem 2.2. The equations ∂a L = [Ma , L], a = (P0 , m; l)
(2.53)
defines commuting flows on the fiber of the evaluation map LD → L0 , The flows (2.53) commute with flows (2.46). The proof is almost identical to that of the previous theorem.
Vector Bundles and Lax Equations on Algebraic Curves
241
3. The Baker-Akhiezer Functions In this section we show that standard procedure in the algebro-geometric theory of soliton equations to solve conventional Lax equations using the concept of the Baker–Akhiezer functions ([13, 14]) is evenly applicable to the case of Lax equations on algebraic curves. Let L ∈ LD be a Lax matrix. The characteristic equation R(k, q) ≡ det (k − L(q)) = k r +
r
rj (q)k r−j = 0
(3.1)
j =1
defines a time-independent algebraic curve , which is an r-fold branch cover of . The following statement is a direct corollary of Lemma 2.1. Lemma 3.1. The coefficients rj (q) of the characteristic equation (3.1) are holomorphic functions on except at the points Pi of the divisor D, where they have poles of order j mi , respectively. For a non-special divisor D the dimension of the space S D of sets of meromorphic functions {rj (Q), j = 1, . . . , r} with the divisor of poles j D equals dim S D =
N r(r + 1) − r(g − 1). 2
(3.2)
Note that dimension counting in the case of the special divisor K gives dim S K = r 2 (g − 1) + 1.
(3.3)
Equation (3.1) defines a map LD −→ S D . The coefficients of an expansion of rj in some basis of S D can be seen as functions on LD . The Lax equation implies that these functions are integrals of motion. Usual arguments show that they are independent. These arguments are based on the solution of the inverse spectral problem, which reconstruct L, modulo gauge equivalence, from a generic set of spectral data: a smooth curve defined by {rj } ∈ S D , and a point of the Jacobian J ( ), i.e. an equivalence class [ γ ] of degree g + r − 1 divisor γ on . Here g is the genus of . For a generic point of S the corresponding spectral curve is smooth. Its genus g can be found with the help of the Riemann–Hurwitz formula 2 g − 2 = 2r(g − 1) + deg ν, where ν is the divisor on , which is the projection of the branch points of over . The branch points are zeros on of the function ∂k R(k, Q). This function has poles on all the sheets of over Pi of order (r − 1)mi . Because the numbers of poles and zeros of a meromorphic function are equal then deg ν = N r(r − 1) and we obtain that g=
N r(r − 1) + r(g − 1) + 1. 2
(3.4)
Moreover, a product of ∂k R on all the sheets of is a well-defined meromorphic function on . Its divisor of zeros coincides with ν and the divisor of poles is r(r −1)D. Therefore, these divisors are equivalent, i.e. in the Jacobian J () of we have the equality [ν] = r(r − 1)[D] ∈ J ().
(3.5)
For a generic point Q = (q, k) of there is a unique eigenvector ψ = ψ(Q) of L, L(q)ψ(Q) = kψ(Q),
(3.6)
242
I. Krichever
normalized by the condition that a sum of its components ψi equals 1, r
ψi = 1.
(3.7)
i=1
The coordinates of ψ are rational expressions in k and the entries of L. Therefore, they define ψ(Q) as a meromorphic vector-function on . The degree of the divisor γ of its poles can be found in the usual way. Let I(q), q ∈ , be a matrix with columns ψ(Qi ), where Qi = (q, ki (q)) are preimages of q on , I(q) = {ψ(Q1 ), . . . , ψ(Qr )}.
(3.8)
This matrix depends on an ordering of the roots ki (q) of (3.1), but the function F (q) = det2 I(q) is independent of this. Therefore, F is a meromorphic function on . Its divisor of poles equals 2π∗ ( γ ), where π : → is the projection. In general position, when the branch points of over are simple, the function F has simple zeros at the images of the branch points, and double zeros at the points γs , because evaluations of ψ at preimages of γs span the subspace orthogonal to αs . Therefore, the zero divisor of F is ν + 2γ , where γ = γ1 + · · · + γrg , and we obtain the equality for equivalence classes of the divisors 2[π∗ ( γ )] = [ν] + 2[γ ] = 2[γ ] + r(r − 1)D,
(3.9)
deg γ = deg ν/2 + rg = g + r − 1.
(3.10)
which implies
Let I0 be the matrix defined by (3.8) for q = P0 . Normalization (3.7) implies that I0 leaves the co-vector e0 = (1, . . . , 1) invariant, i.e. e0 I0 = e0 .
(3.11)
The spectral curve and the pole divisor γ are invariant under the gauge transformation −1 −1 L → I0 LI0 , ψ → I0 ψ, but the matrix I0 gets transformed to the identity I0 = I . Let F = diag(f1 , . . . , fr ) be a diagonal matrix, then the gauge transformation L → F LF
−1
, ψ(Q) → f
−1
(Q)F ψ, where f (Q) =
r
fi ψi (Q),
(3.12)
i=1
which preserves the normalization (3.7) and the equality I0 = I , changes γ to an equivalent divisor γ of zeros of the meromorphic function f (Q). The gauge transformation of L by a permutation matrix corresponds to a permutation of preimages P0i ∈ of P0 ∈ , which was used to define I0 . A matrix g with different eigenvalues has representation of the form g = I0 F , where I0 satisfy (3.11) and F is a diagonal matrix. That representation is unique up to conjugation by a permutation matrix. Therefore, the correspondence described above L → { , γ , I0 } descends to a map LD /SLr −→ { , [ γ ]}, which is well-defined on an open set of LD /SLr .
(3.13)
Vector Bundles and Lax Equations on Algebraic Curves
243
According to the Riemann–Roch theorem for each smooth genus g algebraic curve with fixed points q 1 , . . . , q r , and for each nonspecial degree g + r − 1 effective divisor γ there is a unique meromorphic function ψi (Q), Q ∈ with divisor of poles j in γ , which is normalized by the conditions ψi (q j ) = δi . Let ψ(Q) be a meromorphic vector-function with the coordinates ψi (Q). Note that it satisfies (3.7). Let be a curve defined by Eq. (3.1), where rj is a generic set of meromorphic functions on with divisor of poles in j D. Then for each point q ∈ we define a matrix I(q) with the help of (3.8). It depends on a choice of order of the roots ki (q) of Eq. (3.1) but the matrix function L(q) = I(q)K(q)I −1 (q), K(q) = diag(k1 (q), . . . , kr (q)),
(3.14)
is independent of the choice, and therefore, is a meromorphic matrix function on . It has poles of degree mi at Pi ∈ D and is holomorphic at the points of the branch divisor ν. By reversing the arguments used for the proof of (3.10), we get that the degree of the zero divisor γ of det I equals rg. In general position the zeros γs are simple. From Lemma 2.1 it follows that an expansion of L at γs satisfies constraints (2.10,2.11), where αs is a unique up to multiplication vector orthogonal to the vector-columns of I(γs ). Hence, L is a Lax matrix-function. If the points P0i used for normalization of ψj are preimages of P0 ∈ , then L, given by (3.14), is diagonal at q = P0 , and the correspondence { , γ } → L descends to a map { , [ γ ]} → LD /SLr ,
(3.15)
which is well-defined on an open set of the Jacobian bundle over S, where it is inverse to (3.13). Now, let L = L(q, t) be a solution of the Lax equations (2.46,2.53). Then the spectral curve of L(q, t) is time-independent and can be regarded as a generating form of the integrals of the Lax equations. The divisor γ of poles of the eigenvector ψ, defined by (3.6, 3.7) does depend on ta . It is now standard procedure to show that [ γ ] evolves linearly on J ( ). From the Lax equation ∂a L = [Ma , L] it follows that, if ψ is an eigenvector of L, then (∂a − Ma )ψ is also an eigenvector. Therefore, (∂a − Ma )ψ(Q, t) = fa (Q, t)ψ(Q, t), . The vector-function where fa (Q, t) is a scalar meromorphic function on
ta fm (Q, τ )dτ ψ (Q, t) = ϕ(Q, t)ψ(Q, t), ϕ(Q, t) = exp −
(3.16)
(3.17)
0
satisfies the equations (q, t) = 0. (Q, t) = k ψ (Q, t), (∂a − Ma (q, t)) ψ L(q, t)ψ
(3.18)
It turns out that the pole divisor γ (t) of ψ under the gauge transform (3.17) gets trans. All the time dependence formed to a time-independent divisor γ = γ (0) of poles of ψ (Q, t) is encoded in the form of its essential singularities, which it acquires at the of ψ constant poles of fa . Let L(q, t) be a solution of the hierarchy of Eqs. (2.46), (2.53). Here and below we assume that only a finite number of "times" ta are not equal to zero. For brevity we
244
I. Krichever
denote the variables ta corresponding to indices (2.44) and (2.51) by t(i,n,m) and t(0,m; l) , respectively. Commutativity of the hierarchy implies that there is a unique common (Q, t) = ϕ(Q, t)ψ(Q, t) such the ψ solves all the auxiliary linear gauge transform ψ equations (3.18). (Q, t), ψ (Q, 0) = ψ(Q, 0) be the common solution of equations Lemma 3.2. Let ψ (3.18). Then is a meromorphic function on except at the points Pil and P0l , which are 10 . ψ preimages on of the points Pi ∈ D and P0 on , respectively. Its divisor of poles on outside of Pil , P0l is not greater than γ; has the form 20 . In the neighborhood of Pil the function ψ −m n = ξi,l (w, t) exp ψ (3.19) t(i,n,m) w k , n
where ξi,l (w, t) is a holomorphic vector-function, and k = kl (q) is the corresponding root of Eq. (3.1); has the form 30 . In the neighborhood of P0l the function ψ = χl (w, t) exp (3.20) t(0,m; l) w −m , ψ n
where χl is a holomorphic vector-function such that evaluation of its coordinates at P0l equals χli (P0l ) = δ il . (Q, t) is a particular case of the conventional Baker–Akhiezer functions. The function ψ As shown in [14], for any generic divisor γ of degree g + r − 1 there is a unique vector (Q, t) which satisfy all the properties 10 − 30 . It can be written explicitly in function ψ terms of the Riemann theta-function of the curve . (Q, t) be the Baker–Akhiezer vector function associated with a nonTheorem 3.1. Let ψ special divisor γ on . Then there exist unique matrix functions L(q, t), Ma (q, t) such that Eqs. (3.18) hold. As a corollary we get that the Lax operator L(q, t) ∈ LD constructed with the help of solves the whole hierarchy of the Lax equations (2.46,2.53). ψ 4. Hamiltonian Approach As we have seen, the spectral transform which identifies the space of gauge equivalent Lax matrices with a total space of a Jacobian bundle over the moduli space of the spectral curves does not involve a Hamiltonian description of the Lax equations. Moreover, a’priori it is not clear, why all the systems constructed above are Hamiltonian. In this section we show that the general algebraic approach to the Hamiltonian theory of the Lax equations proposed in [9, 10] and developed in [11] is evenly applicable to the Lax equations on the Riemann surfaces. The entries of L(q) ∈ LD can be regarded as functions on LD . Therefore, L by itself can be seen as a matrix-valued function and its external derivative δL as a matrixvalued one-form on LD . The matrix I (3.8) with columns formed by the canonically normalized eigenvectors ψ(Qi ) of L can also be regarded as a matrix function on LD
Vector Bundles and Lax Equations on Algebraic Curves
245
defined modulo permutation of the columns. Hence, its differential δI is a matrix-valued one-form on LD . In the same way we consider the differential δK of the diagonal matrix K (3.14). Let us define a two-form ;(q) on LD with values in a space of meromorphic functions on by the formula ;(q) = Tr I −1 δL ∧ δI − I −1 δI ∧ δK . (4.1) This form does not depend on an order of the eigenvalues of L, and therefore, is well defined on LD . Fix a holomorphic differential dz on . Then the formula rg 1 ω=− resγs ;dz + resPi ;dz , (4.2) 2 s=1
Pi ∈D
defines a scalar-valued two-form on LD . The equation δL = IδKI −1 + δIKI −1 + IKδI −1
(4.3)
; = 2 δ Tr KI −1 δI = 2 δ Tr I −1 LδI .
(4.4)
implies
We would like to emphasize that though the last formula looks simpler than (4.1) and directly shows that ω is a closed two-form, the original definition is more universal. As shown in [9, 10], it provides symplectic structure for general soliton equations. Lemma 4.1. The two-form ω defined by (4.2) is invariant under gauge transformations defined by matrices g that preserve the co-vector e0 = (1, . . . , 1), e0 g = e0 . Proof. If g preserves e0 , then the gauge transformation L = g −1 Lg, I = g −1 I
(4.5)
preserves normalization (3.7) of the eigenvectors. If h = (δg)g −1 , then from (4.4) it follows that under (4.5) ; gets transformed to ; = ; + F , where F = −2 δ (Tr (Lh)) = −2Tr (δL ∧ h + L h ∧ h) .
(4.6)
The additional term F is a meromorphic function on with poles at the points γs and Pi . Therefore, the sum of residues at these points of the differential F dz equals zero and the lemma is proved. It is necessary to emphasize that in the generic case the form ω is not gauge invariant with respect to the whole group SLr , because it does depend on a choice of the normalization of the eigenvectors. A change of normalization corresponds to the transformation I = IV , L = L, where V = V (Q) is a diagonal matrix, which might depend on Q. The corresponding transformation of ; has the form: ; = ; + 2δ (Tr (Kv)) = ; + 2Tr (δK ∧ v) , v = δV V −1 . Here we use the equation δv = v ∧ v = 0 which is valid because v is diagonal.
(4.7)
246
I. Krichever
Let P0D ⊂ LD be a subspace of the Lax matrices such that restriction of δkdz to P0D is a holomorphic differential. This subspace is a leaf of foliation on LD defined by the common level sets of the functions defined on LD by the formulae Ti,j,l = resP l (z − z(Pi ))j kdz , j = 0, . . . , (mi − di ), (4.8) i
where di is the order of zero dz at Pi (compare with the definition of the universal configuration space in [9]). Note that although the functions (4.8) are multivalued, their common level sets are leaves of a well-defined foliation on LD . Lemma 4.2. The two-form ω defined by (4.2) restricted to P0D ⊂ LD is gauge invariant, i.e. it descends to a form on P D = P0D /SLr . Let L ∈ LK be a Lax matrix corresponding to the zero divisor K of a holomorphic differential dz, then Ldz has poles at the points γs , only. Therefore, P0K = LK . Lemma 4.3. The two-form ω on LK defined by the formula (4.2) descends to a form on LK /SLr , which under the isomorphism (2.21) coincides with the canonical symplectic structure on the cotangent bundle T ∗ (M). Proof. The first statement is a direct corollary of the previous lemma. The second one follows from the equality r (4.9) δβsi ∧ δαsi , resγs ;dz = −2 δκs ∧ δzs + i=1
which can be proved as follows. Let Ls be the matrix defined by the gauge transformation (2.15), and let ;s be the function defined by (4.1) for L = L . Then as shown above, resγs ;s dz = resγs ;dz + 2resγs Tr δgs gs−1 ∧ δL − L δgs gs−1 ∧ δgs gs−1 . (4.10)
From (2.9), (2.10) it follows that the second term in (4.10) equals I I = −2Tr (βs δαsT + δβs αsT ) ∧ δgs gs−1 + βs αsT δgs gs−1 ∧ δgs gs−1 .
(4.11)
Using the equality δαsT gs + αsT δgs = 0, which follows from (2.14), we get I I = 2Tr δβs ∧ δαsT = −2 δαsT ∧ δβs .
(4.12)
= fs−1 Ls fs , where fs = fs (z) is the The matrix Ls under the gauge transformation L diagonal matrix (2.18), gets transformed to a holomorphic matrix. Therefore, 0 = resγs ;s dz + 2resγs Tr δfs fs−1 ∧ δLs − Ls δfs fs−1 ∧ δfs fs−1 . (4.13) The last term in (4.13) equals zero because fs is diagonal. From (2.14)–(2.18) it follows that resγs ;s dz = −2resγs Tr δfs fs−1 ∧ δLs = 2δzs ∧ δκs . (4.14)
Vector Bundles and Lax Equations on Algebraic Curves
247
Equations (4.10)–(4.14) imply (4.9). In the coordinates As and Bs (2.34)–(2.38) on an open set of T ∗ (M) the form ω due to (2.36) equals ω0 =
rg
δκs ∧ δzs +
s=1
rg s=r+1
δBsT ∧ δAs , g > 1
(4.15)
and the Lemma is proved. Let us now consider the contribution to ω from poles of Ldz at the points Pm of the divisor DK = D − K. The residue of the last term in (4.1) restricted to P0D vanishes. Therefore, 1 (4.16) ωm = − resPm ;dz = resPm Tr LδII −1 ∧ δII −1 dz . 2 If Ldz has a simple pole at Pm , then its residue Lm is a point of the orbit Om of the adjoint action of GLr , corresponding to the fixed singular part of kdz, which defines the leaf P0D . Let ξ be a matrix, which we regard as a point of the Lie algebra ξ ∈ slr . The formula ∂ξ Lm = [Lm , ξ ],
(4.17)
defines a tangent vector ∂ξ ∈ TLm (Om ) to the orbit at Lm . The correspondence ξ → ∂ξ is a isomorphism between slr /slr (Lm ), and TLm (Om ). Here sl a subalgebra of r (Lm ) is the matrices, that commute with Lm . Evaluation of the form δII −1 at ∂ξ is equal to ξ . Hence, (4.16) restricted to P0D coincides with the canonical symplectic structure on the orbit Om . Its evaluation on a pair of vectors ξ, η is equal to ωm (ξ, η) = Tr (Lm [ξ, η]) .
(4.18)
m as the equivalence class of the If Ldz has a multiple pole at Pm , then we define L and L meromorphic in singular part of Ldz. By definition two matrix differentials L the neighborhood of Pm are equivalent if L − L is a holomorphic differential. Let G− be a group of the invertible holomorphic matrix functions in the neighborhood of Pm . → g Lg −1 , g ∈ G− defines a representation of G− on the finiteThe transformation L m be an orbit of dimensional space of singular parts of meromorphic differentials. Let O this representation. If H− is the Lie algebra of G− , then the equivalence class of the right-hand side m . Therefore, (4.17) of (4.17) for ξ ∈ H− depends only on the equivalence class of L m at L m and H− /H− (L m ), defines an isomorphism between the tangent space to O m ) is the subalgebra of holomorphic matrix functions ξ such that [Lm , ξ ] where H− (L is holomorphic at Pm . The formula m [ξ, η] (4.19) ωm = resPm Tr L m . defines a symplectic structure on O Lemma 4.4. If DK = D − K > 0 is an effective divisor, then the map m , }, L −→ {zs , κs , αs , βs , L
(4.20)
248
I. Krichever
is a bijective correspondence between points of the bundle LD over M0 and sets of the data (4.20) subject to the constraints (αsT βs ) = 0, and rg s=1
βs αsT +
Pm
m = 0, resPm L
(4.21)
∈D
modulo gauge transformations (2.27). If we fix a gauge on a open set of LD by (2.36), then the reconstruction formulae for B1 , . . . , Br+1 become i Br+1 =−
rg s=r+2
Bsi Ais −
m
ii resPm L m,
(4.22)
and Bji
=
i −Br+1
−
rg s=r+2
j
Bsi As −
m
ij resPm L m.
(4.23)
m } provide explicit coordim ∈ O If g > 1, then for DK > 0 the data {zs , κs , As , Bs , L D nates on an open set of P . Theorem 4.1. Let D be a divisor such that DK ≥ 0, where K is the zero divisor of a holomorphic differential dz. Then the form ω defined by (4.2), restricted to P0D descends to a non-degenerate closed two-form on P D : ωm , (4.24) ω = ω0 + Pm ∈DK
where ω0 and ωm are given by (4.15), and (4.19), respectively. The representation of the form ω in terms of the Lax operator and its eigenvectors provide a straightforward and universal way to show that the Lax equations are Hamiltonian, and to construct the action-angle variables. By definition a vector field ∂t on a symplectic manifold is Hamiltonian, if the contraction i∂t ω(X) = ω(∂t , X) of the symplectic form is an exact one-form dH (X). The function H is the Hamiltonian corresponding to the vector field ∂t . Theorem 4.2. Let ∂a be the vector fields corresponding to the Lax equations (2.46, 2.53). Then the contraction of ω defined by (4.2) restricted to P D equals i∂a ω = δHa ,
(4.25)
1 resPi Tr w−m Ln+1 dz, a = (Pi , n, m), n+1 Ha = −resP0 w −m kl dz, a = (P0 , m; l).
(4.26)
where Ha = −
(4.27)
Here kl = kl (q) is the l th eigenvalue of L in the neighborhood of the puncture P0 .
Vector Bundles and Lax Equations on Algebraic Curves
249
Proof. The Lax equation ∂a L = [Ma , L], ∂a k = 0, and Eq. (3.16) ∂a I = Ma I + IFa ,
(4.28)
where I is the matrix of eigenvectors (3.8), and Fa = diag(fa rg 1 i∂a ω = − resγs Rdz + resPi Rdz , 2
(Q1 ), . . .
s=1
, fa
(Qr ), imply (4.29)
Pi ∈D
where R = R(q) equals R = Tr I −1 [Ma , L]δI − I −1 δL(Ma I + IFa ) − I −1 (Ma I + IFa )δK . (4.30) Using, as before, the equality LδI − δIK = IδK − δLI, we get that Tr I −1 [Ma , L]δI = Tr I −1 Ma IδK − Ma δL .
(4.31)
Using the fact that K and F are diagonal, we also obtain the equation Tr I −1 δLI Fa = Tr (δK Fa ) .
(4.32)
From (4.31), (4.32) it follows that i∂a ω = resPi Tr (δK Fa ) dz + Ra ,
(4.33)
Pi ∈D
where Ra =
rg s=1
resγs Tr (δLMa ) dz +
resPi Tr (δLMa ) dz.
(4.34)
Pi ∈D
Note that in the first term of (4.33) a sum of residues at γs has been dropped because K and Fa are holomorphic at these points. Consider first the case of the Lax equations (2.46). The matrix Ma for a = (Pi , n, m) is holomorphic everywhere except at the points γs and Pi . Therefore, Ri,n,m = 0. The corresponding diagonal matrix Fi,n,m is holomorphic at the points Pj ∈ D, j = i. From (2.45) it follows that Fi,n,m in the neighborhood of Pi has the form Fi,n,m = −w −m K n + O(1).
(4.35)
The form δKdz restricted to P D is holomorphic in the neighborhood of Pi . Therefore, 1 −resPi Tr δKFi,n,m dz = resPi Tr w−m K n δK dz = resPi Tr w−m Ln+1 dz. n+1 (4.36) The matrix Fa corresponding to a = (P0 , m; l) is holomorphic at the points of D. Therefore, the right-hand side of (4.33) reduces just to Ra . Because, M0,m; l is holomorphic except at the points γs and P0 , we have in this case the equation R0,m; l = −resP0 Tr δLM0,m; l dz, (4.37) which, with the help of (2.52), implies (4.27). The theorem is therefore proved. It shows that the Lax equations restricted to P D are Hamiltonian whenever the restriction of ω is non-degenerate.
250
I. Krichever
Corollary 4.1. If DK is an effective divisor, then the Lax equations (2.46), (2.53) restricted to P D are Hamiltonian. The corresponding Hamiltonians (4.26), (4.27) are in involution {Ha , Hb } = 0.
(4.38)
The basic relation which implies all Eqs. (4.38) is involutivity of all the eigenvalues of the Lax matrices at different points of , i.e. {kl (q), kl1 (q1 )} = 0.
(4.39)
Example. Let us consider the Lax matrices on an elliptic curve = C/{2nω1 , 2mω2 } with one puncture, which without loss of generality we put at z = 0. In this example we denote the parameters γs and κs by qs and ps , respectively. j j In the gauge αs = es , es = δs the j th column of the Lax matrix Lij has poles only at the points qj and z = 0. From (2.10) it follows that Ljj is regular everywhere, i.e. it is a constant. Equation (2.11) implies that Lj i (qj ) = 0, i = j and Ljj = pj . An elliptic function with two poles and one zero fixed is uniquely defined up to a constant. It can be written in terms of the Weierstrass σ -function as follows: Lij (z) = f ij
σ (z + qi − qj ) σ (z − qi )σ (qj ) , i = j ; Lii = pi . σ (z)σ (z − qj ) σ (qi − qj ) σ (qi )
(4.40)
Let f ij be a rank 1 matrix f ij = a i bj . As it was mentioned above, the equations αi = ei fix the gauge up to transformation by diagonal matrices. We can use these transformation to make a i = bi . The corresponding momentum is given then by the collection (a i )2 and we fix it to the values (a i )2 = 1. The matrix L given by (4.40) with f ij = 1 is gauge with a spectral parameter for the elliptic Calogero–Moser equivalent to the Lax matrix L system found in [15]: ii = pi , L ij = -(qi − qj , z), i = j, L
(4.41)
where -(q, z) =
σ (z − q) ζ (z)q . e σ (z)σ (q)
(4.42)
has essential singularity at z = 0, which is due to the gauge transformation Note that L = diag(-(qi , z), which removes poles of L at the points qi . by the diagonal matrix The Hamiltonian of the elliptic CM system (1.8) is equal to 1 HCM = res0 Tr z−1 L2 dz. 2
(4.43)
For the sequel, we would like to express HCM in terms of the first two coefficients of the Laurent expansion of the marked branch of the eigenvalue of L at z = 0. Indeed, expansions of the eigenvalues of L at z = 0 have the form k1 (z) = (r − 1)z−1 + k11 + k12 z + O(z2 ), kl (z) = −z−1 + kl1 + kl2 z + O(z2 ), l > 1.
(4.44)
Vector Bundles and Lax Equations on Algebraic Curves
251
The equation H1 =
r
pi = Tr L =
i=1
r
kl (z).
(4.45)
l=1
implies H1 =
r
kl1 ,
l=1
r
kl2 = 0.
(4.46)
l=1
From (4.44) and (4.46) it follows that 2HCM = 2rk12 +
r l=1
2 kl1 .
(4.47)
Trace of Lm has the only pole at z = 0. Hence, we have the equations r 2 res0 Tr (L ) = 2 (r − 1)k11 − kl1 = 0, l=2
2 res0 Tr(L3 ) = 3 (r − 1)2 k12 + (r − 1)k11 +
r l=2
(4.48)
2 kl2 − kl1
= 0.
(4.49)
Equations (4.48) and (4.49) imply 2 H1 = rk11 , 2HCM = r 2 k12 + rk11 .
(4.50)
Our next goal is to construct the action-angle variables for ω. Theorem 4.3. Let L ∈ LD be a Lax matrix, and let γs be the poles of the normalized (3.7) eigenvector ψ. Then the two-form ω defined by (4.2) is equal to ω=
g +r−1
δk( γs ) ∧ δz( γs ).
(4.51)
s=1
The meaning of the right-hand side of this formula is as follows. The spectral curve is equipped by definition with the meromorphic function k(Q). The pull back to of the Q abelian integral z(Q) = dz on is a multi-valued holomorphic function on . The γs ) at the points γs define functions on the space LD , and the wedge evaluations k( γs ), z( product of their external differentials is a two-form on LD . (Note that the differential δz( γs ) of the multi-valued function z( gs ) is single-valued, because the periods of dz are constants). Proof. The proof of formula (4.51) is very general and does not rely on any specific form of L. Let us present it briefly following the proof of Lemma 5.1 in [11] (more details can be found in [16]).
252
I. Krichever j
j
Let γs , Pi be preimages on of the points γs ∈ and Pi ∈ D. Then the form ω is equal to rg r 1 dz + dz , ω=− (4.52) resγ j ; resP j ; s i 2 j =1
s=1
i
is a meromorphic function on where ; defined by the formula (Q) = ψ ∗ (Q)δL(q) ∧ δψ(Q) − ψ ∗ (Q)δψ(Q) ∧ δk, Q = (k, q) ∈ . ;
(4.53)
The expression ψn∗ (Q) is the dual eigenvector, which is the row-vector solution of the equation ψ ∗ (Q)L(q) = kψ ∗ (Q),
(4.54)
ψ ∗ (Q)ψ(Q) = 1.
(4.55)
normalized by the condition
Note that ψ ∗ (Q) can be identified with the only row of the matrix I −1 (q) which is not orthogonal to the column ψ(Q) of I(q). That implies that ψ ∗ (Q) as a function j on the spectral curve has poles at the points γs , and at the branching points of the spectral curve. Equation (4.55) implies that it has zeroes at the poles γs of ψn (Q). These analytical properties will be crucial in the sequel. dz is a meromorphic differential on the spectral curve The differential ; . Therefore, j j the sum of its residues at the punctures Pi , γs is equal to the negative of the sum of the has poles at the poles other residues on . There are poles of two types. First of all, ; γs of ψ. Note that δψ has a pole of the second order at γs . Taking into account that ψ ∗ has zero at γs we obtain = (ψ ∗ δLψ)( γs ) ∧ δz( γs ) + δk( γs ) ∧ δz( γs ) = 2δk( γs ) ∧ δz( γs ). res γˆs ;
(4.56)
The last equality follows from the standard formula for variation of the eigenvalue of an operator, ψ ∗ δLψ = δk. is the set of branch points qi of the cover. The pole The second set of poles of ; of ψ ∗ at qi cancels with the zero of the differential dz, dz(qi ) = 0, considered as a differential on . The vector-function ψ is holomorphic at qi . If we take an expansion of ψ in the local coordinate (z − z(qi ))1/2 (in general position when the branch point is simple) and consider its variation we get that δψ = −
dψ δz(qi ) + O(1). dz
(4.57)
Therefore, δψ has simple pole at qi . In the similar way we have δk = −
dk δz(qi ). dz
(4.58)
Equalities (4.57) and (4.58) imply that δkdz . res qi ψ ∗ δL ∧ δψ dz = resqi (ψ ∗ δLdψ) ∧ dk
(4.59)
Vector Bundles and Lax Equations on Algebraic Curves
253
Due to skew-symmetry of the wedge product we may replace δL in (4.59) by (δL − δk). Then, using the identities ψ ∗ (δL − δk) = δψ ∗ (k − L) and (k − L)dψ = (dL − dk)ψ, we obtain (4.60) resqi ψ ∗ δL ∧ δψ dz = −resqi (δψ ∗ ψ) ∧ δkdz = resqi (ψ ∗ δψ) ∧ δkdz. Note that the term with dL does not contribute to the residue, because dL(qi ) = 0. The right-hand side of (4.60) cancels with a residue of the second term in the sum (4.53) and the theorem is proved. Remark. The right-hand side of (4.51) can be identified with a particular case of universal algebraic-geometric symplectic form proposed in [9]. It is defined on the generalized Jacobian bundles over a proper subspace of the moduli spaces of Riemann surfaces with punctures. In the case of families of hyperelliptic curves that form was pioneered by Novikov and Veselov [17]. Let φk be coordinates on the Jacobian J ( ) of the spectral form. The isomorphism of the symmetric power of the spectral curve and the Jacobian is defined by the Abel map φi ( γ) =
γs
dωi ,
(4.61)
s
, corresponding to where dωi is the basis of normalized holomorphic differentials on a choice of a basis of a- and b-cycles on with the canonical matrix of intersections. Restricted to P D , the differential δkdz is holomorphic. Therefore, it can be represented as a sum of the basis differentials δIi dωi . (4.62) δkdz = i
The coefficients of the sum are differentials on P D of the functions kdz . Ii = ai
(4.63)
From (4.51) it follows that ω = δα, where α=
g +r−1 γs
δkdz =
s=1
g
δIi ∧ φi .
(4.64)
i=1
Corollary 4.2. The form ω restricted to P D equals ω=
g
δIi ∧ δφi .
(4.65)
i=1
For the case when DK ≥ 0, this result was obtained first in [18]. It is instructive to show that (4.65) directly implies that ω is non-degenerate for DK ≥ 0. First of all, (4.65) implies that the forms δIi are linear independent. Indeed, if they are linear dependent at s ∈ S D , then there is a vector v tangent to S D at s, such that
254
I. Krichever
δIi (v) = 0. Due to (4.62) we conclude ∂v k ≡ 0. It is impossible for generic s, because the equation r r−j j =1 ∂v rj k ≡ 0, (4.66) ∂v k = Rk (k, Q) implies, then, that k(Q) satisfies an algebraic equation of degree less than r, i.e. the spectral curve can not be an r-sheeted branch cover of . The second argument needed in order to complete the proof is that the dimension of D ⊂ S D of the spectral curves corresponding to P D equals g . The number the space SP of conditions that singular parts of eigenvalues of L at the points Pm ∈ DK are constant along P D equals (r deg DK ) minus 1, due to the relation resPm (Tr L)dz = 0, (4.67) Pm ∈DK
which is valid, because the singular parts of L at γs are traceless. From (3.2) we get D = N r(r − 1) − 2r(g − 1) + 2 = 2 g = dim P D . 2 dim SP
(4.68)
5. The Zero-Curvature Equations The main goal of this section is to present the non-stationary analog of the Lax equations on an algebraic curve as an infinite-dimensional Hamiltonian system. Let AD be a space of the (r × r) matrix function L(x, q) = L(x + T , q) of the real variable x such that: 10 . L(x, q) is a meromorphic function of the variable q ∈ with poles at D and at the points γs (x), where it has the form (2.3), i.e. L(x, q) ∈ NγD(x),α(x) , L(x, z) =
βs (x) α T (x) + Ls1 (x) + O((z − zs (x)), zs (x) = z(γs (x)). z − zs (x)
(5.1)
20 . The vector D(L(x, q)) defined by the map (2.6) is tangent to the loop {γ (x), α(x)}, i.e. ∂x zs (x) = −αsT (x) βs (x), ∂x αsT (x) = −αsT (x)Ls1 (x) + κs (x)αsT (x) ,
(5.2)
where κs (x) is a scalar function. Remark. It is necessary to emphasize, that although the loops S 1 −→ N D /SLr are lifted to matrix functions L (x, q) ∈ N D , x ∈ R, such that L (x + T , q) = gL (x, q)g −1 + ∂x gg −1 , γ = g(x) ∈ GLr ,
(5.3) L
without loss of generality we may consider fucntions periodic in x, because with the monodromy property (5.3) is gauge equivalent to a periodic matrix function L. The space AD σ of the matrix functions, corresponding to a loop σ = {γ (x), α(x)} in M0 , is the space of sections of finite-dimensional affine bundle over the loop, because D for any two functions L1 , L2 ∈ AD σ their difference is the Lax matrix, L1 − L2 ∈ L . D Therefore, for a generic divisor D the space Aσ is non-trivial only if deg D = N ≥ g. 2 The functional dimension of AD σ is equal to r (N −g+1), while the functional dimension of AD equals r 2 (N + 1).
Vector Bundles and Lax Equations on Algebraic Curves
255
Lemma 5.1. If D = K is the zero divisor of a holomorphic differential dz, then the map L ∈ AK −→ {αs (x), βs (x), γs (x), κs (x)}
(5.4)
is a bijective correspondence of AK and the space of functions periodic in x such that ∂x z(γs (x)) = −αsT (x)βs (x),
rg
βs (x)αs (x)T = 0,
(5.5)
s=1
modulo the gauge transformations αs (x) −→ λs (x)αs (x), βs −→ λ−1 s (x)βs (x), κs (x) −→ κs (x) + ∂x ln λs (x), (5.6) αs (x) −→ W (x)T αs (x), βs (x) −→ W −1 (x)βs (x),
(5.7)
r a periodic where λs (x) is a non-vanishing function periodic in x and W (x) ∈ GL non-degenerate matrix function. Note that from (5.2) it follows that locally in the neighborhood of γs (x) the matrix 1 function L(x, Q) ∈ AD σ can be regarded as a connection of the bundle V over S × along the loop {γ (x), α(x)}. Indeed, if F is a space of local sections of this bundle, which can be identified with the space of meromorphic vector functions f (x, z) that have the form (2.1) in the neighborhood of γs , then ∂x + LT (x, z) f (x, z) ∈ Fs . (5.8) Another characterization of the constraints (5.2) is as follows. Lemma 5.2. A meromorphic matrix-function L in the neighborhood of γs (x) with a pole at γs (x) satisfies the constraints (5.2) if and only if there exists a holomorphic matrix function -s (x, z) with at most a simple zero of det -s at γs such that L is gauge equivalent −1 −1 L = -s Ls + ∂ x - s -s
(5.9)
to a holomorphic matrix function L. The tangent space to AD is the space of functions of x with values in the tangent space to the space of Lax matrices T (LD ).
D Lemma 5.3. Let L ∈ AD σ and M ∈ Nγ (x),α(x) , then the commutator [∂x − L, M] = Mx + [M, L] is a tangent vector to AD at L if and only if its divisor of poles outside of γs (x) is not greater than D.
From Eqs. (5.2) it follows that the Laurent expansion of the matrix function T = Mx + [L, M] at the point γs (x) has the form (2.39), where z˙ s and α˙ s are given by formulae (2.7, 2.8). That proves that T is a tangent vector to LD . Lemma 5.3 shows that the zero-curvature equation Lt = Mx + [M, L]
(5.10)
256
I. Krichever
is a well-defined system, whenever we can define M(L), such that the conditions of the lemma are satisfied. Our goal is to construct the zero-curvature equations that are equivalent to differential equations. That requires M(L) to be expressed in terms of L and its derivatives in x. It is instructive enough to consider the case when all the multiplicities of the points D Pi ∈ D equal mi = 1. Let AD 0 be an open set in A such that the singular part of L ∈ AD 0 at Pi has different eigenvalues L(x, q) = wi−1 Ci (x) u(i) (x)Ci−1 (x) + O(1), wi = wi (q), wi (Pi ) = 0, (i) (i) (i) (x) , uk (x) = ul (x), k = l. (5.11) u(i) = diag u1 (x), . . . , u(i) r Lemma 5.4. Let L(x, w) be a formal Laurent series ∞
L=
lj (x)w j
(5.12)
j =−1
such that l−1 (x) = C(x)u(x)C −1 (x), where u is a diagonal matrix, with distinct diagonal elements. Then there is a unique formal solution I0 = I0 (x, w) of the equation (∂x − L(x, w)) I(x, w) = 0, which has the form I0 (x, w) = C(x)
∞
ξs (x)w
s
e
x
x0
h(x ,w)dx
, h = diag(h1 , . . . , hr ),
(5.13)
(5.14)
s=0
normalized by the conditions ij
ξ0 = δ ij , ξsii (x) = 0.
(5.15)
The coefficients ξs (x) of (5.14) and the coefficients hs (x) of the Laurent series h(x, w) =
∞
hs (x)w s , h−1 = u,
(5.16)
s=−1
are differential polynomials of the matrix elements of L. Substitution of (5.14) into (5.13) gives a system of the equations, which have the form hs − [u, ξs+1 ] = R(ξ0 , . . . , ξs ; h0 , . . . , hs−1 ), s = −1, 0, 1, . . . .
(5.17)
They recursively determine the off-diagonal part of ξs+1 , and the diagonal matrix hs as polynomial functions of matrix elements of li (x), i ≤ s. Corollary 5.1. Let I0 be the formal solution (5.14) of Eq. (5.13). Then for any diagonal matrix E the expression w −m I0 EI0−1 does not depend on x0 , and is formally meromorphic, i.e. it has the form w−m I0 EI0−1 =
∞
ms (x)w −s .
(5.18)
s=−m
The coefficients ms (x) are differential polynomials on the matrix elements of the coefficients li (x).
Vector Bundles and Lax Equations on Algebraic Curves
257
Expression (5.18) is meromorphic and does not depend on x0 , because the essential singularities of the factors commute with E and so cancel each other. We are now in position to define matrices Ma , a = (Pi , m; l) , m ≥ 1, l = 1, . . . , r,
(5.19)
which are differential polynomials on entries of L, and satisfy the conditions of Lemma 5.3. Let I0 (x, q) = I0 (x, w(q)) be the formal solution of Eq. (5.13) constructed above for the expansion (5.11) of L ∈ AD 0 at Pi . Then, we define M(i,m;l) (x, q) as the unique meromorphic matrix function, which has the form (2.3), (2.4) at the points γs (x), and is holomorphic everywhere else except at the point Pi , where ij
M(i,m;l) (x, q) = w−m (q)I0 (x, q)El I0−1 (x, q) + O(1), El = δli δ j l .
(5.20)
As before, we normalize M(i,m;l) by the condition M(i,m;l) (x, P0 ) = 0. It is necessary to mention, that Ma , as a function of L, is defined only locally, because it depends on a representation of the singular part of L at Pi in the form (5.11). Theorem 5.1. The equations ∂a L = ∂x Ma + [Ma , L], a = (Pi , m; l)
(5.21)
define a hierarchy of commuting flows on AD 0 . Let the coefficients of (5.12) be periodic functions of x. Then, Lemma 5.4 implies that T p(w) I0 (x + T , w) = I0 (x, w)e , p= h(x, w)dx. (5.22) 0
Therefore, the columns of I0 are Bloch solutions of Eq. (5.13), i.e. the solutions that are eigenvectors of the monodromy operator. The diagonal elements of the matrix p(w) are the formal quasimomentum of the operator (5.13). Our next goal is to show that for DK ≥ 0 the zero curvature equations are Hamiltonian on suitable symplectic leaves, and identify their Hamiltonians with coefficients of the quasimomentum matrices pi corresponding to the expansion (5.11) of L at the punctures Pi , pi (w) =
∞
H(i,s) w s , H(i,s) = diag {H(i,s;l) }.
(5.23)
s=−1
Let us fix a holomorphic differential dz with simple zeros, and a set of diagonal matrix functions v (i) (x). Then for a divisor D, such DK is effective, we define first a subspace B D of AD 0 by the constraints ∂x u(i) (x) − v (i) (x) = 0, (5.24) where u(i) are the matrices of eigenvalues (5.11) of the singular parts of L ∈ AD 0 . Next D of the foliation are parameterized by sets of we define a foliation of B D . The leaves P 0 constant diagonal matrices c(m) with distinct diagonal elements, and are defined by the equations u(m) (x) − v (m) (x) = c(m) , if dz(Pm ) = 0.
(5.25)
258
I. Krichever
We would like to stress the difference between the constraints (5.24) and (5.25). Equations (5.24) imply that for all the points of the divisor D the differences (u(i) (x)−v (i) (x)) are x − independent matrices. For Pm ∈ DK we require additionally that the difference equals the fixed matrix. D by formula (4.2), where now As before, we define a two-form on P 0 x0 +T
;(q) = Tr (5.26) I −1 δL ∧ δI dx − I −1 δI (x0 ) ∧ δp x0
and I is the matrix of the Bloch solutions of (5.13), i.e. (∂x − L(x, q))I(x, q) = 0, I(x + T , q) = I(x, q)ep (q) .
(5.27)
We would like to emphasize that this definition is a slight modification of the formula for symplectic structure for soliton equations, proposed in [9]. The second term in (5.26) gives zero contribution in the conventional theory. It is here to remove the dependence on the choice of x0 in the definition as may be seen as follows. The monodromy property (5.27) implies Tr I −1 δL ∧ δI (x + T ) − Tr I −1 δL ∧ δI (x) = Tr I −1 δLI (x) ∧ δp . (5.28) Using the equations δLI = δIx − LδI, we obtain Tr I −1 δLI = Tr ∂x I −1 δI .
(5.29)
Hence, the form ; does not depend on a choice of the initial point x = x0 . D does not depend The same arguments as before show that ω when restricted to P 0 on the normalization of the Bloch solutions. Theorem 5.2. The formula (4.2) with ; given by (5.26) defines a closed two-form on D . This is gauge invariant with respect to the affine gauge group GL r . P 0 If D ≥ K, then the contraction of ω by the vector field ∂a defined by (5.21) equals i∂a ω = δHa ,
(5.30)
H(i,m;l) = −resPi Tr w−m El p dz,
(5.31)
where for a = (Pi , m; l),
and p is the quasi-momentum matrix. The proof of this theorem proceeds along identical lines to the proof of the stationary analogs of these results presented above. First, we show that under the gauge transformation L = g −1 Lg − g −1 ∂x , I = g −1 I the form ; gets transformed to x0 +T ; = ; + Tr (5.32) (2δh ∧ δL − 2Lδh ∧ δh + δhx ∧ δh) dx, x0
δgg −1 .
where δh = Note that the last term does not contribute to the residues. The first two terms are meromorphic on with poles at γs and Pi ∈ D, only. Therefore, a sum of their contributions to residues of ; dz equals zero. Hence, ω does descend to a form on D = P 0D /GL r . P
(5.33)
Vector Bundles and Lax Equations on Algebraic Curves
259
Using (5.32) for the gauge transformation (5.9), where -s depends on a point z in the neighborhood of γs , we obtain x0 +T r i i resγs ;dz = −2 δκs (x) ∧ δzs (x) + δβs (x) ∧ δαs (x) dx. (5.34) x0
i=1
From (5.14) we obtain that if dz(Pi ) = 0, then resPi ;dz x0 +T
x (i) (i) (i) = Tr δu (x) ∧ δu )(y)dy dx − δu (x0 ) ∧ x0
x0
x0 +T x0
(i)
δu (x)dx . (5.35)
D is x-independent. Then, from Equations (5.24) imply that the restriction of δu(i) to P 0 (5.35) it follows that the points Pi ∈ K give zero contribution to ω. From (5.14) and D is holomorphic in the (5.25) it follows that the form δII −1 when restricted to P 0 neighborhood of Pm ∈ DK . Therefore, in this neighborhood δIx I −1 + δIIx−1 D = 0(1). (5.36) P0
Using this equality we obtain that on P0D the following equation holds: x0 +T −1 −1 LδII ∧ δII dx dz. resPm ;dz = −2resPm Tr x0
(5.37)
D the form ω is equal to the integral over the period of (4.24). Therefore, restricted to P 0 The proof of Eq. (5.30), where Ha is given by (5.31) is almost identical to the proof of (4.26). Important remark. The formulae (5.34) and (5.37) do not directly imply that ω restricted is non-degenerate, because of the constraints (5.24). The conventional theory of to P the soliton equations, and results of the next section provide some evidence that it is non-degenerate for DK ≥ 0, although at this moment the author does not know a direct proof of that. Anyway, Eq, (5.30) shows that Eqs. (5.21) are Hamiltonian on suitable subspaces of P D . Then, commutativity of flows implies {Ha , Hb } = 0.
(5.38)
The previous results can be easily extended for the case when the leading coefficient of the singular part of L at the puncture Pi has multiple eigenvalues. Lemma 5.5. Let L(x, w) be a formal Laurent series (5.12) such that l−1 = C(x)uC −1 (x), and u = ui δ ij is a constant diagonal matrix. Then there is a unique formal solution I0 = I0 (x, w) of Eq. (5.13), which has the form ∞ −s I0 (x, w) = C(x) T (x, w), T (x0 , w) = 1, ξs (x)w (5.39) s=0
where ij
ij
ξ0 = δ ij ; ξs (x) = 0, if ui = uj , s ≥ 1,
(5.40)
260
I. Krichever
and the logarithmic derivative h(x, w) of T is a formal series with non vanishing entries only for indices (i, j ), such that ui = uj , i.e. h = ∂x T T −1 = uw −1 +
∞
hs (x)w s , hij = 0, if ui = uj .
(5.41)
s=0
The coefficients ξs (x) of (5.39) and the coefficients hs (x) of (5.41) are differential polynomials of the matrix elements of L. Substitution of (5.39) in (5.13) gives a system of the equations which have the form ij (5.17) They recursively determine ξs+1 for indices (i, j ) such that ui = uj and the matrix hs , as polynomial functions of the matrix elements of li (x), i ≤ s. Corollary 5.2. Let I0 be the formal solution (5.39) of Eq. (5.13). Then for any diagonal matrix E = Ei δ ij such that Ei = Ej , if ui = uj , the expression w−m I0 EI0−1 does not depend on x0 , and is formally meromorphic. The coefficients ms (x) of its Laurent expansion (5.18) are differential polynomials of the entries of the coefficients li (x). The expression w−m I0 EI0−1 is meromorphic and does not depend on x0 because [T , E] = 0. The corollary implies that if singular parts of L at the punctures Pi have multiple eigenvalues, then the commuting flows are parameterized by sets a = (Pi , m; Eλ ) ,
(5.42)
where Eλ is a diagonal matrix that satisfies the condition of Corollary 5.2. The Hamiltonians of the corresponding equations are equal to
T Ha = −resPi Tr w−m Eλ h(x)dx dz. (5.43) 0
Example. Field analog of the elliptic CM system. Let us consider the zero curvature equation on the elliptic curve with one puncture. We use the same notation as in Sect. 4. j j In the gauge αs = es , es = δs , the phase space can be identified with the space of elliptic matrix functions such that Lij has a pole at the point qj (x) and z = 0, only. From (5.2) it follows that the residue of Ljj at qj equals −qj x . Therefore, Ljj = pj + qj x (ζ (z) − ζ (z − qj ) − ζ (qj ). Equation (5.2) implies also that Lj i (qj ) = 0, i = j . Let us assume, as in the case of the elliptic CM system, that the singular part of L at the puncture z = 0 is a point of the orbit of the adjoint action corresponding to the diagonal matrix diag(r − 1, −1, . . . , −1). Then, taking into account the momentum map corresponding to the gauge transformation by diagonal matrices, we get that the non-stationary analog of the Lax matrix for the CM system has the form Lii = pi + qix (ζ (z) − ζ (z − qi ) − ζ (qi )) , σ (z + qi − qj ) σ (z − qi )σ (qj ) Lij = fi fj , i = j. σ (z)σ (z − qj ) σ (qi − qj ) σ (qi )
(5.44) (5.45)
The values fi2 are fixed to fi2 = 1 + qix ,
r i=1
qix = 0.
(5.46)
Vector Bundles and Lax Equations on Algebraic Curves
261
According to (5.34), the symplectic form equals T r δpi (x) ∧ δqi (x) dx −→ {pi (x), qj (y)} = δij δ(x − y). ω= 0
(5.47)
i=1
The commuting Hamiltonians are coefficients of the Laurent expansion at z = 0 of the quasimomentum, corresponding to the only simple eigenvalue of the singular part of L at z = 0. To find them we look for the solution of (5.13) in the form ψ = C(x, z)e
C=
∞
x 0
h(x ,z)dx
C
(s)
(x)z
s
, h=
s=0
,
∞
(5.48)
hs (x)zs ,
(5.49)
s=−1
where C (0) is the eigenvector of the singular part of L, corresponding to the eigenvalue (r − 1), i.e. (0)
Ci
= fi ,
(5.50)
and the coefficients C (s) for s > 0 are vectors, normalized by the condition r i=1
(s)
fi C i
= 0, s > 0.
(5.51)
Substitution of (5.44,5.45) into (5.13) gives a system of the equations for the coordinates Ci of the vector C: ∂x Ci + hCi = qix Ci [ζ (z) − ζ (z − qi ) − ζ (qi )] +fi fj Cj ζ (z) − ζ (z − qj ) + ζ (qi − qj ) − ζ (qi ) , (5.52) j =i
where we use the identity σ (z + qi − qj ) σ (z − qi )σ (qj ) = ζ (z) − ζ (z − qj ) + ζ (qi − qj ) − ζ (qi ). (5.53) σ (z)σ (z − qj ) σ (qi − qj ) σ (qi ) (s)
Taking the expansion of (5.52) at z = 0, we find recursively the coefficients of Ci and densities hs of the Hamiltonians. The first two steps are as follows: The coeffcients at z−1 of the right- and left-hand sides of (5.52) give fj2 = qix + (r − fi2 ) = r − 1. (5.54) h−1 = qix + j =i
The next system of equations is (1)
fix + fi h0 + (r − 1)Ci
(1)
= pi fi + qix Ci
+ fi
j =i
(1) fj Cj + fj2 Vij ,
(5.55)
262
I. Krichever
where Vij = ζ (qj ) + ζ (qij ) − ζ (qi ), qij = qi − qj .
(5.56)
Using (5.51), we get (1)
rCi
+ fi h0 = pi fi − fix + fi
j =i
fj2 Vij .
(5.57)
Multiplying (5.57) by fi and taking a sum over i, we find upon using (5.51) and skewsymmetry of Vij , rh0 =
r i=1
pi fi2 =
r
pi (1 + qix ).
(5.58)
i=1 (2)
In the same way we get the system of equations for Ci , (2)
rCi
(1)
(1)
+ ∂x C i
+ h0 Ci =
(1) pi Ci
+ h 1 fi + qix fi ℘ (qi ) + fi
j =i
(1) fj Cj Vij + fj ℘ (qj ) .
(5.59)
Consequently the expression for the density of the second Hamiltonian is r 2 h1 =r
i
=r
i
qix fi2 ℘ (qi ) + C (1) (fix + pi fi ) + i
(1) fi2 fj Cj Vij + fi2 fj2 ℘ (qj )
j =i
(r − 1)fi2 ℘ (qi ) + C (1) fix + pi fi + i
j =i
fi fj2 Vj i .
(5.60)
(1) (1) For the first line we have used the equation i fi Cix + fix Cix = 0. From (5.57) it follows that the second term in (5.60) equals pi2 fi2 − fix2 + I I = −rh20 + (fi2 )x fj2 Vij + fi2 fj2 fk2 Vij Vki . (5.61) i
j =i
j,k =i
For any triple of distinct integers i = j = k = i the following equation holds: Vij Vki + Vj k Vij + Vki Vj k = −℘ (qi ) − ℘ (qj ) − ℘ (qk ).
(5.62)
In order to prove (5.62), it is enough to check that the left-hand side, which is a symmetric function of all the variables qi , qj , qk , as a function of the variable qi , has double pole at qi = 0, and is regular at qi = qj . In the same way one can obtain the well-known relation Vij Vj i = −℘ (qi ) − ℘ (qj ) − ℘ (qij ).
(5.63)
Vector Bundles and Lax Equations on Algebraic Curves
263
Equations (5.62, 5.63) imply i
j,k =i
fi2 fj2 fk2 Vij Vki = −
i
rfi2 (r − fi2 )℘ (qi ) +
j =i
fi2 fj4 ℘ (qij ) . (5.64)
From (5.56) it follows 1 2 (fi )x fj2 − fi2 (fj2 )x ζ (qij ) − 2ζ (qi ) (fi2 )x fj2 Vij = 2 j =i j =i 1 2rqixx ζ (qi ) − qij xx ζ (qij ) =− 2 i
j =i
1 + qixx qj x − qj xx qix ζ (qij ). 2
(5.65)
j =i
The first sum is equal to 1 2rqixx ζ (qi ) − qij xx ζ (qij ) 2 i j =i 1 2rqix fi2 ℘ (qi ) − = qij2 x ℘ (qij ) + ∂x F, 2 i
(5.66)
j =i
where
F =
1 2
i
2rfi2 ζ (qi ) −
qij x ζ (qij ) .
(5.67)
j =i
The first terms in (5.60), (5.64), (5.65) cancel each other. The function F , as a function of the variable qi , has poles at the points 0, qj , j = i and the sum of its residues at these points equals rfi2 −
qij x = r.
(5.68)
j =i
Therefore, it has the same monodromy properties with respect to all the variables. The functions qi (x) represent loops on the elliptic curve. Therefore, qi (x + T ) = qi (x) + bi , where bi is a period of the elliptic curve. The constraint (5.46) implies i bi = 0. Then, from (5.68) it follows that F is a periodic function of x. The densities of the Hamiltonians are defined up to a total derivative of periodic functions in x. Hence, a density of the
264
I. Krichever
second Hamiltonian of the hierachy equals r 2 h1
2 2 qixx 1 2 pi (1 + qix ) − =− (pi (1 + qix ) + (5.69) r 4(1 + qix ) i i 1 (qixx qj x − qj xx qix ζ (qij ) − (5.70) 2 j =i 1 (1 + qix )(1 + qj x )2 + (1 + qj x )(1 + qix )2 − qij2 x ℘ (qij ). (5.71) + 2 j =i
The transformation pi → pi + f (x) does not change hs for s > 0. In particular, the first two terms in (5.69) can be rewritten as 2 1 1 − (pi (1 + qix ) + pi2 (1 + qix ) = (pi − pj )2 (1 + qix )(1 + qj x ). r 2r i
i
i,j
(5.72) The symplectic form (5.47) restricted to the subspace
qi = 0,
i
pi = 0,
(5.73)
i
T is non-degenerate. The Hamiltonians Hs = 0 hs (x)dx restricted to this space generate a hierarchy of commuting flows, which we regard as field analog of the elliptic CM system. For r = 2 the Hamiltonian 2H1 has the form (1.10), where q = q1 = −q2 , p = p1 = −p2 . 6. The Algebro-Geometric Solutions So far, our consideration of the Bloch solutions (5.27) has been purely local and formal. For generic L ∈ AD 0 the series (5.14, 5.16) for the formal solutions I(x, q), and quasimomentum have zero radius of convergence. The main goal of this section is to construct algebro-geometric solutions of the zero curvature equations, for which these series do converge and, moreover, have meromorphic continuations on a compact Riemann surface. Let T(q) be a restriction of the monodromy operator f (x) → f (x + T ) to the space of solutions of the equation (∂x − L(x, q)f = 0, where f is a vector function. Then, we define the Riemann surface of the Bloch solutions by the characteristic equation r R(µ, q) ≡ det µ − T(q) = µr + Rj (q)µr−j = 0.
(6.1)
j =1
Lemma 6.1. The coefficients Rj (q) of the characteristic equation (6.1) are holomorphic functions on except at the points Pi of the divisor D.
Vector Bundles and Lax Equations on Algebraic Curves
265
Proof. In the basis defined by columns of the fundamental matrix of solutions to the equation (∂x −L)F (x, q; x0 ) = 0, F (x0 , q, x0 ) = 1, the operator T(q) can be identified with the matrix T(q) = F (x0 + T , q; x0 ).
(6.2)
A priori this matrix is holomorphic on except at the points of the divisor D and at points of the loops γs (x), where L has singularities. From Lemma 5.2 it follows that in the neighborhood of the loop we have (x, q)-−1 F = -s (x, q)F s (x, q), F (x0 , q) = 1,
(6.3)
is a holomorphic matrix function, and -s is defined by (2.14), (2.18), (2.19). where F The function -s is periodic, because γs , αs are periodic. In the neighborhood of the loop γs , the functions Rj (q) coincide with the coefficients of the characteristic equation for . Therefore, they are holomorphic in that neighborhood. The lemma is then proved. F It is standard in the conventional spectral theory of periodic linear operators that for a generic operator the Riemann surface of the Bloch functions is smooth and has infinite genus. For algebro-geometric or finite-gap operators the corresponding Riemann surface is singular, and is birational equivalent to a smooth algebraic curve. It is instructive to consider first, as an example of such operators, the case, when L does not depend on x, i.e. L ∈ LD . In this case the equation (∂x − L)ψ = 0 can be easily solved. The Bloch solutions have the form ψ = ψ0 ekx ,
(6.4)
where ψ0 is an eigenvector of L, and k is the corresponding eigenvalue. These solutions are parameterized by points Q of the spectral curve 0 of L. The image of 0 under the map into C1 × defined by formula (k, q) ∈ 0 −→ (µ = ekT , q) ∈ C1 ×
(6.5)
is the Riemann surface defined by (6.1), where the coefficients are symmetric polynomials of eki (q)T . For example, if 0 is defined by the equation k 2 + u(q) = 0,
(6.6)
where u(q) is a meromorphic function with double poles at the points of D, then is defined by the equation (6.7) µ2 + 2R1 µ + 1 = 0, R1 (q) = cosh ( u(q)). The Riemann surface defined by (6.7) is singular. Projections onto of the points of self-intersection of are roots of the equation
πN 2 u(q) = , (6.8) 2T where N is an integer. The coefficient u(q) has poles of the second order at D, u = ai2 w −2 + O(w −1 ), where w is a local coordinate at Pi ∈ D. Therefore, as |N | → ∞,
266
I. Krichever
the roots of (6.7) tend to the points of D. The coordinates of the singular points qi,N that tend to Pi equal w(qi,N ) = 2T ai (π N )−1 + O(N −2 ).
(6.9)
As usual in perturbation theory, for generic L each double eigenvalue qi,n splits into two ± smooth branch points qi,n . By analogy with the conventional theory we expect, that if L ± is an analytic function of x, then the differences |w(qi,N ) − w(qi,N )| < O(N −k ) will decay faster that any power of N −1 . Localization of the branch points is a key element in the construction [19] of a theory of theta-functions for infinite genus hyperelliptic curves of the Bloch solutions for periodic Sturm–Liouville operators. In [20] a general approach for the construction of Riemann surfaces of the Bloch functions was proposed. The model of the spectral curves developed in [20] was chosen in [21] as a starting point of the theory of general (non-hyperelliptic) infinite-genus Riemann surfaces. It was shown that for such surfaces many classical theorems of algebraic geometry take place. Algebro-geometric or finite-gap operators can be seen, as operators for which there are only a finite number of multiple eigenvalues that split into smooth branch points. Let be a smooth genus g algebraic curve that is an r-branch cover of . Note that unlike the stationary case, for given a rank r there is no relation between g , and the genus g of . As g increases the dimension of the space of r-sheeted cover increases. It equals 2( g − rg + r − 1). Assume that the preimages Pil , P0l on of the points of a divisor D, and a point P0 on are not branch points. The definition of the Baker-Akhiezer function corresponding to this data and to a non-special degree g + r − 1 divisor γ on is as follows: except at the points Pil . Its divisor of 10 . ψ is a meromorphic vector function on poles on outside of Pil is not greater than γ. l 0 2 . In the neighborhood of Pi the vector function ψ has the form
ψ = ξi,l (q, t) exp
t(i,m;l) w
−m
,
(6.10)
m
where ξi,l (q, t) is a holomorphic vector-function. 30 . Evaluation of ψ at the punctures P0l are vectors with coordinates (ψ(P0l ))(i) = δ il . Theorem 6.1. Let ψ(q, t) be the Baker–Akhiezer vector function associated with a non-special divisor γ on . Then, there exist unique matrix functions M(i,m;l) (q, t) ∈ NγD(t),α(t) such that the equations ∂(i,m;l) − M(i,m;l) ψ(q, t) = 0
(6.11)
hold. T (i) (i) (i) Now, let vl (x) be a set of periodic functions, 0 vl dx = 0, and ul be a set of constants. Then the change of the independent variables (i)
(i)
t(i,1;l) = xul + vl (x) + t(i,1;l)
(6.12)
Vector Bundles and Lax Equations on Algebraic Curves
267
define the Baker–Akhiezer function ψ, as a function of (q, t) and the variable x. From (6.11) it follows that (i) (i) (6.13) ul + ∂x vl M(i,1;l) . (∂x − L)ψ = 0, L = i,l
As follows from Lemma 5.2, the vector D(Ma ), a = (i, m; l), corresponding to Ma under (2.6), is tangent to (γs (ta ), αs (ta )). Therefore, D(L) is tangent to (γs (x), αs (a)). In general, L constructed above is not a periodic function of x. It is periodic, if we impose additional constraints on the set of data that are the curve and the constants (i) (i) ul . We call the set { , ul } admissible if there exists a meromorphic differential dp on which has second order poles at Pil , (i) (6.14) dp = −ul dw w−2 + O(1) , and such that all periods of dp are multiples of 2π i/T , 2π imc dp = , mc ∈ Z, c ∈ H1 (, Z). T c
(6.15)
Lemma 6.2. The Baker–Akhiezer function ψ, associated with an admissible set of data (i) { , ul } satisfies the equation ψ(x + T , q) = gψ(x, q)µ(q), µ = ep(q)T , where g is the diagonal matrix g = diag
(µ(P01 ), . . .
(6.16)
, µ(P0r )).
From (6.15) it follows that the function µ defined by the multi-valued abelian integral p is single-valued. Equation (6.16) follows from the uniqueness of the Baker-Akhiezer function, because the left- and the right-hand sides have the same analytic properties. The matrix function L constructed with the help of ψ satisfies the monodromy property L(x + T , q) = gL(x, q)g −1 .
(6.17)
Let S = S(T , p) be a space of curves with meromorphic differential dp satisfying (6.15). We would like to mention that the closure of S, as T → ∞, coincides with the space of all genus g branching covers of . (i) Corollary 6.1. A set of data ∈ S, [ γ ] ∈ J ( ), and a set of periodic functions vl (x) define with the help of the corresponding Baker-Akhiezer function a solution of the r . hierarchy (5.21) on B D /GL
The finite-gap or algebro-geometric solutions are singled out by the constraint that there is a Lax matrix L1 ∈ LnD such that [∂x − L, L1 ] = 0.
(6.18)
where D is the preimage of Indeed, let k be a function on with divisor of poles nD, D. If n is big enough this exists. Let ψ be the Baker–Akhiezer function on , then as it was shown above there is a unique Lax matrix L1 such that L1 (t, q)ψ(t, q) = k(q)ψ(t, q).
(6.19)
Equation (6.19) implies that the spectral curve of L1 is birationally equivalent to the Riemann surface of Bloch solutions for L.
268
I. Krichever
Theorem 6.2. The form ω defined by (4.2) and (5.26) restricted to the space of algebro(i) geometric solutions, corresponding to a set of function vl (x) equals ω=
g +r−1
δp( γs ) ∧ δz( γs ).
(6.20)
s=1
The meaning of the right-hand side of this formula is analogous to that of formula (4.51). It shows that the form ω restricted to the space of algebro-geometric solutions is non-degenerate. It is well-known that the finite-gap solutions of the KdV hierarchy are dense in the space of all periodic solutions ([22]). As shown in [20] the finite-gap solutions are dense for the KP-2 equation as well. It seems quite natural to expect that the similar result is valid for the zero-curvature equations on an arbitrary algebraic curve, as well. In the conjectured scenario the infinite dimensional space B D can be identified with a direct limit of finite-dimensional spaces LnD , as n → ∞. We are going to address that problem in the near future. References 1. Krichever, I.M., Novikov, S.P.: Holomorphic bundles over algebraic curves and non-linear equations Uspekhi Mat. Nauk 35, n 6 (1980) 2. Hitchin, N.: Stable bundles and integrable systems. Duke Math. Journ. 54, (1), 91–114 (1987) 3. Krichever, I.M., Novikov, S.P.: Holomorphic bundles over Riemann surfaces and KadomtzevPetviashvilii equation. Funk. anal. i pril. 12, n 4, 41–52 (1978) 4. Zakharov, V., Shabat, A.: Integration of non-linear equations of mathematical physics by the method of the inverse scattering problem. II. Funct. Anal. and Appl. 13, 13–22 (1979) 5. Ben-Zvi, D., Frenkel, E.: Spectral Curves, Opers and Integrable Systems. math.AG/ 9902068 6. Tyurin, A.: Classification of vector bundles over an algebraic curve of arbitrary genus. Amer. Math. Soc. Translat. II, Ser 63, 245–279 (1967) 7. Krichever, I.M.: The commutative rings of ordinary differential operators. Funk. anal. i pril. 12, n 3, 20–31 (1978) 8. Enriquez, E.,Rubtsov, V.: Hecke-Tyurin parametrization of the Hitchin and KZB systems. math.AG/9911087 9. Krichever, I., Phong, D.H.: On the integrable geometry of N = 2 supersymmetric gauge theories and soliton equations. J. Differential Geometry 45, 445–485 (1997); hep-th/9604199 10. Krichever I., Phong, D.H.: Symplectic forms in the theory of solitons. In: Surveys in Differential Geometry IV, edited by C.L. Terng and K. Uhlenbeck, Cambridge: International Press, 1998, pp. 239–313; hepth/9708170 11. Krichever, I.M.: Elliptic solutions to difference non-linear equations and nested Bethe ansatz. solvint/9804016. 12. Gorsky,A., Nekrasov, N.: Elliptic Calogero-Moser system from two-dimensional current algebra. Preprint ITEP-NG/1-94. hep-th/9401021. 13. Krichever, I.M.: The algebraic-geometrical construction of Zakharov–Shabat equations and their periodic solutions. Doklady Acad. Nauk USSR 227, n 2, 291–294 (1976) 14. Krichever, I.: The integration of non-linear equations with the help of algebraic-geometrical methods. Funk. Anal. i Pril. 11, n 1, 15–31 (1977) 15. Krichever, I.M.: Elliptic solutions of Kadomtsev–Petviashvilii equation and integrable systems of particles. Funct. Anal. App. 14, n 4, 282–290 (1980) 16. Krichever, I., Phong, D.H.: Spin Chain Models with Spectral Curves from M Theory. Commun. Math. Phys. 213, 539–574 (2000) 17. Novikov, S.P., Veselov, A.P.: On Poisson brackets compatible with algebraic geometry and KortewegdeVries dynamics on the space of finite-zone potentials. Soviet Math. Doklady 26, 357–362 (1982) 18. Nekrasov, N.: Holomorphic Bundles and Many-Body systems. Commun. Math. Phys. 180, 587–604 (1996) 19. McKean, H., Trubovitz, E.: Hill’s operator and hyperelliptic function theory in the presence of infinitely many branch points. Comm. Pure Appl. Math. 29,143–226 (1976)
Vector Bundles and Lax Equations on Algebraic Curves
269
20. Krichever I.: Spectral theory of two-dimensional periodic operators and its applications. Uspekhi Mat. Nauk 44, n 2, 121–184 (1989) 21. Felder, J., Knörrer, H., Trubowitz, E.: Riemann surfaces of the infinite genus. Preprint ETH Zurich 22. Marchenko, V., Ostrovski, I.: In the book: V.Marchenko, Sturm-Liouville operator and applications. Nukova Dumka, 1977 (in Russian) Communicated by L. Takhtajan
Commun. Math. Phys. 229, 271–292 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0690-7
Communications in
Mathematical Physics
Coinvariants for Lattice VOAs and q-Supernomial Coefficients B.L. Feigin1 , S.A. Loktev2 , I.Yu. Tipunin3 1
Landau Institute for Theoretical Physics, Russian Academy of Sciences, 142432 Russia, Moscow Region, Chernogolovka 2 Institute for Theoretical and Experimental Physics, B. Cheremushkinskaja, 25, 117259 Moscow, Russia and Independent University of Moscow, Bolshoi Vlasevksy Pereulok, 11, 121002, Moscow, Russia 3 Tamm Theory Division, Lebedev Physics Institute, Russian Academy of Sciences, Leninsky pr., 53, 117924 Russia Received: 13 August 2001 / Accepted: 23 January 2002 Published online: 6 August 2002 – © Springer-Verlag 2002
Abstract: We propose an alternative definition of the q-supernomial coefficients as the characters of coinvariants for the one–dimensional lattice vertex operator algebras. This provides a new formula for the q-supernomial coefficients. We also prove that the spaces of the coinvariants form a bundle over the configuration space of complex points on Riemann surfaces (the configuration space includes the diagonals). Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2. Lattice VOAs . . . . . . . . . . . . . . . . . . . . . . . 3. Coinvariants and Verlinde Rules . . . . . . . . . . . . . 4. Degenerations of ᑛass p and the Main Inequality . . . . . . 5. Combinatorics of the Characters . . . . . . . . . . . . . 6. Dimension and Character Formulas for the Coinvariants . References . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
271 274 278 281 285 290 291
1. Introduction We study the spaces of conformal blocks for the one–dimensional lattice vertex operator algebras. The spaces of conformal blocks, or the modular functor (see [S]), play an important role in conformal field theory. These spaces parameterizing the states of the corresponding field theory can be defined for a wide class of vertex operator algebras. We now describe our approach to the affine algebras; this approach corresponds to the WZNW theory and allows avoiding complicated notions. We first recall the definition of the space of conformal blocks. Let ᒄ be a simple Lie algebra, and let ᒄ denote the central extension of the Lie algebra of the ᒄ–valued Laurent polynomials. We fix a level k, that is, we identify the central element of ᒄ with
272
B.L. Feigin, S.A. Loktev, I.Yu. Tipunin
a positive integer k. As the geometric part of the data, we consider a complex curve C and a set of pairwise distinct points p1 , . . . , pn ∈ C; we fix a local coordinate in a small vicinity of each point. We define the Lie algebra ᒄC\{p1 ,...,pn } of the ᒄ-valued meromorphic functions on C with poles only at pi . We obviously have the inclusion ᒄ ⊕ · · · ⊕ ᒄ ᒄC\{p1 ,...,pn } →
mapping any ᒄ-valued function to its Laurent expansions at the points p1 , . . . , pn . Another part of the data is a set of the integrable representations L1 , . . . , Ln of ᒄ at level k. Their external tensor product is a representation of ᒄ ⊕ ··· ⊕ ᒄ and therefore of ᒄC\{p1 ,...,pn } . The space of conformal blocks is then defined as the space of coinvariants C\{p1 ,...,pn } L1 , . . . , Ln C , p1 ,···,pn = (L1 . . . Ln ) /ᒄ
i.e., the quotient of L1 · · · Ln by the action of ᒄC\{p1 ,...,pn } . As is well known, these spaces are finite-dimensional and form a flat family over the module space of curves with punctures. Another construction is proposed in [FFu]. As above, let C be a curve, and let p1 , . . . , pn and p be pairwise distinct points on C. We consider the subalgebra ᒄC\p (p1 , . . . , pn ) ⊂ ᒄC\p of the functions with zeroes at p1 , . . . , pn and the subalgebra C\p ᒄᒋ (p1 , . . . , pn ) ⊂ ᒄC\p of the functions whose values at p1 , . . . , pn belong to the nilpotent Borel subalgebra ᒋ ⊂ ᒄ. We then have the natural isomorphisms (see [FKLMM1]) C\p L1 , . . . , Ln , L C L/ᒄᒋ (p1 , . . . , pn ) ∼ (1.1) = p1 ,...,pn ,p , L1 ,...,Ln
L/ᒄ
C\p
(p1 , . . . , pn ) ∼ =
L1 ,...,Ln
π(L1 ) ⊗ · · · ⊗ π(Ln ) ⊗ L1 , . . . , Ln , L C p1 ,...,pn ,p (1.2)
for any integrable representation L at the level k, where the summation ranges all sets of integrable representations at the level k and π(Li ) denotes the finite-dimensional representation of ᒄ generated by the highest vector in Li . The space of the coinvariants of a single representation L therefore contains the spaces of the conformal blocks involving L. The advantage of this construction is that these spaces of coinvariants are well defined if some of the points p1 , . . . , pn coincide. In this case, we consider zeroes with multiplicities. We have no isomorphisms (1.1), (1.2) for multiple points, but we can easily show that these spaces form a coherent sheaf over the variety S n C of sets of points on C. Two questions naturally arise here. The first question is whether the dimension of the space of coinvariants is preserved if some points coincide. This question is actually local, and verifying this in the case where C is the Riemann sphere suffices. The answer is positive for ᒄ = sl2 (see [FKLMM1] for L/ᒄC\p (p1 , . . . , pn ) and [FKLMM2] for C\p L/ᒄᒋ (p1 , . . . , pn )) and negative for ᒄ = E8 . The answer to the same question for the (2, m)-minimal models of the Virasoro algebras (see [FFr]) is also positive. But the general case is far from studied. We here prove that dimensions are preserved in the case of the one-dimensional lattice VOAs. The second question concerns the natural filtration on these spaces. Namely, the affine Lie algebra ᒄ and its integrable representations are graded by the action of the energy element. But the subalgebras defined above are not homogeneous (except for the case
Coinvariants for Lattice VOAs and q-Supernomial Coefficients
273
of the Riemann sphere and p1 = · · · = pn ); hence, the spaces of coinvariants can only be treated as filtered spaces. Nevertheless, we consider the Hilbert polynomials H(q) of the adjoint graded spaces and interpret them as a q-analogue of the corresponding Verlinde numbers (that are the dimensions of the spaces of conformal blocks). This object is interesting even in the case of the Riemann sphere (and is well studied only in this case). There is a conjecture in [FL] that H(q) is expressed in terms of the generalized q-supernomial coefficients. For the (2, m)-minimal models of the Virasoro algebras, the polynomials H(q) enter the Andrews–Gordon identities and can also be decomposed into sums of the q-multinomial coefficients. In this paper, we prove a formula that represents H(q) for the one–dimensional lattice VOAs as a sum of q-supernomial coefficients (see Theorem 6.3, (ii)). We note that this formula can be considered another definition of the q-supernomials. These two questions are connected. Namely, we suppose that the dimension is preserved when the points coincide. Then in the case of the Riemann sphere, it suffices to find H(q) for the graded subalgebra with p1 = · · · = pn . This makes it possible to degenerate the algebra together with the subalgebra into an abelian one or a rather simple VOA and obtain a fermionic-type formula (in terms of [KKMM]) for H(q). Such a formula exists for ᒄ = sl2 (see [FKLMM1, FKLMM2, FKLMM3]), for (2, m)-minimal models of the Virasoro algebra (see [FFr]), and for one-dimensional lattice VOAs (see Theorem 6.3, (i)). In more general cases (W -algebras, lattice VOAs, etc.), it is not evident which subalC\p gebras should be chosen instead of ᒄC\p (p1 , . . . , pn ) and ᒄᒋ (p1 , . . . , pn ). We explain what we do in the case of affine Lie algebras. We introduce some “elementary” graded subalgebras ᒄi ⊂ ᒄ. As the subalgebras are graded, they consist of ᒄ-valued functions with some local conditions at the zero point. We then define the “fused” subalgebra C\p ᒄi1 ,...,in (p1 , . . . , pn ) of functions with a pole only at p and local conditions at pi prescribed by the subalgebras ᒄi . We then have C\p L/ᒄi1 ,...,in (p1 , . . . , pn ) ∼ =
L1 ,...,Ln
L1 /ᒄi1 ⊗· · ·⊗ Ln /ᒄin ⊗L1 , . . . , Ln , L C p1 ,...,pn ,p .
(1.3) C\p In particular, the subalgebra ᒄᒋ (p1 , . . . , pn ) is fused from n copies of the affine nilpotent Borel subalgebra ᒄᒋ ⊂ ᒄ, the subalgebra ᒄC\p (p1 , . . . , pn ) is fused from n copies of the affine nilpotent parabolic subalgebra ᒄ0 ⊂ ᒄ, and (1.1) and (1.2) therefore follow from (1.3). This paper is organized as follows. In Sect. 2.1, we recall some facts about lattice vertex operator algebras (“VOA”) and their representations. In Sect. 2.2, we introduce an associative algebra that acts in each representation of the VOA. Representations of the VOA can thus be obtained as induced representations of the associative algebra. In Sect. 3.1, we introduce spaces of coinvariants and give basic statements about their dimensions. In Sect. 3.2, elementary subalgebras are introduced, and the fused subalgebras are studied. Sections 4 and 5 are rather technical. In Sect. 4, the character of an auxiliary space of coinvariants is calculated. In Sect. 5, we show that characters of these auxiliary spaces can be expressed in terms of q-supernomial coefficients. This allows proving the main results of the paper in Sect. 6.
274
B.L. Feigin, S.A. Loktev, I.Yu. Tipunin
We use the standard notation for the q-binomial coefficients (q)n n = (q)n−m (q)m , n ≥ m ≥ 0, where (q)n = (1 − q) . . . (1 − q n ). (1.4) m q 0, otherwise, 2. Lattice VOAs n 2.1. operator We recall that a lattice vertex algebra (VOA) is determined by R , and · , · , where is a lattice in Rn and · , · is a scalar product in Rn taking integer values on vectors from . Such an algebra is generated by vertex operators V (x, z), x ∈ . In this paper, we concentrate on one-dimensional lattice VOAs with a positive definite scalar product. Let α be the basis vector of and α , α = p, p ∈ N. Let ᑛp for p = 1, 2, 3, 4, . . . denote this VOA. The algebra ᑛp is generated by G + (z) = V (α, z) and G − (z) = V (−α, z). The generators G ± are bosonic (G ± (z)G ± (w)−G ± (w)G ± (z) = 0) for even p and fermionic (G ± (z)G ± (w) + G ± (w)G ± (z) = 0) otherwise. They satisfy the relations + 2 + 2 p−2 + 2 G (z) = 0 , ∂G (z) = 0 , . . . , ∂ G (z) = 0 , (2.1) − 2 p−2 − 2 − 2 ∂G (z) = 0 , . . . , ∂ G (z) = 0 , (2.2) G (z) = 0 ,
for even p and the relations ∂G + (z)G + (z) = 0 , ∂G − (z)G − (z) = 0 ,
∂ 3 G + (z)G + (z) = 0 , ∂ 3 G − (z)G − (z) = 0 ,
... , ... ,
∂ p−2 G + (z)G + (z) = 0 , ∂ p−2 G − (z)G − (z) = 0 ,
(2.3) (2.4)
for odd p. We introduce the currents Hm for 0 ≤ m ≤ p − 2 as the currents appear in the OPE 1 p
H0 H1 Hp−2 +regular terms. + + ··· + p−1 p−2 (z − w) (z − w) z−w (z (2.5) The currents H0 and H1 respectively correspond to the u(1)-current and the energy– momentum tensor. We choose the Virasoro T such that G + and G − are primary fields with the respective conformal weights p − 1 and 1, G + (z)G − (w) =
− w)p
+
T = H1 −
p−1 ∂ H0 . 2
(2.6)
The mode decomposition corresponding choice 2.6 is
G + (z) = G + n z−n−p+1 , G − (z) = G − n z−n−1 , n∈Z
Hm (z) =
n∈Z
(2.7)
n∈Z
Hm n z−n−m−1 ,
T (z) =
Ln z−n−2 .
(2.8)
n∈Z
This gives the commutation relations Ln , G + i = ((p−2)n−i)G + i+n , Ln , G − i = −iG − i+n ,
L0 , Hm i = −iHm i . (2.9)
Coinvariants for Lattice VOAs and q-Supernomial Coefficients
275
The algebra ᑛp admits the following family of automorphisms (spectral flow) Uθ : G+n → G + n+θ , G − n → G − n−θ , 1 0 0 H n → H n + θ p δn,0 , H1 n → H1 n + θ H0 n + 21 θ(θ − 1) p1 δn,0 , 1 m m m−1 m−2 + 1 θ(θ − 1)(θ − 2) Hm−3 (2.10) H n→ H n+θH n + 2 θ (θ − 1) H n n 6 1 + · · · + m! θ (θ − 1)(θ − 2) · · · (θ − m) p1 δn,0 , where θ ∈ Z. The automorphisms with θ ∈ pZ play the role of the affine part of the Weyl group. We note that the algebra ᑛp for p = 1 can be considered the infinite–dimensional
2 at the level k = 1; and for p = 3 the N = 2-superClifford algebra; for p = 2 the sl conformal algebra with the central charge c = 1/3. We recall some facts about representations of lattice VOAs. The category of representations of any lattice VOA is semisimple, i.e., any representation decomposes into the direct sum of irreducible ones. The irreducible representations are enumerated by the quotient ∨ / , where ∨ is the dual lattice. Namely, the representation corresponding to the element y ∈ ∨ is the space of descendants of V (y, 0), and if two elements of ∨ differ by an element of , then the corresponding representations are isomorphic. It is also known that these representations form a minimal model with the Verlinde algebra isomorphic to the group ring of ∨ / . We describe it in detail for the one-dimensional case. Let Ir with 0 ≤ r ≤ p − 1 denote the representation corresponding to V (− αr , 0). This representation contains the highest–weight vector |p; r satisfying the conditions G + i |p; r = G − j |p; r = 0 , i ≥ −p + r + 2 , j ≥ −r , n n H0 0 |p; r = − |p; r , L0 |p; r = (r − p + 2)|p; r . p 2p Another description of Ir is Ir =
Ir (m),
(2.11)
(2.12)
m∈Z
where Ir (m) is the Fock module over H0 with the highest-weight vector |p; r + pm . In this realization, G ± : Ir (m) → Ir (m ± 1) act as vertex operators. We call the representation I0 a vacuum representation. The representations Ir with 0 ≤ r ≤ p − 1 compose a minimal model, and the Verlinde algebra (the fusion algebra) is Ir · Is = Ir+s mod p .
(2.13)
The vacuum representation I0 is the unit in the Verlinde algebra. We also note that the module Ir is isomorphic to the contragradient to Ip−r . This equips the Verlinde algebra with the scalar product. The automorphisms Uθ act naturally on the set of representations Ir as Uθ Ir = Ir+θ mod p .
(2.14)
In particular, the action of U1 on the set of representations Ir coincides with the fusion with I1 .
276
B.L. Feigin, S.A. Loktev, I.Yu. Tipunin
Each space Ir is bigraded by the operators H0 0 and L0 . By definition, the characters of the representations Ir corresponding to this bigrading are ch Ir (q, z) = tr zH0 q L0 ,
(2.15)
where the trace is calculated over Ir and 2.12 implies ch Ir (q, z) =
z
− pr
r
(r−p+2)
q 2p (q)∞
p
zn q 2 (n
2 +n)−n(r+1)
.
(2.16)
n∈Z
An analogous series can be considered for any graded subspace. They are denoted by the symbol ch in front of the symbol of the subspace in what follows. 2.2. The representations of ᑛp can be considered from a more algebraic standpoint. The operators G ± n act on each representation Ir ; therefore, the associative algebra generated by G ± n acts on each Ir . We let ᑛass p denote this algebra and we describe its defining relations. We first note that ᑛp contains a Virasoro algebra; hence, the algebra of vector fields on the circle acts by derivatives (commutators with Lj ) on ᑛass p . Let Fλ denote the representation of the algebra of vector fields in weight-λ tensor fields C[z, z−1 ](dz)λ . Then the space {G + i , i ∈ Z} = W+ is isomorphic to F2−p , and {G − i , i ∈ Z} = W− is isoi−p+2 morphic to F0 . This isomorphism is given by the identification G + i = z1 (dz1 )2−p , i − 0 G i = z2 (dz2 ) . The generators G ± i commute whenever p is even, anticommute otherwise, and in both cases satisfy the quadratic relations
p(α, β)G ± α G ± β = 0 , n ∈ Z, (2.17) α+β=n
for any polynomial p(α, β) of degree not higher than p − 2. The relations between G + i and G − j are also quadratic. We introduce the space W0 of all commutators of G + i and G − j for even p and anticommutators for odd p. We describe W0 as a quotient of the tensor product W+ ⊗ W− . We identify W+ ⊗ W− with the space of Laurent polynomials C[z1 , z1−1 , z2 , z2−1 ] (dz1 )2−p (dz2 )0 and then introduce the subspaces (j )
W0
= C[z1 , z1−1 , z2 , z2−1 ](z1 − z2 )p−2−j (dz1 )2−p (dz2 )0 , (j )
j ∈ Z.
(j −1)
is isomorphic to F−j as a module over the We note that the quotient W0 /W0 algebra of vector fields. (−2) (−1) acts on any Ir by zero and W0 acts by The OPE (2.5) implies that the space W0 (−1) (−2) F1 , but F1 a scalar. The scalar can be found as follows. We know that W0 /W0 is the space of 1-forms. Hence, there exists the functional (−1)
β : W0 (−1)
res −→ F1 −→ C.
(2.18)
Let K ⊂ W0 be the kernel of the mapping β. Then K is a subspace in W+ ⊗ W− , and we have W0 = (W+ ⊗ W− ) /K.
Coinvariants for Lattice VOAs and q-Supernomial Coefficients
277
Theorem 2.1. The defining relations in ᑛass p are (2.17) and the relations between W+ and W− following from (2.18). Because this theorem is not used anywhere in the paper, we omit the proof. But this consideration gives us the description of Hi s in terms of Laurent polynomials as −p+2 s+i z2 (z1
H i s = z1
− z2 )p−i−2 (dz1 )p−2 (dz2 )0 .
(2.19)
This notion is very convenient for us. In particular, automorphism (2.10) acts on the space of Laurent polynomials as multiplication by (z1 /z2 )θ . It can be shown that the category of representations of ᑛass p with highest weight is equivalent to the category of representations of ᑛp . Therefore, irreducible representations of ᑛass p with highest weight are precisely Ir . We define a class of representations for the algebra ᑛass representap . Each of these tions contains a cyclic vector that is annihilated by some subspaces in W+ W0 W− , which we now describe. Let N = (N+ , N− ; N0 , N1 , . . . , Np−2 ) be a (p + 1)-dimensional vector with integer components that satisfy 0 ≤ N0 ≤ N1 ≤ · · · ≤ Np−2 ≤ N+ + N− .
(2.20)
We define the subspace W (N) = W+ (N) ⊕ W0 (N) ⊕ W− (N) as N
W+ (N) = C[z1 ]z1 + (dz1 )2−p ⊂ C[z1 , z1−1 ](dz1 )2−p , N
W− (N) = C[z2 ]z2 − (dz2 )0 ⊂ C[z2 , z2−1 ](dz2 )0 , and the space W0 (N) is the sum (not direct!) of the images of the subspaces (j )
N −N− N− z2 (z1
W0 (N) = C[z1 , z2 ]z1 j
− z2 )p−j −2 (dz1 )2−p (dz2 )0
for j = 0 · · · p − 2 under the mapping W+ ⊗ W− → W0 . We note that if N0 = N1 = · · · = Np−2 = N+ + N− , then W0 (N) is the subspace of (anti)commutators of W+ (N) and W− (N). Clearly, we have Uθ (W (N)) = W (N + θ u), where u = (1, −1; 0, . . . , 0).
(2.21)
Proposition 2.2. The irreducible representation Ir is induced from the subspace W (ru). Example 2.3. Let p = 2. Then we can identify G + i = e·t i , G − i = f ·t i , and H0 i = h·t i ,
2 at the where e, h, f is the standard basis of sl2 . Our VOA is therefore isomorphic to sl level 1. We then have W ((N+ , N− , N0 )) = e · t N+ C[t] ⊕ h · t N0 C[t] ⊕ f · t N− C[t] .
(2.22)
Thus, we covered the cases for k = 1 considered in [FKLMM1, FKLMM2, FKLMM3]. Let R(N) be the representation induced from the trivial representation of the subalgebra generated by the subspace W (N). We know that the representation R(N) is a direct sum of Ir . The multiplicity of Ir in R(N) is the dimension of the space Homᑛass (R(N), Ir ), p which is isomorphic to the space of W (N)-invariants in Ir .
278
B.L. Feigin, S.A. Loktev, I.Yu. Tipunin
3. Coinvariants and Verlinde Rules 3.1. It is more convenient to study the space dual to the space of W (N)-invariants in Ir , namely, the space of W (N)-coinvariants in Ir ∗ , which is Ir ∗ /W (N)Ir ∗ by definition. Note that the dual representation Ir ∗ differs from Ir only by exchanging positive and negative modes. The space W (N) can be deformed, i.e., can be included in a family of subspaces of W depending on a set of parameters that are points on C. Given k vectors Ni = i (N+i , N−i ; N0i , N1i , . . . , Np−2 ) satisfying 2.20 and given a vector a = (a 1 , . . . , a k ) ∈ Ck , 1 k we define W (N , . . . , N ; a) as the direct sum of the spaces W± (N1 , . . . , Nk ; a) and W0 (N1 , . . . , Nk ; a), where W+ (N1 , . . . , Nk ; a) = C[z1 ](dz1 )2−p
k
i
(z1 − ai )N+ ⊂ C[z1 , z1−1 ](dz1 )2−p ,
i=1
W− (N1 , . . . , Nk ; a) = C[z2 ](dz2 )0
k
i
(z2 − ai )N− ⊂ C[z2 , z2−1 ](dz2 )0 ,
i=1
and the space W0 (N1 , . . . , Nk ; a) is the sum (not direct!) of the images of the subspaces (j )
W0 (N1 , . . . , Nk ; a) = C[z1 , z2 ](z1 − z2 )p−j −2 (dz1 )2−p (dz2 )0
k i i i (z1 − ai )Nj −N− (z2 − ai )N− i=1
for j = 0, . . . , p − 2 under the mapping W+ ⊗ W− → W0 . Then W (N) = W (N1 , N2 , . . . , Nk ; 0) for N = N1 + · · · + Nk , and the family of subalgebras W (N1 , . . . , Nk ; a) is therefore a deformation of W (N). Such a deformation can be used to calculate dimensions for the coinvariants. To simplify the notation, we let Ir [W ] denote the space of coinvariants Ir ∗ /W Ir ∗ . Theorem 3.1. If the points a 1 , a 2 , . . . , a k are pairwise distinct, then the dimension of Ir [W (N1 , . . . , Nk ; a)] coincides with the coefficient of Ir in the expression k
dim I0 [W (Ni )] · I0 + · · · + dim Ip−1 [W (Ni )] · Ip−1 ,
(3.1)
i=1
where the product is the Verlinde multiplication. Proof. We first recall the fusion procedure in the case of one input and k outputs. We suppose that N1 , . . . , Nk and M1 , . . . , Mk are two sets of vectors such that Nji ≥ Mji for any i = 1, . . . , k and j = +, −, 0, . . . , p − 2. We then have the natural projection Ir [W (N1 , . . . , Nk ; a)] → Ir [W (M1 , . . . , Mk ; a)]. The dual spaces thus form an inductive system, and we can consider the limit ∗ Ir [a] = lim Ir [W (N1 , . . . , Nk ; a)] . →
ass The action of the algebra ᑛass p on Ir can be extended to the action of the algebra ᑛp ⊕ ass 1 · · · ⊕ ᑛp on the limit Ir [a], where the summands correspond to the points a , . . . , a k .
Coinvariants for Lattice VOAs and q-Supernomial Coefficients
279
The statement that the set of representations Ir form a minimal model means that the representation Ir [a] decomposes into the direct sum of external tensor products Ir [a] = Is1 · · · Isk . 0≤s1 ,...,sk
Is1 ·Is2 ·····Isk =Ir
We emphasize that the sum goes over the set of representations whose product in the Verlinde algebra is Ir . Clearly, the space Ir [W (N1 , . . . , Nk ; a)] is dual to the space of W (N1 , . . . , Nk ; a)invariants in Ir [a]. Because the subspace W (N1 , . . . , Nk ; a) is dense with respect to the ass topology of the direct limit in the subspace W (N1 ) ⊕ · · · ⊕ W (Nk ) ⊂ ᑛass p ⊕ · · · ⊕ ᑛp , we have
dim Ir [W (N1 , . . . , Nk ; a)] = dim Is1 [W (N1 )] · · · · · dim Isk [W (Nk )], 0≤s1 ,...,sk
Is1 ·Is2 ·····Isk =Ir
which is equivalent to the statement of the theorem. We have a family of subalgebras here parameterized by complex numbers a 1 , . . . , a k . The deformation argument (see, e.g., [FKLMM2]) implies that dimension of the space of coinvariants with respect to a special subalgebra is not less than one with respect to a generic subalgebra. Proposition 3.2. The dimension of Ir [W (N1 , . . . , Nk ; a)] is minimal whenever the a i are pairwise distinct and maximal whenever the a i are equal to each other. 3.2. We introduce elementary spaces Wi,j = W (Ni,j ) with Ni,j = (i − j, j ; 1, 2, . . . , i − 1, i, . . . , i) and 0 ≤ i ≤ p, j ∈ Z. Proposition 3.3. We have dim Ir [Wi,j ] = -{n|0 ≤ pn + j + r ≤ i} . In other words,
dim Ir [Wi,j ] =
(3.2)
2, i = p, (r + j ) mod p = 0, 1, otherwise, 0, (r + j ) mod p > i .
Proof. We note that Wi,j contains the subalgebra H0 ≥1 ; therefore, the space of Wi,j invariants in Ir is spanned by several vectors |p; r + pn , n ∈ Z. Taking the invariants of G + ≥i−j −p+2 and G − ≥j selects the vectors |p; r + pn with −j ≤ r + pn ≤ i − j . We must show that these vectors are invariant. We note that if −j ≤ r + pn ≤ i − j , then Wi,j ⊂ W0,−(r+pn) . And we know that the vector |p; s is annihilated by the subspace W0,−s . According to (3.1), (3.2), and (2.13), we define the positive integers dp,r [i 1 , j 1 ; i 2 , j 2 ; . . . ; i k , j k ] by the formula k s=1
s
(ξ −j + ξ −j
s +1
+ · · · + ξ −j
s +i s
)=
p−1
r=0
dp,r [i 1 , j 1 ; i 2 , j 2 ; . . . ; i k , j k ] · ξ r , (3.3)
280
B.L. Feigin, S.A. Loktev, I.Yu. Tipunin
where ξ is a primitive root of unity of degree p. By Theorem 3.1 and Proposition 3.2, we then have dim Ir [W (Ni
1 ,j 1
, . . . , Ni
k ,j k
; 0)] ≥ dim Ir [W (Ni ,j , . . . , Ni ≥ dp,r [i 1 , j 1 ; . . . ; i k , j k ] 1
1
k ,j k
; a)] (3.4)
for any a ∈ Ck . One aim of this paper is to show that (3.4) is indeed an equality for an arbitrary vector a ∈ Ck (see Theorem 6.1). It is easy to see from (3.3) that dp,r [i 1 , j 1 ; i 2 , j 2 ; . . . ; i k , j k ] is a sum with the step p of supernomial coefficients. To write this sum, we introduce the m×m matrix (see [SW]) 2 −1 0 0 ... 0 0 0 0 0 0 0 −1 2 −1 0 . . . 0 0 −1 2 −1 . . . 0 0 0 0 Tm = (3.5) . . . . . . . . . . . . . . . . . . . . . . . . . .. , 0 0 0 0 . . . −1 2 −1 0 0 0 0 0 . . . 0 −1 2 −1 0 0 0 0 ... 0 0 −1 1 −1 and then Tm ij = min{i, j } for 1 ≤ i, j ≤ m. With the vector N = (N+ , N− ; N0 , N1 , . . . , Np−2 ), we associate the other vector N = (N0 , N1 , . . . , Np−2 , N+ + N− ). We define the vector L = (L1 , L2 , . . . , Lp ) as L = N Tp . Proposition 3.4. Let N =
k
s=1 N
i s ,j s
(3.6)
and L = N Tp . Then
dp,r [i 1 , j 1 ; i 2 , j 2 ; . . . ; i k , j k ] =
a∈Z
L pa + N− − r
p+1
.
(3.7)
Sum (3.7) can be easily calculated in the case where at least one i s is equal to p − 1. Remark 3.5. We have dp,r [i 1 , j 1 ; i 2 , j 2 ; . . . ; i k , j k ] = 2L1 3L2 . . . (p − 1)Lp−2 p Lp−1 −1 (p + 1)Lp
(3.8)
whenever Lp−1 ≥ 1 or, equivalently, at least one i s is equal to p − 1. Finally, we describe the class of subalgebras W (N) that can be thus deformed. Proposition 3.6. If all components of the vector L = N Tp−1 are non-negative, then W (N) W (Ni for some i 1 , . . . , i k ; j 1 , . . . , j k .
1 ,j 1
, Ni
2 ,j 2
, . . . , Ni
k ,j k
; 0)
Coinvariants for Lattice VOAs and q-Supernomial Coefficients
281
4. Degenerations of ᑛass p and the Main Inequality 4.1. Let I be a set with 2 elements. Let ea with a ∈ I be the standard basis in Z2 . We write elements x from Z2 as rows. Let x · y denote the standard scalar product in Z2 . Let (A) be the lattice with the basis ea and the scalar product determined by a symmetric matrix A, a, b ∈ I . (4.1) ea , eb = Aab ∈ Z , We consider the algebra ᑜ(A) generated by vertex operators V (ea , z), a ∈ I . We emphasize that we do not involve vertex operators corresponding to −ea . We fix the decomposition of V (ea , z) in a formal power series,
V (ea )n z−n−1 . (4.2) V (ea , z) = n∈Z
The algebra ᑜ(A) is naturally (A)-graded. Decomposition (4.2) determines another one grading by the operator L0 , [L0 , V (ea )n ] = −nV (ea )n .
(4.3)
This algebra acts in the space H=
H(y) ,
(4.4)
y∈ ∨ (A)
where H(y) is the Fock module with the momentum y and ∨ (A) is the lattice dual to (A). The representation H contains the subrepresentation Jy generated from the highest-weight vector |y in H(y). The representation Jy plays the role of a Verma module, namely, it is maximal among the representations with the same annihilation conditions satisfied by the vector |y . Clearly, Jy admits the ∨ (A)-grading; to fix it, we set deg |y = 0. The dual space Jy ∗ can be identified with the space of correlation functions. Name ly, any homogeneous linear functional θ | ∈ Jy ∗ of degree ni ei determines and is determined by the function Yθ,y (x11 , x21 , . . . , xn11 ; x12 , x22 , . . . , xn22 ; . . . ; x12 , x22 , . . . xn22 ) = θ |V (e1 , x11 )V (e1 , x21 ) . . . V (e1 , xn11 )V (e2 , x12 )V (e2 , x22 ) . . . V (e2 , xn22 ) . . . ×V (e2 , x12 )V (e2 , x22 ) . . . V (e2 , xn22 )|y .
(4.5)
Proposition 4.1. The correlation functions Yθ,y are products of the polynomial (x11 x21 . . . xn11 )w1 (x12 x22 . . . xn22 )w2 . . . (x12 x22 . . . xn22 )w2 × (xia − xja )Aaa (xia − xjb )Aab 1≤a≤2
1≤i<j ≤na
(4.6)
1≤a
and a polynomial symmetric in each group of variables x1a , x2a , . . . , xnaa . The numbers wi are determined by the annihilation conditions satisfied by the vector y.
282
B.L. Feigin, S.A. Loktev, I.Yu. Tipunin
In other words, we have the decomposition Jy ∗ (x) , Jy ∗ =
(4.7)
x∈(A)
and Jy ∗ (x) can be identified with the space of correlation functions of degree x. Using Proposition 4.1, we obtain the following formula of the Gordon type. Proposition 4.2. The character of the space Jy is given by the formula ch Jy (q, z) =
1
zn q 2 nA·n+v·n
1 , (q)ea ·n
a∈I
n∈Z2
(4.8)
where v = − 21 diag(A) + (w1 , w2 , . . . , w2 ) and zn = z1n1 z2n2 . . . z2n2 . Let N be an 2-dimensional vector with integer components. We define the subspaces ⊂ ᑜ(A) by the condition
W A (N)
W A (N) is spanned by {V (ea )n | n ≥ Na , a ∈ I }.
(4.9)
Then W A (N)-coinvariants of Jy ∗ can be easily calculated in terms of correlation functions. Namely, the dual space consists of functions given in Proposition 4.1 with a restriction on the degrees in each variable. This yields the following proposition. Proposition 4.3. The character of Jy [W A (N)] = Jy ∗ /W A (N)Jy ∗ is given by the formula e · (N + n − nA + w)
1 a ch Jy [W A (N)](q, z) = zn q 2 nA·n+v·n , (4.10) ea · n q a∈I
n∈Z2
where w = (w1 , w2 , . . . , w2 ). d
4.2. We consider the family of algebras ᑛp obtained from the algebra ᑛass p by the following recursive procedure. + − We introduce a filtration on ᑛass p by assigning degree 1 to the generators G and G . +
0
−
The associated graded algebra ᑛp is generated by G , G . We define the currents H by the OPE
m
p−2
0
H H G (z)G (w) = + regular terms. + ··· + (z − w)p−1 z−w +
−
0
We note that the original algebra ᑛass p is the central extension of ᑛp . +
−
0
At the next step, we assign degrees 1 to G , G , and H . After taking graded objects, 1 we obtain the algebra ᑛp whose generators are again denoted (abusing the notations) +
−
+
1
0
−
0
by G , G , and H . The algebra ᑛp is generated by the currents G , G , and H that have OPEs with the properties ±
±
0
0
G (z)G (w) ∼ (z − w)p , H (z)H (z) ∼ (z − w)2 ,
±
0
G (z)H (w) ∼ (z − w)1 , (4.11)
Coinvariants for Lattice VOAs and q-Supernomial Coefficients +
d−1
−
To make the step ᑛp 0
−
d−1
i
p−2
1
−
G (z)G (w) =
G , and H , . . . , H
283
H H + ··· + + regular terms . p−2 (z − w) z−w
(4.12)
d
+
→ ᑛp , we introduce a filtration by attaching degree 1 to G , d
+
. As a result, we obtain the algebra ᑛp with the generators G ,
G , and H , 0 ≤ i ≤ d − 1 with the OPEs ±
±
n
m
±
m
G (z)H (w) ∼ (z − w)A±m
G (z)G (w) ∼ (z − w)p , H (z)H (w) ∼ (z − w)Anm , +
(4.13)
d
−
G (z)G (w) =
p−2
H H + ··· + + regular terms (z − w)p−d−1 z−w
(4.14)
determined by the (d + 2) × (d + 2) matrix
p −p + d + 1 1 A= 2 3 ... d
−p + d + 1 p 1 2 3 ... d
1 1 2 2 2 ... 2
2 2 2 4 4 ... 4
3 3 2 4 6 ... 6
... ... ... ... ... ... ...
d d 2 4 , 6 ... 2d
(4.15)
where the first row and column correspond to the index +; the second ones, to −; the third to 0; and so on to d − 1. Proposition 4.4. There is the surjective homomorphism d
ᑜ(A) → ᑛp
(4.16)
acting on generators as +
V (e+ , z) → G (z) ,
−
i
V (e− , z) → G (z) ,
V (ei , z) → H (z) ,
0 ≤ i ≤ d − 1. (4.17)
Proof. From (4.13), we can deduce that all relations between V (ea , z) hold for their d images in ᑛp . Hence, (4.16) defines a homomorphism. Because the image contains all d
the generators of ᑛp , this map is surjective.
We note that the map (4.16) is graded. Namely, it preserves the grading with respect to d L0 and specializes the -grading of ᑜ(A) into the H0 -grading of ᑛp . This specialization is defined by the functional u on such that u(e+ ) = 1, u(e− ) = −1, and u(ei ) = 0 (see (2.21)). Any representation Ir of ᑛass p contains the highest-weight vector. Therefore, the fil0
tration in ᑛass p determines a filtration in Ir , and the associated graded algebra ᑛp acts in 0
0
the associated graded module Ir . We can repeat this procedure with Ir and the image d of the highest-weight vector. We thus obtain the family of representations Ir of the
284
B.L. Feigin, S.A. Loktev, I.Yu. Tipunin d
d
algebras ᑛp . Evidently, characters of all Ir , 0 ≤ d ≤ p − 2, are equal to each other and equal to the character of Ir given by (2.16). d We note that (4.16) allows treating any ᑛp -module as a ᑜ(A)-module. According to this, we have the following proposition. Proposition 4.5. Let y = ru. Then there is the surjective mapping of ᑜ(A)-modules Jy → Ir
d
(4.18)
mapping the vector |y into the highest-weight vector. d
Proof. This follows from the observation that the highest-weight vector of Ir satisfies the same annihilation conditions as |y . We show (see Corollary 6.2) that (4.18) is indeed an isomorphism. We note that this map is bigraded with respect to L0 and u(). d We define the subspaces W d (N) ⊂ ᑛp recursively from W (N) as the adjoint graded subspaces. Proposition 4.6. Let 0 ≤ d ≤ p − 1. Then W d (N) contains the image of W A (N) under map (4.16). Proof. We note that W (N) contains linear combinations of the form Hi s + c1 Hi−1 s + · · · + ci H0 s + ci+1 i
for s ≥ Ni − i. Therefore, W d (N) contains H s for s ≥ Ni − i and hence (according to (2.8) and (4.2)) contains the image of W A (N). The most important case of this statement is d = p − 1, where the condition on N is empty. Corollary 4.7. Under the conditions of Proposition 4.6, we have the surjective map d
Jru [W A (N)] → Ir [W (N)] .
(4.19)
This map is an isomorphism only under some additional conditions (see Corollary 6.2). The deformation argument (see, eg., [FKLMM2]) implies that for any filtered representation V of a filtered (sub)algebra W , we have dim(gr V [gr W ]) ≥ dim V [W ]. In our case, this implies d
dim Ir [W d (N)] ≥ dim Ir
d−1
0
[W d−1 (N)] ≥ · · · ≥ dim Ir [W 0 (N)] ≥ dim Ir [W (N)] . (4.20) Taking (4.19), (4.20), and Proposition 3.4 together, we obtain the main inequality. Proposition 4.8. For 0 ≤ d ≤ p − 1, we have dim Jru [W A (N)] ≥ dim Ir [W (Ni
1 ,j 1
i 1 ,j 1
, Ni
2 ,j 2
i 2 ,j 2
, . . . , Ni
,N ,...,N ≥ dim Ir [W (N ≥ dp,r [i 1 , j 1 ; i 2 , j 2 ; . . . ; i k , j k ] .
k ,j k
i k ,j k
; 0)] ; a)] (4.21)
Coinvariants for Lattice VOAs and q-Supernomial Coefficients
285
5. Combinatorics of the Characters 5.1. Following [SW], we introduce q-supernomial coefficients as
L a
k+1 q
=
k
q i=2
ni−1 (−ni +
k
j =i
n1 ,...,nk ∈Z
Lj )
Lk nk
q
Lk−1 + nk nk−1
q
...
L1 + n2 n1
q
,
n1 +···+nk =a
(5.1) where L = (L1 , L2 , . . . , Lk ) is a k-dimensional vector with nonnegative integer entries. The sum in the RHS is a well-defined polynomial in q because q-binomial coefficients vanish whenever the bottom index becomes greater than the top one. Let N = (N+ , N− ; N0 , . . . , Nd−1 ) be a vector in Zd+2 , let N = (N0 , . . . , Nd−1 , N+ + N− ), and let L = N Td+1 . The aim of this section is to prove the following polynomial identity. Theorem 5.1. Let 0 ≤ d < 2p − 2. If Li ≥ 0 and, in addition, 2N+ − Nd−1 ≥ −(2p − d − 2), then
zn+ −n− q 2 nA·n 1
n∈Z2
2N− − Nd−1 ≥ −(2p − d − 2),
(5.2)
d+2 e · (N + n − nA)
2 L a = za q pa /2 , ea · n pa + N− q q
a∈I
a∈Z
(5.3)
where I = {+, −; 0, 1, . . . , d − 1}.
We call (5.2) the well-balanced conditions. Without them, identity (5.3) can fail. To eliminate the well-balanced conditions, we use an improved version of q-binomial coefficients: n n ≥ 0, m , + n q = (5.4) 2 +(n−m) (n−m) −m − 1 m q 2 (−1)n−m q − , n < 0. −n − 1 q −1 + n is the coefficient of z−m in the Laurent In particular, for q → 1, the integer m 1 expansion of (1 + z−1 )n at z = 0. This definition is motivated by the following relations. + n Proposition 5.2. The polynomials are uniquely defined by the q-Pascal identities m q + n−1 + n−1 + n = qm + for any n, m , (5.5) m q m−1 q m q + n−1 + n−1 + n = + q n−m for any n, m , (5.6) m q m−1 q m q and the conditions
m m
+ q
= 1,
+ n = 0 for n < 0 , 0 q
which need verification for only certain m and n.
(5.7)
286
B.L. Feigin, S.A. Loktev, I.Yu. Tipunin
? n Proof. Indeed, any set of polynomials satisfying (5.5) and (5.6) can be expressed m q as (n−m)2 +(n−m) −m − 1 n−m q − 2 , n < 0, m < 0 , P1 · (−1) n − m q −1 ? 2 m +m −(n − m) − 1 n = P2 · (−1)m q − 2 , n < 0, m ≥ 0 , m q m q −1 (P1 + P2 ) · n , n ≥ 0, m q
where P1 and P2 are certain polynomials in q. In our case, we need only verify that P1 = 1 and P2 = 0.
0 P1
P2
0
0 P1 + P2
We note that for the usual q-binomial coefficients, (5.5) and (5.6) fail for n = m = 0. Theorem 5.3. Let 0 ≤ d < 2p − 2. If Li ≥ 0, then d+2
e · (N + n − nA) +
1 2 L a zn+ −n− q 2 nA·n = za q pa /2 . ea · n pa + N− q q n∈Z2
a∈I
a∈Z
(5.8)
We emphasize that in the left-hand side, the sum ranges all integer vectors. 5.2. Here we show that the left-hand side of (5.8) is a well-defined polynomial and that Theorem 5.3 implies Theorem 5.1. ea · (N + n − nA) + Let PN (n) = . We must prove that if Li ≥ 0 for any ea · n q a∈I i, then there is only a finite number of vectors n such that PN (n) = 0. And if we impose well-balanced conditions (5.2) in addition, then PN (n) = 0 implies na ≥ 0, a = +, −, 0, 1, . . . , d − 1, where na = ea · n. Proposition 5.4. If PN (n) = 0, then Na ≥ ea · nA for any a = +, −, 0, 1, 2, . . . , d − 1; (5.9) Na < ea · (nA − n) if na < 0. (5.10) + n Proof. Condition (5.9) follows because = 0 if n < m. Condition (5.10) follows m q + n = 0 if m < 0 and n ≥ 0. because m q Lemma 5.5. Let 1 ≤ i ≤ d. If Li ≥ 0 and ni−1 < 0, then PN (n) = 0. Proof. We suppose that ni−1 < 0 and PN (n) = 0. Then adding (5.10) for a = i − 1 twice and subtracting (5.9) for a = i − 2 (if i > 1) and a = i (if i < d; if i = d, then for a = + and a = −), we obtain Li < 0. Lemma 5.6. If Ld+1 ≥ 0 and n+ < 0, n− < 0, then PN (n) = 0. Proof. The proof is similar. Here we add (5.10) for a = + and for a = − and then subtract (5.9) for a = d − 1.
Coinvariants for Lattice VOAs and q-Supernomial Coefficients
287
Lemma 5.7. If n+ < 0 and n− ≥ 0, then PN (n) = 0 for n+ ≤ C+ = (2N+ − Nd−1 )/(2p − d − 2). Proof. We suppose that n+ < 0 and PN (n) = 0. Then adding (5.10) for a = + twice and subtracting (5.9) for a = d − 1, we obtain 2N+ − Nd−1 < (2p − d − 2)n+ − (2p − d − 2)n− . Taking n− ≥ 0 and 2p − d − 2 > 0 into account, we deduce (2p − d − 2)n+ > 2N+ − Nd−1 . Clearly, we have a similar statement for n− < 0 and n+ ≥ 0. Therefore, if Li ≥ 0 for any i and the well-balanced conditions are satisfied, then PN (n) = 0 only if na ≥ 0 for any a = +, −, 0, . . . , d − 1; hence, Theorem 5.3 implies Theorem 5.1. Under the conditions of Theorem 5.3, we have ni ≥ 0 for i = 0 . . . d − 1 and n+ > C+ , n− > C− ; otherwise, PN (n) = 0. Adding (5.9) for a = + and a = −, we obtain N+ + N− ≥ (d + 1)n+ + (d + 1)n− + 2n0 + 4n1 + 6n2 + · · · . Because all the coefficients in the right-hand side of this inequality are positive, there exist only a finite number of such vectors n. L [N](q, z) and the right-hand side χ R 5.3. Here we compare the left-hand side χp,d p,d [L, N− ](q, z) of (5.8). We note that for q-supernomial coefficients, there is the identity k L + ek k+1 L L + ek−1 k+1 = + qa . a q a a − 1 q q
Therefore, we have R R R χp,d [L + ed+1 , N− ] = χp,d [L + ed , N− − 1] + z−1 q N− −p/2 χp,d [L, N− − p] (5.11)
for the right-hand side. Because (L1 , . . . , Lk−1 , 0) k+1 (L1 , . . . , Lk−1 ) k = , a a q q we also have R R χp,d [(L1 , . . . , Ld , 0), N− ] = χp,d−1 [(L1 , . . . , Ld ), N− ].
(5.12)
The idea of the proof is to verify (5.11) and (5.12) for the left-hand side. Proposition 5.8. Let A be a symmetric m×m matrix with integer components, and let u and N be vectors in Zm . We consider the polynomial e · (N + n − nA) +
1 a zu·n q 2 nA·n . (5.13) χA,u [N](q, z) = ea · n q m n∈Z
a
For any a, we then have χA,u [N] = χA,u [N − ea ] + zua q Na −Aaa /2 χA,u,v,w [N − ea A].
(5.14)
288
B.L. Feigin, S.A. Loktev, I.Yu. Tipunin
Proof. Identity (5.14) follows from identity (5.6) applied to the factor ea · (N + n − nA) + in each summand. ea · n q L [N], we verify (5.11) for the left-hand side. We Applying (5.14) with a = − to χp,d note that for a = 0, . . . , d − 1, (5.14) implies recursive relations for the q-supernomial coefficients the same as in [SW]. To verify (5.11) for the left-hand side, we need the following identity.
Lemma 5.9. There is the relation + +
N M N −m + M −n + N +M −n−m+l + = q (n−l)(m−l) . n q m q n−l q m−l q l q l∈Z
(5.15)
Proof. We first verify that the sum in the right-hand side is indeed finite. We note that if N + M − n − m < 0, then the third factor is always zero, and entire sum is therefore zero. If N + M − n − m ≥ 0, then either N − m ≥ 0 or M − n ≥ 0. In the first case, we have nonzero summands only if n + m − N ≤ l ≤ n, and in the second case, only if n + m − M ≤ l ≤ m. We let the polynomial in the right-hand side be denoted by RHSN,M n,m . The idea of proof is to apply Proposition 5.2 to RHSN,M as a set of polynomials indexed by N and n,m n. This means that we prove that n N−1,M N,M N−1,M + RHSN−1,M + q N−n RHSN−1,M RHSN,M n,m = q RHSn,m n−1,m , RHSn,m = RHSn,m n−1,m (5.16)
for any N, n, M, and m and RHSn,M n,m =
M m
+
RHSN,M 0,m = 0
q
(5.17)
for any M and m and certain N < 0 and n. To prove (5.16), we fix N , M, n, and m for the moment and introduce the notation ab c N −m+a + M −n+b + N +M −n−m+l+c + . = q (n−l)(m−l) d ef n−l+d q m−l+e q l+f q Then substituting l + 1 for l leads to the identity
a b c−1 ab c = . q 2l−m−n+1 d +1 e+1 f −1 d ef l∈Z
l∈Z
Using this identity together with (5.5), we obtain
0 0 0 = RHSN,M n,m 000
−1 0 0 −1 0 0 = q n−l + 000 −1 0 0
−1 0 0 n −1 0 −1 n−l −1 0 −1 = q + q + 00 0 0 0 −1 −1 0 0
Coinvariants for Lattice VOAs and q-Supernomial Coefficients
289
−1 0 0 −1 0 −1 −1 0 0 + q l−m + 00 0 −1 −1 0 −1 0 0
−1 0 −1 −1 1 0 = qn + q l−m 00 0 −1 0 0
=
qn
= q n RHSN−1,M + RHSN−1,M n,m n−1,m , where the sums range l ∈ Z. The proof of the second part of (5.16) is similar. To verify (5.17), we take n = M and any negative N < m − M. We have only one nonzero summand in RHSM,M M,m , namely, for l = m, and it coincides with the q-binomial given in (5.17). If N < m − M and n = 0, then N + M − m − n < 0, and, as described above, the result is zero. Proposition (5.2) therefore implies the lemma. L Applying (5.15) to the factors in χp,d−1 [(N+ , N− ; N0 · · · , Nd−2 )] corresponding to n+ and n− , we obtain L L [(N+ , N− ; N0 · · · , Nd−2 )] = χp,d [(N+ , N− ; N0 · · · , Nd−2 , N+ + N− )], χp,d−1
i.e., (5.12) for the left-hand side. We can now prove Theorem 5.3 by induction on d and Ld+1 . To complete the proof, we need only verify the case d = 0 manually. Lemma 5.10. If M + S ≥ 0, then
qk
2 +ak
k∈Z
M a+k
+ + M +S S = . S+a q q k q
(5.18)
Proof. We first verify that the sum in the left-hand side is indeed finite. Because M +S ≥ 0, we have either M ≥ 0 or S ≥ 0. In the first case, we have nonzero summands only if −a ≤ k ≤ M − a, and in the second case, only if 0 ≤ k ≤ S. We let the polynomial in the left-hand side be denoted by LHSM,S a . Clearly, we have S,M LHSM,S = LHS and the same identity for the right-hand side. We can therefore a −a restrict ourselves to the case S ≥ 0 and prove that = LHSM,S a
M +S S+a
+ q
for S ≥ 0 and any M.
(5.19)
Applying (5.6) to the second factor in LHSM,S a , we obtain LHSM,S = LHSM,S−1 + q S+a LHSM,S−1 a a a+1 , i.e., (5.5) for the right-hand side of (5.19). It therefore remains to verify (5.19) for S = 0. If S = 0, then the only nonzero summand in LHSM,S is for k = 0, and it is equal to a the right-hand side of (5.19). The case d = 0 in Theorem 5.3 can be obtained by substituting M = N+ − (p − 1)(n+ − n− ), S = N− + (p − 1)(n+ − n− ), a = n+ − n− , and k = n− in (5.18).
290
B.L. Feigin, S.A. Loktev, I.Yu. Tipunin
6. Dimension and Character Formulas for the Coinvariants Theorem 6.1. For any a ∈ Ck , we have dim Ir [W (Ni
1 ,j 1
, Ni
2 ,j 2
, . . . , Ni
k ,j k
; a)] = dp,r [i 1 , j 1 ; i 2 , j 2 ; . . . ; i k , j k ] .
(6.1)
Proof. We first note that because of the action of the automorphisms Uθ , we can consider only coinvariants of the representation I0 . The main idea of the proof is to verify that main inequality (4.21) is indeed an equality. We have a chain of inequalities here, where we know the first element by Proposition 4.3 and the last element. As soon as we verify that these two numbers are equal, we also obtain the equality of the other numbers in (4.21). We suppose that the vector N satisfies well-balanced conditions (5.2). Then this equality follows from Theorem 5.1 by setting z = 1 and q = 1. But if conditions (5.2) are not satisfied, then this equality can be violated. Hence, we should deduce the general case from the well-balanced one. For this, we consider the family of automorphisms Upθ . They preserve the representation I0 and permute the subalgebras W (N). Namely, Upθ preserves N0 , . . . , Nd−1 , increases N+ by pθ, and decreases N− by pθ . We recall that Lp = N+ + N− − Np−2 . We first consider the case Lp > 0. Then we should set d = p − 1. Let θ ∈ Z be the minimum number such that 2N+ + pθ − Np−2 ≥ −p + 1. We therefore have 2N+ + pθ − Np−2 ≤ p, and because Lp ≥ 1, we obtain 2N− − pθ − Np−2 > −p + 1. Hence, after the automorphism Upθ is applied, the well-balanced conditions are satisfied. In the case Lp = 0, we set d = p − 2. As above, let θ ∈ Z be the minimum number such that 2N+ + pθ − Np−3 ≥ −p. We therefore have 2N+ + pθ − Np−3 < p, and because Lp−1 ≥ 0, we obtain 2N− − pθ − Np−3 > −p. Hence, after the automorphism Upθ is applied, the well-balanced conditions are satisfied. Corollary 6.2. Let 0 ≤ d ≤ p − 1. Let A be the matrix given by (4.15). Then we have d Jru ∼ = Ir .
(6.2)
If Nd = Nd+1 = · · · = Np−2 = N+ + N− and N satisfies the well-balanced condition 2N+ − Nd−1 + 2r ≥ −(2p − d − 2), then we have
2N− − Nd−1 − 2r ≥ −(2p − d − 2),
d Jru [W A (N)] ∼ = Ir [W (N)] .
(6.3)
Therefore, for the character of Ir , we have a set of formulas indexed by 0 ≤ d ≤ p−1, ch Ir (q, z) = z
− pr
r
q 2p
(r−p+2)
n∈Z2
zu·n q 2 nA·n+v·n 1
a∈I
1 , (q)ea ·n
where the set of indices is I = {+, −, 0, 1, 2, . . . , d − 1}, u = (1, −1; 0, . . . , 0), and v = (p/2 − r − 1)u. We now discuss the character formulas for coinvariants. Theorem 6.3. We consider the character defined as ch Ir [W (N)](q, z) = tr zH0 q L0 .
Coinvariants for Lattice VOAs and q-Supernomial Coefficients
291
(i) Let the set of indices be I = {+, −, 0, 1, . . . , p − 2}. We consider the vectors u = (1, −1; 0, . . . , 0), v = (p/2 − r − 1)u, w = ru. Then
r 1 −r (r−p+2) ch Ir [W (N)](q, z) = z p q 2p zu·n q 2 nA·n+v·n n∈Z2
e · (N + n − nA + w) + a × , ea · n q a∈I
n m
where (ii) We have
+ q
is given by (5.4).
ch Ir [W (N)](q, z) = z
− pr
r
q 2p
(r−p+2)
p
zm q 2 (m
2 +m)−(r+1)m
m∈Z
L pm − r + N−
p+1 q
.
Proof. We note that Uθ acts on the character by substituting zq for z. It therefore suffices to prove the theorem in the case r = 0. We first suppose that N satisfies well-balanced conditions (5.2). Then (i) follows from Corollary 6.2 and Proposition 4.3; statement (ii) can be deduced from (i) by Theorem 5.1. We know that the general case can be reduced to the well-balanced case by the action of Upθ . Because Upθ acts on the right-hand side of (ii) by the same substitution of zq p for z, we have (ii) in general case. Statement (i) can be deduced from (ii) by Theorem 5.3. Corollary 6.4. Let 0 ≤ d ≤ p − 2. Suppose that Nd = Nd+1 = · · · = Np−2 = N+ + N− . Let the set of indices be I = {+, −, 0, 1, . . . , d − 1} and, as above, u = (1, −1; 0, . . . , 0), v = (p/2 − r − 1)u, and w = ru. Then
r 1 −r (r−p+2) ch Ir [W (N)](q, z) = z p q 2p zu·n q 2 nA·n+v·n n∈Z2
e · (N + n − nA + w) + a × , ea · n q a∈I
where
n m
+ q
is given by (5.4).
Acknowledgements. This work is supported by grants RFBR 00-15-96579, 01-01-00906, 01-02-16686 and 01-01-00546 and CRDF RP1-2254. The work of I. Yu. Tipunin is partially supported by Science support foundation, grant for talented young researchers. We are grateful to William B. Everett for improving English in this paper.
References [D] [FFr] [FFu]
Dong, C.: Vertex algebras associated with even lattice. J. Algebra 160, (1993) Feigin, B., Frenkel, E.: Coinvariants of nilpotent subalgebras of the Virasoro algebra and partition identities. Adv. Sov. Math. 16, 139–148 (1993), hep-th/9301039 Feigin, B. L., Fuchs, D. B.: Cohomology of some nilpotent subalgebras of the Virasoro and Kac-Moody algebras. In: Geometry and Physics, Essays in Honor of I.M. Gelfand on the Occasion of his 75th Birthday. S. Gindikin, I. M. Singer, eds, Amsterdam: North-Holland, 1991, p. 209–235
292
B.L. Feigin, S.A. Loktev, I.Yu. Tipunin
2 spaces of [FKLMM1] Feigin, B., Kedem, R., Loktev, S., Miwa, T., Mukhin, E.: Combinatorics of the sl coinvariants. In: Transformation Groups. Vol. 6, No. 1, 2001, pp. 25–52, math-ph/9908003 2 spaces [FKLMM2] Feigin, B., Kedem, R., Loktev, S., Miwa, T., Mukhin, E.: Combinatorics of the sl of coinvariants, II. math.QA/0009198, to appear in Selecta Mathematica 2 spaces [FKLMM3] Feigin, B., Kedem, R., Loktev, S., Miwa, T., Mukhin, E.: Combinatorics of the sl of coinvariants, III. math.QA/0012190, to appear in Compositio Mathematica [FL] Feigin, B., Loktev, S.: On generalized Kostka polynomials and the quantum Verlinde rule. In: Differential Topology, Infinite-dimensional Lie Algebras, and Applications. Amer. Math. Soc. Transl. Ser. Vol. 2, 194 (1999), pp. 61–79, math.QA/9812093 [K] Kac, V.: Vertex Algebras for Beginners. University Lecture Series, 10, 1997 [KKMM] Kedem, R., Klassen, T., McCoy, B., Melzer, E.: Fermionic sum representations for conformal field theory characters. Phys. Lett. B 307, 68–76 (1993) [SW] Schilling, A., Ole Warnaar, S.: Supernomial coefficients, polynomial identities and q-series. The Ramanujan Journal 2, 459–494 (1998), math.QA/9701007 [S] Segal, G.: Geometric aspect of quantum field theory. In: Proceedings of the ICM at Kyoto, 1990 [V] Verlinde, E.: Fusion rules and modular transformations in 2D conformal field theory. Nucl. Phys. B300, 360–376 (1988) Communicated by L. Takhtajan
Commun. Math. Phys. 229, 293–307 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0683-6
Communications in
Mathematical Physics
Numerical Linear Algebra and Solvability of Partial Differential Equations Maciej Zworski Mathematics Department, University of California, Evans Hall, Berkeley, CA 94720, USA. E-mail:
[email protected] Received: 26 November 2001 / Accepted: 20 February 2002 Published online: 6 August 2002 – © Springer-Verlag 2002
Abstract: It was observed long ago that the obstruction to the accurate computation of eigenvalues of large non-self-adjoint matrices is inherent in the problem. The basic idea is that the resolvent of a highly non-normal operator can be very large far away from the spectrum. This leads to an easily observable fact that algorithms for locating eigenvalues will typically find some “false eigenvalues”. These false eigenvalues also explain one of the most surprising phenomena in linear PDEs, namely the fact (discovered by Hans Lewy in 1957, in Berkeley) that one cannot always locally solve the PDE P u = f . Almost immediately after that discovery, H¨ormander provided an explanation of Lewy’s example showing that almost all operators with non-constanct complex valued coefficients are not locally solvable. In modern language, that was done by considering the essentially dual problem of existence of non-propagating singularities. The purpose of this article is to review this work in the context of “almost eigenvalues” and from the point of view of semi-classical analysis. 1. Introduction The purpose of this article is to describe a connection between the following seemingly unrelated issues: • Let A ∈ Mn (C) be an n by n matrix with complex entries. Its eigenvalues, λ1 , · · · , λn , which are the solutions of the characteristic equation det(A−λ) = 0, are well defined mathematical objects. Their numerical computation is a delicate problem which in the case when A is not normal (AA∗ = A∗ A) may be very unstable. In fact, in many situations it is practically impossible. • Let V be a non-vanishing vector field in three variables, V = 3j =1 aj (x)∂xj . Can the equation V u = f , f ∈ C ∞ (R3 ), be locally solved somewhere? That is, does there exist ⊂ R3 , open, and u ∈ C 1 () such that V u = f in ? When the coefficients
294
M. Zworski
aj are real valued, that can be done, as by a change of variables V can be locally transformed to ∂y1 . When aj ’s are allowed to be complex valued, but both aj ’s and f are real analytic, then local solvability follows from the classical Cauchy-Kovalevskaya theorem. It came as a great surprise to everybody when Hans Lewy [14] discovered a simple vectorfield V = ∂x1 + i∂x2 + i(x1 + ix2 )∂x3 , for which there exist many functions f ∈ C ∞ (R3 ), such that V u = f cannot be locally solved anywhere. As we will see, some of the difficulties in finding eigenvalues for highly non-selfadjoint problems result from the phenomena which also cause the lack of solvability of most partial differential operators with complex coefficients. I take the point of view of semi-classical analysis, with the “Planck constant” h being small. It should be stressed however that the same methods are applicable in many settings of asymptotic analysis, where h can be replaced by the wave length, step size in discretization of PDEs, the reciprocals of the P´eclet number, or the Reynolds number, and even the reciprocal of the size of the matrix. This article is written from the point of view of a “press-the-button” user of numerical analysis. What I found fascinating was the fact that things which have been standard in microlocal analysis can be easily seen in numerical experiments. I learned this because of the work of Brian Davies and Nick Trefethen. 2. Spectrum The simplest operator for which the eigenvalues are interesting and well understood is the quantum harmonic oscillator: P = Dx2 + x 2 , Dx =
1 ∂ , i ∂x
(1)
where we follow the notational convention which is useful when dealing with Fou rier transforms: D x f (ξ ) = ξ fˆ(ξ ). This is also motivated by quantum mechanics as the operator Dx is the quantization of the classical momentum ξ . The classical object corresponding to the quantum harmonic oscillator is the energy, E = ξ 2 + x2 ,
(2)
of the classical harmonic oscillator. The spectrum, that is the set of eigenvalues of P , can be analyzed using the creation and annihilation operators A+ = Dx + ix , A− = A∗+ = Dx − ix , which satisfy P = A+ A− + 1 = A− A+ − 1 . A calculation, which also captures the essence of the uncertainty principle, shows that the lowest eigenvalue is 1:
Linear Algebra and Solvability
295
u 2 = |[Dx , x]u, u| = 2| ImxDx u, u| ≤ 2 xu Dx u ≤ xu 2 + Dx u 2 = P u, u, P u0 = u0 , u0 (x) = exp(−x 2 /2) . By applying the creation operator A+ to u0 , we obtain the eigenfunctions corresponding to higher energies: un = An+ u0 , P un = (2n + 1)un . The operator P is self-adjoint on L2 (R) and the time evolution of a state preserves the energy, and can be described using the eigenfunctions: exp(−itP )u L2 = u L2 , exp(−itP )un = exp(−it (2n + 1))un . (3) 2 2 2 Here the norm which defines the Hilbert space L (R) is given by u L2 = R |u(x)| dx , and by “evolution” we mean solving the Schr¨odinger equation, i
∂ u = Pu. ∂t
Although the operator P is very special, many quantum systems are modeled by ensembles of harmonic oscillators. In general, when the energy of a system is conserved, we have propagation which can be described in terms of oscillations coming from various real modes of the system. These modes are the eigenvalues of a self-adjoint operator P = − + V (x) , =
n j =1
∂x2j , V (x) real valued.
(4)
For instance we can be in the situation shown in Fig.1a, where we impose Dirichlet boundary conditions, u(0) = u(π ) = 0. 3. Non-selfadjoint Operators There are many ways in which non-selfadjoint operators occur in real life problems. Roughly speaking, any phenomenon in which dissipation or escape are possible (consequently, practically any phenomenon) will be governed by a non-selfadjoint operator. We will consider here an example motivated by quantum chemistry and illustrated in a
Fig. 1. (a) A potential well on a finite interval. The spectrum is real and the hard wall causes reflections. (b) A potential on R: same classical picture but a very different quantum picture. The absorbing potential −iW (x) is added to absorb the states escaping from the well (no reflection)
296
M. Zworski
simplified form in Fig.1b. The situation is still described by a self-adjoint operator of the form 4 but the possible escape to infinity, in the now open system, will cause local decay of states. In a concrete example studied in [8], the bottom of the well in the figure corresponds to an unstable molecule of oxirene and the open regions on the left and the right to isomers of ketene. To study the reaction rates in isomerization of ketene it is useful to isolate the unstable system and the simplest way to do so is through the introduction of a complex absorbing potential −iW (x). Here, W (x) ≥ 0 is zero, or nearly zero, in [0, π ] and becomes very large outside of that interval [8]. The new operator, = − + V (x) − iW (x) , P (5) is now non-self-adjoint. The difference between a hard barrier in Fig.1a (which could also be modeled by adding the real potential W (x)), is that the complex absorbing potential produces no reflection and we model an escaping state. The eigenvalues of are now complex and lie in the lower half-plane: z = E − i . The evolution of a pure P state is as in 3: u = (E − i )u , exp(−it P )u = exp(−itE − t )u , P where we now see decay at the rate given by − Im z = . As in [8] the complex absorbing potential method can be used to compute reaction rates and the location of the Breit-Wigner peaks. Finer description is given in terms of resonances (see [21] for an introduction and pointers to references) and the relation between the two methods is going to be described elsewhere. 4. A Simple Model Instead of considering a complicated system such as 5 we follow Davies [1] and study the rotated harmonic oscillator: Pα = Dx2 + eiα x 2 , 0 ≤ α < π .
(6)
The spectrum of this operator is easily computed by making a change of variables y = eiα/4 x, so that Pα = eiα/2 (Dy2 + y 2 ) = eiα/2 P , where P is the harmonic oscillator (1). With a minimal amount of justification, we can see from this that the spectrum of Pα is given by eiα/2 (2n + 1) , n = 0, 1, · · · , where spectrum is the set of z ∈ C for which there exists u ∈ L2 (R) such that P u = zu. We can now try to find the same eigenvalues numerically. Although better accounts of this are available in [1] and [20, Chapter 9], out of curiosity, I proceeded directly c using Mathematica, as a basis of L2 ei ∞ and a simple discretization based on taking 2 genfunctions, ψj j =0 , of a different harmonic oscillator (Dx + 2x 2 ). For a truncated basis of 200 elements, and the discretization Pα ψi , ψj 0≤i,j ≤200 , the results of the computation are shown in Fig. 2. What we see is a very regular bifurcation, with the correct angle α/2 at first, and then with one arm of the bifurcation going up at the angle α. The computation is in fact very accurate up to the bifurcation point as seen in the magnified picture shown in Fig. 3. The same basis works very well for the normal operator eiα/2 (Dy2 + y 2 ), which is shown as the “control” result in the same figure.
Linear Algebra and Solvability
297
Fig. 2. Computation of eigenvalues of the discretized rotated harmonic oscillator with α = π/3
Fig. 3. Results near the bifurcation point: the eigenvalues of a normal operator with the same spectrum are computed as control data
This example is typical. As emphasized by Trefethen, computations of eigenvalues of highly non-normal operators, will show “pseudo-eigenvalues” and that set will exhibit structure of its own. 5. Pseudospectrum We will now follow Trefethen’s review article [19], and recall the definition of the pseudospectrum for matrices. Thus let A ∈ Mn×n (C) be a matrix. A well known consequence of the spectral theorem says that
298
M. Zworski
A = A∗ ⇒ (λ − A)−1 = d(λ, spec(A))−1 , where spec(A) is the set of eigenvalues of A, and d(•, •) the usual distance in C. Nothing of this sort remains valid for non-self-adjoint operators (or strictly speaking, non-normal operators, AA∗ = A∗ A). The resolvent (λ − A)−1 can be very large for points very far from the spectrum. This leads to the definition of the %-pseudospectrum: &% (A) = {λ ∈ C : (λ − A)−1 ≥ % −1 } = {λ ∈ C : λ ∈ spec(A + δA) for some δA with δA ≤ %} , where we used the fact that A is a matrix for the second equality. If the resolvent is large for points away from the spectrum, false eigenvalues will appear. They come from inevitable perturbations originating for instance from the round-off errors, and from the discretization (such as the finite difference, finite element, or spectral method employed in a specific code). That explains, at least roughly, the regularity of bifurcation in Fig.2. Due to the specifics of our disretization, and computer code1 c , at some stage of the computation, points on the level sets of the used by Mathematica norm of the resolvent, &% (A), with % small, will be chosen over the actual eigenvalues. The computational results presented in Figs.2 and 3 constituted a heuristic experiment, though one that a user of a numerical package is likely to encounter. A proper approach involves the computation of the pseudospectra and for the rotated harmonic oscillator that can be seen in [20, Output 24]. The importance of the size of the resolvent in computational problems was stressed early on by Kreiss [13]. Many systems in science and engineering are described using some form of linearized propagation exp(tA) , not unlike the quantum mechanical propagation (3). The following intuitive statement is quite standard: Stability ⇐⇒ max Re λ < 0 . (7) λ∈spec(A)
This is correct in the asymptotic sense as for a matrix A, we always have
exp(tA) ≤ exp max Re λ + δ t , t ≥ t (δ) 0 . λ∈spec(A)
If however sup{|z| : z ∈ &% (A)} > 0, the linearized propagator, exp tA, can be very large for very long times – see [19, Theorem 5], and references there. What matters in specific problems is far from clear and, in some cases, somewhat controversial. 6. Semi-classical Explanation The problem of finding eigenvalues for the operator Pα defined by (6), is the same as that of solving the equation (Pα − λ)u = 0. We will rephrase it semi-classically by putting 1 y = h 2 x, so that Pα − λ = Dx2 + eiα x 2 − λ
= h−1 (hDy )2 + eiα y 2 − hλ = h−1 (P (h) − z) , z = hλ . 1
The choice of this particular code was motivated only by easy availability.
Linear Algebra and Solvability
299
Since we are interested in the behaviour of the resolvent as λ gets larger, we can work with z in a fixed region, and let h → 0. As was observed by Davies [1, 2] and discussed further in [22], the resolvent becomes very large inside of the open set given by the values of the symbol of the operator P (h): ξ 2 + eiα x 2 : ∀ N (P (h) − z)−1 ≥ CN ()h−N , z ∈ {w : 0 < arg w < α} , h < h0 () . (8) In fact, as will be discussed later, the “super-polynomial” lower bound could be replaced by an exponential bound exp(1/C()h). Rescaling back to λ shows that the level sets of the resolvent are close to lines homothetic to the boundary of the range of the symbol (a more precise description will be given later). The two arms of the bifurcation in Fig.2 are getting close to these level sets. In general, the symbol is defined by substituting ξ for hDx in a differential operator, and we think of the differential operator as the quantization of its classical symbol, just as (1) was the quantization of (2). The principal symbol comes from neglecting the terms which depend on h (none in this case). Conversely, if we think of (x, ξ ) as an element of the classical configuration space, T ∗ R (or more generally T ∗ X where X is our physical space), then any function p(x, ξ ) satisfying β
|∂xα ∂ξ p(x, ξ )| ≤ Cαβ (1 + |ξ |)m , for some m
(9)
can be quantized to give an operator P (h) = p(x, hD). Differential operators are obtained from polynomials in ξ – see [4] for a comprehensive introduction to semi-classical analysis. The estimate (8) follows from the existence of “almost eigenvalues” or what in mathematical physics would be called quasi-modes: (P (h) − z)u(h) L2 = O(h∞ ) , u(h) L2 = 1 .
(10)
In addition, the states u(h) are highly localized, which in the semi-classical/microlocal language means that they are non-propagating. Normally, we expect states to propagate according to the rules of classical mechanics, just as light, governed by the Helmholtz equation, (−h2 − 1)u = 0, propagates2 . Consequently, to construct approximate solutions, global properties of the classical flow, such as existence of invariant sets, need to be considered, leading to Bohr-Sommerfeld quantization conditions (see [15] and references given there). Different mechanisms take over in the case of complex coefficients. To describe them we need some elementary symplectic geometry. The configuration space T ∗ X is a symplectic manifold, that is, a manifold equipped with non-degenerate closed differential two-form ω. By the classical theorem of Darboux (which will be essential below), any symplectic form can be locally transformed to the standard one: ω0 = ni=1 dξi ∧ dxi . Classical evolution is described by flows which preserve the symplectic form, and a natural object which arises is the Poisson bracket: {f, g} =
n ∂f ∂g ∂g ∂f − . ∂ξi ∂xi ∂ξi ∂xi i=1
2
See the remark after the proof of the main theorem for an explanation.
300
M. Zworski
The Poisson bracket is essential in quantization theories, through the following fundamental relation: [f (x, hD), g(x, hD)] =
h {f, g}(x, hD) + O(h2 ) , i
where in the roughest form O(h2 ) may mean an operator bound in L2 , provided that f and g satisfy (9) with m = 0. With this definition, we can state a semi-classical generalization of H¨ormander’s theorem [9], which is an immediate adaptation of the results of Duistermaat and Sj¨ostrand [5]: Theorem . Suppose that p(x, ξ ) satisfies (9) and that p(x0 , ξ0 ) = 0 , {Re p, Im p}(x0 , ξ0 ) < 0 . Then for any P (h) = p(x, hD) + hp1 (x, hD, h), there exist u(h) such that q(x, hD)u(h) L2 (x0 , ξ0 ).
(11) P (h)u(h) L2 = O(h∞ ) , u(h) L2 = 1 , ∞ = O(h ) for any q(x, ξ ) which vanishes in a neighbourhood of
The last statement in (11) says that u(h) is localized in space (x) and momentum (ξ ). A moment’s reflection also shows that the O(h∞ ) smallness of the L2 norms implies smallness of norms including derivatives. When p(x, ξ ) is real analytic (for instance, when we are dealing with a differential operator with real analytic coefficients), then using the work of Kashiwara-Kawai [12] (which provided partial motivation for [5]), the estimates can be improved to exponential ones, that is, O(h∞ ) can be replaced by exp(−1/Ch). The level sets of the norm of the resolvent are also related to the level sets of {Re p, Im p}. In the simple example discussed above we take P (h) = (hDx )2 + eiα x 2 − z , so that p(x, ξ ) = ξ 2 + eiα x 2 − z, and {Re p, Im p}(x, ξ ) = 2xξ sin α . For any z in the interior of {w : 0 < arg w < α} we can find (x0 , ξ0 ) such that the assumptions of the theorem are satisfied. That produces the complex quasi-modes, and results in the blow-up of the resolvent as h → 0. The estimates are uniform on compact sets in which the Poisson bracket is uniformly bounded away from 0. We conclude this section with another numerical experiment motivated by [19, Example 5]. Let us put P (h) = (hDx )2 + ihDx + V (x) , V (x) ≥ 0 , where V (x) grows very fast outside of a finite interval (faster than x 2 , we can take V (x) = x 2 for simplicity). This is a form of a convection-diffusion operator where the small constant h can be interpreted as the inverse of the P´eclet number, rather than as the Planck constant. If we conjugate the operator by the exponential exp(−x/2h) we see that the spectrum is real and bounded from below by 1/4:
Linear Algebra and Solvability
301
Fig. 4. Computation of the eigenvalues of a convection-diffusion operator with a quadratic potential
1 e−x/2h (hDx )2 + ihDx ex/2h = (hDx )2 + . 4 On the other hand, if p(x, ξ ) = ξ 2 + iξ + V (x), then {Re p, Im p}(x, ξ ) = −Vx (x) , and that is negative for sufficiently large x (the potential is supposed to grow). This will produce complex quasi-modes concentrated where V (x) > 0 (for V (x) = x 2 , at points x0 , with x0 > 0). A na¨ıve numerical experiment confirms that the resolvent gets large inside the set of values of p(x, ξ ): {z : Re z > (Im z)2 } . This is shown in Fig. 4, where the computations are done for V (x) = x 2 with 100 basis vectors used in the previous computation. We take h = 10−k , k = 1, 2, 3. What is strange is the fact that in principle the number of basis vectors for h = 10−3 should be too small: in one dimension we ordinarily need ∼ h−1 basis vectors. Yet, we already see the semi-classically determined pattern of a parabola. Finally, let us point out that if we put A = 1/8 − P (h), we obtain a simple example in which the criterion for stability (7) is not accurate for a very long time: the eigenvalues lie left of −1/8 but the resolvent is very large (for small h) for Re z ∼ 1/8. 7. Lack of Solvability H¨ormander’s theorem quoted in the previous section was motivated by a very different problem. It was a tool to explain Lewy’s example mentioned in the very beginning, and to provide a general condition for the lack of solvability. The idea used in [9] can be roughly described as follows. Suppose we want to solve Pu = f . If one succeeds in constructing a family of u(h) such that P ∗ u(h) = O(h∞ ) , lim u(h), f = ∞ , h→0
(12)
302
M. Zworski
then (12) is clearly impossible: O(h∞ ) = P ∗ u(h), u = u(h), P u = u(h), f −→ ∞ , a contradiction. The implementation of this idea for f ∈ C ∞ and u ∈ D (the space of distributions) involves an elegant use of functional analysis (Banach’s applications of Baire’s Category Theorem). The main point is the construction of approximate solutions for the adjoint P ∗ , and that is in essence the theorem of the previous section. Its translation to the classical differential operator gives the celebrated commutator condition. Suppose that P (x, D) = |α|≤m aα (x)Dxα , p(x, ξ ) = |α|=m aα (x)ξ α , and that p(x0 , ξ0 ) = 0 , {Re p, Im p}(x0 , ξ0 ) = 0 , ξ0 = 0 . Then for a large class (generic) of f ∈ C ∞ , (12) cannot be solved in any neighbourhood of x0 . For differential operators, the sign of the Poisson bracket is irrelevant as the sign can be changed by changing the sign of ξ (it is a polynomial of degree 2m − 1). Hence the vanishing of the symbol of the commutator of P and P ∗ is a necessary condition for solvability (the commutator condition). The commutator condition appeared already in H¨ormander’s thesis before Lewy’s example. In a stronger form it was used to guarantee solvability for some operators with complex coefficients (see [10] and references given there). In Lewy’s example we have p(x, ξ ) = ξ1 + iξ2 + i(x1 + ix2 )ξ3 , so that at any x ∈ R3 we can find ξ(x) = (−x1 ξ3 , −x2 ξ3 , ξ3 ) = 0 such that p(x, ξ(x)) = 0 , {Re p, Im p}(x, ξ(x)) = 2ξ3 = 0 , which shows lack of solvability anywhere. Let us also use this example to indicate the relation to the semi-classical discussion before. If we take the Fourier transform in x3 , put h = 1/ξ3 , ξ3 → +∞, and multiply P by h, then we obtain a semi-classical operator, P (h) = hDx1 + ihDx2 + ix1 − x2 . The smoothness properties are now translated to decay properties as h → 0. The adjoint of P (h) has the symbol ξ1 − iξ2 − ix1 − x2 which satisfies the assumptions needed for (11) (with P (h) replaced by P (h)∗ ). The approximate solutions used to prove lack of solvability can be obtained from the quasi-modes constructed there. Since the early papers [9, 14], the question of solvability the question of solvability of differential and pseudodifferential equations3 was studied by the leading analysts – see [11], and also [10] for excellent surveys. 8. Proof of Theorem Replacing technical arguments by heuristics, we can present an outline of the modern proof of the main theorem. It essentially follows [11, Sect. 26.2] with simplifications in the semi-classical setting. Let P (h) be a semi-classical operator with a principal symbol p(x, ξ ): P (h) = p(x, hDx ) + hp1 (x, hDx , h) . The solution of the long open problem of sufficiency of condition 5 has been recently announced by Dencker [3]. 3
Linear Algebra and Solvability
303
Suppose that the commutator condition holds in a stronger form, {Re p, Im p}(x, ξ ) = −1 , in a neighbourhood of a point (x0 , ξ0 ), such that p(x0 , ξ0 ) = 04 . The classical theorem of Darboux shows that there exists a symplectic change of variables (that is, a change of variables preserving the symplectic form, or in other words, classical mechanics) κ(y, η) = (x, ξ ) , defined near (0, 0), and such that κ(0, 0) = (x0 , ξ0 ) , Re p(κ(y, η)) = η1 , Im p(κ(y, η) = −y1 . In other words,
κ ∗ p(y, η) = η1 − iy1 , which is the symbol of the annihilation operator A− = hDy1 − iy1 which appeared in the discussion of the quantum harmonic oscillator. This operator has a highly localized solution given by the ground state of the harmonic oscillator, and we can localize it trivially to (0, 0) in all variables: u0 (h, y) = exp(−|y|2 /2h) , (hDy1 − iy1 )u0 (h, y) = 0 . The question now lies in transplanting u0 to the (x, ξ ) coordinates so that we obtain an approximate solution of P (h)u = 0. The point of semi-classical analysis, and of its reflection in the theory of partial differential equations, microlocal analysis, is that the symplectic transformation κ can also be quantized (just as we quantized functions to obtain generalized differential operators, or pseudo-differential operators). That gives the theory of Fourier Integral Operators (see [11, Chapter XXV], and also [18] for a self-contained discussion of local theory in the semi-classical setting). More precisely, we can associate to κ a family of operators Uκ (h) : L2 (Rn ) −→ L2 (Rn ) , uniformly bounded, invertible for small h, such that for any function q(x, ξ ) satisfying (9) we have Uh (κ)−1 q(x, hDx )Uh (κ) = κ ∗ q(y, hDy ) + OL2 →L2 (h) .
(13)
This relations is called the Egorov Theorem and for simplicity we assumed here that κ is globally defined. There are many operators which satisfy (13) and with care, one can choose Uh (κ), and q(x, ξ, h), so that Uh (κ)−1 P (h)Uh (κ) = q(y, hDy , h)(hDy1 − iy1 ) + OL2 →L2 (h∞ ) , q(y, hDy ; h) = q0 (y, hDy ) + hq1 (y, hDy ; h) , q0 (0, 0) = 0 , is valid for functions localized near (0, 0) (in the sense given in (11)). This concludes the proof of the theorem as by putting u(h) = Uh (κ)u0 (h) we obtain a family of quasi-modes satisfying (11). In the case when we have analyticity, similar results can be obtained, but now, with exponentially small errors. That involves the geometric result from [12] (existence of an analytic symplectic transformation) and the theory of Fourier Integral Operators in complex domains [16], or a direct complex WKB construction in the pseudo-differential setting. 4 As shown by Duistermaat and Sj¨ ostrand [5, 11, Lemma 21.3.4] this can always be arranged by multiplying p by a non-vanishing function.
304
M. Zworski
Remark . We should briefly mention what happens for operators, P (h), with real symbols satisfying the principal type condition: p(m) = 0 ⇒ dp(m) = 0 . In that case, we can proceed as before, obtaining κ ∗ p(y, η) = η1 , and then Uh (κ)−1 P (h)Uh (κ) = hDy1 + OL2 →L2 (h∞ ) . If we consider solutions of hDy1 u = 0, we see that in solving (11) the condition on q has to be modified: the support of q has to be invariant under the flow of the Hamilton vector field of p (∂y1 in the (y, η) coordinates; the Hamilton vector field is defined by Hp f = {p, f }). In the actual construction of approximate solutions (quasi-modes) global properties of the flow are important and “matching conditions” will allow only for a discrete set of z’s. An easy way to see it is to consider the problem (hDx −z)u(h) = O(h∞ ), x ∈ S1 = R/2π Z. A complex analogue is obtained when we consider the case where {Re p, Im p} = 0 but d Re p and d Im p are independent – see [11, Sect.26.2] and [15]. 9. Annihilation Operator in Linear Algebra The annihilation operator A− = hDy1 − iy1 has spectrum equal to C and it satisfies the commutator condition globally: [A− , A∗− ] = 2I d , a property which we already used in deriving the lower bound for the quantum harmonic oscillator. As discussed in the previous section this is also the microlocal model for an operator with a non-propagating semi-classical singularity (quasi-mode). From the point of view of linear algebra the simplest model exhibiting “almost eigenvalues”, or more precisely pseudospectrum away from the actual eigenvalues, is the Jordan block matrix:
0 0 . Jn = .. 0 0
1 0 0 ··· 0 1 0 ··· .. . . . . . . . . . . . .. ··· ··· . 1 ··· ··· ··· 0
− , which can also be interpreted as an annihilation It is a truncation of the shift operator A − , A ∗− ] ≥ 0. operator (A− un+1 = un , n ≥ 0). The spectrum of A˜ − is the unit disc and [A We have Jn =
I Jn J n−1 + + ··· + nn , λ λ λ
and although the spectrum of Jn is equal to {0} (with multiplicity n), the norm of the resolvent grows exponentially with n for |λ| < 1 − δ, δ > 0. Hence for any % > 0, the %-pseudospectrum will approach the the disc of radius 1 + % as the size of the matrix goes to infinity.
Linear Algebra and Solvability
305
Fig. 5. Eigenvalues of perturbations of 200 × 200 Jordan matrices: 10−5 put in the lower left-hand corner, and a random perturbation with entries bounded by 10−5
From the spectral point of view, the most dramatic perturbation of Jn comes from adding to it the matrix 0 ··· 0 .. . . .. . . . , x ··· 0
with the resulting matrix denoted by Jn (x). Its characteristic polynomial is given by λn − x. Hence, for an arbitrarily small x, the spectrum will be very close to the boundary of the pseudospectrum, |x|1/n → 1, as n → ∞. This is the phenomenon which we have seen in our numerical experiments: in Figs. 2 and 4 the “false eigenvalues” were computed near the boundary of the semi-classically determined pseudospectrum. In fact, this can be used to explain the effect of the pseudospectra on some numerical computations. The proximity to Jordan block-like matrices will cause small perturbations (caused in turn by the properties of the discretization used, and by round-off errors) to push the computed eigenvalues, which are the actual eigenvalues of A + δA, for δA small, to the boundary of the pseudospectrum. We refer to [6] for a discussion of related issues and content ourselves with another na¨ıve experiment. Figure 5 shows the eigenvalues of J200 (10−5 ) and the eigenvalues of a matrix obtained by adding to J200 a random matrix with entries of size bounded by 10−5 . The eigenvalues of the random perturbation are certainly closer to the unit circle than to zero.
306
M. Zworski
10. Other Directions There are at least two natural directions for further investigation of the connection described here: • The commutator condition and the rˆole of the annihilation operator can be understood in a more abstract framework. Classes of large matrices can be understood in terms of quantization of compact symplectic manifolds with the Planck constant corresponding to the inverse of the size of the matrix. In other words, the phenomena described here can be seen directly on the level of large matrices, and not only in discretization of differential operators. • Better estimates are expected to hold in the regions where {Re p, Im p} = 0, and those regions are often most interesting physically. The estimates on the resolvent are expected to be better there, and consequently numerical computations should be more stable. We motivate the second item above by the example from chemistry, schematically illustrated by Fig.1. The complex eigenvalues which are of interest (that is, the ones which model the unstable states) are expected to have imaginary parts, , of size much smaller than h (since the factor / h will appear as the decay rate). Also, the artificial potential −iW (x), will produce its own irrelevant eigenvalues in a region with Im z −h. Hence, any eigenvalues there are “false” – either due to −iW (x) or to numerical problems, and in any case of no interest. Consequently all interest lies in an h-dependent neighbourhood of the boundary “semi-classical pseudospectrum”. The Poisson bracket vanishes there and consequently the general results do not apply. For more subtle objects, namely resonances (see [21] and references given there), a similar statement is true. Of various recently studied cases, the furthest that one ever gets from the real axis is in the case of scattering by convex obstacles [17]5 , and a similar distance is Ch2/3 . In computing resonances one also has a lot of freedom in choosing the non-selfadjoint operator of which they are eigenvalues. Deforming so that {Re p, Im p} = 0 holds in a relevant region was recently exploited by Melin and Sj¨ostrand [15]. To get an heuristic understanding of such deformations, we refer to the example shown in Fig. 4: the operator became normal after a conjugation by exp(−x/2h), and that could be considered as a deformation of our operator. In the case shown in 3, the conjugation involves a differential operator in the exponential weight, and has a geometric interpretation as a deformation into the complex domain, which is in fact what we used. Consequently one arrives again at an issue similar to the one which arises in the study of solvability: {Re p, Im p} = 0 is necessary for solvability, but what other condition would make it sufficient as well? We refer to H¨ormander’s review [10] for a discussion of that. The relation of solvability to quantum mechanics and semi-classics was emphasized by Fefferman [7] but the relation to practical problems described here remains unclear. Acknowledgements. I would like to thank Mike Christ, Peter Greiner, and Lars H¨ormander for helpful discussions of the issues of solvability, Alan Edelman, William Kahan, and Beresford Parlett for advice on numerical linear algebra, Bill Miller for the explanation of the CAP method in a concrete situation, and Carl Wunsch and Jared Wunsch for comments on the first version of this paper. I am also grateful to Nick Trefethen and Mark Embree for their detailed and instructive comments, and to Tom Wright for providing the correct version of Fig. 5. 5 That is in fact also the oldest case, as the resonances for the sphere appear in the Watson transform used to study the behaviour of a diffracted wave in the deep shadow, a real but subtle thing.
Linear Algebra and Solvability
307
References 1. Davies, E.B.: Pseudospectra, the harmonic oscillator and complex resonances. R. Soc. Lond. Proc. Ser. A Math. Phys. Eng. Sci. 455, 585–599 (1999) 2. Davies, E.B.: Semi-classical states for non-self-adjoint Schr¨odinger operators. Commun. Math. Phys. 200, 35–41 (1999) 3. Dencker, N.: The sufficiency of condition 5. Preprint, 2001 4. Dimassi, M., Sj¨ostrand, J.: Spectral Asymptotics in the semi-classical limit. Cambridge: Cambridge University Press, 1999 5. Duistermaat, J.J., Sj¨ostrand, J.: A global construction for pseudo-differential operators with noninvolutive characteristics. Inv. Math. 20, 209–225 (1973) 6. Edelman, A., Ma, Y.: Staircase failures explained by orthogonal versal forms. SIAM J. Matrix Anal. Appl. 21, 1004–1025 (2000) 7. Fefferman, C.: The uncertainty principle. Bull. Am. Math. Soc. 9, 129–206 (1983) 8. Gezelter, J.D., Miller, W.H.: Resonant features in the energy dependence of the rate of ketene isomerization. J. Chem. Phys. 103, 7868–7876 (1995) 9. H¨ormander, L.: Differential operators of principal type. Math. Ann. 140, 124–146 (1960); Differential equations without solutions. Math. Ann. 140, 169–173 (1960) 10. H¨ormander, L.: On the solvability of pseudodifferential equations. In: Structure of solutions of differential equations (Katata/Kyoto, 1995). River Edge, NJ: World Sci. Publishing, 1996, pp. 183–213 11. H¨ormander, L.: The Analysis of Linear Partial Differential Operators. Vol. III, IV, Berlin-Heidelberg-New York: Springer Verlag, 1985 12. Kashiwara, M., Kawai, T.: Microhyperbolic pseudo-differential operators. I. J. Math. Soc. Japan 27, 359–404 (1975) ¨ 13. Kreiss, H.-O.: Uber die Stabilit¨atsdefinitionen f¨ur Differenzengleichungen die partielle Differentialgleichungen aproximieren. BIT 2, 153–181 (1962) 14. Lewy, H.: An example of a smooth linear partial differential equation without solution. Ann. Math. 66, 155–158 (1957) 15. Melin, A., Sj¨ostrand, J.: Bohr-Sommerfeld quantization condition for non-self-adjoint operators in dimension two. Preprint, 2001 16. Sj¨ostrand, J.: Singularit´es analytiques microlocales. Ast´erisque 95 (1982) 17. Sj¨ostrand, J., Zworski, M.: Asymptotic distribution of resonances for convex obstacles. Acta Math. 183, 191–253 (1999) 18. Sj¨ostrand, J., Zworski, M.: Quantum monodromy and semi-classical trace formulæ. J. Math. Pure Appl. 81, 1–33 (2002) 19. Trefethen, L.N.: Pseudospectra of linear operators. Siam Review 39, 383–400 (1997) 20. Trefethen, L.N.: Spectral methods for MATLAB. Software, Environments, and Tools. Philadelphia, PA: Society for Industrial and Applied Mathematics (SIAM), 2000 21. Zworski, M.: Resonances in Physics and Geometry. Notices of the AMS 46 no.3, March, 319– 328 (1999) 22. Zworski, M.: A remark on a paper by E.B. Davies. Proc. Am. Math. Soc. 129, 2955–2957 (2001) Communicated by P. Sarnak
Commun. Math. Phys. 229, 309 – 335 (2002)
Communications in
Mathematical Physics
© Springer-Verlag 2002
Modular Categories and Orbifold Models Alexander Kirillov, Jr. Department of Mathematics, SUNY at Stony Brook, Stony Brook, NY 11794, USA. E-mail:
[email protected] Received: 27 August 2001 / Accepted: 1 March 2002
Abstract: In this paper, we try to answer the following question: given a modular tensor category A with an action of a compact group G, is it possible to describe in a suitable sense the “quotient” category A/G? We give a full answer in the case when A = Vec is the category of vector spaces; in this case, Vec/G turns out to be the category of representation of Drinfeld’s double D(G). This should be considered as the category theory analog of the topological identity {pt}//G = BG. This implies a conjecture of Dijkgraaf, Vafa, E. Verlinde and H. Verlinde regarding so-called orbifold conformal field theories: if V is a vertex operator algebra which has a unique irreducible module, V itself, and G is a compact group of automorphisms of V, and some not too restrictive technical conditions are satisfied, then G is finite, and the category of representations of the algebra of invariants, V G , is equivalent as a tensor category to the category of representations of Drinfeld’s double D(G). We also get some partial results in the non-holomorphic case, i.e. when V has more than one simple module. Introduction The goal of this paper is to discuss the properties of the so-called orbifold models of Conformal Field Theory from the categorical point of view. For the reader’s convenience, we recall here the main definitions and results, assuming that the reader is familiar with the notion of a vertex operator algebra. Let V be a VOA and G – a finite group acting on V by automorphisms. Then the subspace of invariants V G is itself a VOA. The main question is: is it possible to describe the category of V G -modules in terms of the category of V-modules and the group G? This question was asked in this form in [DVVV]. They didn’t give a full answer, but did suggest (without a proof) an answer in a special case, when V is holomorphic, i.e. has Supported in part by NSF grant DMS9970473.
310
A. Kirillov, Jr.
only one simple module (vacuum module). In this case, the category of representations of V is equivalent to the category of vector spaces. The answer suggested in [DVVV] and further discussed in [DPR] is that in this case, the category of V G -modules is equivalent to the category of modules over the (twisted) Drinfeld double D(G) of the group G. The case of VOA’s coming from the Wess–Zumino–Witten model (or, equivalently, from affine Lie algebras) was studied in detail in [KT]. Many of the results of [DVVV] were rigorously proved in the language of VOA’s in a series of papers of Dong, Li, and Mason; in particular, it is proved in [DLM1] that in the holomorphic case, V considered as a module over both V G and G can be written as V= Vλ ⊗ M λ , (0.1) λ∈G
where Vλ are irreducible G-modules and Mλ are non-zero simple pairwise non-isomorphic V G -modules. However, even in the holomorphic case the full result (i.e., that the category of V G -modules is equivalent to the category of modules over the (twisted) Drinfeld double D(G)) is still not proved. In this paper we suggest a new approach to the problem. The main idea of this approach is not using the structure theory of VOA’s. In the author’s opinion, all the information about VOA’s which is relevant for this problem is encoded in the category of representations of V. For example, the pair V G ⊂ V can be described in this way: as discussed in [KO], such a pair is the same as an associative commutative algebra in the category C = Rep V G with some technical restrictions. Thus, if we know some basic properties of C – e.g., that this category is semisimple, braided, and rigid – then we can forget anything else about VOA’s, operator product expansions, etc. Instead, we use well-known tools for working with braided tensor categories, such as graphical presentation of morphisms. Using this approach, in this paper we give an accurate proof of the above conjecture; for simplicity, we only consider the case when all the “twists”, i.e. phase factors, are trivial. In this case, the main result reads as follows. Theorem. Let V be a VOA, G – a finite group of automorphisms of V, and V G – the algebra of invariants. Assume that (1) V is “holomorphic”, i.e. has a unique irreducible module, V itself, so that Rep V = Vec. (2) Rep V G is a semisimple braided rigid balanced category. (3) V has finite length as a V G -module. (4) Certain cohomology class ω ∈ H 3 (G, C× ) defined by V is trivial. Then the category Rep V G is equivalent to the category of modules over D(G, H ) = C[G]F(H ) for some normal subgroup H ⊂ G. If, in addition, we assume that Rep V G is modular, then H = G so Rep V G Rep D(G). In this paper we assume that the reader is well familiar with braided tensor categories and in particular, with the technique of using graphs to prove identities in such categories, developed by Reshetikhin and Turaev. This can be found in many textbooks (see, e.g., [Ka]); we follow the conventions of [BK]. Conversely, knowledge of vertex operator algebras and conformal field theory is not required: they do not even appear in the paper except in this introduction. The paper also makes heavy use of results of [KO], so we suggest that the reader keep a copy of that paper handy.
Modular Categories and Orbifold Models
311
1. Preliminaries Throughout the paper, we denote by C a semisimple braided tensor category over C, with simple objects Li , i ∈ I (“simple” always means “non-zero simple”). As usual, we assume that the unit object is simple and denote the corresponding index in I by 0: 1 = L0 . We assume that all spaces of morphisms are finite-dimensional and denote V , W = dim HomC (V , W ); in particular, Li , V is the multiplicity of Li in V . As any abelian category, C is a module over the category Vec of finite-dimensional complex vector spaces, i.e. we have a natural functor of “external tensor product”: : Vec × C → C
(1.1)
defined by Hom(Li , V L) = V ⊗ Hom(Li , L) (more formally: V L is the object representing the functor F (M) = V ⊗ Hom(M, L)). This functor is bilinear and has natural associativity properties: V (W L) = (V ⊗ W ) L, V (L ⊗ M) = (V L) ⊗ M, (V L) ⊗ (W M) = (V ⊗ W ) (L ⊗ M)
(1.2)
(here = means “canonically isomorphic”). Abusing the language, we will sometimes use ⊗ instead of . Also, we denote by G a compact group (e.g., G can be finite) and by Rep G the category of finite-dimensional complex representations of G. This category is semisim the set of isomorphism classes of simple G-modules, and for each ple. We denote by G we choose a representative Vλ . λ∈G We denote by C[G] the category whose objects are pairs (M ∈ C, action of G by automorphisms on M). In particular, each object of C can be considered as an object in C[G] by letting G act trivially. This category has the following properties, proof of which is left as an exercise to the reader. (1) C[G] is a semisimple rigid braided tensor category, with simple objects Vλ Li . (2) Define the functor of G-invariants C[G] → C by HomC (Li , M G ) = (HomC (Li , M))G = HomC [G] (Li , M), i.e. (V Li )G = V G Li . Then this functor is exact, and one has canonical embedding X G ⊗ Y G → (X ⊗ Y )G .
(3) One can define the anonical functor of “exterior tensor product” : Rep G × C[G] → C[G] which has the associativity properties (1.2).
(1.3)
312
A. Kirillov, Jr.
2. Untwisted Sector Throughout the paper, we let A be a C-algebra (i.e. an object of C with a map µ : A⊗A → A) as defined in [KO]. We also assume that G acts on A by multiplication-preserving automorphisms and that AG = 1 (recall that by axioms of the C-algebra, one has canonical embedding 1 → A and the multiplicity A, 1 = 1). In addition, we will also assume that C is rigid and balanced, and that A is rigid and satisfies θA = id. Our main goal is to describe the category C in terms of the group G and the category Rep0 A (see [KO] for definitions). The main motivation for this comes from the orbifold conformal field theories, as explained in the introduction; in this situation, C = Rep V G , and A is V considered as a V G -module (cf. [KO]). We will freely use results and notation of [KO]. We start by describing the structure of A as an object of C. Define a functor : Rep G → C by V → (V A)G . In other words, if one writes A=
(2.1)
Vλ M λ
λ∈G
for some Mλ ∈ C, then (Vλ∗ ) = Mλ . Theorem 2.1. (C) = 1 and (Vλ ), 1 = 0 for Vλ C. Proof. Immediate from definitions. Our next goal is to prove that under suitable conditions, is a tensor functor. An impatient reader can find the final result as Theorem 2.11 below. We start by constructing a morphism (V ) ⊗ (W ) → (V ⊗ W ). Theorem 2.2. Define the functorial morphism J : (V ) ⊗ (W ) → (V ⊗ W ) as the following composition: µ
(V A)G ⊗ (W A)G → ((V ⊗ W ) (A ⊗ A))G − → ((V ⊗ W ) A)G (we have used (1.2), (1.3)). Then J is compatible with associativity, commutativity, unit, and balancing morphisms in Rep G, C. Proof. Immediate from definitions. This theorem allows one to define functorial morphisms J : (W1 ) ⊗ . . . (Wn ) → (W1 ⊗ . . . ⊗ Wn ). For future use, we explicitly write the functoriality property: for any f : Wi → Wi , the following diagram is commutative 1⊗...⊗(f )⊗...⊗1
(W1 ) ⊗ . . . ⊗ (Wn ) −−−−−−−−−−−→ (W1 ) ⊗ . . . ⊗ (Wi ) ⊗ . . . ⊗ (Wn ) J J (W1 ⊗ . . . ⊗ Wn )
(1⊗...⊗f ⊗...⊗1)
−−−−−−−−−−−→
(W1 ⊗ . . . ⊗ Wi ⊗ . . . ⊗ Wn )
Remark 2.3. We do not claim that J is an isomorphism: in general, this is false.
Modular Categories and Orbifold Models
313
Now let us use rigidity of A. We denote by eV : V ∗ ⊗ V → 1, iV : 1 → V ⊗ V ∗ the canonical rigidity morphisms, and by dim M dimension of an object M ∈ C. Theorem 2.4. For every V ∈ Rep G, the map (e)
J
e˜ : (V ∗ ) ⊗ (V ) − → (V ∗ ⊗ V ) −−→ (C) = 1
(2.2)
gives an isomorphism (V ∗ ) (V )∗ . In other words, there exists a morphism ı˜ : 1 → (V ) ⊗ (V ∗ ) such that e, ˜ ı˜ satisfy the rigidity axioms. Proof. It is easy to see that for any X ∈ C[G], the morphism (X ∗ )G ⊗ X G → (X ∗ ⊗ X)G → 1G = 1 identifies (X ∗ )G (X G )∗ (here we have used (1.3) and functoriality of X → X G ). Combining this with the definition of J and rigidity of A we get the statement of the theorem. Theorem 2.5. For any V , W ∈ Rep G, the morphism J : (V ) ⊗ (W ) → (V ⊗ W ) is injective. Proof. Define the morphism I : (V ⊗ W ) → (V ) ⊗ (W ) by the graph shown in Fig. 1.
(W ) (e⊗1)
(V )
(V ∗ ⊗V ⊗W ) J (V ⊗W )
Fig. 1. Definition of I
Then the composition I J : (V ) ⊗ (W ) → (V ) ⊗ (W ) can be rewritten as shown in Fig. 2 and thus, I J = id which proves that J is injective. Theorem 2.6. Let Vλ be an irreducible representation of G. (1) (Vλ ) is either zero or a simple object in C. λ = µ are such that (Vλ ) = 0, (Vµ ) = 0, then (Vλ ) (Vµ ). (2) If λ, µ ∈ G, Proof. Using Theorem 2.1, Theorem 2.5, we get (Vλ ) ⊗ (Vµ∗ ), 1 ≤ (Vλ ⊗ Vµ∗ ), 1 = δλµ . Thus, using (V ∗ ) (V )∗ (see Theorem 2.4) for λ = µ we get (Vλ ) ⊗ (Vλ )∗ , 1 ≤ 1, which shows that (Vλ ) is either simple or zero, and for λ = µ, we get (Vλ ) ⊗ (Vµ )∗ , 1 = 0.
314
A. Kirillov, Jr.
(W )
(W ) (V )
(e⊗1)
(e⊗1)
(V ∗ ⊗V ⊗W )
=
(V ∗ ⊗V ⊗W )
(V )
J
J J (W )
(V )
(V )
(W )
(W ) J
=
(V )
(V )
(W )
1 (e)
=
J
(V )
(W )
Fig. 2. Proof of I J = id
Lemma 2.7. Let Vλ be an irreducible representation of G such that (Vλ ) = 0. Then the composition ı˜
J
1− → (Vλ ) ⊗ (Vλ∗ ) − → (Vλ ⊗ Vλ∗ ) is equal to (iλ ), where iλ is the canonical map C → Vλ ⊗ Vλ∗ . Remark 2.8. For (Vλ ) = 0, this lemma obviously fails. Proof. Denote this composition by ϕ. Since (Vλ ⊗ Vλ∗ ), 1 = 1, one has ϕ = c(iλ ) for some c ∈ C. To find c, compose both ϕ and (iλ ) with the morphism f : (Vλ∗ ) ⊗ (Vλ ⊗ Vλ∗ ) → (Vλ∗ ) shown in Fig. 3. Arguing as in the proof of Theorem 2.5, we see that f ϕ = f (iλ ) = id(Vλ∗ ) . such that (Vλ ) = 0, then J : (Vλ ) ⊗ (W ) → (Vλ ⊗ W ) Theorem 2.9. If λ ∈ G is an isomorphism for any W ∈ Rep G.
Modular Categories and Orbifold Models
315
(Vλ∗ ) (e⊗1)
(Vλ∗ ⊗Vλ ⊗Vλ∗ ) J (Vλ∗ )
(Vλ ⊗Vλ∗ )
Fig. 3. Definition of f : (Vλ∗ ) ⊗ (Vλ ⊗ Vλ∗ ) → (Vλ∗ )
Proof. Let I be as in the proof of Theorem 2.5. Let us calculate J I . Using Lemma 2.7 and functoriality of J , we can rewrite J I as shown in Fig. 4. Thus, J I = id. On the other hand, it was proved in Theorem 2.5 that I J = id. Now, let us assume that the action of G on A is faithful, that is, every g ∈ G, g = 1 acts on A by a non-trivial automorphism. Theorem 2.10. If the action of G is faithful, then (Vλ ) = 0 for any λ ∈ G. | (Vλ ) = 0}. Then, by Theorem 2.4, I is closed under Proof. Let I = {λ ∈ G ν the duality (as usual, we denote by λ∗ the class of representation Vλ∗ ). Denote by Nλµ ν multiplicities in the tensor product decomposition: Vλ ⊗ Vµ Nλµ Vν . Then ν Nλµ =0
if λ, ν ∈ I, µ ∈ / I.
(2.3)
ν = 0, then there is an embedding V ⊂ V ⊗ V , which gives Indeed: if Nλµ ν λ µ
(Vν ) ⊂ (Vλ ⊗ Vµ ) (Vλ ) ⊗ (Vµ ) 0 (by Theorem 2.9), which contradicts (Vν ) = 0. ∗ ν = N µ , we can rewrite (2.3) as By using Nλµ λν ∗ ν Nλµ =0
if λ, µ ∈ I, ν ∈ / I.
(2.4)
Let A be the full subcategory in Rep G generated (as an abelian category) by Vλ , λ ∈ I . Then it follows from (2.4) that A is closed under the tensor product; it is also closed under duality (by Theorem 2.4) and contains C (by Theorem 2.1). By the usual reconstruction theorems, this means that A is the category of representations of some group H which is a quotient of G. But by assumption, the action of G on A is faithful, which means that the action of G on λ∈I Vλ is faithful. Thus, H = G, I = G. Combining all the results above, we get the main theorem of this section. Theorem 2.11. Let A be a rigid C-algebra, θA = 1, and G – a compact group acting faithfully on A. Then (1) A
Vλ M λ ,
λ∈G
where Mλ = (Vλ∗ ) are non-zero, simple, and Mλ Mµ for λ = µ.
(2.5)
316
A. Kirillov, Jr.
(V ⊗W )
(V ⊗W )
(1⊗e⊗1)
J
(V ⊗V ∗ ⊗V ⊗W )
(W ) (e⊗1)
(V )
(V ∗ ⊗V ⊗W )
J
= (V )
(V ∗ ⊗V ⊗W )
J
J (V ⊗W )
(V ⊗W )
(V ⊗W ) (1⊗e⊗1)
(1⊗e⊗1)
(V ⊗V ∗ ⊗V ⊗W )
=
(V ⊗V ∗ ⊗V ⊗W )
= J
J (V ⊗V ∗ )
(V )
(V ⊗W )
(iλ )
(V ⊗W )
(1⊗e⊗1)
(V ⊗V ∗ ⊗V ⊗W ) =
(iV ⊗1⊗1)
J 1
(V ⊗W )
Fig. 4. Proof of J I = id
(2) Let C1 be the full subcategory in C generated as an abelian category by Mλ , λ ∈ G. Then C1 is a symmetric tensor subcategory in C, and the functor : Rep G → C1 defined by (2.1) is an equivalence of tensor categories. Corollary 2.12. G is finite. Proof. Immediate from (2.5) and finite-dimensionality of spaces of morphisms in C. 2 = id. Corollary 2.13. Rˇ AA
Modular Categories and Orbifold Models
317
Example 2.14. Let G be a finite group, C = Rep G, A = F(G) – the algebra of functions on G, with the usual (pointwise) multiplication. Formula gδh = δgh makes A a G-module and thus, an object of C. It is trivial to show (see [KO]) that A is a rigid C-algebra, and Rep A = Vec. We also have another action of G on A, by πg δh = δhg −1 . This commutes with previously defined and thus, defines an action of G by automorphisms on A considered as a C-algebra. In this case, the functor is an equivalence of categories Rep G C, and the decomposition (2.5) becomes the standard decomposition Vλ ⊗ Vλ∗ . F(G) = λ∈G
Let us return to the general case. Theorem 2.15. Under the assumptions of Theorem 2.11, consider A ∈ C1 ⊂ C as an object of Rep G using equivalence . Then A F(G) with multiplication, structure of G-module and action of G by automorphisms as defined in Example 2.14. Proof. It is immediate from Theorem 2.11 that A lies in C1 Rep(G) and that as an ∗ object of the Rep(G), A = λ∈G Vλ Vλ . The structure of G-module is determined by the action on the second factor, and the action of G by the automorphism is defined by the action on the first factor. On the other hand, it is shown in [KO] that any algebra in category Rep G must be of the form A = F(G/H ) for some subgroup H . Combining these statements, we see that H = {1}, A = F(G). Corollary 2.16. G is the group of all automorphisms of A as a C-algebra. Corollary 2.17. dim A = |G|. Of course, in the general case C can be (and usually is) larger than Rep G. A very important example is when C is the category of representation of D(G) (the Drinfeld double of G), and A = F(G) ∈ Rep G ⊂ Rep D(G) with the same action of G as in Example 2.14. This example plays an important role in what follows; it is discussed in detail in Sect. 3. Example 2.18. Let G be commutative. Then all its irreducible representations are onedimensional, and it immediately follows from Theorem 2.11 that Mλ , A
λ∈G
where Mλ are non-zero, simple, pairwise non-isomorphic objects in C and Mλ ⊗ Mµ
Mλµ . This case is well studied in numerous papers under the name “simple currents extensions”; a review of known results can be found, e.g., in [FMS] and [FS]. 3. Example: D(G) Let D(G) = C[G] F(G) be Drinfeld’s double of the finite group G (see, e.g., [BK]), and let C be the category of finite-dimensional complex D(G)-modules. As is well known, this category is equivalent to the category of G-equivariant vector bundles on G.
318
A. Kirillov, Jr.
An object of this category is a complex vector space V with an action of G and with a Ggrading: V = g∈G Vg such that gVx ⊂ Vgxg −1 , or, equivalently, wt(gv) = gwt(v)g −1 , where wt(v) ∈ G denotes weight of a homogeneous v ∈ V . The tensor product in C is the usual tensor product, with wt(v ⊗ w) = wt(v)wt(w). The braiding in C is given by Rˇ = P R, where R= δg ⊗ g ∈ D(G) ⊗ D(G). (3.1) g∈G
In other words, if v, w are homogeneous vectors in V , W respectively, then ˇ ⊗ w) = gw ⊗ v, R(v
g = wt (v).
It is also known that Rep D(G) is semisimple, and the set of isomorphism classes of simple objects is (g, π )/G, where g ∈ G, π – an irreducible representation of the centralizer Z(g) = {h ∈ G | hg = gh}, and G acts on the set of pairs (g, π ) by h(g, π ) = (hgh−1 , π ◦ h−1 ). We will denote the corresponding representation of D(G) by Vg,π . Let A = F(G); consider it as an object of C by endowing it with the standard action of G (the same as in Example 2.14), and by letting wt (v) = 1 for all v ∈ A. Lemma 3.1. A is a rigid C-algebra, with θA = 1. The proof is straightforward. As in Example 2.14, we also have of G by automorphisms on A defined by an action πg δh = δhg −1 , and AG = 1 = C h∈G δh . We remind the reader that for every C-algebra A, one can define the category Rep A of A-modules and two natural functors F : C → Rep A, G : Rep A → C (see [KO]). The category Rep A is a tensor category; we denote the tensor product in Rep A by ⊗A . Theorem 3.2. The category Rep A is equivalent to the category GVec of G-graded vector spaces. Under this equivalence, the functor ⊗A becomes the usual tensor product of vector spaces with the grading given by wt(v ⊗ w) = wt(v)wt(w), and the functors F, G are given by F (V ) = V
forgetting the action of G but keeping the grading,
G(V ) = F(G) ⊗ V
with grading given by wt(δg ⊗ v) = gwt(v)g −1 .
Proof. It is immediate from the definition that Rep A is the category of G-modules with G × G-grading such that wt(gv) = (gxg −1 , gy)
if wt(v) = (x, y) ∈ G × G.
Indeed, the action of G and the first component of the grading define V as an object of C, and the second component of the grading defines the action of A: if v ∈ V is homogeneous, then
v, wt(v) = (·, g) δg v = 0, otherwise. by Define a new G × G-grading wt wt(v) = (y −1 xy, y)
if wt(v) = (x, y)
Modular Categories and Orbifold Models
319
(which implies that wt(v) = (y xy ˜ −1 , y) if wt(v) = (x, ˜ y)). Then wt(gv) = (x, ˜ gy)
if wt(v) = (x, ˜ y).
From this it immediately follows that the functor Rep A → GVec given by V → {v ∈ V | wt(v) = (·, 1)} = {v ∈ V | wt(v) = (·, 1)}
(3.2)
considered with G-grading given by the first component of wt, is an equivalence of categories. The remaining statements of the theorem are straightforward. Corollary 3.3. The set of simple objects in Rep A is G. For future use we give a description of the corresponding simple objects Xg in terms of G-graded vector spaces and in terms of G × G-graded G-modules. As a G-graded vector space,
C, g = h (Xg )h = 0, otherwise. As a bi-graded G-module, Xg is given by Xg =
Cex,y
(3.3)
y −1 xy=g
with wt(ex,y ) = (x, y), the action of G and A given by hex,y = ehxh−1 ,hy ,
ex,y , h = y µ(δh ⊗ ex,y ) = 0, otherwise. Note that the first component of wt(v), v ∈ Xg is supported on the conjugacy class of g. This description immediately implies the following result: G(Xg ) = Vg,F (Z(g)) = π ⊗ Vg,π , (3.4) π∈Z(g)
where π is the representation space, considered with trivial action of D(G). Theorem 3.4. Rep0 A = Vec which is considered as a subcategory in GVec consisting of spaces with grading identically equal to 1. Proof. Let V ∈ Rep A; for now, we consider V as a G × G-graded G-module, as in the proof of Theorem 3.2. Then explicit calculation shows that Rˇ V A Rˇ AV (δg ⊗ v) = δxg ⊗ v Therefore,
if wt(v) = (x, ·).
v, µV Rˇ V A Rˇ AV (δg ⊗ v) = 0,
xg = y otherwise,
320
A. Kirillov, Jr.
where wt(v) = (x, y). Comparing it with the usual formula for the action of A,
v, g = y µV (δg ⊗ v) = 0, otherwise, we see that µV = µV Rˇ V A Rˇ AV iff wt(v) = (1, ·) for all v ∈ V .
For future use, we give here two more results about D(G). First, define the map τ : D(G) → D(G), gδh → gδh−1 .
(3.5)
Then it is trivial to check the following properties. Lemma 3.5. (1) τ is an algebra automorphism. (2) τ is coalgebra anti-automorphism: 2τ (a) = τ ⊗ τ (2op a), τ ◦ S = S ◦ τ , where S in D(G). is the antipode (3) τ ⊗ τ (R) = R −1 . Thus, if we define for a representation V of D(G) a new representation V τ which coincides with V as a vector space but with the action of D(G) twisted by τ : πV τ (a) = πV (τ a), then one has canonical isomorphisms (V ⊗ W )τ = W τ ⊗ V τ , (V ∗ )τ = (V τ )∗ and thus, τ gives an equivalence ∼
τ : (Rep D(G))op − → Rep D(G),
(3.6)
where (Rep D(G))op coincides with Rep D(G) as an abelian category but has tensor product and braiding defined by V ⊗op W = W ⊗ V , R op = R −1 . Second, note that D(G) can be generalized as follows. Let H ⊂ G be a normal subgroup. Define D(G, H ) = C[G] F(H ).
(3.7)
One easily sees that it is a quotient of D(G): D(G, H ) = D(G)/JH , where JH is the ideal generated by δg , g ∈ G \ H (this is a Hopf ideal). For H = {1}, we get D(G, H ) = C[G]; for H = G, D(G, H ) = D(G). One also easily sees that Rep D(G, H ) is the subcategory of Rep D(G) consisting of representations such that wt(v) ∈ H . Theorem 3.6. Let A ⊂ Rep D(G) be a full subcategory containing Rep G (which we consider as a subcategory in Rep D(G) as before), and closed under duality, tensor product, and taking sub-objects. Then A = Rep D(G, H ) for some normal subgroup H ⊂ G. Proof. The proof is based on the following easily proved lemma. Lemma 3.7. Let g1 , g2 ∈ G and let π1 , π be irreducible representations of Z(g1 ), Z(g1 g2 ) respectively. Then there exists π2 – an irreducible representation of Z(g2 ) such that Vg1 ,π1 ⊗ Vg2 ,π2 , Vg1 g2 ,π = 0. Using this lemma with g2 = 1, we see that if Vg1 ,π1 ∈ A, for some π1 ∈ Z(g 1 ), then for all π ∈ Z(g1 ), Vg1 ,π ∈ A. Using this lemma again, we see that the set H = {g ∈ G | Vg,π ∈ A} is closed under product. Since Vg,π = Vhgh−1 ,π◦h−1 , this set is also invariant under conjugation. Thus, H is a normal subgroup in G.
Modular Categories and Orbifold Models
321
We also need the following theorem. Theorem 3.8. Rep D(G, H ) is modular iff H = G. Proof. It is well known that Rep D(G) is modular; thus, let us prove that if H = G then Rep D(G, H ) is not modular. As discussed above, simple objects in Rep D(H, G) are Vg,π , g ∈ H . Let π be a formal linear combination of irreducible representations of G such that tr π (h) = 0 for all h ∈ H ; such a π always exists if H = G. Then it follows from explicit formulas for s (see, e.g., [BK]) that s(1,π),(h,π ) = 0 Thus, s is singular. for all h ∈ H, π ∈ Z(h).
4. Twisted modules As before, we let A be a rigid C-algebra with θA = id, and G – a compact group acting faithfully on G by automorphisms (by Corollary 2.12, this implies that G is finite). For every g ∈ G, we will denote the corresponding automorphism of A by πg . We use the same notation Rep A, Rep0 A, µV : A ⊗ V → V and functors F : C → Rep A, G : Rep A → C as in [KO]. We will also use the same conventions in figures as in [KO], representing A by dashed line. From now on, we will also assume that A is such that Rep0 A has a unique simple module, A itself; thus, Rep0 A Vec. This corresponds to the “holomorphic” case in conformal field theory; for this reason, we will call such A “holomorphic”. Definition 4.1. Let g ∈ G. A module X ∈ Rep A is called g-twisted if µRˇ 2 = µ ◦ (πg −1 ⊗ id) : A ⊗ V → V (see Fig. 5). In particular, X is 1-twisted iff X ∈ Rep0 A. This definition is, of course, nothing but rewriting in our language the definition given in [DVVV].
X
X
= πg−1
Fig. 5. Definition of g-twisted module
322
A. Kirillov, Jr.
Example 4.2. In the situation of Sect. 3, the simple module Xg is g-twisted. Indeed,
ˇ2
µR (δh ⊗ ex,y ) =
ex,y , 0,
xh = y otherwise.
But for Xg , y −1 xy = g, which trivially implies that xh = y ⇐⇒ hg = y. Thus, µRˇ 2 (δh ⊗ ex,y ) = µ(δhg ⊗ ex,y ). The key result of this section is the following theorem. Theorem 4.3. Assume that Rep0 A = Vec. Then every simple object Xi ∈ Rep A is g-twisted for some g = g(Xi ) ∈ G. Proof. The proof is based on the following lemma. Lemma 4.4. If X ∈ Rep A is simple and A is holomorphic, then
X* 1 (dim A)2
X*
X =
X
1 dim X
X*
X
Note that dim Xi is non-zero in any semisimple rigid category in which X ∗∗ X (see, e.g., [BK, Sect. 2.4]). In particular, this implies dimRep A X = 0 and thus, dim X = (dim A)(dimRep A X) = 0 (see [KO, Theorem 3.5]). Proof. Let us rewrite the left hand side as shown in Fig. 6. Using Lemma 1.15, Lemma 5.3 from [KO], we see that the left hand side is the composition P
X ∗ ⊗ X → X ∗ ⊗A X − → (X∗ ⊗A X)0 , where P is the projector Rep A → Rep0 A, and for V ∈ Rep A, V 0 = P (V ) is the maximal sub-object of V which lies in Rep0 A. But if A is holomorphic, then the only simple object in Rep0 A is A itself, which is the unit object in Rep A. It appears in X ∗ ⊗A X with multiplicity one, and the right-hand side is exactly the projection of X∗ ⊗A X on A ⊂ X ∗ ⊗A X.
Modular Categories and Orbifold Models
323
1 (dim A)2
Fig. 6.
For every X ∈ Rep A, define the morphism TX : A → A by the following graph:
TX =
(4.1)
X
Lemma 4.5. If X ∈ Rep A is simple, then algebra.
1 dim X TX
is an automorphism of A as a C-
Proof. Let us calculate µ◦(TX ⊗TX ). Using Lemma 4.4 we can rewrite the graph defining 2 = 1 (Corollary 2.13) µ ◦ (TX ⊗ TX ) as shown in Fig. 7. (Note that we need to use Rˇ AA to move the “ring” through A; the last step also uses Lemma 1.10 from [KO].) This shows that µ ◦ (TX ⊗ TX ) = (dim X)TX ◦ µ. In other words, dim1 X TX is a morphism of C-algebras. But it easily follows from Theorem 2.15 that every such morphism is either zero or invertible. Restricting TX to 1 ⊂ A, we see that TX is non-zero; thus, dim1 X TX is an automorphism. Lemma 4.6.
X
X 1 dim X
= TX
2 = id, we can rewrite the left hand side as shown in Proof. Using Lemma 4.4 and Rˇ AA Fig. 8.
324
A. Kirillov, Jr.
=
X
=
X
X
dim X (dim A)2
=
dim X (dim A)2
X
X
X =
dim X (dim A)2
= (dim X)TX ◦ µ
Fig. 7. Proof of Lemma 4.5
The statement of the theorem easily follows from these two lemmas. Indeed, combining Lemma 4.5 with Corollary 2.16 we see that dim1 X TX = πg for some g ∈ G. On the other hand, Lemma 4.6 gives µ ◦ Rˇ −2 = µ ◦ dim1 X TX ⊗ id = µ ◦ (πg ⊗ id) and thus, µ ◦ Rˇ 2 = µ ◦ (πg−1 ⊗ id). This completes the proof of Theorem 4.3 Let us study some properties of the correspondence X → g(X). Theorem 4.7. Let X ∈ Rep A be simple. Then (1) For h ∈ G, let Xh ∈ Rep A coincide with X as an object of C, but with a twisted action of A : µXh = µX ◦ (πh ⊗ id). Then g(X h ) = h−1 g(X)h.
Modular Categories and Orbifold Models
325
X
X
X
1 dim X
=
1 dim X
=
1 (dim A)2
X X
=
1 (dim A)2
=
Fig. 8. Proof of Lemma 4.6
(2) g(X) = 1 iff X A. (3) g(X∗ ) = g(X)−1 . (4) If X, Y ∈ Rep A are simple then so is X ⊗A Y and g(X ⊗A Y ) = g(X)g(Y ). (5) If X, Y ∈ Rep A are simple and non-isomorphic, then g(X) = g(Y ). Proof. (1) is immediate from the definitions. (2) follows from the fact that the only simple object in Rep0 A is A. To prove (3) and (4), we will use the following lemma which easily follows from Lemma 1.15 in [KO]. Lemma 4.8. TX⊗A Y =
1 dim A TX TY .
This lemma implies that TX⊗A X∗ = cπg(X)g(X∗ ) for some c ∈ C. On the other hand, X ⊗A X ∗ A⊕ i=0 Ni Xi , and TX⊗A X∗ = π1 + ci πg(Xi ) . Since, by (2), g(Xi ) = 1 for i = 0 and the operators πg are linearly independent, we see that these two expressions can be equal only if X ⊗A X ∗ A, g(X)g(X ∗ ) = 1. To prove (4), note that X∗ ⊗A (X⊗A Y ) (X∗ ⊗A X)⊗A Y A⊗A Y Y is simple, which immediately implies that X ⊗A Y is simple. The identity g(X ⊗A Y ) = g(X)g(Y ) immediately follows from Lemma 4.8. Finally, to prove (5) note that (3) and (4) imply g(X) = g(Y ) ⇐⇒ g(X∗ )g(Y ) = g(X∗ ⊗A Y ) = 1. By (2), this is only possible if X ∗ ⊗A Y A. Corollary 4.9. The map X → g(X) is a bijection between the set of isomorphism classes of simple objects in Rep A and some normal subgroup H ⊂ G.
326
A. Kirillov, Jr.
We will denote the unique simple g-twisted object X ∈ Rep A by Xg ; then Theorem 4.7 implies that Xg ⊗A Xh Xgh .
(4.2)
Combining this with multiplicativity of dimension and [KO, Theorem 3.5], we see that dim X dim A is a character of the group H ⊂ G. Since H is a finite group, this immediately implies the following result. Lemma 4.10. For any simple X ∈ Rep A, dim X = 1. g → dimA X =
dim A
In particular, if dim X ≥ 0 (which happens if C is a unitary category in the sense of Turaev), then this implies dim Xg = dim A = |G|. 5. Twisted Sectors Now that we have a description of irreducible objects in Rep A in terms of the group G, we can move on to our ultimate goal: description of irreducible objects in C. As before, we assume that A is rigid, θA = id and holomorphic (Rep0 A = Vec), and G is a finite group acting faithfully on A. It is immediate from the identity M, G(X) C = F (M), X Rep A (see [KO]) that every simple Li ∈ C appears in the decomposition of some Xg ∈ Rep A (considered as an object of C). Thus, our first goal is to study the decomposition Xg
Ng,i Li . (5.1) Note that it immediately follows from Theorem 4.7 that as an object of C, Xg Xhgh−1 ; thus, the multiplicities Ng,i only depend on the conjugacy class of g. Our strategy in studying decomposition (5.1) is parallel to the approach taken in Sect. 2 to study the decomposition of A = X1 . However, instead of the functor : Rep G → C which was defined using A, we will define functor : Rep D(G) → C using A˜ = g∈G Xg , where D(G) is the Drinfeld double of G (see Sect. 3). ˜ To do so, note that it First of all, we need to define the algebra structure on A. follows from Theorem 4.7 that for every g, h ∈ G there exists a unique up to a constant isomorphism of A-modules ∼
µg,h : Xg ⊗A Xh − → Xgh .
(5.2)
Considering morphisms Xg ⊗A Xh ⊗A Xk → Xghk obtained as compositions of µ, we get µg,hk ◦ (1 ⊗A µh,k ) = ω(g, h, k) µgh,k ◦ ( µg,h ⊗A 1)
(5.3)
for some ω(g, h, k) ∈ C× . One immediately sees that ω is a 3-cocycle on G with values in C× and that rescaling µ by µg,h → µg,h · f (g, h) results in replacing ω by a cohomological one. Thus, the class [ω] ∈ H 3 (G, C× ) is well-defined. To simplify the exposition, in this paper we only consider the simplest case ω ≡ 1. The general case is similar but will involve a “twisted” version of the Drinfeld double, as in [DPR], and will be discussed elsewhere. Note also that rescaling µg,h → c µg,h , where c ∈ C× is independent of g, h, does not change ω; thus, without loss of generality we can assume that µ1,1 : A ⊗A A → A coincides with the multiplication map µ.
Modular Categories and Orbifold Models
327
Assumption 5.1. From now on, we assume that ω ≡ 1 and µ1,1 = µ. In this case, the morphism µ=
˜ µg,h : A˜ ⊗A A˜ → A,
(5.4)
g,h
is associative. We will also use the the same symbol µ for the composition µ → A˜ A˜ ⊗ A˜ → A˜ ⊗A A˜ −
where the first morphism is the canonical projection. This morphism defines on A˜ the structure of an associative (but not commutative) C-algebra. Lemma 5.2. Under Assumption 5.1, µg,1 : Xg ⊗A → Xg coincide with µXg , µXg ◦ (1) The morphisms µ1,g : A⊗Xg → Xg , −1 Rˇ A,X respectively. g (2) The morphisms µ
→ X1 = A → 1 Xg −1 ⊗ Xg −
(5.5)
(dim A) µ−1
1 → A = X1 −−−−−−→ Xg ⊗A Xg −1 → Xg ⊗ Xg −1 satisfy the rigidity axioms and thus define an isomorphism Xg −1 Xg∗ .
(5.6)
(We use the fact that X ⊗A Y is canonically a direct summand in X ⊗ Y , see [KO, Corollary 1.16]). (3) For all g, dim Xg = 1. dim A The proof is left to the reader as an exercise. ˜ It follows from Theorem 4.7 that for every Now let us define an action of G on A. g, x ∈ G there exists a unique up to a constant A-morphism g −1 ∼
ϕx (g) : Xx
− → Xgxg −1 .
(5.7)
Equivalently, ϕ is a C-morphism Xx → Xgxg −1 such that ϕx (g) ◦ µ ◦ (πg−1 ⊗ 1) = µ ◦ (1 ⊗ ϕx (g)) : A ⊗ Xx → Xg −1 xg . ϕx (g)
ϕgxg −1 (h)
Considering the composition Xx −−−→ Xgxg −1 −−−−−→ Xhgx(hg)−1 and using uniqueness, we see that ϕgxg −1 (h)ϕx (g) = cx (g, h)ϕx (hg) for some cx (g, h) ∈ C× . In particular, denoting by Z(x) the centralizer of x: Z(x) = {g ∈ G | gx = xg}, we see that g → ϕx (g) defines a projective action of Z(x) on Xx .
(5.8)
328
A. Kirillov, Jr.
Lemma 5.3. If ω ≡ 1, then ϕg (x) can be chosen so that cx (g, h) ≡ 1. Proof. Define ϕx (g) by Fig. 9 (where we used (5.6) to identify Xg∗ Xg −1 ). We leave it to the reader to check that so defined ϕ satisfies the associativity property (5.8) with cx (g, h) ≡ 1.
µ 1 ϕx (g) = dim A
Xg∗
Xg
Xx
g −1 ∼
Fig. 9. Definition of ϕx (g) : Xx
−→ Xgxg −1
Example 5.4. Let x = 1, X1 = A. Then it immediately follows from the construction given in the proof of Theorem 4.3 and Lemma 5.2 that ϕ1 (g) = dim1Xg Tg = πg . Lemma 5.5. If ω ≡ 1 and ϕ is defined as in Lemma 5.3 then ϕ is compatible with µ, i.e., the following diagram is commutative: Xx ⊗ Xy ϕx (g)⊗ϕy (g)
µ
−−−−→
Xxy ϕxy (g)
(5.9)
µ
Xgxg −1 ⊗ Xgyg −1 −−−−→ Xgxyg −1 Remark 5.6. In the general case (ω = 1), it is easy to see that (5.9) is commutative up to a constant factor γx,y (g) ∈ C× . We plan to show in a forthcoming paper that both cx (g, h) and γx,y (g) can be expressed in terms of ω in a manner similar to [DPR, Eqs. 3.5.2, 3.5.3]. Denote
ϕ(g) =
˜ ϕx (g) : A˜ → A.
x
Then we have the following result. (We assume that the reader is familiar with the definition and properties of the Drinfeld double D(G) of a finite group G; these results are briefly reviewed in Sect. 3.) Theorem 5.7. Let ω ≡ 1 and ϕ be defined as in Lemma 5.3. Define the map ϕ : D(G) → EndC A˜ by g → ϕ(g), δh → ph ,
(5.10)
Modular Categories and Orbifold Models
329
where ph : Xx → Xx is the projection on Xh . Then ϕ defines an action of D(G) on ˜ A by C-endomorphisms. This action preserves the multiplication µ: for every x ∈ D(G), µ ◦ (ϕ ⊗ ϕ)2(x) = ϕ(x) ◦ µ. Proof. Immediately follows from the commutation relations in D(G), Lemma 5.3 and Lemma 5.5. Note that it follows from Example 5.4 that restriction of ϕ to C[G] ⊂ D(G), A ⊂ A˜ coincides with the original action of G on A. Thus, we have a situation analogous to that of Sect. 2: we have an associative C-algebra A˜ on which D(G) acts by endomorphisms. Analogously to the definition in Sect. 1, let C[D(G)] be the category with objects: pairs (M ∈ C, ρ : D(G) → EndC (M)) and morphisms: C-morphisms commuting with the action of D(G). We have the following results which are parallel to those given in Sect. 1 for C[G]. Lemma 5.8. (1) C[D(G)] has a canonical structure of a rigid monoidal category. (2) Both Rep D(G) and C are naturally subcategories in C[D(G)]. This, in particular, allows us to define the functor of exterior tensor product : Rep D(G) × C[D(G)] → C[D(G)]. (3) C[D(G)] is a semisimple abelian category with simple objects Vg,π Li . (4) C[D(G)] is braided with the commutativity isomorphism Rˇ D = Rˇ ◦ R D(G) ,
(5.11)
where R D(G) is the R-matrix of D(G). 2 (5) For V ∈ C, W ∈ Rep D[G] considered as objects in C[D(G)], one has Rˇ VDW = id. The proof of this lemma is left to the reader as an exercise. Note that unlike the C[G] case, Rˇ D does not coincide with the usual commutativity isomorphism in C. We can also define the notion of “D(G) invariants”. Namely, define functors Rep D(G) → Vec V → V D(G) = HomD(G) (C, V ) and C[D(G)] → C D(G)
⊕Vi Li → Vi
Li
(or, more formally, by HomC (L, M D(G) ) = HomC [D(G)] (L, M), where L ∈ C is considered as an object of C[D(G)] with trivial action of D(G)). Using semisimplicity of D(G), one easily sees that M D(G) is canonically a direct summand in M, and that for every L, M ∈ C[D(G)] there is a canonical embedding LD(G) ⊗ M D(G) → (L ⊗ M)D(G) . Theorem 5.9. A˜ is an associative commutative algebra in C[D(G)] (with multiplication µ and action of D(G) defined as in Theorem 5.7), and A˜ D(G) = AG = 1.
330
A. Kirillov, Jr.
Proof. The only part which is not obvious is the fact that A˜ is commutative (with respect ˇ i.e. that the composition to Rˇ D , not R!), ˇ
D(G)
µ R R A˜ ⊗ A˜ −−−→ A˜ ⊗ A˜ − → A˜ → A˜ ⊗ A˜ −
coincides with µ. To prove it note that the explicit formula (3.1) for R D(G) shows that this composition, when restricted to Xg1 ⊗ Xg2 , is equal to Rˇ
1⊗ϕ(g1 )
µ
Xg1 ⊗ Xg2 −−−−→ Xg1 ⊗ Xg1 g2 g −1 − → Xg1 g2 g −1 ⊗ Xg1 − → Xg1 g2 . 1
1
Using the presentation of ϕ(g1 ) given in Fig. 9 and associativity of µ, we can rewrite it as shown in Fig. 10, which shows that it is equal to µ.
µ
µ
1 dim A
=
1 dim A
µ
µ
=
1 dim A
µ
=
Fig. 10. Proof of commutativity of µ with respect to Rˇ D
Now, define the functor : Rep D(G) → C by ˜ D(G) (V ) = (V ⊗ A)
(5.12)
Modular Categories and Orbifold Models
331
(cf. with (2.1)) and functorial morphism J : (V ) ⊗ (W ) → (W ⊗ V ) (note that it reverses the order!) by
D(G) ∼ ˜ D(G) ⊗ (W ⊗ A) ˜ D(G) − ˜ D(G) ⊗ W ⊗ A˜ (V ⊗ A) → (V ⊗ A)
D(G) Rˇ D ˜ D(G) ⊗ A˜ ˜ D(G) −→ W ⊗ (V ⊗ A) → (W ⊗ V ⊗ A˜ ⊗ A)
(5.13)
µ
˜ D(G) = (W ⊗ V ) − → (W ⊗ V ⊗ A) (cf. Theorem 2.2). Please note that by Lemma 5.8(5), Rˇ D in the second line can be −1 ; this will be used in the future. The definition of J is illustrated replaced by Rˇ D in Fig. 11, where – unlike all previous figures in this paper – crossings represent Rˇ D ˇ and the dashed line represents the object A. ˜ The boxes are the canonical and not R, ˜ D(G) → V ⊗ A˜ and projections V ⊗ A˜ (V ⊗ A) ˜ D(G) . embeddings (V ⊗ A)
(W ⊗V )
W
(V )
V
(W )
Fig. 11. Definition of J
Theorem 5.10. J is compatible with associativity, unit isomorphisms and reverses the commutativity isomorphism, i.e. J ◦ Rˇ −1 = (Rˇ D(G) ) ◦ J . Proof. The only one which is not obvious is the commutativity isomorphisms, proof of which is shown in Fig. 12, which uses the same conventions as Fig. 11. Now, repeating with obvious changes the same steps as in Sect. 2, we get the following the set of isomorphism classes of irreducible representations results. We denote by D(G) we choose a representative Vλ . of D(G) and for each λ ∈ D(G) (Vλ ) is either zero or an irreducible object in Theorem 5.11. (1) For any λ ∈ D(G), C.
332
A. Kirillov, Jr.
J Rˇ =
=
=
=
= (Rˇ −1 )J.
=
Fig. 12.
are such that (Vλ ), (Vµ ) are non-zero, and λ = µ, then (Vλ )
(2) If λ, µ ∈ D(G) Vµ . and (Vλ ) = 0, then J : (Vλ ) ⊗ (V ) → (V ⊗ Vλ ) is an (3) If λ ∈ D(G) isomorphism. | (Vλ ) = 0}, and let A be the abelian subcategory in D(G) (4) Let I = {λ ∈ D(G) generated by Vλ , λ ∈ I . Then A is a subcategory in Rep D(G) which is closed under taking submodules, tensor product, and duality. Combining this last part with Theorem 2.11 and Theorem 3.6, we get the following theorem, which is the main result of this paper. Theorem 5.12. Assume that the action of G is faithful and ω ≡ 1. Let H ⊂ G be the normal subgroup defined in Corollary 4.9. Then the functor : Rep D(G) → C ∼ → C. (The category Rep D(G, H ) defined by (5.12) is an equivalence (Rep D(G, H ))op − is defined in (3.7) and (Rep D(G, H ))op is Rep D(G, H ) with opposite tensor product, see (3.6).)
Modular Categories and Orbifold Models
333
If, in addition, C is modular then H = G and thus, is an equivalence (Rep D(G))op
C. Corollary 5.13. Let τ : Rep D(G) → (Rep D(G))op be as defined in (3.5). Then the ∼ → composition ◦τ : V → (V τ ) is an equivalence of tensor categories Rep D(G, H ) − ∼ → C. C. If C is modular, this gives an equivalence Rep D(G) − Combining this result with the results of [KO], we get the theorem formulated in the introduction. Corollary 5.14. (1) For every g ∈ H , Xg considered as an object of C has decomposition ∗ ), Xg ⊕π π (Vg,π
and π is the vector space of the representation π . where the sum is over π ∈ Z(g), (2) For every simple object Li ∈ C there exists a unique conjugacy class C in G such that Li appears in decomposition of Xg iff g ∈ C. Proof. Follows from the previous corollary and Eq. (3.4).
6. D(G) Revisited It is instructive to explicitly describe the constructions of the previous sections, and in ∼ → C in the case when C = Rep D(G), A = F(G), particular equivalence Rep D(G) − ∼ → Vec (see Sect. 3). It is natural to so that G acts on A by automorphisms and Rep0 A − ∼ → Rep D(G) is the identity functor; as we will expect that in this case, the functor C − show, this is indeed so. We already have an explicit description of the modules Xg . The “multiplication” map µ : Xg1 ⊗ Xg2 → Xg1 g2 is given by ex1 ,y1 ⊗ ex2 ,y2 → δy1 ,y2 ex1 x2 ,y1 . We leave it to the reader to check that this map is indeed a morphism of A-modules, and is associative. Explicit calculation also shows that the map ϕx (g) : Xx → Xgxg −1 defined by Fig. 9 can be explicitly written as ϕx (g) : ea,b → ea,bg −1 . The object A˜ =
Xg =
(6.1)
Cex,y
x,y
is a D(G) ⊗ D(G)-module: the first action is defined as in Theorem 5.7 and the second action comes from the fact that each Xg is an object of C = Rep D(G). Explicitly, these two actions are written as follows: π 1 (g)ex,y = ex,yg −1 ,
π 1 (δh )ex,y = δh,y −1 xy ex,y ,
(6.2)
π (g)ex,y = egxg −1 ,gy ,
π (δh )ex,y = δh,x ex,y .
(6.3)
2
2
334
A. Kirillov, Jr.
This bi-module admits a more explicit description. Namely, consider D(G) as a D(G) ⊗ D(G)-module by π 1 (x)a = xa, π 2 (x)a = aS(x), where S is the antipode in D(G). Since D(G) is semisimple as an associative algebra, we have ∼ ∗ D(G) − → Vg,π ⊗ Vg,π considered as a D(G) ⊗ D(G)-module: the first copy of D(G) acts on the first factor in the tensor product, the second on the second. This also shows that the functor V → (V ⊗ D(G))D(G) (we consider invariants with respect to the action of D(G) on the tensor product given by π 1 ; thus, this space becomes a D(G)-module via π 2 ) can be canonically identified with the identity functor. ˜ The answer is given by the following simple lemma. How is this related to A? Lemma 6.1. Let τ : D(G) → D(G) be defined by (3.5), and let D(G)τ be D(G) considered as a D(G) ⊗ D(G)-module by π 1 (x)a = xa, π 2 (x)a = aS(τ x). Then the map A˜ = Cx,y → D(G)τ ex,y → y −1 δx is an isomorphism of D(G) ⊗ D(G)-modules. The proof is obtained by direct calculation. Note: the map µ : A˜ ⊗ A˜ → A˜ does not coincide with the multiplication in D(G)! τ ⊗V∗ Corollary 6.2. (1) As a D(G) ⊗ D(G)-module, A˜ Vg,π g,π ˜ D(G) is canonically isomorphic to τ . (2) The functor : V → (V ⊗ A)
Thus, the functor ◦ τ described in Corollary 5.13 can be identified with the identity functor, as should have been expected. References [BK] [DLM1] [DLM2] [DLM3] [DLM4] [DM1] [DM2] [DM3] [DM4] [DM5]
Bakalov, B., Kirillov, A., Jr.: Lectures on tensor categories and modular functors. Providence, RI: Amer. Math. Soc., 2000 Dong, C., Li, H., Mason, G.: Compact Automorphism Groups of Vertex Operator Algebras. International Math. Research Notices 18, 913–921 (1996) Dong, C., Li, H., Mason, G.: Regularity of rational vertex operator algebras., Adv. in Math. 132, 148–166 (1997) Dong, C., Li, H., Mason, G.: Twisted representations of vertex operator algebras. Math. Ann. 310, 571–600 (1998) Dong, C., Li, H., Mason, G.: Vertex Operator Algebras and Associative Algebras. J. of Algebra 206, 67–96 (1998) Dong, C., Mason, G.: Nonabelian Orbifolds and the Boson–Fermion Correspondence. Commun. Math. Phys 163, 523-559 (1994) Dong, C., Mason, G.: On quantum Galois theory. Duke Math. J. 86, 305–321 (1997) Dong, C., Mason, G.: Quantum Galois theory for compact Lie groups. J. Algebra, 214, 92–102 (1999) Dong, C., Mason, G.: Radical of a vertex operator algebra associated to a module. arXiv:math.QA/9904155. Dong, C., Mason, G.: Vertex operator algebras and their automorphism groups. In: Proceedings of International Conference on Representation Theory (Shanghai, 1998), China Higher Education Press and Springer-Verlag, to appear
Modular Categories and Orbifold Models
[DPR]
335
Dijkgraaf, R., Pasquier, V., Roche, P.: Quasi Hopf algebras, group cohomology and orbifold models. Nucl. Phys. B (Proc. Suppl.) 18B, 60–72 (1990) [DVVV] Dijkgraaf, R.,Vafa, C.,Verlinde, E.,Verlinde, H.: The operator algebra of orbifold models. Commun. Math. Phys. 123, 485–526 (1989) [DY] Dong, C., Yamskulna, G.: Vertex operator algebras, generalized doubles and dual pairs. arXiv:math.QA/0006005 [FMS] Di Francesco, P., Mathieu, P., Sénéchal, D.: Conformal field theory. Graduate Texts in Contemporary Physics. New York: Springer-Verlag, 1997 [FS] Fuchs, J., Schweigert, C.: Lie algebra automorphisms in conformal field theory. arXiv:math.QA/0011160 [Ka] Kassel, C.: Quantum groups, Graduate Texts in Mathematics, 155. New York: Springer-Verlag, 1995 2 [KO] Kirillov, A. Jr, Ostrik, V.: On q-analog of McKay correspondence and ADE classification of sl conformal field theories. arXiv:math.QA/0101219. [KT] Kac, V.G., Todorov, I.T.: Affine orbifolds and rational conformal field theory extensions of W1+∞ . Commun. Math. Phys. 190, 57–111 (1997) Communicated by L. Takhtajan
Commun. Math. Phys. 229, 337–346 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0685-4
Communications in
Mathematical Physics
Invariance Principles for Interval Maps with an Indifferent Fixed Point Mark Pollicott, Richard Sharp Department to Mathematics, University of Manchester, Oxford Road, Manchester M13 9PL, Manchester, England Received: 15 February 2002 / Accepted: 26 March 2002 Published online: 31 July 2002 – © Springer-Verlag 2002
Abstract: In this note we establish an almost sure invariance principle for a large class of interval maps with indifferent fixed points, including the Manneville-Pomeau map. This implies a number of well-known corollaries, including the Weak Invariance Principle and the Law of the Iterated Logarithm. 0. Introduction It is a classical problem in ergodic theory to understand the statistical properties of typical orbits. For example, the Birkhoff ergodic theorem describes the average behaviour of such orbits and the Central Limit Theorem describes the deviation from this average. These results are subsumed by more general invariance principles. The situation for uniformly hyperbolic systems is reasonably well understood. In this note, we shall study a particular class of non-uniformly hyperbolic systems. Let T : X → X be a continuous transformation of the interval X = [0, 1] preserving an absolutely continuous probability measure µ. Assume that T is expanding, except at an indifferent fixed point. More precisely, for 0 < α < 1, we consider the class ᑣα of C 2 interval maps T : X → X, with a fixed point T (0) = 0, such that: (i) T (0) = 1; (ii) T (x) > 1 for 0 < x ≤ 1; (iii) there exists c = 0 such that limx0 T (x)x 1−α = c. Any transformation T ∈ ᑣα has an absolutely continuous invariant probability measure µ. A simple example is provided by the Manneville-Pomeau map Tα : [0, 1] → [0, 1] defined by Tα (x) = x + x 1+α (mod 1), for 0 < α < 1. Let φ : X → R be a H¨older continuous function with φdµ = 0. We say that φ is a coboundary if there exists u ∈ C 0 (X, R) such that φ = u ◦ T − u. We introduce the sequence φ n (x) = φ(x) + φ(T x) + . . . + φ(T n−1 x), for each n ≥ 1.
338
M. Pollicott, R. Sharp
Under the hypothesis that 0 < α < 21 , Young [19] and Liverani, Saussol and Vaienti [14] established the Central Limit Theorem, i.e., provided φ is not a coboundary then t √ 1 2 lim µ x : φ n (x) < nσ t = √ e−u /2 du, n→∞ 2π −∞ n where σ 2 = φ(x)2 dµ + 2 ∞ n=1 φ(T x)φ(x)dµ > 0. We shall now consider a stronger result than the Central Limit Theorem (cf. [4, p. 35]). Define a sequence of functions ζn : X → C 0 ([0, 1], R) by 1 ζn (x) : t → √ φ [tn] (x) + (nt − [nt])φ(T [nt]+1 x) , n ≥ 1, σ n for 0 ≤ t ≤ 1, i.e., ζn (x) is the piecewise linear √ function on [0, 1] defined by interpolating between the values ζn (x)(k/n) = (σ n)−1 φ k (x), for k = 0, . . . , n. The Weak Invariance Principle asserts that the measures (ζn )∗ µ converge weakly on C 0 ([0, 1], R) to the standard Wiener measure [9]. Our first main result is the following. 1 Theorem 1. Suppose that 0 < α < 3 . Let T ∈ ᑣα and let φ : X → R be a H¨older continuous function with φdµ = 0. Then the Weak Invariance Principle holds provided φ is not a coboundary.
Our second theorem relates to the Law of the Iterated Logarithm. This describes the growth of the sums φ n and asserts that 1 lim sup √ φ n (x) = 1, a.e.(µ). σ 2n log log n n→+∞ Theorem 2. Suppose that 0 < α < 13 . Let T ∈ ᑣα and let φ : X → R be a H¨older continuous function with φdµ = 0. Then the Law of the Iterated Logarithm holds provided φ is not a coboundary. A functional form of the Law of the Iterated Logarithm can also be deduced [4, p. 36]. Our interest in these problems was motivated by the earlier work of Liverani, Saussol and Vaienti [14], Young [19] and Isola [11] on central limit theorems for indifferent maps, and the papers of Denker and Philipp [5] and Field, Melbourne and T¨or¨ok [7] on almost sure invariance principles for flows and skew products. Invariance principles for maps admitting an infinite invariant measure were studied in [3]. 1. Transfer Operators, Martingales and Invariance Principles In this section we shall first describe the induced transformation and associated transfer operator. Let us choose a0 with T (a0 ) = 0 such that, for the subinterval Y = [a0 , 1] ⊂ X, the map T : Y → X is a homeomorphism. Given any point x ∈ Y we can define the first return time to Y by R(x) = inf{n ≥ 1 : T n x ∈ Y }. We can define an induced transformation S : Y → Y by S(x) = T R(x) (x). The transformation S : Y → Y preserves an absolutely continuous invariant measure m, which is the (normalized) restriction to Y of the measure µ on X.
Invariance Principles for Interval Maps with Indifferent Fixed Point
339
Let us assume, for simplicity of notation, that T has two inverse branches T0 : X → [0, a0 ] and T1 : X → [a0 , 1] and let us denote an = T0n (a0 ). In particular, this sequence
converges monotonically to 0 at a rate an = O(n− α ). We can write the inverse branches to the induced map S in the form T1 T0n (x), for x ∈ [T1 (an+1 ), T1 (an )]. In particular, S : [T1 (an+1 ), T1 (an )] → Y is continuous and surjective. Theorems 1 and 2 are standard consequences of the Technical Theorem below, which relates the summations φ n (x) to a sequence χ n (x) which has the stronger property of being approximated by a Brownian motion. 1
Technical Theorem. Suppose that 0 < α < 13 . Let T ∈ ᑣα and let φ : X → R be a H¨older continuous function with φdµ = 0. Then there exists a H¨older continuous function χ : Y → R, a one-dimensional Brownian motion W : ! → C 0 (R+ , R) on some probability space (!, ν), such that W (·)(t) has variance tσ 2 , and a sequence of random variables #n : ! → R such that: (a) for some δ > 0 and a.e.(m) x ∈ Y there exists a sequence k = k(n) such that 1 φ n (x) = χ k (x) + O(n 2 −δ ) and n/k = Rdm + o(1); (b) the families {#k : k ≥ 1} and {χ k : k ≥ 1} have the same distribution (i.e., for every Borel set A ⊂ R we have m{x ∈ Y : χ k (x) ∈ A} = ν{ω ∈ ! : #k (ω) ∈ A}, for all k ≥ 1); and 1 (c) #k (·) = W (·)(n) + o(n 2 ), for n ≥ 0, a.e. (ν) provided φ is not a coboundary. The derivation of Theorems 1 and 2 from almost sure invariance principles of this type is explained in Chapter 1 of [16]. Many other consequences are also discussed there. Given 0 < β ≤ 1, let C β (Y ) be the Banach space of H¨older continuous functions of exponent β on Y with respect to the usual norm ||f ||β = ||f ||∞ + |f |β , where |f |β = (y)| . We can associate to S a transfer operator L : C β (Y ) → C β (Y ) supx=y |f (x)−f |x−y|β defined by 1 f (y). Lf (x) = |S (y)| Sx=y
This operator is well defined (cf. [11 and 17]). The advantage of studying S instead of T is that we are helped by the additional feature of hyperbolicity and, in particular, by the resulting estimates on the iterates of L. The following lemma collects together some useful estimates. Lemma 1. (1) There exists λ > 1 such that inf x∈Y |S (x)| ≥ λ. (2) There exists a constant C > 0 such that log |S (x)/S (y)| ≤ C|x − y|. (3) If φ : X → R is a H¨older continuous function then the function ψ : Y → R defined by ψ(x) = φ R(x) (x) has the property that (Lψ) : Y → R is H¨older continuous. (4) There exists a positive H¨older continuous function h > 0 such that Lh = h. (5) If we define Pf (x) = h1 L(hf ), then P 1 = 1 and there exists 0 < θ < 1 such that if φdµ = 0 then ||P n ψ||∞ = O(θ n ).
340
M. Pollicott, R. Sharp
Proof. Since the restriction of S to the interval [T1 (an+1 ), T1 (an )] ⊂ Y is a composition of the expanding map T : Y → X and the non-contracting map T n : [an+1 , an ] → Y , part (1) is immediate. For part (2), we observe that for x ∈ [T1 (an+1 ), T1 (an )] we can use the chain rule to write log |S (x)| = log |(T n ) (x)| = log |(T n−1 ) (T x)| + log |T (x)|. Since T : Y → X is uniformly expanding and C 2 , the term log |T (x)| is Lipschitz. Moreover, the function log |(T n−1 ) | is Lipschitz on [an+1 , an ] by direct calculation, cf. [19, Lemma 5]. For part (3), observe that if x, y ∈ [T1 (an+1 ), T1 (an )], then |ψ(x) − ψ(y)| =
n−1
|φ(T i x) − φ(T i y)|
i=0
≤ ||φ||β ≤ ||φ||β
n−1
|T i x − T i y|β
i=0
n−1
i
sup
i=0 x∈[T1 (an+1 ),T1 (an )]
|(T ) (x)|
β
|x − y|β .
The function ψ may have discontinuities at the points T1 (an ). However, since the map S : [T1 (an+1 ), T1 (an )] → Y is surjective we see that Lψ : Y → R is H¨older continuous. For part (4), the existence of a positive H¨older continuous eigenfunction Lh = h is proved in [17]. Finally, for part (5) it follows using the bounds in parts (1) and (2) that there exists K > 0, such that ||P n ψ|| ≤ K||ψ||∞ + λ−βn ||ψ||β , for all n ≥ 1, cf. [15]. The result immediately follows by a standard argument [18, Prop. 5.24]. If we write US f (x) = f (Sx) then the condition P 1 = 1 implies that P is a left inverse to US , i.e., P US = I . A consequence of Lemma 1 is the following result. Lemma 2. There exists w ∈ C β (Y ) such that if we set χ := ψ + (w − US w) then P χ = 0 and ψ n (x) = χ n (x) + O(1). ∞ n Proof. We can define w ∈ C β (Y ) by the series w := n=1 P ψ which converges since, by part (5) of Lemma 1, ||P n ψ||β = O λ−βn . We observe that
Pw − w =
∞ n=1
P
n+1
ψ
−
∞
n
P ψ
= −P ψ.
n=1
Since P US = I we see that P χ = P ψ + P (w − US w) = P ψ + (P w − w) = 0. Finally, we notice that ψ n (x) − χ n (x) = USn w(x) − w(x), so that |ψ n (x) − χ n (x)| ≤ 2||w||∞ . Remark. Rather than working with the induced map S : Y → Y , an alternative approach would be to consider the original map T : X → X and an associated transfer operator L : Lp (X, µ) → Lp (X, µ), where p < (2α)−1 . However, using this approach we are only able to show ||Ln φ||p = O(n−γ ), for some γ > 1, and ψ ∈ Lp (X, µ). In particular, in this case we can only prove the conclusion of the Technical Theorem in the smaller range 0 < α < 41 .
Invariance Principles for Interval Maps with Indifferent Fixed Point
341
To proceed with the analysis, we need to replace the sequence of functions χ n (x) with another sequence (on a related space) which forms a martingale. We consider the natural extension S : Y → Y of S : Y → Y , which is the space consisting of all sequences x = (xn )0n=−∞ in Y satisfying Sxn−1 = xn , for n ≤ 0. We denote by m measure on Y . There is a canonical projection from Y to Y the associated S-invariant
defined by π x = x0 . The σ -algebra B for Y allows us to associate a natural σ -algebra n B0 = π −1 B on the natural extension. Let us denote Bn := S B0 , for n ≥ 0. Definition. Given a nested sequence of σ -algebras B0 ⊂ B1 ⊂ B2 ⊂ . . ., a sequence of functions #n : Y → R is called an increasing martingale if #n is Bn -measurable, and E(#n |Bn−1 ) = #n−1 (or, equivalently, E(#n − #n−1 |Bn−1 ) = 0).
The function χ : Y → R naturally extends to a function χ : Y → R by χ (xn )0n=−∞ = χ (x0 ). We shall now denote χ n (x) := χ (S
−n
x) + . . . + χ (S
−2
x) + χ (S
−1
x).
Since χ is B0 -measurable, it immediately follows that χ n is Bn -measurable. Lemma 3. The sequence χ n is a martingale with respect to the increasing sequence of σ -algebras Bn , n ≥ 1. Proof. For n ≥ 1, we can write −n
n−1
E(χ n − χ n−1 |Bn−1 ) = E(χ ◦ S |S B0 ) −1 n = E(χ |S B0 ) ◦ S −1 n = E(χ|S B) ◦ S = USn+1 P χ = 0, since P χ = 0. In particular, the sequence χ n is a martingale, as required. The function ψ has variance σ 2 defined by ∞ ψ(S n x)ψ(x)dm > 0. σ 2 = ψ(x)2 dm + 2 n=1
If we replace ψ by the cohomologous function χ then the variance is unchanged, i.e., ∞ χ (S n x)ψ(x)dm > 0. σ 2 = χ (x)2 dm + 2 n=1
This is readily seen from alternative characterizations of the variance [15, Chapter 4]. Lemma 4.
−1 2 (1) We may write σ 2 = Rdm σ . 2 2 (2) We may write σ = χ (x) dm.
Proof. For part (1) it is technically easier to work with the invertible natural extension T : X → X of T : X → X which we can identify with {(x, i) : x ∈ Y , 0 ≤ i ≤ R(x) − 1}. Corresponding to φ and µ we have φ and µ on the natural extension. Observe that
342
M. Pollicott, R. Sharp
m
φ(T (x, i)) =
n
+∞
R(S x)−1
n=−∞
j =0
n
φ(S x, j )δ(i + m − j − R n (x)),
where δ(·) denotes the Dirac delta function. Substituting into the definition of σ 2 we obtain +∞ m φ(T (x, i))φ(x, i)dµ m=−∞
−1 +∞
=
Rdm
R(x)−1
n=−∞
=
−1 +∞
Rdm
φ(x, i)
i=0
n
R(S x)−1
n
φ(S x, j ) dm
j =0 n
ψ(x)ψ(S x)dm,
n=−∞
where we understand ψ(x) as being defined on Y , in a natural way. For part (2) we observe that, for n ≥ 1,
n χ (S x)χ (x)dm = P n χ (S n x)χ (x) dm = χ (x)(P n χ )(x)dm = 0. 2. The Proof of Part (a) of the Technical Theorem For a.e.(m) x ∈ Y we can associate to each n ≥ 0 the unique value k = k(n) satisfying R(x) + R(Sx) + . . . + R(S k−1 x) ≤ n < R(x) + R(Sx) + . . . + R(S k−1 ) + R(S k x). By the Birkhoff ergodic theorem we know that −1 k(n) = n Rd m (1 + o(1)) Y
and using the notation R k (x) := R(x)+R(Sx)+. . .+R(S k−1 x) we have that |R k (x)− n| ≤ R(S k x). In particular, we can bound k
|φ n (x) − ψ k (x)| = |φ n (x) − φ R (x)| ≤ ||φ||∞ R(S k x). Recall that ψ and χ differ by a coboundary, i.e., χ k = ψ k +(w −US w), and thus we can uniformly bound |ψ k (x) − χ k (x)| ≤ 2||w||∞ . Therefore, using the triangle inequality, we can bound |φ n (x) − χ k (x)| ≤ ||φ||∞ R(S k x) + 2||w||∞ . Thus to complete the proof of part (a) of the Technical Theorem it suffices to show that there exists δ > 0 such that R(S n x) = O(n1/2−δ ), a.e.(m). The following estimate is well-known (cf. [11]). 1 1− α . Lemma 5. m{x ∈ Y : R(x) ≥ n} = O n
Invariance Principles for Interval Maps with Indifferent Fixed Point
For α < bound
1 3
343
and δ > 0 sufficiently small we have that ( 21 − δ)( α1 − 1) > 1 and we can ∞
m{x ∈ Y : R(S n x) ≥ n 2 −δ } ≤ 1
n=1
∞
n
1 2 −δ
1− α1
< ∞.
n=1 R(S n x)
Thus by the Borel-Cantelli Lemma we see that = O(n1/2−δ ), a.e.(m) and this completes the proof of part (a) the Technical Theorem. 3. The Proof of Parts (b) and (c) of the Technical Theorem In this section we complete the proof of the Technical Theorem by establishing the last two parts. Let (!, ν) be a probability space. Recall that a stochastic process W : ! → C 0 (R+ , R) is called a Brownian motion if (a) W (ω)(0) = 0, a.e. (ν); (b) there exists σ 2 > 0 such that for each t0 > 0 the values ω → W (ω)(t0 ) ∈ R have a normal distribution with variance t0 σ 2 ; (c) for times t0 < t1 < . . . < tn the differences ω → W (ω)(ti+1 ) − W (ω)(ti ) ∈ R are independent random variables. The following result is standard. Lemma 6. Brownian motion satisfies the law of the iterated logarithm, i.e., |W (ω)(t)| lim sup = 1, a.e.(ν). t→+∞ 2σ 2 t log log t We shall follow the analysis of Field, Melbourne and T¨or¨ok, based on a treatment of Philipp and Stout, for the martingale χ k . A key ingredient is the martingale version of the Skorokhod embedding theorem [9, Appendix I] which we now state. Proposition 1 [9, Appendix 1] . There exists a Brownian motion W ∗ (·) on a probability space (!, ν) such that W ∗ (·)(t) has variance t, an increasing sequence of σ -algebras Fk , and sequences of random variables τk : ! → R+ such that k (1) #k := W ∗ (Tk ), where Tk := k−1 l=0 τl , has the same distribution as χ ; (2) #k and Tk are Fk -measurable; and (3) E(τl |Fl−1 ) = E([#l − #l−1 ]2 |Fl−1 ), a.e. (ν), for each l ≥ 1. The above result immediately implies part (b) of the Technical Theorem. To obtain the estimate in part (c) of the Technical Theorem we need to replace Tk by σ 2 k, where σ 2 = Y χ 2 dm > 0. Following [16, p.11], and using part (3) of Proposition 1, we can write σ 2k = Tk −
k−1
{τl − E(τl |Fl−1 )} l=0 k−1
E [#l − #l−1 ]2 |Fl−1 − [#l − #l−1 ]2
+
l=0
+
k−1
[#l − #l−1 ]2 − σ 2 k, a.e. (ν),
(3.1)
l=0
where we set #−1 = 0. Both the first and second terms on the right-hand side of (3.1) are martingales since
344
M. Pollicott, R. Sharp
E(τl − E(τl |Fl−1 )|Fl) = 0, and 2 E E [#l − #l−1 ] |Fl−1 − [#l − #l−1 ]2 |Fl = 0. We can therefore invoke the strong law of large numbers for martingales [6, §VII.9, Theorem 3] for these terms in (3.1) to see that, for any δ > 0, Tk − σ 2k =
k−1
[#l − #l−1 ]2 − σ 2 k + O(k 1/2+δ ), a.e. (ν).
(3.2)
l=0
To estimate the summation in (3.2) we shall consider the following integral: Iδ :=
∞ !
l
−(1/2+δ)
2
2 σ2 #l (ω) − #l−1 (ω) −
dν(ω),
l=1
for δ > 0. The next lemma relates Iδ to the function χ . Lemma 7. We can write Iδ =
∞ Y
l
−(1/2+δ)
2 2 l 2 χ (S x) − χ dm dm(x). Y
l=1
Proof. By Proposition 1, #k and χ k are equal in distribution. Moreover, since S is the natural extension of S we have that χ k and χ k are equal in distribution. Given any measurable function F : RN → R we have that
k ∞ F (χ (x))k=0 dm(x) = F (#k (ω))∞ k=0 dν(ω) Y
!
[1, Prop. 2.39]. To obtain the required result we choose
∞ 2
∞ −(1/2+δ) 2 2 F (xl )l=0 = [xl+1 − xl ] − χ dm l , Y
l=1
noting that χ (S l x) = χ l+1 (x) − χ l (x).
For convenience we introduce a function ρ defined by ρ(x) := χ 2 (x) − Lemma 8. There exists C > 0 and 0 < θ < 1 such that | k ≥ 0. Proof. By part (5) of Lemma 1, we can bound | Cθ k .
Y
Y
Y
χ 2 dm.
ρ ◦ S k ρdm| ≤ Cθ k , for
ρ ◦ S k ρdm| = |
Y
ρ(P k ρ)dm| ≤
Invariance Principles for Interval Maps with Indifferent Fixed Point
345
Consider the expansion Iδ :=
∞ ∞
l
−(1/2+δ) −(1/2+δ)
p
l=1 p=1
=
∞
l
−(1+2δ)
l=1
≤ ||ρ ||∞ 2
Y
ρ dm + 2 2
Y ∞
l
ρ ◦ S l ρ ◦ S p dm
∞ ∞
l
−(1/2+δ)
l=1
d=1 ∞
−(1+2δ)
+ 2C
l=1
l
(l + d)
−(1+2δ)
−(1/2+δ)
∞
l=1
Y
θ
d
ρ ◦ S d ρdm
< +∞,
d=1
where we have used Lemma 8. In particular, we deduce that ∞
l −(1/2+δ)
2 #l (ω) − #l−1 (ω) −
l=1
Y
χ 2 dm
is finite a.e.(ν), for any δ > 0. Applying the Kronecker lemma [9, p.31] we can deduce that k−1 2 #l (ω) − #l−1 (ω) − χ 2 dm = O(k 1/2+δ ). (3.3) l=1
Y
Comparing (3.2) and (3.3) shows that Tk = σ 2 k +O(k 1/2+δ ) and so #k (·) = W ∗ (Tk ) = W ∗ ( σ 2 k) + O(k 1/4+δ ). To complete the proof of part (c) of the Technical Theorem we define a rescaled Brownian motion by W (·)(t) = W ∗ (·)(σ 2 t). Then, for k = k(n), 2 1/4+δ W ∗ (Tk ) = W ∗ ( ) σ k)2+ O(n n σ ∗ =W + o(n) + O(n1/4+δ ) Rdm = W ∗ (σ 2 n) + o(n1/2 ) = W (n) + o(n1/2 ).
(3.3a)
Remarks. (1) The error term in part (c) of the Technical Theorem can be improved to give #k (·) = W (·)(n) + O(n1/4+δ ), for any δ > 0. More precisely, by comparing known results on the rate of mixing in [14 and 19] with estimates on the rate of convergence in the Birkhoff ergodic theorem in [12, Theorem 16, part 3] we have the stronger estimate k(n) = n( Rdm)−1 + O(n1/2+δ ), for any δ > 0. This allows (3.3a) to be improved to W ∗ (Tk ) = W (n) + O(n1/4+δ ), for any δ > 0. (2) The method above can be adapted to study other systems (e.g., higher dimensional analogues of the interval maps considered here [10] and rational maps). It also applies to certain types of abstract tower model, as introduced √ by Young [19] providing the return time map R satisfies ∞ n} < ∞. Another way in n=1 µ{x : R(x) ≥ which one could generalise these results is to consider more general invariant Gibbs measures [2]. Acknowledgement. We are grateful to Carlangelo Liverani and Manfred Denker for useful comments.
346
M. Pollicott, R. Sharp
References 1. Breiman, L.: Probability. London: Addison-Wesley, 1968 2. Buzzi, J., Maume-Deschamps, V.: Decay of correlations on towers for potentials with summable variation. Prepublication de l’Universit de Bourgogne, 1999 3. Campanino, M., Isola, S.: On the invariance principle for non-uniformly expanding transformations of [0,1]. Forum Math. 8, 475–484 (1996) 4. Denker, M.: The Central Limit Theorem for dynamical systems. In: Dynamical systems and ergodic theory, Banach Center Publications, Vol. 23, Warsaw: PWN - Polish Scientific Publishers, 1989, pp. 33–62 5. Denker, M., Philipp, W.: Approximation by Brownian motion for Gibbs measures and flows under a function. Ergod. Th. and Dynam. Sys. 4, 541–552 (1984) 6. Feller, W.: An introduction to probability theory and its applications. Vol. II., NewYork: Wiley, 1971 7. Field, M., Melbourne, I., T¨or¨ok, A.: Decay of correlations, central limit theorems and approximation by Brownian motion for compact Lie group extensions. Preprint 8. Gordin, M.: The central limit theorem for stationary processes. Soviet Math. Doklady 10, 1174–1176 (1969) 9. Hall, P., Heyde, C.: Martingale limit theory and its application. New York: Academic Press, 1980 10. Hu, H.: Conditions for the existence of SBR measures for “almost Anosov” diffeomorphisms. Trans. Am. Math. Soc. 352, 2331–2367 (2000) 11. Isola, S.: Renewal sequences and intermittency. J. Statist. Phys. 97, 263–280 (1999) 12. Kachurovskii, A.: The rate of convergence in ergodic theorems. Russ. Math. Surv. 51, 73–124 (1996) 13. Ledrappier, F.: Principe variationnel et systemes dynamiques symboliques. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 30, 185–202 (1974) 14. Liverani, C., Saussol, B., Vaienti, S.: A probabilistic approach to intermittency. Ergodic Theory Dynam. Systems 19, 671–685 (1999) 15. Parry, W., Pollicott, M.: Zeta functions and the periodic orbit structure of hyperbolic dynamics. Asterisque 187–188, 1–268 (1990) 16. Philipp, W., Stout, W.: Almost sure invariance principles for partial sums of weakly dependent random variables. Mem. Am. Math. Soc., No. 161, Providence, RI: Amer. Math. Soc., 1975 17. Prellburg, T.: Maps of intervals with indifferent fixed points: Thermodynamic formalism and phase transitions. Ph.D. Thesis, 1991 18. Ruelle, D.: Thermodynamic Formalism. New York: Addison-Wesley, 1978 19. Young, L.-S.: Recurrence times and rates of mixing. Israel J. Math. 110, 153–188 (1999) Communicated by G. Gallavotti
Commun. Math. Phys. 229, 347–367 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0692-5
Communications in
Mathematical Physics
Einstein Relation for a Tagged Particle in Simple Exclusion Processes Michail Loulakis Courant Institute of Mathematical Sciences, 251 Mercer Str., New York, NY 10012, USA. E-mail:
[email protected] Received: 2 September 2001 / Accepted: 28 March 2002 Published online: 31 July 2002 – © Springer-Verlag 2002
Abstract: It is known that the rescaled position of a tagged particle in symmetric simple exclusion processes converges to a diffusion. If now the tracer particle is driven by a small force, then it picks up a velocity. The Einstein relation states that in the limit, this velocity is proportional to the small force, and the constant of proportionality can be computed from the diffusion matrix of the tracer particle with no driving force. Such a relation is believed to be generally valid. In this article we establish its validity for all symmetric simple exclusion processes in dimension d ≥ 3, and we prove a density property for certain invariant states of the driven system.
1. Introduction The Einstein relation is an example of a general relation between linear transport coefficients and equilibrium fluctuations, namely the mobility of a charged particle in the presence of an external field and its self-diffusion coefficient. In a general setting one first considers the diffusion of a tracer particle in a particle system in equilibrium. The tracer particle is then subjected to a small external field E which does not affect the motion of its environment and moves with a drift v(E). The ∂vi (E) mobility matrix M of the particle is defined by: Mij = ∂Ej and the Einstein reE=0 lation states that M = βD, where β is the Boltzmann factor, and D is the self-diffusion coefficient of the tracer particle. A natural class of models for which one would like to establish the validity of such a relation is simple exclusion processes on Zd . The motion of a tracer particle when these models are in equilibrium is by now well understood. Saada [Sa] proved a law of large numbers for the position of a tracer particle and Kipnis & Varadhan [KV], Varadhan [V] and Sethuraman, Varadhan & Yau [SVY] have proved an invariance principle for its fluctuations in the symmetric, mean-zero asymmetric, and general asymmetric in d ≥ 3 case, respectively.
348
M. Loulakis
With the introduction of an external field the system evolves according to dynamics, under which its initial state is no longer invariant. In principle, there could be a variety of steady states accessible to the system and very little can be said about the properties of these new invariant states. Consequently, the physically proper order of limits, first the scaling limit to compute the drift, then E → 0 to compute the mobility, presents a hard problem. Ferrari, Goldstein & Lebowitz [FGL] proposed rescaling the external field simultaneously with space and time as an alternative. This “weak asymmetry” approach was successfully carried out by Lebowitz & Rost in [LR]. Landim, Olla & Volchan [LOVo] studied the motion of an asymmetric tracer particle for the 1-dimensional nearest neighbor symmetric simple exclusion process. This is the only case where invariant measures for the process “as seen from the particle” are explicitly known [FGL]. Relating the motion of the particle to the diffusion of a particle system with zero range interaction they obtained results for a wide class of non-equilibrium initial configurations, including the validity of the Einstein relation. The subject of this article is the validity of such a relation for the motion of a tagged particle in symmetric simple exclusion processes, in dimension d ≥ 3. We prove that in the hydrodynamic limit the velocity of the tagged particle in any direction is bounded between two values and that both bounds exhibit the limiting behavior suggested by the Einstein relation. 2. Notation and Results Let us fix a finite range probability measure p(·) on Zd , with p(0) = 0. Consider now an initial configuration of particles on Zd . The simple exclusion process associated to p(·) is a stochastic process whose dynamics can be informally described as follows. Each particle, independently of the others, waits for an exponential time and attempts to jump to another site. If the particle is at x, the target site y is chosen with probability p(y − x). If the target site is empty the particle jumps there, otherwise the jump is suppressed. In either case the process starts afresh. d A natural state space for the process is X = {0, 1}Z , which can be thought of as the d space of possible configurations of particles on Z such that different particles are not allowed to occupy the same site. Hence, for η ∈ X and x ∈ Zd , η(x) is 1 or 0 according to whether the site x is occupied or not. It turns out (cf. [Li], Sect. I.3) that such a stochastic process starting from any initial configuration in X can be well-defined. Local functions (i.e. functions depending on a finite number of coordinates) are a core for L and the action of L on a local function f is given by Lf (η) = p(y − x) η(x) (1 − η(y)) (f (ηxy ) − f (η)), x,y
η(z) if z = x, y, where ηxy (z) = η(x) if z = y, η(y) if z = x. Hereafter, we will restrict ourselves to symmetric simple exclusion processes, i.e. p(−z) = p(z) for all z ∈ Zd . We will also assume that the random walk in Zd with one step transition probabilities p(y −x) is irreducible, i.e. {x : p(x) > 0} generates
Einstein Relation for Tagged Particle in Simple Exclusion Processes
349
the group Zd . The measures Pρ (0 ≤ ρ ≤ 1), defined as products over the sites of Zd of Bernoulli distributions with parameter ρ, are reversible and ergodic for the evolution of these processes (cf. [Li], Sect. VIII.1 for a proof). In order to study the motion of the tracer particle, consider an initial configuration η, chosen according to Pρ conditioned to have a particle at the origin. Tag this particle and denote by Xt its position at time t. Even though Xt is not Markov (due to the presence of the environment), (Xt , ηt ) is. Since the motion of the tagged particle is not localized, it is natural to consider the process ξt : ξt (x) = ηt (x + Xt ) when looking for invariant states of the system. We refer to ξt as the “environment seen from the particle”. ξt is itself a Markov process, whose infinitesimal generator is given by L = L0 + L1 , where L0 f (ξ ) = p(y − x) ξ(x) (1 − ξ(y)) (f (ξ xy ) − f (ξ )) x,y =0
and L1 f (ξ ) =
p(z) (1 − ξ(z)) (f (τz ξ ) − f (ξ )).
z
(Here τz ξ stands for the configuration obtained from ξ by transferring the tagged particle to z, then shifting the entire configuration by −z). L0 corresponds to jumps of the environment, and L1 takes into account the jumps of the tagged particle. A simple computation shows that µρ := Pρ ( · |ξ(0) = 1) is reversible for the process ξt , i.e. L is self-adjoint in L2 (µρ ), and Saada [Sa] proved that µρ is ergodic. An application of Itˆo’s rule shows that for any vector ∈ Rd , t
Xt · = ψ (ξs )ds +
· zJtz , 0
where ψ (ξ ) =
z
p(z) · z (1 − ξ(z)),
(1)
z
and Jtz are the compensated Poisson processes associated to the jumps of the tagged particle by z: t Jtz = χ{ξs =τz ξs− } − p(z)(1 − ξs (z))ds. (2) 0
s≤t
Notice that Jtz are martingales with respect to the natural filtration of ξt , so Xt is adapted to this filtration. In the early 80’s Kipnis & Varadhan [KV] proved an invariance principle for the position of a tagged particle in symmetric simple exclusion, i.e. the convergence in the Skorokhod space of εXtε−2 , as ε → 0 to a diffusion. Their proof relies on a general central limit theorem for additive functionals of reversible Markov processes. The limiting diffusion is characterized by a symmetric matrix D(ρ), such that ∀ ∈ Rd : p(z) · z2 − 2 ψ 2−1,ρ . (3)
· D(ρ) = (1 − ρ) z
350
M. Loulakis
Here ψ −1,ρ is defined by ψ2−1,ρ = sup 2 f, ψρ − Dρ (f ) , where the supremum is taken over all local functions in X, ·, ·ρ stands for the inner product in L2 (µρ ), and Dρ (f ) = f, (−L)f ρ is the Dirichlet form of f . If we let U stand for the space of functions with finite ·−1,ρ , then we can define H−1,ρ by completing U with respect to ·−1,ρ . H−1,ρ is a Hilbert space endowed with the inner product that can be defined by polarization. The canonical way to model the effect of the external field is by modifying the generator of the process to Lα = L0 + Lα1 , where Lα1 f (η) = p(z) e α·z (1 − η(z)) (f (τz η) − f (η)). (Note that − E · z is the change in the potential energy of a unit charge translated by z in a homogeneous field E, so one can think of α as βE.) By standard Markov semi-group theory Lα gives rise to a continuous time Markov chain on X. Local functions are again a core for the generator Lα . Let us denote by Pα,η the measure on the space of c´adl´ag paths on
X that corresponds to this process started from the configuration η, and let Pα,ρ = Pα,η dµρ (η). Eα,ρ will be used to denote expectations with respect to Pα,ρ . Just as in the absence of the driving force, an application of Itˆo’s rule gives that for any ∈ Rd : t z
Xt · =
· zMt + fα (ηs ) ds , (4) z
where now fα (η) =
0
p(z) · z e α·z (1 − η(z)) ,
(5)
z
and the Pα,η -martingales Mtz are the compensated Poisson processes associated to the jumps of the tagged particle by z. When α = 0 the tagged particle is expected to pick up a velocity rather than diffuse (in all cases except the one-dimensional nearest neighbor symmetric simple exclusion process). Hence, the appropriate scaling to capture the characteristics of its movement is now hyperbolic (z → εz, t → εt). After rescaling, the martingale term in the right-hand side of (4) is easily seen to converge to zero a.s., so what we need to show in order to compute the velocity of the tagged particle is some kind of ergodic averaging property for fα (ηt ). However, the particular dynamics of the tagged particle prohibit the explicit computation of any invariant state of the driven system. It is not even clear at this point if we can make sense of the mobility of the tagged particle. This is in fact the main subject of this article. But before we present the results of this paper, let us introduce some useful notation. Let C be the space of real-valued continuous functions on X, and M be the space of probability measures on X equipped with the topology of weak convergence of measures. For f ∈ C and µ ∈ M we will denote f (η) dµ(η) either by f, µ or by E µ [f ]. λ∗ρ (f ) will stand for the principal eigenvalue of the self-adjoint generator L + f in L2 (µρ ). We will also need to consider the following function hα ∈ C:
Einstein Relation for Tagged Particle in Simple Exclusion Processes
hα (η) =
351
p(z) α · ze α·z − e α·z + 1 (1 − η(z)).
(6)
z
Note that hα is nonnegative and has a quadratic behavior for small values of α. Let now Iα = {µ ∈ M : (Lα )∗ µ = 0} be the set of invariant states of the process in the presence of the driving force, and Aα,ρ = µ ∈ Iα : f, µ ≤ hα , µ + λ∗ρ (f ) , ∀f ∈ C . We will prove in Sect. 4 that the sets Aα,ρ are non-empty, and that their elements are orthogonal for different values of ρ, when d ≥ 3. The first theorem provides estimates for the displacement of the tagged particle: Theorem 1. In any dimension d, for every ∈ Rd , t ≥ 0 we have Pα,ρ -a.s. t
inf E µ [fα ] ≤ lim inf ε Xtε−1 · ≤ lim sup ε Xtε−1 · ≤ t
µ∈Aα,ρ
ε→0
ε→0
sup E µ [fα ] .
µ∈Aα,ρ
Theorem 2 states a density property of the elements in Aα,ρ in d ≥ 3. Theorem 2. If d ≥ 3 and µ ∈ Aα,ρ , then for any increasing sequence of cubes {(n }, (n ⊂ Zd , such that |(n | → ∞, we have 1 η(x) = ρ µ − a.s. lim n→∞ |(n | x∈(n
The final result has to do with the properties of Aα,ρ , as α → 0. Theorem 3. If d ≥ 3, then for any ∈ Rd we have
1 µ α p(z) · z α · z − 2 ψ , ψα −1,ρ −→ 0 , sup E [f ] − (1−ρ) |α| µ∈Aα,ρ z as α → 0. In view of Theorems 1 and 3 the velocity of the tagged particle is determined up to the first order of magnitude of α and therefore we can define the mobility matrix of the tagged particle. The first order term of the drift in the direction is p(z) · z α · z − 2 ψ , ψα −1,ρ ,
v(α) · = (1−ρ) z
so the mobility matrix M is given by p(z)zi zj − 2 ψei , ψej −1,ρ . Mij (ρ) = (1−ρ) z
A comparison with (3) identifies M with the self-diffusion coefficient of the tagged particle, as suggested by the Einstein relation. The remaining sections are organized as follows. In Sect. 3 we prove Theorem 1. In Sect. 4, we describe some properties of the sets Aα,ρ for a fixed α and prove Theorem 2. Finally, Sect. 5 is concerned with the behavior of νρα ∈ Aα,ρ , as α → 0, towards a proof of Theorem 3.
352
M. Loulakis
3. Proof of Theorem 1 The key idea here is the construction of Pα,η as a Girsanov type transform of the process in the absence of the external field, P0,η . Since the two processes have the same allowed jumps (only the rates differ), it is possible to construct them both on the same space, in a way that they are mutually absolutely continuous when we only consider paths up to a certain finite time. More precisely, let Ft := σ (ηs ; s ≤ t) and let Ntz stand for the process that counts the number of jumps of the tagged particle by z: χ{ηs =τz ηs− } . Ntz = s≤t
Under P0,η , the (varying) rates of Ntz are given by: λ(z, t) = p(z)(1 − ηt (z)), and the compensated processes Jtz defined in (2) are orthogonal for different z. Therefore, if we define
t
α · zNtz − λ(z, s) e α·z − 1 ds , .t (α) := exp 0
z
then .t (α) is a
P0,η -martingale
z
adapted to Ft , and Pα,η defined by dPα,η = .t (α) , dP0,η Ft
is easily seen to be a Markov family, with generator Lα . Under Pα,η , the rates of the processes Ntz now become λα (z, t) = p(z) e α·z (1 − ηt (z)). This construction permits the computation of the relative entropy of Pα,η with respect to P0,η on Ft : α,η 0,η := .t (α) log .t (α) dP0,η = Eα,η [log .t (α)] . Ht P |P In terms of the Pα,η -martingales Mtz = Ntz −
t
λα (z, s) ds we have t
α · zMtz + hα (ηs ) ds , log .t (α) = 0
0
z
where hα is defined in (6). Hence t hα (ηs ) ds . Ht Pα,η |P0,η = Eα,η
(7)
0
We will also need to estimate E0,η .t (α)q , for q > 1, t q q α·z
α·z p(z) e − qe + q − 1 (1 − ηs (z)) ds .t (α) = .t (q α) exp
0
z
t hα (ηs ) ds = .t (q α) exp (q −1) 0 t
α·z (q−1) α·z × exp e p(z) e − (q − 1) α · z − 1 (1 − ηs (z)) . 0
z
Einstein Relation for Tagged Particle in Simple Exclusion Processes
353
It follows immediately that lim q↓1
t 1 log E0,η .t (α)q e(1−q) 0 hα (ηs ) ds = 0. (q −1)t
(8)
In order to prove the a.s. bounds asserted by Theorem 1 we are going to use some large deviations estimates that we state below: Proposition 1. For every ε > 0 there exists a strictly positive constant C(ε) such that:
α,η z P
· zMt > tε ≤ e−C(ε)t . (9) z
Proof. The argument is pretty standard, and it is based on the fact that for every ∈ Rd we can define the following Pα,η -exponential martingales: t z
α·z ·z
· zMt − p(z) e (e − · z−1) (1 − ηs (z)) ds . At () := exp 0
z
z
The rest is simply Jensen’s inequality.
Proposition 2. Let f be a local function on X. Then, for every ε > 0 there exist strictly positive constants C(ε), A such that: t α,η α L f (ηs ) ds > tε ≤ A e−C(ε)t . (10) P 0
Proof. The proof is very similar to that of the previous proposition. The appropriate exponential martingales are now given by t f −f α f Mt = exp f (ηt ) − (11) e L e (ηs ) ds . 0
Notice that if a generator L has the form Lf (x) = y r(x, y)(f (y) − f (x)), then G(f ) := e−f Lef − Lf = r(x, y)Q(f (y) − f (x)) , (12) y
where Q(x) := ex − x −1. For any γ < 0, using the fact that (11) is a martingale we have t 1 Lα f (ηs ) ds > ε Pα,η t 0 t ≤ eγ εt Eα,ρ exp − γ Lα f (ηs ) ds 0
≤ exp sup f (ξ ) − inf f (ξ ) exp t (γ ε + G(γf )∞ ) . ξ
ξ
For a local f , G(γf )∞ is finite and because of (12) it behaves like γ 2 for small values of γ . Thus, with a suitable choice γ < 0 we obtain (10).
354
M. Loulakis
Remark. By essentially repeating the same arguments one can show that
t α,η z α,η 1 α P
· zMt < −tε , and P L f (ηs ) ds < −ε t 0 z decay exponentially in time, as well. Proposition 3. Let f ∈ C. For every ε > 0 there exists a strictly positive constant C(ε) such that: t α,ρ 1 ∗ f (ηs )−hα (ηs ) ds > λρ (f ) + ε ≤ e−C(ε)t . (13) P t 0 Proof. For every γ > 0 we have t α,ρ 1 P f (ηs )−hα (ηs ) ds > x t 0 t −xγ t α,ρ ≤e exp γ E f (ηs )−hα (ηs ) ds 0 t −xγ t 0,ρ =e .t (α) · exp γ E f (ηs )−hα (ηs ) ds . 0
By H¨older’s inequality now, for all p, q > 1 such that
1 p
+
1 q
= 1 we have
t 1 (f −hα )(ηs ) ds > x t 0
t 1 ≤ −γ x + log E0,ρ .t (α)q e−qγ 0 hα (ηs ) ds qt t 1 0,ρ + exp γp f (ηs ) ds . log E pt 0
1 log Pα,ρ t
By an application of the Feynman-Kac formula, the last term can be estimated by (γpf ). Choosing γ = p1 = q−1 q , we can estimate the left-hand side in the previous relation by
t q −1 1 0,ρ q (1−q) 0 hα (ηs ) ds ∗ − .t (α) e − λρ (f ) . x− log E q (q −1)t 1 ∗ p λρ
Thus, if x = λ∗ρ (f ) + ε, then (13) follows by (8) with a suitable choice of q > 1.
We can now conclude the proof of Theorem 1. Recall Eq. (4) for the position of the tagged particle. After rescaling we have ε Xtε−1 · = ε
z
z
· zMtε −1 + ε
0
tε−1
fα (ηs ) ds .
Clearly, the martingale term on the right-hand side goes to zero by (9).
Einstein Relation for Tagged Particle in Simple Exclusion Processes
355
t Let us define now the random measures νt := 1t 0 δηs ds ∈ M, so that for f ∈ C:
1 t α,ρ -probability 1, any t 0 f (ηs ) ds = f, νt . It would suffice to show now that with P weak limit of νt is in Aα,ρ . In other words, for Pα,ρ -a.s. we have 1 lim t→∞ t
t
0
Lα f (ηs ) ds = 0,
∀f ∈ Dom(Lα ) ,
(14)
and 1 lim sup t→∞ t
t
0
f (ηs )−hα (ηs ) ds ≤ λ∗ρ (f ) ,
∀f ∈ C .
(15)
Equations (10) and (13) essentially prove this limiting behavior for one local function f . But that’s really all we need. Since X is compact, C is separable. On the other hand local functions are a dense subset of C and a core for Lα . Once the asserted limiting behavior has been established Pα,ρ -a.s. for one local f , it is true Pα,ρ -a.s. for a dense countable family of local functions, and (14), (15) follow easily by approximation. Thus, for any continuous function f (and in particular for the local function fα appearing in the statement of Theorem 1) with Pα,ρ -probability 1 we have
µ
t inf E [f ] ≤ lim inf ε µ∈Aα,ρ
ε→0
tε−1
0
f (ηs ) ds ≤ lim sup ε ε→0
0
tε−1
f (ηs ) ds ≤ t sup E µ [f ]. µ∈Aα,ρ
4. Properties of the Sets Aα,ρ We will investigate now some properties of the sets Aα,ρ ⊂ M appearing in Theorem 1. Let us begin by showing that Aα,ρ are non-empty. The natural candidate for an element in Aα,ρ is any weak limit of µt , defined by
f, µt := E
α,ρ
t 1 f (ηs ) ds , t 0
∀f ∈ C.
If µ is any weak limit of µt , then it is invariant by construction. On the other hand for any f ∈ C the entropy inequality gives: t 1 α,ρ 0,ρ 1 0,ρ exp f (ηs )ds + log E
f, µt ≤ Ht P |P t t 0 ≤ hα , µt + λ∗ρ (f ) , the last inequality following from (7) and the Feynman-Kac formula. Now passing to the limit along an appropriate subsequence we get:
f, µ ≤ hα , µ + λ∗ρ (f ) , and therefore µ ∈ Aα,ρ .
356
M. Loulakis
The proof of Theorem 2 is based on the following: Lemma 1. Let ν ∈ Aα,ρ and ( be a cube in Zd , d ≥ 3. Define: f( (η) :=
1 (η(x) − ρ). |(| x∈(
Then, ρ(1−ρ) 1/2 + Chα ∞ δ( , E ν f(2 ≤ |(| where C is a constant and δ( = |(|
2−d 2d
.
Let us see how this lemma implies Theorem 2. Consider an increasing sequence of cubes {(n } such that |(n | → ∞. Divide the cubes into the sets Dk := {(n , whose side lies in [k 3 , (k + 1)3 )}. Choose the largest representative from each non-empty Dk and form a subsequence {Vk } of {(n }. By Chebyshev’s inequality and Lemma 1, for every ε > 0 we have 3 1 1 ν fVk > ε ≤ 2 E ν fV2k ≤ 2 ρ(1−ρ)k −3d + C |α| k 2 (2−d) . ε ε The Borel-Cantelli lemma guarantees the ν-a.s. convergence to 0 of {fVk }. Now if ( = (n ∈ Dk we have fVk =
|(| |Vk \(| f( + fVk \( . |Vk | |Vk |
Since for every (, f( ≤ 1 we have |Vk \(| (k + 1)3 − k 3 → 0, as k → ∞. ≤ 2 sup fVk − f( ≤ 2 sup k3 (∈Dk (∈Dk |Vk | Together with the ν-a.s. convergence to 0 of fVk this implies the ν-a.s convergence to 0 of f(n and thus, the law of large numbers asserted by Theorem 2. Theorem 2 has the following immediate corollaries: Corollary 1. If νρ ∈ Aα,ρ , νρ ∈ Aα,ρ and ρ = ρ then νρ ⊥ νρ . Corollary 2. There exists a measure νρα , which is invariant and ergodic for Lα and such that for every increasing family of cubes (n , with |(n | → ∞: 1 η(x) −→ ρ, |(n | x∈(n
νρα − a.s.
Remark. Uniqueness for each ρ of such an ergodic measure νρα would imply that Aα,ρ is a singleton. In view of Theorem 1, this would prove that Pα,ρ -a.s. the tagged particle α
α·z (1 − η(z)) . moves after rescaling with velocity v(α) = E νρ z p(z) e z
Einstein Relation for Tagged Particle in Simple Exclusion Processes
357
Let us now turn to the proof of Lemma 1. Without loss, we may prove the inequality for ( \ {0} instead of (. In order to keep the notation simple though, we will still denote ( \ {0} by (. If ν ∈ Aα,ρ , then we have f, ν ≤ hα ∞ + λ∗ρ (f ) , ∀f ∈ C, and hence 1 hα ∞
f(2 , ν ≤ inf + λ∗ρ γf(2 γ >0 γ γ 1 hα ∞ = f(2 , µρ + inf (16) + λ∗ρ (γ ϑ( ) , γ >0 γ γ where ϑ( := f(2 − f(2 , µρ ρ(1−ρ) = f(2 − |(| 1 − 2ρ 1 = (η(x)−ρ) + (η(x)−ρ)(η(y)−ρ). |(|2 x∈( |(|2 x,y∈( x =y
The proof now relies on estimating the principal eigenvalue we have λ∗ρ (γ ϑ( ) =
λ∗ρ
(γ ϑ( ). Using the variational characterization of γ ϑ( , φ 2 ρ − Dρ (φ) .
sup
φ2 =1
At this point we would like to integrate by parts in ϑ( , φ 2 ρ and create a gradient of φ that we can control by the Dirichlet form. We cannot hope to do so in terms of the operators η → ηxy or η → τz η because η(x)−ρ and (η(x)−ρ)(η(y)−ρ) do not have zero expectation under all measures µρ . However, we can do so in terms of the operator σ x that flips the configuration at the site x: η(z), if z = x x σ η(z) = 1 − η(z), if z = x. This is the fundamental operation of another important particle process, the Glauber dynamics, under which the configuration at each site (independently of the other sites) flips from 1 to 0 at rate ρ −1 and from 0 to 1 at rate (1 − ρ)−1 . In our case we will not allow flips of the configuration at 0. The generator of this process is given by 1 η(x) 1 1−η(x) Gρ f (η) = f (σ x η) − f (η) . ρ 1−ρ x =0
The dynamics are reversible with respect to µρ and the associated Dirichlet form is given by 2 f (σ x η) − f (η) dµρ . Dg (f ) = Dg,x (f ) = x =0
x =0
The spectral gap property that Dg enjoys shows that every µρ -mean zero local function is in the range of Gρ , so we can integrate by parts and pass the gradient on φ. The
358
M. Loulakis
problem becomes now controlling the Glauber form by the Dirichlet form of the simple exclusion. This was shown in ([SVY]) using duality techniques: If we define η(x) − ρ 1, if A = ∅ ξx (η) = √ , and ξA (η) = ρ(1−ρ) x∈A ξx (η), if A = ∅ , then the functions {ξA ; A ⊂ Zd∗ } form an orthonormal basis for L2 (µρ ) and provide the orthogonal decomposition: L2 (µρ ) = Hn , where Hn = span{ξA ; |A| = n}. n≥0
If u ∈ L2 (µρ ) then we can write u = A u(A)ξ ˜ A and a staightforward computation shows that for all x = 0: 1 2 η(x)(u(σ x η) − u(η))2 dµρ = u˜ (A) , (17) 1−ρ A x 1 2 (1 − η(x))(u(σ x η) − u(η))2 dµρ = u˜ (A) , (18) ρ A x 1 η(x)(u(σ x η) + u(η))2 dµρ ≤ (19) u22 . 1−ρ Transience plays an important role in controlling the Glauber form by the Dirichlet form of the simple exclusion process. If p(t, x, y) stands for the transition probability function for the underlying walk, then the Green’s function defined by ∞ g(x, y) = p(t, x, y) dt 0
is finite for all x and y in Zd . Let now u ∈ L2 (µρ ), and set v 2 (x) := u˜ 2 (A) . A x
It is shown in ([SVY]) under the assumption of transience, that for every nonnegative compactly supported function V (x) we have 2 V (x)v (x) ≤ C sup V (y)g(x, y) Dρex (u) , (20) x
x
where Dρex (f ) =
1 4
y
p(y − x) (f (ηxy ) − f (η))2 dµρ
x,y =0
is the piece of the Dirichlet form that corresponds to jumps of the environment. 0: We are going to use these ideas now, to estimate λ∗ρ (γ ϑ( ). Note first that for x = (η(x)−ρ) h(η) dµρ = (1−ρ) η(x) h(η) − h(σ x η) dµρ . (21)
Einstein Relation for Tagged Particle in Simple Exclusion Processes
359
Therefore, γ ϑ( , φ 2 ρ γ (1−2ρ) γ =
η(x)−ρ, φ 2 ρ +
(η(x)−ρ)(η(y)−ρ), φ 2 ρ 2 2 |(| |(| x,y∈( x∈( x =y
γ (1−2ρ) = (1−ρ) η(x)(φ(η) − φ(σ x η))(φ(η) + φ(σ x η)) dµρ |(|2 x∈( γ (1−ρ) η(x)(η(y)−ρ)(φ(η) − φ(σ x η))(φ(η) + φ(σ x η)) dµρ + |(|2 x,y∈( x =y
1/2 1/2 γ (1−ρ) x 2 x 2 ≤ (φ(η)−φ(σ η)) dµρ (φ(η)+φ(σ η)) dµρ |(| η(x)=1 η(x)=1 x∈( 1/2 1/2
1−ρ 1−ρ x 2 x 2 ≤γ (φ(η)−φ(σ η)) dµρ (φ(η) + φ(σ η)) dµρ . |(| |(| η(x)=1 η(x)=1 x∈(
x∈(
By (17), (19), for such φ that φ2 = 1 the last expression can be bounded by
1/2
1/2 ex 1/2 1 2 C ˜ γ Dρ (φ) ≤ γ g(x, y)χ( (y) , φ (A) sup |(| |(| x y x∈( A x
the last inequality following from (20). Thus, γ 2C sup g(x, () , sup γ ϑ( , φ 2 ρ − Dρ (φ) ≤ 4 |(| x φ2 =1 where g(x, () is the Green’s measure of the cube (. Minimizing over γ in (16) we have for every ν ∈ Aα,ρ : g(x, () 1/2 ρ(1−ρ) + . E ν [f(2 ] ≤ Chα ∞ sup |(| |(| x On the other hand, it follows by standard estimates on the Green’s function for random walks (cf. [Sp]) that supx g(x, () = O(|(|2/d ) for large |(| and the proof of Lemma 1 is completed. 5. The Behavior of Aα,ρ as α → 0 Let us first examine the behavior for small values of γ of 1 sup γ f, φ 2 ρ − Dρ (φ) . γ −1 λ∗ρ (γf ) = γ φ2 =1 Define the sets Aγ ,ρ ⊂ M as follows: Aγ ,ρ = ν ∈ M : dν = φ 2 dµρ , with φ2 = 1, Dρ (φ) ≤ γ
.
It follows easily that Aγ ,ρ are convex, and they are clearly shrinking as γ ↓ 0. If we define
360
M. Loulakis
Aρ :=
!
Aγ ,ρ ,
γ >0
then Aρ is a nonempty (Aρ see that for all f ∈ C:
µρ ), convex, compact subset of M. It is also not hard to
lim γ −1 λ∗ρ (γf ) = sup f, ν , γ ↓0
(22)
ν∈Aρ
and therefore, lim γ −1 λ∗ρ (γf ) = inf f, ν . ν∈Aρ
γ ↑0
The following lemma identifies Aρ . Lemma 2. If the random walk in Zd with one step transition probabilities p(x, y) = p(y−x) is transient, then Aρ = {µρ }. If the walk is recurrent, then Aρ = conv{µθ ; 0 ≤ θ ≤ 1}. Proof. If ν ∈ Aρ then by a diagonalization argument we can construct a sequence {φn } with the following properties: (23) (i) φn 22 = φn2 dµρ = 1 , (ii) Dρ (φn ) → 0, as n → ∞, (iii) if dνn :=
φn2 dµρ ,
and
(24)
then νn "⇒ ν as n → ∞ .
(25)
Then, for any f continuous on X, and x, y = 0 we have xy f (ηxy ) − f (η) · φn2 (η) dµρ (η) f (η ) dν(η)− f (η) dν(η) = lim n→∞ = lim f (η) · φn2 (ηxy ) − φn2 (η) dµρ (η) . n→∞
By Cauchy-Schwarz inequality, the last term is bounded by 1/2 1/2 lim sup f 2 (η)(φn (ηxy ) + φn (η))2 dµρ (η) (φn (ηxy ) − φn (η))2 dµρ (η) n→∞
≤ lim sup 2f ∞ n→∞
xy
1/2
(φn (η ) − φn (η)) dµρ (η) 2
.
(26)
Since the underlying walk is assumed to be irreducible, in all cases except the onedimensional nearest neighbor one, we can find a sequence x = x0 , x1 , . . . xn = y such that p(xi −xi−1 ) > 0. So we can exchange the configuration at the sites x, y in a number of steps, in each of which the bond we flip is present in the Dirichlet form. Therefore, the last term in (26) is controlled by a piece of the Dirichlet form of φn , and hence 1/2 = 0. f (ηxy ) dν(η) − f (η) dν(η) = lim Cf ∞ Dρex (φn ) n→∞
It follows immediately that ν is invariant under finite permutations of the site configurations {η(z); z ∈ Zd∗ }. Since these variables take values in {0, 1}, by an application of
Einstein Relation for Tagged Particle in Simple Exclusion Processes
361
de Finetti’s Theorem their distribution under ν is a convex combination of Bernoullis. Clearly, ν{η(0) = 1} = 1, so ν is a convex combination of µθ ’s, with 0 ≤ θ ≤ 1: Aρ ⊂ conv{µθ ; 0 ≤ θ ≤ 1} . It only remains to examine which convex combinations of Bernoullis actually belong to Aρ . Transient case. Consider a local function f that is measurable with respect to the σ -field C( := σ {η(x); x ∈ (}, (( 0). Let Gρ( stand for the generator of the Glauber dynamics restricted to (, and Dg( stand for the associated Dirichlet form. The spectral gap property (cf. [LuY]) " " "φ − E µρ [φ] "2 ≤ λD ( (φ), g 2 for all C( -measurable functions φ ∈ L2 (µρ ), implies that N ull(Gρ( ) ⊂ span{1}. Therefore, mean zero C( -measurable functions are in Range(Gρ( ), and we can find a C( -measurable function F such that: f (η) − E µρ [f ] = Gρ( F (η). Suppose now ν ∈ Aρ , and consider a sequence of functions φn with the properties (23)–(25). Then, f (η) − f, µρ φn2 (η) dµρ (η)
f, ν − f, µρ = lim n→∞ = lim Gρ( F (η)φn2 (η) dµρ (η) n→∞ F (η)Gρ( φn2 (η) dµρ (η) . = lim n→∞
Now, by Cauchy-Schwarz and (23): η(x) 1−η(x) 1 1 ( 2 2 x 2 F G φ dµρ = F (η) (φ (σ η) − φ (η)) dµ (η) ρ ρ n n n ρ 1−ρ x∈( 1/2 ≤ 2F ∞ (ρ(1−ρ))−1/2 Dg,x (φn ) 1/2 , ≤ Cρ (f ) Dρex (φn )
x∈(
the last inequality following from (17)–(20). Letting n → ∞ we obtain that f, ν =
f, µρ for all local f and hence that ν ≡ µρ . Recurrent case. Consider a sequence of functions un : Zd −→ (0, 1), such that for each n, un (x) = ρ for all x outside a finite subset of Zd . Define now # φun (x) (η(x)) , Gun (η) := x =0
where φx (0) =
1−x 1−ρ
1/2 ,
and
1/2 x φx (1) = . ρ
362
M. Loulakis
The goal is to choose the sequence un so that Gun satisfies the properties (23)–(25), with ν = µθ . It is easy to see that (23) is satisfied and so is (25) as long as un (x) −→ θ,
as n → ∞, ∀x ∈ Zd .
(27)
A simple computation shows that $ 2 $ 2 Gu (ηxy )−Gu (η) dµρ = 2 u(x)(1−u(y)) − u(y)(1−u(x)) , Dxy (Gu ) := Dz (Gu ) := (1−η(z))(Gu (τz η) − Gu (η))2 dµρ =
$
2 $ $ 1−u(−z) − 1−u(z) + 2 (1−u(−z))(1−u(z)) # $ $ × 1 − u(x)u(x − z) + (1−u(x))(1−u(x − z)) . x =0,z
Using the elementary inequality 1 − ni=1 xi ≤ ni=1 1−xi2 , for 0 ≤ xi ≤ 1, it follows that: 1 1 p(y − x)Dxy (Gun ) + p(z)Dz (Gun ) Dρ (Gun ) = 4 2 z x,y =0 $ 2 $ 3 ≤ p(y − x) un (x)(1−un (y)) − un (y)(1−un (x)) 2 x,y =0 $ 2 $ 1 + p(z) 1−un (z) − 1−un (−z) . (28) 2 z The last term in (28) tends to zero if (27) holds. We may also assume that θ ≥ 21 , because if un (x) → θ and Dρ (Gun ) → 0, then by (28), Dρ (G(1−un ) ) → 0, as well. On the other hand, we have $ 2 $ Sn := p(y − x) un (x)(1−un (y)) − un (y)(1−un (x)) x,y =0
=
p(y − x)un (x)un (y)(hn (x) − hn (y))2
x,y =0
˜ n ), ≤ 4D(h where
) hn (x) :=
1 − un (x) ˜ n) = 1 p(y − x)(hn (y) − hn (x))2 , and D(h un (x) 4 x,y
is the Dirichlet form of hn for the random walk. Hence, the feasibility of this construction for all θ ∈ [0, 1] depends on the existence of a sequence of functions θn on Zd with the following properties: (i) θn are compactly supported and 0 ≤ θn (x) ≤ 1, (ii) θn (x) ↑ 1, as n → ∞ for all x ∈ Zd , (iii) the Dirichlet form of θn for the random walk converges to 0, as n → ∞.
Einstein Relation for Tagged Particle in Simple Exclusion Processes
363
If such a sequence exists we may take ) * 1−θ 1−ρ hn (x) := θn (x) + (1 − θn (x)) . θ ρ It is a standard result in random walks that these conditions are equivalent to recurrence: Lemma (3.1) in [SVY] shows that such a construction is impossible in the transient case, whereas for a recurrent walk the functions θn (x) := P x (τ0 ≤ Tn ), defined as the probability that the walk started from x will visit 0 in its first n jumps, will do. They obviously satisfy (i), (ii) and
1 ˜ n) = D(θ 1− p(y)θn (y) − θn (y)[θn+1 (y) − θn (y)] 2 y y =0
1 ≤ 1− p(y)θn (y) −→ 0 , 2 y by monotone convergence. In view of Lemma 2 and (22), if d ≥ 3, then for any continuous function f : lim γ −1 λ∗ρ (γf ) = f, µρ .
γ →0
(29)
Theorem 1 states that the rescaled velocity of the tracer particle is bounded between the infimum and the supremum of fα , νρα , where νρα is taken in Aα,ρ . In order to show that the mobility of the particle can be well-defined we have to prove that ψ, νρα is differentiable in α at zero, at least for a nice class of functions ψ. This will be our next step, and it requires the introduction of some notation. Let ( be a finite subset in Zd∗ , denote by µ( ρ the restriction of µρ to configurations in (, and consider the canonical measures:
ν(,K := µ( η(x) = K . ρ · x∈(
Let also C0 denote the space of local functions ψ whose expectation under any canonical measure ν(,K , such that ( contains the support of ψ, is zero. Examples of such functions are the ψ defined in (1), as well as functions of the form Lg, where g is a local function. C0 is the class of functions for which we will prove the differentiability ρ of ψ, να , and this will turn out to be sufficient for the purpose of checking the validity of the Einstein relation. We begin with the simplest case, when ψ = Lg for some local function g. In order to simplify the notation, let us drop ρ and denote an arbitrary element of Aα,ρ by να . Now,
Lg, να = (L−Lα )g, να (να ∈ Iα ) = (L−Lα )g, µρ + (L−Lα )g − E µρ (L−Lα )g , να . Note that
L−Lα g(η) = p(z) 1 − e α·z (1−η(z))[g(τz η) − g(η)] . z
(30)
364
M. Loulakis
The expectation under µρ of (L−Lα )g can easily be computed with a change of variables η → τz η, that changes the measure (1 − η(z)) dµρ to (1 − η(−z)) dµρ : µρ α
α·z L−L g = p(z) 1 − e g(η) [(1 − η(−z)) − (1 − η(z))] dµρ E z
= 2 g, ψ˜ α ρ , where ψ˜ α (η) =
(31)
p(z) sinh( α · z)(1−η(z)) .
z
The linear part of ψ˜ α is ψα . Note also that reversibility implies 2 g, ψ˜ α ρ = −2 Lg, ψ˜ α −1,ρ .
(32)
As far as the last term in relation (30) is concerned, notice that it is of the form: Rα (η) = z p(z)cz (α)wz (η), where cz (α) = 1 − e α·z (= O(|α|)) and wz are µρ mean zero local functions. We have
∗ sup Rα , να ≤ hα ∞ + λρ p(z)cz (α)wz (η) να ∈Aα,ρ
z
= hα ∞ + sup
φ2 =1
≤ hα ∞ +
z
p(z) cz (α) wz , φ ρ − Dρ (φ) 2
z
p(z)λ∗ρ (cz (α)wz ) .
(33)
The same bound can be obtained of course for (−Rα ). Therefore, by (29), (31), (32) and (33) we get that if g is a local function then, 1 lim (34) sup Lg, να + 2 Lg, ψα −1,ρ = 0. α→0 |α| να ∈Aα,ρ We would like to generalize (34) so that it holds for all functions ψ ∈ C0 . This is done by approximating such a ψ by a function in the range of the generator. Let us recall an approximation lemma for the tagged particle, proved in [LOVa]: Lemma 3. If ψ is a function in C0 , then for every ε > 0 there exist local functions, gε , rε such that ψ = Lgε + rε ,
and
rε −1,0,ρ < ε ,
where · −1,0,ρ is defined by
2 = sup 2 ψ, φρ − Dρex (φ) ψ−1,0,ρ
The supremum above is taken over local functions and Dρex is defined after (20). In view of this lemma, we just need to control the approximation error: rε , να . Notice that rε , being the difference of two functions in C0 is in C0 itself. Functions in C0 are in the range of the generator of the simple exclusion, restricted to a box that contains their support. Again, we can integrate by parts and estimate λ∗ρ (rε ). This is the subject of the following standard integration by parts Lemma (cf. [KL], Lemma 7.2.1).
Einstein Relation for Tagged Particle in Simple Exclusion Processes
365
Lemma 4. Let h be a C0 -function, whose support is restricted to (h ((h 0). Then, there exists a family of cylinder functions {Gb ; b is a bond in (h }, measurable with respect to C(h , such that: If b is the bond {x,y} thenGb (ηxy ) = −Gb (η),
h, φρ =
Gb , ∇b φρ
∀φ ∈ L2 (µρ ), and
b∈(h
b∈(h
Gb , Gb ρ ≤ ch2−1,0,ρ .
Here, if b is the bond {x, y}, ∇b φ(η) is defined as φ(ηxy ) − φ(η). Applying this result to the function rε , supported on (, (( 0) we get that for all φ with φ2 = 1 : γ rε , φ 2 ρ = γ
Gb , ∇b φ 2 ρ b∈(
≤γ
b∈(
≤ 2γ
G2b , (φ(ηxy ) + φ(η))2 ρ
b∈(
1/2
∇b φL2 (µρ )
1/2 Dρex (φ)1/2 ,
G2b , φ 2 ρ
where in the last step we used the first property of Gb mentioned in Lemma 4. Thus, λ∗ρ (γ rε ) = γ rε , φ 2 ρ − Dρ (φ) sup φ2 =1
Dρ (φ)≤γ rε ∞
≤
sup
2γ
φ2 =1
b∈(
Dρ (φ)≤γ rε ∞
≤ γ2
sup
ν∈Aγ rε ∞ ,ρ b∈(
For all να ∈ Aα,ρ we have γ >0
2
G2b , φ 2 ρ (Dρex (φ)) − Dρ (φ) 1 2
G2b , ν .
rε , να ≤ inf
1
hα ∞ 1 ∗ + λρ (γ rε ) . γ γ
We may now choose γ = A |α| (A arbitrary), divide both sides by |α| and let α → 0. By Lemma 2 and Lemma 4 we get: lim sup α→0
1 hα ∞ + cA rε 2−1,0,ρ . sup | rε , να | ≤ lim sup 2 |α| να ∈Aα,ρ α→0 A |α|
Minimizing over A, we end up with lim sup α→0
1 sup | rε , να | ≤ C rε −1,0,ρ . |α| να ∈Aα,ρ
We can now put together (34), (35) and Lemma 3 to state the following:
(35)
366
M. Loulakis
Lemma 5. For any function ψ ∈ C0 : 1 sup ψ, να + 2 ψ, ψα −1,ρ = 0 . α→0 |α| να ∈Aα,ρ lim
Let us finally turn to the proof of Theorem 3. We can write p(z) · z sinh( α · z) + ψ (η) + fα (η) = (1−ρ) +
z
p(z) · z 1 − e α·z (η(z) − ρ) .
(36)
z
The first term is deterministic and will give us no trouble. The last term is of the form: Wα (η) = p(z)cz (α)(η(z) − ρ) , z
where |cz (α)| = O(|α|). Using (21) to pass the gradient on the test function and Cauchy-Schwarz inequality we get that for every φ ∈ L2 (µρ ), with φ2 = 1:
1/2 ex 1/2 1 2 2 2
Wα , φ ρ ≤ Cg (0, 0) p(z)cz (α) , Dρ (φ) z
by (20). Therefore, λ∗ρ
1 2 2 p(z)cz (α) , (Wα ) ≤ C g(0, 0) 4 z
and by the definition of the set Aα,ρ : sup | Wα , να | ≤ C |α|2 .
να ∈Aα,ρ
(37)
Lemma 5 and Eqs. (36) and (37) conclude the proof of the theorem. Acknowledgement. I would like to thank the Courant Institute for the fellowship that supported me during my studies, and the Alexander Onassis Foundation for a subsidiary I received. I am thankful to my advisor Professor S.R.S. Varadhan for his continuous help and encouragement.
References [FGL] [KL] [KV] [Li]
Ferrari, P. A., Goldstein, S., Lebowitz, J. L.: Diffusion, mobility and the Einstein relation. In: Statistical Physics and Dynamical Systems. J. Fritz, A. Jaffe, D. Sz´az, eds, Boston: Birkh¨auser, 1985, pp. 405–441 Kipnis, C., Landim, C.: Scaling limits of interacting particle systems. Grundlehren der mathematischen Wissenschaften, Berlin-Heidelberg-New York: Springer-Verlag, 1999 Kipnis, C., Varadhan S. R. S.: Central limit theorem for additive functionals of reversible Markov processes and applications to simple exclusion. Commun. Math. Phys. 106, 1–19 (1986) Liggett, T.: Interacting particle systems. Grundlehren der mathematischen Wissenschaften, New York: Springer-Verlag, 1985
Einstein Relation for Tagged Particle in Simple Exclusion Processes [LOVo] [LOVa] [LR] [LuY] [Sa] [SVY] [Sp] [V]
367
Landim, C., Olla, S., Volchan, S. B.: Driven tracer particle in 1 dimensional symmetric simple exclusion. Commun. Math. Phys. 192 (2), 287–307 (1998) Landim, C., Olla, S., Varadhan, S. R. S.: Finite dimensional approximation of the self-diffusion coefficient for the exclusion process. To appear in Ann. Prob. (2002) Lebowitz, J. L., Rost, H.: The Einstein relation for the displacement of a test particle in a random environment. Stoch. Proc. Appl. 54, 183–196 (1994) Lu, S. L.,Yau, H. T.: Spectral gap and logarithmic Sobolev inequality for Kawasaki and Glauber dynamics. Commun. Math. Phys. 156, 399–433 (1993) Saada, E.: A limit theorem for the position of a tagged particle in a simple exclusion process. Ann. Probab. 15, 375–381 (1987) Sethuraman, S.,Yau, H. T., Varadhan, S. R. S.: Diffusive limit of a tagged particle in asymmetric simple exclusion processes. Comm. Pure Appl. Math. 53, 972–1006 (2000) Spitzer, F.: Principles of Random Walk. New York: D.Van Nostrand, 1964 Varadhan, S. R. S.: Self diffusion of a tagged particle in equilibrium for asymmetric mean zero random walk with simple exclusion. Ann. Inst. H. Poincar´e, Probabilit´es 31, 273–285 (1995)
Communicated by H.-T. Yau
Commun. Math. Phys. 229, 369–374 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0697-0
Communications in
Mathematical Physics
β-Boundedness, Semipassivity, and the KMS-Condition Bernd Kuckert∗ Korteweg-de Vries Instituut voor Wiskunde, Amsterdam, The Netherlands Received: 17 January 2002 / Accepted: 16 April 2002 Published online: 6 August 2002 – © Springer-Verlag 2002
Abstract: The proof of a recent result by Guido and Longo establishing the equivalence of the KMS-condition with complete β-boundedness [2] is shortcut and generalized in such a way that a covariant version of the theorem is obtained. Recently it was proved by Guido and Longo in [2] that the KMS-condition at a finite nonnegative inverse temperature β is equivalent to a condition called complete β-boundedness. This condition imposes a bound on the number of degrees of freedom in certain phase space regions and is a weak form of the Buchholz-Wichmann nuclearity condition [1], very similar to the (weaker) Haag-Swieca compactness criterion [3]. On the other hand, it was shown in [4] that the KMS-condition at nonnegative temperature in some (a priori unknown) inertial frame is equivalent to a condition called complete semipassivity. Both proofs are variations of the classical result by Pusz and Woronowicz [5], who proved that the KMS-condition at nonnegative temperature is equivalent to complete passivity. It is of interest to investigate bounds on the efficiency of thermodynamic cycles in less generic settings than that of a stationary and homogeneous state, as the extent to which the passivity condition is violated can be considered as a kind of a nonequilibrium state’s distance from thermodynamic equilibrium. As a first step on this path, the result found by Guido and Longo is generalized below in such a way that it characterizes semipassive states as well. As a spinoff of this result, a shortcut of the Guido-Longo argument is given first. In what follows, the algebra of observables of the system under consideration is a von Neumann algebra M on a Hilbert space H, and the state ω of M under consideration is induced by a cyclic unit vector . The time evolution is generated by a selfadjoint operator H with the property that eitH Me−itH = M and H = 0. ∗ Current address: II. Institut f¨ ur Theoretische Physik, Luruper Chaussee 149, 22761 Hamburg, Germany. E-mail:
[email protected]
370
B. Kuckert
We fix a parameter β ≥ 0 that will, eventually, estimate the inverse temperature of ω. Definition 1. The state ω is called β-bounded with bound 1 if the linear space M is a subspace of the domain of e−βH and if the set e−βH M1 consists of vectors with lengths ≤ 1, where M1 := {A ∈ M : A ≤ 1}. ω is called completely β-bounded if for each n ∈ N, the state ω⊗n on the algebra M ⊗ · · · ⊗ M is β-bounded with bound 1. Theorem 2 (Guido, Longo). ω is completely β-bounded if and only if it is a ground state or a KMS-state at an inverse temperature ≥ 2β. Proof. The condition is sufficient by Lemma 1.2 in [2]. It remains to prove that it is necessary. If E denotes the orthogonal projection onto the closure of the subspace h := M and if denotes the modular operator of with respect to the von Neumann algebra N obtained by restricting the elements of M to h (note that is not only cyclic, but also separating with respect to this von Neumann algebra on h), then β-boundedness of ω implies, by Cor. 1.8 in [2], that e−2βH ≤ 1 + E. If ω is faithful, then this implies that each point (η, κ) in the joint spectrum σH,K of the operators H and K := log() satisfies e−2βη ≤ 1 + eκ ,
(1)
and as ω is completely β-bounded, it follows by the arguments used in [5] that this inequality must hold for the elements of the additive group generated by the elements of σH,K as well. Inequality (1) can be rewritten as η≥−
1 log(1 + eκ ) =: f (κ), 2β
and this estimate separates σH,K from a region containing an open cone. Each subgroup in the admitted region must, therefore, be a subgroup of a one-dimensional subspace, and in particular, σH,K and the additive subgroup it generates must be subsets of such a one-dimensional space X. As f (κ) is defined everywhere, this subspace cannot be the η-axis, so H = −αK for some α ∈ R. Since f (κ) tends to zero as κ → −∞, α must be nonnegative, and since f (κ) tends κ to − 2β as κ → +∞, it follows that α ≤ 1/(2β), so ω is a KMS-state at an inverse temperature ≥ 2β if α = 0, and if α = 0, then ω trivially is a ground state of H = 0. It remains to consider the case that ω is not faithful. Then the elements of the space h⊥ = (1 − E)H are eigenvectors of E with the eigenvalue zero. As M is invariant under the adjoint action of eitH , the space h is invariant under eitH , so H |h⊥ is a selfadjoint operator in h⊥ , whose spectral projections are restrictions of the corresponding spectral projections of H , respectively. But this implies that σH,E contains some point of the form (η, 0). Now let (η , δ ) be an arbitrary point in σH,E . As ω satisfies complete β-boundedness, each point of the form (nη + η, nδ · 0), n ∈ N, must satisfy Eq. (1) as well (cf. [5, 4]), so 2β(nη + η) ≥ 0 for all n ∈ N, which can be only if η ≥ 0. So we have proved that η ≥ 0 for all (η , δ ) ∈ σH,E , which implies H ≥ 0 (cf. Lemma B.2 in [4]).
β-Boundedness, Semipassivity, and the KMS-Condition
371
The above proof does not only provide a more direct argument than the original proof, it also admits a generalization of the theorem in the spirit of Thm. 3.3 in [4]. To this end, assume that H generates, together with s self-adjoint operators P1 , . . . , Ps collected in a vector operator P, a strongly continuous representation of the 1 + s-dimensional spacetime translation group that leaves the vector invariant. ω is stationary in all inertial frames, whereas in the presence of matter, it can be a thermodynamic equilibrium state in that matter’s rest frame only. The reason is that the condition of passivity, which is a consequence of the Second Law, is violated in the other frames. The appropriate weakening of the passivity condition is semipassivity. In the following definition, U1 (M) denotes the group of all unitary operators in M that can be connected to the unit operator by a norm-continuous path of unitary operators in M. Furthermore, we define |P| :=
P12 + . . . , Ps2 .
Definition 3. ω is called semipassive if there exists a constant E ≥ 0 such that −W , H W ≤ EW , |P|W for each W ∈ U1 (M) with [H, W ] ∈ M and [P1 , W ], . . . , [Ps , W ] ∈ M.1 Each constant E satisfying this condition is called an efficiency bound of ω. ω is called completely semipassive if all its tensorial powers are semipassive with respect to the same efficiency bound. Denoting uP := u1 P1 + · · · + us Ps , we now recall Thm. 3.3 from [4]: Theorem 4. ω is completely semipassive with efficiency bound E ≥ 0 if and only if there exists a u ∈ Rs with |u| ≤ E such that ω is a ground state or a KMS-state (at finite β ≥ 0) with respect to H + uP. This result describes a most generic example of a nonequilibrium state, and the question is whether bounds on the power of a cyclic process could be of interest in less generic situations. As far as such investigations are concerned, it is an obstacle of the above definition of semipassivity that the invariance of ω is part of the definition and its motivation. While the problem addressed in Thm. 4 is nontrivial only if ω is invariant under all spacetime translations, it would be of interest whether the semipassivity condition can be subdivided into this invariance property plus some additional condition that may be meaningful in other situations as well. Such a condition is semi-β-boundedness. Definition 5. The state ω is called semi-β-bounded if there exists a damping factor E ≥ 0 such that the linear space M is a subspace of the domain of e−β(H +E |P|) and the set e−β(H +E |P|) M1 consists of vectors with length ≤ 1. It is called completely semi-β-bounded if for each n ∈ N, the state ω⊗n on the algebra M ⊗ · · · ⊗ M is semi-β-bounded with respect to one fixed damping factor E ≥ 0. As |P| is a positive operator, the operator e−β E |P| is bounded and provides an additional damping term, so β-boundedness implies semi-β-boundedness for all E ≥ 0, i.e., semi-β-boundedness is the weaker assumption. 1 As a quadratic form, the commutator of H and W is defined on the domain of H . The condition [H, W ] ∈ M means that this quadratic form is bounded and that its associated bounded operator is an element of M. [P1 , W ], . . . , [Ps , W ] ∈ M is to be read accordingly.
372
B. Kuckert
Theorem 6. A stationary and homogeneous state ω is completely semi-β-bounded if and only if there exists a u ∈ Rs with |u| ≤ E such that ω is a ground state or a KMS-state at an inverse temperature ≥ 2β with respect to H + uP. Proof. As in the proof of Theorem 2, the only nontrivial part is the proof that the condition is necessary. One checks that the proof of Cor. 1.8 in [2] still works if one replaces H by H + E|P|, and one obtains e−2β(H +E |P|) ≤ 1 + E,
(2)
where and E are defined as in the proof of Theorem 2. Again, we distinguish the cases that ω is faithful and not faithful: Lemma 7. If ω is faithful, then there exists a u ∈ Rs such that either (i) H + uP = 0, or (ii) ω is a KMS-state at an inverse temperature ≥ 2β with respect to H + uP. Proof. Inequality (2) implies that for each (η, k, κ) ∈ σH,P,K , one has η + E|k| ≥ −
1 ln(1 + eκ ), 2β
(3)
which expells the joint spectrum of H , P, and K from a region containing an open cone. As in [5] it follows that σH,P,K is a subset of a subspace X of Rs+2 with codimension ≥ 1 whose elements satisfy Ineq. (3). If H is a linear function of P, then there exists a u ∈ Rs such that H + uP = 0. Inserting this into Ineq. (3), one finds that −uk + E|k| ≥
1 ln(1 + eκ ) 2β
whenever the pair (k, κ) is in the joint spectrum of P and K, and by complete semiboundedness, one also obtains −nuk + nE|k| ≥
1 ln(1 + enκ ) 2β
(4)
for all n ∈ N. If J denotes the modular conjugation associated with M and , then J H J = −H , J PJ = −P, and J KJ = −K by elementary Tomita-Takesaki theory, so Ineq. (4) holds for all n ∈ Z. But this immediately entails |u| ≤ E, proving Alternative (i) of the statement. There remains the case that H is not a linear function of P. As X cannot contain the κ-axis by Ineq. (3), K is a linear function of H and P, so there exists an α ∈ R and a v ∈ Rs such that K = −αH + vP.
(5)
The vector v is unique up to a component that is perpendicular to the smallest linear subspace Y of Rs containing the joint spectrum of the components of P, so v can be chosen in Y . If vP = 0, then K = −αH , and Ineq. (3) implies that α ≤ 2β, so H generates a KMS-dynamics at an inverse temperature ≥ 2β. In the remaining case that vP = 0, the unit vector ev := |v|−1 v is in Y .
β-Boundedness, Semipassivity, and the KMS-Condition
373
If α = 0, then Eq. (5) and the assumption that H is not a function of P entail K = 0, so for each κ > 0 and each λ > 0, one has (η(λ, κ), λev , κ) ∈ X, where 1 1 η(λ, κ) := − (κ − λev v) = − (κ − λ|v|). α α Inequality (3) yields v κ 1 −λ −E ≥− ln(1 + eκ ) α α 2β for all κ, λ > 0, so α ≤ 2β and αv ≤ E, and defining u := αv , one finds that H +uP generates a KMS-dynamics at an inverse temperature ≥ 2β, and one arrives at Alternative (ii) of the statement. It remains to consider the case that α = 0, i.e., that K = vP. As H is not a linear function of P, while K = vP, H cannot be a linear function of K and P, so X must contain the η-axis. But if (η, k, κ) ∈ σH,P,K , then so is (η , k, κ) for all η ∈ R, and Ineq. (3) implies 1 η + E|k| ≥ − ln(1 + evk ) 2β −
for all η ∈ R, which cannot be.
Lemma 8. If ω is not faithful, then there exists a u ∈ Rs with |u| ≤ E such that ω is a ground state with respect to H + uP. Proof. The elements of the space h⊥ = (1 − E)H are eigenvectors of E with the eigenvalue zero. As M is invariant under the adjoint action of eitH , the space h is invariant under eitH and eitP1 , . . . , eitPs , so H |h⊥ and P1 |h⊥ , . . . , Ps |h⊥ are self-adjoint operators in h⊥ , whose spectral projections are restrictions of the corresponding spectral projections of H and P, respectively. But this implies that σH,P,E contains some point of the form (η, k, 0). Now let (η , k , δ ) be an arbitrary point in σH,P,E . As ω satisfies complete semi-βboundedness, each point of the form (η + nη , k + nk , 0 · nδ ), n ∈ N, must satisfy Ineq. (3) as well, so η + nη + E|k + nk | ≥ 0 for all n ∈ N, which can be only if η ≥ E|k |. So we have proved that η ≥ E|k |
(6)
for all (η , k , δ ) ∈ σH,P,E and, hence, for all (η , k ) ∈ σH,P (use, e.g., Lemma B.2 in [4]). By complete semi-β-boundedness, the corresponding estimate should hold for all tensorial powers of ω as well, which implies that the joint spectrum of H and P is a subset of a sub-semigroup of R1+s all of whose elements satisfy Ineq. (6). Such a semigroup must be a subset of a half space all of whose elements satisfy Ineq. (6). This implies that there exists a u ∈ Rs with |u| ≤ E such that H + uP is a positive operator (cf. Lemma B.2 in [4]). With these two lemmas, the proof of Thm. 6 is complete as well.
Acknowledgements. The author thanks D. Arlt and D. Guido for critically reading the manuscript. This work has been supported by the Stichting Fundamenteel Onderzoek der Materie (FOM). It has been written at the Korteweg-de Vries Institut of Mathematics in Amsterdam.
374
B. Kuckert
References 1. Buchholz, D., Wichmann, E.H.: Causal independence and the energy-level density of states in local quantum field theory. Commun. Math. Phys. 106, 321–344 (1986) 2. Guido, D., Longo, R.: Natural energy bounds in quantum thermodynamics. Commun. Math. Phys. 218, 513–536 (2001) 3. Haag, R., Swieca, J.A.: When does a quantum field theory describe particles. Commun. Math. Phys. 1, 308–320 (1965) 4. Kuckert, B.: Covariant thermodynamics of quantum systems: passivity, semipassivity, and the Unruh effect. Preprint hep-th/0107236, to appear in Ann. Phys. (N. Y.) 5. Pusz, W., Woronowicz, S.L.: Passive states and KMS states for general quantum systems. Commun. Math. Phys. 58, 273–290 (1978) Communicated by H. Araki
Commun. Math. Phys. 229, 375–395 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0693-4
Communications in
Mathematical Physics
Upper Bounds on Coarsening Rates Robert V. Kohn1 , Felix Otto2 1
Courant Institute, New York University, 251 Mercer Street, New York, NY 10012, USA. E-mail:
[email protected] 2 Institut f¨ ur Angewandte Mathematik, Universit¨at Bonn, Wegelerstr. 10, 53115 Bonn, Germany. E-mail:
[email protected] Received: 20 September 2001 / Accepted: 5 February 2002 Published online: 12 August 2002 – © Springer-Verlag 2002
Abstract: We consider two standard models of surface-energy-driven coarsening: a constant-mobility Cahn-Hilliard equation, whose large-time behavior corresponds to Mullins-Sekerka dynamics; and a degenerate-mobility Cahn-Hilliard equation, whose large-time behavior corresponds to motion by surface diffusion. Arguments based on scaling suggest that the typical length scale should behave as (t) ∼ t 1/3 in the first case and (t) ∼ t 1/4 in the second. We prove a weak, one-sided version of this assertion – showing, roughly speaking, that no solution can coarsen faster than the expected rate. Our result constrains the behavior in a time-averaged sense rather than pointwise in time, and it constrains not the physical length scale but rather the perimeter per unit volume. The argument is simple and robust, combining the basic dissipation relations with an interpolation inequality and an ODE argument. 1. Introduction We prove rigorous upper bounds on the coarsening rates for two standard models of surface-energy-driven interfacial dynamics. The sharp-interface versions of these models are the Mullins-Sekerka law (MS) and motion by surface diffusion (SD). Both evolutions preserve volume and decrease surface energy. The difference between them lies in the mechanism of rearrangement: MS corresponds to diffusion through the bulk, while SD corresponds to diffusion along the interfacial layer. We prefer to work with diffuse-interface versions of these models. Therefore rather than analyze the sharp-interface MS and SD laws, we shall consider two Cahn-Hilliard equations – one with constant mobility, the other with degenerate mobility – whose large-time regimes are described by MS and SD respectively. We prefer the CahnHilliard models because they make sense even when the geometry becomes singular, for example due to a topological transition such as pinch-off. The Cahn-Hilliard viewpoint is also attractive because it can be derived from a stochastic Ising model, and because it provides a unified description of spinodal decomposition and coarsening.
376
R.V. Kohn, F. Otto
Our focus is the large-time coarsening behavior, i.e. the growth of the characteristic length scale (t) as t → ∞. The expected behavior is (t) ∼ t 1/3 for Mullins-Sekerka, (t) ∼ t 1/4 for surface diffusion.
(1)
To explain why, recall that both MS and SD are scale-invariant: solutions of MS are preserved by x → λx, t → λ3 t, while those of SD are preserved by x → λx, t → λ4 t. Thus if there is any universal law for it must be given by (1). In truth, much more than (1) is conjectured: solutions with random initial data are believed to be statistically self-similar. Such behavior has been confirmed by numerical and physical experiments, but we know no rigorous results in this direction. The proposed coarsening law (1) can be decomposed into two rather different assertions: (a) an upper bound for (t), saying that microstructure cannot coarsen faster than the similarity rate; and (b) a lower bound for (t), saying that microstructure must coarsen at least at the similarity rate. Assertion (b) is subtle: it may be true generically, or with probability one – but viewed as a universal statement it is clearly false, since there are configurations that do not coarsen at all (e.g. parallel planar layers). We have nothing new to say about it. Assertion (a) is however different and easier, because it should be true universally. Therefore it can be approached using deterministic methods. That is the goal of the present paper. Our main achievement is a (very) weak version of (a). It constrains the behavior in a time-averaged sense rather than pointwise in time, and it constrains not the physical length scale but rather the surface energy per unit volume. Our approach is relatively simple and robust. We outline it here using the language of the sharp-interface models, though the proofs presented later are for the Cahn-Hilliard equations. The argument makes use of interfacial energy density E(t) = interfacial area per unit volume, which has the dimensions of 1/length, and the physical scale L(t) = a suitable negative norm of the order parameter, which has dimensions of length. They are related by a sort of interpolation inequality – a basic fact of analysis, having nothing to do with the dynamics – which says EL ≥ C
(2)
for some positive universal constant C. Of course the interfacial area decreases, in other words E˙ ≤ 0,
(3)
since the motion is surface-energy-driven. In addition – this is the heart of the matter – we also have differential inequalities ˙ ˙ 2 ≤ C (−E) (L) 2 ˙ ˙ (L) ≤ C E (−E)
for MS for SD
(4)
Bound on Coarsening Rates
377
as consequences of the basic energy-dissipating structure of the dynamics. Our upper bound on the time-averaged coarsening rate follows from these relations by an elementary ODE argument. The main conclusion is 1 T 1 T
T 2 T 1 E dt ≥ C T1 0 (t − 3 )2 0 T 3 1 T − 41 3 ) 0 E dt ≥ C T 0 (t
for MS for SD
for T 1. This is a time-averaged version of the (unproved) pointwise statement E −1 ≤ Ct 1/3 E −1 ≤ Ct 1/4
for MS for SD,
which is in turn a one-sided version of (1) with = E −1 . We emphasize that bounding the coarsening rate from above is quite different from bounding it from below. Our upper bound is a matter of kinematics, while a lower bound would be a matter of geometry. Indeed, a system cannot coarsen too quickly, no matter how large its curvature, due to the kinematic restrictions (2)–(4); it can however coarsen slowly if its curvature is small. The situation is roughly analogous to the blowup of semilinear heat equations, where local-in-time existence theory gives a lower bound on the blowup rate but faster blowup is possible, see e.g. [16, 19]. Another analogy is to diffusion-enhanced convection of active scalars, where kinematic considerations lead to upper but not lower bounds for the effective diffusivity, see e.g. [8]. We need a scheme for spatial averaging, to define the quantities E and L, and to prove the fundamental relations (2)–(4). Our choice is to consider solutions that are spatially periodic. This does not significantly compromise the physics, since the size of the period cell and the complexity of the initial data are unrestricted. The constants in our estimates are of course independent of the period cell. We shall focus on the case of a “critical mixture,” i.e. the two phases are assumed to have equal volume fractions. This simplifies the notation somewhat, and it is physically natural when the mixture originates from spinodal decomposition. The restriction of equal volume fractions is, however, merely a convenience, not a mathematical necessity. Similar results hold, with similar proofs, at any volume fraction. Our rigorous analysis is restricted to the diffuse-interface (Cahn-Hilliard) setting. However, a similar analysis can be given for “reasonable” solutions of the sharp-interface evolution laws – for example, solutions which are classical at all but finitely many times, and continuous across the singular times. The paper is organized as follows. Section 2 provides physical and mathematical background concerning the Cahn-Hilliard and sharp-interface models. Section 3 states our rigorous results on the coarsening rate, and Sect. 4 presents the proofs. Section 5 concludes with a brief discussion. 2. Background We have been discussing four evolutions: the sharp-interface MS and SD laws, and the diffuse-interface Cahn-Hilliard equations associated with them. There is, however, a natural unity to the story: all four evolutions arise as limits of a single equation, with a clear link to stochastic Ising models. Sections 2.1 and 2.2 present this unifying viewpoint, and explain how it leads to our two Cahn-Hilliard models – with constant vs. degenerate mobility – in the shallow-quench vs. deep-quench regimes. Section 2.3 discusses
378
R.V. Kohn, F. Otto
the large-time behavior of these Cahn-Hilliard models, explaining their connection with the sharp-interface MS and SD laws. Finally Sect. 2.4 discusses the scale-invariance of the sharp-interface laws, and the associated conjectures about their coarsening behavior. None of this material is strictly necessary to understand our rigorous analysis: the impatient reader can skip straight to Sect. 3.
2.1. A unifying Cahn-Hilliard model: Variable quench. Our starting point is the following Cahn-Hilliard-type model. The free energy is given by β 1 E=− |∇m|2 + (1−m2 ) + ((1+m) log(1+m) + (1−m) log(1−m)) dx, 2 2 (5) where m ∈ (−1, 1). Here c = 21 (1 + m) ∈ (0, 1) stands for the relative concentration of, say, the first species. We havenormalized the total free energy by the volume of the system, denoting the average by −. The first term in (5) is of enthalpic, the second term of entropic origin; β is the inverse temperature. The relative concentration evolves to reduce E while preserving the volume of each phase: ∂m ∂E − ∇ · (1 − m2 ) ∇ = 0, ∂t ∂m
(6)
which leads to the equation ∂m − ∇ 2 m + β ∇ · (1 − m2 ) ∇(m + ∇ 2 m) = 0. ∂t
(7)
We will be interested in the case of a “critical mixture”, in other words one with − m dx = 0.
(8)
This model (7) is a natural starting point, because it has a firm microscopic foundation: it is a local version of the macroscopic limit of an Ising model with long-range Kac potential and Kawasaki dynamics; see [15] or the review article [14, Theorem 6.1]. In particular, the specific form of the mobility 1 − m2 in (6), which vanishes at the two extreme values m = ±1, is natural. It is well-known and easy to verify that for β > 1, (5) has two bulk equilibrium values, m+ ∈ (0, 1) and m− = −m+ . They behave as m+ ≈
1
(3 (β − 1)) 2
for 0 < β − 1 1
1 − 2 exp(−2 β) for β 1
.
Hence one is lead to consider two regimes, the “shallow quench” 0 < β − 1 1 and the “deep quench” β 1.
Bound on Coarsening Rates
379
2.2. Shallow and deep quench regimes: Constant vs. degenerate mobility. In the shallow quench regime, it is natural to rescale time, space, concentration and energy according to 2 1 2 2 2 t = β−1 x, ˆ tˆ, x = β−1 1
m = (3 (β − 1)) 2 m, ˆ E =
3 (β − 1)2 Eˆ + const. 2
As β → 1 the bulk equilibrium values become m ˆ ± = ±1 and Eqs. (5) and (6) become (formally, to leading order) 1 Eˆ = − ˆ 2 )2 d xˆ |∇ m| ˆ 2 + (1 − m 2
(9)
and
∂ Eˆ ∂m ˆ − ∇ˆ 2 = 0, ∂m ˆ ∂ tˆ yielding the Cahn-Hilliard equation with constant mobility ∂m ˆ + ∇ˆ 2 ∇ˆ 2 m ˆ + 2 (1 − m ˆ 2) m ˆ = 0. ∂ tˆ
(10)
This is the Cahn-Hilliard equation associated with MS dynamics, as we shall explain presently. The deep quench regime is even more obvious: One rescales time and energy according to 1 t = tˆ, E = β Eˆ + const, β and obtains formally from (5) and (6) to leading order 1 Eˆ = − |∇m|2 + (1 − m2 ) dx 2 resp. ˆ ∂m ∂ E = 0, − ∇ · (1 − m2 ) ∇ ∂m ∂ tˆ yielding the Cahn-Hilliard equation with degenerate mobility
∂m + ∇ · (1 − m2 ) ∇ ∇ 2 m + m = 0. ∂ tˆ
(11)
This is the Cahn-Hilliard equation associated with SD dynamics, as we shall explain below. The preceding argument, deriving (11) as the deep-quench limit of (7), has been made rigorous by Elliott & Garcke [9]. Our attention in the remainder of this paper will be restricted to the two CahnHilliard equations (10) and (11). We shall of course drop the hats. We remark that these Cahn-Hilliard equations, being fourth-order, have no maximum principle. However, solutions of (11) preserve the constraint −1 ≤ m ≤ 1, as a consequence of the degenerate mobility 1 − m2 , which vanishes at the bulk equilibrium values m = ±1 [9].
380
R.V. Kohn, F. Otto
2.3. The interfacial regime. Experimental observation and numerical simulation shows the following scenario (see e.g. [11, 12, 28]). Consider as initial data the uniform critical mixture m = 0, which is an unstable equilibrium of E, perturbed by some stationary random fluctuations of amplitude o(1) and correlation length o(1). The linearization selects a most unstable wavelength, which in our non-dimensionalization is O(1); fluctuations of this wavelength grow fastest. After this exponential growth regime, nonlinear effects kick in: m approximately saturates at its bulk equilibrium values ±1 in most of the sample. The order parameter m attains its bulk equilibrium value 1 in a convoluted region of characteristic length scale 1. Likewise, there is a region where m attains the other bulk equilibrium value. These regions represent distinct “phases,” and their geometry is highly connected (a “bicontinuous” phase distribution). Each phase has volume fraction 1/2, since the evolution preserves the constraint − m = 0. The phases are separated by a transition layer of width O(1). The profile of m across the transition layer is approximately in equilibrium. Based on the explicit form of the equilibrium profile, one obtains for the energy E per unit volume
4 in the constant mobility case 3 interfacial area density E ≈ . (12) π in the degenerate mobility case 2 interfacial area density As the system matures it enters the “interfacial regime,” characterized by small energy per unit volume and large characteristic length scale: E 1
and 1.
(13)
In this regime the evolution is essentially geometric, since the interface is sharp on the scale of the regions it separates. Motion is driven by the reduction of the total interfacial area, limited by diffusion through the bulk for the case of constant mobility resp. along the interface for the case of degenerate mobility. It leads to a coarsening of the phase distribution, that is, to an increase of its characteristic length scale . The geometric evolution associated with our constant-mobility Cahn-Hilliard equation (10) is the Mullins-Sekerka law, which prescribes the normal velocity V of the evolving interface as follows. First, let p be the chemical potential defined by −∇ 2 p = 0 outside ,
p =
1 H on , 3
where H denotes the mean curvature. Then V is given by ∂p V = on , ∂ν
(14)
∂p where [ ∂p ∂ν ] denotes the jump in the normal derivative ∂ν of p across . The fact that large-time Cahn-Hilliard coarsening is described by the Mullins-Sekerka regime was shown by Pego [27] using a formal, asymptotic-expansion-based argument (see [2] for the multicomponent case). A rigorous proof of this result was given by Alikakos, Bates & Chen, provided the limiting Mullins-Sekerka law has a smooth solution [1]. A rigorous result not requiring any regularity hypotheses, using a very weak notion of solution of the Mullins-Sekerka law, was given by Chen [4].
Bound on Coarsening Rates
381
The geometric evolution associated with our degenerate-mobility Cahn-Hilliard equation (11) is motion by surface diffusion. It prescribes the normal velocity V of the evolving interface by V = −
π2 2 ∇ H 16 s
on ,
(15)
where ∇s2 denotes the surface Laplacian on . This was shown by Cahn, Elliott & Novick-Cohen [3] using a formal, asymptotic-expansion-based argument (see also [13] for the multi component case). There is, to our knowledge, as yet no rigorous version of this result. The literature on Cahn-Hilliard equations, sharp-interface limits, and related topics is vast; additional information and references can be found in the review [10].
2.4. Scaling. The sharp-interface models are important for their scale-invariance: solutions are preserved under the scaling x = λ x, ˆ t = λ3 tˆ x = λ x, ˆ t = λ4 tˆ
for Mullins-Sekerka, for surface diffusion,
(16)
as an easy consequence of the definitions (14) and (15). Solutions of the Cahn-Hilliard equations are, therefore, approximately scale-invariant in the interfacial regime. Of course we do not expect the phase geometry to be pointwise scale invariant. But for a critical mixture (one with − m = 0), numerical simulations suggest that solutions in the interfacial regime are statistically self-similar (see e.g. [11, 12, 28]). Such behavior imposes itself after an initial transient, and persists as long as the length scale of the phase distribution is much smaller than the system size – after which finite-size effects take over. Conceptually, statistical self-similarity means that the (suitably defined, random) solution is invariant under the scaling (16). Practically, we can replace statistical averaging by spatial averaging to derive the following very measurable consequence: the two-point correlation function c(t, r) should have the form r c(t, r) = c( ˆ ),
where = t α ,
(17)
for some universal profile c(r), ˆ with α = 1/3 in the constant-mobility (MS) setting, and α = 1/4 in the degenerate-mobility (SD) setting. Such self-similarity is indeed seen experimentally; for example it is a robust feature of many experiments in the spinodal decomposition of polymer melts, where the Fourier transform of the correlation function (the “structure factor”) can be measured with high precision [18, 20]. To our knowledge, there is no convincing theoretical explanation for the observed statistical self-similarity. The closest thing we know to such an explanation is the meanfield theory of Ostwald ripening. This amounts to the constant-mobility Cahn Hilliard model (or the Mullins-Sekerka law) applied to a strongly off-critical mixture (volume fraction of one phase close to zero, i.e. − m dx + 1 1). In this setting, the minority phase m ≈ 1 breaks into many nearly spherical droplets of varying radius. The Lifshitz-Slyozov-Wagner mean field theory [21, 31] gives an evolution equation for the number density f (t, R) dR of droplets of radius R at time t. This evolution equation has been given a rigorous justification [24, 25]. It admits self-similar solutions, which can be viewed as “statistically self-similar” configurations at the level of the distribution
382
R.V. Kohn, F. Otto
of radii. Surprisingly, however, the large-time behavior is not necessarily self-similar within this simple mean–field theory [26]. The conjecture (17), asserting self-similarity of the correlation functions, seems intractable. We therefore concentrate on the subsidiary, presumably easier conjecture that ∼ tα with α determined by scaling. Let us work out the plausible range of validity of this statement. Assume t = 0 corresponds to a fixed time where we are already in the interfacial regime, that is E0 := E(t = 0) 1,
0 := (t = 0) 1.
In view of the scale invariance (16), we expect 1 1 (t + 30 ) 3 ∼ t 3 for t 30 ∼ 1 1 (t + 40 ) 4 ∼ t 4 for t 40 Because of (12), we expect that
(18)
constant mobility degenerate mobility
.
E ∼ −1 ,
so the preceding relation becomes 1 t − 3 for t 30 E ∼ 1 t − 4 for t 40
constant mobility degenerate mobility
.
Thus, taking into account the hypothesis (18), we expect
1 t − 3 for t 30 1 E0 constant mobility . E ∼ 1 t − 4 for t 40 1 E0 degenerate mobility
(19)
The main result of this paper is a one-sided, time-averaged version of (19). 3. The Main Result The last two sections mixed rigorous statements with many heuristic arguments and conjectures. From here on, however, our treatment is fully rigorous. We consider solutions of the “constant-mobility” Cahn-Hilliard equation ∂m + ∇ 2 ∇ 2 m + 2 (1 − m2 ) m = 0 constant mobility, Eq. (10) ∂t with associated energy 1 |∇m|2 + (1 − m2 )2 dx; E = − 2 and solutions of the “degenerate-mobility” Cahn-Hilliard equation
∂m + ∇ · (1 − m2 ) ∇ ∇ 2 m + m = 0 degenerate mobility, Eq. (11) ∂t
Bound on Coarsening Rates
383
with associated energy 1 E = − |∇m|2 + (1 − m2 ) dx. 2 We restrict our attention for simplicity to the case of a critical mixture, i.e. to solutions with − m dx = 0 critical mixture, Eq. (8). The initial value problem for the constant-mobility Cahn-Hilliard equation is wellposed and solutions are smooth. Less is known about the degenerate-mobility: weak solutions are known to exist [9] but uniqueness remains open. Our arguments are valid for the weak solutions constructed in [9]. We always use periodic boundary conditions for the PDE’s, and − denotes averaging over the period cell. The size $ of the period cell is effectively the system size; the interesting case is $ 1. We always work with averages, so the system size $ never enters our analysis. In particular, our upper bounds on the coarsening rate are independent of system size. As a specific solution coarsens, its length scale must eventually approach the system size. When this happens finite-size effects will slow and eventually stop the coarsening. This behavior does not falsify our results, since we discuss only upper bounds on the coarsening rate. Our analysis uses two different measures of the length scale of the microstructure. One is the interfacial energy density; we explained in Sect. 2.3 that E itself is a good proxy for this. The other is the physical scale – the quantity in our heuristic discussions. The convenient definition of this quantity is the following: Definition 1. For any spatially-periodic m(x) with mean value zero, its physical scale L = L[m] is L := − |∇ −1 m| dx := sup − m ζ dx | periodic ζ with sup |∇ζ | ≤ 1 .
(20)
The notation − |∇ −1 m| dx is purely formal: it is not the L1 norm of some function ∇ −1 m. Rather, it reminds us that L[m] is dual to the W 1,∞ norm on ζ . (To extend our analysis to off-critical mixtures, i.e. to permit − m = 0, one must restrict ζ in (20) to have mean value 0.) We now state our main results. For maximum clarity we state a special case of our result as Theorem 1, then the general case as Theorem 2. Theorem 1. If the initial energy is E0 and the initial length scale is L0 then we have T T 1 > 2 − E dt ∼ − (t − 3 )2 dt for T L30 1 E0 constant mobility, 0
0
0
0
T T 1 > 3 − E dt ∼ − (t − 4 )3 dt for T L40 1 E0 degenerate mobility.
384
R.V. Kohn, F. Otto
Remark 1. The detailed statement of Theorem 1 is this: There exists a (possibly large but controlled) universal constant C < ∞ (depending only on the space dimension N ) such that 1 T
T
E 2 dt ≥
0
1 −2 T 3 C
provided T ≥ C L30 and E0 ≤
1 C
for the constant mobility case and a similar statement in the degenerate mobility case. > < Here and throughout, the symbols ∼, resp. ∼ and bear precisely this meaning. > < The symbol ∼ means both ∼ and ∼. >
Theorem 1 asserts that E ∼ t −1/3 in a suitable time-averaged sense for the case > of constant mobility, and E ∼ t −1/4 in a different time-averaged sense for the case of degenerate mobility. It is natural to ask whether similar bounds hold for other norms of E, and with E replaced by E θ L−(1−θ) . The answer is yes: the method used to prove Theorem 1 actually shows the following stronger result. Theorem 2. For any 0 ≤ θ ≤ 1, suppose r satisfies r < 3, r < 4,
θ r > 1 and (1 − θ )r < 2 in the case of constant mobility, (21) θ r > 2 and (1 − θ )r < 2 in the case of degenerate mobility. (22)
Then we have T T 1 > θr −(1−θ)r dt ∼ − (t − 3 )r dt for T L30 1 E0 constant mobility, − E L 0
0
0
0
T T 1 > θr −(1−θ)r − E L dt ∼ − (t − 4 )r dt for T L40 1 E0 degenerate mobility. The values of r and θ permitted by (21) and (22) are shown in Fig. 1, resp. 2. Notice that when θ = 1, (21) permits any 1 < r < 3 and (22) permits any 2 < r < 4. Also notice that the minimum possible θ permitted by (21) is 1/3, while the minimum permitted by (22) is 1/2. The conclusion of the theorem is strongest when θ and r are smallest, i.e. for values close to the curve θ r = 1 (constant mobility), resp. θr = 2 (degenerate mobility). Indeed, focusing for simplicity on the constant mobility case, if the estimate holds for a given r0 < 3 then it holds for all r between r0 and 3 by an application of Jensen’s inequality; and if the estimate holds for a given θ0 < 1, then it holds for all θ > θ0 by an application of Lemma 1 below.
4. The Proof Theorems 1 and 2 are immediate consequences of three basic lemmas. We state them in Sect. 4.1, then prove each in turn in Sects. 4.2–4.4.
Bound on Coarsening Rates
385
Fig. 1. Constant mobility
Fig. 2. Degenerate mobility
4.1. Ingredients. The first basic lemma relates L and E using just their definitions – making no use of the Cahn-Hilliard dynamics. As motivation, we observe that L scales like length. In the interfacial regime E 1, according to (12), E is essentially the interfacial area density, which scales like inverse length. So it is tempting to suggest that E L ∼ 1. This is true for sufficiently simple geometries with a single length scale. In general, however, there is only an inequality: Lemma 1 (Interpolation). >
E L ∼ 1 for E 1. We call this an “interpolation” lemma because it is closely related to the following relation, asserted for spatially periodic f with mean value 0: 1/2 1/2 < − |f | dx ∼ − |∇f | dx − |∇ −1 f | dx .
(23)
The proof is similar to (but easier than) the one given below for Lemma 1. We obtain a geometric statement by choosing f to take only the values ±1, so that − |f | dx = 1 and − |∇f | dx is twice the interfacial area density. Thus (23) contains a sharp-interface version of Lemma 1. We note in passing that interpolation inequalities similar to (23) – interpolating between the BV norm − |∇f | dx and a suitable negative norm – were central to our recent work with Choksi on domain branching in uniaxial ferromagnets [5]. Inequalities of this type have also emerged from recent work on nonlinear approximation theory [6, 7].
386
R.V. Kohn, F. Otto
The second basic lemma restricts the rate at which L can change. In our CahnHilliard models, the free energy E is dissipated by friction. The following lemma says that a change of the length scale L has to overcome significant friction, and is therefore accompanied by a significant reduction of the free energy E. Lemma 2 (Dissipation). < ˙ 2∼ (L) −E˙ constant mobility, < 2 ˙ ∼ E (−E) ˙ degenerate mobility. (L)
The third basic lemma is a pure ODE result, reaping the benefits of the other two. > < ˙ 2∼ Lemma 3 (ODE). If 0 ≤ θ ≤ 1 and r > 0 satisfy (21), then E L ∼ 1 and (L) −E˙ imply T r > − E θr L−(1−θ)r dt ∼ T − 3 for T L30 . (24) 0
> < ˙ 2∼ ˙ imply E (−E) If 0 ≤ θ ≤ 1 and r > 0 satisfy (22), then E L ∼ 1 and (L) T r > − E θr L−(1−θ)r dt ∼ T − 4 for T L40 .
(25)
0
4.2. Proof of Lemma 1. We present the proof for the case of constant mobility. The argument for the case of degenerate mobility is similar (actually slightly easier, since when the mobility is degenerate we have −1 ≤ m ≤ 1). The first ingredient is the well-known Modica-Mortola [22] inequality. Defining m W (m) := |1 − t 2 | dt, (26) 0
we have so
∂W = |1 − m2 |, ∂m
∂W 2 ∂W 1 2 dx = E. − |∇(W (m))| dx = − |∇m| dx ≤ − |∇m| + ∂m 2 ∂m (27) The second ingredient is the interpolation estimate 1 2 < 2 −1 − m dx ∼ − |∇(W (m))| dx − |∇ m| dx +E
(28)
with − |∇ −1 m| dx = L defined by (20). The proof of (28) makes use of a smooth mollifier ϕ which is radially symmetric, non-negative, and supported in the unit ball with ϕ = 1. Let the subscript ε denote the convolution with the kernel N R 1 · ϕ . εN ε
Bound on Coarsening Rates
387
We split the L2 -norm according to < − m2 dx ∼ − (m − mε )2 dx + − m2ε dx.
(29)
For the first term in (29), we observe that <
(m1 − m2 )2 ∼ |W (m1 ) − W (m2 )| as an easy consequence of the definition (26). Therefore 2 − (m − mε ) dx ≤ sup − (m(x) − m(x + h))2 dx |h|≤ε
∼ sup − |W (m(x)) − W (m(x + h))| dx <
|h|≤ε
< ∼ ε − |∇(W (m))| dx.
(30)
For the second term in (29), we must deal separately with large and small |mε |-values: − m2ε dx = − (m2ε − min{m2ε , 4}) dx + − min{m2ε , 4} dx. (31) (The case of degenerate mobility is easier at this point, since sup |mε | ≤ 1.) To estimate the first term in (31) we observe that since m2 −min{m2 , 4} is non-zero only for |m| > 2, we have the following pointwise estimate by the energy density: <
m2 − min{m2 , 4} ∼
1 (1 − m2 )2 . 2
(32)
Furthermore, m2 − min{m2 , 4} is convex in m. Hence we obtain by Jensen’s inequality − (m2ε − min{m2ε , 4}) dx ≤ − (m2 − min{m2 , 4}) dx 1 ∼ − (1 − m2 )2 dx ≤ E. 2
(32) <
To estimate the second term in (31), we observe that < 2 − min{mε , 4}dx ∼ − |mε | dx.
(33)
(34)
Since the convolution operator is symmetric in the L2 norm and <
sup |∇ζε | ∼
1 sup |ζ | ε
for any function ζ,
a duality argument gives < 1 − |∇ −1 m| dx. − |mε | dx ∼ ε
(35)
388
R.V. Kohn, F. Otto
Combining (30), (33), (34) and (35), we conclude that 1 < 2 − m dx ∼ ε − |∇(W (m))| dx + − |∇ −1 m| dx + E. ε Optimization over ε gives the desired interpolation inequality (28). The final ingredient is the elementary estimate 1/2 < 2 2 2 2 1 − − m dx = − (1 − m ) dx ≤ − (1 − m ) dx ∼ E 1/2 . Together with (27) and (28), we obtain as desired <
1 ∼ (E L)1/2 + E + E 1/2 , which yields Lemma 1 for E 1. 4.3. Proof of Lemma 2. In the constant mobility setting, the PDE (10) can be written as ∂m +∇ ·J = 0 ∂t
where J := −∇
∂E , ∂m
and its solutions are known to be classical. Therefore the rate of change of E is ∂E −E˙ = − − mt dx = − |J |2 dx. ∂m Concerning the rate of change of L, we claim that for any t1 < t2 , t2 |L(t2 ) − L(t1 )| ≤ − |J | dx dt. t1
(36)
(37)
(38)
Indeed, let ζ∗ (x) be an optimal test function in the definition of (20) of L(t2 ); thus L(t2 ) = − m(x, t2 ) ζ∗ (x) dx and ζ∗ is periodic and Lipschitz continuous with |∇ζ∗ | ≤ 1. Using ζ∗ as a test function in the definition of L(t1 ) gives L(t2 ) − L(t1 ) ≤ − (m(x, t2 ) − m(x, t1 )) ζ∗ dx t2 ∂m − = ζ∗ dx dt ∂t t1 t2 (36) = − J · ∇ζ∗ dx dt t1 t2
≤
t1
− |J | dx dt.
Bound on Coarsening Rates
389
The opposite inequality L(t1 ) − L(t2 ) ≤
t2
t1
− |J | dx dt
is proved similarly, choosing ζ∗ to be optimal for the definition of L(t1 ). Thus (38) holds. The conclusion of Lemma 2 follows easily from (37) and (38). Indeed, from the latter we see that L is an absolutely continuous function of t and ˙ ≤ − |J | dx. |L| (39) Applying the Cauchy-Schwarz inequality and using (37) we conclude that 1/2 1/2 ˙ ≤ − |J |2 dx |L| = −E˙ , which is the assertion of the lemma in the constant mobility setting. The proof in the degenerate mobility setting is very similar. The PDE in this case is (11), which can be written as ∂m +∇ ·J = 0 ∂t
where J := −(1 − m2 ) ∇
∂E . ∂m
(40)
For a classical solution, (40) implies
−E˙ = −
1 |J |2 dx. 1 − m2
(41)
The variation of L is still estimated by (39), and the Cauchy-Schwarz inequality gives − |J | dx ≤ − We also have
1 2 1 2 2 |J | dx − (1 − m ) dx . 1 − m2
1 E ≥ − (1 − m2 ) dx. 2
(42)
(43)
Combining inequalities (39), (41), (42) and (43) we conclude that ˙ ˙ 2 ≤ −2E E, (L)
(44)
which is the assertion of Lemma 2 in the degenerate mobility setting. It is not known whether the degenerate-mobility Cahn-Hilliard equation (40) has a global-in-time classical solution. However Elliott & Garcke proved the existence of a global-in-time weak solution in [9], and the argument just presented extends to the weak solutions constructed by those authors. Indeed, their solutions are obtained by a limiting procedure involving Cahn-Hilliard equations similar to (40), but with a finite quench (so the energy is (5) with β > 0) and regularized mobility. The regularized equations have classical solutions and support estimates analogous to (41)–(43). There is sufficient compactness to pass to the limit in L and E, and the analogues of (41) give in the limit the energy inequality 1 −E˙ ≥ − |J |2 dx 1 − m2
390
R.V. Kohn, F. Otto
(the situation is analogous to Leray-Hopf weak solutions of the Navier-Stokes equations). This is, fortunately, all we really needed from (41): passing to the limit in the regularized version of ˙ ≤ |L|
−
1 2 1 2 2 |J | dx − (1 − m ) dx , 2 1−m
and noting that −2E E˙ = −dE 2 /dt, we conclude that for the limiting weak solution L is absolutely continuous, −dE 2 /dt is a bounded measure, and ˙ 2 ≤ −2dE 2 /dt. (L) This is the sense in which the dissipation relation holds for weak solutions.
4.4. Proof of Lemma 3. We begin with some remarks, showing that Lemma 3 takes more or less optimal advantage of its hypotheses. Let us focus for simplicity on the case of constant mobility. >
Remark 2. We shall prove a weak form of the statement E ∼ t −1/3 , but we cannot expect < to prove any form of the analogous-looking statement L ∼ t 1/3 . > < ˙ 2 ∼ Indeed, the hypotheses E L ∼ 1 and (L) −E˙ are consistent with the choice
L := t α
and
E := t −β
0 ≤ β ≤ α
and
β ≤ 1 − 2 α.
provided (45) >
These inequalities imply that β ≤ 1/3, consistent with the expected result E ∼ t −1/3 . However they permit α to take any value between 0 and 1/2. Thus our approach cannot < give an upper bound on L better than L ∼ t 1/2 . Remark 3. Extending the preceding comment: we can expect to prove a weak form of > the statement E θ L−(1−θ) ∼ t −1/3 only for 1/3 ≤ θ ≤ 1. Indeed, it is easy to see that 0≤β≤α
and β ≤ 1 − 2α
imply
θβ + (1 − θ)α ≤ 1/3
only if 1/3 ≤ θ ≤ 1. Remark 4. Our tools are not sufficient to prove a pointwise version of the statement > E ∼ t −1/3 .
Bound on Coarsening Rates
391
Indeed, for any 0 < E1 1, consider the functions E(t) := 1 − E12 t
and
L(t) := 1 + E1 t.
< ˙ 2 ∼ −E˙ trivially, and they also satisfy They satisfy the restrictions E˙ ≤ 0 and (L) > E L ∼ 1 on the finite time horizon
t ≤ t1 := Since
1 − E1 1 ≈ 2. 2 E1 E1
E(t1 ) = E1 ,
this example rules out any pointwise lower bound of the form >
E(t) ∼ t −γ
with γ <
1 . 2
We now begin the proof of Lemma 3. We present the argument just for the case of degenerate mobility; the other case, when the mobility is constant, is entirely similar. > We may assume E(t) and L(t) are differentiable since the hypotheses E L ∼ 1 and < ˙ 2∼ (L) −d(E 2 )/dt are preserved under mollification. < ˙ 2∼ ˙ implies that E is a monotone function of The differential inequality (L) E(−E) time, and L is an absolutely continuous function of E. Therefore L can be viewed as a function of E, and the differential inequality can be rewritten as dL 2 ˙ 2 < ˙ (E) ∼ E|E|. de Here we use the lower case e for the energy as an independent variable to distinguish it ˙ ≥ 0 gives from E = E(t). Division by E|E| 1 dL 2 ˙ < |E| ∼ 1. (46) E de (The division is inadmissible if E˙ = 0, but the conclusion (46) is trivial in that case, so this conclusion is valid for all t > 0.) Multiplying by any function f (E(t)) and integrating in time gives T E(0) f (e) dL 2 > f (E(t))dt ∼ de. e de 0 E(T ) Taking f = eθr L−(1−θ)r and writing E0 = E(0), ET = E(T ), we reach the conclusion that T E0 dL 2 > E θr (t) L−(1−θ)r (t)dt ∼ eθr−1 L−(1−θ)r de (47) de ET 0 for all T > 0. Now we must estimate the right-hand side of (47). Consider the change of variables eˆ =
1 2−θr
e2−θr ,
and Lˆ =
1
)r 1− (1−θ 2
L1−
(1−θ )r 2
.
392
R.V. Kohn, F. Otto
Our hypotheses θ r > 2,
(1 − θ) r < 2
(48)
assure that eˆ → −∞ and Lˆ → ∞ as e → 0 and L → ∞ respectively. They also imply θ > 1/2,
(49)
which will be needed below. Since 2 dL 2 dL 2 d eˆ d Lˆ d eˆ de = de d eˆ de d Lˆ we have
E0
ET
eθr−1 L−(1−θ)r
dL de
2
de =
Eˆ 0
Eˆ T
d Lˆ d eˆ
2 d e. ˆ
ˆ e) The right-hand side is bounded below by the minimum over all functions L( ˆ with the same end conditions ˆ Eˆ 0 ) = L(
1 1−
(1−θ)r 2
(L(0))1−
(1−θ )r 2
ˆ Eˆ T ) = L(
,
1 1−
(1−θ)r 2
(L(T ))1−
(1−θ )r 2
.
To simplify notation we denote these end conditions by Lˆ 0 and Lˆ T respectively. The extremal Lˆ is of course linear in e, ˆ so we have 2 T ˆ T − Lˆ 0 L > E θr (t)L−(1−θ)r (t)dt ∼ . (50) Eˆ 0 − Eˆ T 0 When T is such that
L(T ) ≥ 2L(0),
the right side of (50) is easy to control: we have > Lˆ T − Lˆ 0 ∼ Lˆ T
so
T
and
Eˆ 0 − Eˆ T ≤ −Eˆ T
Lˆ 2T 2−(1−θ)r θr−2 ∼ LT ET . −Eˆ T
>
E θr L−(1−θ)r dt ∼
0
Rewriting the right-hand side as 2−(1−θ)r
LT
−(1−θ) r−4
ETθr−2 = [ETθ LT
]
[LT ET ]4θ−2 ,
>
we conclude, using EL ∼ 1 and (49), that T > −(1−θ) r−4 E θr L−(1−θ)r dt ∼ [ETθ LT ]
provided L(T ) ≥ 2L(0).
0
Introducing
h(T ) := 0
T
E θr L−(1−θ)r dt,
(51)
Bound on Coarsening Rates
393 >
we can rewrite (51) as h ∼ (h )(r−4)/r , so we have shown that >
hr/(4−r) (T )h (T ) ∼ 1
provided L(T ) ≥ 2L(0).
(52)
Here, we have used r < 4. The preceding method doesn’t work when L(T ) < 2L(0), but for such T we can estimate h (T ) = E θr (T )L−(1−θ)r (T ) by different, more elementary means. Indeed, for such T we have > > E(T ) ∼ L−1 (T ) ∼ L−1 0 , which implies
>
E θ (T )L−(1−θ) (T ) ∼ L−1 0 . Thus >
h (T ) ∼ L−r 0
if L(T ) < 2L0 .
(53)
Combining (52) and (53) we conclude, using r < 4, that 4 r d > 4−r 4−r 4−r ∼ h(t) + L h (t) ∼ 1 h + L4−r 0 0 dt
for all t > 0.
Integration in time gives >
h(T ) + L4−r ∼T 0
4−r 4
for all T > 0.
Restricting attention to T L40 , this becomes
T
>
E θr L−(1−θ)r dt = h(T ) ∼ T
0
4−r 4
for T L40 ,
which is precisely the conclusion of Lemma 3. 5. Discussion We explained in Sect. 1 that upper bounds on coarsening rates are different from lower bounds, because upper bounds are kinematic and universal, while lower bounds are geometry-dependent. Our rigorous results demonstrate the merit of this viewpoint, by using simple dissipation and interpolation relations to prove weak, time-averaged upper bounds. It would be nice to prove more. We suppose E and L should satisfy pointwise-in-time bounds. But proving this seems to require a new idea, if not an entirely new method. This paper addresses just two of the many energy-driven coarsening models in materials science. Other examples include the coarsening of mounds in epitaxial growth (see e.g. [23, 29, 30]) and the coarsening of defect structures in soft condensed matter (see e.g. [17]). We wonder whether the viewpoint and methods of this paper might be applicable to such problems. Acknowledgement. This research was supported by the National Science Foundation through grants DMS-0073047 and DMS-0101439 (RVK), and by the Deutsche Forschungsgemeinschaft through SFB 611 (FO).
394
R.V. Kohn, F. Otto
References 1. Alikakos, N., Bates, P., Chen, X.: Convergence of the Cahn-Hilliard equation to the Hele-Shaw model. Arch. Rat. Mech. Anal. 128, 165–205 (1994) 2. Bronsard, L., Garcke, H., Stoth, B.: A multi-phase Mullins-Sekerka system: Matched asymptotic expansions and an implicit time discretisation for the geometric evolution problem. Proc. Royal Soc. Edinburgh Ser. A 128, 481–506 (1998). 3. Cahn, J.W., Elliott, C.M., Novick-Cohen, A.: The Cahn-Hilliard equation with a concentration dependent mobility: Motion by minus the Laplacian of the mean curvature. European J. Appl. Math. 7, 287–301 (1996) 4. Chen, X.: Global asymptotic limit of solutions of the Cahn-Hilliard equation. J. Diff. Geom. 44, 262–311 (1996) 5. Choksi, R., Kohn, R.V., Otto, F.: Domain branching in uniaxial ferromagnetics: A scaling law for the minimum energy. Commun. Math. Phys. 201, 61–79 (1999) 6. Cohen, A., DeVore, R., Petrushev, P., Xu, H.: Nonlinear approximation and the space BV (R 2 ). Am. J. Math. 121, 587–628 (1999) ´ 7. Cohen, A., Meyer, Y., Oru, F.: Improved Sobolev embedding theorem. S´eminaire sur les Equations ´ aux D´eriv´ees Partielles, 1997–1998, Exp. No. XVI, Ecole Polytech., Palaiseau, 1998 8. Constantin, P., Doering, C.R.: Infinite Prandtl number convection. J. Stat. Phys. 94, 159–172 (1999) 9. Elliott, C.M., Garcke, H.: On the Cahn-Hilliard equation with degenerate mobility. SIAM J. Math. Anal. 27, 404–423 (1996) 10. Fife, P.C.: Models for phase separation and their mathematics. Electron. J. Diff. Eqns. 2000(48), 1–26 (2000) 11. Fratzl, P., Lebowitz, J.L.: Universality of scaled structure functions in quenched systems undergoing phase-separation, Acta Metall. 37, 3245–3248 (1989) 12. Fratzl, P., Lebowitz, J.L., Penrose, O. Amar, J.: Scaling functions, self-similarity, and the morphology of phase separating systems. Phys. Rev. B 44, 4794–4811 (1991) 13. Garcke, H., Novick-Cohen, A.: A singular limit for a system of degenerate Cahn-Hilliard equations Adv. Diff. Equations 5, 401–434 (2000) 14. Giacomin, G., Lebowitz, J.L., Presutti, E.: Deterministic and stochastic hydrodynamic equations arising from simple microscopic model systems. In: Stochastic Partial Differential Equations: Six Perspectives. R.A. Carmona, B. Rozovskii, eds, Math. Surveys and Monographs 44, Providence, RI: American Mathematical Society, 1997, pp.107–152 15. Giacomin, G., Lebowitz, J.L.: Phase segregation dynamics in particle systems with long range interactions II: Interface motion. SIAM J. Appl. Math. 58, 1707–1729 (1998) 16. Giga, Y., Kohn, R.V.: Asymptotically self-similar blowup of semilinear heat equations. Comm. Pure Appl. Math. 38, 297–319 (1985) 17. Harrison, C., Adamson, D.H., Cheng, Z.D. Sebastian, J.M. Sethuraman, S. Huse, D.A. Register, R.A., Chaikin, P.M.: Mechanisms of ordering in striped patterns. Science 290, 5497:1558–1560 (2000) 18. Hashimoto, T., Itakura, M., Shimidzu, N.: Late stage spinodal decomposition of a binary polymer mixture. II. Scaling analyses on Qm (τ ) and Im (τ ). J. Chem. Phys. 85, 6773–6786 (1986) 19. Herrero, M.A., Velazquez, J.J.L.: Explosion de solutions d’´equations paraboliques semilin´eaires supercritiques. C. R. Acad. Sci. Paris S´er. I Math. 319(2), 141–145 (1994) 20. Izumitani, T., Tanenaka, M., Hashimoto, T.: Late stage spinodal decomposition of a binary polymer mixture. III. Scaling analyses of late-stage unmixing. J. Chem. Phys. 92, 3213–3221 (1990) 21. Lifshitz, I.M., Slyozov, V.V.: The kinetics of precipitation from supersaturated solid solutions. J. Phys. Chem. Solids 19, 35–50 (1961) 22. Modica, L., Mortola, S.: Un esempio di -convergenza. Boll. U.M.I 14, 285–299 (1977) 23. Moldovan, D., Golubovic, L.: Interfacial coarsening dynamics in epitaxial growth with slope selection. Phys. Rev. E 61, 6190–6214 (2000) 24. Niethammer, B.: Derivation of the LSW-theory for Ostwald ripening by homogenization methods. Archive Rat. Mech. Anal. 147(2), 119–178 (1999) 25. Niethammer, B., Otto, F.: Ostwald ripening: The screening length revisited. Calc. Var. 13, 33–68 (2001) 26. Niethammer, B., Pego, R.: Non-self-similar behavior in the LSW theory of Ostwald ripening. J. Stat. Phys. 95, 867–902 (1999) 27. Pego, R.: Front migration in the nonlinear Cahn-Hilliard equation. Proc. Royal Soc. London A 422, 261–278 (1989) 28. Puri, S., Bray, A.J., Lebowitz, J.L.: Phase-separation kinetics in a model with order-parameterdependent mobility. Phys. Rev. E 56, 758–765 (1997) 29. Ortiz, M., Repetto, E., Si, H.: A continuum model of kinetic roughening and coarsening in thin films. J. Mech. Phys. Solids 47, 697–730 (1999)
Bound on Coarsening Rates
395
30. Siegert, M.: Ordering dynamics of surfaces in molecular beam epitaxy. Physica A 239, 420–427 (1997) 31. Wagner, C.: Theorie derAlterung von Niederschl¨agen durch Uml¨osen. Z. Elektrochemie 65, 581–594 (1961) Communicated by P. Constantin
Commun. Math. Phys. 229, 397–413 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0684-5
Communications in
Mathematical Physics
Quasi-Product Flows on a C∗ -Algebra Akitaka Kishimoto Department of Mathematics, Hokkaido University, Sapporo 060-0810, Japan Received: 5 February 2002 / Accepted: 22 March 2002 Published online: 12 August 2002 – © Springer-Verlag 2002
Abstract: A version of Glimm’s theorem is given for a flow on a separable C∗ -algebra; i.e., we give necessary and sufficient conditions for the flow under which any UHF flow (product type flow on a UHF algebra) is embedded in an inner perturbation of the flow. 1. Introduction A well-known theorem of Glimm [7] says that if A is a separable C∗ -algebra which is not of type I and D is any UHF C∗ -algebra (i.e., an infinite tensor product of matrix algebras), then there is a C∗ -subalgebra B of A and a closed projection q in the second dual A∗∗ such that q ∈ B , qAq = Bq, and Bq is isomorphic to D. Since any representation of D can be extended to a representation of A (via Bq), it follows, as a corollary, that if M is an AFD (approximately finite-dimensional) factor not of type II1 , then there is a representation π of A such that π(A) ∼ = M. (See also Chap. V of [5].) In [4] another version of this result is given in the case A is equipped with a continuous action of a compact abelian group and D is with a product type action. To carry over the proof of Glimm to this case, we need to assume the condition that the action on A has a covariant irreducible representation (whose image does not contain compact operators), which is easily seen as also necessary; in this case it thus follows that if M is an AFD factor not of type II1 and if the Connes spectrum is full (or non-zero), then there is a covariant representation π of A such that π(A) ∼ = M. See also [4] for the non-abelian compact case and [6, 2, and 3] for further developments. In this note we will handle the case of flows (i.e., continuous actions of R). Namely, if A is equipped with a flow α and D is with a UHF flow γ (i.e., an infinite tensor product type action of R, see [14]), we will embed (D, γ ) into (A, α ), in a sense as above, for some small inner perturbation α of α under a suitable assumption on α as in the case of compact groups. This is a missing result from the analysis in [10, 11, 13] compared with [2, 3]. Thus, as a corollary, we will obtain a similar result on covariant representations as above. As another corollary we will also obtain that for such a flow
398
A. Kishimoto
α the set of invariant states of a small inner perturbation of α contains a copy of the state space of a UHF C∗ -algebra as a closed face. Thus if α is asymptotically abelian and A is simple and unital (so that the α-invariant states form a simplex), a small inner perturbation can destroy that property in a very strong sense (cf. 2.6 of [12]). If α is a flow on A, we denote by δα the (infinitesimal) generator of α; t → αt (x) is continuous, by the assumption, for any x ∈ A; and if it is differentiable, then x is in the domain D(δα ) and δα (x) is defined as dαt (x)/dt|t=0 . If h is a self-adjoint element of A, then δα + ad ih generates a flow, which will be denoted by α (h) and called an inner perturbation of α. If α is a flow on A, we denote by Spec(α) the (Arveson) spectrum of α, a closed subset of R and by R(α) the Connes spectrum, a closed subgroup of R (see, e.g., [16]). Our result in the case R(α) = (0) is given as follows: Theorem 1.1. Let A be a separable prime C∗ -algebra and let α be a flow on A with R(α) = (0). Then the following conditions are equivalent: 1. There exists a faithful family of α-covariant irreducible representations of A. 2. There exists a faithful α-covariant irreducible representation of A which induces a representation of the crossed product A ×α R (on the same Hilbert space), whose kernel is left invariant under α|R(α). ˆ ithn on 3. For any UHF C∗ -algebra D and any UHF flow γ on D (i.e., γt = ⊗∞ n=1Ad e ∞ ∗ D = ⊗n=1 Mkn with hn = hn ∈ Mkn ) such that Spec(γ ) ⊂ R(α), and any > 0, there is a C∗ -subalgebra B of A, an h = h∗ ∈ A, and a closed projection q of A∗∗ such that h < , (h) αt (B) (h) (αt )∗∗ (q)
= B,
= q, qAq = Bq, ∼ (D, γ ), (Bq, (α (h) )∗∗ |Bq) = (h)
∗∗ on A∗∗ , and if c(q) denotes the central support of q in where (α (h) )∗∗ t = (αt ) ∗∗ A , x = 0 iff xc(q) = 0 for any x ∈ A.
Since R(α) = (0), then A is automatically antiliminary (i.e., it has no abelian hereditary C∗ -subalgebra). Suppose that R(α) = R. We have shown in [11] that the first two conditions above are equivalent. (See 3.1 of [10] and 2 of [11] for other equivalent conditions.) Since obviously (3) implies (1) or (2), it suffices to prove that (2) implies (3). When we use Glimm’s arguments as presented in Pedersen’s book [16], we use an irreducible representation as in (2) above. Suppose that R(α) ∼ = Z; in this case we do not seem to know that (1) implies (2); we give the proof of this fact in a more general setting in Sect. 3. We shall give the proof of (2)⇒(3) in Sect. 2. In the case R(α) = (0) our result runs as follows: Theorem 1.2. Let A be a separable antiliminary prime C∗ -algebra and let α be a flow on A. Then the following conditions are equivalent: 1. There exists a faithful family of α-covariant irreducible representations of A.
Quasi-Product Flows on a C∗ -Algebra
399
2. There exists a faithful α-covariant irreducible representation of A. ithn 3. For any UHF C∗ -algebra D and some UHF flow γ on D (i.e., γt = ⊗∞ n=1Ad e ∞ ∗ ∗ on D = ⊗n=1 Mkn with hn = hn ∈ Mkn ), and any > 0, there is a C -subalgebra B of A, an h = h∗ ∈ A, and a closed projection q of A∗∗ such that h < , (h) αt (B) (h) (αt )∗∗ (q)
= B,
= q, qAq = Bq, (h) ∗∗ (Bq, (α ) |Bq) ∼ = (D, γ ), (h)
∗∗ on A∗∗ , and if c(q) denotes the central support of q in where (α (h) )∗∗ t = (αt ) ∗∗ A , x = 0 iff xc(q) = 0 for any x ∈ A.
Note that A is explicitly assumed to be antiliminary as this does not follow automatically if R(α) = (0) and that in Condition (3) any UHF flow is switched to some UHF flow. We will indicate how this latter change must be made in the proof of (2)⇒(3); otherwise the proof remains the same. We may call a flow satisfying the condition (3) of either form above a quasi-product flow (cf. [2, 3]). For any separable antiliminary prime C∗ -algebra A there is a quasiproduct (approximately inner) flow on A with full Connes spectrum [15]. There is of course a flow which is not of quasi-product, as we shall exhibit now. Let θ be an irrational number and let Aθ be the (simple) C∗ -algebra generated by two unitaries u, v with the relation uv = e2πθ vu, which is an irrational rotation C∗ -algebra. Let α denote the flow on Aθ defined by αt (u) = e2πit u, αt (v) = e2πiθt v. Then, since α1 = Ad u and αt (u) = u for t ∈ R \ Z, there is no α-covariant irreducible representation of Aθ . Thus α is not a quasi-product flow. Note that the reason why we could not formulate the above results for a locally compact abelian group is that we do not know how to make an inner perturbation of such an action. If the group is compact, we do not need to make an inner perturbation as seen in [4]; thus we could have the version of the above two theorems for a compact abelian group. Note that the results for non-full Connes spectrum would be new. We close this introduction by giving two simple corollaries: Corollary 1.3. Let A be a separable antiliminary prime C∗ -algebra and let α be a quasi-product flow on A. If M is an AFD factor of type IIIλ with 0 < λ ≤ 1 or of type II∞ , then there is an α-covariant representation π of A such that π(A) ∼ = M and the flow on M induced by α has R(α) as its Connes spectrum. If M is an AFD factor of type III or of type II∞ and R(α) = (0), then there is an α-covariant representation π of A such that π(A) ∼ = M (and the Connes spectrum of the induced flow on M is zero). Note that the (W∗ -version) Connes spectrum of the induced flow is smaller than or equal to the (C∗ -version) Connes spectrum of the flow. If M is an AFD factor of type IIIλ with 0 < λ ≤ 1 or II∞ , then M is an ITPFI factor by the classification theorem due to Connes et al (cf. V.9 of [5]). Here an ITPFI factor is the factor obtained as the weak closure of a UHF algebra in the GNS representation associated with a product state; see [1]. But for a UHF flow it is easy to construct a covariant ITPFI representation
400
A. Kishimoto
as claimed above. Thus by Theorem 1.1 this part follows in general. If M is an AFD factor of type III or II∞ and R(α) = (0), then we apply 1.1 with trivial γ and take a representation ρ of D with ρ(D) ∼ = M. We then extend ρ uniquely to a (covariant) representation of A, which still satisfies that π(A) ∼ = M. The first assertion may also follow for any type III0 ITPFI factor. At least, if a type III0 ITPFI factor is given as in 10.10 of [1] for example, we can confirm this. Corollary 1.4. Let A be a separable antiliminary prime C∗ -algebra and let α be a quasi-product flow on A with R(α) = (0). For any > 0 there is an h ∈ Asa with h < such that the set of α (h) -invariant states include a closed face of the state space, which is affinely homeomorphic to the state space of a UHF C∗ -algebra. Proof. By taking the trivial γ in condition (3) of Theorem 1.1, we regard a state on Bq as a state on A. All the states on Bq are α (h) -invariant and form a closed face of the state space of A (since q is closed). 2. Proof of (2)⇒(3) Suppose that R(α) = R. Let π be a faithful α-covariant irreducible representation of A on a Hilbert space H and let U be a unitary flow (i.e., a strongly continuous one-parameter unitary group) on H such that Ad Ut π = παt for t ∈ R and (π, U ) induces a faithful representation π × U of A ×α R. (If the ker π × U is α-invariant, ˆ then it must be generated by an α-invariant ideal of A; thus π × U in (2) is faithful if R(α) = R.) In particular the spectrum of U is R. We may suppose, by a small inner perturbation, that there is a unit vector ξ0 ∈ H such that Ut ξ0 = ξ0 . (By using a method in [12], we can actually assume that U is diagonal such that the set of eigenvalues of finite multiplicity is an arbitrary dense subset of R. Note that, by doing so, we retain the assumption that (π, U ) induces a faithful representation of A ×α R.) We denote by ω the pure state of A defined by ω(x) = π(x)ξ0 , ξ0 , x ∈ A. Let T = T (A) denote the set {e ∈ A | 0 ≤ e ≤ 1, ea = a for some nonzero a ∈ A}. There is a decreasing sequence (eN ) in T such that eN eN+1 = eN+1 for any N and such that if φ is a state, φ = ω iff φ(eN ) = 1 for any N . Lemma 2.1 (R(α) = R). Let EU be the spectral measure (on R) of the unitary flow U on H. Then it follows that π(eN )EU (q − , q + ) = 1 for any q ∈ R, > 0, and N = 1, 2, . . . . Proof. We denote by λ the canonical embedding of C ∗ (R) into the multiplier algebra M(A ×α R). Since (eN λ(g)eN ) is a decreasing sequence, we can define ρ(g) = lim eN λ(g)eN N →∞ for g ∈ C ∗ (R). Then we will assert that ρ is a C∗ -norm on C ∗ (R) and hence that ρ(g) = g, g ∈ C ∗ (R). If f is an integrable continuous function on R, we define, for x ∈ A, αf (x) = f (t)αt (x)dt.
Quasi-Product Flows on a C∗ -Algebra
401
If f ≥ 0 with integral 1, we have that lim αf (eN )eM − eM = 0,
M →∞
lim eM αf (eN ) − αf (eN ) = 0.
N →∞
Hence we also have, for example, that ρ(g) = lim αf (eN )λ(g)αf (eN ), g ∈ C ∗ (R). N →∞ For any g ∈ C ∗ (R) and > 0, there is a non-negative f with integral 1 such that [λ(g), αf (eN )] < . (Because, if g ∈ L1 (R), then [λ(g), αf (eN )] ≤ |g(t)|αt (αf (eN )) − αf (eN )dt ≤ |g(t)|( |f (s − t) − f (s)|ds)dt.) Hence, if g, h ∈ C ∗ (R), we have, for a suitable choice of f as above, that ρ(gh) = lim αf (eN )2 λ(g)λ(h)αf (eN )2 N
≤ lim αf (eN )λ(g)αf (eN )2 λ(h)αf (eN ) + (h + g) N
≤ ρ(g)ρ(h) + (h + g). Namely it follows that ρ(gh) ≤ ρ(g)ρ(h). In a similar way it also follows that ρ(g ∗ g) = ρ(g)2 for g ∈ C ∗ (R). Thus one can conclude that ρ is a C∗ -semi-norm on C ∗ (R). Let π × U denote the representation of A ×α R induced by (π, U ). Since (π × U )(eN λ(g)eN )ξ0 , ξ0 = g(t)dt, and αˆ p (eN λ(g)eN ) = eN αˆ p (λ(g))eN for p ∈ R, where αˆ is the dual flow on A ×α R, we can conclude that ρ(g) = g for g ∈ C ∗ (R). Since π ×U is faithful by the assumption and (π ×U )(eN λ(g)) = π(eN ) g(t)Ut dt for g ∈ L1 (R), the assertion now follows. Let H denote the generator of U : Ut = eitH . If h ∈ Asa we define (h)
Ut
= eit (H +π(h)) . (h)
(h)
Recall that the generator of α (h) is δα + ad ih and note that Ad Ut π = π αt . We denote by U(A) the unitary group of A (or A + C1 if A 1) and Asa the space of self-adjoint elements of A. Lemma 2.2. Let (A, α) and (π, U, ξ0 , H) be as above. Let (p1 , p2 , . . . , pm ) be a sequence in R and (x0 , x1 , . . . , xm ) a sequence in the unit ball of A with x0 ∈ T such that π(x0 )ξ0 Ut π(xk )ξ0 xj∗ xk xj xk xj∗ xj x0
= = = = =
for all j, k = 0, 1, . . . , m, where p0 = 0.
ξ0 , eipk t π(xk )ξ0 , 0 if j = k, 0 if k = 0, x0 if j = 0
402
A. Kishimoto
Let (q1 , q2 , . . . , qn ) be a sequence in R and > 0. Then there exist a sequence (y0 , y1 , . . . , yn ) in the unit ball of A with y0 ∈ T , a u ∈ U(A), and an h ∈ Asa such that u − 1 < , h < , x0 y2 = y2 x0 π(y0 )ξ0 yj∗ y2 yj y2 ∗ yj yj y0
= = = = =
y2 , ξ0 , 0 if j = 2, 0 if 2 =
0, y0 if j =
0,
for all j, 2 = 0, 1, . . . , n and (h)
Ut π(uxk y2 )ξ0 = ei(pk +q2 )t π(uxk y2 )ξ0 , π(u)π(xk )ξ0 = π(xk )ξ0 ,
(h)
(αt Ad u(xk y2 ) − ei(pk +q2 )t Ad u(xk y2 ))y0 < , t ∈ [−1, 1], (h)
αt Ad u(y0 ) − Ad u(y0 ) < , t ∈ [−1, 1] for k = 0, 1, . . . , m and 2 = 0, 1, . . . , n with q0 = 0. Proof. We mostly follow the arguments in the proof of 6.7.1 of [16]. Recall that we have chosen a decreasing sequence (eN ) in T = T (A) for the pure state ω : x → π(x)ξ0 , ξ0 . We may suppose that e1 = x0 . Since π(αt (xk ) − eipk t xk )ξ0 = 0, it follows that lim (αt (xk ) − eipk t xk )eN = 0 N
for each t ∈ R. Let > 0, which will be chosen later much smaller than the in the statement. There exists an N > 1 such that (αt (xk ) − eipk t xk )eN < , |t| ≤ 1, k = 0, 1, . . . , m. Let P be the spectral projection of π(eN ) corresponding to the eigenvalue 1. Since π(eN+1 )EU (q2 − , q2 + )π(eN+1 ) = 1 (by the previous lemma) and π(eN+1 )P = π(eN+1 ), we have that P EU (q2 − , q2 + )P = 1. Hence we find a unit vector η2 ∈ P H such that 1 − EU (q2 − , q2 + )η2 , η2 < 2 , which implies that
EU (q2 − , q2 + )η2 − η2 < .
We may furthermore suppose that those vectors η2 ’s are mutually orthogonal. By Kadison’s transitivity [8] there exists a y2 ∈ A such that y2 = 1 and π(y2 )ξ0 = η2 for 2 = 1, 2, . . . , n. There also exists a b ∈ A+ such that π(b)η2 = (2 + 1)η2 , 2 = 0, 1, . . . , n, where η0 = ξ0 . We may replace b by eN beN and hence we may suppose that x0 b = b. Let (f0 , f1 , . . . , fn ) be a sequence of non-negative functions in C0 (0, ∞) with norm 1 such that f2 (2 + 1) = 1 and supp(f2 ) ⊂ (2 + 1/2, 2 + 3/2) for 2 = 0, 1, . . . , n. Then, since π(f2 (b)y2 f0 (b))η0 = η2 ,
Quasi-Product Flows on a C∗ -Algebra
403
we may replace y2 by f2 (b)y2 f0 (b) for 2 = 1, . . . , n. Then it follows that x0 y2 = y2 x0 = y2 , yj∗ y2 = 0 for j = 2, and yj y2 = 0 for j, 2 = 1, 2, . . . , n besides the original conditions π(y2 )ξ0 = η2 and y2 = 1. Let f be a non-negative function in C0 (0, ∞) such that f (t) = t −1/2 around t = 1 and tf (t)2 ≤ 1 for all t > 0. Since π(y1∗ y1 )ξ0 = ξ0 and f (1) = 1, we have that π(f (y1∗ y1 ))ξ0 = ξ0 and π(y1 f (y1∗ y1 ))ξ0 = η1 . Thus, replacing y1 by y1 f (y1∗ y1 ), it follows that y1∗ y1 ∈ T . Let z1 ∈ T be such that y1∗ y1 z1 = z1 and π(z1 )ξ0 = ξ0 (and z1 may be obtained as a function of y1∗ y1 ). Then we may replace y2 by y2 z1 f (z1 y2∗ y2 z1 ); it follows that y2 y1∗ y1 = y2 and y2∗ y2 ∈ T . Let z2 ∈ T be such that y2∗ y2 z2 = z2 and π(z2 )ξ0 = ξ0 . Repeating this procedure we modify y3 , . . . , yn and obtain zn ∈ T to set y0 = zn . Then it follows that y0 y2∗ y2 = y0 and y0 y2 = 0 for 2 = 1, . . . , n. Thus (y0 , . . . , yn ) satisfies all the conditions without involving h and u, which we now have to decide. Since Ut π(xk y2 )ξ0 − ei(pk +q2 )t π(xk y2 )ξ0 ≤ παt (xk )Ut η2 − eiq2 t παt (xk )η2 + π αt (xk )η2 − eipk t π(xk )η2 ≤ Ut η2 − eiq2 t η2 + (αt (xk ) − eipk t xk )eN , we have that Ut π(xk y2 )ξ0 − ei(pk +q2 )t π(xk y2 )ξ0 < 2, t ∈ [−1, 1]. Note also that the (m + 1)(n + 1) vectors π(xk y2 )ξ0 are mutually orthogonal unit vectors and π(xk y0 )ξ0 = ξk . By Lemma 2.4 below we obtain u ∈ U(A) and h ∈ Asa as required except for the last two conditions. To meet the last two conditions we replace y0 by a smaller z ∈ T with π(z)ξ0 = ξ0 by the argument used in the beginning of this proof and by Lemma 2.5 below. Lemma 2.3. For any > 0 there exists a δ > 0 such that if Ut ξ − eiqt ξ < δ, |t| ≤ 1 for a unit vector ξ ∈ H and q ∈ R, then EU (q − , q + )ξ > 1 − . Proof. Let ξ ∈ H be a unit vector and q ∈ R and define a probability measure µ = µξ,q on R by µ(S) = EU (S + q)ξ, ξ . Since Ut ξ − eiqt ξ 2 = 2 (1 − cos pt)dµ(p) for µ = µξ,q , if Ut ξ − eiqt ξ < δ, |t| ≤ 1, we have that 1−
cos(pt)dµ(p) < δ 2 /2, |t| ≤ 1.
If g is a non-negative integrable function on [−1, 1] with 1 −1 g(t) cos(pt)dt belongs to C0 (R) and satisfies 1−
1 −1
1
−1 g(t)dt
g(t) cos(pt)dt dµ(p) < δ 2 /2.
= 1, then p →
404
A. Kishimoto
Suppose that the assertion is not correct, i.e., suppose that there were an > 0, a sequence (ξn ) of unit vectors in H, and a sequence (qn ) in R such that lim sup Ut ξn − eiqn t ξn = 0, n |t|≤1
EU (qn − , qn + )ξn ≤ 1 − . Then, by taking a weak limit point of (µξn ,qn ) (in the dual of C0 (R)), we obtain a measure µ on R such that
1
−1
µ(−, ) ≤ (1 − )2 , µ(R) ≤ 1, g(t) cos(pt)dt dµ(p) = 1
for any non-negative integrable function g on [−1, 1] with taking the characteristic function of [0, 1] for g, we have that 1 sin(p) cos(pt)dt = <1 p 0
g(t)dt = 1. Since, by
for p = 0, the last two conditions imply that µ is the Dirac measure at p = 0, which contradicts the first condition. Thus we have reached the assertion. Lemma 2.4. Let (p1 , . . . , pm ) and (q1 , . . . , qn ) be sequences in R and > 0. Then there exists a δ > 0 satisfying: If (ξ1 , . . . , ξm , η1 , . . . , ηn ) is an orthogonal family of unit vectors in H such that Ut ξj = eipj t ξj , Ut ηj − eiqj t ηj < δ, for t ∈ [−1, 1] and all j , there exists a u ∈ U(A) and h ∈ Asa such that u − 1 < , h < , π(u)ξj = ξj , (h)
(h)
Ut ξj = eipj t ξj ,
Ut π(u)ηj = eiqj t π(u)ηj . Proof. We choose an 1 ∈ (0, ) so that any two of (qj − 1 , qj + 1 )’s are identical or mutually disjoint and the intersection of (qj − 1 , qj + 1 ) with {p1 , . . . , pm } is empty or {qj }. Let 2 > 0. By the previous lemma it follows, from the assumption that Ut ηj − eiqj t ηj < δ for a sufficiently small δ > 0, that EU (qj − 1 , qj + 1 )ηj 2 > 1 − 22 . We then choose a ζj ∈ EU (qj − 1 , qj + 1 )H for j = 1, . . . , n such that ζj = 1 and ζj − EU (qj − 1 , qj + 1 )ηj < 2 , which implies that ηj − ζj ≤ EU (qj − 1 , qj + 1 )ηj − ζj + (1 − EU (rj − 1 , rj + 1 ))ηj < 22 . Thus the vectors ζj ’s are almost mutually orthogonal and almost orthogonal to ξi ’s.
Quasi-Product Flows on a C∗ -Algebra
405
We will make ζj ’s exactly mutually orthogonal and orthogonal to ξi ’s. The latter conditions can be met by choosing ζj from EU (qj − 1 , qj + 1 )V , where the subspace V is the orthogonal complement of the linear space spanned by ξi ’s. Since EU (qj − 1 , qj + 1 )ηj ∈ V , this is indeed possible. If all the (qj − 1 , qj + 1 ) are mutually disjoint, the first condition follows automatically. Otherwise we can choose ζj ’s step by step (via the Gram-Schmidt process) for those j with the same value of qj . Since EU (qj −1 , qj +1 )ηj ’s are almost mutually orthogonal, we can still keep the condition that ζj ≈ EU (qj − 1 , qj + 1 )ηj . Thus, by choosing 2 > 0 sufficiently small (and hence δ > 0) in the above arguments, we have unit vectors ζj ∈ EU (qj − 1 , qj + 1 )H such that ξi ’s and ζi ’s are mutually orthogonal and ζj − ηj < for a prescribed > 0. Here we have chosen so that there is a unitary W on the subspace spanned by ζj ’s and ηj ’s such that W − 1 < , W η j = ζj . We regard W as a unitary on H by defining it to be 1 on the orthogonal complement. Note that W ξi = ξi since ξi ’s are orthogonal to ηj ’s and ζj ’s. Then, by Kadison’s transitivity [8], there is a u ∈ U(A) such that u − 1 < , π(u)ξi = W ξi = ξi , π(u)ηj = W ηj = ζj . Recall that H is the generator of U and define K=−
n
(H − qj )EU (qj − 1 , qj + 1 ).
j =1
Since K ≤ 1 < , there is an h ∈ Asa such that h < and π(h)ξi = Kξi = 0, π(h)ζj = Kζj . Then it follows that (H + π(h))ξi = pi ξi , (H + π(h))ζj = qj ζj . This concludes the proof.
Lemma 2.5. Let x ∈ T (A) be such that π(x)ξ0 = ξ0 . Then for any > 0 there exists a b ∈ T (A) such that π(b)ξ0 = ξ0 , xb = b, and αt (b) − b < for t ∈ [−1, 1].
406
A. Kishimoto
Proof. Let (e N ) be a decreasingsequence defined for ξ0 as before. If f ∈ C0 (R) satisfies that f ≥ 0, f (t)dt = 1, and |f (t − s) − f (t)|dt < for s ∈ [−1, 1], then we have, for bN = αf (eN ), that π(bN )ξ0 = ξ0 , αs (bN ) − bN < , s ∈ [−1, 1]. Let c ∈ T be such that cx = c and π(c)ξ0 = ξ0 . Then, since cαt (eN ) − αt (eN )→0, we have that cbN − bN →0. Define a gN ∈ C0 (0, ∞) by gN (t) = (1 + 1/N )t for t < N/(N + 1) and gN (t) = 1 for t ≥ N/(N + 1). Then, since gN (cbN c) ∈ T , gN (cbN c) − cbN c ≤ 1/N , and cbN c − bN →0, we can take gN (cbN c) for b for a sufficiently large N . Now we come to the proof of (2)⇒(3) of Theorem 1.1 in the case R(α) = R. We are given a UHF flow γ on a UHF C∗ -algebra D. Up to conjugacy we may suppose that γ is given as follows: γt =
∞
Ad diag(1, eitpn1 , . . . , eitpn,kn −1 )
n=1
∞
on D = n=1 Mkn , where diag(µ1 , µ2 , . . . , µk ) is the diagonal matrix with µi as the (i, i) component and Mkn is the kn × kn matrix algebra with kn ≥ 2. In the covariant irreducible representation (π, U, H) we have fixed a unit vector ξ0 ∈ H such that Ut ξ0 = ξ0 . We find an e ∈ T = T (A) such that π(e)ξ0 = ξ0 . Starting with m = 0 and e as x0 , we apply Lemma 2.2 repeatedly, which we will now describe below. Let (µn ) be a strictly decreasing sequence of positive numbers such that nk1 k2 · · · kn µn < 1 and let n = µn − µn+1 . Suppose that we have xn0 , xn1 , . . . , xn,kn −1 ∈ A1 (= the unit ball of A), un ∈ U(A), and hn ∈ Asa for n ≤ m such that xn0 ∈ T , un − 1 < n , hn < n , and π(un )ξ0 π(xn0 )ξ0 ∗ xnj xnk xnj xnk ∗ xnj xnj xn0 xn−1,0 xnk
= = = = = =
ξ0 , ξ0 , 0, j = k, 0, k = 0, xn0 , j = 0, xnk xn−1,0 = xnk ,
etc. To describe the other conditions define hm (m) vni
= h1 + h 2 + · · · + h m ,
= Ad(um um−1 · · · un )xni , Xm = {i = (i1 , i2 , . . . , im ) | 0 ≤ in < kn }, (m)
vi
(m) (m)
(m)
= v1i1 v2i2 · · · vmim for i = (i1 , i2 , . . . , im ) ∈ Xm .
Quasi-Product Flows on a C∗ -Algebra
407
Then it should follow that for i ∈ Xm , (hm )
Ut
(m)
π(vi
(m)
)ξ0 = eipi
t
(m)
π(vi
)ξ0 ,
(m) (h ) (m) (m) (m) (αt m (vi ) − eipi t vi )v0
< m , t ∈ [−1, 1],
(h ) (m) (m) αt m (v0 ) − v0
< m , t ∈ [−1, 1],
(m)
(m)
(m)
where pi = p1i1 + p2i2 + · · · + pmim with pn0 = 0 and v0 = v0m = Ad um (xm0 ) with 0m = (0, 0, . . . , 0) ∈ Xm . (m) (m) By applying Lemma 2.2 to vi , i ∈ Xm with v0 for x0 , (pm+1,1 , . . . , pm+1,km+1 −1 ) for a sequence in R and m+1 > 0, we obtain xm+1,j ∈ A1 for 0 ≤ j < km+1 , um+1 ∈ U(A), and hm+1 ∈ Asa satisfying the conditions as above. We define h = lim hm = m
∞
hn ,
n=1 (m)
vni = lim vni = lim umn xni u∗mn = u∞,n xni u∗∞,n , m
m
vi = v1i1 v2i2 · · · vmim , i ∈ Xm , where um,n = um um−1 · · · un and u∞,n = lim m um,n . Then h < µ1 , where µ1 could be assumed to be arbitrarily small. Since n>m n = µm+1 , we have that u∞,n − (m) um,n < µm+1 and vi − vi < 2µm+1 for i ∈ Xm . If i ∈ Xm , 0m = (0, 0, . . . , 0) ∈ Xm , and t ∈ [−1, 1], then (h)
(m)
(αt (vi ) − eipi
t
(h)
(m)
(m)
t (m) (m) vi )v0 + 6µm+1 (m) (hm ) (m) ipi t (m) (m) (αt (vi ) − e vi )v0 + 8µm+1
vi )v0m < (αt (vi
) − eipi
< < 8µm . In a similar way we have that (h)
αt (v0m ) − v0m < 6µm for t ∈ [−1, 1]. Let pm be the spectral projection of v0m = u∞,m xm0 u∗∞,m corresponding to 1. Define vi pm vi∗ qm = i∈Xm
and for 1 ≤ k ≤ m, qmk =
i∈Xk,m
vi pm vi∗ ,
where Xk,m = {i ∈ Xm | i1 = 0, i2 = 0. . . . , ik = 0}. Then (vi pm vj∗ )i,j ∈Xm form a family of matrix units for Mk1 ⊗ Mk2 ⊗ · · · ⊗ Mkm and the elements qm , qmk with 1 ≤ k ≤ m are projections in A∗∗ satisfying vki qm = vki qmk = qm vki
408
A. Kishimoto
∗ v q = v q = q . (See 6.7 of [16] for details.) Define and vki ki m k0 m mk
rm =
i∈Xm
vi v02m vi∗ ∈ A.
Then it follows that qm ≤ rm ≤ qm−1 . Thus (qm ) is a decreasing sequence of closed projections and hence the limit q = limn qm is a closed projection in A∗∗ . If t ∈ [−1, 1] and i ∈ Xm , then we have that (h)
(h)
(h)
αt (vi v02m vi∗ ) − vi v02m vi∗ ≤ αt (vi )v02m αt (vi∗ ) − vi v02m vi∗ + 12µm (m)
(h)
≤ 2(αt (vi ) − eipi ≤ 28µm ,
t
vi )v0m + 12µm
which implies that (h)
αt (rm ) − rm < 28k1 k2 · · · km µm < 28/m. (h)
Since q is the weak∗ limit of (rm ), it follows that αt (q) = q for all t ∈ R. If k ≤ m, then (h)
(αt (vki ) − eipki t vki )qm ≤
(h)
(αt (vki ) − eipki t vki )vi pm ,
i∈Xk,m
where each term is estimated to be smaller than 16µm . Hence we have that (h)
(αt (vki ) − eipki t vki )qm < 16k1 k2 · · · km µm < 16/m, (h)
(h)
(h)
(h)
and hence that αt (vki )q = eipki t vki q. Note that qαt (vki ) = αt (qvki ) = αt (h)
(vki q) = αt (vki )q = eipki t vki q.
(h)
Let B be the C∗ -subalgebra of A generated by αt (vni ) with all possible n, i, t. Then B is left invariant under α (h) and q(∈ B ∗∗ ⊂ A∗∗ ) is an α (h) -invariant closed projection in the commutant of B. We know that (Bq, (α (h) )∗∗ |Bq) is isomorphic to (D, γ ) by the construction. Since π ∗∗ (q)ξ0 = ξ0 and π is a faithful representation of A, we have that if x ∈ A, then x = 0 iff xc(q) = 0, where c(q) is the central support of q in A∗∗ . Thus (B, q, α (h) ) satisfies all the condition except for qAq = Bq. But this can be handled exactly in the same way as in 6.7.2 of [16]; at this point we make essential use of the separability of A. This concludes the proof in the case R(α) = R. Suppose that R(α) ∼ = Z. In this case we take a faithful α-covariant irreducible representation π on a Hilbert space H and a unitary flow U on H such that Ad Ut π = π αt and the kernel of the representation π × U of A ×α R is left invariant under αˆ p , p ∈ R(α). We assume that there is a unit vector ξ0 ∈ H such that Ut ξ0 = ξ0 (by a small inner perturbation and a phase change) and let (eN ) be a decreasing sequence as before. We have to replace Lemma 2.1 by the following; then the proof will be exactly the same as in the case R(α) = R.
Quasi-Product Flows on a C∗ -Algebra
409
Lemma 2.6 (R(α) = (0)). Let EU be the spectral measure of the unitary flow U on H. Then it follows that π(eN )EU (q − , q + ) = 1 for any q ∈ R(α), > 0, and N = 1, 2, . . . . Moreover the spectral projection of π(eN )EU (q − , q + )π(eN ) corresponding to [1 − δ, 1] is infinite-dimensional for any δ > 0. Proof. The proof of Lemma 2.1 shows that π(eN )EU (−, ) = 1. Then we use the invariance of the kernel of π × U under α|R(α). ˆ If the second part is not true, this means that the range of π ×U contains non-zero compact operators. Hence αˆ p must extend to an automorphism of (π ×U )(A×α R) = B(H) for p ∈ R(α); but then, since π(A) = B(H), this extension must be the identity, which is a contradiction. Thus the second assertion also follows. If R(α) = (0), then the second part of the above lemma may not follow. For example, let γ be a UHF flow on the CAR algebra ∞ M2 of the form ∞
Ad diag(1, eipn t ),
n=1
n−1 where (pn ) is a sequence satisfying that p1 > 1 and pn > k=1 pk + 1. Then, in ∞ the GNS representation associated with the product state ⊗ φ, where φ(x) = x11 for x ∈ M2 , U is diagonal and the multiplicity of each eigenvalue is 1. That is, in the notation as above, EU (−, ) is one-dimensional if 0 < < 1. Hence the assertion of Lemma 2.2 does not follow in the case R(α) = (0); we will not be allowed to take an arbitrary sequence (q1 , q2 , . . . , qn ) in R but only some. Lemma 2.7 (R(α) = (0); A is antiliminary). Let EU be the spectral measure of the unitary flow U on H. Then for any > 0 and N = 1, 2, . . . there is an infinite number of q ∈ R such that π(eN )EU (q − , q + ) > 1 − . Proof. Let δ > 0. There exists a non-zero α-invariant hereditary C∗ -subalgebra B of A such that Spec(α|B) ∩ [−1, 1] ⊂ (−δ, δ). By finding x ∈ A such that π(x)ξ0 is a unit vector in [π(B)H] and Specα (x) is contained in (p, p + δ) for some p ∈ R, let C be the hereditary C∗ -subalgebra generated by αt (x)∗ Bαs (x); then Spec(α|C) ∩ [−1 + 2δ, 1 − 2δ] ⊂ (−3δ, 3δ) and [π(C)H] ξ0 . If δ < 1/8, then the spectral subspace C α (−3δ, 3δ) forms an α-invariant C∗ -algebra. Hence we can form a decreasing sequence (eN ) for ξ0 in C α (−3δ, 3δ). Thus we assume, for some choice of (eN ), that Specα (eN ) ⊂ (−δn , δn ) with δn →0. Then the spectral projection pN of π(eN ) corresponding to the eigenvalue 1 satisfies that [pN , H ]→0, where H is the generator of U , and is infinite-dimensional. Then the conclusion follows with the equality π(eN )EU (q − , q + ) = 1 for this choice of (eN ). Thus, in general, by replacing the equality by the weaker inequality as in the statement, we can reach the assertion.
410
A. Kishimoto
3. Covariant Representations We shall give a theorem which implies that (1)⇒(2) of Theorems 1.1 and 1.2 (cf. 1.2 of [13]). Theorem 3.1. Let A be a separable prime C∗ -algebra, G a locally compact abelian group with countable basis with its dual @, and α a continuous action of G on A. Then the following conditions are equivalent: 1. There is a faithful family of α-covariant irreducible representations. 2. There is a faithful α-covariant irreducible representation of A which induces a representation of A ×α G (on the same Hilbert space) whose kernel is left invariant under α|@(α). ˆ ˆ on A ×α G Proof. Before going into the proof, recall that αˆ is the dual action of @ = G and that the spectrum G2 (α) ˆ for αˆ is defined in [11] as follows: t ∈ G belongs to G2 (α) ˆ if for any non-zero x ∈ A ×α G, any compact neighborhood U of t, and any > 0, there is an a ∈ A ×α G such that Specαˆ (a) ⊂ U , a = 1, and x(a + a ∗ )x ∗ ≥ (2 − )x2 . Let us pose another condition: 3. G2 (α) ˆ =G and let us prove that these conditions (1), (2) and (3) are equivalent. If the Connes spectrum of α is full, this is 2 of [11]. The condition (3) follows easily from the condition (1). We will construct a representation as in (2) by using (3). We just mimic the proof of 3.1 (or 3.3) of [10]. Let B denote A ×α G and β = α. ˆ We know that J ∩ βp (J ) = (0) for any p ∈ @(α) and for any non-zero ideal J of B. Moreover we know that ∩p∈C J = (0) for any compact subset C of @(α) and for any non-zero ideal J (see 8.11.7,8 of [16]). Let T (B) denote the set of e ∈ B such that 0 ≤ e ≤ 1 and ae = a = 0 for some a ∈ B. For e ∈ T (B) we write H (e) = {b ∈ B | be = eb = b}. Note that if J is a non-zero ideal of B, ∪{H (e) | e ∈ T (J )} is dense in J . Let (Jn ) be a sequence of non-zero (closed) ideals of B such that for any non-zero ideal J of B, ∪{Jn | Jn ⊂ J } is dense in J . (To show the existence of such a sequence, let (en , an ) be a dense sequence in T 2 (B) = {(e, a) ∈ T (B) × T (B) | ea = a} and let Jn be the ideal generated by an . Let J be a non-zero ideal and let x ∈ J . Then there is (e, a) ∈ T 2 (J ) such that ax ≈ x. If en ≈ e and an ≈ a, then an = an (1 + e − en )(1 + e − en )−1 = an e(1 + e − en )−1 ∈ J , i.e., Jn ⊂ J . Since an x ∈ Jn approximates x, the sequence (Jn ) satisfies the desired property.) Let (Kn ) be a sequence of non-zero ideals of A such that for any non-zero ideal K of A, {Kn | Kn ⊂ K} is dense in K. Let Kˆ n be the ideal of B = A ×α G generated by Kn , which is essential because A is prime. Let (In ) be a sequence consisting of infinitely many copies of Jn and Kn . Let (Cn ) be an increasing sequence of compact subsets of @(α) such that n Cn = @(α). Let (un ) be a dense sequence in U(B) and let (Un ) be a countable basis for the open sets of G. We enumerate {(uk , Um ) | k, m = 1, 2, . . . } and denote by {(un , Un )} the resulting sequence. Fix (e1 , a1 ) ∈ T 2 (I1 ). Then by (3) we have that sup{a1 u∗1 (b + b∗ )u1 a1 | b ∈ B β (U1 ), b = 1} = 2,
Quasi-Product Flows on a C∗ -Algebra
411
where B β (U ) is the β-spectral subspace corresponding to U [16]. We choose b1 ∈ B β (U1 ) such that b1 = 1 and sup Spec(y1 ) > 2 − 1/1, where y1 = R by
a1 u∗1 (b1
+
b1∗ )u1 a1 .
For n = 1, 2, . . . , define a continuous function fn on
0t ≤0 fn (t) = 1 t ≥ 2 − 1/n
and by linearity elsewhere. Let z1 = f1 (y1 ) ∈ T . If H (z1 ) ∩ I2 = (0), then we choose (e2 , a2 ) ∈ T 2 ((H (z1 )∩ p∈C2 βp (I2 )). (If I denotes the ideal generated by H (z1 ), then we have that p∈C2 βp (I ∩ I2 ) = (0), which implies that I ∩ p∈C2 βp (I2 ) = (0).) If H (z1 ) ∩ I2 = (0), we omit I2 and set (e2 , a2 ) ∈ T 2 (H (z1 )). We choose b2 ∈ B β (U2 ) such that b2 = 1 and sup Spec(y2 ) > 2 − 1/2, where y2 = a2 u∗2 (b2 + b2∗ )u2 a2 . Let z2 = f2 (y2 ) and proceed as above. then we choose (en+1 , an+1 ) from T 2 (H (zn )∩ When zn is set, if H (zn )∩In+1 = (0), 2 β p∈Cn+1 βp (In+1 )); otherwise from T (H (zn )). Then we choose a bn+1 ∈ B (Un+1 ) such that bn+1 = 1 and sup Spec(yn+1 ) > 2 − 1/(n + 1), ∗ )u where yn+1 = an+1 u∗n+1 (bn+1 + bn+1 n+1 an+1 . We set zn+1 = fn+1 (yn+1 ). Since zn en = zn and en+1 zn = en+1 , (en ) forms a decreasing sequence in T (B). We may further impose the condition that (en ) defines a pure state; i.e., the set of states φ with φ(en ) = 1 for all n is a singleton. (Assuming that we are given a dense sequence (hn ) in Asa beforehand, when we choose en , we further impose the condition that
sup Spec(p(en )hm p(en )) − inf Spec(p(en )hm p(en )) < 1/n for all m ≤ n, where p(en ) is the spectral projection of en corresponding to the eigenvalue 1.) Let ω denote the pure state with ω(en ) = 1 for all n. Since ω(yn ) ≥ 2 − 1/n and yn ≤ 2an2 , we have that limn ω(an2 ) = 1. This implies that π(an )D − D→0, where (π, H, D) is the GNS triple associated with ω. Let t ∈ G = @ˆ and u ∈ U(B). Then there is a subsequence (nk ) such that unk − u→0 and (Unk ) form a basis for the neighborhoods of t in G. Since ! ω(an u∗n bn un an ) ≥ 1−1/2n, any weak limit point Q of (π(bnk )) on H satisfies that Q = 1 and Qπ(u)D = π(u)D. For each t ∈ G we denote by M(t) the set of Q ∈ B(H) = π(B) which is obtained as the limit of a sequence (xn ) in B satisfying that the sequence (Specβ (xn )) shrinks to {t}, i.e., for any neighborhood U of t there is an n0 such that Specβ (xn ) ⊂ U for n ≥ n0 . It follows that M(t) is a weakly closed subspace, M(s)M(t) ⊂ M(st), and M(1) is a von Neumann algebra. Suppose that there is a projection E ∈ M(1) and S ∈ M(t) such that ES(1−E) = 0. Then there are unit vectors ξ, η in H such that (1 − E)ξ = ξ , Eη = η, and Sξ, η = 0. Let Q ∈ M(t −1 ) such that Q = 1 and Qη = η, whose existence we have shown in the last paragraph. Since Q∗ η = η, we have that QSξ, η = Sξ, η = 0. But since QS ∈ M(1), we should have that QSξ, η = QS(1 − E)ξ, Eη = 0, a contradiction. Hence we can conclude that M(t) ⊂ M(1) = M(1). Since M(t) is a non-zero ideal,
412
A. Kishimoto
it follows that M(t) = M(1) and M(1) = B(H) (see 1.3 of [10] for details). From this one can conclude that the restriction of π to A (which is in the multiplier algebra of B = A ×α R) is irreducible. (To see this, we use β-integration; see 4 of [11].) Since π |Kˆn = 0, it follows that π|A is faithful. Let J = ker π. Suppose that βp (J ) ⊂ J for some p ∈ @(α). Then there is an Jk such that Jk ⊂ βp (J ) and Jk ⊂ J . This implies that en ∈ Jk for a sufficiently large n. (Since π|Jk = 0, there is an f ∈ T (Jk ) such that ω(f ) = 1. Since f en −en →0, we have, for a −1 −1 large n, that en+1 = en+1 (1+e n f −en )(1+en f −en ) = en+1 f (1+en f −en ) ∈ Jk .) Then we have that en ∈ p∈Cm βp (Im ) for m ≥ n with Im = Jk . Thus it follows that en ∈ β−p (Jk ), or en ∈ J , a contradiction. Hence we have that βp (J ) ⊂ J for p ∈ @(α), which shows the invariance of J = ker π under β|@(α). Finally we give a simple consequence of the above theorem; a similar proof is found in [9]. Corollary 3.2. Under the situation of Theorem 3.1 suppose that @/ @(α) is compact. Then the set S of α|@(α)-invariant ˆ primitive ideals P satisfying p∈@ αˆ p (P ) = (0) (in the primitive ideal space of A ×α G) is isomorphic to the compact Haudorff space @/ @(α). Proof. We take a representation ρ of A ×α G as in (2) of the previous theorem. Then the kernel of ρ belongs to S in the statement. We can define an action γ of the quotient H = @/ @(α) on S by γπ(p) (P ) = αˆ p (P ), P ∈ S for p ∈ @ with π the quotient map of @ onto @/ @(α) = H . We show that if γq˙ (P ) ⊂ P for some q˙ ∈ H and some P ∈ S, then q˙ = 0. Let q ∈ @ be such that π(q) = q. ˙ If q ∈ @(α), there is an neighborhood U of 0 ∈ G such that (@(α) + U + q) ∩ (@(α) + U ) = ∅ and then a non-zero α-invariant hereditary C∗ -subalgebra B of A such that Spec(α|B) ⊂ @(α) + U . Note that Spec(ρ(λ)|[ρ(B)Hρ ]) ⊂ Spec(α|B)+p ⊂ @(α)+U +p for some p ∈ @, where ρ is the representation of A×α R given above and λ is the canonical unitary representation of @ in the multiplier algebra of A ×α R. Hence it follows that Spec(ρ(λ)|[ρ(B)Hρ ]) ∩ Spec(ρ αˆ q (λ)|ρ(B)Hρ ) = ∅, which implies that BP B = B(P + αˆ q (P ))B = BA ×α RB. Since ρ|A is faithful, this is a contradiction. Thus q ∈ @(α), or q˙ = 0. If P , Q ∈ S, then, since p∈H γp (P ) = (0) ⊂ Q and H is compact, it follows that γp (P ) ⊂ Q for some p ∈ H . By changing roles between P and Q, it also follows that γq (Q) ⊂ P for some q ∈ H . Hence we have that γp+q (P ) ⊂ P , which implies that p + q = 0. Thus we have that γp (P ) = Q. By fixing one P ∈ S we can identify S with H by γp (P ) p, which also preserves the topologies. References 1. Araki, H., Woods, E.J.: A classification of factors. Publ. Res. Inst. Math. Sci. Kyoto 4, 51–130 (1968) 2. Bratteli, O., Elliott, G.A., Evans, D.E., Kishimoto, A.: Quasi-product actions of a compact abelian group on a C∗ -algebra. Tohoku Math. J. 41, 133–161 (1989) 3. Bratteli, O., Elliott, G.A., Kishimoto, A.: Quasi-product actions of a compact group on a C∗ -algebra. J. Funct. Anal. 115, 313–343 (1993) 4. Bratteli, O., Kishimoto, A., Robinson, D.W.: Embedding product type actions into C∗∗ -dynamical systems. J. Funct. Anal. 75, 188–210 (1987) 5. Connes, A.: Non-commutative Geometry. London-San Diego: Academic Press, 1994 6. Evans, D.E., Kishimoto, A.: Duality for automorphisms on a compact C∗ -dynamical system. Ergod. Th. & Dynam. Sys. 8, 173–189 (1988) 7. Glimm, J.: Type I C∗ -algebras. Ann. of Math. 73, 572–612 (1961)
Quasi-Product Flows on a C∗ -Algebra
413
8. Kadison, R.V.: Irreducible operator algebras. Proc. Nat. Acad. Sci. U.S.A. 43, 273–276 (1957) 9. Kishimoto, A.: Simple crossed products of C∗ -algebras by locally compact abelian groups. Yokohama Math. J. 28, 69–85 (1980) 10. Kishimoto, A.: Type I orbits in the pure states of a C ∗ -dynamical system. Publ. Res. Inst. Math. Sci. Kyoto 23, 321–336 (1987) 11. Kishimoto, A.: Type I orbits in the pure states of a C ∗ -dynamical system II. Publ. Res. Inst. Math. Sci. Kyoto 23, 517–526 (1987) 12. Kishimoto, A.: Outer automorphism subgroups of a compact abelian ergodic action. J. Operator Theory 20, 59–67 (1988) 13. Kishimoto, A.: Automorphism groups and covariant irreducible representations. In: Mappings of Operator Algebras, H. Araki, R.V. Kadison, eds., Basel-Boston: Birkh¨auser, pp. 129–139, 1990 14. Kishimoto, A.: UHF flows and the flip automorphism. Rev. Math. Phys. 9, 1163–1181 (2001) 15. Kishimoto, A.: Approximately inner flows on separable C∗ -algebras. Rev. Math. Phys., to appear 16. Pedersen, G.K.: C∗ -algebras and their automorphism groups. London-San Diego: Academic Press, 1979 Communicated by H. Araki
Commun. Math. Phys. 229, 415–432 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0698-z
Communications in
Mathematical Physics
Equality of Bulk and Edge Hall Conductance Revisited P. Elbau, G. M. Graf Theoretische Physik, ETH-H¨onggerberg, 8093 Z¨urich, Switzerland Received: 11 March 2002 / Accepted: 1 April 2002 Published online: 12 August 2002 – © Springer-Verlag 2002
Abstract: The integral quantum Hall effect can be explained either as resulting from bulk or edge currents (or, as it occurs in real samples, as a combination of both). This leads to different definitions of Hall conductance, which agree under appropriate hypotheses, as shown by Schulz-Baldes et al. by means of K-theory. We propose an alternative proof based on a generalization of the index of a pair of projections to more general operators. The equality of conductances is an expression of the stability of that index as a flux tube is moved from within the bulk across the boundary of a sample.
1. The Model and the Result The simultaneous quantization of bulk and edge conductance is essential to the QHE in finite samples, as explained in [8, 13]. In these two references that property is established in the context of an effective field theory description, resp. of a microscopic treatment suitable to the integral QHE. The present paper is placed in the latter setting as well. In our model the QHE sample is represented by the upper half-plane Z × N, with lattice points denoted as r = (x, y). The Hamiltonian H is a discrete Schr¨odinger operator on the single-particle Hilbert space 2 (Z × N). It is obtained from the restriction (with e.g. Dirichlet boundary conditions) of a “bulk” Hamiltonian HB acting on 2 (Z × Z). These assumptions are spelled out in detail at the end of this section. The spectrum of HB (but not that of H , as a rule) has an open gap containing the Fermi energy µ: ∩ σ (HB ) = ∅.
(1)
Let PB be the Fermi projection: PB = E(−∞,µ] (HB ) for any µ ∈ . A real-valued function g ∈ C ∞ (R) with g(λ) = 1 (resp. 0) for λ large and negative (resp. positive) will be called a switch function. We remark that PB = g(HB ) if the switch function has supp g ⊂ .
416
P. Elbau, G.M. Graf
Theorem 1. Assume the hypotheses as described and, in particular, (1). Let σB =
1 Ind(U PB U ∗ , PB ), 2π
(2)
where U = U (r ) = ei arg r , be the bulk Hall conductance; and let σE = − tr (g (H )i[H, χ (x)]),
(3)
where g and χ are switch functions with supp g ⊂ , be the edge Hall conductance. Then σB = σE .
(4)
In particular, σE is independent of g and χ as stated. Remarks. 1) Ind(P , Q) is the index of a pair of projections, see [2], from where also the definition of σB is taken, except for a change of sign. In other words, their definition of σB agrees with the Kubo formula (6.18) for σ12 , whereas ours with σ21 . Or equivalently: their definition is such that for a Landau Hamiltonian with magnetic field B > 0 and electron charge e = +1 one has σB > 0, see Remark 6.7c. Ours is opposite. 2) U (r ) can be replaced, without affecting σB , by U (r ) = eiϕ(arg r) ,
(5)
where ϕ : S 1 → S 1 is a continuous function with winding number 1. This follows by continuity from the additivity [2] and the stability of the index:
Q − P < 1 ⇒ Ind(Q, P ) = 0. 3) The rationale for the definition (3) is that −i[H, χ (x)] is the current operator in the x-direction (for χ (x) = θ (−x), it is the current across the line x = 0). For −g (H ) = E[µ1 ,µ2 ] (H )/(µ2 − µ1 ) (3) is (up to the sign) the expected current in the 1-particle density matrix E[µ1 ,µ2 ] (H ), corresponding to filled edge levels [µ1 , µ2 ] ⊂ , divided by the potential difference. For the above Landau Hamiltonian the current is positive, since the electrons run in the positive x-direction near the boundary. Thus σE is, like σB , negative. The result (4) was proven in [13] and, more extensively, in [11] using non-commutative geometry and K-theory. (However, the quantization of σE was shown there without making use of these techniques.) The present proof makes use of basic functional analysis. While their result is established using and extending tools developed in [4], ours bears a similar relation to [2]. We conclude this section by specifying the Schr¨odinger Hamiltonians H used here. Kronecker states corresponding to lattice points r = (x, y) are denoted as δr ∈ 2 (Z×N) and matrix elements as H (r1 , r2 ) = (δr1 , H δr2 ). We assume H to be a self-adjoint operator with short-range off-diagonal hopping terms: sup |H (r1 , r2 )|(eµ0 |r1 −r2 | − 1) < ∞ (6) r1
r2
for some µ0 > 0. The bulk Hamiltonian HB is of the same form, except that the lattice is Z × Z = Z2 . It should restrict to H on the upper half-plane under some largely arbitrary
Equality of Bulk and Edge Hall Conductance Revisited
417
boundary condition. More precisely, let J : 2 (Z × N) → 2 (Z2 ) denote the extension by 0. We assume that the “edge term” E = J H − HB J : 2 (Z × N) → 2 (Z2 ) satisfies
|E(r , r )| ≤ Ce−µ0 |y|
(7)
r ∈Z×N
for all r = (x, y) ∈ Z2 . For instance, for Dirichlet boundary conditions, −HB (r , r ) , (y < 0), E(r , r ) = 0, (y ≥ 0), whence (7) follows from (6) for HB at the expense of making µ0 smaller. The trace ideals of operators on the Hilbert space 2 (Z × N) or 2 (Z2 ), depending on the context, are denoted as Jp , (1 ≤ p < ∞), with norm · p . Universal constants are denoted by C. 2. Idea and Outline of the Proof We consider the gauge transformation (5) with ϕ having supp ϕ ⊂ [π/4, 3π/4], so that U (r )−1 is supported in a wedge pointing upwards. We shall compare two modifications thereof. The first one, Ua , is obtained from (5) by changing U (r ) to 1 for y < a.
a Fig. 1. Isolines of U, Ua , U
a , is obtained from Ua by pulling the line of fluxes at y = a across The second one, U the boundary, as in the figure. Morally, the Hall conductance σB is given as 1 a∗ − g(H )) a g(H )U tr (U 2π
(8)
with either = , . Indeed, in both cases the heuristic argument, explained in more detail in [2], Sect. 5, is that the trace in (8) counts the number of electrons which are pulled to infinity as the gauge field is switched on adiabatically starting from zero to a flux quantum, see Fig. 1. That number may also be computed by integrating the current = σB ε E
(9)
418
P. Elbau, G.M. Graf
(with ε denoting a rotation by π/2) over time and across a large circle C enclosing the a ) is the electric field accompanying the change of magnetic flux. Here E = i∂t ∇(log U field, and is the same on C in the two cases. Since the phenomenological equation (9) is valid only well inside the sample, it is crucial that the isolines of the gauge transformation run to infinity through the upper half-plane, so that E vanishes where C crosses the boundary of the sample. It appears reasonable, even without recourse to (9), that for = and a → ∞ (8) a note that tends to σB as defined in (2). As for U
a (r ) = e2πiχa (r ) , U where χa (r ) is a single-valued function over the sample and, for r close to the boundary, χa (r ) = χa (x) = χ (x/a) is a switch function. This suggests that 2π 1 d iϕχa a∗ − g(H ) = 1 tr Ua g(H )U tr e dϕ g(H )e−iϕχa − g(H ) 2π 2π 0 dϕ = − tr i[g(H ), χa ] = − tr (g (H )i[H, χa ]) = σE , where the last two traces are formally equal since the operators inside differ by a commutator. The trouble with this explanation for σB = σE is that none of the traces starting with (8), except for the last one, is well-defined. In fact, one has the weaker property a g(H )U a∗ − g(H ) ∈ J3 for g a switch function (but notice that as a rule even this fails U if g is taken as a step function, a fact related to Theorem 3.11 in [2]). Put differently: the formal eigenvalue sum represented by (8) is not absolutely convergent, but exhibits strong cancellations between eigenvalues λ of opposite sign (which are exact except for λ = ±1 in a bulk situation, where g(HB ) = PB is a projection [3]). Let therefore ft (λ) be an odd function with ft (1) = 1 interpolating between λ3 (as t = 0) and λ (as t = ∞). For definiteness we take ft (λ) =
(1 + t)λ3 . 1 + tλ2
(10)
We regard the “principal value” limt→∞ tr ft (A) as a replacement for tr A, when the latter is not defined. But first we pass to a more general setting. We consider a fixed bounded operator P (typically not a projection!) on a Hilbert space H equipped with a fixed orthonormal basis B. Our standing assumptions are: let Q = U P U ∗,
(11)
where U is a unitary operator satisfying B is an eigenbasis for U, Q − P ∈ J3 , (Q − P )(P − P 2 ) , (P − P 2 )(Q − P ) ∈ J1 , p(Q) − p(P ) ∈ J1 ,
(12) (13) (14) (15)
Equality of Bulk and Edge Hall Conductance Revisited
419
for any polynomial p(λ) with p(0) = p(1) = 0 and deg p ≤ 3. This implies tr p(Q) − p(P ) = 0,
(16)
as it is seen by evaluating the trace in an eigenbasis of U . Specifically, (16) will be used for the polynomials p(λ) = λ − λ2 and p(λ) = (1 − 2λ)(λ − λ2 ) = λ − 3λ2 + 2λ3 , which span the above space of polynomials. As an abstract replacement for (8) we have Lemma 2. Assume (11–15) and P = P ∗ . Then
3 lim tr ft (Q − P ) = tr Q − P , (Q − Q2 ) + (P − P 2 ) + (Q − P )3 ≡ K(U ). t→∞ 2 (17) The proof of Theorem 1 will not depend on Lemma 2, except for the fact that K(U ) is well-defined. The limit (17) will thus be proved only towards the end of the paper. The heuristic discussion following (8) is now substantiated in terms of K(U ). Lemma 3. Let Qi = Ui P Ui∗ , (i = 1, 2) satisfy (12–15) and assume U2 − U1 ∈ J1 .
(18)
Then K(U1 ) = K(U2 ). We now turn to the application to the quantum Hall effect. Lemma 4. i) The assumptions (12–15) hold true for H = 2 (Z × N), B = {δr }r∈Z×N , a g(H )U a∗ Qa = U
with = , and g as in Theorem 1. a , with separate choices of a and = ii) Assumption (18) applies to Ui = U for i = 1, 2.
(19)
a , U =U
P = g(H ) ,
,
a ) is independent of a and . Therefore, K(U
Lemma 5. Let (19) with =
. Then
lim tr (Qa − P )3 = 2πσB ,
a→∞
3 tr {Qa − P , (P − P 2 ) + (Qa − Q2a )} = 0. 2 Lemma 6. Let (19) with = . Then lim
a→∞
lim tr (Qa − P )3 = 0,
a→∞
3 tr {Qa − P , (P − P 2 ) + (Qa − Q2a )} = 2π σE . 2 Proof of Theorem 1. Is immediate from Lemmas 3–6. lim
a→∞
(20) (21)
(22) (23)
Remark . Theorem 1 may be stated for a continuum model as well, though the proof would require some adjustments. For instance, hypothesis (12) does not apply in that case, but coordinate space is still distinguished by the gauge transformation. Consequences like (16) may then be drawn using that g(H ) has a jointly continuous integral kernel g(H )(r1 , r2 ), cf. Hypothesis 3.1 and Theorem A.1 in [2].
420
P. Elbau, G.M. Graf
3. The Details The starting point to the proofs of Lemmas 2 and 3 are two identities from [2] valid for projections P = P 2 and Q = Q2 . They are (Q − P ) − (Q − P )3 = [QP , P Q] = [QP , [P , Q − P ]], [P , (Q − P )2 ] = [Q, (Q − P )2 ] = 0.
(24) (25)
The first was used there for it yields the case n = 0 of tr (Q − P )2n+3 = tr (Q − P )2n+1 for Q − P ∈ J2n+1 . The second yields the extension to n ∈ N. For later purpose we remark that they similarly yield tr ft (Q − P ) = tr (Q − P )3
(26)
for 0 ≤ t < ∞ if P − Q ∈ J3 . Indeed: since ft (λ) − λ3 =
tλ2 (λ − λ3 ) 1 + tλ2
(27)
we have ft (Q − P ) − (Q − P )3 = t QP , [P ,
(Q − P )3 ] 2 1 + t (Q − P )
with the inner commutator being trace class, whence (26). Our primary concern here is however a generalization of (24, 25) to arbitrary bounded operators P , Q. More precisely, we take the half-difference between (24) and its “particle-hole” reversed variant (P → 1 − P , Q → 1 − Q), and correct the result by the appropriate terms involving P − P 2 and Q − Q2 : (Q − P ) − (Q − P )3 =
1 1 [QP , P Q] − (1 − Q)(1 − P ), (1 − P )(1 − Q) 2 2 +(1 − 2Q)(Q − Q2 ) − (1 − 2P )(P − P 2 ) 3 + {Q − P , Q − Q2 + P − P 2 }. (28) 2
In the new setting (25) is replaced with P , (Q − P )2 = Q, (Q − P )2 = Q − P , (Q − Q2 ) − (P − P 2 ) .
(29)
These relations are conveniently stated in terms of the operators A = Q − P,
B =1−P −Q
(30)
introduced in [10, 3], for which {A, B} = 2 (Q − Q2 ) − (P − P 2 ) , 1 − A2 − B 2 = 2 (Q − Q2 ) + (P − P 2 ) .
(31) (32)
Equality of Bulk and Edge Hall Conductance Revisited
421
Then (28) reads (with equality line by line) A − A3 =
1 B, [B, A] 4 1 1 + B, {A, B} − A, 1 − A2 − B 2 4 4 3 2 + A, 1 − A − B 2 4
(33)
and (29) (after multiplication by 2) [A2 , B] = [A, {A, B}].
(34)
Proof of Lemma 3. We remark that Q − P , (Q − Q2 ) + (P − P 2 ) = 2 Q − P , P − P 2 } + {Q − P , p(Q) − p(P ) with p(λ) = λ − λ2 , is trace class by our assumptions (14, 15). Thus K(U ) in (17) is well-defined. Let Ai = Qi − P , (i = 1, 2), and similarly for Bi . We take the difference between (33) (or (28)) in the two cases. In a mixed notation we have 2 2 2 1 Ai 1 −A3i 1 = Bi , [Bi , Ai ] 1 4 2 + p(Qi ) − p(P ) 1 2 3 + Qi − P , (Qi − Q2i ) + (P − P 2 ) 1 2
(35)
with p(λ) = (1 − 2λ)(λ − λ2 ). We note that A2 − A1 = −(B2 − B1 ) = Q2 − Q1 ∈ J1 with tr (Q2 − Q1 ) = 0. Indeed, by (18), Q2 − Q1 = U2 P U2∗ − U1 P U1∗ = (U2 − U1 )P U2∗ + U1 P (U2 − U1 )∗
(36)
is trace class, and the trace is seen to vanish using the basis B. Writing 2 Bi , [Bi , Ai ] 1 = B2 , [B2 , A2 − A1 ] + B2 , [B2 − B1 , A1 ] + B2 − B1 , [B1 , A1 ] , we see that the first term on the r.h.s. of (35) is trace class with vanishing trace. So is the next one due to (16). a are multiplication operators. Proof of Lemma 4. Equation (12) is evident, since the U a has compact support as a function, Let U (r ) be given by (5) as in Fig. 1. Since U − U it is trace class as an operator. Thus (ii) holds true and it suffices to prove (13–15) for U a , cf. (36). The (r1 , r2 )-matrix element of (Q − P )U = Ug(H ) − g(H )U is instead of U g(H )(r1 , r2 ) U (r1 ) − U (r2 ) , so (13) follows from (A.5), U (r1 ) − U (r2 ) ≤ C |r1 − r2 | 1 + |r2 | and (A.4) with p = 3. To prove (15), we note that G = p ◦ g has supp G ⊂ . Hence (A.7) applies. Writing the matrix element of (p(Q) − p(P ))U = U G(H ) −
422
P. Elbau, G.M. Graf
G(H )U as before, the claim follows. As mentioned, the verification of (14) could a . However we prefer to do this equally be done on the basis of U instead of U for = , explicitly, since this will provide estimates, stated in the lemma below, which will be useful in the proofs of Lemmas 5, 6. Technically, the first part of (14) is just the case b = 0 in (37) below. The second part follows by taking the adjoint.
The rough reason for a g(H )U a∗ − g(H ) g(H ) − g(H )2 (Qa − P )(P − P 2 ) = U a − 1) has compact intersection (possibly empty) with to be trace class is that supp (U the boundary. Lemma 7. Let Fb = Fb (y) be the characteristic function of the neighborhood {r |y < b} of the boundary. Then, in the notation (19), (37) (38)
, and some κ > 0. For b ≤ a/2 we furthermore have
in case =
for both =
(Qa − P )(1 − Fb )(P − P 2 ) 1 ≤ C(1 + a)e−κb ,
(Qa − P )(1 − Fb )(Qa − Q2a ) 1 ≤ C(1 + a)e−κb
(Qa − P )Fb 1 ≤ CN (1 + a)−N
(39)
; and
(Qa − P )Fb 1 ≤ C · b ,
(Qa − P )Fb 2 ≤ C(b/a)1/2
(40) (41)
in case = . Proof. We set G(H ) = P − P 2 = g(H ) − g(H )2 and estimate (37) as
(Qa − P )(1 − Fb )G(H ) 1 = (Qa − P )(1 − Fb )e−κy eκy G(H ) 1 κ
κ
≤ e− 2 b (Qa − P )e− 2 y 1 eκy G(H ) , κ
a has where the last norm is finite due to (A.7). The operator T = (Qa − P )e− 2 y U kernel κ a (r1 ) − U a (r2 ) e− 2 y2 T (r1 , r2 ) = g(H )(r1 , r2 ) U with
a (r2 )| ≤ Ck 1 + (|x2 | − a − y2 )+ −k 1 + |r1 − r2 | k . a (r1 ) − U |U
(42)
In fact if |x2 | < a + y2 , the first factor on the r.h.s. is bounded below by 1, while the l.h.s. is bounded above by 2. In the opposite case |x2 | ≥ a + y2 , we distinguish between |x1 | ≥ a + y1 , whence the l.h.s. vanishes (see Fig. 1), and |x1 | < a + y1 , where √ 2|r1 − r2 | ≥ |x2 | − a − y2 implies that the r.h.s. is bounded below away from 0. We claim this proves κ
a 1 ≤ C(1 + a),
T 1 = (Qa − P )e− 2 y U
(43)
Equality of Bulk and Edge Hall Conductance Revisited
423
and hence (37). To this end we apply (A.4) with p = 1: using (A.5) with N + k instead of N we have −k − κ y |T (r + s, r)| ≤ C(1 + |s |)−N ·e 2 1 + (|x| − a − y)+ r∈Z×N
r
≤ C(1 + |s |)−N
∞
κ
(1 + a + y)e− 2 y ≤ C(1 + |s |)−N · (1 + a),
y=0
a∗ instead of U a for k ≥ 2. This is summable w.r.t. s ∈ Z2 for N ≥ 3. Taking (37) with U yields (38). Let now b ≤ a/2 for the rest of the proof. The proof of (39) is just like that of (43), which we supplement with Ua (r1 ) − Ua (r2 ) = 0 if y2 < a/2 and |r1 − r2 | ≤ a/2. This yields for T = (Qa − P )Fb Ua ,
|T (r + s, r)| ≤ C(1 + |s |)−N
∞
(1 + a + y)Fb (y)
y=0
r
≤ C(1 + |s |)−N (1 + a)2 , and = 0 if |s | < a/2. Thus
T 1 ≤ C(1 + a)2
(1 + |s |)−N ≤ C(1 + a)2 (1 + a)−(N−2) .
s:|s |≥a/2
Let finally = , where a (r2 ) ≤ C |r1 − r2 | . Ua (r1 ) − U a + |r2 |
(44)
a (r ) = U 1 (r /a), This holds true for a = 1 and r1 , r2 ∈ R2 , and follows by scaling, U a we use (44) for |x2 | < 3a and (42) for for a > 0. To estimate T = (Qa − P )Fb U |x2 | ≥ 3a (with N + 1, resp. N + k in (A.5)). Thus
|T (r + s, r)| ≤ C(1 + |s |)−N
b−1 −k 1 1 + (|x| − a − y)+ + a y=0 |x|<3a ∞
r∈Z×N
≤ C(1 + |s |)−N b 6 + 2
|x|≥3a
(1 + m)−k
,
m=a
where we used |x| − a − y ≥ 3a − 2a = a. Similarly, r∈Z×N
|T (r + s, r)| ≤ C(1 + |s |) 2
−2N
∞ 1 −2k + b (1 + m) a m=a
≤ C(1 + |s |)−2N · b/a .
P. Elbau, G.M. Graf
424
Proof of Lemma 5. Let A = Qa − P = Ua g(H )Ua∗ − g(H ), AB = Ua g(HB )Ua∗ − g(HB ) = Ua PB Ua∗ − PB and D = (A − AB )Ua , where g(H ) ≡ J g(H )J ∗ is now meant as an operator on 2 (Z × Z), simply extended by zero. The kernel of D, D(r1 , r2 ) = J g(H )J ∗ − g(HB ) (r1 , r2 ) Ua (r1 ) − Ua (r2 ) ,
satisfies (up to a factor 2) the bound (A.6), and vanishes if both r1 , r2 are outside of the wedge. Thus (A.4) with p = 1 shows
D 1 ≤ Ce−κa . Writing A3 − A3B = A2 (A − AB ) + A(A − AB )AB + (A − AB )A2B , this proves lim (tr A3 − tr A3B ) = 0 .
a→∞
But, see [2],
tr A3B = Ind Ua PB Ua∗ , PB
(45)
is independent of a due to the stability of the index ([10], Theorem 5.26) under compact perturbations (or use Lemma 3 above instead). In particular (45) equals 2π σB as defined. This proves (20). To prove (21), we let b ≤ a/2 and note that by (37, 38, 39),
(Qa − P )(P − P 2 + Qa − Q2a ) 1 ≤ (Qa − P )(1 − Fb )(P − P 2 + Qa − Q2a ) 1 + 2 (Qa − P )Fb 1 ≤ C(1 + a)e−κb + CN (1 + a)−N .
Upon choosing e.g. b = a 1/2 , this tends to 0 as a → ∞.
As a preparation to the proof of Lemma 6 we have: Lemma 8. Equation (3) is well-defined and independent of χ and g as stated in Theorem 1. In particular, σE = − lim tr (g (H )i[H, χa (x)]), a→∞
(46)
where χa (x) = χ (x/a). Proof. Equation (3) is well-defined by (A.8). By taking differences of switch functions, independence amounts to (i) tr g (H )i[H, X(x)] = 0, (ii) tr G (H )i[H, χ (x)] = 0, where X, G ∈ C0∞ (R) with supp G ⊂ . These statements are verified as follows: (H )X(x) ∈ J by (A.7,A.4) we have tr g (H )[H, X] = tr g (H )H X − i) Since g 1 tr g (H )XH = 0 by cyclicity. This already proves (46). ii) [G(H ), χ ] (r1 , r2 ) = G(H )(r1 , r2 ) χ (r2 ) − χ (r1 ) . By (A.7, A.4), [G(H ), χ] ∈ J1 and hence tr [G(H ), χ (x)] = 0.
(47)
∈ ⊂ and GG = G. Then (47) may also be written, We then pick G with supp G using cyclicity and (A.9) as χ ] = tr [G, χ ]G + tr [G, χ ]G = tr [H, χ ](G G +G G) tr [GG, = tr [H, χ ]G . C0∞
Equality of Bulk and Edge Hall Conductance Revisited
425
Proof of Lemma 6. Let A = Qa − P . Then, by (A.5) and (44), a (r1 ) − U a (r2 ) ≤ CN 1 + |r1 − r2 | −N |r1 − r2 | , |A(r1 , r2 )| = g(H )(r1 , r2 ) U a + |r1 | so that by (A.4) A3 1 = A 33 ≤ Ca −1 . This proves (22). For b ≤ a/2 we have a (r ) = Fb (y)e2πiχa (x) , Fb (y)U where χa (x) = χ (x/a) is a switch function. We then have, using (37, 38), 3 tr {Qa − P , P − P 2 + Qa − Q2a } 2 = 3 tr Fb (Qa − P )Fb (P − P 2 + Qa − Q2a ) + O((1 + a)e−κb ) = 3 tr Fb P (2π)−P (0) Fb P (0) − P (0)2 +P (2π )−P (2π )2 +O((1 + a)e−κb ), (48) where P (ϕ) = eiϕχa (x) g(H )e−iϕχa (x) . We now apply the fundamental theorem of calculus to 2π 2π d dϕ dϕ eiϕχa i[g(H ), χa ]e−iϕχa . (49) P (ϕ) = − P (2π ) − P (0) = dϕ 0 0 We remark that in (37, 38, 40, 41) one can, by the same proof, replace Qa − P by i[g(H ), χa ] :
i[g(H ), χa ]Fb 2 ≤ C(b/a)1/2 ,
i[g(H ), χa ](1 − Fb )(g(H ) − g(H )2 ) 1 ≤ C(1 + a)e−κb . Thus sup
0≤ϕ,ϕ ≤2π
(50) (51)
P (ϕ ) − P (ϕ) Fb 2 ≤ C(b/a)1/2 ,
so that by writing P (ϕ ) − P (ϕ )2 − P (ϕ) − P (ϕ)2 = P (ϕ ) − P (ϕ) 1 − P (ϕ ) − P (ϕ) P (ϕ ) − P (ϕ) , we infer sup
0≤ϕ,ϕ ≤2π
Fb P (ϕ ) − P (ϕ )2 − P (ϕ) − P (ϕ)2 Fb 2 ≤ C(b/a)1/2 .
Using this with ϕ = 0, 2π , (49, 50) and the Cauchy-Schwarz inequality we find that (48) equals, up to errors O(b/a) + O((1 + a)e−κb ), 2π dϕ tr Fb i[g(H ), χa ]Fb g(H ) − g(H )2 −6 0 = −6 · 2π tr i[g(H ), χa ] g(H ) − g(H )2 = −2π · 6 tr i[H, χa ]g (H ) g(H ) − g(H )2 = −2π tr i[H, χa ] g (H ) ,
426
P. Elbau, G.M. Graf
where Fb has been dropped using (51) and (A.9) been used. We remark that g = 3g 2 −2g 3 1/2 is also a switch function. We finally pick e.g. b = a so that the error mentioned above vanishes as a → ∞. Thus (23) follows from Lemma 8. Proof of Lemma 2. This is a variant of the argument leading to (26) in the case of projections. Let, in the general case, A, B be as in (30, 11). Then, by (27), ft (A) − A3 = (1 − Rt )(r.h.s. of (33)) 1 = (1 − Rt )[B, [B, A]] − Rt {B, {A, B}} 4 1 − Rt {A, 1 − A2 − B 2 } 2 1 1 + {B, {A, B}} − {A, 1 − A2 − B 2 } 4 4 3 2 + {A, 1 − A − B 2 } 4 ≡ L1 + L2 + L3 + L4 (linewise), where 1 − Rt = tA2 (1 + tA2 )−1 ,
(52)
resp. Rt = (1 + tA2 )−1 . Note that since A = A∗ , s− lim t 1/2 ARt = 0,
(53)
s− lim Rt = : ,
(54)
t→∞ t→∞
where : is the projection onto the null space of A. 1) We claim limt→∞ tr L1 = 0. To this end we consider the first term in the corresponding bracket first: (1 − Rt )[B, [B, A]] = [B, (1 − Rt )[B, A]] + [B, Rt ][B, A].
(55)
Since A ∈ J3 and 1 − Rt ∈ J3/2 we have (1 − Rt )[B, A] ∈ J1 by the H¨older inequality. Thus tr [B, (1 − Rt )[B, A]] = 0. The last term in (55) is by (34) [B, Rt ][B, A] = tRt [A2 , B]Rt · [B, A] = tRt [A, {A, B}]Rt [B, A] = tRt A{A, B}Rt (−2AB + {A, B}) − tRt {A, B}ARt (2BA − {A, B}) = −2tRt A{A, B}Rt AB − 2tRt {A, B}ARt BA +tRt {A, {A, B}}Rt {A, B} . All terms are trace class since {A, B} is by (31, 15) with p(λ) = 2(λ − λ2 ). We recall that s
Xn → 0, Y ∈ J1 ⇒ Xn Y 1 → 0, s Xn∗ → 0, Y ∈ J1 ⇒ Y Xn 1 → 0.
(56)
Equality of Bulk and Edge Hall Conductance Revisited
427
Thus the first two terms on the r.h.s. do not contribute to the trace as t → ∞ by (53) (use cyclicity for the second). Similarly, in the last term tRt (A2 B + 2ABA + BA2 )Rt {A, B}, the middle term thereof does not. Using cyclicity of the trace on the remaining ones, as well as (52), we find for t → ∞, tr (1 − Rt )[B, [B, A]] = tr BRt {A, B}(1 − Rt ) + tr B(1 − Rt ){A, B}Rt + o(1) = tr BRt {A, B} + tr B{A, B}Rt + o(1) = tr Rt {B, {A, B}} + o(1) , where we used Rt {A, B}Rt −→ :{A, B}: = 0 in trace norm, a consequence of (54, t→∞
56). The traces of the two terms in L1 thus compensate one another in the limit t → ∞. 2) We note that {A, 1 − A2 − B 2 } ∈ J1 by (30, 32, 14). Again by (54) we have −2 lim tr L2 = tr :{A, 1 − A2 − B 2 } = tr :{A, 1 − A2 − B 2 }: = 0, t→∞
since : = :2 . 3) L3 equals the second line of the r.h.s. of (28), as seen from (33). Hence tr L3 = 0 follows from (16) for p(λ) = (1 − 2λ)(λ − λ2 ). We can now summarize: lim tr ft (A) = tr A3 +
t→∞
which is (17).
3 tr {A, 1 − A2 − B 2 }, 4
As a final remark, we note that limt→∞ tr ft (U P U ∗ − P ), if existent, is invariant under trace class perturbations of U . This follows from (A.3). Similarly, as a possible replacement for Lemma 5, one has, without making recourse to Lemma 2, lim tr ft Ua g(H )Ua∗ − g(H ) = 2π σB
a→∞
uniformly in t ≥ 1. This follows from the proof of Lemma 5 together with (26) and (A.2). A. Appendix Lemma A.1. Let X = X ∗ , Y = Y ∗ and t ≥ 0. For X ∈ J3 ,
ft (X) 1 ≤ (1 + t) X 33 .
(A.1)
ft (X) − ft (Y ) 1 ≤ 3(1 + t −1 ) X − Y 1
(A.2)
lim tr ft (X) − ft (Y ) = tr (X − Y ).
(A.3)
If X − Y ∈ J1 , then
and t→∞
428
P. Elbau, G.M. Graf
Proof. Equation (A.1) is evident from (10). From ft (λ) = (1 + t −1 ) λ −
λ 1 + tλ2
and from X(1 + tY 2 ) − (1 + tX 2 )Y = X − Y − tX(X − Y )Y we find ft (X) − ft (Y ) = (1 + t −1 ) X − Y − (1 + tX 2 )−1
× (X − Y − tX(X − Y )Y )(1 + tY 2 )−1 .
Using (1+tX2 )−1 ≤ 1, t 1/2 X(1+tX2 )−1 ≤ 1 we obtain (A.2). Using furthermore s− lim t 1/2 X(1 + tX 2 )−1 = 0, t→∞
s− lim (1 + tX 2 )−1 = :X , t→∞
where :X is the projection onto the null space of X, together with (56), we obtain ft (X) − ft (Y ) −→ X − Y − :X (X − Y ):Y = X − Y t→∞
in trace norm.
Lemma A.2. For 1 ≤ p < ∞,
T p ≤
s
1/p |T (r + s, r)|p
.
(A.4)
r∈Z×N
Proof. The case p = 3 is Eq. (4.11) in [1], and the proof given there applies to 1 ≤ p < ∞. Lemma A.3. i) Let g ∈ C ∞ (R) with supp g compact. Then, for any N , |g(H )(r1 , r2 )| ≤ CN (1 + |r1 − r2 |)−N . ii) If furthermore supp g ⊂ , then, for some κ > 0, J g(H )J ∗ − g(HB ) (r1 , r2 ) ≤ CN (1 + |r1 − r2 |)−N e−κ min(|y1 |,|y2 |) ,
(A.5)
(A.6)
unless both y1 , y2 < 0. iii) If G ∈ C0∞ (R) with supp G ⊂ , then G(H )(r1 , r2 ) ≤ CN (1 + |r1 − r2 |)−N e−κ(y1 +y2 ) . In particular, eκy G(H ) is a bounded operator.
(A.7)
Equality of Bulk and Edge Hall Conductance Revisited
429
Lemma A.4. Let χ , g , G ∈ C0∞ with supp G ⊂ . Then [H, χ (x)]G(H ) ,
[g(H ), χ (x)]G(H ) ∈ J1
(A.8)
and tr ([g(H ), χ (x)]G(H )) = tr ([H, χ (x)]g (H )G(H )). In [6], Chapter 2, or [12], Lemma B.1 the Helffer-Sj¨ostrand formula 1 ∂z¯ g(z)(H ˜ − z)−1 dxdy, (z = x + iy), g(H ) = 2π R2
(A.9)
(A.10)
is proven in the sense of a norm convergent integral for H a self-adjoint operator on a Hilbert space H and, say, g ∈ C0∞ , where ∂z¯ = ∂x + i∂y and g˜ is a quasi-analytic extension of g. For definiteness, let g(z) ˜ =
N k=0
g (k) (x)
(iy)k χ (y), k!
with N ≥ 1, and hence ∂z¯ g(z) ˜ = g (N+1) (x)(iy)N χ (y) + i
N
g (k) (x)
k=0
(iy)k χ (y), k!
(A.11)
where χ ∈ C0∞ is even and equals 1 in a neighborhood (−δ, δ) of y = 0. In Lemma A.3 one is mainly interested in functions with supp g , but not supp g, compact. The difference is of little importance, since, if H were bounded above or below, one could trade the one for the other by adding a constant to g and changing it outside of the spectrum. As we however do not want to resort to this assumption, we maintain that (A.10) still holds in the strong sense. Proof of Lemma A.3. We claim that 1 ˜ − z)−1 ψ dx dy∂z¯ g(z)(H g(H )ψ = 2π
(A.12)
for all ψ ∈ H and g ∈ C0∞ . By the functional calculus it suffices to show that, if ψ is dropped and H replaced by a ∈ R, the r.h.s. is (a) well-defined as an improper Riemann integral, and (b) agrees with g(a). Indeed, all of ∂z¯ g, ˜ except for the k = 0 term in (A.11), has compact support K ⊂ R2 , and ˜ − ig(x)χ (y)| ≤ C|y|N , |∂z¯ g(z)
(A.13)
so that the analysis of [6, 12] still applies, except for the contribution from ig(x)χ (y). The latter equals, using that χ is odd, ∞ i dxg(x) dyχ (y)[(a − x − iy)−1 − (a − x + iy)−1 ] 2π 0 ∞ 1 =− dxg(x) dy yχ (y)[(a − x)2 + y 2 ]−1 , π 0 which is absolutely convergent. This proves (a); part (b) follows as, e.g., in [6, 12].
430
P. Elbau, G.M. Graf
Let R(r1 , r2 ; z) = (H − z)−1 (r1 , r2 ) be the Green function. We shall use the Combes-Thomas [5] estimates 2 −µ|r1 −r2 | e , |y| √ |R(r1 , r2 ; x + iy) − R(r1 , r2 ; x − iy)|dx ≤ 12 2π e−µ|r1 −r2 | , |R(r1 , r2 ; x + iy)| ≤
which hold true provided sup
r0 ∈Z×N r
|H (r0 , r0 + r)|(eµ|r | − 1) ≤ |y|/2.
(A.14) (A.15)
(A.16)
They have been proven in this form in [1], Appendix D, Eqs. (D.3, D.4, D.11). Since (eµ|r | − 1) ≤ (µ/µ0 )(eµ0 |r | − 1) for 0 ≤ µ ≤ µ0 we may take, by (6), µ = c|y| for y ∈ supp χ, where c > 0 is some small constant. i) The contribution to (A.5) from the k = 0 term in (A.11) is, through (A.12), ∞ i dyχ (y) R(r1 , r2 ; x + iy) − R(r1 , r2 ; x − iy) , dxg(x) 2π 0 and is bounded in modulus by ∞ √ 6 2 g ∞ dy|χ (y)|e−c|y||r1 −r2 | ≤ Ce−cδ|r1 −r2 | , 0
by using (A.15). The remaining contribution is bounded using (A.14, A.13) as 2 C dxdy|χ (y)||y|N e−c|y||r1 −r2 | ≤ CN (1 + c|r1 − r2 |)−N , |y| K since K is compact. ii) It suffices to establish a bound of the form Ce−2κ|y1 | if y2 ≥ 0 for the l.h.s. of (A.6). In fact by applying that estimate to g¯ we can interchange y1 and y2 in the bound, and hence replace it by Ce−2κ min(|y1 |,|y2 |) for y1 , y2 as specified in the lemma. Moreover, we can also bound (A.6) by a constant times (1 + |r1 − r2 |)−2N by virtue of (A.5), which applies to HB as well. Then (A.6) follows since min(a, b) ≤ (ab)1/2 for a, b > 0. For y2 ≥ 0 the matrix element (A.6) is (J g(H ) − g(HB )J )(r1 , r2 ). We use the resolvent identity J (H − z)−1 − (HB − z)−1 J = −(HB − z)−1 E(H − z)−1 in (A.12) and distinguish as before between the contribution, I , to (A.6) from ig(x)χ (y), and the rest, I I . Using again that χ is odd and (HB − z)−1 E(H − z)−1 − (HB − z¯ )−1 E(H − z¯ )−1 = [(HB − z)−1 − (HB − z¯ )−1 ]E(H − z)−1 +(HB − z¯ )−1 E[(H − z)−1 − (H − z¯ )−1 ], we have i I =− 2π
dxg(x) 0
∞
dyχ (y)
RB (r1 , r; x + iy)E(r , r )R(r , r2 ; x + iy)
r∈Z2 r ∈Z×N
+ RB (r1 , r; x − iy)E(r , r )R(r , r2 ; x + iy) ,
Equality of Bulk and Edge Hall Conductance Revisited
431
where R(r1 , r2 ; z) = R(r1 , r2 ; z) − R(r1 , r2 ; z¯ ). We use (A.14) for R, RB and (A.15) for R, RB , and bound e−cδ|r −r2 | by 1. The result is |E(r , r )||FI (r1 , r, r , r2 )| , |I | ≤ r∈Z2 r ∈Z×N
∞ √
g ∞ 2 dy|χ (y)| 12 2π · e−cδ|r1 −r | · 2 , 2π δ 0 so that by (7) |I | ≤ C r∈Z2 e−µ0 |y| e−cδ|r1 −r | . We may at this point assume µ0 < cδ and use |y| ≥ |y1 | − |r1 − r|, so that e−(cδ−µ0 )|r1 −r | ≤ Ce−µ0 |y1 | . (A.17) |I | ≤ Ce−µ0 |y1 | |FI | ≤
r∈Z2
Before turning to I I we note that |y| in (A.14, A.16) can be replaced with dist (x + iy, σ (H )). This follows by inspection of the proof, Eqs. (D.8–D.10) in [1]. By the spectral condition (1) and the assumption of (ii) we have dist (z, σ (HB )) ≥ d for some d > 0 and all z ∈ supp ∂z¯ g. ˜ Therefore, 1 dxdy(∂z¯ g(z) ˜ − ig(x)χ (y)) II = − 2π K × RB (r1 , r; x + iy)E(r , r )R(r , r2 ; x + iy) r∈Z2 r ∈Z×N
can be estimated as |I I | ≤
|E(r , r )||FI I (r1 , r, r , r2 )| ,
r∈Z2 r ∈Z×N
|FI I | ≤ C
2 2 dxdy|y|N e−cd|r1 −r | · ≤ Ce−cd|r1 −r | . d |y| K
We conclude as in (A.17). foliii) In this case G(HB ) = 0, and (A.7) follows from (A.6). The final remark lows e.g. from Holmgren’s bound [7]: A ≤ max(supr1 r2 |A(r1 , r2 )|, supr2 r1 |A(r1 , r2 )|). Proof of Lemma A.4. By [H, χ]G(H ) 1 ≤ [H, χ ]e−κy 1 eκy G(H ) and (A.7) we are left to show that T = [H, χ ]e−κy is trace class. Its kernel is T (r1 , r2 ) = H (r1 , r2 )(χ (x2 ) − χ (x1 ))e−κy2 . Since |χ (x2 ) − χ (x1 )| ≤ C
|x2 − x1 |(1 + |x2 − x1 |) eµ0 |x2 −x1 | − 1 ≤ C , 1 + x22 1 + x22
we have by (6) s
|T (r + s, r)| ≤ C
e−κy , 1 + x2
432
P. Elbau, G.M. Graf
which is summable w.r.t. r = (x, y) ∈ Z × N. The first part of (A.8) thus follows by (A.4) with p = 1. The same proof with H replaced by g(H ), except that (A.5) is used instead of (6), implies the second part of (A.8). Equation (A.12) implies, see [12], Eqs. (B.10, B. 14), 1 ˜ − z)−1 [H, χ ](H − z)−1 , (A.18) dx dy∂z¯ g(z)(H [g(H ), χ ] = − 2π 1 g (H ) = − (A.19) ˜ − z)−2 , dx dy∂z¯ g(z)(H 2π where the integrals are again meant in the strong sense. For the two sides of (A.9) we may write tr ([g(H ), χ (x)]G(H )) = tr (E (H )[g(H ), χ (x)]G(H )E (H )), tr ([H, χ (x)]g (H )G(H )) = tr (E (H )[H, χ (x)]g (H )G(H )E (H )). We now multiply (A.19) from the left by [H, χ ], and both (A.18, A.19) by E (H ) from the left and by G(H )E (H ) from the right. The integrals then become absolutely convergent in trace class norm. This follows from (A.13) and from
E (H − z)−1 [H, χ ](H − z)−1 GE 1 ≤ [H, χ ]G 1 (H − z)−1 E 2 ,
E [H, χ](H − z)−2 GE 1 ≤ [H, χ ]G 1 (H − z)−2 E , since (H − x − iy)−p E ≤ C|x|−p for large x. The traces can thus be carried inside the integral representations, where they are seen to be equal by cyclicity. Acknowledgement. After completion of this work we learned from A. Klein that Lemma A.3(i) appeared in [9].
References 1. Aizenman, M., Graf, G.M.: Localization bounds for an electron gas. J. Phys. A31, 6783–6806 (1998) 2. Avron, J.E., Seiler, R., Simon, B.: Charge deficiency, charge transport and comparison of dimensions. Commun. Math. Phys. 159, 399–422 (1994) 3. Avron, J.E., Seiler, R., Simon, B.: The index of a pair of projections. J. Funct. Anal. 120, 220–237 (1994) 4. Bellissard, J., van Elst, A., Schulz-Baldes, H.: The non-commutative geometry of the quantum Hall effect. J. Math. Phys. 35, 5373–5451 (1994) 5. Combes, J.M., Thomas, L.: Asymptotic behaviour of eigenfunctions for multiparticle Schr¨odinger operators. Commun. Math. Phys. 34, 251–270 (1973) 6. Davies, E.B.: Spectral Theory of Differential Operators. Cambridge: Cambridge Univ. Press, 1995 7. Friedrichs, K.O.: Spectral Theory of Operators in Hilbert Space. Berlin-Heidelberg-New York: Springer, 1973 8. Fr¨ohlich, J., Studer, U.M.: Gauge invariance and current algebra in nonrelativistic many-body theory. Rev. Mod. Phys. 65, 733–802 (1993) 9. Germinet, F., Klein,A.: Operator kernel estimates for functions of generalized Schr¨odinger operators. To appear in Proc. AMS; mp arc 01-289 10. Kato, T.: Perturbation Theory for Linear Operators. Berlin-Heidelberg-New York: Springer, 1966 11. Kellendonk, J., Richter, T., Schulz-Baldes, H.: Edge current channels and Chern numbers in the integer quantum Hall effect. Rev. Math. Phys. 14, 87–119 (2002) 12. Hunziker, W., Sigal, I.M.: Time-dependent scattering theory of N-body quantum systems. Rev. Math. Phys. 12, 1033–1084 (2000) 13. Schulz-Baldes, H., Kellendonk, J., Richter, T.: Simultaneous quantization of edge and bulk Hall conductivity. J. Phys. A33, L27–L32 (2000) Communicated by B. Simon
Commun. Math. Phys. 229, 433–458 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0682-7
Communications in
Mathematical Physics
Fluctuations in the Composite Regime of a Disordered Growth Model Janko Gravner1 , Craig A. Tracy2 , Harold Widom3 1
Department of Mathematics, University of California, Davis, CA 95616, USA. E-mail:
[email protected] Department of Mathematics and Institute of Theoretical Dynamics, University of California, Davis, CA 95616, USA. E-mail:
[email protected] 3 Department of Mathematics, University of California, Santa Cruz, CA 95064, USA. E-mail:
[email protected]
2
Received: 6 November 2001 / Accepted: 8 April 2002 Published online: 6 August 2002 – © Springer-Verlag 2002
Abstract: We continue to study a model of disordered interface growth in two dimensions. The interface is given by a height function on the sites of the one-dimensional integer lattice and grows in discrete time: (1) the height above the site x adopts the height above the site to its left if the latter height is larger, (2) otherwise, the height above x increases by 1 with probability px . We assume that px are chosen independently at random with a common distribution F , and that the initial state is such that the origin is far above the other sites. Provided that the tails of the distribution F at its right edge are sufficiently thin, there exists a nontrivial composite regime in which the fluctuations of this interface are governed by extremal statistics of px . In the quenched case, the said fluctuations are asymptotically normal, while in the annealed case they satisfy the appropriate extremal limit law.
1. Introduction Disordered systems, which are, especially in the context of magnetic materials, often referred to as spin glasses, have been the subject of much research since the pioneering work in the 1970s. The vast majority of this work is nonrigorous, based on simulations and techniques for which a proper mathematical foundation is yet to be developed. (See [MPV] for early developments and [Tal] for a nice overview of the mean field approach.) As a result, there is a large number of new and intriguing phenomena observed in these models which await rigorous treatment. Among the most fundamental of issues are the existence and the nature of a phase transition into a glassy or composite phase: below a critical temperature, the dynamics of a strongly disordered system becomes extremely slow with strong correlations, aging and localization effects and possibly many local equilibria. We refer the reader to [NSv] and [BCKM] and other papers in the same volume for reviews and pointers to the voluminous literature and to [NSt1] and [NSt2] for some recent rigorous results. In view of the difficulties associated with a detailed
434
J. Gravner, C.A. Tracy, H. Widom
understanding of realistic spinglass systems, other disordered models have been introduced, which are more amenable to existing probabilistic methods. One of the most successful of such (deceptively) simple models is the one–dimensional random walk with random rates [FIN1]. In this model, the walker waits at a site x ∈ Z for an exponential time with mean τx before jumping to either of its two neighbors with equal probability. The disorder variables τx are i.i.d. and quenched, that is, chosen at the beginning. Provided that the distribution of τx has sufficiently fat tails, namely, if P (τx ≥ t) decays for large t as t −α with α < 1, the walk exhibits aging and localization effects ([FIN1, FIN2]). Various one–dimensional voter models and stochastic Ising models at zero temperature can be explicitly represented with random walks. This connection has been explored to demonstrate glassy phenomena such as aging and chaotic time dependence ([FIN1, FINS]). The positive temperature versions of such results remain open problems, even in one dimension. In contrast with models which are exactly solvable in terms of random walks and are by now a classical subject in spatial processes ([Gri1, Lig]), techniques based on the RSK algorithm and random matrix theory have entered into the study of growth processes only recently ([BDJ, Joh1, Joh2, BR, PS, GTW1]). The purpose of this paper is to employ these new methods to prove the existence of a pure phase and a composite phase in a disordered growth model. It has been observed before in similar models [SK] that the role of temperature is for flat interfaces apparently played by their slope. In our case, the initial set is very far from flat and “temperature” is measured instead by the macroscopic direction (from the origin) of points on the boundary. We identify precisely the critical direction and demonstrate that the fluctuations asymptotics provide an order parameter that distinguishes the two phases. We emphasize that a hydrodynamic quantity, the asymptotic shape, has a discontinuity of the first derivative at the transition point, at which the shape changes from curved to flat. However, this does not signify the existence of a new phase as kinks are common in many random growth models [GG], thus a finer resolution is necessary. The particular model we investigate is Oriented Digital Boiling (ODB) (Feb. 12, 1996, Recipe at [Gri2], [Gra, GTW1, GTW2]), arguably the simplest interacting model for a growing interface in the two–dimensional lattice Z2 . The occupied set, which changes in discrete time t = 0, 1, 2, . . ., is given by At = {(x, y) : x ∈ Z, y ≤ ht (x)}. The initial state is a long stalk at the origin: 0, if x = 0, h0 (x) = −∞, otherwise, while the time evolution of the height function ht is determined thus: ht+1 (x) = max{ht (x − 1), ht (x) + εx,t }. Here εx,t are independent Bernoulli random variables, with P (εx,t = 1) = px . Although this model is simplistic, note that it does involve the roughening noise (random increases) as well as the smoothing surface tension effect (neighbor interaction), the basic characteristics of many growth and deposition processes. (See Sects. 5.1, 5.2 and 5.4 of [Mea] for an overview of simple models of ODB type as well as some other disordered growth processes.) We will assume, throughout this paper, that the disorder variables px are initially chosen at random, independently with a common distribution F (s) = P (px ≤ s). We use · to denote integration with respect to dF and label by p a generic random variable with distribution F .
Fluctuations in the Composite Regime of a Disordered Growth Model
435
It quickly turns out ([GTW1]), that fluctuation in ODB can be studied via equivalent increasing path problems. Start by constructing a random m × n matrix A = A(F ), with independent Bernoulli entries εi,j and such that P (εi,j = 1) = pj , where, again, d
pj = p are i.i.d. Label columns as usual, but start rows at the bottom. We call a sequence of 1’s in A whose positions have column index nondecreasing and row index strictly increasing an increasing path in A, and denote by H = H (m, n) the length of the longest increasing path. Then, under a simple coupling, ht (x) = H (t − x, x + 1) ([GTW1]). Thus we will concentrate our attention on the random matrix A rather than the associated growth model. From now on we will also replace pi with its ordered sample, so that p1 ≥ p2 ≥ · · · ≥ pn (see Sect. 2.2 of [GTW1]). We initiated the study of ODB in a random environment in an earlier paper ([GTW2]), from which we now summarize the notation and the main results. Throughout, we denote by b the right edge of the support of dF and assume it is below 1, i.e., b = min{s : F (s) = 1} < 1. Moreover, we fix an α > 0 and assume that n = αm. (Actually, n = αm, but we omit the obvious integer parts.) As mentioned above, we can expect different behaviors for different slopes on the boundary of the asymptotic shape, which translates to different α’s. To be more precise, we define the following critical values:
p −1 αc = , 1−p p(1 − p) −1 . αc = (b − p)2 Note that the second critical value is nontrivial, i.e., αc > 0, iff (b − p)−2 < ∞. Next, define c = c(α, F ) to be the time constant, c = c(α, F ) = lim H /m, which determ→∞
mines the limiting shape of At , namely lim At /t, as t → ∞. In Theorem 1 of [GTW2], it was found that c exists a.s. and is given by b + α(1 − b) p/(b − p) , if α ≤ αc , c(α, F ) = a + α(1 − a) p/(a − p) , if αc ≤ α ≤ αc , 1, if αc ≤ α. Here a = a(α, F ) ∈ [b, 1] is the unique solution to α p(1 − p)/(a − p)2 = 1. In [GTW2], we also determined fluctuations in the pure regime αc < α < αc . (The deterministic regime αc < α has no fluctuations.) The √ annealed fluctuations ([GTW2], Theorem 2) about the deterministic shape c grow as m and are asymptotically normal: H − cm d −→ N (0, 1) √ τ0 α · m1/2 as m → ∞, where τ02 = Var((1 − a)p/(a − p)). By contrast, quenched fluctuations conditioned on the state of the environment grow more slowly, as m1/3 , and satisfy the F2 –distribution known from random matrices ([TW1, TW2]). To formulate this result, we let rj = pj /(1 − pj ), define un to be the solution of
436
J. Gravner, C.A. Tracy, H. Widom n rj α 1 = 2 n (1 + rj u) (u − 1)2
(1.1)
j =1
which lies in in the interval (−r1−1 , 0). This solution exists provided that αn−1 nj=1 rj < 1 which holds a.s. for large n as soon as α < αc . Next, set cn = c(un ) where c(u) =
n α rj u 1 − . 1−u n 1 + rj u
(1.2)
j =1
Then ([GTW2], Theorem 3) there exists a constant g0 = 0 so that H − cn m P ≤ s | p1 , . . . , pn → F2 (s), g0−1 m1/3 as m → ∞, almost surely, for any fixed s. For fluctuation results in this paper we need to impose some additional assumption on F , which are best expressed in terms of G(x) = 1 − F ((b − x)−), the distribution function for b − p. First we list our weaker conditions: (a) If x, y → 0 and x ∼ y, then G(x) ∼ G(y). (b) If x, y → 0 and x = O(y), then G(x) = O(G(y)). (c) As x → 0, G(x) = o(x 2 / log x −1 ). Our stronger assumptions on F require that there exists a γ > 0 so that: (a ) The function G(x)/x γ is nonincreasing in a neighborhood of x = 0. (b ) G(x) = O(x 2 / logν x −1 ) as x → 0 for some ν > 2γ + 4. If αc > 0, then automatically G(x) = o(x 2 ) as x → 0. The stronger assumptions thus do not require much more: for nicely behaved G they amount to G(x) = O(x 2 / logν x −1 ) for some ν > 8. The quenched and annealed fluctuations are now determined by the next two theorems. Theorem 1. Assume that 0 < α < αc , let τ 2 = b(1 − b)
1 1 − α αc
,
and let & be the standard normal distribution function. If (a)–(c) hold, then for any fixed s, as m → ∞, √
H − cn m + 2τ n P ≤ s | p1 , . . . , pn → &(s). √ τ n Here, the convergence is in probability if (a)–(c) hold, and almost sure if (a ) and (b ) hold. Theorem 2. Assume that 0 < α < αc , and that (a)–(c) hold. Then, for any fixed s P H ≤ cm − (1 − α/αc ) m G−1 (s/n) | p1 , . . . , pn → e−s
Fluctuations in the Composite Regime of a Disordered Growth Model
437
in probability. In particular, P H ≤ cm − (1 − α/αc ) m G−1 (s/n) → e−s . Throughout, we follow the usual convention in defining G−1 (x) = sup{y : G(y) < x} to be the left continuous inverse of G, although any other inverse works as well. Assume, for simplicity, that, as x → 0, G(x) behaves as x η for some η > 2. Then, in contrast with the pure regime, the annealed fluctuations in composite regime scale as m1−1/η , while the quenched ones scale as m1/2 . In fact, this can be guessed from [GTW2]. Namely, as explained in Sect. 2 of that paper, the maximal increasing path has a nearly vertical segment of length asymptotic to (1 − α/αc )m in (or near) the column of A which uses the largest probability p1 . Therefore,√this vertical part of the path dominates the fluctuations, as the rest presumably has o( m) fluctuations. (These are most likely not of the order exactly m1/3 as they correspond to the critical case α = αc . The precise nature of the critical fluctuations is an interesting open problem.) The variables in the p1 –column are Bernoulli with variances about b(1 − b), thus the √ contribution of the vertical part to the variance is about (b(1−b)(1−α/αc )m)1/2 = τ n. The annealed case then simply picks up the variation in the extremal statistic p1 . Simple as the above intuition may be, Theorems 1 and 2 are not so easy to prove and require √ considerable additional technical details. We also note the mysterious correction 2τ n in Theorem 1 for which we have no intuitive explanation. The fluctuations results in [GTW2] and the present paper thus sharply distinguish between two different phases of one particular growth model. Nevertheless, it seems natural to speculate that this phenomenon is universal in the sense that it occurs in other one–dimensional finite range dynamics of ODB type, started from a variety of initial states. Indeed, such universality has been established in other random matrix contexts [Sos]. Fluctuations of higher–dimensional versions seem much more elusive; it appears that a glassy transition should take place, but the fluctuation scalings could be completely different. To elucidate, we present some simulation results. In all of them, we start from the flat substrate h0 ≡ 0 and use F (s) = 1 − (1 − 2s)η , so that b = 1/2. It is expected that, as η increases, the quenched fluctuation experiences a sudden jump from 1/3 to 1/2. We simulate two dynamics, the ODB and the two–sided digital boiling (abbreviated simply as DB), given by ht+1 (x) = max{ht (x − 1), ht (x + 1), ht (x) + εx,t }. The top of Fig. 1 illustrates the ODB on 600 sites (with periodic boundary), run until time 600. The occupied sites are periodically colored so that the sites which become occupied at the same time are given the same color. On the left, η = 1 (i.e., p is uniform on [0, 1/2] and αc = 0), while η = 3 (and αc > 0) on the right. The darkly colored sites thus give the height of the surface at different times and provide a glimpse of its evolution. In the pure regime (η = 1), the boundary of the growing set reaches a local equilibrium ([SK]), while in the composite regime (η = 3) the boundary apparently divides into domains, which are populated by different equilibria and grow sublinearly. This is the mechanism that causes increasing fluctuations. The bottom of Fig. 1 confirms this observation; it features a log–log plot of quenched standard deviation (estimated over 1000 independent trials) of ht (0) vs. t up to t = 10 000. The η = 1 case is drawn with +’s and the η = 3 case with ×’s; the two least squares approximations lines (with slopes 0.339 and 0.517, respectively) are also drawn. We note that the asymptotic speed
438
J. Gravner, C.A. Tracy, H. Widom
Fig. 1. Evolution and quenched deviation in the two phases of disordered ODB
of this flat interface is known: limt→∞ ht (0)/t = supα>0 (α +1)c(α). Here is the reason: if ODB dynamics hit , ht start from initial states hi0 , h0 = supi hi0 , respectively, and are coupled by using the same coin flips εx,t , then ht = supi hit for every t. Perhaps surprisingly, it appears that the phase transition in the DB does not occur at η = 2, and in general the delineation is much murkier. At this point, we cannot even eliminate the possibility of continuous dependence of fluctuation exponent on η. In Fig. 2, we present the results of simulations for η = 0.2 (left) and η = 1 (right). The top figures only show evolution near time t = 5000, as no difference is readily apparent at earlier times. The plot of quenched deviations is analogous to the one in Fig. 1, with the least squares slopes 0.395 (η = 0.2) and 0.49 (η = 1). The organization of the rest of the paper is as follows. Section 2 reviews the set-up from [GTW1, GTW2], in Sect. 3 we prove the relevant asymptotic properties of the order statistics and of the solutions of (1.1) and (1.2), and demonstrate how Theorem 2 follows from Theorem 1. Section 4 is a detailed analysis of the asymptotic behavior of steepest descent curves. The proof of convergence in probability in Theorem 1 is then concluded in Sect. 4. Finally, Sect. 5 strengthens the results of Sect. 3 (under the stronger conditions) so that almost sure convergence is implied.
2. The Basic Set-up We recall how we approached these problems in [GTW1, GTW2]. The starting point is the identity P (H ≤ h) = det (I − Kh ),
Fluctuations in the Composite Regime of a Disordered Growth Model
439
Fig. 2. Evolution and quenched deviation in disordered DB
where Kh is the infinite matrix acting on +2 (Z+ ) with (j, k)–entry Kh (j, k) =
∞
(ϕ− /ϕ+ )h+j +++1 (ϕ+ /ϕ− )−h−k−+−1 .
+=0
The subscripts denote Fourier coefficients and the functions ϕ± are given by ϕ+ (z) =
n
(1 + rj z), ϕ− (z) = (1 − z−1 )−m .
j =1
The matrix Kh is the product of two matrices, with (j, k)–entries given by (ϕ+ /ϕ− )−h−j −k−1 = (ϕ− /ϕ+ )h+j +k+1
1 2πi
n
(1 + rj z) (z − 1)m z−m+h+j +k dz,
j =1
n 1 = (1 + rj z)−1 (z − 1)−m zm−h−j −k−2 dz. 2πi j =1
The contours for both integrals go around the origin once counterclockwise; in the second integral 1 is on the inside and all the −rj−1 are on the outside.
440
J. Gravner, C.A. Tracy, H. Widom
If h = cn m + h we have
1 ψ(z) zh +j +k dz, 2πi 1 ψ(z)−1 z−h −j −k−2 dz, = 2πi
(ϕ+ /ϕ− )−h−j −k−1 =
(2.1)
(ϕ− /ϕ+ )h+j +k+1
(2.2)
where ψ(z) =
n
(1 + rj z) (z − 1)m z−(1−cn ) m .
j =1
The idea is to apply steepest descent to the above integrals. If σ (z) = m−1 log ψ(z), then n α rj 1 cn − 1 σ (z) = + + (2.3) n 1 + rj z z − 1 z j =1
and, with un and cn as defined above, σ (un ) = σ (un ) = 0. The steepest descent curves both pass through un . As n → ∞ the zeros/poles −rj−1 accumulate on the halfline (−∞, ξ ], where ξ = 1 − b−1 . In the pure regime the points un and the curves are bounded away from this half-line, behave regularly and have nice limits. However in the composite regime the points and curves come very close to ξ , their behavior is not so simple, and we apply steepest descent not quite as described. 3. Preliminary Lemmas I: Properties of pn , un , and cn Until Sect. 5, we assume that all limits are in probability, unless otherwise indicated. To prove the first part of Theorem 1 and Theorem 2, we thus assume that (a)–(c) hold. We let qj = b − pj , so that q1 , · · · , qn are chosen independently according to the distribution function G, then ordered so that q1 ≤ q2 ≤ · · · ≤ qn . Let t1 < t2 < . . . < tn be an ordered sample of i.i.d. uniform (0, 1) random variables. Then we may construct the G–sample by setting qj = G−1 (tj ). We will also use the well-known fact that, given tj , the conditional distribution of t1 , . . . tj −1 is that of an ordered sample of j − 1 uniforms on [0, tj ]. Lemma 3.1. There exist a positive constant c1 so that x ≤ G(G−1 (x)) ≤ x/c1 for x ∈ (0, 1). Moreover, G(G−1 (x)) ∼ x as x → 0. Proof. Write the complement of the range of G as ∪i Ii , where Ii are disjoint and either of the form [ai , bi ) or (ai , bi ). If x ∈ (0, 1) is in the range of G, then G(G−1 (x)) = x, otherwise, if x ∈ Ii , G(G−1 (x)) = bi . By (a), bi ∼ ai if ai → 0. The last sentence in the statement is then proved, and the first follows. Lemma 3.2. With c1 as in Lemma 3.1, for η < 1 and j ≥ 2, P G(q1 ) > ηG(qj ) ≤ (1 − c1 η)j −1 . Proof. By Lemma 3.1 and remarks preceding it, P G(q1 ) > ηG(qj ) ≤ P t1 > c1 ηtj = ( 1 − c1 η )j −1 .
Fluctuations in the Composite Regime of a Disordered Growth Model
441
Lemma 3.3. limn→∞ P q1 ≤ G−1 (s/n) = 1 − e−s . Proof. Fix an ε > 0. First, by monotonicity of G−1 , t1 ≤ s/n implies q1 ≤ G−1 (s/n). Second, by Lemma 3.1 and the monotonicity of G we have that, for large enough n, q1 ≤ G−1 (s/n) implies t1 ≤ G(G−1 (t1 )) = G(q1 ) ≤ G(G−1 (s/n)) ≤ (1 + ε)s/n. These give the inequalities P (q1 ≤ G−1 (s/n)) ≥ 1 − (1 − s/n)n , and P (q1 ≤ G−1 (s/n)) ≤ 1−(1−(1+ε)s/n)n . The statement of the lemma now follows upon first letting n → ∞ and then ε → 0. Remark. It follows from Lemma 3.3, and the fact that G(x) = o(x 2 ) near x = 0, that n1/2 q1 → ∞ as n → ∞. Lemma 3.4. With high probability q1 /q2 is bounded away from 1 as n → ∞. More precisely, for every η > 0 there is a δ > 0 such that P (q1 ≤ (1 − δ) q2 ) ≥ 1 − η for large enough n. Proof. It follows from Lemma 3.1 that for every η > 0 there exists a δ1 > 0 so that the following implication holds for t2 < δ1 : if G(q1 ) > (1 − δ1 )G(q2 ) then t1 > (1 − η)t2 . Furthermore, by the assumption (a), there exists a δ ∈ (0, δ1 ) so that, for t2 < δ, q1 > (1 − δ)q2 implies G(q1 ) > (1 − δ1 )G(q2 ). Therefore, P (q1 > (1 − δ)q2 ) ≤ P (t1 > (1 − η)t2 ) + P (t2 > δ) = η + P (t2 > δ), and the proof is concluded since t2 → 0 a.s.
Lemma 3.5. n−1 n1 q1 /qj3 → 0 as n → ∞.
Proof. For any fixed k we have n−1 kj =1 q1 /qj3 ≤ k/nq12 → 0. Also, n−1 nj=k+1 qj−2 −2 + 1 a.s. for large n. < q Let δ > 0 be given. By the above paragraph, it suffices to show that
lim sup P n→∞
q1 >δ qk+1
will be arbitrarily small for sufficiently large k. Now, from the assumption (b), it follows that for some η > 0 we have G(q1 ) > ηG(qk+1 ) whenever q1 > δqk+1 and q1 < η. With this η (which we may assume is less than 1) we have, from Lemma 3.2,
P which is clearly enough.
q1 > δ ≤ (1 − c1 η)k + P (q1 ≥ η), qk+1
From now on {ϕn } will denote a sequence of random variables satisfying ϕn = o(q1 ). Since q1 n−1/2 we shall assume when convenient that also ϕn n−1/2 . In the statement of the next lemma, the expression O(ϕn ) could have been replaced by the less awkward o(q1 ). The reasons for the present statement are that the substitute for this lemma (Lemma 6.2) when we consider almost sure convergence will have this form, and that the same sequence {ϕn } will appear in later lemmas.
442
J. Gravner, C.A. Tracy, H. Widom
Lemma 3.6. Let {vn } be a sequence of points in a disc with diameter the real interval [−r1−1 − O(ϕn ), ξ ]. Then n rj 1 r . = n→∞ n (1 + rj vn )2 (1 + rξ )2 lim
j =2
Proof. Write vn = (bn − 1)/bn . Then if we recall that ξ = (b − 1)/b and pj = b − qj we see that b − bn lies in a disc with diameter [0, q1 + O(ϕn )] and that n
n
j =2
j =2
rj 1 1 bn2 (b − qj )(1 − b + qj ) = . n (1 + rj vn )2 n (bn − b + qj )2 If we subtract from this the same expression with bn replaced by b, that is, n
1 b2 (b − qj )(1 − b + qj ) , n qj2
(3.1)
n 1 bn2 b2 (b − qj )(1 − b + qj ) − 2 . n (bn − b + qj )2 qj j =2
(3.2)
j =2
we obtain
We shall show that this is o(1). Assuming this for the moment, we can finish the proof by first noting that we may, with error o(1), start the sum in (3.1) at n = 1 since qi n−1/2 , and then (3.1) has the a.s. limit 2 b (b − q)(1 − b + q) r = . q2 (1 + rξ )2 It remains to show that (3.2) is o(1). If we replace the numerator b2 on the right by
−2 qj is a.s. bounded. If we make this replacement then what we obtain is bounded by a constant times n b (bn − b)2 − 2(bn − b)qj . n qj2 (bn − b + qj )2 bn2 , the error is o(1), since n−1
j =2
Since |b − bn | ≤ q1 + O(ϕn ) = q1 + o(q1 ) it follows from Lemma 3.4 that |bn − b + qj | is at least a constant times qj for large n and so the above is at most a constant times n
n
j =2
j =2
1 |bn − b| 1 q1 ≤ , 3 n n qj qj3 and by Lemma 3.5 this is o(1). We denote θ = 1 − α/αc ,
β=
(1 − b) α b3 θ
1/2 .
Lemma 3.7. We have un = −r1−1 + βn−1/2 + o(n−1/2 ) as n → ∞.
(3.3)
Fluctuations in the Composite Regime of a Disordered Growth Model
443
Proof. We show first that un ≥ ξ cannot occur for arbitrarily large n. If it did, then we would have, using Eq. (1.1) for un , n rj 1 1 α α r1 b = ≤ ≤ + . (ξ − 1)2 (un − 1)2 n (1 + r1 ξ )2 n (1 + rj ξ )2 2
j =2
It follows from the remark following Lemma 3.3 that the first term on the right is o(1) and from Lemma 3.6 that the second term on the right has limit r 2 p(1 − p) α = αb < b2 (1 + rξ )2 (b − p)2 since we are in the composite regime. This contradiction shows that un ≤ ξ for sufficiently large n, and so un ∈ [−r1−1 , ξ ]. By Lemma 3.6 again, n rj α 1 r = b2 α/αc . = → α n (1 + rj u)2 (u − 1)2 (1 + rξ )2 j =2
Therefore Eq. (1.1) for un becomes
α 1 r r1 + o(1) = b2 θ + o(1). = −α n (1 + r1 un )2 (ξ − 1)2 (1 + rξ )2
Since r1 = b/(1 − b) + o(1) we find that the solution is as stated.
Next, we see how cn behaves. Lemma 3.8. We have cn = c(α, F ) − θ q1 + o(q1 ) as n → ∞, where θ is given in (3.3). Proof. Write cn =
n 1 α rj u n α r 1 un − − . 1 + r j un 1 − un n n 1 + r 1 un
(3.4)
j =2
By Lemma 3.7, the last term above is O(n−1/2 ). Equation (1.1) tells us that n d 1 α rj u − = 0, du 1 − u n 1 + rj u u=un j =1
and so
n d 1 α α rj u r1 = − du 1 − u n 1 + rj u u=un n (1 + r1 un )2 j =2
=
α α(1 − b) + o(1) = + o(1). r1 β 2 bβ 2
By Lemma 3.6 and its proof, with an error o(1) the derivative of the expression in the parentheses above equals in [un , ξ ] what it equals at u = ξ , so the above holds with un replaced by any point in this interval. From this and (3.4) we get cn = c(un ) = c(ξ ) −
α(1 − b) (ξ − un ) + o(ξ − un ). bβ 2
444
J. Gravner, C.A. Tracy, H. Widom
We have ξ − un = 1 − b−1 − r1−1 + O(n−1/2 ) = p1−1 − b−1 + O(n−1/2 ) =
q1 + o(q1 ), b2
where we have used the fact that q1 n−1/2 . Thus cn = c(ξ ) −
α(1 − b) q1 + o(q1 ). b3 β 2
Finally, as (b − p)2 < ∞, we can use the central limit theorem to conclude that c(ξ ) = c(α, F ) + O(n−1/2 ), which completes the proof. Remark. Lemmas 3.3 and 3.8 show that Theorem 2 follows from the part of Theorem 1 on convergence in probability. 4. Preliminary Lemmas II: Steepest Descent Curves Now we go to our integrals (2.1) and (2.2). We are not going to apply steepest descent with ψ as the main integrand, but rather with the function ψ1 which is ψ with the factor 1 + r1 z removed. It is convenient to introduce the notation ψ1 (z, c) =
n
(1 + rj z) (z − 1)m z−(1−c) m ,
j =2
where c > 0. (This parameter is not to be confused with the time constant c = c(α, F ) defined earlier.) Thus ψ1 (z) = ψ1 (z, cn ) in this notation. We also define the integrals 1 1 − + (1 + r1 z) ψ1 (z, c) dz, I (c) = (1 + r1 z)−1 ψ1 (z, c)−1 z−2 dz. I (c) = 2π i 2π i (Since I + (c) = 0 when c ≥ 1 we always assume that c < 1.) Notice that these are exactly the integrals (2.1) and (2.2) when we set c = cn + (h + j + k)/m. Since j, k ≥ 0 and we will eventually set h = sn1/2 , we may also assume that c ≥ cn − O(n−1/2 ).
(4.1)
To apply steepest descent to I ± (c) we must locate the critical points and determine the critical values of ψ1 (z, c). Thus we define σ1 (z, c) = so that σ1 (z, c)
1 log ψ1 (z, c), m
n 1 c−1 α rj + + . = n 1 + rj z z − 1 z j =2
Fluctuations in the Composite Regime of a Disordered Growth Model
445
As before, if the parameter c does not appear we take it to be cn , e.g., σ1 (z) = σ1 (z, cn ). So 1 α r1 σ1 (z) = log ψ1 (z) = σ (z) − . m n 1 + r1 z Using σ (un ) = σ (un ) = 0 we get from the above and Lemma 3.7 that α α (4.2) σ1 (un ) = − √ (1 + o(1)), σ1 (un ) = 2 (1 + o(1)). β β n To determine the critical values of σ1 (z, c) let us first find the value of c for which its derivative has a double zero. (This is the analogue of the quantity cn for σ (z).) For this we use the analogue of (1.1) and (1.2) but where the terms corresponding to j = 1 are dropped from the sums. If we call the solution of (1.1) u¯ and set c¯ = c(u) ¯ then σ1 (z, c) ¯ has a double zero at u. ¯ In analogy with un , we know that u¯ is to the right of and within ¯ we use Lemma 3.8, its analogue where the sums in (1.1) O(n−1/2 ) of −r2−1 . As for c, and (1.2) start with j = 2, as well as Lemma 3.4, to see that to a first approximation c¯ = cn − θ (q2 − q1 ) n−1/2 .
From this and (4.1) we see that c > c. ¯ and that q2 − q1 Using subscripts for derivatives now, we have ¯ c) ¯ = σ1zz (u, ¯ c) ¯ = 0, σ1z (u, ¯ as c and we want to see how the critical points u± c of σ1 (z, c) move away from u + .) The function σ (z, c) increases from c. ¯ (Here we take u− < u ¯ vanishes at u ¯ and is 1z c c otherwise positive in (−r2−1 , 0). It follows that for c close to but larger than c¯ we have ± u− ¯ < u+ c
1 du± du± c c ± + σ1zc (u± + ±. , c) = σ (u , c) 1zz c c dc dc uc
(4.3)
+ ± Since u± c < 0 it follows that duc /dc = 0, and so each of uc is either a decreasing or increasing function of c for c > c. ¯ From their behavior that we already know for c close − − to c¯ we deduce that u+ c increases and uc decreases as c increases. In particular, uc is −1 even closer to −r2 than u. ¯ We remark that from (4.3) and the signs of du+ c /dc we deduce − σ1zz (u+ c , c) > 0, σ1zz (uc , c) < 0.
(4.4)
Next we shall determine the asymptotics of the critical values σ (u± c , c). The sequence {ϕn } is as described before Lemma 3.6. Lemma 4.1. For c − cn = O(ϕn ), σ1 (u+ c , c)
=
r1 β σ1 (−r1−1 , c) −
2
2α
2α c − cn + (1 + o(1))n−1/2 r1 β
2 ,
(4.5)
and for all c ≥ cn , −1 −1/2 (c − cn ) + O(n−1 ). σ1 (u+ c , c) < σ1 (−r1 , c) − ηn
for some η > 0. Moreover for all c −1 2 σ1 (u− c , c) > σ1 (−r1 , c) + ϕn
when n is sufficiently large.
(4.6)
446
J. Gravner, C.A. Tracy, H. Widom
Remark. In these and analogous inequalities below we think of σ1 as actually meaning σ1 . Proof. Consider first the case c = cn . We have σ1 (un + ζ ) =
σ1 (un ) + σ1 (un ) ζ
+ζ
1
2 0
(1 − t) σ1 (un + tζ ) dt.
If ζ = O(ϕn ) then it follows from Lemma 3.6 that σ1 (un + tζ ) = σ1 (un ) + o(1). Hence, by (4.2), we have for such ζ
α α σ1 (un + ζ ) = σ1 (un ) − √ ζ + + o(1) ζ 2 . (4.7) 2β 2 β n This has zero derivative for
β ζ = √ (1 + o(1)) n
and it follows that β 2β −1 u+ cn = un + √ (1 + o(1)) = −r1 + √ (1 + o(1)). n n
(4.8)
− −1/2 ) (This critical value must be u+ cn rather than ucn since the latter is within O(n −1 −1 −1/2 of −r2 .) From this and (4.7), taking ζ = −r1 − un = −(β + o(1))n and −1/2 and subtracting, it follows that ζ = u+ − u = (β + o(1))n n cn −1 −1 σ1 (u+ cn ) = σ1 (−r1 ) − 2(α + o(1))n .
(4.9)
+ To determine the behavior of u+ c and σ1 (uc , c) for more general c we assume first that −1 c = cn + o(1), u+ c = un + O(ϕn ) = −r1 + O(ϕn ).
Then σ1zz (u+ c , c) = σ1 (un ) −
c − cn 2 u+ c
=
α + o(1) β2
by (4.2). Therefore (4.3) gives du+ β2 c = −(β 2 /α + o(1))/uc = r1 (1 + o(1)), dc α whence β2 (c − cn )(1 + o(1)) α 2β β2 = −r1−1 + √ (1 + o(1)) + r1 (c − cn )(1 + o(1)), α n
+ u+ c = ucn + r1
(4.10)
by (4.8). This holds if c − cn = O(ϕn ) since this assures that u+ c = un + O(ϕn ). The above gives −1 −1/2 log(−u+ − r12 c ) = log(−r1 ) − 2r1 β(1 + o(1))n
(Again, real parts are tacitly meant.)
β2 (c − cn )(1 + o(1)). (4.11) α
Fluctuations in the Composite Regime of a Disordered Growth Model
447
+ To determine, σ1 (u+ c , c) we use σ1z (uc , c) = 0 to deduce
d + σ1 (u+ c , c) = log uc . dc
(4.12)
We continue to assume that c − cn = O(ϕn ) so our estimates hold. Integrating (4.12) −1 using the first part of (4.10) gives (since u+ cn → −r1 ) 1 2 β2 + + σ1 (u+ (c − cn )2 (1 + o(1)) c , c) = σ1 (ucn ) + (c − cn ) log ucn − r1 2 α = σ1 (−r1−1 ) − 2(α + o(1)) n−1 + log(−r1−1 )(c − cn ) 1 β2 −2r1 β(c − c) n−1/2 (1 + o(1)) − r12 (c − cn )2 (1 + o(1)), 2 α by (4.9) and (4.11). This gives (4.5). + For all c ≥ cn we use the fact that log(−u+ c ) is a decreasing function of c, since uc increases, and integrate (4.12) with respect to c from cn to c, which gives + + σ1 (u+ c , c) ≤ σ1 (ucn ) + log(−ucn )(c − cn ).
Using (4.9) and (4.8) give (4.6). For the lower bound for σ1 (u− c , c), we assume first that c ≤ cn . By (4.1) this implies in particular that c − cn = O(n−1/2 ). Now σ1 (z) is decreasing on the interval + + − (u− c , uc ) and uc − uc ϕn . To see the last inequality, note that, from Lemma 3.6, σ1zz (un + ζ, c) = 0 for ζ = O(ϕn ) and c − cn = o(1). Therefore σ1z (un + ζ, c) can − vanish for at most one such ζ and, since u+ c −un = O(ϕn ), we must have un −uc ϕn . Take any sequence ϕn = o(q1 ) and write + + + σ1 (u− c , c) ≥ σ1 (uc − ϕn , c) = σ1 (uc − ϕn ) + (c − cn ) log(ϕn − uc ).
(As usual, we imagine real parts having been taken.) If we apply (4.7) with ζ = u+ c − un and with ζ = u+ c − ϕn − un and subtract, we obtain α −1/2 α + 2 σ (u+ ϕn (1+o(1))+ 2 −2ϕn (u+ n c −ϕn )−σ (uc ) = c − un ) + ϕn ) (1+o(1)). β 2β By subtracting the first parts of (4.10) and (4.8) we see that this equals α o(n−1/2 ϕn ) + 2 ϕn2 . 2β Since ϕn n−1/2 , as we may assume, we obtain + 2 σ1 (u+ c − ϕn ) > σ1 (uc ) + ηϕn
for some η > 0. Also, since c − cn > −ηn−1/2 for some η and log(1 − ϕn /u+ c ) is positive and O(ϕn ) we have + −1/2 (c − cn ) log(ϕn − u+ ϕn . c ) ≥ (c − cn ) log(−uc ) − ηn
Putting these together gives + 2 σ1 (u− c , c) > σ1 (uc , c) + ηϕn
for some η > 0.
448
J. Gravner, C.A. Tracy, H. Widom
This was for c ≤ cn . For c > cn we use what we get from (4.12) by replacing + with − , subtracting the two, and integrating. Together with using the already proved inequality for c = cn this gives c − + 2 + σ1 (uc , c) − σ1 (uc , c) > ηϕn + log(u− c /uc ) dc. cn
+ 2 The logarithm is nonnegative. Hence σ1 (u− c , c) − σ1 (uc , c) > ηϕn for all c. If c − cn = O(ϕn ) then using this and (4.5) give −1 −1 2 σ1 (u− c , c) > σ1 (−r1 ) + log(r1 )(c − cn ) + ηϕn
with a different η. If c ≥ cn we use − σ1 (u− c , c) − σ1 (ucn )
=
c
cn
log(−u− c ) dc.
−1 Since u− c is decreasing and is less than −r1 when c = cn this gives −1 − σ1 (u− c , c) ≥ σ1 (ucn ) + log(r1 )(c − cn )
−1 2 ≥ σ1 (u+ cn ) + log(r1 )(c − cn ) + ϕn .
Combining this with (4.5) for c = cn shows that −1 −1 2 σ1 (u− c , c) ≥ σ1 (−r1 ) + log(r1 )(c − cn ) + ηϕn
holds for these c as well. Since {ϕn } was an arbitrary sequence satisfying ϕn = o(q1 ) the last statement of the lemma follows. Next we consider the steepest descent curves, which we denote by C ± (c) corresponding to the integrals I ± (c). It follows from (4.4) that C + (c) passes through u+ c because on the curve |ψ1 (z, c)| has a maximum at that point; similarly, C − (c) passes through u− c . We have enough information to evaluate the portions of these integrals taken over the immediate neighborhoods of these points, but we also have to show that the integrals over the rest of the curves are negligible. This requires not only that the integrands are much smaller there, which they are, but also that the curves themselves are not too badly behaved. To see what is needed, let : ± be arcs of steepest descent curves for a function ρ, curves on which ρ is constant. In analogy with our C ± (c) we assume ρ is increasing on : − as we move away from the critical point and decreasing on : + . If s measures arc length on : ± we have for z ∈ : ± , dz |ρ (z)| =∓ . ds ρ (z)
(4.13)
If the arc goes from a to b then |ρ (z)| ds = ∓ ρ (z) dz = ∓(ρ(b) − ρ(a)). :±
Hence the length of
:±
:
is at most |ρ(b) − ρ(a)| . minz∈: ± |ρ (z)|
(4.14)
Fluctuations in the Composite Regime of a Disordered Growth Model
449
This is to be modified if ρ has a simple zero at z = a, for example. In this case√we replace ρ (z) by ρ (z)/(z−a). (This is seen by making the variable change z = a + ξ .) Our goal is Lemma 4.5 below. In order to use the length estimate (4.14) to deduce the bounds of the lemma, we must first locate regions in which our curves are located, and then find lower bounds for σ1 (z, c) in these regions. (Upper bounds for |σ1 (z, c)| will be easy.) These will be established in the next lemmas. For r > 0 define n(r) = #{j : rj ≥ r}. Lemma 4.2. The curves C ± (c) lie in the regions z : | arg(r −1 + z)| ≤ π
cn αn(r) + cn
for all r and in |z + r2−1 | ≥ δn−1 if δ is small enough. Proof. For a point z on either of the curves, say in the upper half-plane, we have n α arg(rj−1 + z) + arg(z − 1) + (c − 1) arg z cπ = n j =2
αn(r) ≥ arg(r −1 + z) + c arg(r −1 + z), n which gives the first statement of the lemma. For the second, observe that if ζ = O(ϕn ) then σ1 (r2−1 + ζ, c) = α/nζ + O(1). This shows, first, that u− c lies to the right of the circle |ζ | = δ n−1 if δ is small enough and, second, that 1/σ1 (z, c), thought of as a vector, points outward from this circle if δ is small enough. Since a point of C − (c) moves in the direction of 1/σ1 (z, c) as it moves away from u− c (see (3.7) of [GTW2]), the curve can never pass inside the circle. Therefore the entire disc |ζ | ≤ δ n−1 lies to the left of C − (c). This gives the second statement for C − (c) and it follows also for C + (c) since this is to the right of C − (c). The next lemma, together with (4.13) and the length estimate (4.14), will imply that for z large the curves will move in the direction of z and are well-behaved. If we take any r¯ < b/(1 − b) then a positive proportion of the rj are greater than r¯ and so by Lemma 4.2 the curves lie in a region (4.15) z : | arg(¯r −1 + z)| ≤ π(1 − δ) for some δ > 0. Lemma 4.3. We have z σ1 (z, c) → c + α as n → ∞ and z → ∞ through region (4.15). Proof. We have z σ1 (z, c) = c + α + O(n−1 ) + O(z−1 ) +
n 1 α , n 1 + rj z j =2
and it suffices to show that the last term tends to 0 as n → ∞ and z → ∞ through region (4.15). If z is in this region and r < r¯ /2 then |1 + rz| ≥ δ(1 + r|z|) for another δ. The same bound will hold for all r ≤ b/(1 − b) if z is large enough. Choose M large and
450
J. Gravner, C.A. Tracy, H. Widom
break the sum on the right, with its factor n−1 , into two parts, the terms where rj |z| < M and the terms where rj |z| ≥ M. We find that its absolute value is at most n−1 (n − n(M/|z|)) +
1 . δM
The first term tends to 0 as z → ∞ while the second could have been arbitrarily small to begin with. Remark. If P (p = 0) is positive then the above has to be modified. We replace c + α by c + α P (p > 0). Because of the above lemma we need only consider z in a bounded set. We use the fact that by Lemma 4.2 with r = r2 our curves lie a region z : | arg(r2−1 + z)| ≤ π(1 − δn−1 ), |r2−1 + z| ≥ δn−1 . (4.16) Lemma 4.4. For all z in any bounded subset of the region (4.16) we have − + −6 (z − uc ) (z − uc ) |σ1 (z, c)| ≥ δ n z(z − 1) for some δ > 0 independent of c. Proof. To obtain the lower bound we write φ(s; z) = φ(s2 , s3 , · · · , sn ; z) =
n α 1 1 c−1 + + . n sj + z z − 1 z j =2
Of course σ1 (z, c) = φ(r2−1 , r3−1 , · · · , rn−1 ). Think of s2 = r2−1 and z as fixed, and consider the problem of finding inf |φ(s; z)|, where s3 , · · · , sn are subject to the conditions sj ≥ s2 , φ(s; u± c ) = 0. If we take sequences so that the inf is approached in the limit, then some sj may tend to infinity, others may tend to s2 , and the rest, if any, tend to values strictly greater than s2 . Thus our inf is equal to the minimum of |φ(s; z)|, where φ now has the form
n α nj 1 c−1 φ(s2 , s3 , · · · , sn ; z) = + + n sj + z z − 1 z j =2
nj = n − 1, and the sj with j > 2 satisfying sj > s2 and the constraints with n ≤ n, ± φ(s; uc ) = 0. Notice that the minimum cannot be zero since φ(s; z), thought of for the moment as a function of z, has n finite zeros. It has zeros at u± c and one between each pair of consecutive −sj since all the coefficients of 1/(sj + z) are positive. This accounts for all n zeros, so our z cannot be one of them. We apply Lagrange multipliers to find the minimum of |φ(s; z)|2 over s3 , · · · , sn , achieved at interior points. There are two constraints, hence two multipliers λ and µ.
Fluctuations in the Composite Regime of a Disordered Growth Model
451
If p + iq is the value φ(s; z), where its absolute value achieves its minimum, then the equations we get are λ µ 1 = + , (p − iq) − 2 2 2 (sj + z) (sj + uc ) (sj + u+ c ) where we have divided by the factor nj appearing in all terms. This is the same sixth degree polynomial equation for all the sj . It follows that there are at most six different sj . Assuming there are exactly six (if there are fewer the argument is the same and the final estimate is better) we change notation again and write these as s3 , · · · , s8 so that the minimum is achieved for φ(s2 , s3 , · · · , s8 ; z) =
8 α nj 1 c−1 + + n sj + z z − 1 z j =2
with other nj . This has eight zeros. Two of them are u± c and the other six, lying between consecutive −sj , we denote by u1 , · · · , u6 . We have the factorization 6 + 1 − c (z − u− c ) (z − uc ) i=1 (1 − z/ui ) φ(s; z) = − + , 8 z(z − 1) uc uc j =2 (1 − z/sj ) and it remains to find a lower bound for this. Near z = 0 we have σ1 (z, c) = (1 − c)z−1 − 1 + α r + o(1), so if c is close to 1 then (1 − c)/u+ c = 1 − α r + o(1). In particular this is bounded away from zero. Thus the first factor above is bounded away from zero. As for the factors in the products, observe first that each factor 1 − z/sj is bounded since z and all factors 1/sj are. For the others, we use again the fact that the curves lie in a region (4.16). In any bounded subset of this region each |1−z/ui | ≥ ηn−1 for some η > 0. (If z is in a neighborhood of 0 this is clear since each ui < 0. Otherwise write 1 − z/ui = z(z−1 − u−1 i ).) Therefore the product of these is bounded below by a constant times n−6 . This completes the proof. Now we can show that the curves C ± (c) are not too badly behaved. Lemma 4.5. For some constant A > 0 the length of C + (c) is O(nA ) and |z|−2 |dz| = O(nA ). C − (c)
Proof. It follows from Lemma 4.3 that C + (c) lies in a bounded set. For, this lemma implies that the vectors 1/σ1 (z, c) point outward from a large circle |z| = R, and since by (4.13) C + (c) goes in the direction opposite to 1/σ1 (z, c), a point of the curve starting at u+ c can never pass outside the circle. Also, some disc |z| ≤ δ(1 − c) is disjoint from C + (c) because 1/σ1 (z, c) points outward from a small enough circle |z| = δ(1 − c) and so C + (c) cannot cross into it. It follows that σ1 (z, c), and so also σ1 (z, c), is bounded on any portion of C + (c) close to z = 0. A similar argument shows that some disc −1/2 ) |z − 1| ≤ δ lies entirely inside C + (c). Finally, we know that u− c is within O(n −1 −1 of −r2 and if ζ = o(q1 ) then σ1 (r2 + ζ, c) = α/nζ + O(1). In particular u− c lies in a region |ζ | ≥ δn−1 for some δ > 0. Since also σ1 = −α/nζ 2 + O(1), by −1 Lemma 3.6, we deduce that σ1 (z, c) = O(n) when |z − u− c | ≤ δn /2, thus for such
452
J. Gravner, C.A. Tracy, H. Widom
− 2 z we have σ1 (z, c) = σ1 (u− c , c) + O(n|z − uc | ). But it follows from Lemma 4.1 that − + 2 σ1 (uc , c) − σ1 (uc , c) > ϕn , and then, since n−1 = o(ϕn2 ), σ1 (u+ c , c) < σ1 (z, c) for −1 /2. As the maximum of σ (z, c) on C + (c) occurs at u+ , this shows that |z − u− | ≤ δn 1 c c −1 the distance from C + (c) to u− c is at least δn /2. With these facts established we use the lower bound of Lemma 4.4, the length estimate (4.14) (extended as in the remark following it), and the obvious upper bound for |σ1 (z, c)| in the region (4.16) to deduce that the length of C + (c) is O(nA ) for some constant A. As for the integral over C − (c), we observe that, since c < 1 and cm is an integer, 1 − c is at least a constant times n−1 . Since C − (c) lies outside a disc |z| ≤ δ(1 − c), we have z−1 = O(n) on C − (c). A lower bound for the distance from C − (c) to u+ c + 2 is obtained using the fact that σ1 (u− c , c) − σ1 (uc , c) > ϕn . Since σ1 is bounded in a − + 2 neighborhood of u+ c , we have σ1 (uc , c) > σ1 (z, c) for |z − uc | less than ϕn times a sufficiently small constant. This shows that C − (c) is at least this far from u+ . We apply c the other bounds as before; we think of the integral over the portion of C − (c) outside a large circle as the sum of integrals over the arcs from ak to ak+1 , where ak is the point of C − (c) where |z| = k. Lemma 4.3 and (4.14) are used again here.
5. Asymptotic Evaluation of the Integrals 2 We evaluate I + (c) first when c − cn = O(ϕn ). Then σ1zz (u+ c , c) = α/β + o(1) and so + if we set z = uc + ζ we have α σ1 (z, c) = σ1 (u+ (1 + o(1))ζ 2 c , c) + 2β 2
as long as ζ = O(ϕn ). If |ζ | = ϕn then the real part of the second term above is less than a negative constant times ϕn2 and, since this real part decreases as we go out C + (c), it is at least this negative whenever |ζ | ≥ ϕn . If we recall that this gets multiplied by m in the exponent and the fact that C + (c) has the length at most a power of n (by Lemma 4.5), + ,c)−nϕ 2 +O(log n) mσ (u c n we see that the contribution of this part of the integral is O e . It follows from Lemma 3.3 and assumption (c) that with high probability q1 log n/n1/2 , + and we could have chosen ϕn to satisfy this also. Thus, with error o(emσ (uc ,c) ) the integral I + (c) is equal to 1 + (n/2β 2 )(1+o(1))ζ 2 (1 + r1 (u+ dz emσ1 (uc ,c) c + ζ )) e 2πi |ζ |<ϕn (since αm = n). Since ϕn n−1/2 , in the limit after making the variable change ζ → n−1/2 ζ the integration can be taken over (−i∞, i∞) (downward really, but we can reverse the directions of integrations), the linear factor ζ contributes zero, and by (4.10),
r1 β 2 −1/2 −1/2 1 + r 1 u+ 2βn (c − c = r + ) + o(n + |c − c |) . 1 n n c α √ Thus the integral is asymptotically equal to β 2π in−1/2 times the above and, by (4.5),
r1 β 1/2 r1 β 2 −1 + 1/2 I (c) = √ n 2+ n (c − cn ) + o(1 + n |c − cn | α 2π −1 −
×ψ1 (−r1 , c)
e
r1 β 2 2α m
c−cn + r2αβ (1+o(1))n−1/2 1
2
.
Fluctuations in the Composite Regime of a Disordered Growth Model
453
This assumes that c−cn = O(ϕn ). For all c ≥ cn we use the second part of Lemma 4.1 and again the fact that C + (c) has the length at most a power of n. We deduce 1/2 I + (c) = O ψ1 (−r1−1 , c) e−ηn (c−cn )+O(log n) for c ≥ cn . For the integral over C − (c) we use the last part of Lemma 4.1 and the second part of Lemma 4.5. These imply that the integral over C − is 2 O ψ1 (−r1−1 , c)−1 e−nϕn +O(log n) = o(ψ1 (−r1−1 , c)). But our integral for I − (c) is not taken over C − (c). Recall that the original contour must have all the −rj−1 on the outside whereas −r1−1 is inside (more precisely, on the other side of) C − (c). Therefore if we deform the contour to C − (c) we pass through the pole at −r1−1 . Thus I − (c) = r1 ψ1 (−r1−1 , c)−1 + o(ψ1 (−r1−1 , c)). Now recall that in I + (c) we set c−cn = h +j ++, in I − (c) we set c−cn = h +++k and then we sum over + to get the matrix product. Recall also that ψ1 (−r1−1 , c) = ψ1 (−r1−1 ) (−r1 )−m(c−cn ) . The factors (−r1 )−m(c−cn ) in I + (c) and (−r1 )m(c−cn ) in I − (c) will combine to give (−r1 )m(k−j ) which can be eliminated without affecting the determinant. It follows that we can modify the expressions for I ± (c) by removing these factors. We can also remove the factors ψ1 (−r1−1 )±1 since they cancel upon multiplying. Thus our replacements are 2
r β2 r1 β 1/2 r1 β 2 −1 − 12α m c−cn + r2αβ (1+o(1))n−1/2 + 1 I (c) → √ n 2+ , n (c − cn ) e α 2π if c − cn = O(ϕn ), and
1/2 I + (c) → O e−ηn (c−cn )+O(log n) ,
if c > cn . Furthermore, I − (c) → r1 + o(1). Recall next that we set h = sn1/2 and in I + (c), c = cn +sn1/2 +xn1/2 +zn1/2 , so that c − cn = (s + x + z + o(1))n1/2 /m = α(s + x + z + o(1))n−1/2 , and eventually we multiply by n because of the scaling. Take first the case c−cn = O(ϕn ), that is, x + z = O(n1/2 ϕn ). Since m = n/α and r1 β = τ −1 (1 + o(1)) the modified I + (c) equals r 2β 3 2 2 √1 n−1 (2τ + s + x + z + o(1 + x + y))e−(2τ +s+x+z+o(1)) /2τ . 2π On the other hand, I − (c) is equal to r1 with error o(1). The result of multiplying these together, multiplying by n, and integrating with respect to z over (0, ∞), is asymptotically equal to √
1 2πτ
2 /2τ 2
e−(2τ +s+x)
.
(5.1)
454
J. Gravner, C.A. Tracy, H. Widom
This holds for c − cn = O(ϕn ). If c − cn ≥ ϕn we have, for our modified I + (c), the estimate 1/2 O e−ηn (c−cn )+O(log n) = O(n−1 ). Integrating the square of this over a region x + z = O(n1/2 ) will give o(1). It follows that the matrix product scales to the operator on (0, ∞) with kernel (5.1). This is a rank one kernel so its Fredholm determinant equals one minus its trace, which equals 2τ +s 1 2 2 e−x /2τ . √ 2πτ −∞ This establishes the convergence in probability statement of Theorem 1. Remark. One could rightly object that to scale a product to a trace class operator we should know that each factor scales in Hilbert-Schmidt norm. In our case the second limiting kernel is a constant and the product is not even Hilbert-Schmidt. But we could have multiplied the kernel of the first operator by (1 + x) (1 + z) and the kernel of the second operator by (1 + z)−1 (1 + y)−1 . This would not have affected the determinant of the product, both operators would have scaled in Hilbert-Schmidt norm and the product would have scaled in trace norm to the rank one kernel √
1 2πτ
e−(2τ +s+x)
2 /2τ 2
1+x 1+y
which has the same Fredholm determinant. 6. Almost Sure Convergence What is needed, and all that is needed, is an “almost sure” substitute for Lemma 3.6 under assumptions (a ) and (b ). We begin with a lemma on extreme order statistics of uniform random variables, part or all of which may well be in the literature. Lemma 6.1. Let a > 1 be arbitrary. Then, almost surely, t1 ≥
η , n loga n
1 t1 ≤1− , t2 loga n
for sufficiently large n. Here, η is a positive constant depending on a. Proof. We use the notation tn,j for our tj to display their dependence on n. We have P (tn,1 ≤ δ) = 1 − (1 − δ)n ∼ nδ if nδ = o(1). In particular
1 2−k P t2k ,1 ≤ a ∼ a . k k
It follows that, a.s. for sufficiently large k we have t2k ,1 >
2−k . ka
Fluctuations in the Composite Regime of a Disordered Growth Model
455
Take any n and let k be such that 2k−1 < n ≤ 2k . From the above we have, a.s. for sufficiently large n 2−k η tn,1 ≥ t2k ,1 > a ≥ , k n loga n for some η. For the ratio we use the fact that
tn,j P > 1 − δ = 1 − (1 − δ)j ∼ j δ if j δ = o(1). (6.1) tn,j +1 Now suppose that
tn,1 1 , >1− tn,2 loga n
(6.2)
and let k be such that 2k−1 < n ≤ 2k . Take any J (which will eventually be of order log k). Then there are two possibilities: (1) t2k ,j ≤ tn,1 for all j ≤ J ; (2) t2k ,j > tn,1 for some j ≤ J . Consider possibility (1) first. Let Gn be the event that tn,1 ≤ a log log n/n. By Ex. 4.3.2 of [Gal], P (Gn eventually) = 1. Moreover, P ({t2k ,j ≤ tn,1 for all j ≤ J } ∩ Gn ) ≤ P (t2k ,j ≤ 2 log log n/n for all j ≤ J )
k 2 log log n J ≤ eJ log log k−J log J +AJ , ≤ 2 J! n for some constant A. If J = B log k then the bound above equals e−B(log B−A) log k , so if we choose B large enough the sum over k of these probabilities will be finite. With this J , (1) can therefore a.s. occur for only finitely many k. Next consider possibility (2) and let j be the smallest integer ≤ J such that t2k ,j > tn,1 . Then t2k ,j ≤ tn,2 and tn,1 = t2k ,+ for some + < j . It follows that t2k ,j −1 /t2k ,j > tn,1 /tn,2 and by (6.2) this is at least 1 − C/k a , for some constant C (which will change from appearance to appearance). Therefore, by (6.1), P ((6.2) and (2) both happen) ≤ P (t2k ,j −1 /t2k ,j > 1 − C/k a for some j ≤ J ) ≤ CJ 2 /k a ≤ C log2 k/k a . It follows that (2) and (6.2) can happen together only for finitely many n. The upshot is that a.s. the inequality (6.2) can occur for only finitely many n, which completes the proof. We are now ready to prove our substitute for Lemma 3.6. Recall that we can set qj = G−1 (tj ). The assumption (a ) implies that G is continuous near 0, so that G(G−1 (x)) = x for small x. Lemma 6.2. Suppose (a ) and (b ) are satisfied. Then there exists a sequence ϕn log n/n1/2 such that a.s. for any sequence {vn } lying in the disc with diameter the real interval [−r1−1 − O(ϕn ), ξ ] we have n rj 1 r lim . = n→∞ n (1 + rξ )2 (1 + rj vn )2 j =2
456
J. Gravner, C.A. Tracy, H. Widom
Proof. From the proof of Lemma 3.6 we see that we want to show that, for some sequence ϕn as described, we have a.s. n
1 q1 = 0. n→∞ n qj (qj − (q1 + O(ϕn )))2 lim
j =2
Assumption (a ) implies that x ≥ y
G−1 (x) G−1 (y)
γ ,
when x ≤ y are small enough. Therefore, it follows from the second part of Lemma 6.1, that a.s. for large n, η q1 ≤1− (6.3) q2 loga n for another constant η > 0. Set ψn =
1 η q2 . 2 loga n
Let us show that ψn log n/n1/2 . Assumption (a ) implies that G−1 (x) is at most a constant times x 1/γ , thus the fact that t1 = O(log log n/n) shows that q1 is at most a constant times (log log n/n)1/γ . Furthermore, assumption (b ) gives, with a slightly smaller ν, x 2 G(x) logν x −1 . Applying this with x = q1 = G−1 (t1 ) and using the first part of Lemma 6.1 gives q12
1 logν q1−1 . n loga n
We therefore deduce that
1 logν−a n n for a slightly smaller ν than in (b ). By (6.3), the same holds for q2 and so q12
ψn2
(6.4)
1 logν−3a n n
and ψn log n/n1/2 as long as ν − 3a > 2. Since a > 1 is arbitrary the requirement becomes ν > 5. But from (a ) and (b ) we see that necessarily γ > 2, so that ν > 8. If j ≥ 2, then (6.3) and the inequality q2 ≤ qj imply that qj − (q1 + ψn ) ≥
1 η qj . 2 loga n
We take for {ϕn } any sequence satisfying log n " ϕn " ψn . n1/2 At this point we follow the proof of Lemma 3.6 to see that the expression n
log2a n q1 n qj3
(6.5)
j =2
needs to go to 0 a.s. to conclude the proof of this lemma. This is what we will demonstrate.
Fluctuations in the Composite Regime of a Disordered Growth Model
457
For any kn , if we separate the sum in (6.5) over j ≤ kn from the sum over j > kn , we see that (6.5) is at most n
log2a n q1 1 1 kn + log2a n . 2 qkn +1 n n q1 q2 j =1 j
(6.6)
We kn so the second term in (6.6) goes a.s. to 0. By strong law,
first determine n−1 qj → q −2 a.s., so log2a n q1 /qkn +1 needs to go to 0. We have, for each δ > 0,
q1 P log2a n ≥δ qkn +1
γ
t1 G−1 (t1 ) δ δ ≤P =P ≥ ≥ G−1 (tkn +1 ) tkn +1 log2a n log2a n γ
γ kn δ δ − kn log2a n = 1− ≤ e . 2a log n This is summable over n if we choose
γ kn = loga n log2a n + 1.
With this choice, the second summand in (6.6) therefore goes to 0 a.s. On the other hand, the first term in (6.6) is with the same choice of kn at most a constant times log(2γ +3)a n , n q12 and from (6.4) this is o(1) times log(2γ +4)a−ν n. Since a > 1 was arbitrary and ν > 2γ + 4, we can make (2γ + 4)a − ν < 0 and then the first summand in (6.6) goes to 0 a.s. This completes the proof. With this lemma in place of Lemma 3.6 the reader will find that all subsequent limits and estimates in Sects. 4 and 5 will hold almost surely, thus giving the second statement 1/2 of the theorem. The reason our sequence had to satisfy ϕn log n/n is that errors 2 of the form O e−nϕn +O(log n) appeared in the evaluation of I ± (c) and these had to be o(1). Acknowledgement. This work was partially supported by National Science Foundation grants DMS9703923, DMS–9802122, and DMS–9732687, as well as the Republic of Slovenia’s Ministry of Science Program Group 503. Special thanks go to Harry Kesten, who supplied the main idea for the proof of Lemma 6.1. The authors are also thankful to the referee for the careful reading of the manuscript and suggestions for its improvement.
References [BCKM] Bouchaud, J.-P., Cugliandolo, L.F., Kurchan, J., M´ezard, M.: Out of equilibrium dynamics in spin–glasses and other glassy systems. In: Spin Glasses and Random Fields. A.P. Young, ed., Singapore: World Scientific, 1998 [BDJ] Baik, J., Deift, P., Johansson, K.: On the distribution of the length of the longest increasing subsequence of random permutations. J. Am. Math. Soc. 12, 1119–1178 (1999)
458 [BR]
J. Gravner, C.A. Tracy, H. Widom
Baik, J., Rains, E.M.: Limiting distributions for a polynuclear growth model with external sources. J. Stat. Phys. 100, 523–541 (2000) [FIN1] Fontes, L.R.G., Isopi, M., Newman, C.M.: Random walks with strongly inhomogeneous rates and singular diffusions: Convergence, localization and aging in one dimension. Ann. Probab. 30, 579–604 (2002) [FIN2] Fontes, L.R.G., Isopi, M., Newman, C.M.: Chaotic time dependence in a disordered spin system. Probab. Theory Relat. Fields 115, 417–443 (1999) [FINS] Fontes, L.R.G., Isopi, M., Newman, C.M., Stein, D.L.: Aging in 1D discrete spin models and equivalent systems. Phys. Rev. Lett. 87, 110201–1 (2001) [Gal] Galambos, J.: The Asymptotic Theory of Order Statistics. Second ed. Malabar, FL: Krieger, 1987 [GG] Gravner, J., Griffeath, D.: Cellular automaton growth on Z 2 : Theorems, examples, and problems. Adv. Appl. Math. 21, 241–304 (1998) [Gri1] Griffeath, D.: Additive and cancellative particle systems. Lecture Notes in Mathematics. Vol. 724, Berlin-Heidelberg-New York: Springer, 1979 [Gri2] Griffeath, D.: Primordial Soup Kitchen. psoup.math.wisc.edu [Gra] Gravner, J.: Recurrent ring dynamics in two-dimensional excitable cellular automata. J. Appl. Prob. 36, 492–511 (1999) [GTW1] Gravner, J., Tracy, C.A., Widom, H.: Limit theorems for height fluctuations in a class of discrete space and time growth models. J. Statist. Phys. 102, 1085–1132 (2001) [GTW2] Gravner, J., Tracy, C.A., Widom, H.: A growth model in a random environment. To appear in Ann. Probab. (ArXiv: math.PR/0011150) [Joh1] Johansson, K.: Shape fluctuations and random matrices. Commun. Math. Phys. 209, 437–476 (2000) [Joh2] Johansson, K.: Discrete orthogonal polynomial ensembles and the Plancherel measure. Ann. Math. 153, 259–296 (2001) [Lig] Liggett, T.: Interacting Particle Systems. Berlin-Heidelberg-New York: Springer-Verlag, 1985 [Mea] Meakin, P.: Fractals, Scaling and Growth Far from Equilibrium. Cambridge: Cambridge University Press, 1998 [MPV] M´ezard, M., Parisi, G., Virasoro, M.A.: Spin Glass Theory and Beyond. Singapore: World Scientific, 1987 [NSt1] Newman, C.M., Stein, D.L.: Equilibrium pure states and nonequilibrium chaos. J. Stat. Phys. 94, 709–722 (1999) [NSt2] Newman, C.M., Stein, D.L.: Realistic spin glasses below eight dimensions: A highly disordered view. Phys. Rev. E (3) 63(1), part 2, 016101, 9 pp (2001) [NSv] Norblad, P., Svendlindh, P.: Experiments on spin glasses. In: Spin Glasses and Random Fields. A.P. Young, ed., Singapore: World Scientific, 1998 [PS] Pr¨ahofer, M., Spohn, H.: Universal distribution for growth processes in 1 + 1 dimensions and random matrices. Phys. Rev. Lett. 84, 4882–4885 (2000) [SK] Sepp¨al¨ainen, T., Krug, J.: Hydrodynamics and platoon formation for a totally asymmetric exclusion model with particlewise disorder. J. Stat. Phys. 95, 525–567 (1999) [Sos] Soshnikov, A.: Universality at the edge of the spectrum in Wigner random matrices. Commun. Math. Phys. 207, 697–733 (1999) [Tal] Talagrand, M.: Huge random structures and mean field models for spin glasses. Doc. Math., Extra Vol. I, 507–536 (1998) [TW1] Tracy, C.A., Widom, H.: Level spacing distributions and the Airy kernel. Commun. Math. Phys. 159, 151–174 (1994) [TW2] Tracy, C.A., Widom, H.: Universality of the distribution functions of random matrix theory. II. In: Integrable Systems: From Classical to Quantum. J. Harnad, G. Sabidussi, P. Winternitz, eds. Providence, RI: American Mathematical Society, 2000, pp. 251–264 Communicated by H. Spohn
Commun. Math. Phys. 229, 459–489 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0694-3
Communications in
Mathematical Physics
Hamiltonian Monodromy via Picard-Lefschetz Theory Mich`ele Audin Institut de Recherche Math´ematique Avanc´ee, Universit´e Louis Pasteur et CNRS, 7 rue Ren´e Descartes, 67084 Strasbourg cedex, France Received: 28 September 2001 / Accepted: 12 April 2002 Published online: 12 August 2002 – © Springer-Verlag 2002
Abstract: In this paper, we investigate the “Hamiltonian” monodromy of the fibration in Liouville tori of certain integrable systems via (real) algebraic geometry. Using PicardLefschetz theory in a relative Prym variety, we determine the Hamiltonian monodromy of the “geodesic flow on SO(4)”. Using a relative generalized Jacobian, we prove that the Hamiltonian monodromy of the spherical pendulum can also be obtained by the Picard-Lefschetz formula. Introduction The aim of this paper is to relate the “Hamiltonian” monodromy of the fibration in Liouville tori of certain integrable systems to the monodromy of a fibration in (real) Abelian varieties, via Picard-Lefschetz theory. Consider an integrable system with n degrees of freedom. Assume the Hamiltonian is a proper map, so that, according to the Arnold-Liouville theorem (see [2]), the connected components of the regular level sets of the momentum mapping are n-dimensional compact tori, the Liouville tori. If the set of regular values of the momentum mapping is not simply connected, the fibration by the Liouville tori can have a nontrivial monodromy. Assume now that n = 2. When a singular point around which it is possible to turn is a “focus-focus” point (this is a nondegeneracy assumption), a general theorem of Nguyen Tien Zung [19] asserts that the Hamiltonian monodromy is nontrivial. This is, very classically, what happens for the spherical pendulum: the Hamiltonian monodromy phenomenon showed up precisely in this example investigated by Cushman to illustrate the possible nonexistence of global action-angle coordinates (see Duistermaat’s foundational paper [11] and the book [10]). The case of the symmetric spinning top is completely similar. Another example in which there is a component in the set of regular values which is not simply connected is the case of geodesic flows of invariant metrics on SO(4) (or “4-dimensional free rigid body”) that I will investigate in this paper.
460
M. Audin
In these examples as in many others, the phase space is an algebraic submanifold of a space R2N and the first integrals are polynomials. It is thus easy to complexify all the data. Very often (and in particular in the examples mentioned above) the differential system may be written as a Lax equation with a spectral parameter, the set of singular values of the momentum mapping is the discriminant locus of a family of affine plane curves and the fibration in Liouville tori appears (sometimes up to a covering map) as the real part of a fibration in Abelian varieties associated with this family. We will use this algebro-geometric framework to describe the “Hamiltonian” monodromy. Of course, there can be complex monodromy, turning around a branch of the discriminant hypersurface, but this is not what we are interested in: what we want is to turn around something real, staying in the reals. Thus we must turn around something having codimension 2, like the intersection of two branches of the (codimension-1!) discriminant: the singular spectral curves we will look at will have two double points. Applying twice the Picard-Lefschetz formula and looking carefully at the real structures of the curves and their Jacobians, we will get the monodromy. Here is a description of the results and organization of the paper. – In Sect. 1, I consider families of (real) algebraic curves and express, according to Picard-Lefschetz theory, the monodromy around a real point of the discriminant corresponding to two nonreal double points. – In Sect. 2, I concentrate on two specific families of elliptic curves. To get nontrivial monodromy from the general considerations of Sect. 1 in these examples, I will replace the relative Jacobian by another family of algebraic tori: – either the family of Pryms of a double branched cover of the elliptic curve – or the family of generalized Jacobians of a singular genus-2 curve (of which the original elliptic curve is the normalization). In both cases, I determine the monodromy of the fibration, for the family of complex tori and for its real part. – In Sect. 3, I come to integrable systems. The two families of curves considered abstractly in Sect. 2 turn out to be the families of spectral curves for Lax equations describing respectively: – the geodesic flow on SO(4), – the spherical pendulum. Using the eigenvectors of the Lax matrices, I prove that the fibrations in algebraic tori of Sect. 2 are models for the fibrations in Liouville tori and deduce the Hamiltonian monodromy from the algebro-geometric results. As far as I know, the explicit determination of the monodromy for the SO(4) case is new1 . In the case of the spherical pendulum2 , I recover Cushman’s classical results. As a byproduct, the Picard-Lefschetz formula shows the Hamiltonian monodromy as given by integrating a (nonholomorphic) meromorphic differential form. This phenomenon was observed, using a direct computation, in [9]. – In Sect. 4, I recall a few facts I use on real generalized Jacobians and in Sect. 5, I have grouped a few computations used or mentioned in the paper. 1 It would be a consequence of the results of [19] if the number of critical points was determined, but explicit computation is rather tedious... 2 In this case, the connected components of the real part of the Jacobian are indeed compact 2-tori although the Jacobian itself is noncompact. This is something we should expect, since one of the first integrals is a periodic Hamiltonian, inducing an S 1 -action, which complexifies into a C -action. See, more generally, [13, 6].
Hamiltonian Monodromy via Picard-Lefschetz Theory
461
1. General Setting Consider an n-dimensional family of affine (real) plane curves. Assume that two smooth local branches B1 and B2 of the discriminant intersect transversally in such a way that B1 ∩ Rn = B2 ∩ Rn = (B1 ∩ B2 ) ∩ Rn is a codimension-2 submanifold of Rn . To be more specific, we will consider two families of elliptic curves. Example 1.1. Consider the curves Eh,k of equations A(x)y 2 + Bh,k (x)y + C(x) = 0, where A, Bh,k and C are real polynomials, the second depending on two real parameters h and k. Assume that these polynomials are such that h,k (x) = Bh,k (x)2 − 4A(x)C(x) is a degree-4 polynomial. Let (h0 , k0 ) be such that h0 ,k0 (x) = (ax 2 + bx + c)2 for some real numbers a, b and c such that b2 − 4ac < 0. Then (h0 , k0 ) is a point in R2 at which two nonreal branches of the discriminant of the family of curves intersect. Example 1.2. Consider the curves Ch,k of equations y 2 + Ph,k (x) = 0, where Ph,k is a monic degree-4 polynomial, real when the parameters h, k are real. Assume that (h0 , k0 ) ∈ R2 is such that Ph0 ,k0 (x) = (x 2 + ax + b)2 for some real numbers a, b satisfying a 2 − 4b < 0. Here again, (h0 , k0 ) is a point in R2 at which two nonreal branches of the discriminant of the family of curves intersect. Remark 1.3. Except for the way we have chosen to write the affine curves, the only difference between these two examples is that, for (h, k) close to (h0 , k0 ), the curve Eh,k has a nonempty real part (with two connected components) while Ch,k has no real point. Going back to the general situation, let 2 = B1 ∩ B2 and 2R be its real part. Let U be a neighbourhood of 2 in Cn , and UR its real part. Taking transversal linear subspaces C2 and R2 , we see that the fundamental group of the complement of 2R in Rn is an infinite cyclic group while π1 (U − 2 ) is isomorphic to Z × Z. More precisely, we have: Lemma 1.4. The natural map π1 (UR −2R ) → π1 (U −2 ) ∼ = Z⊕Z maps a generator to (1, −1).
462
M. Audin
Fig. 1. The local fundamental group
Proof. Having taken 2-dimensional transversals, we can assume that n = 2. Then the two branches can be locally described by an equation P (x, y)P¯ (x, y) = 0 for some complex linear form P : C2 → C. An explicit isomorphism from the fundamental group π1 (C2 − 2 ) to Z × Z is given by 1 dP 1 d P¯ γ −−−−→ . , 2iπ γ P 2iπ γ P¯ Our assumption is that P (x, y)P¯ (x, y) = 0 at (x, y) ∈ R2 if and only if (x, y) = (0, 0), namely that |P (x, y)|2 = ε is the equation of an ellipse in R2 . Let c be a generator of π1 (R2 − {(0, 0)}) (see Fig. 1). Evaluate dP d P¯ d |P |2 1 1 =0 = + 2iπ 2iπ c |P |2 c P c P¯ (since one can choose a parametrization of |P |2 = ε for c). Also ¯ 1 P dP − P d P¯ dP d P¯ 1 − = ±2, = 2iπ c 2iπ |P |2 c P c P¯ according to the orientation of c. Thus ±c is, indeed, mapped to (1, −1) ∈ Z × Z. Remark 1.5. Notice that a complex transversal to a branch of the complex discriminant has a natural orientation, but this is not the case for an R2 transversal to 2R . In the family of curves, those corresponding to parameters in 2 have two double points. Choose a smooth curve nearby and call δ, δ¯ the vanishing cycles corresponding to the two double points. The Picard-Lefschetz formula [17] gives, for the monodromy along the real loop c ¯ δ) ¯ γ −−−−→ γ + ε((γ · δ)δ − (γ · δ)
(1)
where ε = ±1 according to the orientation of c. Note that this applies to cycles on the complex curve.
Hamiltonian Monodromy via Picard-Lefschetz Theory
463
Fig. 2. The curve E
2. Elliptic Curves with Two Double Points In Examples 1.2 and 1.1 above, the curves have genus 1. On a punctured neighbourhood of (h0 , k0 ), the two vanishing cycles corresponding to the two double points at (h0 , k0 ) are in the same homology class, so that formula (1) gives nothing, neither in the homology of the curve nor in that of its Jacobian. The problem is that the Jacobian of the curve has complex dimension 1. We are going, for each of these examples, to construct a fibration of complex tori of dimension 2 for which the Picard-Lefschetz formula (1) will indeed give a nontrivial monodromy. We use different approaches in these different examples.
2.1. Double coverings of elliptic curves – Prym varieties. Let us start with elliptic curves3 E = Eh,k as in Example 1.1. For (h, k) in a neighbourhood U of (h0 , k0 ), the real part of E has two connected components. We will choose four real points Ai (for 0 ≤ i ≤ 3), two on each component, and look at the double cover of E branched at these points. To be more specific in the way these points depend on the parameters, I will assume that, in the equation of E, A(x)y 2 + Bh,k (x)y + C(x) = 0, C is a constant and the polynomial A has four distinct real roots a0 < a1 < a2 < a3 , so that Ai = (ai , ∞) is indeed a real point of E and the Ai ’s are suitably divided on the real part, A0 and A3 on a component, A1 and A2 on the other (see the real part of the affine curve Eh,k , for (h, k) in U , in the left part of Fig. 2). Look at the covering map x : E → P1 . The branch points are the four roots of h,k (x). As h0 ,k0 (x) = (ax 2 + bx + c)2 with b2 − 4ac < 0, for the nearby value (h, k), the polynomial h,k has nonreal distinct roots. Call s, t the two roots in the 3 In this section, I will use the same symbol E = E h,k for the affine curve and for its nonsingular completion.
464
M. Audin
upper half plane, s¯ and t¯ their conjugates. The right part of Fig. 2 shows the complex curve Eh,k with its real part, the vanishing cycles δ and δ¯ (which are, indeed, conjugate to each other). The heavy points are the four points Ai , where y = ∞. Notice that δ ∪ δ¯ separates the curve in two connected components, each of which contains two of the points Ai ’s. Let us now look at the curve Dh,k (nonsingular completion of the curve) of equation z4 A(x) + z2 Bh,k (x) + C = 0. This is a real curve, endowed with an involution τ (x, z) = (x, −z) which preserves the real structure and shows Dh,k as a double cover of Eh,k (by y = z2 ) branched at the points Ai ’s. Hence, Dh,k is a genus-3 curve, smooth exactly when Eh,k is smooth (recall that the points Ai are assumed to be distinct). The projection π : D → E induces an injective map π : H 0 ($1E ) −−−−→ H 0 ($1D ), the image of which is the vector space of holomorphic forms η on D such that τ η = η. Dualizing and factoring out the period lattice, this defines a surjective morphism Jac(D) −−−−→ Jac(E), the kernel of which is an Abelian dimension-2 subvariety of Jac(D), the Prym variety Prym(D|E). The relative Prym is a fibration of the complement of (h0 , k0 ) in complex 2-dimensional tori. This is the fibration of which we are going to compute the monodromy. What we need to compute is the monodromy of the lattice of anti-invariant (with respect to τ ) elements of H1 (D; Z). Let us use our description of D and E to define a basis of this group. We take two copies of E and identify the corresponding points Ai in both. In the i th ¯ Call β a curve copy of E, we have the curves αi , δi and δ¯i corresponding to α, δ and δ. on D obtained in the following way: – go from A0 to A1 on the path b of Fig. 2 in the first copy of E, meeting δ1 once transversally with intersection number 1 – then go back from A1 to A0 in the second copy of E, meeting δ¯2 once transversally, with intersection number −1 (see Fig. 3). Then α1 , α2 , β, δ1 , δ¯1 and δ¯2 form a basis of H1 (D; Z). Notice that we have succeeded, in the sense that δi = δ¯i in H1 (D; Z). In this group, we also have δ1 + δ2 = δ¯1 + δ¯2 so that δ2 = δ¯2 − (δ1 − δ¯1 ). The orientations have been chosen so that αi · δi = αi · δ¯i = 1 and β · δ¯1 = 1, β · δ¯2 = −1.
Fig. 3. The curve D
Hamiltonian Monodromy via Picard-Lefschetz Theory
465
The real structure. Notice firstly that the real part of D has two connected components (as the branch points are real and there are two of them on each component of the real part of E). The effect of the real involution S on the cycles defined above is transparent, as S comes from the real structure on E. We have: S(αi ) = −αi ,
S(δi ) = δ¯i .
Performing the complex conjugation on the path b, we also get S(β) = β − α1 + α2 . The involution τ . Its effect is also transparent: the cycles with a label 1, 2 are sent to the same cycle with the other label, and τ (β) = −β, so that the anti-invariant cycles are generated by α1 − α2 , β, δ1 − δ¯1 , δ¯1 − δ¯2 . The real cycles that are anti-invariant are thus generated by α1 − α2 − 2β
δ1 − δ¯2 .
and
Monodromy. Look now at the monodromy of the relative Jacobian and Prym. Formula (1) gives γ −−−−→ γ + ε((γ · δ1 )δ1 + (γ · δ2 )δ2 − (γ · δ¯1 )δ¯1 − (γ · δ¯2 )δ¯2 ), namely
αi −−−−→ αi + ε(δi − δ¯i ),
β −−−−→ β + ε(−δ¯1 + δ¯2 ),
and the others fixed. If we restrict our attention to the real part, we find α1 − α2 − 2β −−−−→ α1 − α2 − 2β + 2ε(δ1 − δ¯2 ). Eventually, we have proved: Theorem 2.1. The relative Jacobian on the complement of the discriminant in UC is a fibration in complex 3-dimensional tori with monodromy matrix conjugated to Id ε −ε 0 −ε ε −ε 0 0 ε
0
Id
.
The relative Prym is a fibration in complex 2-dimensional tori with monodromy matrix conjugated to Id 0 2ε 0 . Id 0 −ε The real part is a fibration inreal tori of the complement of (h0 , k0 ) in UR with mono1 0 dromy matrix conjugated to . 2ε 1
466
M. Audin
Fig. 4. The affine curve C
2.2. Elliptic vs. singular genus-2 curves – generalized Jacobians. Let us now investigate the example of the family y 2 + Ph,k (x) = 0 (Example 1.2). The polynomial has degree 4 so that the curve Ch,k (or C) is the complement of two points in an elliptic curve, with only one point at infinity, which is a singular point (Fig. 4, note that this picture is not a real one). ¯ the complete singular curve obtained by adding this point at infinity Call C¯ h,k (or C)
h,k (or C)
the normalized curve, that has two points ∞± at to the affine curve and C infinity. In other words, C can be completed in two different ways, and I will use both: – Add a unique point at infinity, completing the affine curve in the completion C2 ⊂ P2 (C). In homogeneous coordinates [x, y, Z], this is the point ∞ = [0, 1, 0]. This ¯ is the completion C.
by separating the two branch– Normalize at infinity, that is, define a complete curve C
is smooth at infinity (by definition) and has a degree-2 function to es. Note that C P1 (C) (extending x), with x −1 (∞) = {∞+ , ∞− } .
is a double cover of P1 (C) branched at the four roots of the polynomial The curve C PH,K and thus, when these roots are distinct, this is a genus-1 curve.
is to complete the affine curve, not in P2 (C), but The simplest way to construct C in the total space of the bundle O(2) → P1 (C). The affine coordinate x becomes an element of P1 (C), the fibration(x, y) → x from C2 to C becomes the fibration O(2) → P1 (C) defined by gluing P1 − {∞} × C and P1 − {0} × C by the map (x, y) −−−−→
1 y , x x2
.
is naturally embedded in this surface, the two points at infinity The complete curve C y are the points x = ∞, 2 = ±i. x Figure 5 represents the two completions of C, in a neighborhood of ∞. The dotted
is smooth lines represent the projection onto the x-axis (projective line). Notice that C if and only if C is. Notice that, as we plan to use the Jacobians of the curves in the family (the relative Jacobian) as a model for the Liouville tori in a 2-degrees of freedom system, we should
Hamiltonian Monodromy via Picard-Lefschetz Theory
467
Fig. 5. The complete curves
is prefer to use genus-2 curves. The singular curve C¯ has, indeed, genus 2. When C smooth, C¯ has a generalized Jacobian4 which is the extension ¯ −−−−→ Jac(C)
−−−−→ 0 1 −−−−→ C −−−−→ Jac(C)
(the quotient of C by a lattice *) by a multiof the Jacobian of the elliptic curve C
is given by plicative group C . Recall that the period lattice * of the elliptic curve C
the integration of the holomorphic form dx/y on the cycles in C. In the same way, the noncompact subgroup C created by the identification of ∞− and ∞+ into a single point ∞ ∈ C¯ can be materialized by the integration of the meromorphic 1-form xdx/y on a small loop γ around ∞. Look now at the real structure induced by that of C on the Jacobians. The singular affine curve Ch0 ,k0 has equation y 2 + (x 2 + ax + b)2 = 0 (with a 2 − 4b < 0), so that it has no real point, and the same is true for the smooth nearby curves. Then it is known
= Pic0 (C)
(we must be specific with the degrees, or at least with their parity, that Jac(C) here) is a real hyperelliptic curve whose real part has two connected components. On ¯ associates to the nonzero complex number z the the other hand, the map C → Pic(C) f (∞+ ) divisor of any function f on C such that = z. Let us look at the real structure f (∞− ) induced on C : 1 f (∞+ ) f (∞− ) f (∞+ ) = = z −−−−→ = f (∞− ) z¯ f (∞+ ) f (∞− ) so that the real points are, indeed, the points of the unit circle in C . Hence we see that the real part of the (noncompact) Jacobian of the curve C¯ is a compact 2-torus. Let us now be more specific on the curves which generate the homology of this torus. Fix a neighbourhood UR of the critical point (h0 , k0 ) such that (h0 , k0 ) is the unique critical point in UR . Choose a regular point (h, k) in UR . The polynomial has two pairs of conjugate (nonreal) roots. Let as above s, t, be the two roots in the upper half plane and δ be a cycle in C = Ch,k above the segment [s, t], let δ¯ be the conjugated cycle and let α be a cycle above the segment [s, s¯ ] (see Fig. 6). Choose orientations so that α · δ = 1. 4
See Sect. 4 and the references given there.
468
M. Audin
Fig. 6. The curve C¯
Notice that the two double points of Ck0 ,k0 are obtained when s and t come together, so that δ and δ¯ are indeed the vanishing cycles considered above. ¯ which we The complex curve C¯ is represented on Fig. 6 with the cycles α, δ and δ, will use as a basis of a suitable version of its first homology group, namely of the group H1 (C; Z). Consider now the real structure S defined by the complex conjugation of coordinates
The holomorphic 1-form dx/y is real in the sense that S (dx/y) = dx/y. Also, on C. S(δ) ∼ δ and S(α) ∼ −α. But the real structure induced by S on the Jacobian is a little
has no real point while Pic0 (C)
R has two connected components. It more tricky, as C
by turns out that the Abel-Jacobi map “defined” on divisors on C P − Q −−−−→ i
Q
P
dx y
defines an isomorphism of real curves
S −−−−→ (C/*, z → −¯z) Pic0 (C), (see, e.g., Sect. 5.3 for a concrete example). The period lattice * is generated by dx dx i and i y δ α y so that the real cycle is α. ¯ Z), as
Z), δ = δ¯ but that this is not the case in H1 (C; Notice finally that in H1 (C, δ − δ¯ is represented by a small loop γ around ∞ and we have xdx xdx dx = =± = ±2π = 0. δ−δ¯ y γ y γ ix We have proved: Proposition 2.2. The relative Jacobian of the family of curves on the complement of the discriminant in UC is a fibration in complex tori ¯ −−−−→ Pic0 (C)
−−−−→ 0 1 −−−−→ C −−−−→ Pic0 (C)
Hamiltonian Monodromy via Picard-Lefschetz Theory
469
with real structure such that the connected components of the real part of the fiber on the complement of (h0 , k0 ) in UR are fibrations in compact real tori 1 −−−−→ S 1 −−−−→ T 2 −−−−→ R/T Z −−−−→ 0. The homology of the real torus T 2 is generated by two elements
α , γ , such that – the image of
α in H1 (R/T Z; Z) is the generator α coming from H1 (C; Z) and dx satisfying = iT α y – γ is the image of the generator of H1 (S 1 ; Z) coming from H1 (C; Z), that is, the xdx difference δ − δ¯ of the two vanishing cycles and satisfying = ±2π . γ y Now, apply the Picard-Lefschetz formula (1) to get: Corollary 2.3. The monodromy of the fibration in tori over UR − {(h0 , k0 )} is given by
α −−−−→
α + εγ and γ −−−−→ γ .
Remark 2.4. Notice that the fact that there is, indeed, real monodromy in this situation reflects the fact that the complex monodromy respects the real structure on the Jacobian. This is certainly not the case turning around a single branch of the discriminant. 3. Hamiltonian Monodromy The two families of curves Ch,k and Dh,k used so far are the families of spectral curves for certain Lax equations. We will apply the previous study to the determination of the Hamiltonian monodromy of the corresponding integrable systems. Both are integrable systems with two degrees of freedom. The regular levels are (unions of) 2-dimensional tori. In both cases, the set of regular values of the momentum mapping contains a punctured disk (see Figs. 7 and 8). According to a general theorem of Nguyen Tien Zung [19], if the singular point inside this disk is a “focus-focus” point, the fibration by Liouville tori should have nontrivial monodromy. I will use the previous results: – to prove that this is, indeed, the case for the geodesic flow on SO(4), using Pryms, and – to prove that the well known Hamiltonian monodromy [11, 10] of the spherical pendulum, usually determined by direct computation, is indeed given by the PicardLefschetz formula as above, using generalized Jacobians. 3.1. Hamiltonian monodromy for geodesic flows on SO(4). The geodesic flow of an invariant metric on SO(n) gives rise to the differential system (Euler-Arnold equations [2]) M˙ = [M, $], where M and $ are skew-symmetric matrices related by M = $J + J $ for some diagonal constant matrix J . It also describes the motion of a free rigid body in a three dimensional space when M and $ are 3 × 3-matrices. For this reason, the case of 4 × 4-matrices, that we will study now, is sometimes called the “4-dimensional rigid body”.
470
M. Audin
A Lax equation. The Euler-Arnold equations have been put in Lax form by Manakov [18]: d (M + J 2 λ) = [M + J 2 λ, $ + J λ]. dt Call bi (0 ≤ i ≤ 3) the diagonal entries of the matrix J and assume that they are distinct, and satisfy b02 < b12 < b22 < b32 . Without loss of generality, assume that b0 = 0 and write ai = bi2 . As is usual, write the skew-symmetric matrix M as
0 x3 M = (x, y) = −x2 −y1
−x3 0 x1 −y2
x2 −x1 0 −y3
y1 y2 . y3 0
The spectral curve, given by the characteristic polynomial det M + λJ 2 − λµId of the Lax matrix Aλ = M + J 2 λ, has the equation λ4 µ
3
(µ − ai ) + λ2 µ2 f1 (x, y) − µH (x, y) + K(x, y) + f2 (x, y)2 = 0,
i=1
where f1 (x, y) = f2 (x, y) = H (x, y) =
3 i=1
K(x, y) =
1 2
xi2 +
yi2 is the square of the norm of M,
xi yi is its Pfaffian, ai xi2 +
1 2
{i,j,k}={1,2,3}
{i,j,k}={1,2,3}
(ai + aj )yk2 ,
ai aj yk2 .
We fix f1 > 0 and f2 such that |2f2 | < f1 . Then the set of skew-symmetric matrices M = (x, y) that have norm f1 and the Pfaffian f2 is a coadjoint orbit Of of SO(4) diffeomorphic to S 2 × S 2 . The two functions H and K are then in involution on this symplectic manifold, the Euler-Arnold equations being the Hamiltonian system associated with H . The algebro-geometric features of this integrable system were investigated in the classical and beautiful paper of Haine [16] and some of its topological aspects were described by Oshemkov [20]. As in Sect. 2.1, call E the elliptic curve, that is the quotient of D by the involution τ (µ, λ) = (µ, −λ). The spectral curve D and the curve E are of the type considered in Sect. 2.1. The points Ai are indeed the points (ai , ∞) (with a0 = 0). The parameters in the equations of E and D are all real and satisfy: – The orbit invariants f1 and f2 are two fixed real numbers such that f2 = 0, f12 −4f22 = 4β 2 > 0. – The values h, k of H and K are positive parameters.
Hamiltonian Monodromy via Picard-Lefschetz Theory
471
Then, it can be shown (see Sect. 5.1 for a justification) that, for β in a suitable interval (ε1 , ε2 ), the point (h0 , k0 ) =
1 1 (f1 + 2β)(a1 + a2 ) + (f1 − 2β)a3 2 2
is a point at which h0 ,k0 (x) = (ax 2 + bx + c)2 with b2 − 4ac < 0, thus two nonreal branches of the discriminant intersect at this point.
Eigenvector mapping. We consider now the eigenvectors of the Lax matrix Aλ = M + J 2 λ. This will give us a very precise dictionary between the topology and the algebraic geometry. Fix a value (h, k) of (H, K). This fixes a curve D = Dh,k . Consider a point (λ, µ) ∈ D. The eigenvectors of the Lax matrix for the eigenvalue λµ form a line bundle on the complement of the branch points of λ, a sub-bundle of the trivial bundle D × C4 . When the curve D is smooth, there is a unique way to extend this line bundle to the whole of D as a sub-line bundle of D × C4 (an easy exercise, see [15]). Thus, we have a line bundle, the eigenvector bundle L, on the complete curve D. Let TD be the common level set of the first integrals H and K corresponding to the curve D. This is a subset of the phase space S 2 × S 2 . Letting the point (x, y) vary in the level TD , we get a map (recall we assume D to be smooth): ϕD : TD −−−−→ Pic(D), where the group Pic(D) of isomorphism classes of line bundles over D is identified with the set of linear equivalence classes of divisors on D. Remark 3.1. Eigenvector mappings turn out to be very useful in the investigation of algebraically integrable systems as they allow us to understand the geometry of the system with the help of the algebraic geometry of a curve: determination of Liouville tori5 , determination of the regular levels6 , construction of action coordinates7 . Here we will use it to get information on the monodromy. Proposition 3.2. Assume the elliptic curve E is smooth. Then the tangent mapping to ϕD maps the Hamiltonian vector fields of H and K to independent vectors in H 1 (OD ) = T· Pic(D). 5
See [8] for the Kowalevski top. See [3] for the case of geodesics of quadrics, Corollary 3.4 for the geodesic flow on SO(4) and Corollaries 3.10 and 3.13 for the spherical pendulum. 7 See [5, 7]. 6
472
M. Audin
Proof. Assume E is smooth. Then, D is smooth. We use the covering D = U+ ∪ U− , U+ (resp. U− ) being the open set where λ = ∞ (resp. λ = 0). The cohomology group H 1 (OD ) is isomorphic with the first cohomology group of this covering. We are going to express the images of the Hamiltonian vector fields in H 1 (OD ) as cocycles of this covering. Notice that the Hamiltonian vector fields XH and XK generate the same subspace in T· Of as the Hamiltonian vector fields of the functions H3 (M) = 3tr(J 2 M) and H4 (M) = 6tr(J 4 M). Now these two functions have the form k Hk (M) = Res λtr λ−1 (M + J 2 λ) dλ for k = 3, 4. Using the standard linearization theorem of [1] and [21], it is seen that the corresponding Hamiltonian vector fields are mapped to the cocycles k
λ λ−1 (λµ) = λµk , k = 3, 4 (recall that our eigenvalue is λµ). It is proved that these two cocycles are independent by a residue computation (see Sect. 5.2). Remark 3.3. Notice that these two cocycles are anti-invariant (with respect to τ ) so that the eigenvector mapping maps TD into a subvariety of Pic(D) parallel to Prym(D|E). Notice that Proposition 3.2 implies that the eigenvector mapping is a covering of its image, a result of Haine that we will state more precisely below. Corollary 3.4. If (h, k) is such that the elliptic curve E is smooth, then this is a regular value of the momentum mapping. Proof. If the images of the Hamiltonian vector fields XH and XK are independent, the two vectors themselves are independent. A complete picture of the discriminant of this family of curves can be found in Chapter IV of [4]. We will only need the case, considered in Example 1.1, where ε1 < β < ε2 (see Sect. 5.1). In this case, the discriminant has the form depicted in Fig. 7, which, thanks to our Corollary 3.4, is coherent with Oshemkov’s pictures [20]. Proposition 3.5 ([16]). If E is smooth, the eigenvector mapping ϕD : TD −−−−→ Pic(D) is a covering of degree 4 of its image. This image is isomorphic to an open subset of Prym(D|E). Proof. We have already mentioned that ϕD was a covering map and that its image was contained in a subvariety parallel to Prym(C|E). The pairs (x, y) such that all the corresponding M + J 2 λ belong to the same conjugacy class modulo GL(4; C) (and are in the same SO(4)-orbit) are all the εx1 εy1 ηx2 , ηy2 with ε2 = η2 = 1. εηx3 εηy3
Hamiltonian Monodromy via Picard-Lefschetz Theory
473
Fig. 7. The regular values
Hamiltonian monodromy. We now deduce the Hamiltonian monodromy from the algebro-geometric description in Theorem 2.1. Theorem 3.6. Assume ε1 < β < ε2 . Then the set of regular values of the momentum mapping (H, K) for the geodesic flow on SO(4) has a connected component which is not simply connected. The monodromy of the fibration in tori along a loop Liouville 10 generating the fundamental group is given by the matrix (where ε = ±1). ε1 Proof. We relate the relative Prym to the fibration in Liouville tori by the eigenvector mapping. According to Oshemkov [20], the regular levels close to the singular point (h0 , k0 ) have two connected components. We concentrate on the monodromy of the fibration for one of them. According to Haine, the eigenvector mapping is a 4-fold covering map. This is of course a real map, so that it maps our Liouville tori to the real part of the Prym variety by a degree-2 covering map. In each fiber, we thus have a map T 2 → T 2 inducing, at the level of fundamental groups Z2 −−−−→ Z2 (a, b) −−−−→ (a, 2b). 1 0 The only possibility to have monodromy on the right is to have monodromy 2ε 1 10 on the left. ε1 3.2. Hamiltonian monodromy for the spherical pendulum. Both the Hamiltonian systems for the spinning top and the spherical pendulum can be described by Lax equations whose spectral curves are of the type considered in Example 1.2 and Sect. 2.2. To be more specific, I concentrate here on the case of the spherical pendulum. The spherical pendulum is the mechanical system described on the unit sphere in R3 by the Hamiltonian 1 H = p2 − 9 · q, 2 q denoting the position of the ball on the unit sphere, 9 the (constant) gravity and p the momentum. The phase space is T S 2 = (q, p) ∈ R3 × R3 | q2 = 1 and p · q = 0 ,
474
M. Audin
and the Hamiltonian system is
q˙ = p p˙ = 9 − (q · 9 + p2 )q.
The system is invariant under the rotations around the vertical (i.e., 9) axis, so that the momentum K = (q × p) · 9 is a first integral: the spherical pendulum is a completely integrable system. Remark 3.7. The S 1 -action by rotations around the vertical is generated by the flow of K. Considering q, p as vectors in C3 , this action complexifies into a C -action. This is the reason why, in this example as in the spinning top case, one should expect to need noncompact tori (as are the generalized Jacobians). See, more generally, [14, 13, 6]. The elliptic curve, down-to-earth approach. It is more or less obvious (and in any case well known) that there should be an elliptic curve present. Choose an orthonormal basis of R3 such that 9 = −e3 and eliminate q1 , q2 , p1 and p2 using H and K to get p32 = 2(H − q3 )(1 − q32 ) − K 2 , so that x = q3 satisfies a differential equation of the form x˙ 2 = f (x), where f is a degree-3 polynomial. The elliptic curve X of equation y 2 = 2x 3 − 2H x 2 − 2x + 2H − K 2 thus plays a role. It allows to solve the equations, using the Weierstrass ℘-function it defines. Even more directly, it shows, as the polynomial f must have two real roots8 between −1 and 1, that the ball will oscillate between two horizontal parallel circles on the unit sphere, according to everyday experience. A Lax equation. The best way to understand why there should be elliptic curves around is to re-write the Hamiltonian system as a Lax equation with a spectral parameter. A Lax equation for the spherical pendulum involving matrices in the Lie algebra ᒐᒌ(3, 1) appears in [22] as an example of the beautiful general constructions explained in this very recommendable paper. Here I will use something simpler9 . As usual, identify vectors in R3 with skew-symmetric 3 × 3-matrices (and the vector cross product with the matrix bracket). The Lax equation d q + λ(p × q) − λ2 9 = q + λ(p × q) − λ2 9, p × q − λ9 dt It always has a real root in ]1, +∞[; for a real motion it must also have roots in [−1, 1]. The Lax equation I use here is so simple that it probably belongs to folklore. I learned it from Alexei Reyman. 8 9
Hamiltonian Monodromy via Picard-Lefschetz Theory
475
is equivalent to the Hamiltonian system on T S 2 . Then, there is a spectral curve given by the characteristic polynomial of the Lax matrix. Let Aλ = q + λ(p × q) − λ2 9 − µId. Then
2 det Aλ = −µ µ2 + q + λ(p × q) − λ2 9
= −µ µ2 + λ4 − 2(p × q) · 9 + 2λ2 (p × q2 − 9 · q) + q2 ... where we recover the first integrals 4 3 = −µ µ2 + λ + 2H λ2 + 1 . + 2Kλ PH,K (λ)
We will thus use the curve10 of equation µ2 + PH,K (λ) = 0. Remark 3.8. It is also possible to replace ᒐᒌ(3) by ᒐᒒ(2) to avoid the extra factor µ as is done in [14] for the Lagrange top. I have preferred to use real matrices in order to keep an eye on the reality questions, especially when I will use eigenvectors. Now we have a family of the type we have discussed in Sect. 2.2. I will use the
(elliptic, normalized) and C¯ (genus 2, singular)11 as above. notation C (affine), C Eigenvector mapping (classical). We now come to the eigenvectors of the Lax matrix Aλ = q + λ(p × q) − λ2 9. They will give us a very precise dictionary between the topology of the Liouville tori and the geometry of the relative Jacobian. Fix a value (h, k) of (H, K). This fixes a curve C. Consider a point (λ, µ) ∈ C. The eigenvectors of the Lax matrix for the eigenvalue µ form a line bundle on the complement of the branch points of λ, a subbundle of the trivial bundle C × C3 . As above, when the curve C is smooth, there
as a sub-line bundle is a unique way to extend this line bundle to the whole of C
× C3 . Thus, we have a line bundle, the eigenvector bundle L, on the complete of C
curve C. Let TC be the common level set of the first integrals H and K corresponding to the curve C. This is a subset of the phase space T S 2 . Letting the point (q, p) vary on the level TC , we get a map (recall we assume C to be smooth):
ϕC : TC −−−−→ Pic(C), 10 The curve C defined by this equation is isomorphic to the elliptic curve X defined above. See Sect. 5.3 for a more precise statement. 11 The analogue of the singular curve C ¯ is used by Gavrilov and Zhivkov [14] to describe the “complex geometry” of the spinning top. Here we will use it to describe the “real monodromy”. See also [23], where the monodromy of the top is determined, still by a direct computation, but using the singular curves.
476
M. Audin
of isomorphism classes of line bundles over C
is identified with where the group Pic(C)
the set of linear equivalence classes of divisors on C. This mapping relates the geometry
of the spherical pendulum to the algebraic geometry of C. Eigenvector mapping (modified). We can also define an eigenvector bundle on the sin¯ Let us look at what happens when (λ, µ) tends to infinity in C. Write gular curve12 C. (q + λ(p × q) − λ2 9) · v = µv. The eigenvalue µ is equivalent to ±iλ2 , so that we are interested in the eigenvectors of the matrix −9 (the infinitesimal rotation around the vertical axis) with respect to the eigenvalues ±i, namely in the two vectors t (1, ±i, 0)—one for each branch. We have found a preferred basis of L∞+ and L∞− , so that we have an isomorphism ¯ L∞+ → L∞− , which allows to define a line bundle over the singular curve C. Letting the point (q, p) vary on the level TC , we now get a map (recall we assume C to be smooth): ¯ ϕC¯ : TC −−−−→ Pic(C), ¯ of isomorphism classes of line bundles over C, ¯ is identified where the group Pic(C) ¯ The neutral component is with the set of linear equivalence classes of divisors on C. a generalized Jacobian, a dimension-2 complex algebraic group. The exact sequence
relating it to the usual Picard group of the smooth curve C 0 C ϕC¯ / Pic(C) ¯ TC E EE EE ϕC EEE "
Pic(C) 0 allows to relate the two “eigenvector mappings”. We do not need to compute the map ϕC explicitly. But we will use its tangent mapping. Tangent map to ϕC¯ . The next aim is to show that, under ϕC¯ , the flows of H and K are ¯ Here we really need the generalized mapped to linear and independent flows on Pic(C).
of the genus-1 Jacobian: the level set TC is 2-dimensional while the Jacobian Pic(C)
curve C is only 1-dimensional (either over C or over R—but over the same field for both!). Let us begin by the usual vector mapping ϕC . 12 See Sect. 4 and the references given there for the generalized Jacobian and the Picard group of the ¯ singular curve C.
Hamiltonian Monodromy via Picard-Lefschetz Theory
477
Proposition 3.9. The eigenvector mapping ϕC is constant on the orbits of the vector field XK . Its tangent mapping maps the Hamiltonian vector field XH to the class in
by U0 (where λ = ∞) and U∞ H 1 (C; OC ) of the 1-cocycle λ−1 µ of the covering of C (where λ = 0). Proof. The flow of K is the flow of rotations around the vertical axis. The first assertion can of course be checked directly. For further use, I will prove it in an indirect way. The differential system associated with XK is
q˙ = 9 × q p˙ = 9 × p,
a system that has an obvious Lax form using our matrix Aλ : d q + λ(p × q) − λ2 9 = q + λ(p × q) − λ2 9, −9 . dt The matrix Bλ in this Lax equation is −9 = λ−2 (q + λ(p × q) − λ2 9)
+
(the notation + denotes the polynomial part of a Laurent polynomial). Using as above the linearization theorem (for this case, see also, e.g., [7, Theorem IV.2.5]), we get that
OC ) = T· Pic(C)
is the class of the cocycle λ−2 µ. the image of XK in H 1 (C; In the same way, the matrix Bλ in the Lax equation describing the Hamiltonian system associated with XH is p × q − λ9 = λ−1 (q + λ(p × q) − λ2 9)
+
so that the image of XH is the class of the cocycle λ−1 µ. Now, as we have already mentioned, µ is equivalent to ±iλ2 when λ goes to infinity, so that the function λ−2 µ on U0 ∩ U∞ extends to a holomorphic function on U∞ . Thus the cocycle λ−2 µ is a coboundary and its class is zero.
OC ). An easy way to prove On the opposite, the cocycle λ−1 µ is nonzero in H 1 (C;
We take this is to compute its residue at 0 (or at ∞) against a holomorphic form on C. the form dλ/µ and compute
dλ Resλ=0 λ−1 µ = Resλ=0 λ−1 dλ = 2. µ Eventually, the tangent map to ϕC sends XK to zero and XH to a generator of
OC ). H 1 (C; Notice that Proposition 3.9 readily has the following consequence: Corollary 3.10. If the complex curve C is smooth, the corresponding value of (H, K) in C2 or R2 is regular.
478
M. Audin
Proof. Note that the eigenvector mapping ϕC is well-defined as soon as the curve C is smooth. Assume this is the case. Then XK is mapped to zero and XH to a nonzero element. The only way in which XH and XK can be dependent is then to have XK = 0. This means that q and p are vertical. As they are orthogonal and q is nonzero, p must be zero. Thus H = ±1 and K = 0, the equation of C is written
2 µ2 + λ2 ± 1 = 0, an obviously singular curve (it has two double points)... and a contradiction. Thus XH and XK were independent and the value is regular. Let us now look more closely at the map ϕC¯ , to get: Proposition 3.11. Assume the curve C is smooth. Then the tangent mapping to the ¯ O ¯ ). eigenvector map ϕC¯ maps XH and XK to independent vectors in H 1 (C; C Proof. It is based on the considerations used in the proof of Proposition 3.9. A 1-cocycle
for the covering of holomorphic functions on C
= U0 ∪ U∞ C is just a holomorphic function f0,∞ : U0 ∩ U∞ −−−−→ C. We can also consider such an f0,∞ as a cocycle f¯0,∞ of holomorphic functions on C¯ now, for the covering C¯ = U0 ∪ U∞+ ∪ U∞−
coboundaries are differ(with an obvious notation, in which U∞+ ∩ U∞− = ∅). On C, ¯ these are also ences f0 − f∞ of functions that are holomorphic on U0 , U∞ resp. On C, the differences of functions f0 − f∞± on U0 ∩ U∞± . But now, the functions f∞± must take the same value at the two points at ∞ (see Sect. 4). As soon as we have taken care of the fact that the eigenvector bundle is well¯ the proof of the linearization theorem works to give us, as in the proof defined on C, of Proposition 3.9, that T(q,p) ϕC¯ (XH (q, p)) is the class of λ−1 µ and that
T(q,p) ϕC¯ (XK (q, p)) is the class of λ−2 µ.
¯ O ¯ ). The big difference now is that these two cocycles give independent classes in H 1 (C; C ¯ For this, consider the two holomorphic forms dλ/µ and λdλ/µ on C and compute the residues, – for XH :
as above and
dλ = Resλ=0 λ−1 dλ = 2 Resλ=0 λ−1 µ µ Resλ=0 λ
−1
λdλ µ µ
= Resλ=0 λ−2 dλ = 0,
Hamiltonian Monodromy via Picard-Lefschetz Theory
479
– and for XK :
dλ = Resλ=0 λ−2 dλ = 0 Resλ=0 λ−2 µ µ once again, as now
λdλ Resλ=0 λ−2 µ = Resλ=0 λ−1 dλ = 2. µ
Thus the two cocycles give independent cohomology classes.
Remark 3.12. Let us show that the fact that the spectral curve has no real point is related to the periodicity of the flow of K. Look at the diagram C
S1 TC
ϕC¯
/ Pic(C) ¯
TC /S 1
ϕC
/ Pic(C)
where, on the left, we factor out the regular real level TC by the flow of K (rotations around the vertical axis) and on the right, we look at the generalized Jacobian as above. As TC is compact, the map ϕC¯ , whose differential is always injective, according to Proposition 3.11, is a covering of its image. This re-proves that the real part of the right side should also be an S 1 -fibration (see Sect. 2.2). Critical values vs. discriminant. We have shown that, if the value (H, K) corresponds to a smooth curve C, it is a regular value. Of course, non-smooth curves C correspond to polynomials with double roots λ4 + 2Kλ3 + 2H λ2 + 1 = (λ − u)2 (λ − v)(λ − w) that is 1 2 3 H = 2 u − 2u2 u ∈ C . 1 K = −u + u3 The real part of the discriminant curve (meaning that H and K are real, which does not mean that u is real) is shown in Fig. 8. The point (1, 0) is part of the discriminant (a real double point with imaginary branches). Obviously, for real (q, p), H must be greater than or equal to −1, so that the only connected component of the complement of the discriminant that are actual values for real (q, p) is the one containing the positive H -axis. These points must be regular values (according to Corollary 3.10) except for the point (1, 0) which we have seen corresponds to a singular level. The boundary points of the image of (H, K) must be critical values. Thus we have proved the converse of Corollary 3.10 for real (H, K), namely:
480
M. Audin
Fig. 8. The regular values for the spherical pendulum
Corollary 3.13. The value of (H, K) in R2 is regular if and only if the corresponding complex curve C is smooth. The points of the discriminant curve actually correspond to the trajectories of the pendulum on horizontal circles, namely, they are the images of the points 1 (q, p) = 9 + q , uq × 9 with q ⊥ 9 u2 for which XH (q, p) = −uXK (q, p) – critical points indeed, a result of Huygens (see [11]). Hamiltonian monodromy. We have seen that the eigenvector mapping is a covering of its image, so that H1 (TC ; Z) is a sublattice of H1 (Pic2 (C¯ R ); Z). We have more precisely: ¯ embeds H1 ((TC )R ; Z) as Proposition 3.14. The eigenvector mapping TC → Pic2 (C) the sublattice ¯ R ; Z)) ⊂ H1 (Pic2 (C) ¯ R ; Z). 2H1 (Pic2 (C) ¯ Z) so α , 2γ ) ∈ H1 (Pic0 (C); Thus, we have a basis in H1 (TC ; Z) which is sent to (2
that the monodromy of the fibration in Liouville tori is the same as that of the fibration by the relative Jacobian. Corollary 3.15 (Cushman [11]). The monodromy of the fibration in Liouville tori for 10 the spherical pendulum is given by the matrix . ε1 Remark 3.16. According to Proposition 2.2, the Hamiltonian monodromy for the spherical pendulum is thus, eventually, given by the integration of a meromorphic 1-form. See [9] for a proof of this fact by a direct computation. Proof (of the proposition). Let us now compute the degree of the eigenvector mapping ϕC¯ . We have seen that the classical eigenvector mapping ϕC defines a covering map (on its image)
ϕC : TC /C −−−−→ Pic2 (C). It is well known (see, e.g., [4] or [14]) that ϕC maps the real level set TC /S 1 to one of
R = XR by a degree-2 map. To compute the degree of the two components of Pic2 (C) ϕC¯ , we just need to understand what happens in the direction of the field XK .
Hamiltonian Monodromy via Picard-Lefschetz Theory
481
Lemma 3.17. The restriction of ϕC¯ to the orbits of XK is a covering of degree 2. It maps a real orbit to a circle via a degree-2 map. Proof. It obviously suffices to show this for one (well chosen) orbit in a well chosen level set. Fix q = 9, then p is horizontal (p3 = 0), K = 0 and p is determined by 1 p2 − 92 = H, that is p2 = 2(H + 1). 2 Fix H (not equal to ±1) and look at the orbit of XK through (9, p) in the level (H, 0). The Lax matrix Aλ has the form Aλ = q + (p × q)λ − 9λ2 = (1 − λ2 )9 + (p × 9)λ, this is the skew-symmetric matrix corresponding to the vector t
The vector
(−λp2 , λp1 , λ2 − 1).
1 − λ12 p2 − λµ2 p1 µ 1 V = − 1 − 2 p1 − 2 p2 λ λ (p12 + p22 ) λ1
is an eigenvector of Aλ for the eigenvalue µ, that never vanishes and has neither a zero nor a pole at infinity. ¯ There is a pole when a component of Let us look at the pole divisor of V in Pic(C). V tends to infinity. This happens only for λ = 0. The divisor obtained does not depend on p. This is no surprise, since we know (Proposition 3.9) that the classical eigenvector mapping ϕC is constant on the orbits of XK . We must thus concentrate on what happens at infinity. – at ∞+ , we have
λ = i and µ2
p2 − ip1 1 V (∞+ ) = −p1 − ip2 = (p2 − ip1 ) −i , 0 0 – at ∞− , similarly,
λ = −i and µ2
p2 + ip1 1 V (∞− ) = −p1 + ip2 = (p2 + ip1 ) i . 0 0
482
M. Audin
The vectors t (1, ±i, 0) are the reference vectors we have used to define the eigenvector ¯ The mapping ϕ ¯ thus sends the point (9, p) to bundle as a bundle over the singular C. C the ratio V (∞+ )/V (∞− ), namely to p2 − ip1 (p2 − ip1 )2 (p2 − ip1 )2 = = . 2 2 p2 + ip2 2(H + 1) p1 + p2 Thus ϕC¯ is a degree-2 map, as the solutions of (p2 − ip1 )2 = 2a(H + 1) p12 + p22 = H + 1 for a = α 2 ∈ C are the two points √ ε H +1 1 p1 = −α , 2i α
√ ε H +1 1 p2 = +α 2 α
for ε = ±1. Notice that they are real when |α| = 1.
4. The Generalized Jacobian, a Few Words Let me recall very briefly a few notions related to the geometry of the singular curve C¯ that I have used in this paper. See [12] for detail.
of genus g − 1. Choose two distinct points The curves. Consider a smooth13 curve C
call C their complement in C
and C¯ the curve obtained by the ∞+ and ∞− on C, ¯ Notice that the identification of ∞+ and ∞− to a single point, that we call ∞ ∈ C. natural map
Z) H1 (C; Z) −−−−→ H1 (C; is surjective, its kernel being the infinite cyclic group generated by the homology class of a small loop turning once around ∞+ (or ∞− ). ¯ Functions, vector fields, forms. We start from germs of holomorphic functions on C:
these are germs of holomorphic functions on C which take the same value at ∞+ and
Then, vector ∞− . Note that a germ at ∞ on C¯ is a germ at both ∞+ and ∞− on C. fields must be derivations, we must thus have (X · f )(∞+ ) = (X · f )(∞− ), for all functions f , which forces X(∞+ ) = X(∞− ) = 0, thus the holomorphic vector
satisfying this condition. fields on C¯ are the holomorphic vector fields on C Now, the holomorphic 1-forms are obtained by duality: a holomorphic form on C¯
with at worst simple poles at ∞+ and ∞− (with oppois a meromorphic form on C site residues). According to the Riemann-Roch theorem, the complex vector space of
that have a simple pole at ∞+ , a simple pole at ∞− and no meromorphic forms on C other pole, has dimension 1. Hence the cokernel of the natural injection H 0 ($1C ) → 13 Starting from a curve with double points, we see that there is no difficulty to make, by induction, more complicated examples.
Hamiltonian Monodromy via Picard-Lefschetz Theory
483
H 0 ($1C¯ ) has dimension 1. The dimension of H 0 ($1C¯ ) is g, that is, C¯ has arithmetic genus g.
= P1 (C) = C ∪ {∞}, the two points ∞+ = ∞, Example 4.1. Consider, on the curve C ¯ ∞− = 0. Then C = C − {0} and C is a sphere with two points identified (or a pinched torus). The meromorphic form dz/z over C can be considered as a holomorphic form ¯ on C. Example 4.2. On the curve C¯ considered in this paper, the form xdx/y is equivalent to ¯ ±idx/x near ∞± , so that it defines a holomorphic form on C. Divisors, line bundles, Picard group. In order to be able to integrate “holomorphic” forms over paths, it is necessary that the paths do not pass through the points ∞+ and
not involving ∞+ or ∞− . Linear equiv∞− . Hence divisors on C¯ must be divisors on C alence of divisors on C¯ is defined by D ∼ D if and only if there exists a meromorphic ¯ which has neither a zero nor a pole at ∞ and such that D − D = (f ). function f on C, We can then assume that f (∞) = 1. ¯ there
In order that it defines a line bundle on C, Consider now a line bundle over C. should be an isomorphism between the fibers of ∞+ and ∞− . This can be achieved if a meromorphic section with neither a zero nor a pole at ∞+ and at ∞− is given. This way, we get the generalized version of the equivalence between line bundles and divisors we need. ¯ is defined either as the group of isomorphism classes of The Picard group Pic(C) line bundles over C¯ or as the group of linear equivalence classes of divisors. There is ¯ → Pic(C),
the forgetful map which sends the class of D as a a projection Pic(C)
The kernel is a set of divisors of functions, divisor on C¯ to its class as a divisor on C.
such that isomorphic to C under the map z → (f ) where f is any function on C f (∞+ )/ f (∞− ) = z. Generalized Jacobian. The Jacobian can be defined by holomorphic forms. The map H1 (C; Z) −−−−→ H 0 ($C¯) α −−−−→ ω → α ω is an embedding Z2g−1 ⊂ Cg , the Jacobian is the quotient, a noncompact g-dimensional complex torus. More precisely, there is a commutative diagram
0 −−−−→ C −−−−→ H 0 $1C¯ −−−−→ H 0 $1C −−−−→ 0 u u∞ u
Z) −−−−→ 0 0 −−−−→ Z −−−−→ H1 (C; Z) −−−−→ H1 (C; 2g−2 as a lattice * into the
Z) ∼ in which the usual Abel-Jacobi map
u embeds H 1 (C; =Z
complex (g − 1)-dimensional vector space H 0 $1C . The integration map u∞ sends ¯ The exact sequence of Z to 2iπ Z ⊂ C so that H1 (C; Z) is mapped to a “lattice”14 *. quotients 14
I use the quotation marks because this is a Z2g−1 in Cg , and thus not quite a lattice.
484
M. Audin
¯ −−−−→ Jac(C)
−−−−→ 0 0 −−−−→ C/2iπ Z −−−−→ Jac(C) is isomorphic to the exact sequence
−−−−→ 0 0 −−−−→ C −−−−→ Pic0 C¯ −−−−→ Pic0 C of Picard groups. Example 4.3. Consider again the case of the sphere with two points identified. This is a curve of arithmetic genus 1 (and it is easy to embed it in the plane as a rational cubic with a double point). The Jacobian of P1 (C) is trivial, so that the sequences above reduce to the inclusion H1 (C − {0} ; Z) → C given by integration of the form dz/z. The generalized Jacobian is C/2iπZ ∼ = C . 5. Complements 5.1. The family of curves Eh,k . In this section, we prove the assertions on the family Eh,k that we have used in Sect. 3.1. Recall the equation y2x
3
(x − ai ) + y(f1 x 2 − hx + k) + f22 = 0,
i=1
and the assumptions we have made on the ai ’s, 0 < a1 < a2 < a3 , on the fi ’s:
f2 = 0,
f12 − 4f22 = 4β 2 > 0,
and that h, k are positive parameters. Write (x) = (f1 x 2 − hx + k)2 − 4f22 x
3 (x − ai ), i=1
and define σ1 , σ2 , σ3 for the elementary symmetric functions of the ai ’s. We are looking for values of the parameters such that (x) = (ax 2 + bx + c)2 with b2 − 4ac < 0. Identifying terms, – using constant terms, we get k 2 = c2 , we choose c = k, – using coefficients of x 4 , we get f12 − 4f12 = a 2 , so that a = ±β. As c = k > 0, we must take a = 2β, – from the first and third degree terms, we get h and b as functions of k: h=
1 σ3 , (f1 + 2β) σ1 − 2β k 2
b=
1 σ3 (f1 + 2β) σ1 − f1 . 2 k
Hamiltonian Monodromy via Picard-Lefschetz Theory
485
Replacing in the x 2 term gives an equation in k: 8k 3 − 4(f1 + 2β)σ2 k 2 + 2(f1 + 2β)2 σ1 σ3 k − (f1 + 2β)3 σ32 = 0. This is easy to solve: as the roots of α 3 − σ2 α 2 + σ3 σ1 α − σ32 = 0 are the ai aj (i < j ), we get k = 21 (f1 + 2β)ai aj and h = −2βak + 21 (f1 + 2β)σ1 (in these formulas, i < j and {i, j, k} = {1, 2, 3}). Moreover, in terms of u = 21 (f1 +2β), a = 2u − f1 , so that
b = −f1 ak + uσ1 ,
c = uai aj ,
b2 − 4ac = (σ12 − 8ai aj )u2 + 2f1 (2ai aj − σ1 ak )u + f12 ak2 .
Notice that this is positive at u = 0. In order that there exist values of u for which this expression is negative, we must have f12 (2ai aj − σ1 ak )2 − ak2 (σ12 − 8ai aj ) > 0, that is
4ai aj f12 (ak − ai )(ak − aj ) > 0.
Thus we must have i = 1 and j = 2 or i = 2 and j = 3. In the second case, there are no values of β such that b2 − 4ac < 0 and 0 < β < 2 |f1 |. The first case gives the desired interval (ε1 , ε2 ). 5.2. Holomorphic forms and cocycles on D. The equation of the elliptic curve E of Example 1.1 and Sect. 2.1 can be put in the form t 2 + 2a(x)t + b(x) = 0, with t =
1 . y
Hence the form
−2dt dx = t + a(x) 2a (x)t + b (x) is holomorphic (and has no zero!). Let η be its pull back to D, thus ω=
η=
u2
dx 1 −4udu = , with u2 = 2 = t, 2 + a(x) 2a (x)u + b (x) z
is a holomorphic form on D. Its divisor is (η) = (u)0 = (z)∞ =
3
Ai .
i=0
The divisor of the meromorphic form zη is (zη) =
3 i=0
Ai + (z)0 −
3 i=0
Ai = (z)0 ,
486
M. Audin
thus zη is also holomorphic. Now, on D, z and x both have degree 4; the zeroes of z are the poles of x, so that (z)0 = (x)∞ , and (xzη) = (x)0 − (x)∞ + (z)0 = x0 . Thus we have exhibited three holomorphic forms on the genus-3 curve15 D, the forms η, zη and xzη. Notice that the first one is τ -invariant while the two others are anti-invariant. Consider now, as in the proof of Proposition 3.2, the cocycles zx k of the covering D = U+ ∪ U− ; let us prove the lemma that ends the proof of this proposition. Lemma 5.1. The cocycles zx 3 and zx 4 define independent elements of H 1 (OD ). Proof. We show that the images of these cocycles under the linear map H 1 (OD ) −−−−→ C2 f −−−−→ (Resz=∞ (f zη), Resz=∞ (f xzη) are independent vectors. Let us thus compute −4ux m du u2 (2a (x)u2 + b (x) 2x m dt = −2Rest=0 . t (2a (x)t + b (x)
Rm = Resz=∞ (z2 x m η) = Resu=0
The points of E at which t = 0 are the four points Ai ’s. Near such a point, t is a local coordinate on E and x can be expressed as a function of t, x = ai + O(t), hence Rm = −2
3 aim b (ai ) i=1
(notice that the residue at A0 is zero as m ≥ 3). Now , "
"3 b(x) = Rm = "
j =0 (x − aj ) , f22
so that b (ai ) =
j =i (ai − aj ) f22
2 a1m−1 (a2 − a3 ) + a2m−1 (a3 − a1 ) + a3m−1 (a1 − a2 ) . 1≤i<j ≤3 (ai − aj )
To prove that zx 3 and zx 4 are independent, we only need to check that # # #R3 R4 # # # #R4 R5 # = 0, which is true. 15
and
The computation below will show that they are independent in H 0 ($1D ).
Hamiltonian Monodromy via Picard-Lefschetz Theory
487
5.3. An equivalence of elliptic curves. Let us investigate the relations between the two elliptic curves that appeared in the spherical pendulum problem: – the curve X of equation y 2 = 2x 3 − 2H x 2 − 2x + 2H − K 2 , obtained by the “na¨ıve” method, and – the genus-1 spectral curve C of equation 4 3 µ2 + λ 2H λ2 + 1 = 0 . + 2Kλ + P (λ)
Let us check first that they are isomorphic as complex curves. The method is classical and can be found, e.g., in [24, p. 453]. We send a root λ0 of P to ∞ to transform it into a degree-3 polynomial. Expand P by the Taylor formula P (λ) = 4A3 (λ − λ0 ) + 6A2 (λ − λ0 )2 + 4A1 (λ − λ0 )3 + (λ − λ0 )4 and put u =
1 , so that λ − λ0
P (λ) = (λ − λ0 )4 4A3 u3 + 6A2 u2 + 4A1 u + 1 Q(u)
$
and µ + P (λ) = (λ − λ0 ) 2
4
µ (λ − λ0 )2
%
2
+ Q(u) .
Let us now transform Q(u) to eliminate the u2 term. Write s = A3 u + 21 A2 to get Q(u) =
1 3 2 2 3 4s − (3A2 − 4A1 A3 ) s − (2A1 A2 A3 − A3 − A2 ) 2 A3 g2
g3
1 = 2 (4s 3 − g2 s − g3 ). A3 Putting v =
iA3 µ , we get a complex isomorphism (λ − λ0 )2 ϕ C −−−−→ Y (λ, µ) −−−−→ (s, v)
from C to the elliptic curve Y of equation v 2 = 4s 3 − g2 s − g3 . The whole point of the computation now is that g2 and g3 do not depend on λ0 . Using 4A3 = P (λ0 ),
12A2 = P (λ0 ),
24A1 = P (λ0 ),
488
M. Audin
we get after a few computations, 1 1 1 1 g2 = 1 + H 3 , and g3 = H − H 3 − K 2 . 3 3 27 4 The next step is to put the equation of X in the same form, which is (more) easily done, putting iy 1 s = −2 x − H and v = √ , 3 2 2 a change of variable which gives an isomorphism between X and Y . The reason why I have given such explicit formulas is that they allow me to give a more precise result.
are isomorphic. As real curves, X and Proposition 5.2. As complex curves, X and C
are isomorphic for any integer k. Pic2k (C) Proof. The “complex” statement has already been proved. Let us look at the reality questions. Firstly, we use the Abel-Jacobi mapping to identify Y with C/*: u Y −−−−→ C/* P ds P −−−−→ . ∞ v Look at X and at the composition: ψ u X −−−−→ Y −−−−→ C/*, where ψ is the change of variable above iy H ,v = √ . ψ(x, y) = s = −2 x − 3 2 2 Hence we have
u(ψ(P )) =
ψ(P )
∞
ds i =√ v 2
P
∞
dx , y
so that, calling S the real structure on X (S(x, y) = (x, ¯ y)), ¯ we have u ◦ ψ(S(P )) = −u ◦ ψ(P ). Thus, the real curve (X, S) is isomorphic with the real curve (C/*, z → −¯z).
(recall that C
has no real point, so that there is no natural Let us now look at Pic0 (C)
with C/*, we must use the parity of the degree base point to identify the real curve C here) and at the composition ϕ u
−−−−→ Y −−− −→ C/*. Pic0 (C) Here I call ϕ the change of variables 1 iA3 µ A3 + A2 , v = ϕ(λ, µ) = s = λ − λ0 2 (λ − λ0 )2 and the map it defines on Pic0 (C), so that:
Hamiltonian Monodromy via Picard-Lefschetz Theory
u◦ϕ
Pi −
Qi = =
489
ϕ(Pi )
∞ ϕ(Pi )
=i
ds − v ds v
ϕ(Qi ) Pi dλ
Qi
µ
ϕ(Qi ) ∞
ds v
.
S) is – also – isomorphic with (C/*, z → −¯z). Thus, the real curve (Pic0 (C),
References 1. Adler, M., vanMoerbeke, P.: Completely integrable systems, Euclidean Lie algebras and curves, and Linearization of Hamiltonian systems, Jacobi varieties and representation theory. Adv. Math. 38, 267–317 and 318–379 (1980) 2. Arnold, V.I.: Mathematical Methods in Classical Mechanics. Berlin-Heidelberg-NewYork: Springer, 1978 3. Audin, M.: Courbes alg´ebriques et syt`emes int´egrables: g´eod´esiques des quadriques. Expositiones Math. 12, 193–226 (1994) 4. Audin, M.: Spinning Tops, a Course on Integrable Systems. Cambridge: Cambridge University Press, 1996. Traduction en russe, Regular and chaotic dynamics, Moscou, 1999, traduction en japonais, Kyoritsu, 2000 5. Audin, M.: Eigenvectors of Lax matrices, spaces of hyperelliptic curves and action coordinates for Moser systems. Regular and Chaotic Dynamics, 5, 67–88 (2000) 6. Audin, M.: Actions hamiltoniennes de tores et jacobiennes g´en´eralis´ees. C. R. Acad. Sci. Paris 334, 37–42 (2001) 7. Audin, M.: Les syst`emes hamiltoniens et leur int´egrabilit´e. Cours Sp´ecialis´es, 8, Paris: Soci´et´e Math´ematique de France & EDP Sciences, 2001 8. Audin, M., Silhol, R.: Vari´et´es ab´eliennes r´eelles et toupie de Kowalevski. Compositio Math. 87, 153–229 (1993) 9. Beukers, F., Cushman, R.: The complex geometry of the spherical pendulum. Preprint, Utrecht, 2000 10. Cushman, R., Bates, L.: Global Aspects of Classical Integrable Systems. Basel-Boston: Birkh¨auser, 1997 11. Duistermaat, J.J.: On global action-angle coordinates. Comm. Pure Appl. Math. 33, 687–706 (1980) 12. Fay, J.D.: Theta Functions on Riemann Surfaces. Lecture Notes in Mathematics, Vol. 352 Berlin: Springer, 1973 13. Gavrilov, L.: Generalized Jacobians of spectral curves and completely integrable systems. Math. Z. 230(3), 487–508 (1999) 14. Gavrilov, L., Zhivkov, A.: The complex geometry of the Lagrange top. Enseign. Math. (2), 44, 133–170 (1998) 15. Griffiths, P.A.: Linearizing flows and a cohomological interpretation of Lax equations. Amer. J. of Math. 107, 1445–1483 (1985) 16. Haine, L.: Geodesic flow on SO(4) and Abelian surfaces. Math. Ann. 263, 435–472 (1983) 17. Lefschetz, S.: L’Analysis situs et la g´eom´etrie alg´ebrique. Paris: Gauthier-Villars, 1924 18. Manakov, S.V.: Note on the integration of Euler’s equations of the dynamics of an n-dimensional rigid body. Funct. Anal. Appl. 11, 328–329 (1976) 19. Nguyen, T.Z.: A note on focus–focus singularities. Differential Geom. Appl. 7(2), 123–130 (1997) 20. Oshemkov, A.A.: Topology of isoenergetic surfaces, and bifurcation diagrams of integrable cases of the dynamics of a rigid body on SO(4). Russ. Math. Surv. 42, 241–242 (1987) 21. Reiman, A.G.: Integrable Hamiltonian systems connected with graded Lie algebras. J. Soviet Math. 19, 1507–1545 (1982) 22. Reyman, A.G., Semenov-Tian-Shanski, M.A.: Group theoretical methods in the theory of finite dimensional Integrable systems. In: Dynamical Systems VII, Encyclopaedia of Math. Sci. Springer, 16, 1994 23. Vivolo, O.: Th`ese Toulouse, 1997 24. Whittaker, E.T., Watson, G.N.: A Course of Modern Analysis. Cambridge: Cambridge University Press, 1996. An introduction to the general theory of infinite processes and of analytic functions; with an account of the principal transcendental functions, Reprint of the fourth (1927) edition Communicated by L. Takhtajan
Commun. Math. Phys. 229, 491–509 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0695-2
Communications in
Mathematical Physics
Gross-Pitaevskii Theory of the Rotating Bose Gas∗ Robert Seiringer∗∗ Department of Physics, Jadwin Hall, Princeton University, P.O. Box 708, Princeton, NJ 08544, USA. E-mail:
[email protected] Received: 1 December 2001 / Accepted: 19 April 2002 Published online: 6 August 2002 – © Robert Seiringer 2002
Abstract: We study the Gross-Pitaevskii functional for a rotating two-dimensional Bose gas in a trap. We prove that there is a breaking of the rotational symmetry in the ground state; more precisely, for any value of the angular velocity and for large enough values of the interaction strength, the ground state of the functional is not an eigenfunction of the angular momentum. This has interesting consequences on the Bose gas with spin; in particular, the ground state energy depends non-trivially on the number of spin components, and the different components do not have the same wave function. For the special case of a harmonic trap potential, we give explicit upper and lower bounds on the critical coupling constant for symmetry breaking. 1. Introduction We consider the Gross-Pitaevskii (GP) theory of a rotating two-dimensional Bose gas in a trap. The Bose gas is described by a single function φ on R2 , the wave function of the condensate. It is confined in some trap potential V , and rotates around the origin at an angular velocity . The strength of the interaction between the particles is measured by the positive parameter a appearing in the GP functional (1.3) below. It is related to the particle number N , the scattering length as of the interaction potential and the average particle density ρ via a=
4πN . | ln as2 ρ|
(1.1)
Minimization of the GP functional is supposed to describe the physical properties of rotating Bose gases at very low temperatures, as considered in recent experiments ∗ c by the author. This paper may be reproduced, in its ∗∗ Erwin Schr¨ odinger Fellow. On leave from Institut
Boltzmanngasse 5, 1090 Vienna, Austria
entirety, for non-commercial purposes. f¨ur Theoretische Physik, Universit¨at Wien,
492
R. Seiringer
[1, 2]. These show various interesting properties, in particular, the appearance of multiple vortices and a resulting breaking of the rotational symmetry. There have been a lot of theoretical investigations on these phenomena (see, e.g., the review article [3]), based on the GP approach, either using numerical methods or various simplifying approximations, but a proof that the Gross-Pitaevskii functional indeed captures all these features is still missing. We shall not be concerned here with the derivation of the GP functional from the basic quantum mechanical N -particle Hamiltonian. For non-rotating systems, i.e., = 0, this has been achieved in [4, 5] (see also [6]). However, the methods used there allow no simple generalization of these results to the rotating case. We shall now describe the setting more precisely. We denote by (r, ϕ) polar coordinates for x = (x, y) ∈ R2 . In these coordinates, the angular momentum is given by L = −i∂/∂ϕ. For H0 = − − L + V (r)
(1.2)
and φ ∈ Q(H0 ) ∩ L4 (R2 , d 2 x) define the Gross-Pitaevskii energy functional by (1.3) E GP [φ] = φ|H0 φ + a |φ(x)|4 d 2 x. (Here Q(H0 ) denotes the quadratic form domain of H0 , and ·|· denotes the standard inner product on L2 (R2 )). The parameter a is non-negative, and without loss of gener2 ality also ≥ 0. We assume that V ∈ L∞ loc (R ) is a positive radial function with the property that ˜ 2 r 2 /4 − C ˜ V (r) ≥
(1.4)
˜ in some non-zero interval [0, c ), with C ˜ < ∞. We take [0, c ) to be the for all largest such interval, allowing it to be the whole half line; i.e., we allow c to be infinity. Moreover, we assume that V (r) is polynomially bounded at infinity, i.e, there exist constants C1 , C2 and 2 ≤ s < ∞ such that V (r) ≤ C1 + C2 r s . For convenience, let inf r V (r) = 0. Let E GP (a, ) be the ground state energy of E GP , i.e, (1.5) E GP (a, ) = inf E GP [φ], φ ∈ Q(H0 ) ∩ L4 (R2 ), φ2 = 1 , being finite for || < c and a ≥ 0. Using standard methods (see e.g. [4]) one can show that there exists a minimizer φ GP for E GP as long as || < c , i.e., the infimum is actually a minimum. For || > c the functional E GP is not bounded from below. The purpose of this paper is a detailed study of the GP functional (1.3). One of our main results is that for any > 0 and for large enough interaction strength a, no minimizer of E GP is an eigenfunction of the angular momentum L. Since E GP is invariant under rotation of φ, this result means that the rotational symmetry is broken in the ground state. This has interesting consequences on the multi-component Bose gas (or equivalently, Bose gas with spin). In particular, we will show that the ground state energy (of the natural generalization of the GP functional to multi-component systems) depends non-trivially on the number of spin components, and the different components necessarily have a different wave function in the symmetry breaking regime. The paper is organized as follows: In Sect. 2 we study stationary points of the GP functional, in particular minimizers of E GP restricted to the subspace of eigenfunctions
Rotating Bose Gas
493
of the angular momentum with fixed eigenvalue n, so-called vortex states. We show that for large enough angular momentum, these vortex states can never be the absolute minimizer of the GP functional, uniformly in the coupling constant a, which will be crucial in the proof of symmetry breaking. In Sect. 3 we study the critical values of the angular velocity , where an n + 1-vortex becomes energetically favorable to an n-vortex. These critical velocities all tend to zero as a goes to infinity, which will allow us to conclude that all vortex states with angular momentum smaller than a certain value cannot be the actual minimizers of E GP . In Sect. 4 we will use these results to prove symmetry breaking. Section 5 is devoted to the study of a GP density matrix functional, which will be useful in investigations on the multi-component Bose gas in Sect. 6. There we show that in the symmetry breaking regime, the GP energy depends non-trivially on the number of spin components. In Sect. 7 we finally consider the special case of an harmonic potential V (r) = r 2 , where we derive explicit upper and lower bounds on the critical coupling constant for symmetry breaking. 2. Vortex States Given any stationary state of E GP , i.e., a function φ with φ2 = 1 satisfying H0 + 2a|φ|2 − µ φ = 0
(2.1)
for some µ ∈ R, we define the (real) quadratic form Q(w) by the perturbation EµGP [φ + εw] − EµGP [φ] = ε2 Q(w) + O(ε 3 ) (2.2) as ε → 0, where EµGP [φ] = E GP [φ] − µ |φ|2 . A simple calculation shows that (2.3) Q(w) = w|H0 + 4a|φ|2 − µ|w + 2a φ 2 w 2 , where denotes the real part. Multiplying (2.1) with φ and integrating shows that µ = E GP [φ] + a |φ|4 . (2.4) Definition 1 (Stability). We say that a stationary state φ is stable if and only if Q(w) ≥ 0 for all w ∈ Q(H0 ) ∩ L4 (R2 ) that are orthogonal to φ. We have Q(iφ) = 0, which corresponds to a simple phase change in (2.2). Moreover, Q(∂φ/∂ϕ) = 0 because of rotational invariance of E GP . Note that, by definition, an absolute minimizer of E GP is necessarily stable, since E GP [(φ + εw)/φ + εw2 ] = E GP [φ] + ε 2 Q(w) + O(ε 3 )
(2.5)
if φ|w = 0. We now look for special solutions to (2.1) of the form φ(x) = f (r)einϕ
(2.6)
for some n ∈ N, a so-called n-vortex. Here f is a real radial function. Since E GP [f e−inϕ ] = E GP [f einϕ ] + 2n we can restrict ourselves to non-negative n without loss of generality. At least one solution of the form (2.6) for each n always exists, as one easily
494
R. Seiringer
sees by minimizing the functional E GP in the subspace of functions with Lφ = nφ. For φ of the form (2.6) there is the following direct sum decomposition of Q: Writing w(x) = m≥0 wm (x) with wm (x) = Am (r)ei(n−m)ϕ + Bm (r)ei(n+m)ϕ (2.7) one easily sees that Q(w) = m≥0 Q(wm ). Considering the stability of vortices, we restrict ourselves to the case < c , since for > c all states are certainly unstable. (The case = c depends on the particular form of the potential V .) First of all, φ has only a chance of being stable if f has no zeros away from r = 0. More precisely, the following proposition holds. Proposition 1 (Instability for f ’s with zeros). Assume that either n = 0 and f has some zero, or n ≥ 1 and f has some zero away from r = 0. Then φ is unstable. Proof. For some real h we choose w(x) = ih(r)einϕ as a trial function for Q. We get n2 2
|h . Q(w) = h| − + 2 − n + V (r) + 2af − µ|h ≡ h|H (2.8) r
f = 0, but because of its zeros, f cannot be the ground state of H
, so We know that H
there exists an h, orthogonal to f , with h|H |h < 0. Now let φ be the minimizer of E GP in the subspace with angular momentum n. Then f defined in (2.6) minimizes the energy functional ∞ n2 En [f ] = f | − + 2 + V (r)|f + 2π a |f (r)|4 rdr (2.9) r 0 under the condition 2π |f (r)|2 rdr = 1, with corresponding energy En (a) = En [f ] = E GP [f einϕ ] + n.
(2.10)
In the following, we will study En for all n ≥ 0, not only for integers. Denote µ˜ ≡ µ + n,
(2.11)
which is independent of . The minimizer f of En has the following properties: Lemma 1 (Properties of f ). f (r) > 0 for r > 0, f ∈ C ∞ (R+ ) if V ∈ C ∞ , and f (r) = O(r n ) as r → 0. Moreover, f ∈ L∞ (R+ ), and µ˜ . (2.12) 2a Proof. The regularity and strict positivity follow in a standard way from the variational equation for f (cf. [7]). Writing f (r) = r n g(r) we see that g minimizes the functional ∞ 2n + 1 2n 3 ˜ E[g] = g (r) + V (r)g(r) + ar g(r) r 2n+1 dr g(r) −g (r) − r 0 (2.13) under the condition 2π g(r)2 r 2n+1 dr = 1, from which we conclude that g is a bounded, strictly positive function. The bound (2.12) is proved analogously to Lemma 2.1 in [5]: Let B = {x, 2af (|x|)2 > µ}. ˜ We see that −f < 0 on B, i.e., f is subharmonic on B and therefore achieves its maximum on the boundary of B. Hence B is empty. f 2∞ ≤
Rotating Bose Gas
495
We remark that all the properties of f stated in Lemma 1, except for the positivity, hold for all n-vortices and not only for minimizers. Also the following lemma holds true for arbitrary vortex states. Lemma 2 (Properties of g). Let f (r) = r n g(r) be a stationary point of En , for n ∈ R+ . Then n/2 g∞ ≤ f ∞ cn2 µ˜ , (2.14) where
cn =
2−n
2−n n/2
π Csc (2 − n)*(n) n
√ π *(n + 21 ) cn = n *(n)
nπ 1/n 2
for n ≤ 1 (2.15)
for n ≥ 1.
If f is the minimizer of En , and if V is monotone increasing, then g is a monotone decreasing function. Proof. By a rearrangement argument one sees from (2.13) that the minimizer of E˜ is monotone decreasing if V is monotone increasing. For a general n-vortex, g fulfills the equation −g (r) −
2n + 1 ˜ g (r) + V (r)g(r) + 2ar 2n g(r)3 = µg(r). r
(2.16)
Kato’s inequality and the positivity of V imply that −|g(r)| −
2n + 1 |g(r)| ≤ µ|g(r)| ˜ r
(2.17)
in the sense of distributions. Now let χn (r, s) be the kernel of the operator
−1
−
d2 2n + 1 d +1 − dr 2 r dr
acting on L2 (R+ , r 2n+1 dr). It is given by 1 In (r)Kn (s) χn (r, s) = n (rs) Kn (r)In (s)
,
for r ≤ s for r ≥ s,
(2.18)
(2.19)
where In and Kn denote the usual modified Bessel functions. Note that both In and Kn are positive, so χn is positive. By scaling, the integral kernel of (2.18) with +1 replaced by +t 2 is t 2n χn (rt, st). Therefore (2.17) implies that, for t > 0 and 0 ≤ α ≤ 1, ∞ |g(r)| ≤ (µ˜ + t 2 )t 2n χn (rt, st)|g(s)|s 2n+1 ds 0 ∞ 2 2n α f χn (rt, st)s 2n+1−nα ds. (2.20) ≤ (µ˜ + t )t g1−α ∞ ∞ 0
496
R. Seiringer
We now claim that χn (r, s)h(s)s 2n+1 ds is monotone decreasing in r if h is a positive, monotone decreasing function. To prove this, it suffices to consider a step function h(s) = 0(R − s), R > 0. A simple calculation yields R 1 − R n+1 Kn+1 (R) Inr(r) for r ≤ R n (2.21) χn (r, s)s 2n+1 ds = Kn (r) n+1 I R (R) for r ≥ R, n+1 0 rn which proves the claim, since In (r)/r n and Kn (r)/r n are monotone increasing and decreasing, respectively. Therefore the maximum on the right-hand side of (2.20) is achieved for r = 0, and we get ∞ 2−n µ˜ + t 2 1−α α g∞ ≤ 2−nα g∞ f ∞ Kn (s)s n+1−nα ds. (2.22) t *(n + 1) 0 The last integral can be evaluated explicitly, if nα < 2. Choosing α = 1 for n ≤ 1 and α = 1/n for n ≥ 1 and optimizing over t yields the desired result. Equation (2.14) effectively gives a lower bound on s, the size of the vortex core, defined by |f (r)| ∼ f ∞ (r/s)n as r → 0, i.e., |f (r)| −1/n f ∞ 1/n 1 s = lim n ≥ ≥ 1/2 . (2.23) r→0 r f ∞ g∞ µ˜ cn Note that cnn → 1 as n → 0, and cn = O(n−1/2 ) for large n. The latter fact will be important in the proof of the following theorem. Theorem 1 (Instability for large n). For all 0 ≤ < c there exists an N < ∞ (independent of a!) such that all vortices with n ≥ N are unstable. Proof. Let n ≥ 1, and letw1 ∈ H 1 (R2 ) be radial and normalized, with support in the ball of radius 1. Let X = |w1 (r)|2 V (r/cn µ˜ 1/2 )d 2 x, T = w1 | − w1 , and define w by w(x) = cn µ˜ 1/2 w1 (cn µ˜ 1/2 r), with cn given in (2.15). We have, using (2.12) and (2.14), 1 ˜ n2 )n }d 2 x . Q(w) ≤ n + µ˜ cn2 T + X − 1 + 2 |w(x)|2 min{1, (r 2 µc µ˜ With M = |w1 |2 r 2n this gives Q(w) ≤ n + µ˜ cn2 T − 1 + 2M + X. (0)
(2.24)
(2.25)
(2.26)
Now µ˜ is larger than en ≡ inf spec (− + V ) L=n , which can be bounded below ˜ 2 r 2 /4 for some constant C ˜ and as follows. By assumption (1.4), V (r) ≥ −C˜ + ˜ < c . Denoting by ψn(0) the eigenfunction corresponding to en(0) , we have < 2 ˜ 2 2 (0) (0) n2 (0) (0) n (0) ˜ ≥ −C˜ + n. r ψ en ≥ ψn 2 + V (r) ψn ≥ ψn −C˜ + 2 + r r 4 n (2.27)
Rotating Bose Gas
497
Now cn2 = O(n−1 ) as n → ∞, and the same holds for M. X can be bounded by supr≤(cn2 µ) ˜ −1/2 V (r), which is, for fixed , bounded independent of n and a by the considerations above. Therefore Q(w) < 0 for n large enough. For a special class of potentials, we can extend the previous result in the following way: Theorem 2 (Instability for special V’s). Assume that for some d ∈ N, d ≥ 2, V (r) r ≤0 r 2(d−1)
(2.28)
for all r. Assume also that n ≥ d and µ˜ > n 1 +
2 . d −1
(2.29)
Then φ is unstable. Proof. For 1 ≤ d ≤ n we choose as a trial function w(x) = (A(r) + B(r)) ei(n−d)ϕ + (A(r) − B(r)) ei(n+d)ϕ .
(2.30)
Then Q(w) can be written as Q(w) = 2
A A H , B B
(2.31)
where
H=
H0 +
n2 +d 2 r2
− n − µ + 6af 2
d −
2nd r2
d − H0 +
n2 +d 2 r2
2nd r2
− n − µ + 2af 2
.
(2.32)
We now choose A(r) = f (r)/r d−1 and B(r) = nf (r)/r d . Note that A − B = r n−d+1 (f/r n ) ≤ O(r) as r → 0, so w ∈ H 1 (R2 ). A straightforward calculation using Eq. (2.1) yields ∞ f (r)f (r) dr 2(d−1) (−µ(d − 1) + 2a(d − 1)f (r)2 + n Q(w) = 8π r 0 r + (d − 1)V (r) − V (r)) 2 ∞ f (r)2 = 8π dr 2(d−1)+1 (−µ(d − 1)2 + a(d − 1)2 f (r)2 + (d − 1)n r 0 V (r) 1 2(d−1)+1 r , (2.33) + r 4 r 2(d−1) where we used partial integration in the last step. Estimating af (r)2 by (2.12) this shows the negativity of Q(w) as long as (2.28) and (2.29) are satisfied.
498
R. Seiringer
In the case of a homogeneous potential V (r) = r ν , 2 ≤ ν < ∞, the condition (2.28) is fulfilled for ν = 2(d − 1), d ∈ N, showing that in this case every vortex with n ≥ d = 21 ν + 1 is unstable, if (2.29) is fulfilled, i.e., if µ˜ is large enough. (For fixed n this means that a has to be large enough.) Remark 1 (Translational stability). The calculation in the proof of Theorem 2 shows that any vortex with n ≥ 1 is stable against translations, if the opposite of the assumption on V is true for d = 1. More precisely, the function w(x) defined above is, for d = 1, equal to ∂φ/∂x. Looking at (2.33) we see that this expression is always positive for d = 1, if (rV (r)) ≥ 0. This implies that Q(∂φ/∂x) ≥ 0, and the same conclusion holds for ∂φ/∂y. Note that the condition on V is in particular fulfilled for any homogeneous potential V (r) = r ν . The choice of the test function in the proof of Theorem 2 is motivated by analogous considerations in [8] (see also [9] for a treatment of the Ginzburg-Landau model). 3. The Critical Frequencies From (2.10) one sees that an n+1-vortex becomes energetically favorable to an n-vortex if > n , where the critical frequency is given by n (a) = En+1 (a) − En (a) > 0,
(3.1)
with En (a) defined in (2.10). In the following we will study the properties of the n ’s, in particular their behavior for large a. This will be important in the proof of symmetry breaking in the ground state of the GP functional. Lemma 3 (Relation between n ’s). For n ≥ 0, n+1 ≤
2n + 3 n . 2n + 1
(3.2)
Proof. Using fn+1 , the minimizer for En+1 , as a trial function for En+2 and En , respectively, we get fn+1 (r)2 2 d x (3.3) n+1 ≤ (2n + 3) r2 and n ≥ (2n + 1)
fn+1 (r)2 2 d x. r2
(3.4)
Theorem 3 (Bounds on n ). For all n ∈ N0 , a 2πe n (a) ≤ (2n + 1) , E1 (a) 3 + ln a 2π e2 + ˜2 1 ˜ < c . n (a) ≥ (2n + 1) for all 0 ≤ 4 C˜ + En+1 (a)
(3.5) (3.6)
Rotating Bose Gas
499
Proof. The concavity of En (a) in n2 implies that the right and left derivatives of En with respect to n exist, and from the existence of a unique minimizer for En for all n we conclude that En is in fact differentiable in n. Therefore we have ∂En (3.7) n = ∂n n=n0 for some n0 ∈ (n, n + 1). Now let f (r) = r n g(r) be the minimizer of En . To obtain the upper bound, we estimate, for 0 < α ≤ 1, ∂En f (r)2 2 = 2n d x ∂n r2
≤ 2n f 24 √
≤ 2n f ∞
|x|≥R
r −4 d 2 x
1/2
2(1−α) + g2α ∞ f ∞
π 2(1−α) π + g2α R 2nα ∞ f ∞ R nα
|x|≤R
r 2nα−2 d 2 x
for all R > 0. Optimizing over R yields 1/(2nα+1) 1 1+nα ∂En f 2nα+2(1−α) g2α , ≤ 2n + 2π ∞ ∞ ∂n α and by (2.14),
1/(2nα+1) 1 1+nα 2nα ∂En αn cn f 2nα+2 µ ˜ . ≤ 2n + 2π ∞ ∂n α
Next we choose
−1 cn2 µ˜ 1 ln α = min 1, n 4π e2 f 2∞ +
(3.8)
(3.9)
(3.10)
(3.11)
and use (2.12), which yields 2 ∂En πe c a ≤ µ˜ max 2n + 1, n ln n . ∂n a 2π
(3.12)
Using µ˜ ≤ 2En (a) and cn2n ≤ e this gives, together with (3.7), for 0 a 2πe E1 (a) max 3, 1 + ln . (3.13) 0 (a) ≤ a 2π Now n ≤ (2n + 1)0 by Lemma 3, which finishes the proof of the upper bound. To obtain the lower bound, we use (3.4) and 2 ˜2 fn+1 2 1 1 d x ≥ (3.14) ≥ 2 2 2 2 4 C˜ + En+1 (a) r fn+1 r d x ˜ < c . because of (1.4), for all 0 ≤
Note that since V (r) ≤ C1 + C2 r s for some 2 ≤ s < ∞ by assumption, a simple trial wave function shows that En (a) ≤ O(a s/(s+2) ) as a → ∞, implying that n behaves at most as a −2/(s+2) ln a for large a. In particular, lima→∞ n (a) = 0 for all n.
500
R. Seiringer
4. Symmetry Breaking in the Ground State We now have the necessary tools to prove symmetry breaking. With the results of Theorems 1 and 3 the following is easily shown. Theorem 4 (Symmetry breaking). For all 0 < < c there is an a such that a ≥ a implies that no ground state of the functional (1.3) is an eigenfunction of the angular momentum. Proof. Fix 0 < < c . From Theorem 1 we see that there exists an N independent of a such that all vortex states with n ≥ N are unstable, and therefore cannot be minimizers of E GP . By the definition of the critical frequencies (3.1), min {En (a) − n} > min {En (a) − n} n≥N
0≤n
(4.1)
if > max
0≤j
N −1 1 i (a). N − j
(4.2)
i=j
By Theorem 3 (and the remark after the proof) this can always be fulfilled for a large enough, so the ground state of E GP cannot be an n-vortex. This shows that no minimizer of the GP functional is an eigenfunction of the angular momentum. We can even show more, namely that the absolute value of a minimizer is not a radial function. To prove this, we need the following general lemma. Lemma 4 (Fourier series of eih(ϕ) ). Let h : [0, 2π ] → R be a measurable function. Then the set of Fourier coefficients of eih contains either only one or infinitely many non-zero elements. Proof. Let eih(ϕ) = n hn einϕ , where the hn ’s are the Fourier coefficients of eih . Let n0 = max{n, hn = 0} and n1 = min{n, hn = 0}, assuming that both are finite. Since 2 1 = eih(ϕ) = kn einϕ with kn = hn+m hm , (4.3) n
m
we know that kn = δn0 . But kn0 −n1 = hn0 hn1 = 0, so n0 = n1 .
Corollary 1 (Symmetry breaking, Part 2). Let 0 < < c and a ≥ a , and let φ GP be a minimizer of E GP . Then |φ GP | is not a radial function.
= H0 + 2a|φ GP |2 − µ we have H
φ GP = Proof. Assume |φ GP | is radial. With H inϕ
commutes with L, it has eigenfunctions hn (r)e 0. Because H with corresponding eigenvalue 0. Therefore φ GP (r, ϕ) = λn hn (r)einϕ (4.4) n
for some 0 = λn ∈ C, and the sum is finite, since the ground state energies of H restricted to subspaces of L = n go to infinity as n → ∞, and there can only be one eigenfunction for each n. Choosing some interval I ∈ R+ where |φ GP | does not vanish, we can conclude with Lemma 4 that only one hn is unequal to zero in I . But since the hn ’s do not vanish on some open set, this is true on all of R+ . Therefore φ GP has to be an eigenfunction of L, contradicting Theorem 4.
Rotating Bose Gas
501
Note that Theorem 4 implies in particular that the minimizer of E GP is not unique (up to a constant phase), and Corollary 1 shows that even the absolute value is not unique. By rotating a minimizer φ GP one obtains again a minimizer, which is, at least except for exceptional values of the rotation angle, different from the original one. Numerical investigations [10, 11] indicate that the symmetry breaking results from a splitting of an n-vortex into several vortices with winding number 1. I.e., one expects that φ GP has d distinct zeros of degree 1, where d = deg{φ GP /|φ GP |} for large enough r. This property was proved for large a for the minimizer of models similar to the GP functional [12–14]. There is also experimental evidence for vortex splitting [1, 2].
5. A Density Matrix Functional We now introduce a new functional, which will be convenient in the following; firstly, to obtain a lower bound on the critical parameter a , and secondly, for studying a generalization of the GP functional to a Bose gas with several components, which will be done in the next section. Analogously to the Gross-Pitaevskii functional we define the GP density matrix (DM) functional as E DM [γ ] = Tr[H0 γ ] + a ργ (x)2 d 2 x. (5.1) Here γ is a one-particle density matrix, a positive trace-class operator on L2 (R2 ), and ργ denotes its density. The ground state energy, the infimum of (5.1) under the condition Tr[γ ] = 1, will be denoted by E DM (a, ). It is clear that E DM ≤ E GP . Using the methods of [4, 15] one can prove the following theorem. Theorem 5 (Minimizer of E DM ). For each 0 ≤ < c and a ≥ 0 there exists a minimizing density matrix for (5.1) under the condition Tr[γ ] = 1. The density corresponding to the minimizer, denoted by ρ DM , is unique (and therefore a radial function), and each minimizer also minimizes the linearized functional DM Elin [γ ] = Tr[(H0 + 2aρ DM )γ ].
(5.2)
The uniqueness of the density results from the strict convexity of E DM in ργ . In general, the minimizing density matrix need not be unique. However, for the functional (5.1) we can show that this is indeed the case. Theorem 6 (Uniqueness of γ DM ). The minimizer of E DM , denoted by γ DM , is unique. Moreover, it has finite rank. Proof. Being a minimizer of (5.2), γ DM can be decomposed as γ DM (x, x ) =
λj,k fj (r)fk (r )ei(j ϕ−kϕ ) ,
(5.3)
j,k≥0
where the fk (r)eikϕ are the ground states of H0 +2aρ DM , and the sum is finite because of the discreteness of the spectrum of this operator, implying finite rank of γ DM . Moreover,
502
R. Seiringer
there can be only one ground state for each angular momentum, and fj (r) = r j gj (r) with gj (r) bounded and strictly positive. Therefore ρ DM (r) = r j χj (r, ϕ), (5.4) j ≥0
with χj (r, ϕ) =
λj −k,k gj −k (r)gk (r)ei(j −2k)ϕ .
(5.5)
k
Hence each χj has to be independent of ϕ as r → 0, which implies, together with Lemma 4, that λj,k = 0 for j = k. Moreover, λj,j is determined by the unique density ρ DM = j ≥0 r 2j λj,j gj2 , so γ DM is unique. Analogously to minimizers of the GP functional, the DM density has the following properties. Lemma 5 (Properties of ρ DM ). For r > 0 we have ρ DM > 0 and ρ DM ∈ C ∞ if V ∈ C ∞ . Moreover, ρ DM ∞ ≤ µDM /(2a), where µDM is the chemical potential of DM . the DM theory, which is the ground state energy of Elin Proof. Note that ρ DM = j λj,j fj (r)2 with the notation of the proof of Theorem 6, where λj,j ≥ 0. The first two properties follow from this decomposition and a bootstrap argument. Moreover, a direct computation gives −ρ DM ≤ 2ρ DM (µDM − 2aρ DM − V (r)).
(5.6)
Since V (r) ≥ 0 this implies that 2aρ DM ≤ µDM by a subharmonicity argument as in Lemma 1. An important consequence of the uniqueness of the minimizer of E DM is the following corollary. Corollary 2 (Non-equivalence of E GP and E DM ). Assume that the minimizer of E GP is not unique, which is in particular the case for a ≥ a . Then E GP (a, ) > E DM (a, ). Proof. This follows immediately from the uniqueness of γ DM .
Note that in the case of non-uniqueness of φ GP , the rank of γ DM is always greater or equal to two, and therefore the ground state of H0 + 2aρ DM is degenerate. This holds in particular in the whole region a ≥ a , not only for isolated points or lines in the (a, ) plane. For the non-rotating case, i.e. = 0, E GP and E DM are equal for all a. This remains true, if is not too large. Proposition 2 (Equivalence of E GP and E DM for small ). Assume that ≤
˜2 1 4 C˜ + µDM
(5.7)
˜ < c . Then E GP (a, ) = E DM (a, ), and the minimizer of E GP has for some < zero angular momentum.
Rotating Bose Gas
503
Note that this proposition implies a lower bound on a . Proof. Assume that ψ is a ground state of H DM ≡ H0 +2aρ DM with angular momentum Lψ = mψ. Then m2 DM DM DM = inf spec H = |ψ| − − m + 2 + V (r) + 2aρ |ψ| (5.8) µ r 2 m ≥ µDM + ψ 2 − m ψ , (5.9) r implying that m≤
≤ ψ|r 2 |ψ . ψ|r −2 |ψ
(5.10)
˜ with < ˜ < c we have r 2 ≤ 4(C ˜ + V (r))/ ˜ 2 by (1.4), and Choosing some hence 4 2 2 2 2 ψ|r |ψ ≤ C˜ + ψ V (r) − r ψ 1− ˜2 ˜2 4 4 ≤ C˜ + µDM − , (5.11) ˜2 where we have used that the ground state energy of − − L + 2 r 2 /4 is . Therefore 4 C˜ + µDM − 2 <1 (5.12) ψ|r |ψ ≤ ˜ 2 − 2 if ≤
˜2 1 , 4 C˜ + µDM
(5.13)
showing that any ground state of H DM necessarily has angular momentum m = 0.
6. The Multi-Component Bose Gas We now consider the Gross-Pitaevskii theory of a rotating Bose gas with nc different components, or equivalently, a Bose gas consisting of particles with spin (nc − 1)/2. The natural generalization of the Gross-Pitaevskii functional is EnGP [φ1 , . . . , φnc ] c
=
nc i=1
φi |H0 φi + a
|φi |2 |φj |2 ,
(6.1)
1≤i,j ≤nc
c |φi |2 = 1. The corresponding which has to be minimized under the constraint ni=1 GP ground state energy will be denoted by Enc (a, ). Using standard methods, one can show that for all values of nc ∈ N, 0 ≤ < c and a ≥ 0 there exist minimizing functions φ1GP , . . . ,φnGP for EnGP . The proof goes analc c GP ogously to the proof of Theorem 5, noting that Enc can be considered as the restriction
504
R. Seiringer
of E DM to density matrices of rank less than or equal to nc . However, since set is not this c convex, we can in general not conclude that the density of a minimizer, ni=1 |φiGP |2 , is unique, as it was the case for the DM functional. We see that for all values of a, and nc we always have E DM ≤ EnGP ≤ E GP . c Denoting nDM (a, ) = rank γ DM ,
(6.2)
which we showed to be finite in Theorem 6, we can distinguish the following cases. Theorem 7 (Minimizers of the multi-component GP functional). Let φ1GP , . . . , φnGP c be minimizers of EnGP . c (a, ). Moreover, (i) If nc ≥ nDM , then E DM (a, ) = EnGP c nc i=1
|φiGP φiGP | = γ DM ,
(6.3)
nDM DM = implying that the φiGP ’s can be written as φiGP = j =1 Aij ψj , where γ nDM DM -matrix with A† A = 1. i=1 |ψi ψi |, ψi orthogonal, and A is an nc × n (ii) If nc < nDM , then E DM (a, ) < EnGP (a, ). c (iii) If nc ≥ 2, a ≥ a , then EnGP (a, ) < E GP (a, ), and the minimizers φiGP are c nc GP GP not all equal; i.e., i=1 |φi φi | has at least rank 2. DM Proof. As remarked earlier, E DM ≤ EnGP . To prove (i), we write γ DM = ni=1 |ψi ψi |, c where the ψi are orthogonal (but not necessarily normalized). Using φi = ψi for i ≤ nDM and φi = 0 for nDM < i ≤ nc as trial functions, we obtain E DM ≥ EnGP . Since γ DM is c unique, (6.3) follows. (ii) is a trivial consequence of the uniqueness of γ DM . For a ≥ a , we know from Corollary 1 that there are at least two different minimizers, φ (1) and φ (2) , for E GP , whose absolute values are not the same. For nc ≥ 2 we use as trial functions φ1 = φ (1) , φi = φ (2) for i ≥ 2 to obtain EnGP < E GP . Therefore the c GP minimizers for Enc cannot be all equal, and (iii) is proved. An important consequence of part (iii) of the theorem above is that the GP ground state energy depends non-trivially on the number of spin-components, at least in the symmetry breaking regime a ≥ a . Moreover, there is a clear separation between different spin-components, their individual densities |φiGP |2 can never be all equal. 7. The Special Case V (r) = r 2 In the case of a harmonic potential V (r) = r 2 the theorems above are more explicit, setting c = 2 and C = 0. Moreover, the special case = c = 2 is easy to treat: only for a = 0 there is a minimizer for E GP (in fact there are infinitely many), whereas for a > 0 there is no minimizer, and also all n-vortices are unstable. The following bound on the energies En (a) can be easily obtained, and will be used several times in the considerations below.
Rotating Bose Gas
505
Lemma 6 (Upper bound on En (a)). En (a) ≤ 2(n + 1) 1 +
a bn (n + 1)
(7.1)
with bn = 2π 4n (n!)2 /(2n)!. Proof. This follows from a trial function of the form Cr n exp(−cr 2 ), where C is a normalization constant and c is to be optimized. The estimate above implies that, for µ˜ the chemical potential corresponding to the minimizer of En ,
(n + 1) µ˜ − 2n ≤ 2(En − 2n) − 2 ≤ 2 1 + 2 a . (7.2) bn This will be useful for an upper bound on f , since in the case of V (r) = r 2 we can improve the estimate (2.12) by f 2∞ ≤
1 (µ˜ − 2n). 2a
(7.3)
The proof is analogous to the one of (2.12), using in addition that n2 /r 2 + r 2 ≥ 2n. We now consider the stability of n-vortices φ, i.e., solutions to (2.1) of the form (2.6). In addition to the results of Sect. 2 we can state another proposition. Proposition 3 (Instability for small a). Assume that a < π n(2−). Then φ is unstable. √ Proof. Let n ≥ 1, and let w(x) = 1/π exp(−r 2 /2) be the ground state of H0 . Then a Q(w) = 2 − µ + 4a w 2 |φ|2 ≤ n( − 2) + , (7.4) π where we used Cauchy-Schwarz and the fact that µ ≥ 2 − n( − 2) + 2a |φ|4 . Moreover, we can improve Theorem 1 in the following way. Theorem 8 (Instability for large n, harmonic potential). Assume that n ≥ 10 and µ˜ ≥
n , 1 − dn
(7.5)
where dn is a monotone decreasing function of n, with dn < 1 for n ≥ 10, namely 2π*(n + 21 )2 19 2 1−n dn = min 2 + . (7.6) +2 (n! − *(n + 1, 2)) , e n2 *(n)2 n Then φ is unstable. Note that since µ˜ > 2n, (7.5) is in particular fulfilled if dn ≤ 1 − /2.
506
R. Seiringer
Proof. Let n ≥ 1, and let w1 ∈ H 1 (R2 ) be radial and normalized. Let T = w1 |−w1 and X = w1 |r 2 w1 , and define w by w(x) = cn µ˜ 1/2 w1 (cn µ˜ 1/2 r), with cn given in (2.15). Using (7.3) and (2.14), we can estimate 1 2 Q(w) ≤ n + µ˜ cn T + 2 2 X − 1 cn µ˜ +2(µ˜ − 2n) |w(x)|2 min{1, (r 2 µc ˜ n2 )n }d 2 x. With N = X,
r≥1 |w1 |
2
and M =
2 2n r≤1 |w1 | r
(7.7)
(7.8)
this gives, using µ˜ ≥ 2(n + 1) in front of
Q(w) ≤ n + µ˜ cn2 T − 1 + 2 (N + M) +
1 X − 4n(N + M) . (7.9) cn2 2(n + 1)
Now if we choose w1 (r) = (2/π )1/2 exp(−r 2 ) the last term in (7.9) is negative, and the computation of T , N and M yields Q(w) < n − µ(1 ˜ − dn ),
(7.10)
where dn is the first part in the parenthesis in (7.6). For large n, this can be improved by choosing w1 (r) = (35/9π)1/2 [1 − r 3/2 ]+ , which gives 19 . (7.11) Q(w) < n − µ˜ 1 − n In [11] the authors used a similar method to the one of Theorem 8 and a particular assumption on the form of the vortex state φ to obtain a dn in (7.5) that is less than 1 if n ≥ 2. We conjecture that in the case of an harmonic potential an n-vortex with n ≥ 2 is unstable, for all values of ≥ 0 and a > 0 . However, we can prove this only for ≤ 1. Namely, if we insert V (r) = r 2 in (2.33), set d = 2, use the improved bound (7.3) and µ ≥ 2 − n( − 2) + 2a |φ|4 this shows the negativity of Q(w) as long as n ≥ 2 and 1 (7.12) 1 + a |φ|4 <1+ 2n is satisfied. This implies that all vortices with n ≥ 2 and a ≥ 0 are unstable as long as ≤ 1. As a consequence of the considerations above we can state an explicit condition on a where an n-vortex is necessarily unstable, using the general lower bound
3 |φ(x)|2 d 2 x 4 , |φ(x)| d x ≥ 9π |φ(x)|2 |x|2 d 2 x 4 2
(7.13)
Rotating Bose Gas
507
which can easily be proved using elementary calculus of variations. Moreover, since in two dimensions |φ|4 scales as 1/(length)2 , the virial theorem implies for the minimizer f of En , 1 f (r)2 r 2 d 2 x = En (a). (7.14) 2 Hence we only need an upper bound on En (a), which is given in Lemma 6, to obtain a condition on a for validity of (7.12). The critical frequencies n (a), defined in (3.1), have the following properties: Lemma 7 (Properties of critical frequencies). For all n ∈ N0 we have n (0) = 2 and for all a ≥ 0 limn→∞ n (a) = 2. Moreover, n (0) = −
1 (2n)! < n+1 (0) < 0. 4n+1 π n!(n + 1)!
(7.15)
Proof. The first assertion√follows from En (0) = 2(n + 1). Using the harmonic oscillator eigenstates χn (r) = 1/π n!r n exp(−r 2 /2) as trial functions the second assertion is proved by n ≤ 2 + a |χn+1 |4 and n ≥ 2 − a |χn |4 , (7.16) noting that |χn |4 = O(n−1/2 ) as n → ∞. To prove (7.15) we use the FeynmanHellmann principle to calculate n (a) = |fn+1 |4 − |fn |4 , (7.17) where fn is the minimizer of En . For a = 0 we have fn = χn , yielding (7.15).
In the special case of a harmonic potential, the results of Theorem 3 can be improved. We get the following bounds on the critical frequencies: Theorem 9 (Bounds on n , harmonic potential). For all n ∈ N0 ,
a 2a 2πe n (a) ≤ (2n + 1) , 1+ 3 + ln a π 2π e2 +
n (a) ≥
2n + 1 , a (n + 2) 1 + bn+1 (n+2)
(7.18)
(7.19)
with bn given in Lemma 6. Proof. We proceed as in Theorem 3, but now use the improved estimate (7.3) to replace (3.12) by 2 ∂En πe c a ≤ (µ˜ − 2n) max 2n + 1, n ln n . (7.20) ∂n a 2π
508
R. Seiringer
Inserting (7.2) and using (n + 1)/bn ≤ (2π)−1 for 0 ≤ n ≤ 1 and cn2n ≤ e this gives for 0
a 2a 2πe 1+ max 3, 1 + ln , (7.21) 0 (a) ≤ a π 2π and using Lemma 3 we obtain (7.18). 2 2 For the lower bound we proceed as in (3.14) and note that fn+1 r = En+1 (a)/2 by the virial theorem. Equation (7.19) is obtained by inserting the bound (7.1) for En+1 (a). Numerical investigations in [16] indicate that for all n ≥ 0, n is strictly monotone decreasing in a, and, for a > 0, n < n+1 , i.e., En is convex in n. Note that the theorem above states that n behaves at most as a −1/2 ln a for large a, in accordance with previous considerations (see [11] and references therein). We can now use the results above to derive explicit upper and lower bounds on a . Denote
a 2πe 2a =(a) = , (7.22) 1+ 3 + ln a π 2π e2 + which is a strictly monotone decreasing function of a. We know from Theorem 8 and (7.12) that all n-vortices are unstable for n ≥ N , where 2 for ≤ 1 N = (7.23) 38 for > 1. 2− Using the bound on the critical frequencies (7.18), we see by analogous considerations as in the proof of Theorem 4 that symmetry breaking occurs if =(a) ≤
. 2N − 1
(7.24)
From Theorem 2 we see that E DM (a, ) = E GP (a, ) if ≤ 1/µDM . Moreover, it is easy to see that the same holds for ≤ 2 − a/π . Namely, using Cauchy-Schwarz and the fact that (ρ DM )2 is monotone decreasing in a (because of concavity of E DM in a) we have, for H DM = H0 + 2aρ DM , a DM DM 2 inf spec H (7.25) L=0 ≤ 2 + 2a ρ χ < 2 + 2a χ 4 = 2 + , π √ where χ (x) = 1/π exp(−r 2 /2) is the ground state of H0 . Moreover, H DM |L|≥1 > 4 − ,
(7.26)
showing that, for a ≤ π(2 − ), H DM has a unique ground state with zero angular momentum which necessarily also minimizes the GP functional. Since the minimizer of the GP functional is unique and therefore an angular momentum eigenfunction as long as E DM = E GP , we obtain as a consequence a lower bound on a , using µDM ≤ 2E DM (a, ) ≤ 2E0 (a) and the bound on E0 (a) given in (7.1). Thus we have proved the following theorem:
Rotating Bose Gas
509
Theorem 10 (Bounds on a ). In the case of a harmonic potential, the critical parameter for symmetry breaking fulfills the bounds , (7.27) a ≤ =−1 2N − 1 with = defined in (7.22) and N given in (7.23), and 1 − 2 . a ≥ π max 2 − , 82
(7.28)
Acknowledgements. The author would like to thank Elliott Lieb and Jakob Yngvason for fruitful discussions and helpful comments. Financial support by the Austrian Science Fund in the form of an Erwin Schr¨odinger Fellowship is gratefully acknowledged.
References 1. Madison, K.W., Chevy, F., Wohlleben, W., Dalibard, J.: Vortex formation in a stirred Bose-Einstein condensate. Phys. Rev. Lett. 84, 806–809 (2000) 2. Abo-Shaeer, J.R., Raman, C., Vogels, J.M., Ketterle, W.: Observation of vortex lattices in BoseEinstein condensates. Science 292, 476–479 (2001) 3. Fetter, A.L., Svidzinsky, A.A.: Vortices in a trapped dilute Bose-Einstein condensate. J. Phys.: Condens. Matter 13, R135–R194 (2001) 4. Lieb, E.H., Seiringer, R., Yngvason, J.: Bosons in a Trap: A rigorous derivation of the Gross-Pitaevskii energy functional. Phys. Rev. A 61, 043602-1–13 (2000) 5. Lieb, E.H., Seiringer, R., Yngvason, J.: A rigorous derivation of the Gross-Pitaevskii energy functional for a two-dimensional Bose gas. Commun. Math. Phys. 224, 17–31 (2001) 6. Lieb, E.H., Seiringer, R.: Proof of Bose-Einstein condensation for dilute trapped gases. Phys. Rev. Lett. 88, 170409-1–4 (2002) 7. Lieb, E.H., Loss, M.: Analysis, 2nd ed., Providence, RI: AMS, 2001 8. Hagan, P.S.: Spiral waves in reaction–diffusion equations. SIAM J. Appl. Math. 42, 762–786 (1982) 9. Mironescu, P.: On the stability of radial solutions of the Ginzburg-Landau equation. J. Funct. Anal. 130, 334–344 (1995) 10. Butts, D.A., Rokhsar, D.S.: Predicted signatures of rotating Bose-Einstein condensates. Nature 397, 327–329 (1999) 11. Castin, Y., Dum, R.: Bose-Einstein condensates with vortices in rotating traps. Eur. Phys. J. D 7, 399–412 (1999) 12. Bethuel, F., Brezis, H., H´elein, F.: Ginzburg-Landau Vortices, Progress in Nonlinear Differential Equations and Their Applications. Vol. 13, Basel-Boston: Birkh¨auser, 1994 13. Serfaty, S.: On a model of rotating superfluids. ESAIM, Control Optim. Calc. Var. 6, 201–238 (2001) 14. Aftalion, A., Du, Q.: Vortices in a rotating Bose-Einstein condensate: Critical angular velocities and energy diagrams in the Thomas-Fermi regime. Phys. Rev. A 64, 063603-1–11 (2001) 15. Baumgartner, B., Seiringer, R.: Atoms with Bosonic “Electrons” in strong magnetic fields. Ann. Henri Poincar´e 2, 41–76 (2001) 16. Garc´ıa-Ripoll, J.J., P´erez-Garc´ıa, V.M.: Stability of vortices in inhomogeneous Bose condensates subject to rotation: A three-dimensional analysis. Phys. Rev. A 60, 4864–4874 (1999) Communicated by M. Aizenman
Commun. Math. Phys. 229, 511–542 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0689-0
Communications in
Mathematical Physics
A Hamiltonian Model for Linear Friction in a Homogeneous Medium Laurent Bruneau, Stephan De Bi`evre UFR de Math´ematiques et UMR AGAT, Universit´e des Sciences et Technologies de Lille, 59655 Villeneuve d’Ascq Cedex, France. E-mail: {bruneau; debievre}@agat.univ-lille1.fr Received: 18 July 2001 / Accepted: 20 April 2002 Published online: 12 August 2002 – © Springer-Verlag 2002
Abstract: We introduce and study rigorously a Hamiltonian model of a classical particle moving through a homogeneous dissipative medium at zero temperature in such a way that it experiences an effective linear friction force proportional to its velocity (at small speeds). The medium consists at each point in a space of a vibration field modelling an obstacle with which the particle exchanges energy and momentum in such a way that total energy and momentum are conserved. We show that in the presence of a constant (not too large) external force, the particle reaches an asymptotic velocity proportional to this force. In a potential well, on the other hand, the particle comes exponentially fast to rest in the bottom of the well. The exponential rate is in both cases an explicit function of the model parameters and independent of the potential. 1. Introduction Many simple microscopic or macroscopic systems obey (at zero temperature) an effective equation of motion of the type mq(t) ¨ + γ q(t) ˙ = −∇V (q(t)),
γ > 0.
(1.1)
Examples include the motion of electrons in a metal, or of a small particle in a viscous medium, but the coordinate q needs not always be of a geometrical nature. The energy loss due to the linear friction force −γ q˙ (occurring at a rate −γ q˙ 2 ) implied by this equation leads to several well-known phenomena. First, for confining potentials V , the γ particle will come to a stop exponentially fast (with rate 2m if γ is small enough) at one γ does not of the critical points of the potential. Note in particular that the decay rate 2m depend on the potential V . If, on the other hand, V (q) = −F · q, for some F ∈ Rd , the particle will reach a limiting speed v(F ) = Fγ which is proportional to the applied field. This, in particular, is at the origin of Ohm’s law. Again, the approach is exponential, but γ . In particular, if F = 0, the particle comes exponentially fast to this time with rate m
512
L. Bruneau, S. De Bi`evre
a full stop. The phenomenological friction force summarizes the reaction of the environment of the particle to its passage and the energy lost by the particle is transferred to the medium surrounding the particle by various processes (such as inelastic collisions, for example). A more fundamental, microscopic treatment of these phenomena requires therefore considering the combined system consisting of the particle and the medium. This combined system should allow for a Hamiltonian treatment in which the total energy is conserved. Our goal in this paper is to present and study a Hamiltonian model of a system composed of a particle and a homogeneous medium. We show rigorously that the particle has the behaviour described above and analyse the physical mechanisms at the origin of the observed phenomena. We stay within the context of classical mechanics and at zero temperature, hoping to come back to other points in the − T plane at a later date. In particular, at positive temperature, a fluctuating force term is to be added to (1.1), transforming the equation to the Langevin equation. Such a term is indeed produced by our model, but is much harder to analyse at positive temperatures. The model we consider consists of one classical particle that is, on the one hand, coupled to “obstacles” represented by scalar vibration fields and on the other subjected to a time-independent external force F = −∇V . We are mostly interested in the more difficult case where F is constant (so V = −F · q), but we will also deal with confining potentials. More precisely, the equations of motion for the coupled system are: ∂t2 ψ(x, y, t) − c2 y ψ(x, y, t) = −ρ1 (x − q(t))σ2 (y), q(t) ¨ = −∇V (q(t)) − dx dyρ1 (x − q(t))σ2 (y)(∇x ψ)(x, y, t). Rd
Rn
(1.2) (1.3)
Here ψ is the vibration field and q ∈ Rd the position of the particle. The “form factor” ρ1 (x)σ2 (y) determines the coupling of the particle to the vibration field ψ. We shall assume: (H1) ρ1 (x)σ2 (y) ∈ C0∞ (Rd+n ), ρ1 σ2 = 0, where ρ1 , σ2 ≥ 0, are radial functions with ρ1 (x) = 0 if |x| ≥ R1 > 0 and σ2 (y) = 0 if |y| ≥ R2 > 0. To obtain our main results (Sects. 4 and 5), we will need to take the propagation speed c large enough, for reasons that will be explained then. In the first part of the paper, on the other hand, it is convenient to absorb c through the scaling n
φ(x, y, t) = c 2 ψ(x, cy, t) and
n
ρ2 (y) = c 2 σ2 (cy),
(1.4)
so that (1.2)–(1.3) is transformed into ∂t2 φ(x, y, t) − y φ(x, y, t) = −ρ1 (x − q(t))ρ2 (y), q(t) ¨ = −∇V (q(t)) − dx dy ρ1 (x − q(t))ρ2 (y)(∇x φ)(x, y, t). Rd
Rn
(1.5) (1.6)
Note that the field φ(x, y, t) ≡ φt (x, y) plays the role of a potential for the particle. Indeed, the second term in (1.6) is Fφt (q(t)), where dx dy ρ1 (x − q)ρ2 (y)(∇x φ)(x, y) (1.7) Fφ (q) = − Rd
Rn
Hamiltonian Model for Linear Friction in a Homogeneous Medium
513
is the force exerted on the particle by the environment when the latter is in the state φ; this becomes Fφ (q) = −∇φ(q, 0) if we consider a point interaction so that ρ1 (x)ρ2 (y) = δ(x)δ(y). To understand the model intuitively the following observations are helpful. First of all, the particle moves in x-space (or, more precisely, in the y = 0 subspace of Rd+n ). In fact, one can think of φ(x, ·) as representing, for each fixed value of x in the configuration space of the particle, an “obstacle”, which has a large number of degrees of freedom, and is therefore modeled for simplicity by a vibration field φ(x, ·). The variables y should in other words not be interpreted geometrically and are in particular not spatial variables for the particle. To understand this, it is helpful to Fourier transform (1.5)–(1.6) in the y variable to obtain ˆ ˆ ∂t2 φ(x, k, t) + |k|2 φ(x, k, t) = −ρ1 (x − q(t))ρˆ2 (k), q(t) ¨ = −∇V (q(t)) + dk F˜ φt (q(t), k), Rn
where F˜ φ (q, k) = −
Rd
ˆ k). dxρ1 (x − q)ρˆ2 (k)∇x φ(x,
(1.8) (1.9)
(1.10)
ˆ Clearly, φ(x, k, t) is, for each value of x and k, the amplitude of a driven oscillator of frequency ω(k) = |k|. All of these oscillators are decoupled and each of them contributes ˆ k, t) acting on the particle. separately a force −ρ1 (x − q(t))ρˆ2 (k)∇x φ(x, One way to get an intuitive understanding of why this model should exhibit dissipative behaviour is to imagine for a moment the particle is constrained to move in one dimension (x ∈ R), and that y ∈ R2 , so that one can picture φ(x, y) as describing the vibrations of an elastic membrane positioned at x, perpendicular to the axis on which the particle moves. As the particle hits the successive membranes, it creates a wake, much like a boat ploughing the surface of a lake (Fig. 1). Taking for the moment V (q) = 0 in (1.6) (so that there is no external field: F = 0) one can imagine launching the particle with an initial speed v0 , with all membranes initially at rest. In that case intuition predicts that the particle should lose all its kinetic energy into the membranes and come to a full stop. We shall prove that this intuition is correct and that the particle stops exponentially fast for arbitrary values of d, but with n = 3 (Theorem 3) and for c large enough.
Fig. 1. Waves created by the passage of the particle through the successive membranes
514
L. Bruneau, S. De Bi`evre
The physical origin of these restrictions to the case where n equals 3 and c is large is explained in Sect. 2 and at the end of Sect. 4. Another situation of interest is the case where V is confining. Then techniques similar to the ones used in [KKS1] allow to show the particle comes to rest at one of the equilibrium positions of the potential V . We furthermore show this approach is exponential with the expected rate (Theorem 4) provided the particle comes to rest on a non-degenerate minimum of the potential. Our main interest is in the case where V (q) = −F · q. In that case, we show that for a suitable class of initial conditions (and for c large enough) the particle approaches asymptotically a constant speed v(F ) (Theorem 2) which is linear in F for small F , as in an ohmic medium. Various Hamiltonian models for dissipation in general and for linear friction in particular have previously been proposed in the physics literature, mostly with the purpose of deriving the classical or quantum Langevin equation (see [CEFM] and [FLO] for further references). As in the model we propose here (see (1.8)–(1.9)), they all involve the coupling of a particle to a family of independent oscillators representing the degrees of freedom of the environment. Our model has the particular feature of describing a homogeneous (i.e. translationally invariant) medium to which the particle is coupled in a translationally invariant manner (see (3.3)). The coupling is therefore non-linear in the particle position (no dipole approximation), while it is linear in the field variables. It is the only Hamiltonian model we are aware of that describes linear friction at low speeds in the presence of each of the three most commonly studied potentials: V = 0, V = −F · q and V confining. In more realistic models, one ought to couple the oscillators at different points in space. This is easily done in the context of our models by changing the potential energy of the field into dx dy c12 |∇x ψ(x, y)|2 + c22 |∇y ψ(x, y)|2 . It turns out, however, that in that case the force exerted by the medium on a particle moving at constant speed v vanishes identically for all |v| ≤ c1 . In such models, the friction force is therefore proportional to higher derivatives of q. In particular, this is the case when c = c1 = c2 , as in the model for radiation damping studied in [KKS1, KKS2, KS]. This leads to some very different behaviour. For example, in that case, there exist for all |v| < c constant speed solutions for the particle in absence of an external potential V . In a confining potential, the particle still converges exponentially fast to a minimum of the potential, but this time the exponential rate also depends on the shape of the potential. The rest of the paper is organized as follows. In Sect. 2 we study in detail the friction force exerted by the medium on the particle. This allows us to discuss in some detail the intuition behind the model. The rather routine but essential question of existence and uniqueness of the solutions of (1.5)–(1.6) is settled in Sect. 3. In Sect. 4 we study the long time asymptotics of the particle behaviour for the case when V (q) = −F · q, whereas Sect. 5 is devoted to the confined case. 2. The Friction Force Crucial for understanding the model and for the proofs of our results is a detailed study of the reaction force of the medium defined in (1.7). Imagine we apply a constant external
Hamiltonian Model for Linear Friction in a Homogeneous Medium
515
force F to the particle. We then look for solutions of the equations of motion (1.5)–(1.6) where the particle executes a uniform rectilinear motion q(t) = q0 + vt and the field is comoving, i.e.: φv (x, y, t) = v (x − (q0 + vt), y). Inserting this ansatz into (1.8), one easily finds the solution: ˆ v (x, k) = −
+∞
ds ρ1 (x + vs)ρˆ2 (k)
0
sin(|k|s) . |k|
(2.1)
This is the so-called retarded solution, describing the waves created in the “membranes” by the passage of the particle. Note that it has zero initial conditions at t = −∞ in the sense that, for all (x, y) ∈ Rd+n , there exists T (depending only on x) so that φv (x, y, t) = 0 for all t ≤ T (Fig. 1). It is easy to see this is the unique comoving solution. This wave φv (x, y, t) induces a force on the particle that is easily computed from (2.1) and (1.7) using a change of variables in the integration (x → x + vt + q0 ): dx dy ρ1 (x − (q0 + vt))ρ2 (y)(∇x φv )(x, y, t) Fφv,t (q0 + vt) = − Rd Rn +∞ sin(|k|s) dx dk ds∇ρ1 (x)ρ1 (x + vs)|ρˆ2 (k)|2 = − |k| Rd Rn 0 =: f (v), (2.2) which is clearly independent of q0 and t. As a result, φv (x, y, t) and q(t) = vt + q0 will satisfy the coupled system (1.5)–(1.6) with −∇V = F provided v satisfies the equation f (v) = −F . In conclusion, a comoving solution to (1.5)–(1.6) at velocity v exists provided the equation f (v) = −F has at least one solution. We will see below that for F sufficiently small two such solutions exist, one at “low” and one at “high” velocity. Our main result will say that, given “any” sufficiently small initial condition and any not too large force F , the particle trajectory asymptotically converges to the corresponding constant velocity trajectory (Theorem 2). It is important for the proof of our results to understand the behaviour of the function f (v) rather well, a task we now turn to. Remark that f is a functional of ρ1 and ρ2 . In this and the following section the latter are kept fixed, so we do not explicitly indicate this dependence. In Sects. 4 and 5, we will reintroduce c explicitly via (1.4) keeping ρ1 and σ2 fixed: f will then be a function of v and c. It is clear that f ∈ C ∞ (Rd ). Furthermore, it is easy to see that f (v) = −fr (|v|)
v , |v|
fr (| v|) > 0,
(2.3)
so that the reaction force of the medium on the particle is directed opposite to the particle velocity as expected for a friction force. To prove this, first note that the rotational invariance of ρ1 implies that ∀R ∈ O(d),
R[f (v)] = f (Rv).
Now, if v = |v|e1 , one finds, after a few changes of variables (λ = |v|s and k˜ = f (v) = −|v|n−2
Rd
dx
Rn
d k˜
0
+∞
˜ 2 dλ∇ρ1 (x)ρ1 (x + λe1 )|ρˆ2 (|v|k)|
k |v| ):
˜ sin(λ|k|) . ˜ |k|
516
L. Bruneau, S. De Bi`evre
The rotational invariance of ρ1 now implies that fi (|v|e1 ) = 0 for i = 2, . . . , d, so that f (v) has the direction of e1 and so in the general case (v = 0) one has indeed f (v) = −fr (|v|)
v . |v|
We need to study the asymptotic behaviour of f (v) as |v| goes to 0 and as |v| goes to +∞. For that purpose, we write (see (2.2)) dk f (v, k), f (v) = Rn
with (after some manipulations) v , |v| 1 k fr (|v|, |k|) = 2 |ρˆ2 (k)|2 h , |v| |v| +∞ sin λ|ξ | h(ξ ) = dλ dx ∂1 ρ1 (x)ρ1 (x1 + λ, x⊥ ) |ξ | Rd 0 =π dη|ρˆ1 (|ξ |, η)|2 , f (v, k) = −fr (|v|, |k|)
Rd−1
(2.4) (2.5) (2.6) (2.7)
where ρˆ1 is the Fourier transform of ρ1 . Here f (v, k) is the force produced by the “osˆ cillators” φ(x, k) of frequency ω = | k |. It follows immediately from the above that fr (|v|) > 0 and that, given + ∈ N, there exists a constant C+ > 0 so that + |v| 1 |f (v, k)| ≤ C+ 2 . |k| |k| In other words, for fixed k, f (v, k) vanishes to all orders in |v| as v → 0. So, as v → 0, the force on the particle due to one of the oscillators of frequency ω(k) = |k| present at x, decreases faster than any power of |v| for small v (i.e. when |v| |k|R1 ). Roughly speaking, the coupling of the particle to such an oscillator is extremely weak when |v| is much smaller than |k|R1 . This corresponds to a well-known piece of physical intu1 ition: if the particle has speed v, it interacts during a time of order R |v| with any given oscillator. For the energy transfer between the particle and the oscillator to be efficient, this interaction time has to be comparable to the period of the oscillator as an explicit computation easily confirms. Indeed, the total energy transfer E (from t = −∞ to t = +∞) to a driven oscillator of frequency ω, u(t) ¨ + ω2 u(t) = σ (t), is easily computed to be E = π |σˆ (ω)|2 . Applying this to (1.8) with q(t) = vt, one π finds E = |v| ˆ2 (k)|2 |ρˆ1 (|k|/|v|, 0)|2 which vanishes again to all orders in |v| as 2 |ρ |v| → 0. In particular, it is clear from this observation that when coupling the particle to a family of oscillators, all of the same fixed frequency (as in a pinball machine where each circular obstacle would be mounted on a spring), no ohmic behaviour can be expected since the friction force is not linear in v at small v in that case. As the particle slows
Hamiltonian Model for Linear Friction in a Homogeneous Medium
517
down, it couples less and less effectively to such oscillators, leading to a friction force vanishing to all orders in |v|. To remedy this effect, one has to couple the particle to a family of sufficiently many oscillators of arbitrarily low-frequency. As the particle slows down, it will then transfer energy to those oscillators with which it is in resonance. In the model above, the number of low-frequency oscillators present at the point x depends on the dimension n of the y variables through the volume element dk = |k|n−1 d|k|d/. Because of the factor |k|n−1 , the higher the dimension n, the fewer such oscillators are present. This reflects itself immediately in the low v behaviour of the force f (v): |ρˆ2 (|v|ξ )|2 h(ξ )dξ fr (|v|) = |v|n−2 Rn n−2 2 = |v| |ρˆ2 (0)| h(ξ )dξ + o(|v|n−2 ). (2.8) Rn
One notices indeed that for small |v|, fr is smaller if n is higher. So, only when n = 3 a friction force proportional to the velocity (and hence ohmic behaviour) is obtained! More precisely, for n = 3, 2 h(ξ )dξ v + o(v) = −γ v + o(v), f (v) = − |ρˆ2 (0)| Rn
where we defined
γ = |ρˆ2 (0)|2
Rn
h(ξ )dξ.
(2.9)
This shows how motion through the medium modeled here produces a friction term of the type occurring in (1.1) provided n = 3. Note that the friction coefficient γ is given explicitly in terms of ρ1 and ρ2 and is different from 0 under hypothesis (H1). Since, in this paper, we are interested in studying linear friction at low v, we will restrict ourselves to n = 3 in the main theorems (Sects. 4 and 5). We now turn to the behaviour of fr (|v|) for large values of |v|. It is easy to see from (2.8) that lim|v|→+∞ fr (|v|) = 0. In other words, at high speeds as well, the friction force exerted by the medium on the particle is small. As one can see in Eqs. (1.8) and (2.8), this is mostly due to the fact that for high ω(k) = |k|, the oscillators are only very weakly coupled to the particle due to the presence of the smooth form factor ρˆ2 . In particular, in the presence of an external driving force F , the model can therefore only be expected to display dissipative behaviour when v is not too large. The profile for fr (|v|) when ρ is a Gaussian is given in Fig. 2. 3. Existence of Solutions The assumptions on the potential are (H2) V ∈ C 1 (Rd ) and ∇V is Lipschitz. Moreover, one of the two following assumptions holds: either ∇V is bounded (such as when V (q) = −F · q) or V is bounded from below. We are now ready to introduce the phase space E of the model. Let · 2 denote the usual norm on L2 (Rd+n , dxdy). On C0∞ (Rd × Rn ), φ = ∇y φ2 defines a norm. Let E be the completion of C0∞ (Rd × Rn ) with this norm. Actually, as a consequence
518
L. Bruneau, S. De Bi`evre
Fig. 2. Profile of fr (|v|)
of the Sobolev imbedding theorems ([B], Chapter 9), E is the space L2 (Rd , D, dx) where 2n
D = {φ ∈ L n−2 (Rn , dy)|∇y φ ∈ L2 (Rn , dy)}. We then define E = E × Rd × L2 (Rd+n ) × Rd with the norm: 1
|Y |E = (φ2 + |q|2 + π22 + |p|2 ) 2
for Y = (φ, q, π, p).
With this norm, E is a Hilbert space. We now write the problem (1.5)–(1.6) in a more convenient way, so as to prove the existence and uniqueness of a solution:
Y˙ (t) = G(Y (t)) , Y (0) = Y0 ∈ E
(3.1)
where G : (φ, q, π, p) → π, p, y φ − ρ1 (x − q)ρ2 (y), − ∇V (q) + dx dy ∇ρ1 (x − q)ρ2 (y)φ(x, y) . Rd+n
By solution, we mean that:
t
Y (t) = Y0 + 0
in the sense of the distributions.
G(Y (s))ds
(3.2)
Hamiltonian Model for Linear Friction in a Homogeneous Medium
519
Theorem 1. Let n ≥ 3. Under the assumptions (H1) and (H2), we have: 1. For each Y0 in E, the differential equation (3.1) has a unique solution Y (t) in C 0 (R, E). 2. For every t ∈ R, the map W t : Y0 → Y (t) is continuous on E. 3. For every t ∈ R, H (Y (t)) = H (Y0 ), where 1 p2 + V (q) + dx dy (|∇y φ(x, y)|2 + |π(x, y)|2 ) H (Y ) = 2 2 Rd+n + dx dy ρ1 (x − q)ρ2 (y)φ(x, y) (3.3) Rd+n
is a continuous function on E. Remark 1. Using the first part of the theorem and (1.6), one sees that q(t) ∈ C 2 (R, Rd ). For later reference, we define H0 (Y ) =
1 p2 + dx dy (|∇y φ(x, y)|2 + |π(x, y)|2 ). 2 2 Rd+n
(3.4)
Note that the densely defined bilinear anti-symmetric form ω(Y1 , Y2 ) = q1 p2 − p1 q2 + dxdy (φ1 π2 − π1 φ2 ) Rd+n
makes E a symplectic vector space. The equations of motion (1.5)–(1.6) are of course the Hamiltonian equations for the Hamiltonian H in (3.3), which is the total energy of the system. Note that the latter is not bounded from below when V is not, such as when V = −F · q. This makes for a slight complication in the existence proof, which is otherwise standard and largely follows [KKS1]. Proof. We start by showing there is a local solution. Then we will use conservation of energy to show the solution is global. We first look at the problem Y˙ (t) = G0 (Y (t)) (3.5) Y (0) = Y0 with G0 Y = (π; 0; y φ; 0).
(3.6)
This problem is just the free wave equation in Rn with a parameter x ∈ Rd . It admits a unique solution: t ∈ R → Y (t) ∈ E. Moreover, let W0t : Y0 → Y (t) denote the corresponding continuous group; then W0t turns out to be a linear isometry with the norm | |E . It is also continuous on R × E ([LM], Chapter 3). Now define Z(t) = W0−t Y (t) or Y (t) = W0t Z(t). In particular, Z(0) = Y (0) = Y0 . ˙ We have Y˙ (t) = G0 Y (t) + W0t Z(t). Y (t) is a solution of the problem (3.1) if and only if Z(t) satisfies ˙ Z(t) = W0−t G1 (W0t Z(t)) , (3.7) Z(0) = Y0
520
L. Bruneau, S. De Bi`evre
where
G1 : (φ; q; π ; p) ∈ E →
0; p; −ρ1 (x − q)ρ2 (y); −∇V (q) + dx dy ∇ρ1 (x − q)ρ2 (y)φ(x, y) ∈ E. (3.8) Rd+n
Introducing ˜ : (t, Z) ∈ R × E → W −t G1 (W0t Z) ∈ E, G 0 ˜ is continuous on R × E and Lipschitz on E because W t is an isometry it is clear that G 0 and G1 is Lipschitz. This problem satisfies all the conditions of the Cauchy-Lipschitz theorem ([H], Theorem 3.1), so it has a unique solution which is defined on an open interval. More precisely, there exists an open interval J such that 0 ∈ J and there exists a unique function Z : t ∈ J → Z(t) ∈ E satisfying (3.7). Moreover, W˜ t : Z0 → Z(t) is continuous on E for every t ∈ J and so we have the same results for W t : Y0 ∈ E → Y (t) = W0t W˜ t Y0 ∈ E. In order to prove global existence, we now prove conservation of energy. We first prove the result for smooth initial data (i.e. φ0 , π0 ∈ C ∞ (Rd+n )). Let Y0 = (φ0 , q0 , π0 , p0 ) with φ0 , π0 ∈ C0∞ (Rd+n ). Then W0t Y0 is smooth ([CH], Chapter 6) and by the integral representation: t Y (t) = W0t Y0 + ds W0t−s G1 (Y (s)), 0
it is clear that φ(t), π(t) are smooth as well (in x and y). Note that φ(x, y, t) and π(x, y, t) are also smooth in t ([LM], Chap. 3). For such initial data a simple computation then yields: d H (Y (t)) = 0, dt so that, for smooth initial data, H (Y (t)) is a constant for all t in J . We now prove that H is continuous on E. The continuity of W t on E and the fact that smooth initial data are dense in E will then imply the result for all initial data. Since V is continuous, it only remains to show the interaction term in H is continuous. Its continuity in φ is immediate from the following computation: | Rd+n dx dy ρ1 (x − q)ρ2 (y)φ(x, y) − Rd+n dx dy ρ1 (x − q)ρ2 (y)ψ(x, y) | = | Rd+n dx dk
ρ1 (x−q)ρ¯ˆ2 (k) ˆ ˆ (|k|φ(x, k) − |k|ψ(x, k)) |k|
|
ρ¯ˆ2 (k) ˆ L2 L2 × |k|(φˆ − ψ) ≤ ρ1 (x−q) |k| ¯
ρˆ2 (k) ≤ ρ1 (x−q) L2 × φ − ψ. |k|
Because ρ has compact support and n ≥ 3 the first factor of the right-hand side is finite and so H is continuous (the continuity in (q, φ) follows similarly).
Hamiltonian Model for Linear Friction in a Homogeneous Medium
521
We will furthermore need the following obvious inequality (based on | ab | ≤ 2< a 2 +
1 2 2< b ):
φ 2 dx dy ρ1 (x − q)ρ2 (y)φ(x, y)
≤ − ρ1 ρ2 ; ρ1 −1 y ρ2 . 4 Rd+n
(3.9)
Hence: H (Y (t)) ≥
1 1 1 p(t)2 + V (q(t)) + φ(t)2 + π(t)22 + ρ1 ρ2 ; ρ1 −1 y ρ2 . (3.10) 4 2 2
We are now ready to prove that J = R. We know that J can be written ]a; b[ with −∞ ≤ a < 0 and 0 < b ≤ +∞. We will show by contradiction that b = +∞ (the same can be done for a = −∞). If b < +∞, we know by the theory of differential equations ([H], Theorem 2.1) that lim |Z(t)|E = +∞,
t→b
and the same holds for |Y (t)|E because |Y (t)|E = |W0t Z(t)|E = |Z(t)|E . We consider first the (harder) case where ∇V is bounded (but V is not necessarily bounded below). For t > 0, we can write φ as: φ = φr + φ0,
(3.11)
where φ r is the solution of the wave equation with initial data equal to 0 and φ 0 is the solution of the homogeneous wave equation with initial data φ0 and π0 [CH] [J]. Consequently dx dy ∇ρ1 (x − q(t))ρ2 (y)φ r (x, y, t) p(t) ˙ = −∇V (q(t)) + d+n R + dx dy ∇ρ1 (x − q(t))ρ2 (y)φ 0 (x, y, t). Rd+n
The first term −∇V (q(t)) is bounded by hypothesis. The second one is easily bounded using the Cauchy-Schwarz inequality and the exact form of φ r given in ([CH], p.692). Using (3.9) with ∇ρ1 instead of ρ1 , we have
1
0
≤ ∇y φ 0 (t) 2
dx dy ∇ρ (x − q(t))ρ (y)φ (x, y, t) 1 2 2
4
d+n R + ∇ −1 ρ2 (y)∇ρ1 (x − q(t)) 22 .
But φ 0 is a solution of the free wave equation with initial conditions φ0 and π0 , so, by energy conservation d 0 2 0 2 2 2 ∇y φ (t) 2 + φ (t) (3.12) = ∇y φ0 2 + π0 2 , dt 2 and so ∇y φ 0 (t) 22 is bounded too.
522
L. Bruneau, S. De Bi`evre
Finally, p(t) ˙ is bounded on J : there exists C > 0 such that ∀t ∈ J,
t > 0 | p(t) ˙ | ≤ C.
(3.13)
We have supposed b to be finite, so p(t) and q(t) are also bounded for t > 0, t ∈ J. By energy conservation and (3.10), φ(t) and π(t)2 are bounded. Therefore |Y (t)|E is bounded as well which is a contradiction with the fact that b is finite. We finally deal with the second (easier) case, where V is bounded from below. There exists V0 ∈ R such that for every q ∈ Rd , V (q) ≥ V0 . Equation (3.3) implies 1 1 1 p(t)2 + V0 + φ(t)2 + π(t)22 + ρ1 ρ2 ; ρ1 −1 (3.14) y ρ2 . 2 4 2 So p(t), φ(t) and π(t)2 are bounded on J and because b is supposed to be finite, q(t) is also bounded which is again a contradiction. H (Y0 ) ≥
4. Behaviour of the Solutions: Constant Force From now on, we take n = 3. To prove our results we shall need to assume that the propagation speed c (see (1.2)) is large. We will comment on this condition at the end of this section. We therefore reintroduce c explicitly as in (1.4): 3
ρ2 (y) = c 2 σ2 (cy).
(4.1)
In the following, ρ1 and σ2 are fixed and satisfy (H1); c is treated as a parameter. The force exerted by the medium on a particle moving at velocity v is defined in (2.2). One has 1 ˜v 1 ˜ |v| v f (v) = 2 f = − 2 fr , (4.2) c c c c |v| where, for all w ∈ Rd , ˜ dx f (w) = Rd
Rn
+∞
dk
ds∇ρ1 (x)ρ1 (x + ws)|σˆ 2 (k)|2
0
sin(|k|s) . |k|
Remark that f˜ and f˜r do not depend on c. The friction coefficient γ defined in (2.9) then becomes 1 γ˜ γ = fr (0) = 3 f˜r (0) ≡ 3 > 0, c c where γ˜ does not depend on c: h(ξ )dξ > 0. (4.3) γ˜ = |σˆ 2 (0)|2 Rn
We can define wM to be the smallest zero of f˜r and F˜M = f˜r (wM ). For all w < ˜ wM , f˜r is increasing, so for all F ∈ Rd , |F | ≤ FcM2 , there exists a unique v(F ) ∈ Rd , |v(F )| ≤ wM c = vM (see Fig. 2) such that f (v(F )) = −F.
(4.4)
This defines v(F ). To obtain our results, we finally need some hypothesis on the initial conditions. For that purpose, we define the following set:
Hamiltonian Model for Linear Friction in a Homogeneous Medium
523
Definition 1. Let D be the set of all states Y0 = (φ0 , q0 , π0 , p0 ) in E such that
|φ0 (x, y)| + |y|(|∇y φ0 (x, y)| + |π0 (x, y)|) ≤ κ(x)(1 + |y|)−ν
(4.5)
for some ν > 2 and κ ∈ L∞ ∩ L2 . We are now ready to state our main results. Theorem 2. Let ρ1 and σ2 satisfy (H1) and consider (1.5)–(1.6) with V (q) = −F · q, F ∈ Rd . (i) For all F0 , K, R, ε, η > 0 there exists c0 (ρ1 , σ2 , ε, η, F0 , K, R) > 0 such that for any c > c0 , for all |F | < F0 c−2−ε and for all Y0 ∈ E such that φ0 (x, ·), π0 (x, ·) have compact support in BRc ⊂ R3 , satisfying H0 (Y0 ) < Kc2−2ε , there exist q∞ (F, Y0 ) ∈ Rd and K > 0 such that for all t > 0, |q(t) − q∞ − v(F )t| ≤ K e
− γ˜ (1−η) 3 t c
.
(4.6)
(ii) For all F0 , K, ε, η > 0 there exists c0 (ρ1 , σ2 , ε, η, F0 , K) > 0 such that for any c > c0 , for all |F | < F0 c−2−ε and for all Y0 ∈ D with κ∞ < Kc and H0 (Y0 ) < Kc2−2ε , there exist q∞ (F, Y0 ) ∈ Rd and K > 0 such that for all t > 0, |q(t) − q∞ − v(F )t| ≤ K |t|2−ν . Note that, since η can be taken arbitrarily small, the exponential decay rate in (4.6) is essentially given by the friction coefficient γ = cγ˜3 and, in addition that v(F ) =
F + O(c1−2ε ) γ
(4.7)
uniformly for |F | ≤ F0 c−2−ε . This shows that the solutions q(t) of (1.6) do indeed have the same asymptotic behaviour as those of (1.1), as announced in the introduction. The restriction on the energy H0 (Y0 ) of the initial conditions in the hypothesis is related to the fact that f (v) → 0 as v → ∞. Indeed, it is intuitively clear that, if at some time t, |q(t)| ˙ is too large, then the reaction force of the medium will be too small to counter the driving force F and the particle will accelerate. This argument fails when F = 0. In that case, one can indeed omit the hypothesis on the initial energy H0 (Y0 ), provided one imposes an additional hypothesis on σ2 : (W) σˆ 2 (k) = 0 for all k ∈ R3 . This yields: Theorem 3. Let ρ1 , σ2 satisfy (H1) and let σ2 satisfy (W). We consider (1.5)–(1.6) with V ≡ 0, (i) For all η > 0 there exists c0 (ρ1 , σ2 , η) > 0 such that for any c > c0 and for all Y0 ∈ E such that φ0 (x, ·), π0 (x, ·) have compact support in the y direction for each x, there exist q∞ (Y0 ) ∈ Rd and K > 0 such that for all t > 0, |q(t) − q∞ | ≤ K e
− γ˜ (1−η) 3 t c
.
(ii) For all η > 0 there exists c0 (ρ1 , σ2 , η) > 0 such that for any c > c0 and for all Y0 ∈ D, there exist q∞ (Y0 ) ∈ Rd and K > 0 such that for all t > 0, |q(t) − q∞ | ≤ K |t|2−ν .
524
L. Bruneau, S. De Bi`evre
The proof of Theorem 3, which uses techniques of this section and the following one, is given at the end of Sect. 5. We now prove Theorem 2. We introduce some notation which will frequently appear. We denote by Df (v) the differential of the function f (v). One can see that in any v orthonormal basis (e1 , . . . , ed ), where e1 = |v| we have fr (|v|) fr (|v|) Df (v) = diag −fr (|v|), − ,...,− |v| |v| for v = 0 and
Df (0) = −γ I d. Rd ,
We define for w ∈
˜ f (|w|) r θ˜∗ (w) = max f˜r (|w|), , |w|
˜ f (|w|) r γ˜∗ (w) = min f˜r (|w|), . |w|
(4.8)
In view of the definition of wM and (4.2), it is clear that θ˜∗ (w) and γ˜∗ (w) are strictly positive provided |w| < wM and that Df (v) = c13 θ˜∗ ( vc ). Clearly, lim θ˜∗ (w) = lim γ˜∗ (w) = γ˜ ,
w→0
w→0
(4.9)
where γ˜ is defined in (4.3). For simplicity, we will write DfF , γ˜F and θ˜F for Df (v(F )), ) ˜ v(F ) γ˜∗ ( v(F c ) and θ∗ ( c ). Since we expect to prove q(t) ˙ → v(F ), it is convenient to introduce h(t) = q(t) − v(F )t. For the proof of Theorem 2, we need the following lemma. Lemma 1. Under the hypothesis of Theorem 2(i) (resp. Theorem 2(ii)), there exist c0 > 0 and β > 0 such that ˙ sup |h(t)| ≤ βc1−ε t≥0
for all c > c0 and for all initial conditions as in Theorem 2(i) (resp. Theorem 2(ii)). The proof of Lemma 1 will be given below. Proof of Theorem 2. During the proof, many estimates will be done in terms of c, so one shall remember that ρ2 depends on c via (4.1). On the other hand, the different constants will only depend on ρ1 , σ2 , η, ε, F0 , K, R, but not on c, F, or on the initial conditions. We first fix c large enough so that F0 c−2−ε < F˜M c−2 , which implies that v(F ) is well defined (see (4.4)), and we consider (1.5)–(1.6) for some F ∈ Rd , |F | ≤ F0 c−2−ε and Y0 ∈ E. The first part of the proof consists of a rather straightforward but somewhat lengthy computation leading from (1.5)–(1.6) to an effective integro-differential equation for h(t) = q(t) − v(F )t obtained in (4.20). Solving (1.5) yields, according to (3.11), φ(x, y, t) = φ r (x, y, t) + φ 0 (x, y, t), where in the 3-dimensional case we deal with here: 1 ρ2 (y − z) φ r (x, y, t) = − ρ1 (x − q(t − |z|)), dz 4π |z|≤t |z|
(4.10)
Hamiltonian Model for Linear Friction in a Homogeneous Medium
1 φ (x, y, t) = 4πt 2
525
0
St (y)
[φ0 (x, σ ) + σ · ∇y φ0 (x, σ ) + tπ0 (x, σ )]dσ,
(4.11)
and St (y) is the sphere of radius t centered at y ([J], Chap. 3). Inserting this in (1.6) leads to the following integro-differential equation for q(t): 1 ρ2 (y − z)ρ2 (y) q(t) ¨ =F− ρ1 (x − q(t − |z|)) dx dy dz 4π |z| |z|≤t (4.12) ×∇ρ1 (x − q(t)) + A0 (t), where
1 A0 (t) = dx dy dσ [φ0 (x, σ ) + σ · ∇y φ0 (x, σ ) 4πt 2 St (y) + tπ0 (x, σ )]ρ2 (y)∇ρ1 (x − q(t)).
(4.13)
Since n = 3, it is not difficult to see that f (v), defined in (2.2), can be rewritten as follows: 1 ρ2 (y − z)ρ2 (y) f (v) = − dx dy dz ρ1 (x + v|z|)∇ρ1 (x). (4.14) 4π Rd R3 R3 |z| Using this expression to replace F in (4.12) by −f (v(F )) we find 1 ρ2 (y − z)ρ2 (y) q(t) ¨ = dx dy dz ρ1 (x + v(F )|z|)∇ρ1 (x) 4π |z| 1 ρ2 (y − z)ρ2 (y) − dx dy dz ρ1 (x − q(t − |z|))∇ρ1 (x − q(t)) 4π |z| |z|≤t + A0 (t). To alleviate the notation, we shall from now write v = v(F ). We now divide the first integral in two parts: dx dy dz = dx dy dz + dx dy dz. Rd
R3
Rd
R3
R3
Rd
|z|≤t
R3
|z|≥t
We denote by f˜(t) the second one of these two terms, i.e. 1 ρ2 (y − z)ρ2 (y) f˜(t) = dx dy dz ρ1 (x + v|z|)∇ρ1 (x). 4π |z| |z|≥t
(4.15)
Now, remark that for |z| ≥ 2 Rc2 , ρ2 (y)ρ2 (y − z) = 0 because |y − z| ≥ |z| − |y| ≥ 2 and ρ2 (y) = 0 for |y| ≥
R2 c .
R2 − |y| c
So f˜(t) vanishes if t ≥
2R2 c .
A1 (t) = A0 (t) + f˜(t).
Finally, let (4.16)
526
L. Bruneau, S. De Bi`evre
This leads to q(t) ¨ =
ρ2 (y − z)ρ2 (y) 1 dx dy dz [−ρ1 (x − q(t − |z|)) 4π |z| |z|≤t × ∇ρ1 (x − q(t)) + ρ1 (x + v|z|) ∇ρ1 (x)] + A1 (t).
(4.17)
Inserting q(t − |z|) = q(t) −
t t−|z|
q(s)ds ˙
in (4.17) and using translation invariance, we find: q(t) ¨ =
ρ2 (y − z)ρ2 (y) 1 dx dy dz 4π |z| |z|≤t t × ρ1 (x + v|z|) − ρ1 x + q(s)ds ˙ ∇ρ1 (x) + A1 (t). (4.18) t−|z|
We are now ready to introduce h(t) = q(t) − vt, in terms of which (4.18) becomes ρ2 (y − z)ρ2 (y) ¨h(t) = 1 dx dy dz 4π |z| |z|≤t t ˙ × ρ1 (x + v|z|) − ρ1 x + v|z| + h(s)ds ∇ρ1 (x) + A1 (t). t−|z|
We can write ρ1 x + v|z| + =
t t−|z|
t t−|z|
˙ h(s)ds − ρ1 (x + v|z|)
˙ · ∇ρ1 (x + v|z|) ds h(s)
t t 1 ˙ ˙ + h(s)ds; h(s)ds Hessρ1 (x˜t,|z| ) 2 t−|z| t−|z| for some x˜t,|z| belonging to the segment [x + v|z|; x + v|z| + an integration by parts yields:
t
t−|z|
As a result
˙ ˙ + h(s)ds = |z|h(t)
t t−|z|
t
˙
t−|z| h(s)ds]. In addition,
¨ (t − |z| − s)h(s)ds.
Hamiltonian Model for Linear Friction in a Homogeneous Medium
527
¨h(t) = − 1 ˙ · ∇ρ1 (x + v|z|) ∇ρ1 (x) dx dy dz ρ2 (y − z)ρ2 (y) h(t) 4π |z|≤t ρ2 (y − z)ρ2 (y) 1 dx dy dz − 4π |z| |z|≤t t ¨ × (t − |z| − s)h(s)ds · ∇ρ1 (x + v|z|) ∇ρ1 (x) t−|z| 1 ρ2 (y − z)ρ2 (y) − dx dy dz 4π |z| |z|≤t t t 1 ˙ ˙ × h(s)ds; h(s)ds ∇ρ1 (x) + A1 (t). Hessρ1 (x˜t,|z| ) 2 t−|z| t−|z| Once again we rewrite the first integral |z|≤t = R3 − |z|≥t . It is easily seen from ˙ whereas the second one is once again (4.14) that the first term then equals DfF · h(t) 2R2 vanishing for t ≥ c . We define A2 (t) = A1 (t) +
1 4π
|z|≥t
dx dy dz ρ2 (y − z)ρ2 (y)
˙ · ∇ρ1 (x + v|z|) ∇ρ1 (x), × h(t)
(4.19)
and we finally obtain the following convenient form of the integro-differential equation for h(t) = q(t) − vt: ρ2 (y − z)ρ2 (y) ¨h(t) = DfF · h(t) ˙ − 1 dx dy dz 4π |z| |z|≤t t ¨ × (t − |z| − s)h(s)ds · ∇ρ1 (x + v|z|) ∇ρ1 (x) t−|z| 1 ρ2 (y − z)ρ2 (y) − dx dy dz 4π |z| |z|≤t t t 1 ˙ ˙ × h(s)ds; h(s)ds ∇ρ1 (x) + A2 (t), Hessρ1 (x˜t,|z| ) 2 t−|z| t−|z|
(4.20)
where A2 (t) is defined via (4.13)–(4.15)–(4.16)–(4.19). One recognizes here, in the first two terms, Eq. (1.1) with V = −F · q. ˙ We can now show h(t) → 0 and control the rate of convergence. We first define θ˜F
t˙ We have g(t) = e c3 h(t).
θ˜F
˙ = e− c3 t g(t), h(t)
θ˜F θ˜F θ˜ ˜ θ˜F ˙ − F3 t ¨ = − θF e− c3 t g(t) + e− c3 t g(t) c g(t), h(t) ˙ = − ˙ h(t) + e c3 c3
528
L. Bruneau, S. De Bi`evre
so that (4.20) becomes θ˜F g(t) ˙ = I d + DfF · g(t) c3 θ˜F ρ2 (y − z)ρ2 (y) + dx dy dz 3 4πc |z| |z|≤t t θ˜F (t−s) (t − |z| − s)e c3 g(s)ds · ∇ρ1 (x + v|z|) ∇ρ1 (x) × t−|z| ρ2 (y − z)ρ2 (y) 1 dx dy dz − ∇ρ1 (x) 4π |z| |z|≤t t t θ˜F 1 (t−s) 3 ˙ ec g(s)ds; h(s)ds Hessρ1 (x˜t,|z| ) × 2 t−|z| t−|z| 1 ρ2 (y − z)ρ2 (y) − dx dy dz 4π |z| |z|≤t t θ˜F (t−s) (t − |z| − s)e c3 g(s)ds ˙ · ∇ρ1 (x + v|z|) ∇ρ1 (x) × t−|z|
+e
θ˜F c3
t
A2 (t).
(4.21)
˙ Note that in the third term of the right-hand side we only replaced one factor h(s) by e
−
θ˜F c3
s
g(s). We will use Lemma 1 to control the other one. We define M(t) = sup |g(s)| and 0≤s≤t
N (t) = sup |g(s)|. ˙ 0≤s≤t
Writing R(t) for the right-hand side of (4.21) and using Lemma 1 to control its third term, we easily find (remembering (4.1)) there exist constants D1 , D2 , D3 > 0 depending on ρ1 , σ2 so that R 2θ˜ 2 R θ˜F θ˜F γ˜F D2 e F c 4 2θ˜F 42 |R(t)| ≤ − 3 M(t) + 7 D1 e c M(t) + M(t) c3 c c c3+ε θ˜F D3 2R2 θ˜F4 t c N (t) + e c3 |A2 (t)|. e c4 Here, R2 is the radius of the support of σ2 (see (H1)). Taking c large enough (depending on D1 , D2 , θ˜F ) we obtain for all s ≥ 0, R 2θ˜F 42 θ˜F θ˜F ˜θF c γ˜F D4 e M(s) + N (s) D3 e2R2 c4 + e c3 s |A2 (s)|. (4.22) |g(s)| ˙ ≤ 3 − 3 + 3+ε 4 c c c c
+
Taking the supremum over all s ∈ [0, t], first in the right-hand side and then in the left-hand side of this inequality, we obtain R θ˜ 2θ˜F 42 ˜ θ ˜ c F F D3 2R θF γ˜F D4 e M(t) + sup e c3 s |A2 (s)| . N(t) 1 − 4 e 2 c4 ≤ 3 − 3 + c c c c3+ε 0≤s≤t
Hamiltonian Model for Linear Friction in a Homogeneous Medium
529
We denote by kc the inverse of the factor of N (t). Remark that kc ∼ 1 + ck4 . Hence R θ˜ 2θ˜F 42 ˜F c F s θ e γ ˜ D F 4 3 c N(t) ≤ kc M(t) + kc sup e |A2 (s)| . − 3 + c3 c c3+ε 0≤s≤t Remark that, in view of (4.7), uniformly for all F ∈ Rd so that 0 ≤ |F | ≤ F0 c−2−ε , ) limc→+∞ v(F c = 0. Hence, from (4.9) lim γ˜F = lim θ˜F = γ˜ .
c→+∞
c→+∞
(4.23)
Using this, it is now easy to see that for all η > 0, there exists c0 (ρ1 , σ2 , K, F0 , <, η) so that, for all c > c0 one has θ˜F γ˜F D4 2θ˜F R42 1 0 < kc ≤ (θ˜F − γ˜ (1 − η)) 3 . − 3 + 3+ε e c c3 c c c We obtain then:
θ˜ F s 1 3 ˜ c N (t) ≤ 3 (θF − γ˜ (1 − η))M(t) + kc sup e |A2 (s)| . c 0≤s≤t
(4.24)
To control the last term in this inequality, we now need to use the hypotheses on the initial conditions Y0 ∈ E. We treat Theorem 2 (ii) first. Recall that A2 − A0 is a function of compact support, vanishing for t ≥ 2Rc 2 . Hence, there exists Bc (Y0 ) so that, for all t ≥ 0: θ˜F
s
kc sup |e c3 (A2 (s) − A0 (s))| ≤ Bc (Y0 ) < +∞.
(4.25)
0≤s≤t
On the other hand, since for |y| ≥ Rc2 , ρ2 (y) = 0 (see (4.1)), we have 1 A0 (t) = dx dy dσ [φ0 (x, σ ) + σ · ∇y φ0 (x, σ ) R 4πt 2 |y|≤ c2 St (y) + tπ0 (x, σ )]ρ2 (y)∇ρ1 (x − q(t)).
If |y| ≤
R2 c ,
then |σ | ≥ |t − Rc2 |. According to (4.5), and the hypothesis on φ0 , π0 ,
R2
−ν R2
−ν
, |σ · ∇y φ0 (x, σ )| ≤ Kc t − , |φ0 (x, σ )| ≤ Kc t − c
c
R2
−ν−1
, |tπ0 (x, σ )| ≤ Kct t − c
uniformly in the x variable. So we have, for some constant A, | A0 (t) | ≤
A
. c (1 + t)ν ˜ A simple computation then shows there exists t∗ θcF3 > 0 so that 1 2
(4.26)
530
L. Bruneau, S. De Bi`evre
sup0≤s≤t
θ˜F s
e c3 (1+s)ν
=
, ∀t ≥ t∗ ˜ ∀t ≤ t∗ θcF3 .
θ˜F t
e c3 (1+t)ν
=1
We shall write t∗ = t∗ |g(s)| ≤ |g(0)| +
θ˜F c3
θ˜F c3
(4.27)
. Now we have, for all 0 ≤ s ≤ t, s t s |g(u)|du ˙ ≤ |g(0)| + N (u)du ≤ |g(0)| + N (u)du,
0
0
0
so that, using (4.24), (4.25) and (4.27), we find t M(t) ≤ |M(0)| + N (u)du 0
t 1 ˜ ≤ |g(0)| + 3 (θF − γ˜ (1 − η)) M(u)du c 0 θ˜F t u 1 e c3 − 21 du + kc Ac− 2 t∗ + Bc t. + kc Ac ν (1 + u) 0 We can use the Gronwall lemma ([H], Lemma 6.2) to obtain Bc c 3 (θ˜ −γ˜ (1−η)) t3 − 21 c |g(t)| ≤ M(t) ≤ |g(0)| + kc Ac t∗ + e F ˜θF − γ˜ (1 − η) t γ˜3 (1−η)s c 1 e (θ˜ −γ˜ (1−η)) t3 c . + kc Ac− 2 ds e F (4.28) ν 0 (1 + s) θ˜F
1 ˙ = e− c3 t g(t), We define h∞ = |g(0)| + kc Ac− 2 t∗ + ˜ Bc c . Remembering that h(t) θF −γ˜ (1−η) this yields t γ˜3 (1−η)s γ˜ γ˜ c 1 e − (1−η)t ˙ e− c3 (1−η)t |h(t)| ≤ h∞ e c3 + kc Ac− 2 ds ν 0 (1 + s) t γ˜3 (1−η)s γ˜ 2 ec 1 − γ˜3 (1−η)t e− c3 (1−η)t ≤ h∞ e c + kc Ac− 2 ds ν 0 (1 + s) t γ˜3 (1−η)s γ˜ c 1 e e− c3 (1−η)t + kc Ac− 2 ds t (1 + s)ν 2 3
≤ h∞ e
−
γ˜ (1−η)t c3
+ kc Ac ≤ h∞ e
−
− 21
t t 2
γ˜ (1−η)t c3
+ kc Ac
− 21
ds (1 + s)ν
e
−
γ˜ (1−η)t c3
γ˜ (1 − η) c3
e
γ˜ (1−η) 2t c3
−1
1
+
5 − γ˜ (1−η) t kc kc Ac− 2 2 + Ac 2 e c3 ν−1 . γ˜ (1 − η) (ν − 1) 1 + 2t
Hamiltonian Model for Linear Friction in a Homogeneous Medium
531
Consequently ˙ = O(t 1−ν ), h(t) so that we can conclude that there exists q∞ (Y0 , K, F, ε) ∈ Rd with the property that q(t) = q∞ + v(F )t + O(t 2−ν ), which proves the second part of the theorem. In part (i) of the theorem, φ0 and π0 are compactly supported. Hence A2 is compactly supported as well. In that case, (4.24) becomes N (t) ≤
1 (θ˜F − γ˜ (1 − η))M(t) + N˜ , c3
where N˜ is a constant which depends on everything except t, yielding instead of (4.28), |g(t)| ≤ |g(0)|e
(θ˜F −γ˜ (1−η))
t c3
+ N˜ e
(θ˜F −γ˜ (1−η))
t c3
,
and hence ˙ ˙ |h(t)| ≤ (|h(0)| + N˜ )e
−
γ˜ (1−η)t c3
.
From this, the announced behaviour of q(t) follows again. Proof of Lemma 1. First note that in the case considered here (V = −F · q) the Hamiltonian is not bounded below (unless F = 0), so that there is no a priori reason why h˙ ¨ ¨ = q(t) should be bounded. We start by controlling h(t). Using (4.1), (4.12) and h(t) ¨ we have: K˜ ¨ |h(t)| ≤ |F | + 2 + |A0 (t)|. c But
0
|A0 (t)| =
dx dy ρ2 (y)∇ρ1 (x − q(t))φ (x, y, t)
≤ ∇y φ 0 (t)2 × ∇y−1 ρ2 (y)∇ρ1 (x − q(t))2 , and using (3.12) together with the hypothesis H0 (Y0 ) ≤ Kc2−2ε and the form of ρ2 we have |A0 (t)| ≤ Ac−ε . Then remember that |F | ≤ F0 c−2−ε ≤ FM (c) =
F˜M , c2
¨ |h(t)| ≤ K0 c−ε .
(4.29) so finally: (4.30)
˙ which is obtained in (4.35). To alleviate the notation, We now turn to the bound on h(t) we shall write LF for DfF . Multiplying (4.20) by e−LF t and integrating between 0 and T , we obtain after some rewriting:
532
L. Bruneau, S. De Bi`evre
T 1 ρ2 (y − z)ρ2 (y) ˙ ) = eLF T h(0) ˙ h(T − dt dx dy dz 4π 0 |z| |z|≤t t ¨ × (t − |z| − s)h(s)ds · ∇ρ1 (x + v|z|) e−LF (t−T ) ∇ρ1 (x) t−|z|
T 1 ρ2 (y − z)ρ2 (y) − dt dx dy dz 4π 0 |z| |z|≤t t t 1 ˙ ˙ × h(s)ds; h(s)ds e−LF (t−T ) ∇ρ1 (x) Hessρ1 (x˜t,|z| ) 2 t− |z| t−|z| c T + dt e−LF (t−T ) A2 (t). 0
˙ Defining B(t) = sup0≤s≤t |h(s)| and using (4.1) and (4.30), we find for all t ≥ 0 and for some K1 , K2 > 0: t K1 K2 t −LF (s−t) 2 −LF (s−t) ˙ ˙ |h(t)| ≤ |h(0)| + 4+ε e ds + 4 e B (s)ds c c 0 0 t + e−LF (s−t) A2 (s)ds. 0
Then, with the notations introduced after Theorem 3 (see (4.8)–(4.9)) t γ˜ t γ˜ F (s−t) F (s−t) K1 K2 2 3 ˙ ˙ c |h(t)| ≤ |h(0)| + 4+ε e ds + 4 B (t) e c3 ds c c 0 0 t γ˜ F (s−t) + e c3 |A2 (s)|ds, 0
and consequently
t γ˜ F (s−t) K1 K2 2 + (t) + e c3 |A2 (s)|ds. B 1+ε γ˜F c γ˜F c 0 We now control the last term of this inequality. Under the hypothesis of part (i) of Theorem 2, A0 (s) has compact support. One should in addition remember that A2 differs from A0 by terms which have compact support in the ball af radius 2Rc 2 . Moreover one ˙ ˙ |h(t)| ≤ |h(0)| +
˙
of these terms is bounded by c12 and the other by |h(t)| . Therefore, using (4.29), the last c3 integral can be bounded as follows: t γ˜ α γ˜ 2R2 γ˜ F (s−t) F (s−t) F (s−t) c 3 3 c c e |A2 (s)|ds ≤ e |A0 (s)|ds + e c3 |A2 (s) − A0 (s)|ds 0
0
≤ Ac−ε ≤ A αc
0 −ε
0 α
e
γ˜F c3
(s−t)
ds + k(B(t)c−3 + c−2 )
2R2 c
γ˜F
e c3
(s−t)
ds
0
+ k c−3 + k B(t)αc−4 ,
where α is such that A0 (s) = 0 for s > α (note that α < provided c is large enough. And so we have for all t ≥ 0, ˙ ˙ |h(t)| ≤ |h(0)| + K3 c1−ε +
(4.31) sup{ 2Rc 2 , Rc2
K2 2 B (t) + k B(t)c−4 . γ˜F c
+ Rc}) and (4.32)
Hamiltonian Model for Linear Friction in a Homogeneous Medium
533
If we are now under the hypothesis of part (ii), we use (4.26) to control |A0 (s)|ds. We obtain then
t
e
γ˜F c3
(s−t)
|A0 (s)|ds ≤
0
≤
A 1
c2 t 0
t
γ˜F
t 0
γ˜F
e c3
(s−t)
(s−t)
e c3 ds ν 0 (1 + s) 1 A 1 ds ≤ A c− 2 1 ν c 2 (1 + s)
because ν > 2. Finally we have once again (4.32) for all t ≥ 0. We can now conclude as follows. Since B(t) is increasing, we have, for all 0 ≤ t ≤ T , K2 2 ˙ ˙ B (T ) + k B(T )c−4 . |h(t)| ≤ |h(0)| + K3 c1−ε + γ˜F c So, taking the supremum over t, we have the following inequality for all T ≥ 0: K2 2 ˙ + K3 c1−ε + B (T ). B(T ) − k B(T )c−4 ≤ |h(0)| γ˜F c
(4.33)
˙ < Kc1−ε , so, for c large enough: Using the hypothesis on F and H0 (Y0 ), one has |h(0)| B(T ) ≤ 2(K + K3 )c1−ε +
2K2 2 B (T ). γ˜F c
(4.34)
An easy computation and the continuity of B(t) tell us that: B(t) ≤ B− with γ˜F c B± = 4K2
∀t ≥ 0
or
16K2 (K + K3 ) . 1− γ˜F cε
1±
B(t) ≥ B+
∀t ≥ 0
We will now take c large enough so that there exists two constants β and β such that: B− ≤ βc1−ε ,
B+ ≥ β c > Kc1−ε .
Note now that
˙ B(0) =| h(0) |≤ Kc1−ε < B+ , ˙ and so we have finally the following bound for h(t): ˙ |h(t)| ≤ B(t) ≤ B− ≤ βc1−ε
∀t ≥ 0.
(4.35)
The condition “c large” is certainly essential in our proofs. Whether the results can also be obtained without this condition is not clear. On an intuitive level, the condition can be understood as follows. Remark that the model contains three intrinsic time scales that are functions of ρ1 , σ2 and c: (i) the relaxation time τ1 ≡ γ −1 = c3 γ˜ −1 defined in (4.3), 1 (ii) the time τ2 ≡ 2R vM the particle needs to cross its own diameter when moving at speed vM , (iii) the time τ3 ≡ 2Rc 2 the signals in the membranes need to cross the particle.
534
L. Bruneau, S. De Bi`evre
One has therefore two dimensionless parameters: 1 τ1 = c4 ; τ3 2R2 γ˜
τ1 wM = c4 . τ2 2R1 γ˜
Taking c large is therefore equivalent to τ1 % τ3 , which expresses the idea that the membranes evacuate the energy deposited by the particle “quickly”. Alternatively, c large is equivalent to vM γ1 % 2R1 , which is saying that the distance travelled by a particle moving at the characteristic speed vM during a time τ1 is much larger than the particle diameter.
5. The Confined Case We turn to the case of a confining potential. We make the following assumptions on V and σ2 : (C) lim V (q) = +∞; q→∞
(W) σˆ 2 (k) = 0 for all k ∈ R3 . Then ρ2 is defined as in (4.1). Let S = {q ∗ ∈ Rd |∇V (q ∗ ) = 0} be the set of critical points of V . We suppose that S is discrete. For all q ∈ Rd , we denote by φq the unique solution of −y φ(x, y) = −ρ1 (x − q)ρ2 (y) decaying at infinity. Therefore, {(φq , q, 0, 0)|q ∈ S} is the set of equilibrium points for the dynamics. Theorem 4. Suppose that (H1 ), (H2 ), (C) and (W ) are satisfied and n = 3. Denote by Y (t) = (φ(t), q(t), π(t), p(t)) the solution of (1.5)–(1.6). For all Y0 ∈ D, there exists q ∗ ∈ S such that: lim q(t) = q ∗ and
t→+∞
lim q(t) ˙ = 0.
t→+∞
(5.1)
If moreover, q ∗ is a non-degenerate minimum for V , then for all η > 0 there exists a c0 > 0 such that for any c > c0 and for all φ0 and π0 with compact support, we have for all t > 0, |q(t) − q ∗ | ≤ Ke
−(1−η)γ˜ 2c3
t
.
(5.2)
A similar result holds for q(t). ˙ One could also, as in [KKS1], study the convergence of φ(t) to φq ∗ , but we shall not do this here. Note that the first part of the theorem does not require c to be large. In fact, using the linearization method of [KKS1], one could prove the convergence is exponential for all c as well. In this way, we do not, however, obtain a very explicit expression for the exponential rate of decay. Our method here shows it to be equal to γ˜ , confirming the solutions in this model behave very much like those of the phenom2c3 enological equation (1.1).
Hamiltonian Model for Linear Friction in a Homogeneous Medium
535
Proof. In order to prove (5.1), i.e. the convergence of q(t) and q(t), ˙ we follow the method of [KKS1]. The exponential rate (5.2) will then be obtained by the same techniques as in the case V (q) = −F · q. Using the conservation of energy and the hypothesis on V , one concludes immediately that q(t), q(t) ˙ and q(t) ¨ are bounded. Let BR ⊂ R3 be the ball of radius R centered at 0. We define: p 2 (t) 1 dx dy (|∇y φ(x, y, t)|2 + |π(x, y, t)|2 ) + V (q(t)) + ER (t) = 2 2 Rd BR + dx dy ρ1 (x − q(t))ρ2 (y)φ(x, y, t). (5.3) Rd+n
We take R ≥ Rc2 . Using (3.11), we can write φ as φ r + φ 0 and in a similar way, π = π r + π 0 , where π r (x, y, t) = φ˙ r (x, y, t) and π 0 (x, y, t) = φ˙ 0 (x, y, t). Using this decomposition and the regularity of φ0 and π0 , we see that φ(x, y, t) and π(x, y, t) are y differentiable. Let us write n(y) = |y| and let dσ be the surface area element of ∂BR . Then differentiating (5.3), we have d d 1 dx dy (|∇y φ(x, y, t)|2 + |π(x, y, t)|2 ) ER (t) = H (Y (t)) − dt dt 2 Rd |y|>R = dx dσ (y) n(y) · ∇y φ(x, y, t)π(x, y, t) Rd ∂BR = dx dσ (y) n(y) · (∇y φ r (x, y, t)π r (x, y, t) + ∇y φ r (x, y, t) Rd
∂BR
× π (x, y, t) + ∇y φ 0 (x, y, t)π r (x, y, t) + ∇y φ 0 (x, y, t)π 0 (x, y, t)). 0
We bound the three last terms by the Young inequality, and we then integrate in t. Hence, for all T > Rc2 , R2 ER (T + R) − ER R + c T +R ≤ dt dx dσ (y) n(y) · ∇y φ r (x, y, t)π r (x, y, t) R R+
2 c
Rd
∂BR
1 r 2 r 2 0 2 0 2 + (|∇y φ (x, y, t)| + |π (x, y, t)| ) + 2(|∇y φ (x, y, t)| + |π (x, y, t)| ) . 4
We know that ER (R + Rc2 ) ≤ H (Y (R + potential and (3.14) tell us that:
R2 c ))
= H (Y0 ), and the hypothesis on the
1 ER (t) ≥ H (Y (t)) − (π(t)22 + φ(t)2 ) 2 ≥ −H (Y0 ) + 2V0 + 2ρ1 ρ2 ; ρ1 −1 ρ2 , where V0 is the infimum of V . So we have ER (T + R) − ER
R2 R+ c
≥ −C,
536
L. Bruneau, S. De Bi`evre
where C is a constant not depending on R, T . Hence, T +R 1 dt dx dσ (y) n(y) · ∇y φ r (x, y, t)π r (x, y, t) + (|∇y φ r |2 + |π r |2 ) − R2 d 4 R ∂BR R+ c ≤C+2
T +R
R+
R2 c
dt
Rd
dx
∂BR
dσ (y)(|∇y φ 0 (x, y, t)|2 + |π 0 (x, y, t)|2 ).
(5.4)
We first need to bound the right-hand side. This follows from Lemma 3.3 of [KKS1] and the fact that κ ∈ L2 (recall that κ is defined in Definition 1): T +R dt dx dσ (y)(|∇y φ 0 (x, y, t)|2 + |π 0 (x, y, t)|2 ) ≤ I0 R R+
2 c
Rd
∂BR
uniformly in R and T . Then, using (4.10), we have φ r (x, y, t) = −
1 4π
|y−z|≤t
dz
ρ2 (z) ρ1 (x − q(t − |y − z|)) . |y − z|
If |y| = R, because ρ2 (z) = 0 for |z| ≥ Rc2 , we have for t > R + Rc2 , 1 ρ2 (z) φ r (x, y, t) = − dz ρ1 (x − q(t − |y − z|c)) . R 4π |z|< c2 |y − z| Consequently, still for |y| = R and t > R + Rc2 , 1 ρ2 (z) ∂ π r (x, y, t) = φ˙ r (x, y, t) = − dz ρ1 (x − q(t − |y − z|)) (5.5) R 4π |z|< c2 |y − z| ∂t and ∇y φ r (x, y, t) =
1 4π
1 + 4π +
1 4π
R |z|< c2
dz
R |z|< c2
ρ2 (z) ∂ ρ1 (x − q(t − |y − z|)) n(y) |y − z| ∂t
dz
ρ2 (z) ρ1 (x − q(t − |y − z|)) n(y − z) |y − z|2
dz
ρ2 (z) ∂ ρ1 (x − q(t − |y − z|)) |y − z| ∂t
R |z|< c2
×(n(y − z) − n(y)). The last two integrals are bounded by KR −2 because q˙ is bounded. Hence ∇y φ r (x, y, t) = −π r (x, y, t)n(y) + O(|y|−2 ). Since we know that q(t) is bounded by some constant Q0 , for |x| > Q0 + R1 = R˜ 1 , we have φ r (x, y, t) = π r (x, y, t) = 0. So (5.4) becomes T +R dt dx dσ (y) |π r (x, y, t)|2 ≤ K + T O(R −2 ), (5.6) R R+
2 c
|x|
∂BR
Hamiltonian Model for Linear Friction in a Homogeneous Medium
537
and using once again (5.5) we have
T + Rc R+
R2 c
dt
BR˜
dx
∂BR
1
≤ K + T O(R
−2
dσ (y)
B R2 c
dz
2
ρ2 (z) ∂ ρ1 (x − q(t − |y − z|c))
|y − z| ∂t
).
But |y − z| ∼ R and t + R − |y − z| = t + n(y) · z + O(R −1 ), so
T
2
ρ2 (z) ∂ −1
dt dx dσ (y)
dz ρ1 x − q(t + n(y) · z + O(R ))
R2 |y − z| ∂t B ∂B B R R ˜ c R 2 1
c
≤ K + T O(R −2 ).
After the change of variable y = Rσ , we now take the limit R → +∞ and so
2
T
∂
≤ K.
− q(t + σ · z))) ρ dt dx dσ dz ρ (z) (x 2 1
R2 ∂t BR˜ S2 B R2 c 1
c
This bounds holds for all T and so
2
+∞
˙ +σ · z)
< +∞. dt dx dσ
dzρ2 (z)∇ρ1 (x −q(t +σ · z)) · q(t 0
BR˜
1
S2
B R2 c
(5.7) We define the function
I (x, σ, t) =
B R2
2
˙ + σ · z)
dz ρ2 (z)∇ρ1 (x − q(t + σ · z)) · q(t
c
which is differentiable in x, σ and uniformly Lipschitz in t because q˙ and q¨ are bounded. As a result lim I (x, σ, t) = 0
t→+∞
(5.8)
uniformly in x ∈ BR˜ 1 and σ ∈ S 2 . We fix σ and x. We take a basis of R3 such that σ = e1 , and we define ρ¯2 (z1 ) = dz2 dz3 ρ2 (z1 , z2 , z3 ) and s = σ · z. Then we have
2
I (x, σ, t) = ds ρ¯2 (s)∇ρ1 (x − q(t + s))) · q(t ˙ + s)
2
˙
=
ds ρ¯2 (t − s)∇ρ1 (x − q(s))) · q(s) 2 = |ρ¯2 P (∇ρ1 (x − q) · q)(t)| ˙ .
538
L. Bruneau, S. De Bi`evre
Then (5.8) leads to
lim ρ¯2 P (∇ρ1 (x − q) · q)(t) ˙ = 0.
t→+∞
Hence (W ) and Pitt’s extension to Wiener’s Tauberian theorem [R] implies ˙ =0 lim ∇ρ1 (x − q(t)) · q(t)
t→+∞
uniformly in x. So we have 0 = lim
sup ∇ρ1 (x − q(t)) · q(t) ˙
t→+∞ x∈B
R˜ 1
= lim sup ∇ρ1 (x − q(t)) · q(t) ˙ t→+∞
x∈Rd
because ∇ρ1 (x − q(t)) = 0 if |x| > R˜ 1 . Hence lim ∇ρ1 (x) · q(t) ˙ =0
t→+∞
∀x ∈ Rd ,
which proves that q(t) ˙ tends to zero (one can take x = rei , where r is such that ρ1 (rei ) = 0 and (e1 , ..., ed ) is any orthonormal basis). It remains to show that q(t) converges to some q ∗ which satisfies ∇V (q ∗ ) = 0. Remember that φq is the stationary solution of (1.5) corresponding to q(t) = q. Let A = {Yq = (φq , q, 0, 0) q ∈ Rd , |q| ≤ Q0 }. A is compact in E. Finally, we denote by .R the L2 norm restricted to the ball of radius R and |Y |2E ,R = ∇y φ2R + |q|2 + π 2R + |p|2 . We first prove inf |Y (t) − Yq |2E ,R = |p(t)|2 + π(t)2R + inf (∇y (φ(t) − φq )2R
Yq ∈A
|q|≤Q0
+|q(t) − q| ) →t→+∞ 0. 2
(5.9)
We know that |p(t)| → 0 as t → +∞. Then (5.5) implies that π r (t)R → 0 as t → +∞. The bound on π 0 (t)R (see Lemma 3.3 of [KKS1]) then shows that the same result holds for π(t)R . To estimate the infimum over q in (5.9) we take q(t) for q. Then the last term vanishes and we have to control ∇y (φ r (x, y, t) − φq(t) (x, y)) −1 ρ2 (y − z) = ∇y dz (ρ1 (x − q(t − |z|)) − ρ1 (x − q(t))) 4π |z| for |y| ≤ R, the term with ∇y φ 0 being controlled using once again Lemma 3.3 of [KKS1]. The difference ρ1 (x − q(t − |z|)) − ρ1 (x − q(t)) can be written using an integral depending only on q(s) ˙ for s ∈ [t − (R + Rc2 ), t] which tends to zero as t goes to infinity uniformly in (x, y) ∈ BR . All this proves (5.9). Given a solution Y (t) of (3.1), we call B the set of all Y¯ ∈ E such that there exists some sequence tn → +∞ with Y (tn ) → Y¯ in the semi-norm |.|E ,R for all R. The continuity of W t tells us that B is an invariant set. Then, (5.9) tells us that B ⊂ A. So, t ¯ for Y¯ ∈ B there exists a C 2 curve t → q(t) ˜ ∈ Rd such that W t Y¯ = Yq(t) ˜ . But W Y is ∗ ∗ ˙˜ a solution of (3.1) so we must have q(t) = 0, hence q(t) ˜ = q with ∇V (q ) = 0 and q ∗ ∈ S. Therefore Y¯ = Yq ∗ and B ⊂ {Yq , q ∈ S}.
Hamiltonian Model for Linear Friction in a Homogeneous Medium
539
We now prove that q(t) → q ∗ . Suppose there exist R0 , < > 0 and a sequence tn → +∞ such that inf |Y (tn ) − Yq |E ,R0 ≥ <.
(5.10)
q∈S
But (5.9) and the compactness of A imply that there exists Y¯ ∈ A and a subsequence tn such that Y (tn ) → Y¯ in the norm |.|E ,R for all R, where Y¯ ∈ A. Then, by definition, Y¯ ∈ B. But (5.10) is then a contradiction to B ⊂ {Yq , q ∈ S}. So inf |q(t) − q| → 0,
q∈S
and because S is discrete, there exists q ∗ ∈ S such that q(t) → q ∗ . We have therefore proven (5.1). To prove (5.2), we now suppose that q ∗ is a non-degenerate minimum for V . Because of the translational invariance of the interaction term, we can suppose that q ∗ = 0. Now, the computation leading up to (4.20) in the particular case v = 0, so that h(t) = q(t), yields: q(t) ¨ = −∇V (q(t)) −
1 γ˜ q(t) ˙ − 3 c 4π
|z|≤t
ρ2 (y − z)ρ2 (y) |z|
(t − |z| − s)q(s)ds ¨ · ∇ρ1 (x) ∇ρ1 (x) t−|z| 1 ρ2 (y − z)ρ2 (y) − dx dy dz 4π |z| |z|≤t t t 1 × Hessρ1 (x˜t,|z| ) q(s)ds; ˙ q(s)ds ˙ ∇ρ1 (x) + A2 (t). 2 t−|z| t−|z| ×
t
dx dy dz
(5.11)
Moreover, ∇V (q(t)) = W · q(t) + r(q(t)), where W is the Hessian matrix of V at q = 0 and r(q) = o(|q|). Remark that A2 (t) has compact support. We now define Q(t) = (q(t), q(t)) ˙ ∈ R2d and W˜ the 2d × 2d matrix: W˜ =
O I −W − cγ˜3 I
.
Since 0 is a non-degenerate minimum for V , W is a diagonalizable positive definite matrix. One should remark that W˜ is diagonalizable as well with eigenvalues λk = ˜
− 2cγ˜ 3 + iαk , and so for all t in R, eW t = e
−
γ˜ t 2c3
. We rewrite (5.11) as
˙ ˙ Q(t) = W˜ Q(t) + ψ(t, Q, Q), where ψ is a function we will control in terms of |q(t)|, ¨ |q(t)| ˙ and |q(t)|. Defining ˜ X(t) = e−W t Q(t) we have: ˜ ˜ ˜ ˜ ˙ ˙ , X(t) = e−W t ψ t, eW t X(t), W˜ eW t X(t) + eW t X(t)
540
L. Bruneau, S. De Bi`evre
˙ |X(t)| ≤
t ρ2 (y − z)ρ2 (y) (t − |z| − s) |z| |z|≤t t−|z| γ˜ (t−s) ˜ ˙ × e 2c3 (|W X(s)| + |X(s)|)ds |∇ρ1 (x)|2 1 ρ2 (y − z)ρ2 (y) 1 × |Hessρ1 (x˜t,|z| )| + dx dy dz 4π |z| 2 |z|≤t t t γ˜ (t−s) 3 2c × e |X(s)|ds |Q(s)|ds |∇ρ1 (x)|
1 4π
dx dy dz
t−|z|
+e
γ˜ t 2c3
t−|z|
˜ |A(t)| +e
γ˜ t 2c3
|r(q(t))|.
(5.12)
Let ε > 0. Since r(q) = o(|q|), there exists δ > 0 such that for all |q| < δ, |r(q)| < ε|q| < ε|Q|. We define η = min(δ, ε). Moreover, we already know that Q(t) → 0, so there exists T such that for all t ≥ T , |Q(t)| ≤ η. Using the fact that the evolution of the solution Y (t) of (3.1) is given by a continuous linear group and that Y (t) satisfies the same conditions as Y0 , we can suppose T = 0. So we have ∀t ≥ 0, |Q(t)| < η < ε
and
|r(q)| < ε|Q|.
(5.13)
As in Sect. 4, we define M(t) = sup |X(s)| and
˙ N (t) = sup |X(s)|.
0≤s≤t
0≤s≤t
Remembering once more that ρ2 depends on c via (4.1) and using (5.12) and (5.13), we have for all 0 ≤ s ≤ t, ˙ |X(s)| ≤
εK2 γ˜ R2 K1 γ˜ R2 K1 γ˜ R42 ˜ e c W M(t) + 4 e c4 M(t) + 4 e c4 N (t) 4 c c γ˜ γ˜ c τ τ ˜ 3 3 2c 2c + sup εe |Q(τ )| + sup e |A(τ )| . 0≤τ ≤t
Remark that e
γ˜ s 2c3
|Q(s)| =
˜ |e−W s Q(s)|
0≤τ ≤t
= |X(s)| and sup0≤τ ≤t e
γ˜ τ 2c3
˜ |A(τ )| ≤ K3 ,
so taking the supremum over all s ∈ [0, t] in the left hand side, we have εK2 γ˜ R42 K1 γ˜ R42 K1 ˜ γ˜ R42 c c W e + e + ε M(t) + e c N (t) + K3 , N(t) ≤ c4 c4 c4 and so
K1 ˜ γ˜ R42 εK2 γ˜ R42 K1 γ˜ R2 c c W e + e + ε M(t) + K3 . 1 − 4 e c4 N (t) ≤ c c4 c4
We call (kc )−1 the factor of N (t). We can choose ε as small as we want, so the factor of . Then, the same computation as in the last part of the proof M(t) can be bounded by K c4 of Theorem 2 leads to kc K K3 c 4 t c4 , M(t) ≤ M(0) + e K
Hamiltonian Model for Linear Friction in a Homogeneous Medium
541
and finally we have k K γ˜ c K3 c 4 − t e c4 2c3 , |Q(t)| ≤ M(0) + K
which is the annouced result.
Proof of Theorem 3. In the first part, we will follow the proof of Theorem 4 in order to prove that q(t) ˙ → 0. The only thing we have to worry about in the present case is that, unlike in the case of Theorem 4, q(t) is not a priori bounded. However, q(t) ˙ is bounded because V ≡ 0. In order to obtain the exponential decay rate, we will then make the same computations as in the proof of Theorem 2, except that we will not use Lemma 1, but the fact that we already know that q(t) ˙ → 0. We first prove that q(t) ˙ → 0. We follow the computation of the proof of Theorem 4. ! " Since q(t) ˙ is bounded, if t belongs to R + Rc2 ; R + T , if |y| = R and |z| ≤ Rc2 , we have R2 |q(t − |y − z|)| ≤ C T + c for some constant C > 0. With that estimate, (5.6) clearly becomes
R+T
R+
R2 c
dt
|x|
R2 c
+R1
dx
∂BR
dσ (y) |π r (x, y, t)|2 ≤ K + T d+1 O(R −2 ).
Then, (5.7) becomes 0
+∞
dt
Rd
dx
S2
dσ
B R2
2
˙ + σ · z)
< +∞, dzρ2 (z)∇ρ1 (x − q(t + σ · z))) q(t
c
and the end of the proof follows identically. Now that we know that q(t) ˙ → 0, we can control |q(t) − q∞ | in exactly the same way as in the proof of Theorem 2, but instead of using Lemma 1, we remark that there exists T > 0 such that for all t ≥ T , |q(t)| ˙ < 1. Using the fact that the evolution of the solution Y (t) of (3.1) is given by a continuous linear group and that Y (T ) satisfies the same conditions as Y0 , we can suppose T = 0. So, with the notations of Sect. 4 one has, instead of (4.24), N (t) ≤ η
γ˜ γ˜ 3 s |A (s)| . c e M(t) + k sup c 2 c3 0≤s≤t
The end of the proof is then similar.
Acknowledgements. The authors take pleasure in thanking V. Jaksic, H. Leschke, C.A. Pillet, and S.Teufel for helpful comments, encouraging remarks and useful references. They also thank one of the referees for his careful reading of the manuscript and his many useful comments.
542
L. Bruneau, S. De Bi`evre
References [B] Brezis, H.: Analyse fonctionnelle. Th´eorie et applications. Paris: Masson, 1993 [CEFM] Castella, F., Erd¨os, L., Frommlet, F., Markovich, P.A.: Fokker-Planck equations as scaling limits of reversible quantum systems. J. Stat. Phys. 100(3–4), 543–601 (2000) [CH] Courant, R., Hilbert, D.: Methods of mathematical physics (vol 2), Interscience (1962) [FLO] Ford, G.W., Lewis, J.T., O’Connell, R.F.: Independent oscillator model of a heat bath: exact diagonalization of the Hamiltonian. J. Stat. Phys. 53(1/2), 439–455 (1988) [H] Hale, J.: Ordinary Differential Equations. Robert E. Krieger Publishing Company, 1969 [J] John, F.: Partial Differential Equations. Berlin-Heidelberg-New York: Springer-Verlag, 1971 [KKS1] Komech, A., Kunze, M., Spohn, H.: Long-time asymptotics for a classical particle interacting with a scalar wave field. Comm. Partial Differential Equations 22, 307–335 (1997) [KKS2] Komech, A., Kunze, M., Spohn, H.: Effective dynamics for a mechanical particle coupled to a wave field. Commun. Math. Phys. 203, 1–19 (1999) [KS] Komech, A., Spohn, H.: Soliton-like asymptotics for a classical particle interacting with a scalar wave field. Nonlinear Anal. 33, 13–24 (1998) [LM] Lions, J.L., Magenes, E.: Probl`emes aux limites non homog`enes (vol 1), Dunod, 1968 [R] Rudin, W.: Functional Analysis. New York: McGraw Hill, 1973 Communicated by H. Spohn
Commun. Math. Phys. 229, 543–564 (2002) Digital Object Identifier (DOI) 10.1007/s00220-002-0706-3
Communications in
Mathematical Physics
On the Reality of the Eigenvalues for a Class of PT -Symmetric Oscillators K. C. Shin Department of Mathematics, University of Illinois, Urbana, IL 61801, USA. E-mail:
[email protected] Received: 16 January 2002 / Accepted: 1 May 2002 Published online: 6 August 2002 – © Springer-Verlag 2002
Abstract: We study the eigenvalue problem −u (z) − [(iz)m + P (iz)]u(z) = λu(z) with the boundary conditions that u(z) decays to zero as z tends to infinity along the 2π rays arg z = − π2 ± m+2 , where P (z) = a1 zm−1 + a2 zm−2 + · · · + am−1 z is a real polynomial and m ≥ 2. We prove that if for some 1 ≤ j ≤ m2 we have (j − k)ak ≥ 0 for all 1 ≤ k ≤ m − 1, then the eigenvalues are all positive real. We then sharpen this to a larger class of polynomial potentials. In particular, this implies that the eigenvalues are all positive real for the potentials αiz3 + βz2 + γ iz when α, β, γ ∈ R with α = 0 and α γ ≥ 0, and with the boundary conditions that u(z) decays to zero as z tends to infinity along the positive and negative real axes. This verifies a conjecture of Bessis and Zinn-Justin. 1. Introduction 1.1. The main results. We are considering the eigenvalue problem −u (z) − [(iz)m + P (iz)]u(z) = λu(z)
(1)
with the boundary conditions that u(z) decays to zero as z tends to infinity along the 2π rays arg z = − π2 ± m+2 , where m ≥ 2, λ ∈ C and P is a real polynomial of the form P (z) = a1 zm−1 + a2 zm−2 + · · · + am−1 z,
with all
ak ∈ R.
(2)
If a non-constant function u along with a complex number λ solves (1) with the boundary conditions, then we call u an eigenfunction and λ an eigenvalue. The boundary conditions here are those considered by Bender and Boettcher [1]. Note that the boundary conditions imply for m = 3 that u decays rapidly to zero as z tends to infinity along the positive and negative real axes (as will become clear after Lemma 4), and so u ∈ L2 (R). However, for the cases m ≥ 4, the boundary conditions do not imply that u decays to
544
K.C. Shin
zero as z tends to infinity along the positive and negative real axes. We will discuss this after Lemma 4 in Sect. 2. Before we state our main theorem, we first introduce some known facts by Sibuya [21] about the eigenvalues λ of (1), facts that hold even when ak ∈ C. Proposition 1. The eigenvalues λk of (1) have the following properties. (I) Eigenvalues are discrete. (II) All eigenvalues are simple. (III) Infinitely many eigenvalues exist. (IV) Eigenvalues have the following asymptotic expression.
λk =
23 + sin
1 √ π k − 21 m π 1 m 1 + m
2m m+2
[1 + o(1)] as k tends to infinity,
k ∈ N, (3)
where the error term o(1) could be complex. We will give precise references for Proposition 1 after Proposition 5 in Sect. 2. In this paper, we will prove the following theorem that says that Eq. (1) with a polynomial potential in a certain class has positive real eigenvalues only. Theorem 2. Let ak ’s be the coefficients of the real polynomial P (z) = a1 zm−1 + a2 zm−2 + · · · + am−1 z. If for some 1 ≤ j ≤ m2 we have (j − k)ak ≥ 0 for all k, then the eigenvalues of (1) are all positive real. Corollary 3. In particular, with m = 3 the eigenvalues λ of −u (z) + (iz3 + βz2 + γ iz)u(z) = λu(z),
u(±∞ + 0i) = 0,
are all positive real, provided β ∈ R and γ ≥ 0. Proof. This is a special case of Theorem 2 with m = 3, j = 1 and P (z) = βz2 −γ z.
Some kind of lower bound on γ is necessary, in the corollary, since Delabaere et al. [9, 10] studied the potential iz3 + γ iz and showed that a pair of non-real eigenvalues develops for large negative γ .And Handy et al. [14, 15] showed that the same potential admits a pair of non-real eigenvalues for small negative values of γ ≈ −3.0. Remark . By rescaling, the conclusion of Corollary 3 holds for the potential αiz3 + βz2 +γ iz when α ∈ R − {0}, β ∈ R and α γ ≥ 0. This paper is organized as follows. In the rest of Introduction, we will briefly mention some earlier work. Then in the next section, we state some known facts about Eq. (1) and examine further properties. In Sect. 3, we prove Theorem 2, and in Sect. 4 we extend Theorem 2. Finally, in the last section we discuss some open problems for further research.
Reality of Eigenvalues for a Class of PT -Symmetric Oscillators
545
1.2. Motivation and earlier work. Around 1995, Bessis and Zinn-Justin conjectured that the eigenvalues λ of d2 3 2 − 2 − α(iz) + βz u(z) = λu(z), for α ∈ R − {0}, β ∈ R, (4) dz are all positive real. And later Bender and Boettcher [1] generalized the BZJ conjecture; that is, they argued that the eigenvalues λ of d2 m 2 − 2 − (iz) + βz u(z) = λu(z), for β ∈ R, (5) dz are all positive real when β ≥ 0. Notice this follows for β ≤ 0 by Theorem 2 with P (z) = βz2 . The case β > 0 is open, except for m = 3, 4 which are covered by Theorem 2. Recently, Dorey et al. [11, 12] have studied the following problem: d2 l(l + 1) − 2 − (iz)2M − α(iz)M−1 + u(z) = λu(z), (6) dz z2 with the boundary conditions the same as those of (1), and M, α, l being all real. They proved that for M > 1, α < M + 1 + |2l + 1|, the eigenvalues are all real, and for M > 1, α < M + 1 − |2l + 1|, they are all positive. A special case of (6) is the potential iz3 (when M = 23 , α = l = 0), which is the β = 0 version of the BZJ conjecture, but their results do not cover the β = 0 version. (Suzuki [23] also studied the whole l = 0 version of (6) under different boundary conditions.) The proof of our main theorem, Theorem 2, has two parts. The first part follows closely the method of Dorey et al. [12, 13], developing functional equations for spectral determinants, expressing them in factorized forms and then studying an “associated” eigenvalue problem. We also introduce a symmetry lemma that is required by our more complicated potentials. The second part builds on earlier work of the author in [20], estimating eigenvalues of the “associated” problem by integrating over suitably chosen half-lines in the complex plane. Of course both this paper and [12, 13] are indebted to the work of Sibuya [21]. Note that our result Corollary 3 proves the full BZJ conjecture; that is, the eigenvalues λ of (4) are all positive real. Also Theorem 2 contains the polynomial potential case (l = 0, M ∈ N) of problem (6), though only with α ≤ 0, whereas Dorey et al. handle α < M. (Our proof in the case α ≤ 0 can be seen to reduce to that of Dorey et al.) In Theorem 12 we do manage to handle the case 0 < α < M, by using also the harmonic oscillator inequality, which is a different approach from that used in [12, p. 5701]. In a related direction, Bender and Boettcher [2] found a family of the following quasi-exactly solvable quartic potential problems: d2 − 2 − (iz)4 + 2α(iz)3 + (α 2 − 2β)(iz)2 − 2(αβ − J )(iz) u(z) = λu(z) (7) dz with the same boundary conditions as those of (1), where α, β ∈ R and J ∈ N. In their paper [2], the positive integer J corresponds to the number of the eigenfunctions that can be found exactly in closed form. However, for the purpose of studying the reality of the eigenvalues, we can allow J ∈ R. Our results in Theorem 2 confirm that if for
546
K.C. Shin
any J ∈ R, we have either α β ≥ J and α ≥ 0, or α β ≥ J and 2β ≥ α 2 , then the eigenvalues of (7) are all positive real. The above Hamiltonians are not Hermitian in general. However, according to Bender and Weniger [6], Hermiticity of traditional Hamiltonians is a useful mathematical constraint that guarantees real eigenvalues, rather than a physical requirement. All Hamiltonians mentioned above are the so-called PT -symmetric Hamiltonians.A PT -symmetric Hamiltonian is a Hamiltonian which is invariant under the product of the parity operation P(: z → −z) (an upper bar denotes the complex conjugate) and the time reversal operation T (: i → −i). These PT -symmetric Hamiltonians have arisen in recent years in a number of physics papers, see [7, 14, 15, 18, 19, 24] and other references mentioned above, which support that some PT -symmetric Hamiltonians have real eigenvalues only. In general the PT -symmetric Hamiltonians are not Hermitian and hence the reality of eigenvalues is not obviously guaranteed. But the important work of Dorey et al. [12], and results in this paper, prove rigorously that some PT -symmetric Hamiltonians indeed have real eigenvalues only. d2 We also know that if H = − dz 2 + V (z) is PT -symmetric, then V (−z) = V (z) and so Re V (z) is an even function and Im V (z) is an odd function. Hence if V (z) is a polynomial, then V (z) = Q(iz) for some real polynomial Q. Certainly (1) is a PT -symmetric Hamiltonian. As a final remark of the Introduction, we mention that some problems of type (1) were studied on the real line by Simon [22] and Caliceti et al. [8]. They proved compactness d2 2 n − of the resolvent and discreteness of spectrum for − dx 2 + x + βx , where β ∈ C − R , n = 3, 4, 5, . . . . Regarding the reality of eigenvalues, Caliceti et al. [8] showed that d2 2 2n+1 are real if β is small enough. eigenvalues for − dx 2 + x + βx 2. Properties of the Solutions In this section we will introduce some definitions and known facts related to Eq. (1). One of our main tasks is to identify the eigenvalues as being the zeros of a certain entire function, in Lemma 7. But first, we rotate Eq. (1) as follows because some known facts, which are related to our argument throughout, are directly available for this rotated equation. Let u be a solution of (1) and let v(z) = u(−iz). Then v solves −v (z) + [zm + P (z) + λ]v(z) = 0,
(8)
where m ≥ 2 and P is a real polynomial (possibly, P ≡ 0) of the form P (z) = a1 zm−1 + a2 zm−2 + · · · + am−1 z. Next we will rotate the boundary conditions. We state them in a more general context by using the following. Definition. The Stokes sectors Sk of Eq. (8) are
π 2kπ
Sk = z ∈ C :
arg z − < m + 2 m + 2
for k ∈ Z.
See Fig. 1. It is known that every non-constant solution of (8) either decays to zero or blows up exponentially, in each Stokes sector Sk . That is:
Reality of Eigenvalues for a Class of PT -Symmetric Oscillators
547
Fig. 1. The Stokes sectors for m = 3. The dashed rays are arg z = ± π5 , ± 3π 5 , π
Lemma 4 ([16, §7.4]). For each k ∈ Z, every solution v of (8) (with no boundary conditions imposed) is asymptotic to m (const.)z− 4 exp ±
z
1
ξ m + P (ξ ) + λ
2
dξ
(9)
as z → ∞ in every closed sector of Sk . The asymptotic expressions imply that for each k, v(z) either decays to 0 or blows up, as z approaches infinity in closed subsectors of Sk . In particular, Lemma 4 implies that if v(z) → 0 as z → ∞ along one ray in Sk , then v(z) → 0 as z → ∞ along every ray in Sk . Likewise, if v(z) → ∞ as z → ∞ along one ray in Sk , then v(z) → ∞ as z → ∞ along every ray in Sk . Thus the boundary conditions on u in (1) mean that v decays in S−1 ∪ S1 .
(10)
2π , along which we (Note that the rotation z → iz maps the two rays arg z = − π2 ± m+2 impose the boundary conditions for u, onto the center rays of S−1 and S1 .) The above observation shows that the boundary conditions for m = 3 are equivalent to having v decaying along both ends of the imaginary axis, since the Stokes sector S1 contains the positive imaginary axis, and the Stokes sector S−1 contains the negative imaginary axis. However, if m ≥ 4 then the open Stokes sectors S1 and S−1 do not contain the positive or negative imaginary axis, and hence the conditions (10) do not imply that v decays as z tends to infinity along both ends of the imaginary axis. Thus in terms of u, the boundary conditions that u decays to zero as z tends to infinity 2π mean that u ∈ L2 (R) for m = 3, but this is not along the two rays arg z = − π2 ± m+2 guaranteed for m ≥ 4. Next we will introduce Sibuya’s results, but first we define a sequence of complex numbers bj in terms of the ak and λ, as follows. For λ ∈ C fixed, we expand
(1 + a1 z−1 + a2 z−2 + · · · + am−1 z1−m + λz−m )1/2
548
K.C. Shin
=1+ =1+
∞ 1 2
k=1 ∞ j =1
k
a1 z−1 + a2 z−2 + · · · + am−1 z1−m + λz−m
bj (a, λ) , zj
k
|z|,
for large
(11)
where
a := (a1 , a2 , . . . , am−1 ) is the coefficient vector of P (z). Note that b1 , b2 , . . . , bm−1 do not depend on λ. We further define rm = − m4 if m is odd, and rm = − m4 − b m2 +1 if m is even. Also we define the order of an entire function g as lim sup r→∞
log log M(r, g) , log r
max{|g(reiθ )|
: 0 ≤ θ ≤ 2π } for r > 0. If for some positive real where M(r, g) = numbers σ, c1 , c2 , we have M(r, g) ≤ c1 exp[c2 r σ ] for all large r, then the order of g is finite and less than or equal to σ . Now we are ready to introduce some existence results and asymptotic estimates of Sibuya [21]. The existence of an entire solution with a specified asymptotic representation for fixed ak ’s and λ, is presented as well as an asymptotic expression of the value of the solution at z = 0 as λ tends to infinity. These results are in Theorems 6.1, 7.2, 19.1 and 20.1 of Sibuya’s book [21]. The following is a special case of these theorems that is enough for our argument later. The coefficient vector a is allowed to be complex, here. Proposition 5. Equation (8), with a ∈ Cm−1 , admits a solution f (z, a, λ) with the following properties: (i) f (z, a, λ) is an entire function of (z, a, λ). d (ii) f (z, a, λ) and f (z, a, λ) = dz f (z, a, λ) admit the following asymptotic expressions: Let . > 0. Then f (z, a, λ) = zrm (1 + O(z−1/2 )) exp [−F (z, a, λ)] , m
f (z, a, λ) = −zrm + 2 (1 + O(z−1/2 )) exp [−F (z, a, λ)] , as z tends to infinity in the sector | arg z| ≤ set of (a, λ)-values . Here m+2 2 z 2 + F (z, a, λ) = m+2 m
3π m+2
1≤j < 2 +1
− ., uniformly on each compact
1 2 bj z 2 (m+2−2j ) . m + 2 − 2j
(iii) Properties (i) and (ii) uniquely determine the solution f (z, a, λ) of (8). (iv) For each fixed a and δ > 0, f and f also admit the asymptotic expressions, 1 1 (12) f (0, a, λ) = [1 + o(1)]λ−1/4 exp Kλ 2 + m (1 + o(1)) , 1 1 f (0, a, λ) = −[1 + o(1)]λ1/4 exp Kλ 2 + m (1 + o(1)) , (13) as λ tends to infinity in the sector | arg λ| ≤ π − δ, where ∞ √ √ K= 1 + t m − t m dt. 0
(14)
Reality of Eigenvalues for a Class of PT -Symmetric Oscillators
549
(v) The entire functions λ → f (0, a, λ) and λ → f (0, a, λ) have orders
1 2
+
1 m.
Proof. In Sibuya’s book [21], see Theorem 6.1 for a proof of (i) and (ii); Theorem 7.2 for a proof of (iii); and Theorem 19.1 for a proof of (iv). And (v) is a consequence of (iv) along with Theorem 20.1. Note that properties (i), (ii) and (iv) are summarized on pp. 112–113 of Sibuya’s book. We now give references for the proof of Proposition 1. We use the number 2π ω = exp i . m+2 Proof of Proposition 1. See Theorem 29.1 of Sibuya [21] for a proof which says that eigenvalues are simple, and λk = ω
m
(−2k + 1)π 2K sin
2π m
2m m+2
[1 + o(1)],
as
k → ∞,
(15)
where K is given by (14). Note that Sibuya studies Eq. (8) with the boundary conditions that v decays in S0 ∪ S2 , while in this paper we consider the boundary conditions of the rotated equation (8) that v decays in S−1 ∪ S1 . The factor ωm in our formula (15) is due to this rotation of the problem. The remaining two claims (I) and (III) are easy consequences of the asymptotic expression (15). Also one can compute K directly or see Eq. (2.22) in [13], which says 1 1 1 1 1+ . K=− √ − − 2 m m 2 π So this along with (15) and the identity (λ)(1 − λ) = π csc(π λ) implies (3). Note that the asymptotic expression (3) of the eigenvalues agrees with that of Bender and Boettcher [1] obtained by the WKB calculation for the eigenvalue problem (5), after an index shift. We mention that the simplicity of the eigenvalues can be proved by using the fact that for each Stokes sector, there exist two solutions of (8) with no boundary conditions imposed such that one decays to zero and another blows up as z tends to infinity in the sector. The next thing we want to introduce is the Stokes multiplier. First, we let Gk (a) := (ω−k a1 , ω−2k a2 , . . . , ω−(m−1)k am−1 )
for
k ∈ Z.
Let f (z, a, λ) be the function in Proposition 5. Note that f (z, a, λ) decays to zero exponentially as z → ∞ in S0 , and blows up in S−1 ∪ S1 . Then one can see that the function fk (z, a, λ) := f (ω−k z, Gk (a), ω−mk λ), which is obtained by rotating f (z, Gk (a), ω−mk λ) in the z-variable solves (8). It is clear that f0 (z, a, λ) = f (z, a, λ), and that fk (z, a, λ) decays in Sk and blows up in Sk−1 ∪ Sk+1 since f (z, Gk (a), ω−mk λ) decays in S0 . Then since no non-constant solution decays in two consecutive Stokes sectors, fk and fk+1 are linearly independent
550
K.C. Shin
and hence any solution of (8) can be expressed as a linear combination of these two. λ), Especially, for some coefficients C(a, λ) and C(a, λ)f1 (z, a, λ). f−1 (z, a, λ) = C(a, λ)f0 (z, a, λ) + C(a,
(16)
λ) are called the Stokes multipliers of f−1 with respect to f0 These C(a, λ) and C(a, and f1 . We then see that C(a, λ) =
W−1,1 (a, λ) W0,1 (a, λ)
λ) = − W−1,0 (a, λ) , C(a, W0,1 (a, λ)
and
where Wj,k = fj fk − fj fk is the Wronskian of fj and fk . Since both fj , fk are solutions of the same linear equation (8), we know that the Wronskians are constant functions of z. Since fk and fk+1 are linearly independent, Wk,k+1 = 0 for all k ∈ Z. Moreover, we have the following which is needed in the proof of our main theorem. λ) is independent of λ. Moreover, if a ∈ Rm−1 Lemma 6. The Stokes multiplier C(a, λ)| = 1. then |C(a, Proof. First note that Sibuya’s multiplier c(a, λ) in [21] is W1,0 /W1,2 while we use λ) = W−1,0 /W0,1 . Since fk (z, a, λ) = f (ω−k z, Gk (a), ω−mk λ), we see that C(a, fk+1 (z, a, λ) = f (ω−(k+1) z, Gk+1 (a), ω−m(k+1) λ) = fk (ω−1 z, G(a), ω−m λ).
Hence using ωm+2 = 1, we see that Wk+1,j +1 (a, λ) = ω−1 Wk,j (G(a), ω2 λ),
(17)
which is Eq. (26.28) of [21]. So using Eq. (26.29) on p. 117 of [21], one can get λ) = − W−1,0 (a, λ) C(a, W0,1 (a, λ) =−
W0,1 (G−1 (a), ω−2 λ) , W1,2 (G−1 (a), ω−2 λ) −1 (a))
= −ω1−2ν(G where ν(G−1 (a)) =
,
by (17),
by (26.29) of [21],
0 bm/2+1 (G−1 (a))
(18)
if m is odd, if m is even.
λ) is independent of λ. We want From (18), it is clear that C(a, ν(G−1 (a)) to be real if m−1 m−1 λ)| = |ω| = 1. Suppose a ∈ R a∈R , so that |C(a, . Since b m2 +1 (G−1 (a)) = −b m2 +1 (a) as noted on p. 117 of [21] (or can be directly verified from (11)), it is sufficient to show that b m2 +1 (a) is real when a ∈ Rm−1 . Since ak ’s are all real, from (11) we conclude that b m2 +1 (a) must be real. This completes the proof.
Reality of Eigenvalues for a Class of PT -Symmetric Oscillators
551
λ) = eiφ0 for some φ0 = φ0 (a) ∈ R Thus from the proof of Lemma 6 we get C(a, and hence from (16) we have C(a, λ)f0 (z, a, λ) = f−1 (z, a, λ) − eiφ0 f1 (z, a, λ) −1
= f (ωz, G
m
(a), ω λ) − e
iφ0
f (ω
(19) −1
z, G(a), ω
−m
λ). (20)
From this, for each a ∈ Rm−1 we can relate the zeros of C(a, λ) with the eigenvalues of (1) as follows. Lemma 7. For each fixed a = (a1 , a2 , . . . , am−1 ) ∈ Rm−1 , a complex number λ is an eigenvalue of (1) if and only if λ is a zero of the entire function C(a, λ). Hence, the eigenvalues are discrete because they are zeros of a non-constant entire function. Note that the Stokes multiplier C(a, λ) is called a spectral determinant or an Evans function, because its zeros are all eigenvalues of an eigenvalue problem. Proof. Suppose that λ is an eigenvalue of (1) with the corresponding eigenfunction u. Then we let v(z) = u(−iz), and hence v solves (8) and decays in S−1 ∪ S1 . Since f−1 is another solution of (8) that decays in S−1 , we see that f−1 is a multiple of v. Similarly f1 is a multiple of v. Hence the right-hand side of (19) decays in S−1 ∪ S1 . But f0 blows up in S−1 ∪ S1 , and so (19) implies C(a, λ) = 0. Conversely we suppose that C(a, λ) = 0 for some λ ∈ C. Then from (19) we see that f−1 is a constant multiple of f1 . Thus both are decaying in S−1 ∪ S1 and hence u(z) := f−1 (iz, a, λ) is an eigenfunction of (1) with the corresponding eigenvalue λ. Next we examine (20) and its differentiated form at z = 0, which are, (21) C(a, λ)f (0, a, λ) = f (0, G−1 (a), ωm λ) − eiφ0 f (0, G(a), ω−m λ), −1 m iφ0 −1 −m C(a, λ)f (0, a, λ) = ωf (0, G (a), ω λ) − e ω f (0, G(a), ω λ), (22) where we use f instead of f0 . The right-hand sides of these are given by differences of two functions of λ. We will express these right-hand sides with single functions, respectively. To this end, we prove that f and f both have some symmetry as follows. Lemma 8. Let a = (a1 , a2 , . . . , am−1 ) ∈ Cm−1 . Then we have f (0, a, λ) = f (0, a, λ) and f (0, a, λ) = f (0, a, λ).
(23)
Especially, we have that if a = (a1 , a2 , . . . , am−1 ) ∈ Rm−1 is real, then f (0, G(a), λ) = f (0, G−1 (a), λ) and f (0, G(a), λ) = f (0, G−1 (a), λ). (24) Proof. Let g(z) = f (z, a, λ), which is the entire function f in Proposition 5 and hence decays in S0 . Then g solves −g (z) + (zm + a1 zm−1 + a2 zm−2 + · · · + am−1 z + λ)g(z) = 0. Next we take the complex conjugate of this and replace z by z. Then we see that g(z) is entire and solves the following equation: −g (z) + (zm + a1 zm−1 + a2 zm−2 + · · · + am−1 z + λ)g(z) = 0.
(25)
552
K.C. Shin
Since the entire functions g(z) and f (z, a, λ) are solutions of (25) that decay in S0 , we see that these two are linearly dependent. So one is a constant multiple of the other. Moreover, from (11) we see that bk (a, λ) = bk (a, λ) for all k ∈ N. Also we have F (z, a, λ) = F (z, a, λ) in Proposition 5. Hence the entire functions g(z) and f (z, a, λ) along with their first derivatives satisfy the same asymptotic expressions in Proposition 5 (ii), so we conclude that g(z) = f (z, a, λ)
(26)
by Proposition 5 (iii). Next substituting z = 0 in (26) gives the first equation in (23). Also we differentiate (26) with respect to z and substitute z = 0 to get the second equation in (23). For (24), just note that G(a) = G−1 (a). Next we want infinite product representations of f (0, a, λ) and f (0, a, λ), with respect to λ. Lemma 9. Suppose m ≥ 3. The functions λ → f (0, a, λ) and λ → f (0, a, λ) have infinitely many zeros Ej and Ej , respectively. They admit the following infinite product representations for each fixed a = (a1 , a2 , . . . , am−1 ) ∈ Cm−1 : λ f (0, a, λ) = D0 λ 1− Ej j =1 ∞ λ n1 f (0, a, λ) = D1 λ 1− Ej n0
∞
for some D0 ∈ C and nonnegative integer n0 , for some D1 ∈ C and nonnegative integer n1 .
j =1
Moreover, these infinite products converge absolutely. Proof. Fix a ∈ Cm−1 . We know that both λ → f (0, a, λ) and λ → f (0, a, λ) have orders 21 + m1 ∈ (0, 1) by Proposition 5 (v). Thus this lemma is a consequence of the Hadamard factorization theorem (see, for example, Theorem 14.2.6 on p. 199 of [17]).
3. Proof of Theorem 2 When m = 2, Eq. (1) is a translation of the harmonic oscillator. So there is nothing new here. We mention that since z2 + a1 iz = (z + a12 4
a1 2 2 i)
+
a12 4 ,
the eigenvalues for the
potential z2 + a1 iz are 2k + 1 + > 0. Suppose m ≥ 3 and suppose that λ ∈ C is an eigenvalue of the eigenproblem (1), then by Lemma 7 we have C(a, λ) = 0. Then from (22) and (22) along with (24), we have 0 = f (0, G−1 (a), ωm λ) − eiφ0 f (0, G−1 (a), ωm λ), 0 = ωf (0, G−1 (a), ωm λ) − eiφ0 ω−1 f (0, G−1 (a), ωm λ). Since the non-constant function f (z, G−1 (a), ωm λ) solves a linear second order ordinary differential equation, both f (0, G−1 (a), ωm λ) and f (0, G−1 (a), ωm λ) cannot be zero at the same time; otherwise, f (z, G−1 (a), ωm λ) ≡ 0.
Reality of Eigenvalues for a Class of PT -Symmetric Oscillators
553
Suppose that f (0, G−1 (a), ωm λ) = 0. Then from Lemma 9 we have ∞ ∞ ωm λ ωm λ m n0 iφ0 m n 0 1− = e D0 (ω λ) 1− . 0 = D0 (ω λ) Ej Ej j =1
j =1
Then by equating the absolute values of the two sides of the equation (and using ωm+2 = 1), we have
∞ 2
ω Ej − λ
(27)
2
= 1.
ω Ej − λ
j =1
Likewise, when f (0, G−1 (a), ωm λ) = 0, we get the following.
∞ ω2 E − λ
j
2
= 1.
ω Ej − λ
(28)
j =1
We mention that ω2 Ej and ω2 Ej lie in the open lower half-plane for some j, j . From Lemma 9 we know that f (0, G−1 (a), E) and f (0, G−1 (a), E) have infinitely many zeros E∗ . And (12) and (13) imply that the zeros E∗ near infinity lie near the negative real axis. Thus certainly Im ω2 Ej < 0 and Im ω2 Ej < 0 for some j, j . Below we will show that the hypotheses on the signs of the coefficients a1 , a2 , . . . , am−1 of P force all the ω2 Ej and ω2 Ej to lie in the closed lower half-plane, which implies either
2
ω Ej − λ ≥ ω2 Ej − λ and ω2 Ej − λ ≥ ω2 Ej − λ , ∀j ∈ N, if Im λ ≥ 0, or
2
ω Ej − λ ≤ ω2 Ej − λ
and
2
ω Ej − λ ≤ ω2 Ej − λ ,
∀j ∈ N, if Im λ ≤ 0,
(29)
since λ and λ are reflections
2
of each
other with
respect to the real axis. If (27) holds,
ω Ej − λ = ω2 Ej − λ for all j ∈ N. If (28) holds, then (29) then (29) implies
implies ω2 Ej − λ = ω2 Ej − λ for all j ∈ N. Since Im ω2 Ej < 0 for some j and Im ω2 Ej < 0 for some j , and since λ and λ are reflections of each other with respect to the real axis, we deduce in either case that λ = λ and hence λ is real. So our next task is to show that all the ω2 Ej and ω2 Ej lie in the closed lower half-plane. Suppose that for some E∗ ∈ C, either f (0, G−1 (a), E∗ ) = 0
or f (0, G−1 (a), E∗ ) = 0.
(30)
That is, either E∗ = Ej or E∗ = Ej for some j ∈ N. We know that v(z) = f (z, G−1 (a), E∗ ) solves m−1 −v (z) + zm + ak ωk zm−k v(z) = −E∗ v(z), (31) k=1
554
K.C. Shin
where ak ∈ R, k = 1, 2, . . . , m − 1, and by (30), v satisfies either the Dirichlet (E∗ = Ej ) or the Neumann boundary (E∗ = Ej ) condition at z = 0, and the Dirichlet boundary condition at ∞ + 0i. We call (31) with these boundary conditions the “associated” eigenvalue problem. We aim to show all the eigenvalues E∗ have Im (ω2 E∗ ) ≤ 0. π Let g(r) = v(reiθ ) with θ fixed, |θ| < m+2 . We then replace v (z) in (31) by −2iθ 2 g (r), multiply the resulting equation by ω g(r) and integrate over 0 ≤ r < ∞ to e get ∞ −ω2 e−2iθ g (r)g(r) dr 0 ∞ m−1 2 miθ m k+2 (m−k)iθ m−k + ω e r + ak ω e |g(r)|2 dr r 0
= −ω2 E∗
∞
k=1
|g(r)|2 dr.
(32)
0
Since f (z, G−1 (a), E∗ ) decays to zero exponentially in S0 , we know the integrability π of every term in (32) for each |θ | < m+2 . Next we integrate the first term by parts, using g(0) = 0 or g (0) = 0 by (30), so the boundary term vanishes. And then taking the imaginary part of the resulting equation gives ∞ ∞ 4π 4π sin |g (r)|2 dr + sin mθ + r m |g(r)|2 dr − 2θ m+2 m+2 0 0 ∞ m−1 2(k + 2)π + + (m − k)θ ak sin r m−k |g(r)|2 dr m+2 0 k=1 ∞ = −Im ω2 E∗ |g(r)|2 dr. (33) 0
Recall our hypothesis that (j − k)ak ≥ 0 for all 1 ≤ k ≤ m − 1 for some 1 ≤ j ≤ m2 . We want to prove the reality of the eigenvalues by showing that Im ω2 E∗ ≤ 0 for all the E∗ . To this end, we will divide the proof into two cases; Case I, when 1 ≤ j ≤ m2 and m ≥ 5, or when j = 1 and m = 3, 4; and Case II, when j = 2 and m = 4. Case I. When 1 ≤ j ≤ (33) by
m 2
and m ≥ 5, or when j = 1 and m = 3, 4. We choose θ in
θ=
(m − 2j − 2)π , (m − j )(m + 2)
(34)
where the motivation for this choice will be fairly clear later in the proof. Notice here π that |θ | < m+2 as required. Then 4π 2π − 2θ = ≤ π, m+2 m−j 4π (m − 2j )π 0 ≤ mθ + = < π, m+2 m−j 0<
and hence and hence
4π − 2θ ≥ 0, and m+2 4π sin mθ + ≥ 0. m+2
sin
Reality of Eigenvalues for a Class of PT -Symmetric Oscillators
(Clearly these inequalities use that j ≤
555
m 2 .) Also
we see that 2(k + 2)π j −k + (m − k)θ = 1 − π, for all k, m+2 m−j
and so
j −k sin 1 − π m−j
has the same sign as (j − k) for all 1 ≤ k ≤ m − 1.
Among other things, this is why we choose the θ as above. So from (33) and the hypothesis (j − k)ak ≥ 0, we conclude that Im ω2 E∗ ≤ 0. This proves that the eigenvalue λ is real. Case II. When j = 2 and m = 4. The reason we separate this case from Case I is that π π , whereas our argument needed |θ | < m+2 in order to get in this case, |θ| = π6 = m+2 integrability for terms in (32). So we modify the proof as follows. For . > 0 small, we multiply (32) by e−2i. and set θ = − π6 + .. Then integrating the first term by parts and taking the imaginary part of the resulting equation gives ∞ ∞ 2 sin(4.) |g (r)| dr + sin(2.) r 4 |g(r)|2 dr 0
+
3
0
∞ kπ r 4−k |g(r)|2 dr + (2 − k). 2 0 k=1 ∞ 2 −2i. = −Im ω E∗ e |g(r)|2 dr.
ak sin
(35)
0
Clearly sin kπ 2 + (2 − k). has the same sign as (2 − k) for 1 ≤ k ≤ 3. Using the hypothesis (2 − k)ak ≥ 0 for all 1 ≤ k ≤ 3, we have that the left side of (35) is nonnegative and so Im ω2 E∗ e−2i. ≤ 0. Thus by sending . to zero, we get Im ω2 E∗ ≤ 0, which proves the reality for the case of j = 2 and m = 4. Therefore, the eigenvalues of (1) are all real under the hypotheses on the ak ’s given in the statement of this theorem. We must still prove the positivity of the eigenvalues. Suppose u is an eigenfunction of (1) with an eigenvalue λ ∈ R, and suppose ak ’s satisfy the hypotheses of the theorem. Let v(z) = u(−iz). Then we have Eq. (8) with the boundary conditions that v decays in 3π π S−1 ∪ S1 = z ∈ C : < | arg z| < . m+2 m+2 Since λ and all ak ’s are real, one can see that v(z) satisfies the same equation and decays in S−1 ∪ S1 . Then since the eigenvalues are simple, v(z) and v(z) must be linearly dependent, and hence v(z) = cv(z) for some c ∈ C. Since |v(z)| and |v(z)| agree on the
556
K.C. Shin
real line, we see that |c| = 1 and so |v(z)| = |v(z)| for all z ∈ C. That is, |v(x + iy)| is even in y. From this we have that
∂ 2
0= |v(x + iy)|
= −2Im v (x)v(x) , for all x ∈ R. (36) ∂y y=0 Next we let h(r) = v(reiθ ). By substituting into the differential equation (8), then multiplying by h(r) and integrating, we get ∞ ∞ m−1 (m+2)iθ m (m−k+2)iθ m−k e |h(r)|2 dr h (r)h(r) dr + r + ak e r − 0
0
= −λe
∞
2iθ
k=1
|h(r)| dr, 2
π 3π <θ < . m+2 m+2
for
0
Integrating the first term by parts and using h (0) = eiθ v (0), one can get ∞ −v (0)v(0) + e−iθ |h (r)|2 dr 0 ∞ m−1 (m+1)iθ m (m−k+1)iθ m−k + e |h(r)|2 dr r + ak e r 0
= −λeiθ
∞
k=1
|h(r)|2 dr,
π 3π <θ < . m+2 m+2
for
0
Taking the imaginary part and using (36) at x = 0, we have ∞ ∞ m−1 2 m m−k sin θ r sin(m + 1)θ + |h | dr − ak r sin(m − k + 1)θ |h|2 dr 0
= λ sin θ 0
0
∞
|h|2 dr,
k=1
for all
π 3π <θ < . m+2 m+2
(37)
(Here again we used that λ is real.) We choose θ= so that
as required, and
π , m−j +1
π π 2π 3π < ≤θ ≤ < <π m+2 m+1 m+1 m+2
m−k+1 π sin(m − k + 1)θ = sin m−j +1
has the same sign as (k−j ), for all k. Since (k−j )ak ≤ 0, sin θ ≥ 0 and sin(m+1)θ ≤ 0, we see that the left-hand side of (37) is positive, and hence so is the right-hand side. Therefore, the real number λ must be positive. This completes the proof of Theorem 2.
Reality of Eigenvalues for a Class of PT -Symmetric Oscillators
557
Remarks. 1. The idea of using the infinite product in (27) to prove reality of the eigenvalues is due to Dorey et al.[12]. But their potentials are much simpler and the E∗ are all negative real in their situation, so that (29) is immediate. Here it is not. 2. The ideas above for proving positivity of the eigenvalues are similar to those used earlier by the author in [20]. 3. We note that the hypotheses assumed in Theorem 2 on the coefficients of P are sufficient for real eigenvalues, but not necessary, for at least two reasons. Let Q(z) = −[zm +P (z)]. Then first, the problem (1) with the potential Q(iz) = −[(iz)3 −(iz)2 ] is covered by Theorem 2 while the problem with Q1 (iz) = −[(iz)3 + 2(iz)2 + iz] is not. However, Q(iz +1) = Q1 (iz) and so the potential Q1 (iz) produces positive real eigenvalues only. For general cases, if the problem (1) with the potential Q(iz), for some real polynomial Q, has positive real eigenvalues λ only, then the problem with the potential Q(iz + c) − Q(c) for some real c ∈ R has eigenvalues λ + Q(c) which are all real. Second, in the proof of Theorem 2 in order to ensure that Im ω2 E∗ ≤ 0, we insisted that each and every term on the left-hand side of (33) has a single sign, and it is clear from Sect. 4 below that this is not necessary. 4. Extensions of Theorem 2 In this section, we study two particular classes of polynomial potentials to illustrate different methods for sharpening Theorem 2. Theorem 10. Let m ≥ 4 and suppose α < 0, γ < 0. Suppose that an entire function u along with λ ∈ C solves Eq. (1) with P (z) = αz3 + βz2 + γ z. Then the eigenvalue λ is positive real, provided that π √ β ≤ αγ 3 − tan2 . (38) m The eigenvalue λ is also positive real provided that λ ∈ R and π 2 1 − tan m+1 √ √ . β ≤ 4 2 αγ (39) π 3 − tan2 m+1 Remarks. 1. Note that for α, β, γ ≤ 0, we have λ > 0 by Theorem 2. The point of Theorem 10, then, is that if α, γ < 0 then we can allow some values of β > 0. 2. The right-hand side of (38) is less than that of (39) as we show at the end of the proof. Proof of Theorem 10. Since the theorem for β ≤ 0 is contained in Theorem 2, it suffices to show the claims of the theorem hold under the hypotheses (38) and (39) with β replaced by |β|. In proving this we will closely follow the proof of Theorem 2. As in the proof of Theorem 2, in order to prove the reality of the eigenvalues we show that Im (ω2 E∗ ) ≤ 0 for all E∗ ∈ C satisfying (30). In this case, we see that (33) becomes ∞ ∞ 2 sin (2φ) |g (r)| dr − sin(mφ) r m |g(r)|2 dr 0 0 ∞ αr 2 sin (3φ) + βr sin (2φ) + γ sin φ r|g(r)|2 dr − 0 ∞ |g(r)|2 dr, (40) = −Im ω2 E∗ 0
where φ =
2π m+2
− θ. (See the proof of Theorem 2 for the definition of g(r).)
558
K.C. Shin
π Recall that the positivity of the left-hand side of (40) implies λ ∈ R. Since |θ | < m+2 , π 3π we get m+2 < φ < m+2 . Then since we are trying to show the left-hand side is positive π under certain conditions on the coefficients, we restrict φ to m ≤ φ ≤ 2π m if m ≥ 5, and π π ≤ φ < if m = 4, so that sin(mφ) ≤ 0 in the second term above. (Note that when 4 2 m = 4, φ = π2 , we have θ = − π6 for which some terms in (40) are not integrable.) We
further want the discriminant of the quadratic αr 2 sin (3φ) + βr sin (2φ) + γ sin φ to satisfy
β 2 sin2 (2φ) − 4αγ sin(3φ) sin φ ≤ 0, so that the quadratic expression has a single sign. That is, we want β 2 ≤ 4αγ
sin(3φ) sin φ 2 = αγ 3 − tan φ . sin2 (2φ)
(41)
So in order to have right-hand sidein (41), since α < 0 and γ <0, we need
π a2πpositive π π , 3 ∩ m , m for which 3 − tan2 φ is positive. Since 3 − tan2 φ is decreasφ∈ m π ing, to maximize the right-hand side of (41), we choose φ = m . Hence as we remarked at the beginning of the proof, this proves the reality of the eigenvalue under (38). Similarly, in order to prove the positivity of the eigenvalues, suppose λ ∈ R and use (37). Let h(r) = v(reiθ ) = u(−ireiθ ). Then (37) becomes sin θ −
∞
2
∞
|h | dr − sin(m + 1)θ
0∞
r m |h|2 dr
0
αr 2 sin (4θ ) + βr sin (3θ ) + γ sin (2θ) r|g(r)|2 dr 0 ∞ 3π π <θ < . |h|2 dr, for all = λ sin θ m+2 m+2 0
Then we restrict θ ∈
π 2π m+1 , m+1
(42)
so that sin(m + 1)θ ≤ 0 in the second term above. We
also want the discriminant of the quadratic αr 2 sin (4θ ) + βr sin (3θ ) + γ sin (2θ ) to be non-positive, so that the quadratic expression has a single sign. That is, β 2 sin2 (3θ ) − 4αγ sin(4θ ) sin(2θ) sin2 (3θ) ≤ 0.
Hence we have β 2 ≤ 4αγ
One can check
1−tan2 θ 2 θ )2 3−tan (
1 − tan2 θ sin(4θ ) sin (2θ ) = 32αγ 2 . 2 sin (3θ ) 3 − tan2 θ
is decreasing on
π π m+1 , 3
(43)
.Also we want to have 1 − tan2 θ
π ≥ 0 so that the right-hand side of (43) is nonnegative (that is, m+1 ≤ θ ≤ π4 .) Then π it is not difficult to see that θ = m+1 maximizes the right-hand side of (43). Also with π , the left-hand side of (42) is positive and hence λ > 0. So with help of Theorem θ = m+1 2, we conclude that all real eigenvalues are positive under the hypothesis (39).
Reality of Eigenvalues for a Class of PT -Symmetric Oscillators
559
Still we must show that eigenvalues are positive under (38). We will do this by showing that the hypothesis (38) implies (39). That is, we will show π π 1 − tan2 m+1 3 − tan2 < 32 for all m ≥ 4. (44) 2 , m π 3 − tan2 m+1 Since
1−tan2 θ 2 θ )2 3−tan (
is decreasing and positive for θ ∈ 0,
π 4
, the right-hand side of (44)
is an increasing function of m ≥ 5 and hence greater than or equal to the value at m = 5 which is 3. So (44) holds for m ≥ 5 since its left-hand side is less than 3. And for m = 4, one just check (44) directly. This completes the proof. Remark. Above we have chosen P (z) = αz3 + βz2 + γ z for simplicity. One should note that the above argument works for real polynomials of the type P (z) = αzn+k + βzn + γ zn−k for some positive integers n > k. The previous theorem handled α < 0. Similarly, we get the following for α > 0 when m = 4, 5, 6. Theorem 11. Let m = 4, 5, or 6 and let α > 0, γ < 0. Suppose λ is an eigenvalue of (1) with P (z) = αz3 + βz2 + γ z for some β ∈ R. Then the eigenvalue is positive real, provided that 2π 2 β ≤ α|γ | tan − 3. (45) m The eigenvalue λ is also positive real provided that λ ∈ R and ∞ if m = 4, 5, β≤ 1−tan2 2π √ √ 7 if m = 6. 4 2 α|γ | 2 2π 3−tan
(46)
7
Remark. In this theorem we restrict m to m = 4, 5, 6 for reasons explained in the proof below. √ In (45), by convention we take |αγ | tan2 2π m − 3 = +∞ when m = 4, so that (45) just says β ∈ R, in that case, as in Theorem 2. Note that by (46), all real eigenvalues are positive when m = 5, α > 0, β ∈ R, γ < 0. But there could perhaps be some non-real eigenvalues. We mention that the case β ≤ 0 or m = 4 of Theorem 11 is contained in Theorem 2, as it is explained in the proof of Theorem 10. The other cases of Theorem 11 are new. Proof of Theorem 11. Since the case β ≤ 0 is known already, it is enough to prove the theorem under the hypotheses (45) and (46) with β replaced by |β|. The proof below will be very much similar to that of the previous theorem. So we will refer equations to those in the proof of the previous theorem. Since the case m = 6 in (45) says β ≤ 0, this case is contained in Theorem 2. So for (45) we can assume m = 5. Again we use (33) with P (z) = αz3 + βz2 + γ z. Then as π we did in the proof of Theorem 10, we can get (40), where we want m ≤ φ ≤ 2π m so
560
K.C. Shin
that sin(mφ) ≤ 0 in the second term. But this
time since α > 0, we want sin(3φ) <0 and want the discriminant of the quadratic αr 2 sin (3φ) + βr sin (2φ) + γ sin φ to be non-positive. Then again we have (41). Obviously we want a nonnegative right-hand side in (41), and since α > 0 and γ < 0, we need 3 − tan2 φ ≤ 0. This means π3 ≤ φ < π2 π as well as m ≤ φ ≤ 2π m.
Then in order to maximize the right-hand side of (41) we choose φ = 2π 5 for m = 5. This proves the reality of the eigenvalue since β ≤ 0 is covered by Theorem 2. Similarly, in order to prove the positivity of the eigenvalues under the hypothesis (46), suppose 6, let λ ∈ R and use (37). Then like before we get (43) where we want m = 5 or π 2π θ ∈ m+1 , m+1 so that sin(m + 1)θ ≤ 0 in the second term in (42). We further want the right-hand side of (43) to be nonnegative, and hence want φ ≥ π4 . For m = 5 since π π 2π π 2π 2π π 4 < 3 ≤ m+1 , we get β < ∞ from (43). And for m = 6 since 4 < m+1 = 7 < 3 ,
2 and since 1−tan2 θ 2 < 0 is decreasing for θ ∈ π4 , π3 , we choose θ = 2π 7 in order (3−tan θ ) to maximize the right-hand side of (43). Thus along with Theorem 2 for β ≤ 0, we conclude all real eigenvalues are positive under the hypotheses λ ∈ R and (46). Finally, it is not difficult to see that (45) implies (46), and hence we get the positivity of the eigenvalue under (45) as before. This completes the proof. The second method for sharpening Theorem 2 is to make use of the |g (r)|2 dr term, by means of the harmonic oscillator inequality. Below we will prove that eigenvalues λ are positive real if α < m2 . Note that Dorey et al. [12] already prove this, and they also show that eigenvalues are real if α < m2 + 2. Theorem 12. Let m ≥ 4 be an even integer. Suppose that an entire function u along with λ ∈ C solves −
m d2 m 2 −1 u(z) = λu(z), − (iz) − α(iz) dz2
(47)
with the boundary conditions that u decays to zero as z tends to infinity along the rays 2π arg z = − π2 ± m+2 . Then the eigenvalue λ is positive real if α < m2 . If α = m2 , then all eigenvalues are positive real except the smallest one, which is zero and has the corresponding eigenfunction u0 (z) = exp
m+2 2 (iz) 2 . m+2
m+2 m 2 Note that the function v(z) = exp − m+2 z 2 solves −v (z) + zm v(z) = m2 z 2 −1 v(z), and when m = 4k + 2 for some k ∈ N, v(z) decays along both ends of the real axis. This type of problem was studied by Bender and Wang [5]. Proof. The outline of the proof is similar to those of the proofs of Theorems 10 and 11 above. But this time we will make use of the |g |2 dr term via the harmonic oscillator inequality.
Reality of Eigenvalues for a Class of PT -Symmetric Oscillators
561 m
Like before, we examine Eq. (33), for the choice P (iz) = α(iz) 2 −1 . Then we have ∞ ∞ 4π 4π 2 |g (r)| dr + sin mθ + r m |g(r)|2 dr − 2θ sin m+2 m+2 0 0 ∞ m 4π m−2 −α sin + θ r 2 −1 |g(r)|2 dr m+2 2 0 ∞ |g(r)|2 dr, (48) = −Im ω2 E∗ 0
where we refer to the proof of Theorem 2 for the definition of g(r). Now we use the harmonic oscillatorinequality on the first two terms above so that we have that for π 4π |θ | < m+2 with sin mθ + m+2 ≥ 0, ∞ ∞ 4π 4π 2 |g (r)| dr + sin mθ + r m |g(r)|2 dr − 2θ sin m+2 m+2 0 0 ∞ m 4π 4π ≥ sin 2 r 2 |g (r)g(r)| dr − 2θ sin mθ + m+2 m+2 0
∞
m d 4π 4π 2
2 ≥ sin r dr − 2θ sin mθ + |g(r)|
m+2 m+2 dr 0 ∞ m m 4π 4π = sin r 2 −1 |g(r)|2 dr, − 2θ sin mθ + 2 m+2 m+2 0 by parts. We then combine this with (48) to get that if 4π 4π m−2 4π m sin − 2θ sin mθ + ≥ α sin + θ , (49) 2 m+2 m+2 m+2 2 2 then Im ω E∗ ≤ 0, which then proves the reality of the eigenvalues like in the proof of Theorem 2. Next we examine the condition (49) and find with a little effort that θ = 0 is the best choice to get the best bound for α out of (49). That is, α ≤ m2 . Similarly, in order to prove the positivity and non-negativity of the eigenvalues, we use (37). Let h(r) = u(−ireiθ ). Since λ ∈ R, one can get the following from (37): ∞ ∞ m ∞ m 2 m 2 sin θ |h | dr − sin(m + 1)θ r |h(r)| dr − α sin r 2 −1 |h|2 dr θ 2 0 0 0 ∞ 3π π <θ < . |h|2 dr, provided = λ sin θ m + 2 m +2 0 π Then since sin(m + 1)θ < 0 for m+1 <θ < inequality to the first two terms above to get ∞ λ sin θ |h|2 dr 0
2π m+1 ,
we apply the harmonic oscillator
m ∞ m −1 2 sin θ |sin(m + 1)θ | − α sin θ r 2 |h| dr 2 2 0 ∞ m m 2π ≥ r 2 −1 |h|2 dr, − α sin 2 m+2 0
≥
m
562
K.C. Shin
2π 2π where we have chosen θ = m+2 . (One can check with a little effort that θ = m+2 is the m best choice for this argument.) Since sin θ > 0, we see λ > 0 when α < 2 and λ ≥ 0 when α = m2 . m When α = m2 , we see that for m even, u0 (−iz) solves −v (z) + zm + m2 z 2 −1 v(z) = 0 with properties that u0 (−iz) decays in S−1 ∪S1 and blows up in S0 . So it satisfies the proper boundary conditions of (8) and hence u0 (z) is, in fact, the eigenfunction of (47). Hence, all eigenvalues are positive except the smallest eigenvalue zero since eigenvalues are simple by Proposition 1. This completes the proof.
Remark. Note that arguments similar to Theorem 12 work, for example, for some polynomial potentials with P (z) = βz2 + γ z when m ≥ 7. Note also that to prove the reality and positivity of the eigenvalues, it is enough to show the left-hand sides of (33) and (37) are positive, respectively. For specific potentials, this might be achieved by other kinds of estimates. Remark. The methods of proving the theorems in this section show how to sharpen Theorem 2 to problems with potentials “almost the same” as those of Theorem 2. But for Theorem 12, the proof of the reality can be shortened as follows. Suppose that an analytic function v(z) = f (z, G−1 (α), E∗ ) along with E∗ ∈ C solves m
−v (z)+(zm −αz 2 −1 )v(z) = −E∗ v(z),
with
v (0)v(0) = 0,
v(+∞+0i) = 0.
This eigenproblem on the positive real axis is self-adjoint, and so E∗ ∈ R. Precisely, multiplying both sides by v(z), integrating over the positive real axis, and integrating the first term of the resulting equation by parts give ∞ ∞ ∞ ∞ m 2 m 2 −1 2 2 |v (x)| dx + x |v(x)| dx − α x |v(x)| dx = −E∗ |v(x)|2 dx. 0
0
0
0
(Hence E∗ ∈ R.) Then one uses the harmonic oscillator inequality on the first two terms to have ∞ ∞ m m −1 2 2 −α x |v(x)| dx ≤ −E∗ |v(x)|2 dx. 2 0 0 So if m2 ≥ α then E∗ ≤ 0 which implies Im ω2 E∗ ≤ 0. Thus the eigenvalue λ is real, as before. 5. Conclusions In this paper we have proved that a family of one dimensional Schr¨odinger equations with PT -symmetric polynomial potential −[(iz)m +a1 (iz)m−1 +a2 (iz)m−2 +· · ·+am−1 (iz)] has all positive real eigenvalues, provided that (j − k)ak ≥ 0 for all k, for some 1 ≤ j ≤ m2 . In particular, this result implies the original Bessis and Zinn-Justin conjecture for the potential iαz3 + z2 . One would like to further extend the proof of the reality and positivity to a larger class of PT -symmetric potentials. For example, can we get a similar conclusion for j > m2 ? Also an interesting question will be how much the reality and positivity of the eigenvalues depend on the boundary conditions. Our boundary conditions allow only one blowing up Stokes sector between the two decaying sectors near the negative
Reality of Eigenvalues for a Class of PT -Symmetric Oscillators
563
imaginary axis. It will be also interesting to consider three or more (an odd number of) blowing up sectors between the two decaying sectors on which we impose the boundary conditions, with the decaying sectors being symmetric with respect to the imaginary axis. Then since the negative imaginary axis is the center of one of the Stokes sectors for Eq. (1), one should not consider an even number of Stokes sectors between decaying sectors near the negative imaginary axis (see, for example [4]). Also the problem with the potentials +(iz)m − P (iz) (whose leading term has the opposite sign to those in the class of the problems studied in this paper) would be interesting too, in which case the negative imaginary axis is a critical ray, and hence we impose the boundary conditions to allow an even number of blowing up sectors between the decaying sectors. Also it should be possible to apply the methods of this paper to some rational potentials, too. One big question needing to be answered is to determine the span of the set of the eigenfunctions. For Sturm-Liouville problems, we know that zeros of eigenfunctions interlace, which seems to play an important role in the completeness of the set of the eigenfunctions. Numerical work of Bender et al. [3] shows some intriguing interlacing properties of the zeros of the eigenfunctions for some PT -symmetric oscillators, too. So understanding these interlacing properties of the zeros might lead us to progress. But yet, it seems there are a lot more questions than answers in this direction. Also, one would like to study similar problems in higher dimensions. Acknowledgements. The author was partially supported by the Campus Research Board at the University of Illinois. He thanks Richard S. Laugesen for encouragement, invaluable suggestions and discussions throughout the work.
References 1. Bender, C.M., Boettcher, S.: Real spectra in non-Hermitian Hamiltonians having PT -symmetry. Phys. Rev. Lett. 80, 5243–5246 (1998) 2. Bender, C.M., Boettcher, S.: Quasi-exactly solvable quartic potentials. J. Phys. A: Math. Gen. 31, L273–L277 (1998) 3. Bender, C.M., Boettcher, S., Savage, V.M.: Conjecture on the interlacing of zeros in complex SturmLiouville problems. J. Math. Phys. 41, 6381–6387 (2000) 4. Bender, C.M., Turbiner, A.: Analytic continuation of eigenvalue problems. Phys. Lett. A 173, 442– 446 (1993) 5. Bender, C.M., Wang, Q.: A class of exactly-solvable eigenvalue problems. J. Phys. A: Math. Gen. 34, 9835–9847 (2001) 6. Bender, C.M., Weniger, E.J.: Numerical evidence that the perturbation expansion for a non-Hermitian PT -symmetric Hamiltonian is Stieltjes. J. Math. Phys. 42, 2167–2183 (2001) 7. Bernard, C., Savage, V.M.: Numerical simulations of PT -symmetric quantum field theories. Phys. Rev. D 64,085010:1–11 (2001) 8. Caliceti, E., Graffi, S., Maioli, M.: Perturbation theory of odd anharmonic oscillators. Commun. Math. Phys. 75, 51–66 (1980) 9. Delabaere, E., Pham, F.: Eigenvalues of complex Hamiltonians with PT -symmetry I, II. Phys. Lett. A 250, 25–32 (1998) 10. Delabaere, E., Trinh, D.T.: Spectral analysis of the complex cubic oscillator. J. Phys. A: Math. Gen. 33, 8771–8796 (2000) 11. Dorey, P., Dunning, C., Tateo, R.: Supersymmetry and the spontaneous breakdown of PT -symmetry. J. Phys. A: Math. Gen. 34, L391–L400 (2001) 12. Dorey, P., Dunning, C., Tateo, R.: Spectral equivalences, Bethe ansatz equations, and reality properties in PT -symmetric quantum mechanics. J. Phys. A: Math. Gen. 34, 5679–5704 (2001) 13. Dorey, P., Tateo, R.: On the relation between Stokes multipliers and T -Q systems of conformal field theory. Nucl. Phys. B 563, 573–602 (1999) 14. Handy, C.R.: Generating converging bounds to the (complex) discrete states of the P 2 + iX3 + iαX Hamiltonian. J. Phys. A: Math. Gen. 34, 5065–5081 (2001)
564
K.C. Shin
15. Handy, C.R., Khan, D., Wang, X.-Q., Tymczak, C.J.: Multiscale reference function analysis of the PT symmetry breaking solutions for the P 2 + iX 3 + iαX Hamiltonian. J. Phys. A: Math. Gen. 34, 5593–5602 (2001) 16. Hille, E.: Lectures on Ordinary Differential Equations. Reading, MA: Addison-Wesley, 1969 17. Hille, E.: Analytic Function Theory, Volume II. New York: Chelsea Publishing Company, 1987 18. Mezincescu, G.A.: Some properties of eigenvalues and eigenfunctions of the cubic oscillator with imaginary coupling constant. J. Phys. A: Math. Gen. 33, 4911–4916 (2000) 19. Mostafazadeh, A.: Pseudo-Hermiticity versus PT Symmetry: The necessary condition for the reality of the spectrum of a non-Hermitian Hamiltonian. J. Math. Phys. 43, 205–214 (2002) 20. Shin, K.C.: On the eigenproblems of PT -symmetric oscillators. J. Math. Phys. 42, 2513–2530 (2001) 21. Sibuya, Y.: Global theory of a second order linear ordinary differential equation with a polynomial coefficient. Amsterdam-Oxford: North-Holland Publishing Company, 1975 22. Simon, B.: Coupling constant analyticity for the anharmonic oscillator. Ann. Phys. 58, 76–136 (1970) 23. Suzuki, J.: Functional relations in Stokes multipliers − Fun with x 6 + αx 2 . J. Stat. Phys. 102, 1029–1047 (2001) 24. Znojil, M.: Spiked and PT -symmetrized decadic potentials supporting elementary N-plets of bound states. J. Phys. A: Math. Gen. 33, 4911–4916 (2000) Communicated by B. Simon