This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
0. The lemma 2 is proved. 3. // o diagram with (n~,n+) external lines consists only from pairs of conjugated vertices then n~ = n+ and the diagram has on n+ connected parts. Each connected diagram has (1,1) external lines and is half-planar, i.e. it can be drown in the half-plane without self intersections. LEMMA 0}. Its dual space [S]*a is the space of generalized functions. By identifying (L2) with its dual we get the following continuous inclusion maps: 0 there exists a constant K > 0 such that \F(Z)\ 0. Hence 0. Thus y? 6 [£]a and the converse of the theorem is proved. ■ 3 3.1 (x), it is well known that it appears a stability condition which, after a simple change of variables of the type 00, there exists a critical value TJC (percolation thresh old) for which the overlapping disks form an infinite cluster "spanning the entire domain of volume V". The dimensionless factor T)c has been obtained by numerical simulations 13 ' 14 ?7C w 1.18 (\) | A € Spectrum} are square integrable functions. However, the integrals ( l M ) = y"dA
o € V. and a selfadjoint Hamiltonian H with domain D(H) dense in U, has a solution given by the one parameter group of (strongly continuous) unitary operators W{t) inH:2 4>{t) = U\t)4>0=e-iHt o o (-iff{A)|p'> = K(p»,p'-A) Jo n ti n (x). The system {u n (x)} is left to be determined later. n A(x)d3x - 0 for all (x, t) G R2 in The orem 2.2. However, as proven by Lax [45], one can show that ip(x, to) > 0 for some t0 G K and all x G E actually implies i>(x, t) > 0 for all (x, t) € R2 (see also [31]). Moreover, in case V(x,to) is real-valued, we note that L{to)vj{x,to) — 0 has a positive solution tp(x,to) > 0 if and only if the Schrodinger differential expression L(to) = —dx + V(x,to) is nonoscillatory at ±oo (cf. [35]/ While the system of equations (2.4) always has a solution ip(x,t), (cf. Lemma 3 in [31]j, it is the additional requirement ip(x,t) > 0 for all (x, t) € K2 which renders (j> (and hence V) nonsingular. Without the condition xj) > 0, Theorem 2.2 describes (auto)Bdcklund transformations for the KdV and mKdV equations with characteristic singularities (cf. [32]). We conclude this section with a brief discussion of the KdV and mKdV hierarchies and indicate the extension of the basic identity (2.1) to these hierarchies. The higher-order (m)KdV equations can recursively be denned by KdVn(V) = Vt-2Xn+hx(V) = Q, n € N ) , mKdVn( 0) dq = -—VS(q)ds + Xadb(s) (5) m Let q«(x) be the solution of the stochastic equation (5) with the initial con dition x till the (random) explosion time r(q) then <Mx) = £ [ 0 ( q , W ) ] 4, that E|C/(a;)|2 = oo. The field 0, and by an associated Schwarz inequality 6 that 0<\(0\v(fiMf2M92)v(9i)\0)T\2 0) fail to exist in the usual sense. We summarize the situation as follows: On the one hand, we observe that taking the classical limit (h -> 0) after going to the asymptotic quantum fields leaves us with no classical scattering. On the other hand, if one takes the classical limit (h — ► 0) of the quantum equations of motion at finite time values and solves the resultant classical theory, then one should find the nontrivial scattering of the classical theory alluded to before. From another perspective, the behavior just outlined can also lead to intermediate situations. Let us imagine a quantum system endowed with a 0 so that p : = 2 e 2 K 2 | | ^ " ( r + 1 ) | | H 5 < 1. Let q:='p + r + l. Then $ := S~lF belongs to Ti~q, and ||*||2,_, 0 is such that r> (2 1n(3/2)) _ 1 (2 + ln(2X 2 )), f ~ l g in S{R). Choose ^ € C~(R) so that ipg = g. Then ((1 4>) fn, n € N) converges in <S(R) to zero. Given a, /? G N 0 , we have Wffn - 9\\Q,0 < W (a:),i;(a:))r;c is the usual directional derivative on X along the vector field v and Vx denotes the gradient on X. The logarithmic derivative of the measure a is given by the vector field 0° : = v x logp = Vxp/p (where (3a = 0 on {p = 0}). Then the logarithmic derivative of a along v is the function x >-> 0"(x) = {(5° (x) ,V(X))TXX + div t)(x), where div^ denotes the divergence on X w.r.t. the volume element m. Analogously, we define div* as the divergence on X w.r.t. a, i.e., div* is the dual operator on L2(a) := L2(X, a) of V * . Definition 3.2 For any v € VQ(X) we define the logarithmic derivative ofwa along v as the following function on T : := < # ) 7 ) = / \(^(x),v(x))TiX+divxv(x)}d7(x). (12) Jx Theorem 3.3 For all F,G € J"C^°(D,r) and any v € V0(X) the following integration by parts formula for wa holds: f 3 7 - BZ'{i) ,il>)ui(o), T^T and we can r define the T-Laplacian ( A F ) ( 7 ) := T r ( V r V r F ) ( 7 ) e TC^{V,T). We in troduce a differential operator in L?{ira) on the domain FCffip^T) by the formula «F)( , n 6 N the following formulas hold VjQ*-(7l V®n) = n( R U {00} such that the condition in Theorem 6.3 ensuring closability of (££, FC^iJ), T)) on L2(n) for \i G G\c{p, $ ) can be now easily formulated as follows: for /xa.e. 7 e T and m-a.e. x G {y € R d | J2v'ei\{v) ^ y ~ y'^ < °°} ** holds that Jv e^v'eT\<»> v y m(dy) < 00 for some open neighborhood Vx of x. This condition trivially holds e.g. if supp<j> is compact, { + € Lf£c)({cf> < 00};m). If even\x 6 £gC(.z,0) and(j> satisfies the assumptions in Proposition 7.1, then it suffices to merely assume that {4> < 00} is open and <j)+ 6 -^^({0 < 00}; m). This follows by an elementary consideration. Remark 6.6 1. We emphasize that Example 6.5 generalizes the closability result in 10, though an a-priori bigger domain for ££ is considered there. However, Theorems 6.1-6.3 are also valid for this bigger domain. The proofs are exactly the same. 2. We also like to emphasize that similarly to Example 6.5 one proves the closability of (££, J-Cf?(V,T)) (or with a larger domain in 10) on L2(fx) for fi € Glc(o~, $) in the case of multi-body potentials <j>. = „ , ; J is the top extreme eigenstate of J 2 and Jz. The variable a = tan | e ' v defines the stereographic projection from the south pole of the sphere to the plane passing through the equator, with complex coordinates a,a*. Under this projection, distances in the sphere are mapped into the plane according to (ds) = d92 + sin2 8d ,N}pBeiJdtc •^ i=l
We will not present here the simple proof of this lemmata. Now the theorem follows from the above three lemma.
32
5
Entangled Commutation Relations
5.1
Stochastic limit for N point correlators in one-particle states
The following theorem has been proved in 1 3 . THEOREM
2. The stochastic limit
lim Ax(p,k,t)
A-»0
= B~{p,k,t),
lim A$(p,k,t)
X-yO
=
B+(p,k,t)
exists in the sense of the convergence of the matrix elements (ti = ±) lim < 0\c(q)Ax1{pi,kutl)...Ax-(Pn,kn,tn)c+(q')\0
>
= (
(5.47) +
as distributions and the limiting operators B~ = B and B gled commutation relations B{p, k, t)B+ (p',k',t')
satisfy the entan
= 2n6(t - t')8(p - p')5(k - k') ■ 6{E{p, k))n(p)
[n{p'), B*(p, k, t)) = (±)(S(p' -p)-
S(p' - p + k))B*(p, k, t)
[n{p),n(p')}=0 s
(5.48)
(5.49) (5.50)
+
Here *o * ^ e vacuum in the new Hilbert space, B(p,k,t)YliC (qi)^o = 0. We use the same notations for the creation and annihilation operators of cparticles in the original and in the new Hilbert spaces. n(p) is the operator density of the c-particles, n(p) — c+(p)c(p). If we set B(p, k, t) = b(t) ® B(p, k) where b{t)b+{t') =
2n6(t-t')
then we obtain the relations (1.2) B{p, k)B+(p', k1) = n(p)8(E(p, k))S(p - p')S(k - k') The more general theorem will be proved in the next section. Note that to get non-zero in the RHS of (5.48) we have to choose suitable dispersion relations, so that there are non-trivial solutions of the equation E(p, k) — 0.
33
5.2
Stochastic limit for N point correlators in n particles states
In this section the main result of this paper will be proved. Let us consider the matrix element N
<*;,n^A*m)
(5-51)
t=l
where A\ and A% are given by (4.31) and (4.32) and m
n
c+(f) = Jc+(k)f(k)dk
, c(f') = J'c(k)f'(k)dk
Lemma 2 claims that if diagram corresponding to (5.51) doesn't consist only of pairs of conjugated vertices then its contribution vanishes in the limit A —> 0 (in the sense of distributions). One can see that to select diagrams corresponding to (5.51) such that all their vertices form pairs of conjugated vertices one has to consider diagrams that satisfy the following requirements: a) n = m, b) If n ^ 1 then the diagram must be disconnected and it must include n connected parts. Each connected part corresponds to Ni point correlator of composite operators between one particle stats and
c) Each connected part contains only pairs of conjugated vertices. To describe all diagrams satisfying above requirements it is suitable to do the following. Let us first apply the Wick theorem to represent the product A 's in normal form with respect to only of operators c and c+ N
Y[A£i = Y, i=l
:
KpA£i}(a,a+,c,c+)
:c
(5.52)
P
Here : : c means the normal product with respect to operators c and c + , and summation on p involves all terms that appear in the process of application of the Wick theorem. Applying the Wick theorem to
34
nc(/f)^nc+(/,-) i
j
we have f[c(f'i)f[Ac):f[c+(fi)= t
«=i
(5-53)
j=\
symbol j means that we take into account the diagram with / contractions (c-lines). For example, A+Ax = Klt+- + K2,+(5.54) Ki,+- = I dpdp'dkdk'a+{k)c+{p - k)c+{p')c{p)c(p' #2,+- = f
k')a{k')v(p,k)v{p'k')
dpdkdk'c+{p-k)a+{k)a(k')c(p-k')v{p,k)v(p,k)
(see fig. 5)
_^
<~
Figure 5: Friedrichs diagrams representation of (5.54)
For the product of AA+ we have AxAt = Ki,-++ K3,-+
(5.55)
where Klt-+ = fdpdp'dkdk'v(p,k)v(p',k')c+(p)c+{p'
- k')c(p - k)c(p')a{k)a+ (*')
K2,-+ = J dpdkdk'v(p, k)v(p -k + k', k')c+(p)c(p -k + k')a(k)a(k')
35
Figure 6: Friedrichs diagrams representation of (5.55)
(see fig. 5) To select the non-zero vacuum average of (5.53) that consists from dia grams satisfying requirements a), b) and c) we take the factorized products
I J {
£ j^ji=Ni=i
1
1
1
(5-56)
1
Now (5.56) is a sum of products of Wick monomials containing a and a+ operators. Diagrams corresponding to (5.51) that survive in the limit A -> 0 consist from n continuous c-lines connected one annihilator operator, some number of vertices and one creator operators. These c-lines enter in different connected parts of the same disconnected diagram. Beside c-lines the diagrams contain a-lines, that connect only vertices that are crossed by the same c-line. Each of connected part is represented by a half-planar diagram. This consideration proves the following T H E O R E M 3. The stochastic limit of the matrix element (5.51) can be repre sented as follows
im(oiTTc(/n (oi n cifj) n A? n c+^)i°>=
A-K)
= *«m £p
j=i
£
N
m
<=i
j=i
<°lc(/i)S£- • • • ^
c+(/P(i))|0>-
pe n "partitions"
•<0|c(/J)B e 'i .. . e ^ c + ( / p ( 2 ) ) | 0 ) . . . (0\c(fn)B^
. .. ^
c + (/ p ( n ) )|0>
Here Vn is the group of transpositions, the sum over "partitions" includes the sum overall subsets ( i i , . . . , ^ , ) ( j i , . . . ,jk2)... (lly... ,lkn) of {1,...,N} preserving the order and fci + &2 + ... + kn = N.
36
Conclusion The theorem 3 from the previous section describes the stochastic limit of com posite operators in n-particle states. It gives the second quantized general ization of interacting commutation relations from the previous works 5 , 1 3 . It would be interesting to find an operator expression of the obtained fomula as well as the corresponding path integral representation. Acknowledgments. I.Ya.A. and I.V.V. are grateful to the Centro Vito Volterra Universita di Roma Tor Vergata for the kind hospitality. This work is supported in part by INTAS grant 96-0698, I.Ya.A. is supported also by RFFI-99-01-00166 and I.V.V. by RFFI-99-01-00105 1. L. Streit, A new look at functional integration. In: Capri 1993, Proceed ings, Advances in dynamical systems and quantum physics, 307-325. 2. L. Streit, White noise analysis and functional integrals, in: "Mathemat ical approach to fluctuations", T.Hida, ed., World Sci., 1993 3. I.Ya.Aref'eva and I.V.Volovich. Large N QCD and q-deformed quantum field theories, Nucl.Phys.B 462 (1996) 600 4. M. B. Halpern and C. Schwartz, The Algebras of Large N Matrix Me chanics, hep-th/9809197. 5. L. Accardi, Y.G. Lu and I.V.Volovich, Quantum theory and its stochastic limit, Springer Verlag (in press) 6. L. Accardi, S.V. Kozyrev and I.V. Volovich, Dynamical q-deformation in quantum theory and the stochastic limit, J. Phys. A: Math.Gen.32(1999)3485-3495 7. I.Ya. Aref'eva and I.V. Volovich, Quantum group particles and nonarchimedean geometry, Phys. Lett. B268(1991)179-187 8. I.Ya.Aref'eva and I.V.Volovich. Quantum group gauge theory, Mod.Phys.Lett.A6(1991) 893-907. 9. I.Ya. Aref'eva and I.V. Volovich, Noncommutative Gauge Fields on Poisson Manifolds, hep-th/9907114 10. R. Vilela Mendes, Geometry, stochastic calculus and quantum fields in a non-commutative space-time, math-ph/9907001 11. L. Accardi, Y.G. Lu and I.V. Volovich, Interacting Fock spaces and Hilbert module extensions of the Heisenberg commutation relations, Pub lications of HAS, Kyoto,1997 12. L. Accardi, Y.G. Lu, Comm. Math. Phys. 180 (1996) 605 13. L. Accardi, I.Ya. Aref'eva and I.V. Volovich, Non-Equilibrium Quantum Field Theory and Entangled Commutation Relations, hep-th/9905035 14. V.S.Vladimirov, Equations of Mathematical Physics, Academic Press, 1986
37
D E R H A M COMPLEX OVER P R O D U C T MANIFOLDS: DIRICHLET FORMS A N D STOCHASTIC D Y N A M I C S SERGIO ALBEVERIO Institut fur Angewandte Mathematik, Universitat Bonn, Wegelerstr. 6, D 53115 Bonn; SFB 256 (Bonn); BiBoS Research Centre (Bielefeld); CERFIM (Locarno); Ace. Arch. (USJ) ALEXEI DALETSKII Institut fir Angewandte Mathematik, Universitat Bonn, Wegelerstr. 6, D 53115 Bonn; SFB 256 (Bonn); Institute of Mathematics, Kiev (Ukraine) YURI KONDRATIEV Institut fur Angewandte Mathematik, Universitat Bonn, Wegelerstr. 6, D 53115 Bonn; SFB 256 (Bonn); BiBoS Research Centre (Bielefeld); Institute of Mathematics, Kiev (Ukraine) We define de Rbam complex over a product manifold (infinite product of compact manifolds), and Dirichlet operators on differential forms, associated with differentiable mesures (in particular, Gibbs measures), which generalize the notions of Bochner and de Rham Laplacians. We give probabilistic representations for corresponding semigroups, and study properties of the corresponding stochastic dynamics.
1
Introduction
Various questions of stochastic analysis on product manifolds (that is, on infinite products of compact manifolds), have received a strong interest in resent times. This interest is strongly motivated by applications to models of statistical mechanics. Various aspects of the study of Dirichlet operators associated with Gibbs measures on product manifolds, and corresponding semigroups, were considered by many authors, see ', ! for a detailed review of the literature. Let us remark that the Dirichlet operator of a differentiable measure, which is actually an infinite dimensional generalization of Laplace-Beltrami operator, has a natural supersymmetric extension. Namely, we can consider Dirichlet operators on differential forms over a product manifold M, extending the notions of Bochner and de Rham Laplacians. The study of the latter operators and corresponding semigroups on finite dimensional manifolds was the subject of many works, and leads to deep results on the border of stochastic analysis, differential geometry and topology, and mathematical physics, see
38
e.g. 1 7 , 1 8 . In an infinite dimensional situation, such questions were discussed in the flat case in 10 , n , 12 , 6 . A regularized heat semigroup on differential forms over the infinite dimensional torus was studied in 13 . The study of such questions on product manifolds reflects the non-trivial interplay of measure theory and differential geometry (which is essential in an infinite dimensional setting). The present paper represents a step in this direction. The structure of it is as follows. In Section 2 we introduce main notations and geometrical objects. Section 3 is devoted to the definition of the de Rham complex over M . In Section 4 we introduce Bochner-Dirichlet and de Rham-Dirichlet op erators in the de Rham complex, and prove an analog of the Weitzenbock formula. Then we formulate the main result which characterizes Markov and hypercontractivity properties of the corresponding semigroups. Section 5 is devoted to the development of a probabilistic technique needed for the proof of this theorem. In particular, we consider SDE on tensor bundles over M and give with their aid probabilistic representations of semigroups. Section 6 is devoted to a discussion of the obtained results in the framework of Gibbs measures. We give here only main ideas of proofs. More extended presentation see in6. Dedication. It is a great pleasure for the authors to dedicate this work to Ludwig Streit on the occasion of his 60th birthday. His work on the interface of mathematical physics and stochastic analysis was always a great inspiration for us, and we are very grateful to him. Acknowledgments. We are glad to thank K. D. Elworthy, M. Rockner and A. Thalmaier for useful and stimulating discussions. The financial sup port SFB-256, DFG-research projects and INTAS-project No.378 is gratefully acknowledged. 2
Setting
Let A, B be Banach spaces and % be a Hilbert space. We will use the following general notations: - (•, •) - pairing of .4 and A', A' being the dual space; ( ) ' ) « - t n e scalar product in H; - tne norm m
— II'IU
-4;
- C{A, B) - the space of bounded linear operators A -* B; -C(A) = C(A,A); - Cn(A,B) - the space of bounded n- linear operators A -> B\ - HS{H, B) - the space of Hilbert-Schmidt operators H-+B.
39 Let M be a compact connected smooth JV-dimensional manifold. Let us assume that M is equipped with the Riemannian structure given by the operator field G(x) : TXM -f T*M, (•, -)T.M =< G(x); • >. The distance on M which corresponds to this Riemannian structure will be denoted by p. Let us consider the integer lattice Z d , d > 1, and define the space M, which is an infinite product of manifolds Mk = M: M =Mzi := xk€ZjJkfk 9 x = (xk)k€z<< • M is endowed with the product topology. Given A C Z M9iHi
A
(1)
d
= (zk)kGA € M A
(2)
A
denotes the natural projection of M onto M . We denote by Vk> the derivative of the function <j> w.r.t. the variable x k (which will be always identified with the vector field over M (gradient of
r
Let us remark that for u € TC {M) we have Vu € J C will use the notations < X(x),Y(x)
m_1
(4)
(M ->TM). We
>x= Y, (*k(*),*k(*))T„kM, divX = Yl div^X^, (5) kez* kez"4 dtVk meaning the divergence with respect to x k . The space M has a Banach manifold structure with the Banach space lb(Zd -*■ B.N) of bounded sequences Y = (yk)kez*,*k € KN, equipped with the norm imi„ := sup ||Yk||RW , (6) kez*
40
as the model. However, this norm being not smooth, one gets difficulties in using this manifold structure for the purposes of stochastic analysis. The way of overcoming of this difficulty was proposed in 3 , 4 . In these works, an analog of Riemannian structure on M was introduced. On a heuris tic level, the tangent space to M at the point x can be identified with the space x k 6 Z i T z k M . In order to define a differentiable structure on M , it is natural to consider some Hilbert subspace of x k 6 Z .iT x k M. Let h := h(Zd -* R^.) be the space of summable sequences p = (pk)kez,i of positive numbers. For a f i x e d p e l i let us define the space T P l , = I X 6 x kGZ< ,T xk Jlf : Y, P* H*kHr.kM < ° 4 , { kez<« J
(7)
equipped with the natural scalar product <*• y )p,» = £ **(*><> y " ) r . k M • (8) kez d The scalar product (X,Y) in the spaces T P)X wiU play the role of a Riemannian-like structure for M . The space M equipped with this struc ture will be denoted by M p . The bundle over M p with fibres TPiX will be called the tangent bundle of M p and denoted by TMp. The fibres TPiX will be denoted by T x Mp. The bundle T M P is not the tangent bundle to M p in the proper sense. Nevertheless T M P gives us the possibility to define analogues of various differentiable structures on M p . In 3 , 4 , the spaces C n (Mp -> B) of difierentiable mappings of M into a Banach space B, resp. C"*(Mp -> TMp) of differentiable vector fields over M , were introduced. That is, / € C x (Mp -> TMp), iff the matrix V/(ar) := (V,/ k (x)),
(9)
of partial (covariant) derivatives generates a bounded operator in T X M P , con tinuous in x. We will use the notation T M :=2 1 M 1 , where 1 is the weight sequence with elements Ik = 1. The space M p possesses the metric pp defined by
Pp{x,y)2 = 5Z P(*k,*k)9Jfc, kez d
(10)
which makes it a complete metric space. Let /i be a probability measure on M differentiable in the sense that the following integration by parts formula holds true: for any u G TCX (M -> R 1 ) ,
41
and any vector field X e TC1 (M ->TM) / J2 (Vku(x),X k (a;))T„ k M **(*) = - / / ? > ( * ) **(*), ^ kez* ^
(H)
with some /?£ € L 2 ( M , / J ) . /?£ is called the logarithmic derivative of n in the direction X. We assume that /3£ is given by 0x(«) = E (^W-■Xk(s))T. k M + dfoXkfr)), kez^
(12)
where /3M(x) = (/3k 0*0) e C 1 ( M P -»■ TM P ) for some weight sequence p£ We will call 0^ the (vector) logarithmic derivative of \i. 3
h.
Differential forms and de R h a m complex
The simplest differential forms over M are the forms with both image and domain consisting of cylinder elements. The space of m— times differentiable forms of the order m of such type will be denoted by .Ffi™. We will use the notation 7Tln := U m ^n™. Each v € .Ffi„ has the form v x
() =
v
53 ki
*i
k-(*)i
(13)
knCZi
where Ukx k„(«) € T ^ M A ... A T ^ M, and the sum is finite. Actually, each w G T£ln can be regarded as a differential form on the manifold MA for some finite A C Zd. For such forms, the symbols V* resp. A* will denote the covariant derivative resp. the Bochner Laplacian (Afcw(a:) := TrV^w(x)), w.r.t. X*. We will use the notation Vw = (Vfcw)fc€Zd, Aw = X) AjfcW. For w e TQ.n we have Vw e , H } n . We will use the notation <w{x),v(x)
>x=
^ ki
(win
k„(a;),Uk,
k.Wk^AfA.-.Ar^M.
k„&Zd
(14) Let / x b e a differentiable measure on M with the logarithmic derivative p. We introduce spaces of integrable forms L* iln as the completions of F£ln w.r.t. the norm \\-\\s given by
IMC = /
E («ki,...,k„(a0>t'ki i k,,...,k„6Z'
^ ( x ) ) ^ ^ MA...AT,k, Af
d/x(a;). (15)
42
In order to define de Rham complex over M, we need spaces of smooth forms. Given a Hilbert space )C, we define the space A"AC which is the n—fold antisymmetric tensor power of the Hilbert space AC. We introduce in the stan dard way the exterior multiplication A : A"AC x A m AC -»• A n + m A C ,
(16) 0 A tp:=^±I^.ASn+m((l> n'.ml
<8> VO,
ASn+m : ®n+mIC -* A n + m K. being the antisymmetrization operator, and creation resp. annihilation operators a*(Jfc):AnAC->An+1AC, k G K, (17) am(k)
resp. a(k) : An+1AC -»• AnAC
(18)
(the adjoint operator). Let us define the bundle A T M with fibres AT Z M. Sections of this bundle can be considered as differential forms over M (we identify the space T^M with TmM). Each section w of A T M can be represented as a sum of sections with cylinder images, w(*) =
5Z k,
«ki.....k.0O. <^i,....k„(*)€T^MA...Ar x k t M,
(19)
k„€Z<<
converging at each x in the norm of AT^M. We define the (covariant) deriva tive Vu> as an infinite block-operator matrix, Vw(s) : = (Vjc^ kl> ...,k I> (a:))i )kl) ... )k „ eZ , ,
(20)
where VtWk, k n (i) e C(TX\M,Tx]Si MA...AT lk> M) (if exists). We can now define the space C 1 (M P -+ AnTM) of differentiable forms, requiring that Vw(s) G C(TXMP, A n T x M),
(21)
and is bounded uniformly in x G M. Similarly we define the spaces C m (M p -¥ ATM). We wiU use the notation ft™(M„) := Cm(Mp -¥ A T M ) .
(22)
43
We introduce the exterior differential dn^CMp^n^CMp),
(23)
dnwix) = (n + l)ASn+1 (Vw(z)),
(24)
setting
where ASn+i : (£) n+1 TXM -»• A"+1 TXM is the antisymmetrization opera tor, and Vw(x) € C(TXMP -> A"TXM) is identified with the element of (AnTsM)
k„(aO = ailu...tit,(x)hx\Sl A ... A hx^ , hx^ € T^ Af,
akl
k„€C
m
(Mp-^R1).
We have as usual dnWk1 k.(ar) = da k l
k„(z) A / i ^ A ... A hx\^ , (25)
dnw = '^2
kn,
where daki,...,k„ is the 1— form defined by the derivative of the function "k,
k„-
Let us observe that d„ can be considered as unbounded operator I/£ft„ -*• L^H„ + i. We denote by d* : l£fl„+i -)■ l£ft„ the adjoint operator. The following fact follows in a straightforward manner from the integra tion by parts formula. Proposition 3.1 Let us assume that ^ e C^Mp -*• TMP). Then the oper ator e£ is densely defined, its domain of definition contains the space . H V n , and for u> € .Ffi„+i we have d*w € flJ,(Mp). In what follows, we will also use the spaces L"(M -* A"TM,,/j) of integrable forms with values in A n TM„ with the norm given by (JIHIA-T.M.
*<*))*•
44
4
Dirichlet operators on differential forms
In this section we define Dirichlet forms and Dirichlet operators in spaces of differential forms over M, generalizing the notion of Bochner and de Rham Laplacians. Let n be a differentiable measure on M such that the corresponding log arithmic derivative &P belongs to C1 (M p -t TM P ) for some weight sequence p € h. We introduce two Dirichlet forms on l£ft„. For «,« 6 T£ln we define the pre-Dirichlet form £f? resp. £* associated with /x, setting £«(«,») := f ^ ( V k u ( x ) , Vkt;(x))T.M dn(x), J
(26)
k
resp. £*(u, v) := (du, d«) t jn„ +1 + (<**". d * u )/.jn„_ 1 •
(27)
Analogously to the classical pre-Dirichlet form £M on functions (see e.g. 7 ), the forms £j? resp. £j* have generators Hjf resp. H£ acting in L2fi„ on the domain Tiln as H*u(x) = - E Aku(x) - ^(/3 k (a;), Vku(x))A»T.M®T.kAf, k
(28)
k
resp. H£ = dd* + d*d,
(29)
and are therefore closable. We preserve the same notations for their closures resp. generators of the closures. Let us observe that in the case of a single manifold M and fi a Riemannian volume measure on it, the operator Hjf resp. Hj* coincide with the Bochner resp. de Rham Laplacian, see e.g. , 15. We will call them the BochnerDirichlet resp. the de Rham-Dirichlet operator associated with p. Our next goal is to find a relation between the operators H * and H^. / .\i=i,-.Af
/ ,\i=i
N
Let f e^J be an orthonormal basis in T^M. Then f e£ 1 is obviously the orthonormal basis for T Z M, (for any weight sequence q). We set (4)* := aV k )> 4 :=«(«£)• Let us introduce the operator R-W= E
E
[^i.«(^k)(o k )X(a k )*oL]
k € Z d i,j,M,l
in A n T x M,, where Rij,i is the curvature tensor on M.
(30)
45
Lemma 4.1 Rn(x) is a bounded operator in each space AnTzM,, q arbitrary. Because of the assumptions on /?", we have V/3**(x) G £(T x Mp). Because of the symmetry of Levi-Civita connection, the operator V/J^x) is symmetric w.r.t. the scalar product of TZM, and therefore belongs also to £(Tj.M p -i). Let us define the operator [V/?"(x)]An := V/3"(i)®lO...®l (31) +1 ® V0M(x) ®... ® 1 +... + 1 ®... ® 1 ® V0M(x) in A n T x M. We have [V0"(x)]An G £ ( A T I M p - i ) and [V0*(x)]An € £(A n T z M p ). The following result is an extension of the classical Weitzenbock formula. Theorem 4.1 For u G ?Sln H««(i) = H*u(x) + Rn{x)u(x) - [V/3"(x)]An u(x).
(32)
n
Let us introduce the operator R£(x) € £(A T I M p ), R>(x) = Rn(x)-[V0"(x)]An. (33) Formula (32) implies that R£(x) does not depend on the choice of the basis in (30). Both Hjf resp. H * are non-negative self-adjoint operators in L2ft„ and generate therefore strongly continuous contraction semigroups Tff(t) resp. T*(t) in it. The following result characterizes the properties of these semi groups. Theorem 4.2 Let us assume that 0" 6 C4(MP -► TMP). Then : 1) both Hjf and H* are essentially self-adjoint operators on TCPQA); 2) for any weight sequence q the semigroup T^(t) leaves invariant the space C{M -¥ A"TM,), and for v G C(M -¥ ATM,) we have II T J?(*)«(*)|IA-T.M 1 = T "W ll«HA-TM, (*)!
(34)
3) for the weight sequence q = p or q = p _ 1 the semigroup T*{t) leaves invariant the space C(M -t ATM,), and for v G C(M -> ATM,) tue /w«e ||T«(t) V (x)|| A „ T>Mq < e ^ T ^ t ) HHIA.TM, (*).
(35)
where r is such that r • (h,h)A„TmM < (R£(x)h, /I)A»T,M, for each h e AT*M, and x G M; 4) let us assume that there exists a constant ro is such that ro ■ (h, fe)A,TiM < (RJJ(X)/I,/I) A „ T M M for each h G AT^M, and x G M; then the semigroup
46 T£(*) leaves invariant the space C(M -+ ATM), and for v G C(M -4 A"TM) we have ||T?(*M*)|| A ,. T . M < e-tr" T„(t) ||t,|| A „™ (*).
(36)
The prooffollowsfrom a probabilistic representations of the semigroups Tjf(i) and T*(i), which will be given in Section 5. Let us recall that a semigroup T(t) (acting on functions on M) is called hypercontractive, if for all 1 < s < a < oo T{t) : L'(M,M) -*• L"(M,n),
(37)
and T(t) is a contraction when ^
> e" 2tA
(38)
for all t > 0 and some A > 0. Formula (34) emphasizes a certain Markov property of Tjf (t) and implies its hypercontractivity provided TM is hypercontractive. An analogous state ment for T*(£) requires positivity of R£. The following result is the corollary of Theorem 4.2. Theorem 4.3 Let us assume that in the framework of Theorem 4-2 r > 0 resp. r0 > 0. Then: 1) the semigroup T*(i) is Markov in the sense that for q = p orq = p~l resp. g= l
IW<Mz)| A „ T . M , < T„(«) IMU-™, (*);
(39)
2) the semigroup T*(<) is contractive in each L*(M -> A" TM.g,fi), where q = p - 1 resp. q = 1; 5^ «/TM(t) is Aj/peram£racttve, Men T%(t) is also hypercontractive (with the same X), in the sense that T*(t) : L'(M -► AnTMq,(i) -> L a (M -+ AnTM,,/x), where q=p~l 5
(40)
resp. q = 1, ond T^(t) is a contraction when (38) holds.
Probabilistic representations of semigroups
The aim of this section is to obtain probabilistic representations of the semi groups Tjf(t) and T*(i), and to prove with their aid Theorem 4.2. The construction of the diffusion process in M, which gives the stochastic dynamics associated with the classical Dirichlet form £M, is given in 2 - 4 . This process was constructed as the strong solution to an infinite system of SDE
47
d(k(t) = &«(*))# + P(&(t)) o dufcft), k € Z d ,
(41)
in the Stratonovich form on M, where w^, k € Zd, are independent Wiener processes in a given Euclidean space R n such that M c R " isometrically, and P(m) : R n -¥ TmM is the orthoprojector. We will also write this system in the form of one SDE #(*) = J ^ K W ) * + P(«*)) o dw{t)
(42) d
on M, where w is the cylinder Wiener process in /C := /2(Z -► R"), and P(a;) is the block-diagonal operator K. -> r x M with diagonal blocks Pkkfa) = P(xk). It is easy to see that A(x) £ HS(K.,TXMP) for each p eh. Theorem 5.1 (3, 4) Let &> € Cl(Mp -»> TM P ). TAen fAe SDE (42) has a unique solution £x(t) for any initial data x € M. £x{t) depends continuously (in the mean square sense) on the initial data x. The corresponding generator coincides on TC1 (M) with the Dirichlet operator of measure /*. Our next goal is to construct the parallel translation of differential forms along solutions of (42). For this, we need an analog of Levi-Civita connection onM. Let OM be the orthonormal frame bundle over M. The space OM has obviously the structure of a compact manifolds which fits into the framework of previous sections. We define the product manifold 0M:=x k 6 Z d0M.
(43)
This space will play the role of orthonormal frame bundle over M, with fibres O s M:=x k6Z
(44)
We denote by ir: OM -¥ M the corresponding projection. Given the Levi-Civita connection on OM, we equip OM with the product connection. This means the following. Let us consider the tangent bundle T(OM)p to OM. By the definition, for each z G OM T,(OM) p = © k 6 Z , VPLT^OM).
(45)
We define the horizontal tangent space ffT^(OM)p resp. the vertical tangent space VTz(OM)p as HTz(OM)p:= ©keZ<< JpZHTzk(OM), (46) VTz(OM)p:= © k 6 Z ,
^pZVTzk(OM),
48
where HT^OM) resp. VT^OM) is the horizontal resp. vertical tangent space of OM associated with the given connection. We have obviously the decomposition Tz(OM)p = HT,(OM)p © VTz{OM)p.
(47)
Let us observe that the corresponding covariant derivative of a vector field X G Cl(Mp -+ TM,,) coincides with the derivative dX defined in above. We denote by X the horizontal lift of the vector field X over M to OM. It follows directly from our definition of HTz(OM) that (*(*)) k = £ ( * k ) ,
(48)
where Y* is the horizontal lift of the vectorfieldY* := Xk(n(z\)) over M, A = Zd\ {k}. We have then obviously that X G C n ((OM) ->• T (OM) ) provided X G C n (M p -*• TMP). Let now £(t) be the Brownian motion with the drift a G C 1 (Mp -¥ TMP) described by SDE (42). We consider its horizontal lift ~f(t) G OM defined by the SDE dj(t) = a(j(t))dt + P( 7 (t)) o dw(t),
(49)
where P(z)h := P(z)h, with initial data 7(0) such that TT(7(0)) = £(0), which is by Theorem 5.1 uniquely solvable. As in the case of a single manifold M (see 17) we have n(g(t)) = £(t). Given n > 1 and a weight sequence q of positive numbers (not necessarily decreasing), we consider the bundle V := An TMq over M. Let v G V x and define by p(g)v the natural action of g G O {TXM) := xkeZ*0 (TX\M) on Vx, O (TmM) being the space of orthogonal linear transformations of TmM. We can now define the parallel translation p
W)--Vil0)-+Vm
(50)
along the solution £(t) of (42) as **«*)« = PWMO)- 1 )*, (51) where j(t) solves (49). The transformation P^ is obviously orthogonal. Let J be a continuous operator field on M, J(x) G C(VX). We define the mapping J : OM -> V€(0) by the formula J(*) := p ( « b ~ M - , ^ M * ) ) p ( * 0 . «b = 0(0), and consider the equation
(52)
<*?(*) =
(53)
J(l(t)Ht)dt
49 in Vf( 0 ). This equation is uniquely solvable for any initial data by general theory of SDE in Hilbert spaces (see e.g. 1 6 ), and defines a multiplicative functional of the process f(t) and consequently of £(t). Let us consider the semigroup T*'J(t) acting in the space C(M -* V'), which is defined by the expression < T^J(t)w{x),y
> = E < w(^{t)),PUt)Vv(t)
>,
(54)
where nv(t) is the solution to (53) with the initial condition y, and denote by Hi,J its generator. Proposition 5.1 Forut T(?$A -> V ) we have &<Ju{x) = l&u{x)+
< a ( i ) , V u ( i ) >x +Jm{x)u(x).
(55)
st
Proof. Let us first assume that a = 0 and J = 0. This case reduces obviously to the case of a single manifold, and the corresponding generator is equal to \ A by the finite dimensional theory 1T. In the general case, applying Ito formula to the function $(z,v) =< u(n(z)), P(ZZQX)V > on O M x V x , we obtain the additional first-order term V , * ( z , « ) [3(*)] + V.*(z,t>) [J(z)v] = < Vu(*(*)) [a(ir(x))],p(zz^1)v + < u(-w{z)),J(z)p(zZo1)v
> (56)
>,
which implies the result. The following statement follows in a straightforward way from the Ito formula and Gronwall inequality. Proposition 5.2 The semigroup T^'J(t) satisfies the following estimate: \\T*'J(t)v(x)\\{yl)m
< e tc T<(*) ||«|| v , (x),
(57)
v 6 C ( M -» V ) , where c is such that cl > J(x) for each x, as operators in V,. Remark 5.1 Let us consider the bundle W := A n T M , i , where the weight sequence q1 is such that q£ < ?k for each k. Then we have V x C W x for each x e M . Let us assume that there exists a constant c\ such that Cyl > J(x) for each x, as operators in Wx. Similar arguments show that the process n(t) satisfies also the estimate
lln(tM0||W{(1)=e*Mb?(0)llw.0,
(58)
50
which implies that T €,J (f) leaves invariant the space C(M -► W ) , and for v € C(M -► W ) we have ||T € ' J (*)«(*)|| (W , ) . < e te ' I*(t) || V || W , (x).
(59)
Now we can construct probabilistic representations for Bochner and de Rham semigroups and to obtain with their aid a proof of Theorem 4.2. Let us observe that on TQ,n we have H« = -H«,
(60)
where H € is the generator of the parallel translation along the paths of the "stochastic dynamics" process f associated with /i, defined by SDE (42). If ( - H ^ , ^ n „ ) is essentially self-adjoint, the semigroup T^(i) has a simple probabilistic interpretation. Indeed, in this case Tjf(i) is the unique semigroup with the generator (—H^,^fi„) , and therefore for w € C(M -»• AnTM) we have T«(t)u;(x) = E(P t M& (*))).
(61)
In order to obtain a probabilistic representation for the semigroup T^ associated with H*, let us observe that for w € FSln we have H«'- R -w = -H*w.
(62)
The corresponding semigroup T*' -R S (t) leaves invariant the space C(M -»• An TMp-i), and coincides on this space with T^, provided H* is essentially selfadjoint on ^ n „ . The latter fact holds true if /?" € C 4 (M P -> TM,) (in this case both semigroups T*(f), T* , - R - (t) leave invariant the space C 2 (M -> An TMp-i), which obviously contains in the domain of Hjf and H*). This implies Theorem 4.2. 6
Stochastic dynamics for lattice models associated with Gibbs measures on product manifolds
Let us consider a family of potentials U = {UA)A€{I, U\ 6 (?(AfA). Let ft(k) be the family of all sets A G fi containing the point k G Zd. We will assume the following: (Ul) £
sup |l/ A (x)| < oo
Aen(k)* 6 M
(63)
51
for any k € Zd. Let n be a Gibbs measure on the Borel er-algebra B(M), associated with the family of potentials U. We denote by G{U) the family of all such Gibbs measures. G{U) is non-empty under the condition (63), see e.g. 19, 18. Let us now assume that the family of potentials U satisfies (in addition to (Ul) the following conditions: (U2) Us € C 1 (M A ) for each A, and sup V |||V k tf A |||™ < oo, k
(64)
6zdA6n
where |||Vk[/A|||TM :=sup l6M ||V k £/ A (a;)|| TiikM ; (U3) U\ G C2 (A) for each A, and there exists C < oo such that SU
P E E IIIVjVkt/A|||TM®TM < C, (65) Jez<
SU
P E
k6Z 1
Elll V Ji V J' VkC/A IH® 4 ™ <00 '
' JiJj6Z''A6n
52
(66) 8U
P
£
V
V
£lH J'- J«
Vk:/A
8
HI® ™
where ll|Vj1...VjmVkyA|||®-.+iTi»#:= 8U
P liyi,...Vj m V k C/A(s)||T.S Af®-®T.; Af®T,UM-
This condition implies obviously that there exists a weight sequence p € h such that /?" 6 C 4 (M„->rM p ). We summarize our discussion in the following Theorem 6.2 Let the family of interactions U satisfies (Ul), (U2), (US) and (66), and \i e G{U). Then there exists a weight sequence p€ h such that the statements of Theorems 4-2 and 4-S hold true. Remark 6.2 An application of an approximation technique similar to the one developed in3,7 gives the possibility to prove the essential self -adjointness of operators Hjf and H * on FSln only under the conditions (Ul), (U2), (US), and therefore to avoid the additional condition (66) in Theorem 6.2. References 1. S. Albeverio: Some applications of infinite dimensional analysis in math ematical physics, Helv.Phys.Acta 70 (1997), 479-506. 2. S. Albeverio, A. Daletskii, Yu. Kondratiev: Infinite systems of stochastic differential equations and some lattice models on compact Riemannian manifolds, Ukr. Math. J. 49 (1997), 326-337. 3. S. Albeverio, A. Daletskii, Yu. Kondratiev: Stochastic analysis on prod uct manifolds, in the book: "Stochastic Dynamics", ed. H. Crauel and M. Gundlach, Springer 1999 (Proceedings of the conference "Random Dynamical Systems", Bremen, April 28 - May 2, 1997). 4. S. Albeverio, A. Daletskii, Yu. Kondratiev. Stochastic equations and Dirichlet operators on product manifolds, Univ. Bonn, Preprint No. 591 SFB 256 (1999). 5. S. Albeverio, A. Daletskii, Yu. Kondratiev: Stochastic analysis on prod uct manifolds: Dirichlet operators on differential forms, Univ. Bonn, Preprint No. 598 SFB 256 (1999). 6. S. Albeverio, Yu. Kondratiev: Supersymmetrie Dirichlet operators, Ukr.Math.J. 47 (1995), 583-592.
53
7. S. Albeverio, Yu. Kondratiev, M. Rockner: Uniqueness of the stochastic dynamics for continuous spin systems on a lattice, J.Func.Anal., 133, No.l (1995), 10-20. 8. S. Albeverio, Yu. Kondratiev, M. Rockner: Quantum fields, Markov fields and stochastic quantization, in the book: Stochastic Analysis: Mathematics and Physics, Nato ASI, Academic Press (1995) (A. Car doso et al.,eds). 9. S. Albeverio and M. Rockner: Dirichlet forms on topological vector spaceconstruction of an associated diffusion process, Probab. Th. Rel. Fields 83 (1989), 405-434. 10. A. Arai: Supereymmetric extension of quantum scalar field theories, in Quantum and non-commutative analysis, H.Araki et al. (eds.), 73-90, Kluwer Academic Publishers, Holland (1993). 11. A.Arai: Dirac operators in Boson-Fermion Fock spaces and supereym metric quantum field theory, Journal of Geometry and Physics 11 (1993), 465-490. 12. A.Arai, I.Mitoma: De Rham-Hodge-Kodaira decomposition in oo-dimensions, Math. Ann.,291 (1991), 51-73. 13. A. Bendikov, R. Leandre: Regularized Euler-Poincare number of the in finite dimensional torus, to appear. 14. Yu. M. Beresansky and Yu. G. Kondratiev: "Spectral Methods in Infinite Dimensional Analysis", NaukovaDumka, Kiev, 1988.[English translation: Kluwer Akademic, Dordrecht/Norwell, MA, 1995]. 15. L. Cycon, R. G. Froese, W. Kirsch, B. Simon: "Schrodinger Opera tors with Applications to Quantum Mechanics and Global Geometry", Springer, 1987. 16. Yu. L. Daletcky and S. V. Fomin: "Measures and Differential Equations in Infinite-Dimensional Space", Kluwer Academic, Dordrecht, Boston, London, 1991. 17. K. D. Elworthy: Geometric aspects of diffusions on manifolds, Lecture Notes in Math., 1362, Springer Verlag, Berlin and New York, 276-425 (1988). 18. V. Enter, R. Fernandez, D. Sokal: Regularity properties and Pathologies of Position-Space renormalization-group transformations, J. Stat. Phys. 2, Nos. 5/6 (1993), 879-1168. 19. H. O. Georgii: "Gibbs measures and phase transitions", Studies in Math ematics, Vol. 9, de Gruyter, Berlin, New York (1988). 20. L.Gross: Hypercontractivity and logarithmic Sobolev inequalities for Clifford-Dirichlet forms, Duke Math. J., 43 (1975), 383-386 .
54
REAL TIME R A N D O M WALKS ON P - A D I C N U M B E R S SERGIO ALBEVERIO Institut fur Angewandte Mathematik, Universitdt Bonn, Bonn, Germany WITOLD KARWOWSKI Institute of Theoretical Physics University of Wrocllaw, Wroclaw, Poland Dedicated to Ludwig Streit on the occasion of his 60 birthday The field of p-adic numbers is a complete metric space under a non-Archimedian metrics. For this reason it is a suitable mathematical framework for the description of hierarchical phenomena appearing in many disciplines. Here we summarize some investigation of the real time random walks on p-adic based on the properties of transition functions obtained by solving the Kolmogorov equations.
1
Introduction
The last two decades have seen a steady growing interest in p-adic analysis. A substantial part of mathematical investigations has received momentum from physical motivations. There have been two main physical ideas leading towards application of local fields and in particular the p-adic numbers 1 . One came from particle physics. It was based on the conjecture that the space time at Planck distances may have a structure better described by a field other than the one of the real numbers. This has given motivation to exten sive mathematical work which resulted in elaborating interesting structures going under the names of p-adic quantum mechanics and p-adic quantum field theory 2,3 . In this paper we will relate mainly to another idea coming from statisti cal physics, in particular in connectioin with models describing relaxation in glasses. The non exponential nature of those relaxations has been interpreted as a consequence of a hierarchical structure of the state space, which can in turn be put in connection with p-adic structures. In fact p-adic numbers can be interpreted as having a "tree structure". In the physical literature various models of "diffusion" or random walks on (finite or infinite) trees have been investigated 1,15 . In particular L. Brekke and M. Olson4 constructed a class of random walks on p-adic numbers and discussed the relation with relaxations in glasses. This and pure mathematical interest motivated systematic studies of ran-
55 dom walks indexed by continuous real time with the field of p-adic numbers as state space. This note is intended to present some methods and results ob tained in this direction. The Levy processes on p-adic numbers have been con structed by different methods see Evans 16 and Figa-Talamanca 5 , but we shall be mostly concerned with the approach developed by Albeverio, Karwowski7,8 and later extended by Karwowski, Vilela-Mendes 9 , Albeverio, Karwowski, Zhao 10 and Albeverio, Karwowski, Yasuda 13 . For the relation betrween Figa-Talamanca 5 and Albeverio, Karwowski 7|8 see Husssmann 6 . In Section 2 we give basic properties of p-adic numbers and show how the translation and rotation invariant Levy processes are obtained by solving the corresponding Kolmogorov equations. We also present spectral properties of the generator. In Section 3 we generalize this method to cover corresponding processes taking values in weighted state space. In Section 4 we discuss conditions for a process to be recurrent and also hitting and exit times for p-adic balls. In Section 5 we discuss a p-adic trace formula and its analogy with the Selberg trace formula. 2
Levy Processes on Qp
We begin with basic definitions and facts about p-adic numbers 14 . Let p > 1 be a prime number. A p-adic number a is associated with the formal power series oo
a=J2^iP^
(2-1)
i=N
where AT is an integer and a,- = 0 , 1 , . . . , p - 1. With addition and multiplica tion defined in the natural way for formal power series the set Qp of all p-adic numbers becomes a field. Given a G Qp, set i0 for the smallest value of i in the sum (2.1) for which at ^ 0. Then we put
NI P =p- i 0 .
(2-2)
The map a -+ ||o|| p defines a norm in Qp, which has the non Archimedian triangle property ||a + % ^ m a x { | | a | | p , | | 6 | | p } ,
(2.3)
56 and Qp with this norm is a complete separable locally compact totally dis connected space (with the cardinality of the continuum). The series (2.1) converges with respect to the || || p norm. Let a G Qp, Mel, then the set K(a>PM)
= {x € Q P ; \\a - x\\p ^ pM} M
is called a sphere (or a ball) of radius p consequences of (2.3) i) If x e K{a,pM)
then K{x,pM)
=
(2.4)
centered at a. We note the following K(a,pM)
ii) If at e Qp, 1=1,... ,p, \\a, - ak\\p = pM+\
k # I then ^
K{ahpM)
=
M+1
K(ak,p
)
M
in) Let /C be the family of all disjoint balls of radius pM. Then K.M is countable and writing K.M = {Ki}^ we have U Ki = Qp. iv) K(a,pM)
is open and compact.
Let £ denote the er-algebra generated by the family of all balls in Qp. Then the set function /z defined on the balls by M
(K{a,pM))
=pM,
a € Qp, M <E Z
(2.5)
can be uniquely extended to a measure on £ ) p also denoted by /*. It is a Haar measure for the additive group in Qp. Put G+ for Qp as an additive group and G. = {x € QP, \\x\\p = 1}
(2.6)
for Qp \ {0} as a multiplicative group. Then G. defines a group of automor phisms Qz, z E G* of G+ by Qza = za, a € G+. We put G for the semidirect product of C?» and G+ relative to 0 : G = G. x e G+
(2.7)
i.e. G = {ff€ [z,a]:zeG*,
a€G+}
(2.8)
and 5i52 = [z\z2,ziai
+02].
(2.9)
Defining action of G on Qp by 0Z = [z, a]x = zx + a,
x e Qp, g € G
(2.10)
57
we find that G is a doubly transitive group of isometries for Qp We also have (see 1 3 ): Proposition 2.1 /z is the Haar measure for the group G. There are different starting points possible for the construction of random processes. Before presenting the approach proposed in Albeverio et al7-8 we briefly mention two other a) Levy-Khinchine representation 12 ' 16 . Put {x} for the fractional part of x G Qp. The normalized additive character on Qp (G+) is defined by X{x)= exp(27ri{x}) and then the Fourier transform of a complex valued function <j> e L1 (Qp, fi) is defined by
m = f x&MzMdx) Any Levy process on Qp can be defined by the LeVy-Khinchine formula
*(t,fl = exp< tj\x(xO-lMdx)\ where v is a cr-finite measure on £
(2.11)
satisfying v({x € Qp; \\x\\p > pM}) <
oo, M € Z. v is called a Levy measure and $(t,£) is the Fourier transform of the transition function for a Levy process. b) The generator of a Markovian semigroup 17 ' 18 . Consider the real space L2(Qp\n). Then the notion of Dirichlet form and Markovian semigroup are well defined and their relations with the Hunt processes follow from general theory 19 . Vladimirov, Volovich, Zelenov and Kochubei 2,17 ' 18 have taken as the starting point the generators of the Marko vian semigroups Da, a > 0 defined on the space D(QP) of locally constant functions with compact support by the formula
Da
+w = g lp-C-i / w* - f) - *(*)] Wvw;0"1^) ■
(2-12)
Qy
R e m a r k 2.1 Recently H. Kaneko22 modified formula (2.12) and obtained new random walks which apparently do not belong to any class discussed in this note.
58
The construction proposed in Albeverio et al 7 ' 8 is based on the Kolmogorov equations. We begin by taking KM for the state space. Then the system of forward and backward Kolmogorov equations reads PKiKi (*) = -a{Kj)PKtKi
(*) + £
u(Kf, Ki)PKiK,
(t)
(2.13a)
resp.
PKtKi(t) = -a{Ki)PKiKi{t) + Y,^K^Kf)P^f^t>>
(2-13b)
/#* with t > 0, i,j € N and the initial condition P K ^ . (0) = dy. a(Kj) is interpreted as intensity of the state Kj and u{K^Kj) as the infinitesimal transition probability from the initial state K{ to the target state Kj. Till now K,M is just a countable set. Its non Archimedian structure must be imprinted in the coefficients u(Ki,Kj). We shall do it as follows We say that a sequence of real numbers A = {a(n)}„ 6 z belongs to the class A iff a) b)
a(n) ^ a(n + 1), lima(n)=0.
(2.14)
n—>oo
Let A G A. If for any M e Z and m € N put U(M) = a(M - 1) - a(M) and u{M,m) = (p - l ) - 1 ? - " " - 1 C/(M + m ) . MKif\Kj
= 0 then dist p (J^i, Kj) = pM+m for some m € N. Then we define «(lfi,iir i ) = «(M,m)
(2.15)
a(Kj) = Y,u(Ki,Kj)
(2.16)
Requiring further
we immediately obtain a(Kj) = a(M).
59
The equations (2.13) with the coefficients specified by (2.15) and (2.16) can be explicitly solved. Further by shrinking the initial ball to a point while keeping the target ball fixed one arrives at the formulas which can be extended to the transition function of a random walk with Qp as the state space 7 ' 8 . Namely oo
Pt(x, K(a,pM))
=p-1{p-
l)53p-»exp{-T M .H*}
(2.17a)
t=0
if x eK(a,pM)
and
M\\ Pt(x,K(a,pM))=
— p—m
P'1 (P - 1) 5 Z P~* exp{-TAf+m+< *} «=0
- exp{-T W + m _i t}
(2.17b)
if dist p (x, K(a,pM)) = pM+m. These formulas hold for all x,a € Qp, M G Z. We put here rn = ( p - l ) _ I [ p a ( n ) - a ( n + l)] ,
neZ.
(2.18)
The transition function Pt{x,B) (t > 0 , i £ Qp, B G £ ) determined by the formulas (2.17) can be seen as the integral kernel for a Markovian semigroup (Tt,t>0)mL2(Qp]ti). Set Tt — e~Ht. Then H is a positive self adjoint operator in this space. Its action on the indicator functions for the balls can be given explicitly, which provides sufficient information for the spectral analysis of H. It turns out that the spectrum of If is essential pure point
<*{B) = {M B E z • The corresponding eigenvectors are defined as follows. Given a e Qp and n € Z. Put a.1 = a and choose 0 2 , . . . , o p so that ||a* - a.j\\p = p" i ^ j i,j = l,...,p. Let further Xi(x) stand for the indicator function of K{a,i,pn). Then for p
any real numbers 6 1 , . . . bp such that £ bi = 0 we have _•
1
y
v
t=l
i=l
60
The Dirichlet form £ corresponding to H has only the jump part in the Beurling-Deny representation i.e.
W,9) =
I
(/(*) - /(»)) (g{x) - g(y)) J(dx,dy)
Q„xQp\d
for all f,g € D[£]. J is a symmetric positive Radon measure on Qp x Qp supported off the diagonal. J is uniquely determined by J(K{a,pN),K(b,pM)) if &stp(K{a,pN),K{b,pM))
= ^"-"^[ain =pn
- 1) - a(n)}
(n > maxM,N)
J(K(a,pN),K(a,pN))=pNa(N) Since G (2.7) is a group of isometries it follows that the transition functions (2.17) are G invariant. This results in the following Theorem 2.1 2 0 The class of random walks on Qp constructed above is iden tical with the class of all G« symmetric L6vy processes defined by (2.11). It naturally includes the processes defined by {Da,a > 0} (2.12) with the fol lowing relation between a and {o(n)} n 6 z
3
n(n\ P-1 P ^ a(n) = . p l-p-<*-i Random Walks with Weighted Target States
The procedure presented in Sec. 2 can be generalized to yield a wider class of random walks which are not necessarily LeVy processes. In Karwowski et al 9 a class of processes which are not necessarily LeVy processes has been constructed. Similarly as in Albeverio et al 7 ' 8 one begins by solving the Kolmogorov equations (2.13), but the coefficients u(Ki,Kj) are denned differently. Let p be a nonnegative locally Ll{y) function. For B 6 £ ) p we set PB = fp(x)n(dx). Taking {a(n)}„ 6 Z G A we put B
u(Ki,Kj)=pK.U(M
+ m)
where as before U(N) = a(N - 1) - a(N) and distp(Ki, K5) = pM+m, m G N. Since in our notation Ki is the initial state and Kj the target state we see that in this formulation the target states are weighted. The method of solving equations (2.13) developed in Albeverio et al 7,8 has been modified in
61
Karwowski et al9 to also cover this case. To formulate the results we shall need more notation. For any a G Qp and n e Z put p*=
j
p{x)p.{dx).
(3.1)
K(a,p") n
Note that if p ^ ||a||p then K(a,pn) = K(0,pn) and p£ = p°. For the target state Kj = K(b,pM) we define -WhM = a{Ki) = Y,*(Ki*Ki)
(3-2)
and - W ' = X > ( » + * " 1) " a(" + * + 1)K+* • Jfc=0 10
These quantities are finite
(3-3)
iff
«=1
The formulas for transition functions depend on whether / p{x)p,{dx) is finite or not. In the first case we normalize p so that / p(x)p,(dx) = 1. Then Q,
Pt (x,K(a,pM))
=^
{ l + f ; ( 1 - J - ) exp {Mfc.,}} P I i=M ™ <+ l / J
(3.4a)
if x€ K{a,pM) and Pt (x,K(a,pM))
= p%, \l + f ) ( 1 - - U exp {*W?+1}
-i-exp{
(3.4b)
if distp(x, K(a,pM)) =pn,n> M. If J p(x)n(dx) = oo then OP
P t (*, K(a,pM)) = p% f ; ( 1 - J - ) exp {tW?+1}
(3.5a)
62
if xtK(a,pM)
and
P^tfCa.p^-pdfM i - ^ r - ) « P {***+!} v ,,,„
Pi
«+!/
_JLexp{tVO|
(3.5b)
if dist p (i,i<:(a,p M )) = p", n > M. Remark 3.1 When / p(x)p,{dx) < oo thenPt{x,K{a,pM))
-+ paM as
t-too
i.e. paM, a € Q p , M e Z defines the invariant measure for the process. Remark 3.2 If a{N) = 0 for some N € Z then Pt(x,K(a,pM)) = 0 when ever distp(x,K(a,pM)) > pN i.e. the process is confined to the ball K{x,pN). This is a consequence of the non Archimedian distance property. Indeed if a(N) = 0 then U(n) — a(n - 1) - a(n) — 0 for all n > N which means that the process can not make a jump larger than pN. But the p-adic dis tance between K(x,pN) and any ball disjoint with it is longer than pN and this distance must be taken in one jump which is impossible. Remark 3.3 Ifa{N + 1) < a(N) = a(N - m) for allmGN then the process does not make jumps smaller than pN+1. In this case the process is equivalent to one with K.N as the state space. Remark 3.4 If p{x) = 1 then the resulting process is G invariant. However, given A G A the transition functions (2.17) and (3.5) are different. Never theless both formulations are equivalent. It can be seen by direct computations that (3.5) defined by p(x) = 1 and A £ A is identical with (2.17) defined by A' € A where oo
o'(M)=p"
1
a M
{p-l)- a{M)-J2 (
+ i)P~i ■
»=i
Similar arguments work in the converse direction. R e m a r k 3.5 The processes discussed in this section have been used in a model for turbulent cascades n . In order to consider Pt(x, B) as the integral kernel for a Markovian semigroup (Tt, t > 0) we have to take L2(QP, pp.) for the underlying Hilbert space. Then explicit formula for the action of the generator H on the indicator functions can be given9. It follows further that H has pure point spectrum cr(H) = {Wf}i6Z,o6Qp-
63
The eigenvectors are denned as follows. Given a G Qp and n G Z we define Xi(x)' symmetric case. Then
»=i
* = 1,. . . , p as in the G
»=i p
where c* G R, i = l,...,p
satisfy X) CtPJi' = 0-
Passing to the corresponding Dirichlet form we find that it is p\x symmetric and consists of the jump part only as should be expected. The jump measure J(dx, dy) is defined by J(K(a,pM),K(a,pM))
= -\waM
p%
and J{K(a,pM),K{b,pM)) if distpCK, (a,pM), K(b,pM)) 4
= \p%pbM[a{M + m-\)-a{M = pM+m,
+ m)}
meN.
The Global Properties
Now we shall list some global properties of the Markovian semigroups (Tt,t> 0) constructed in previous sections 10 . First {Tt,t > 0) is conservative i.e. Pt{x,Qp) = 1. This property holds generally provided in case />«> = / p{x)n{dx) < oo the function p is normalQP
ized so that />«> = 1Concerning the transience, resp. recurrence problem results have been obtaind in 10 , we summarize them here. The semigroup (Tut > 0) of the bounded operators in I?{Qv\pp) is irreducible iff a{n) > 0 for all n G Z (see Remark 3.2). Recall that a Markovian semigroup which is irreducible is either transient or recurrent. The simplest case is when p^ = 1 and a(n) > 0, n G Z. Then the constant function 1 G Lx{pp) and Ttl(x)
oo
= Pt(x,Qp)
= 1 so that fTtl{x)dt = oo o and (Tt, t> 0) is not transient. As it is irreducible it must be recurrent. oo
If Poo = oo then (Tt,t > 0) is recurrent iff / Pt{x, B)dt = oo for some B o
64
with / p(x)p,(dx) > 0. This by direct examination is seen to be equivalent to B
-T,e%Awr- = ^
(4-1)
where M is such that p°M > 0. If (r t , t> 0) is determined by (2.17) then it is recurrent12 iff °°
E
■ 1
P _ , - 7 T T = OO.
•„
°(»)
v This condition can also be obtained by and passing from A «=0 ' specifying This condition can also be obtained by specifying p = 1 and passing from A to A' according to Remark 3.3. We add a short comment on the probabilistic interpretation of the constants W^. Let & be the random walk denned by the transition function (3.4) or (3.5). Then TK(a,pM) = inf {t > 0;6 i K(a,pM)} is the exit time for the ball K(a,pM). We have Theorem 4.1 10 If a(M) > 0 and p% ^ poo then TK(a,PM) ** almost surely finite and its expectation is ExTK{p.,vM) — ~WM)~X> x e K(a,pM). If p = 1 or equivalently the transition functions are defined by (2.17) then
ExTK{a,p")
Denote by
=a(M)-1. TK^IPM^ the
hitting time for a ball K(a,pM) i.e.
rK{a>pM) = inf {* > 0;& 6 K(a,pM)}
.
Recall that recurrence is said to be positive or persisting if ExTK(a
Trace Formula
We shall close this review with a result which bears similarity with the trace formulas for the Laplace-Beltrami operator for hyperbolic spaces. We shall make our point using probabilistic interpretation13.
65
Consider the complex upper half-plane C+ with the Poincare metrics. Then SL(2, B)/ / {1, —1} is a group of isometries of the half-plane C+. Given a discrete subgroup T consisting of hyperbolic elements there is a fundamental domain F for T i.e. an open subset of C+. such that JiFn
72.F=0
if
71,72 GT
and
717^72
and
Then identification of the points x G F with 72, 7 G T defines a compact Riemann surface M. The Laplace-Beltrami operator A on C+ (resp. AM on M) generates a random walk Xt, t ^ 0 on C+. (resp. Rt,t ^ 0 on M). Let Pt(x,y) (resp. qt(x,y)) be the transition density for Xt (resp. for Rt). Define Qt(x,y) = ^2Pt{x,jy) -rer
x,y£M.
It follows that qt = qt i.e. qt is the transition density for Rt. Moreover Tre A M t = / pt(x,x)dx + ^ F
/ pt(x,'yx)dx F
-i*'
where e is unit of T, can be expressed in terms of lengths of closed geodesies in M i.e. the distances between x and 7 a;, 7 G T21. When looking for a p-adic analog of this formula we first observe that we do not have a diffusion but rather a family of jump processes. Thus instead of the Laplace-Beltrami operator on C+ we have a class of generators for the random walks on Qp. We already mentioned that G is a group of isometries for Qp. For certain discrete family Y of p-adic translations (r cannot be a group and we assume that 7 G T implies ||7|| p > 1) 2^(0,1) becomes a "fundamental domain" for Qp. Let Pt(x, K(a,pM)) x, a G Qp, M G Z be given by (2.17) i.e. we consider G symmetric processes. We define the transition density pt{x, y), t > 0, and assume it to be bounded on Qp x Qp.(this can be realized by a suitable choice of the sequence a(n),) Then qt(x,y) = pt{x,y) + ^2pt{x,n/ + y) 7er
x,y G K(0,1)
66
is a transition density for the process confined to K(0,1). generator of the process in K(0,1) then T r e - " " ' = ft (0,0) = p t (0,0) + $ > ( 0 , 7 ) •
If — H0 is the
(5.1)
If the process in Qp is denned by a(n) = p~" then the right hand side can be expressed in terms of ||7|| p i.e. the p-adic distances between 0 and 7. The process in K(0,1) can be viewed as obtained from the process on Qp by identification of the points x € A"(0,1) and x + 7, 7 G T. Conversely if — Ho generates a process on K(0,1) such that T r e - " 0 * = ft(0,0) < 00 then there is a family of processes on Qp such that (5.1) holds. References 1. R. Rammal and G. Toulouse, Ultrametricity for Physicists, Rev. Modern Phys. 58, 765-78 (1986) 2. V.S. Vladimirov, I.V. Volovich, E.I. Zelenov, p-adic Numbers in Mathe matical Physics, World Scientific, Singapore (1993) 3. A. Khrennikov, p-adic Valued Distributions in Mathematical Physics. Kluver, Dordrecht (1994) 4. L. Brekke, M. Olson, p-adic Diffusion and Relaxation in Glasses, Preprint EFI Chicago (1989) 5. A. Figa - Talamanca, Diffusion on compact ultvametvic spaces. In: Noncompact Lie Gruops and Some of Their Applications, Ed. E.A. Tanner, R. Wilson, Kluwer, Dordrecht (1994) 6. S. Hussmann, Random walks on p-adic tree. SFB 237 Preprint No. 359 '97 (1997) 7. S. Albeverio, W. Karwowski, Diffusion on p-adic Numbers, pp. 86-99 in •.Gaussian Random Fields, Ed. K. Ito, H. Hida, World Scientific, Singa pore (1991) 8. S. Albeverio, W. Karwowski, A Random Walk onp-adics: The Generator and its Spectrum, Stochastic Processes and their application, 53, 1-22 (1994) 9. W. Karwowski, R. Vilela-Mendes, Hierarchical Structures and Asymmet ric Processes on p-adics and Adeles, J. Math. Phys. 35, 4637-4650 (1994) 10. S. Albeverio, W. Karwowski, X. Zhao, Asymptotic and Spectral Results for Random Walks on p-adics, Stochastic Processes and their Applica tions, 83,39-59 (1999)
67
11. R. Lima, R. Vilela-Mendes, Stochastic Processes for the Turbulent Cas cades, Phys. Rev. E 5 3 , 3536-3540 (1996) 12. K. Yasuda, Additive Processes on Local Fields, J. Math. Sci. Univ. Tokyo, 3, 629-654 (1996) 13. S. Albeverio, W. Karwowski, K. Yasuda, Trace Formula for p-adics, (to be published) 14. N. Koblitz, p-adic Numbers, p-adic Analysis and Zeta-Function, Springer, New York, 2nd ed. (1984) 15. M. Schreckenberg, Long Range Diffusion in Ultrametric Spaces, Z. Phys. B60, 483^88 (1985) 16. S.N. Evans, Local Properties of LeVy Processes on a Totally Disconnected Group, J. Theoret. Probab. 2, 209-259 (1989) 17. V.S. Vladimirov, Generalized Functions over the Field of p-adic Numbers, Russian Math. Surveys 43, 19-64 (1988) 18. A.N. Kochubei, Parabolic Equations over the Field of p-adic Numbers, Math. USSR Izviestiya 39, 1263-1280 (1992) 19. M. Fukushima, Y. Oshima, M. Takeda, Dirichlet Forms and Symmetric Markov Processes, De Gruyter, Berlin 1994 20. S. Albeverio, Z. Zhao, On the Relation between different Construction of Random Walks on p-adic, (to be published) 21. D.A. Hejhal, The Selberg trace Formula for PSL(2,R), Vol. 1,2, Lecture Notes in Mathematics 548 resp. 1001, Springer-Verlag (1976 resp. 1983) 22. H. Kaneko, A class of spatially inhomogeneous Dirichlet spaces on the p-adic number field. Preprint (1999)
68 CHARACTERIZATION OF TEST F U N C T I O N S IN CKS-SPACE NOBUHIRO ASM* Graduate School of Mathematics Nagoya University Nagoya 464-8602 JAPAN IZUMI KUBO Department of Mathematics Faculty of Science Hiroshima University Higashi-Hiroshima 739-8526 JAPAN HUI-HSIUNG KUO t Department of Mathematics Louisiana State University Baton Rouge LA 70803 USA We prove a characterization theorem for the test functions in a CKS-space. Some crucial ideas concerning the growth condition are given.
1
Introduction
Let £ be a real countably-Hilbert space with topology given by a sequence of norms {| • | } £ 1 0 (see n . ) Let £p be the completion of £ with respect t o t h e norm | • | p . Assume the following conditions: (a) There exists a constant 0 < p < 1 such t h a t | • | 0 < p\ ■ |i < • • • < pp\ ■ \p <
(b) For any p > 0, there exists q > p such t h a t the inclusion m a p iq>p : £Q —►
69
By using the Riesz representation theorem to identify £0 with its dual space we get the continuous inclusion maps: £ C £p C So C S'v C £',
p > 0,
where £' and S' are the dual spaces of Sv and S, respectively. Condition (b) says that S is a nuclear space and so S C So C S' is a Gel'fand triple. Let fi be the standard Gaussian measure on S'. Let (L2) denote the Hilbert space of /^-square integrable functions on S'. By the Wiener-Ito theorem each tp in (L2) can be uniquely expressed as oo
oo
V = £/n(/„) = l > ® n : , / n ^ , n=0
fn£Sfn,
(1)
n=0
and the (L 2 )-norm \\
2
\ i/a
IMIo=[$>!|/n| J ■ Now, we describe the spaces of test and generalized functions on the space £' introduced by Cochran et al. in a recent paper 4 . Let {a(n)}£L 0 D e a sequence of numbers satisfying the follow ing conditions: (Al) a(0) = 1 and inf„> 0 a(n) > 0. (A2) l i m n _ H X ) ( 2 i j i ) 1 / B = 0. Let
\ V2
IMIp,a= (Y,nla(n)\fn\2p) \n=0
■
(2)
/
Let [Sp]a — {
[£]a c [Sp]a c (L 2 ) c [SPYa c [S]*a, p > o. If $ G [Sp]*a (the dual space of [£p]a) is represented by $ = E^=o( : ' <8n •'»F" ~>> then its [£p]*a-norm is given by / oo
II*H-»I/«=
,
\ V2
a
E^-i -P
•
^
70
This Gel'fand triple [£]a C (L 2 ) C [£\*a is called the CKS-space associated with a sequence {a(n)}£L 0 of numbers satisfying the above conditions (Al) and (A2). Several characterization theorems for generalized functions in [£]^ have been proved in the paper 4 . However, no characterization theorem for test functions in [£]a is given. The purpose of the present paper is to prove such a theorem. In addition we will mention some crucial ideas in order to get a complete description of the characterization theorems for test and general ized functions in our ongoing research collaboration project. We remark that similar results have been obtained by Gannoun et al. 5 . 2
Characterization theorems
For £ e £c (the complexification of £,) the renormalized exponential function :e(''t~*: is defined by oo
:e<,
^
:=
£^<:-®n:^n^-
(4)
n=0
For any p > 0, we have l|:c<-' I - > :|| P .a = G a ( | ^ ) 1 / 2 ,
(5)
where Ga is the exponential generating function of the sequence {a(n)}^_ 0 , i.e., Ga(z) = £
^ * » ,
Z € C.
n=0
By condition (A2) of the sequence {a(n)}£L 0 , the function Ga is an entire function. Equation (5) implies that :e^' ,a:_+ : is a test function in [£]a for any
Se£c. The 5-transform of a generalized function $ in [£]£ is the function S $ defined on £c by S*(fl = «*,:<:<■•*-:»,
£e£e,
where ((•, •)) is the bilinear pairing of [£]* and [£]aWe state three conditions on the sequence {a(n)}^L 0 as follows: (Bl) l i m s u p ^ ( ^ i n f r > 0 ^ - f
" < oo.
71 (B2) The sequence 7(71) = ^ffi,n
> 0, is log-concave, i.e.,
7(n)7(n + 2) < 7 ( n + l ) 2 ,
Vn > 0.
(B3) The sequence {a(n)}£L 0 is log-convex, i.e., a(n)a{n + 2) > a ( n + l ) 2 ,
Vn > 0.
Condition (Bl) is used in the characterization theorem for generalized functions. By Corollary 4.4 in 4 condition (B2) implies condition (Bl). Con dition (B3) implies condition (B2) to be defined below. The following characterization theorem for generalized functions in [£]£, has been proved in 4 . T h e o r e m 2.1 Suppose the sequence {a(n)}^L0 satisfies conditions (Al) and (A2). (I) Let $ G [€]*a. Then the S-transform F = S$ of $ satisfies the conditions: (1) For any £, n in £c, the function F(z£ + n) is an entire function of z € C. (2) There exist constants K,a,p>
0 such that
\F(0\
£££,
(6)
(II) Conversely, assume that condition (Bl) holds and let F be a function on Sc satisfying the above conditions (1) and (2). Then there exists a unique $ G [£]* such that F = S$. We mention that condition (2) is actually equivalent to the condition: there exist constants K, p > 0 such that \F{0\
S€£C.
This fact can be easily checked by using the fact that |£| p < pQ~p\^\q for any q > p. Having the constant a in condition (2) is for convenience to check whether a given function F satisfies the condition. By this theorem, if we assume condition (Bl), then conditions (1) and (2) are necessary and sufficient for a function F defined on £e to be the Stransform of a generalized function in [S\*a. As mentioned above, condition (B2) implies condition (Bl). Thus under condition (B2), conditions (1) and (2) are also necessary and sufficient for F to be the 5-transform of a general ized function in [£]*.
72
For the characterization theorem for test functions in [£]Q, we need to define the exponential generating function of the sequence { j^y}^=o : OO
j
Moreover, we need the corresponding conditions (A2) (Bl) (B2) for the sequence(^}^o: (A2) l i m „ - + o c ( ^ l ) 1 / r i = 0. (Bl) l i m s u p ^ ^ (n!a(n) inf r>0 % ^ ) '
/ n
< oo.
(B2) The sequence {^7^y}n^=o ' s log-concave. It follows from condition (A2) that the exponential generating function G\/a is an entire function. Note that by condition (Al), a(n) > c*o for all n, where Q 0 = inf„>o a(n). Hence by the Stirling formula l/n
n\a{n) J
/
<
t
-:— \n\aoJ
\ l/n
/
~
j
\a0
\ '/"
; \j2-nn)
e
n
> 0,
as n -> oo.
This shows that condition (Al) implies condition (A2). On the other we see t n a t hand, by applying Corollary 4.4 in 4 to the sequence {^y}^Lo 1)J condition (B2) implies condition (Bl). Moreover, it is easy to check that condition (B3) implies condition (B2). Now, we study the characterization theorem for test functions in [£]aFirst we prove a lemma. L e m m a 2.2 Assume that condition (Bl) holds and let F be a function on £c satisfying the conditions: (1) For any £, n in £c, the function F(z£ + 77) is an entire function of z G C. (2) There exist constants K,a,p>0
such that
\F(0\
(e£c.
Then there exists a unique ip € [£]l, such that F = S
^— )
< 1
(8)
73
and
ML < £ £ (»'*(») W % ^ ) (-2Hv„ll^)nn=0 ^
0)
'
Proof. We can modify the proof of Theorem 8.9 in n . Use the analyticity in condition (1) to get the expansion oo
n=0
Then use the Cauchy formula and condition (2) to show that for any £ i , . . . , £n in [Sp]a and any R > 0: l(/n,6®"-®€n-^|<^
^
—
Kl|-p"-|£n|-p.
Make a change of variables r = an2R2 and use the inequality n n < n!e n /v / 27r (see page 357 in n ) and then take the infimum over r > 0 to get
K/»,{,S-SC- - I2 < £«V« ( i n f % ^ ) l&lV-KnlV Now, suppose G [0,p] satisfies the condition in Equation (8). Then by similar arguments as in the proofs of Theorems 8.2 and 8.9 in n we can derive l/n|, <
2?ra
e
^nt o
r„
J||tp,,||HS.
Therefore oo
|M|2,Q = ^n!a(n)|/„| 2 n=0
n=0 ^
'
Note that the series co nverges because of the condition in Equation (8). ■ Theorem 2.3 Suppose the sequence {a(n)}£L 0 satisfies conditions (Al). (I) Let if E [£}a- Then the S-transform F = Sip of
Z€£e.
(10)
74
(II) Conversely, assume that condition (Bl) holds and let F be a function on £c satisfying the above conditions (1) and (2). Then there exists a unique ¥ G [£]a such that F = Sip. We remark that condition (2) is actually equivalent to the condition: for any constant p > 0 there exists a constant K > 0 such that \F(0\
(e£c.
This fact can be easily checked by using the fact that |£|_, < p ? _ p | £ j _ p for a^y Q > P- Having the constant a in condition (2) is for convenience to check whether a given function F satisfies the condition. Proof. Let
= ((:e^:,v)),
£ € £C.
Since
|F(0|
Hence
|F(OI
<
^\S\-P-
Therefore, we obtain
\F(t)\ < IMI,,aG1/a(|f|2_,)1/2 < IMI,.aG1/a(a|f|2_p)1/2. To prove the converse, assume that condition (Bl) holds and let F be a function on £c satisfying conditions (1) and (2). Let q > 0 be any given number. Choose a, p > 0 such that
(
C (r) \ '^n nla{n) inf — — — 1 < 1. r>0
T
)
This inequality can be achieved in two ways: (1) choose any a > 0 and then use the fact that limp-joo ||i P l ,||//s = 0, (2) choose any p > 0 such that i Pi9 is a Hilbert-Schmidt operator and then choose a sufficiently small number a > 0. (For the first way we can choose a — I and this is exactly the fact mentioned in the above remark.) With the chosen a and p, use condition (2) to get a constant K such that the inequality in Equation (10) holds. Then we apply
75
Lemma 2.2 to get the inequality in Equation (9) so that ||y|| 9 ,a < oo. Hence
Examples and comments Four conditions
In the definition of CKS-space and the characterization theorems of general ized and test functions we have assumed several conditions on the sequence {<*(n)}£°=0: (Al), (A2), (Bl), (B2), (B3), (A2), (Bl), (B2). Recall that we have the f ollowing implications: (A1)=*(A2),
(B2)=»(B1),
(B2)=>(B1),
(B3) = » (B2).
Taking these implications into account we will consider below the four con ditions: (Al), (A2), (B2), (B3). 3.2
Examples
We give three examples corresponding to the Hida-Xubo-Takenaka, Kondratiev-Streit, and CKS-spaces. Example 3.1 For the Hida-Kubo-Takenaka space {£) C (L 2 ) C {£)* (see 6 9 10 l3 ,) the sequence is given by a(n) = 1. Obviously, this sequence satisfies the above four conditions. The corresponding exponential generating functions are Gtt(r)=er,
G1/Q(r)=eT.
Thus the growth conditions in Equations (6) and (10) can be stated as
|F(0I < Kea^l,
\F(0\ < Kea^'-'.
Theorems 2.1 and 2.3 are due to Potthoff-Streit 14 and (uc-Potthoff-Streit , respectively. Example 3.2 For the Kondratiev-Streit space (£)& C (L 2 ) C (£)*p (see 7 8 n ,) the sequence is given by a(n) = (n!)^. It is easy to check that this sequence satisfies the above four conditions. The corresponding exponential generating functions are 12
oo
n=0
1
v
';
oo
n=0
.. v
';
76
But from page 358 in u and Lemma 7.1 (page 61) in 4 we have the inequalities: exp (1 - / 3 ) r ^ |
< 2 /3 exp [(1 - / 3 ) 2 ^ 3 r ^ l .
(11)
On the other hand, from page 358 in n and the same argument as in the proof of Lemma 7.1 in4 we can derive the inequalities: 2-0exp
(l + 0)2~&r^]
< exp [(1 + / 3 ) r ^ l .
(12)
The inequalities in Equations (11) and (12) imply that the growth conditions in Theorems 2.1 and 2.3 are respectively equivalent to the conditions: • There exist constants K,a,p>0
such that
|F(0|
Z€£e.
• For any constants a, p > 0 there exists a constant K > 0 such that \F(0\
t e Sc.
These two inequalities are the growth conditions used by Kondratiev and Streit in7 8 (see also Theorems 8.2 and 8.10 in11.) Example 3.3 (Bell's numbers) For each integer k > 2, consider the k-th iterated exponential function exp t (z) = exp(exp(- • • (exp(z)))). This function is entire and so it has the power series expansion expk(z) =
^
Bk(n) nn ^—-^-z .
n=0
The k-th order Bell's numbers {bk(n)}%L0 are defined by
Obviously, this sequence {bk{n)}%L0 satisfies conditions (Al) and (A2). It has been shown recently in J that this sequence satisfies conditions (B2) and (B3). The corresponding exponential generating functions are given by G,,Ar) = Z^ob^r»
= 2%$,
Gift(--)=E:=os4yrn-
(13)
(14)
The function expk(r) in Equation (13) can be used to give the growth condition in Theorem 2.1 for generalized functions, i.e.,
77
• There exist constants K,a,p>
0 such that
|F(OI
Z6£c
On the other hand, we cannot express the sum of the series in Equation (14) as an elementary function. Thus the growth condition (2) in Theorem 2.3 is very hard, if not impossible, to verify. Hence it is desirable to find similar inequalities as those in Equation (12) for the sequence {bk(n)}^=0. 3.3
General growth order
Being motivated by Examples 3.2 and 3.3, we consider the question: What are the possible functions U and u such that the growth conditions in Theorems 2.1 and 2.3 can be respectively stated as follows? • There exist constants K, a,p > 0 such that \F(Q\
Z ££c.
• For any constants a, p > 0 there exists a constant K > 0 such that
\F(0\
£ e£c.
The answer to this question will be given in our forthcoming papers 2 . In particular, when a(n) = bk{n), condition (2) in Theorem 2.3 can be replaced by the following growth condition: 3
• For any constants a, p > 0 there exists a constant K > 0 such that \F(0\ < Kexp [2 J a | e
p
log*-i (a^| 2 _ p ) 1 ,
£ G £e,
where logj, j > 1, is the function denned by l o g ^ i ) = log(max{x,e}),
log^i) = logxOog^^x)),
j > 2.
References 1. Asai, N., Kubo, I., and Kuo, H.-H.: Bell numbers, log-concavity, and logconvexity; in: Classical and Quantum White Noise (A special volume in honor of Hida's 70th birthday,) L. Accardi et al. (eds.) Kluwer Academic Publishers (1999) 2. Asai, N., Kubo, I., and Kuo, H.-H.: Log-concavity and growth order in white noise analysis; Preprint (1999)
78
3. Asai, N., Kubo, I., and Kuo, H.-H.: General characterization theorems and intrinsic topologies in white noise analysis; Preprint (1999) 4. Cochran, W. G., Kuo, H.-H., and Sengupta, A.: A new class of white noise generalized functions; Infinite Dimensional Analysis, Quantum Probability and Related Topics 1 (1998) 43-67 5. Gannoun, R., Hachaichi, R., Ouerdiane, H., and Rezgui, A.: Un Thoreme de Dualite Entre Espaces de Fonctions Holomorphes a Croissance Exponentiele; BiBoS preprint, no. 829 (1998) 6. Hida, T., Kuo, H.-H., Potthoff, J., and Streit, L.: White Noise: An Infinite Dimensional Calculus. Kluwer Academic Publishers, 1993 7. Kondratiev, Yu. G. and Streit, L.: A remark about a norm estimate for white noise distributions; Ukrainian Math. J. 44 (1992) 832-835 8. Kondratiev, Yu. G. and Streit, L.: Spaces of white noise distributions: Constructions, Descriptions, Applications. I; Reports on Math. Phys. 33 (1993) 341-366 9. Kubo, I. and Takenaka, S.: Calculus on Gaussian white noise I; Proc. Japan Acad. 56A (1980) 376-380 10. Kubo, I. and Takenaka, S.: Calculus on Gaussian white noise II; Proc. Japan Acad. 56A (1980) 411-416 11. Kuo, H.-H.: White Noise Distribution Theory. CRC Press, 1996 12. Kuo, H.-H., Potthoff, J., and Streit, L.: A characterization of white noise test functional; Nagoya Math. J. 121 (1991) 185-194 13. Obata, N.: White Noise Calculus and Fock Space. Lecture Notes in Math. 1577, Springer-Verlag, 1994 14. Potthoff, J. and Streit, L.: A characterization of Hida distributions; J. Fund. Anal. 101 (1991) 212-229
79
N O N L I N E A R LIE ALGEBRAS IN Q U A N T U M PHYSICS A N D THEIR INTEREST IN Q U A N T U M FIELD THEORY J. BECKERS' Theoretical and Mathematical Physics, Institute of Physics (B5) University of Liege, B-4000 Liege 1 (Belgium) Solutions of the soliton type and their stability conditions obviously appear through polynomial interactions in toy models enlightening relativistic scalar field theories. We show that differential realizations of operators characterizing non linear Lie algebras connected with deformations of the angular momentum algebra are useful in the study of such polynomial interactions in quantum field theory.
A Ludwig, en reference a notre premiere rencontre lors du "Workshop" de I'Universite de Bujumbura (Burundi, octobre 1987) ! Onze ans, deja ! The use and interest of spectrum generating algebras [1] in quantum problems are very well known : in particular, they lead to solvable (or quasisolvable) Schrodinger equations and facilitates the determination of the eigen values of the spectrum and their associated eigenfunctions. The method has been often exploited, when, in general it is subtended by linear Lie alge bras such as, for example, the real forms sl(2,U) and su(2,C) of the simple Lie algebra A\ in the Cartan classification [2]. There are no reasons to ex clude nonlinear Lie algebras which have been studied very recently [3,4]. In fact, here we want to exploit the irreducible representations of such nonlin ear sl(2,3R)-Lie algebras in order to add further information in quantum field theory in the particularly well studied context of the scalar case. As a typical work in this scalar field theory, let us point out the remarkable Christ and Lee reference [5] leading to solitonic solutions but without any reference to a spectrum generating algebra and let us see how a (non-linear) cubic (Lie) algebra - the so-called Higgs algebra [6] - can help us to get more general solutions. In terms of the scalar field <j>(x,t) and its derivatives
80
(l+l)-dimensional space, the Lagrangian density takes the form C{4>,u
(l)
where V{<j>) refers to the interaction potential characterizing specific toy mod els very often called <j>4 or ^-models more specifically. Through the principle of least action in the stationary context and by varying
z = f{x)
(2)
(when we restrict ourselves to scalar potentials seen as even functions of (j>), simply becomes d2
d
F(z)=u,2F(z)
.
(3)
Here Pk(z) obviously refers to polynomials of degree k in the variable z. Such an equation (3) is thus equivalent to a Schrodinger equation associated with a specific scalar field theory. Its general solutions are welcome through con siderations on nonlinear Lie algebras as we propose to show in the following. Let us first consider the linear sl{2, 8t)-algebra given by the commutation relations [J0,J±) = ±J±
,
(4)
[J+,J-]=2J0
,
(5)
where Jo is the diagonal operator and J± the scaling ones as is well known and, secondly, the nonlinear Higgs algebra [6] given by the commutation relations (4) but with eq. (5) replaced by [J+,J-] = 2Jo + 80Jg
(6)
where (3 refers to a possible family of such nonlinear algebras and can also be interpreted as a deformation parameter. This algebra (4), (6) is recognized as a specific W-algebra [7] and belongs to the so-called category of polynomial deformations already studied [3,8] with respect, in particular, to the determi nation of its irreducible representations through an angular momentum basis [3] or a monomial one [8], the latter one asking obviously for a realization of the operators (Jo,J±) as differential operators.
81
In terms of a single variable x and its powers N — 1,2,3,..., we have proposed [8] differential operators on the forms N
J + = x F(D)
dN , J_ = G{D) _
1 , J 0 = _ (D + c)
where D is the dilation operator x — with the properties ax dN N N [D,x ] =Nx , = » & ■ dxN
(7)
<8»
when we take account of the usual Heisenberg commutation relations. If we ask for Schrodingerlike Hamiltonian operators expressed in terms of generators belonging to (nonlinear) spectrum generating algebras and if we want to priviledge the category of Higgs algebras, we have noticed [8] that we have two families of realizations for N = 1 and 2 only respectively given in terms of real parameters (c\, c2, ..., C5) and (di, d2, ^3, dt) by J+ = x(ciD2 + c2D + cs) , J-=c4D
+ c5 , J0 = D + c
(9)
and 2
2
J+ = x (diD
+ diD + d3) , J-=d4
d2 1 -^s , J0 = -(D
+ c) .
(10)
Both families lead to a commutation relation of the type
[j+,j-] = f(i,j0,JZ,JZ)
,
(ii)
including obviously the Higgs character (6) and leading to Schrodingerlike Hamiltonians given by H = ai J+ + a2J- + o 3 J 0 + a 4 Jo + a 5
(12)
where a\, a2, ..., 05 are parameters chosen in such a way that this if-operator is selfadjoint as required. In particular, the above realization (9) then leads to a corresponding Hamiltonian
H^fW^
+^Ws
+ ^W
<13)
where the superscripts refer to the jV = 1-case and where P3(1)(x) = AlX3 + Bxx2 + Cix , P2(1)(x) = DlX2 + Eix + 1, Pll\x)
= Fix + G{U)
Here is the evident connection with the stability condition (3) and its Hamil tonian intimately connected with our information (13) and (14).
82
We thus conclude that nonlinear deformations of the angular momentum algebra sl(2,3?) have to deal with polynomial interactions of special interest in scalar field theories, these (nonlinear) algebras playing the role of (nonlinear) spectrum generating algebras for such developments. Let us end this discussion by considering the example of a 0 6 -model lead ing to a Hamiltonian in the N = 2-case : BW=Pi2\z)~i
+ Pi2)(z)±+Fi%)
(15)
with P4(z) = 4z(z - A){z - B)(z - C) ,
(16)
P3(z) = 2 [4z3 - 3z2(A + B + Q + 2z(AB + AC + BC) - ABC] (17) and P2(z) = -15z2 + 6z(A + B + Q + AB + AC + BC
(18)
6
so that the equation (3) is completely determined in this 0 -model. We thus get families of solutions of the corresponding Heun equation [9] and it is easy to show that we obtain in that way more general (solitonic) solutions in comparison with the old results coming from the Christ and Lee developments [5]. In particular, the solution F(z) = (A- zf2
(, - * ± £ ) , J2 = 2(S - Cf ,
(19)
exists not only for A = B = 1 and C = —\ [5] but also if A = 2(B + C) in general, etc.
References 1. "Dynamical Groups and Spectrum generating Algebras", A. Barut, A. Bohm and Y. Ne'eman, Eds., World Scientific (Singapore, 1989) 2. J.F. Cornwell, "Group Theory in Physics, an Introduction", Academic Press, London (1997) 3. B. Abdesselam, J. Beckers, A. Chakrabarti and N. Debergh, J. Phys. A 29 (1996) 3075 and references therein 4. N. Debergh, J. Phys. A30 (1997) 5239 5. N.H. Christ and T.D. Lee, Phys. Rev. D 1 2 (1975) 1606 6. P.W. Higgs, J. Phys. A12 (1979) 309
83
7. F. Barbarin, E. Ragoucy and P. Sorba, Nucl. Phys. B442 (1995) 425 8. J. Beckers, Y. Brihaye and N. Debergh, "On realizations of nonlinear Lie algebras by differential operators" (University of Liege preprint, August 1998) 9. A. Erdelyi (Editor), "Higher Transcendental Functions", McGraw Hill, New York, 1955).
84
C O N T I N U O U S PERCOLATION: T R E E A P P R O X I M A T I O N A N D MOST P R O B A B L E C L U S T E R S
PH. BLANCHARD Fakultat fur physik, Theoretische Physik and BiBos, Universitat Bielefeld, Universitatsstrasse, 25, D-33615, Bielefeld, GERMANY. E-mail: blanchard@physik. uni-bielefeld. de G. F. DELL'ANTONIO Dipartimento di Matematica, Universita di Roma, "La Sapienza", Piazzale Aldo Moro 5, 00185, Roma, Italy. E-mail: [email protected] D. GANDOLFO Centre de Physique Theorique, CNRS, Luminy, F-13288 Marseille Cedex 9, and PhyMat, DSpartement de Mathimatiques, UTV, F-83957 La Garde, France. E-mail: [email protected] M. SIRUGUE-COLLIN Centre de Physique Theorique, CNRS, Luminy, F-13288 Marseille Cedex 9, and UFR Sciences de la matiere, Universite de Provence, 3 Place Victor Hugo, 13001 Marseille, France. E-mail: [email protected] A new approach based on tree approximation is derived in order to estimate the critical parameters of continuous percolation of overlapping disks in R 2 when the centres of the disks are Poisson distributed. In the case of directed percolation, a lower bound is found for the critical density. The cluster structure near the percolation threshold is analyzed.
1
Introduction
Percolation theory on regular lattices has already a long standing and success ful history. On the contrary percolation in continuous systems and on random graphs has received less attention in spite of its great interest in applications. Indeed continuous percolation has been used to model such diverse phenomena as conductivity in disordered semiconductors, permeability of porous media, fracture network of rocks, spread of Aids epidemics ! and even quark-gluon plasma formation in QCD gauge theories 2 . We would like to stress that our approach here will refer to continuous systems without any introduction of lattice approximation. Let us recall the
85
main features of continuous percolation, i.e. modeling a random medium — in fact randomly distributed objects in homogeneous media — by using the features of stochastic geometry. The basic facts are quite elementary: a great number N of "small" objects are thrown at random in a large volume V; their distribution follows a Poisson law if the mean density p = y is sufficiently low. So let {xt}ie/cN be the points of a standard Poisson process with intensity p in R d , d > 2. We choose a geometrical object P with a distinguished point xo and a given orientation and for each i £ I we place a copy of P with xo s n. We could also consider objects of random shape and size but in the following we will only be concerned with the simplest case. Denoting Pi the object centered in Xi we are interested in the description of clusters of overlapping objects and particularly in the possible existence of an infinite cluster "spanning" the space domain. Let us declare two objects Pi and Pj adjacent if Pi f] Pj ^ 0. We say that two objects (or the two corresponding points Xi and Xj) are connected, Pi <—> Pj (or Xi <—> Xj), if there exists a chain of adjacent objects (or points) connecting these two objects (or these two points). A cluster is a maximal set {Pi}iGJciN (or {xi}i£j) of connected objects (or points). The aim of continuous percolation theory is to study the sizes and shapes of clusters for specified values of p and typical objects P. As in the case of lattice percolation in a infinite domain, the existence of an "infinite incipient" cluster is linked to the long range connectivity property of the random system. The study of the phase transition associated with the appearance of an infinite cluster is one of the main problem of percolation theory. Objects as diverse as disks, squares, needles in R 2 , spheres, cubes, ellipsoids, in R 3 have been used for typical models (see e.g. 5 ' 1 3 ).
Figure 1.
86
For general results on continuous percolation see 3 and also 4 ' 6 ' 7 . For recent results concerning the scaling properties of the infinite percolating cluster see n 12 ' . Percolation theory in continuous media is naturally related to classical problems of stochastic geometry, see e.g. 8 . Let us now state the framework in which we want to develop our approach. It follows from the scaling property of the Poisson process that the statistical properties of the system are invariant under the scaling transformations: p —» Xp,
v —► Xv,
VA €J0, oo[
where v is the volume of the elementary objects. Due to this property, the simplest invariant, dimensionless parameter we can use in order to describe the behaviour of a percolation model is the socalled filling factor 7 defined by 77 = pv. We can better understand the physical meaning of the parameter 77 by considering N objects of volume v thrown at random in a domain of volume V. In the limit of low density, the fraction of space not covered by these
\N
(V -
objects is lim ( ——— I . In the limit V —> 00, JV —> 00, TT = po < 00, we V—too \
V
J
have
(—y^-J - exP (~vV)
= exp -7?
( )-
Then the fraction of space (p occupied by the randomly distributed objects is clearly given by
Notice that (pc is in some rough sense an indicator of how the "incipient spanning cluster" is branched, at least if one makes the assumption that there is only one such cluster. In order to find the main characteristics of the cluster distribution one defines the following quantities for 77 near T7C • the probability Px for a disk to belong to an infinite cluster,
87
• the correlation length £, maximum size above which the clusters are ex ponentially scarce, • the average cluster size Sci (mean size of the clusters containing one given disk ), Currently accepted wisdom tell us that there should exist a set of critical exponents describing the behaviour of these quantities near the percolation threshold, i.e. there should exist positive constants /3,7, v such that the fol lowing asymptotic behaviour holds: P 0 0 (T ? )
« (i - - )
>
v>Vc
Z(v) <* \v-Vc \~" Sci{rj) oc I r\-i]c I" 7 In R 2 for different types of objects (squares, sticks, . . . ) the percolation threshold r?c changes but the critical exponents are the same as those found for lattice percolation in dimension d = 2, i.e.
fl-A
-t
-H
^~36' V 3' 7 18" The same description can be used for spheres percolation in R 3 , in this case one has T7(d==3) = p^irR3. The corresponding numerical values are in this case: 7]c w 0.35 (
p w 0.41,
v ss 0.90,
7 w 1.8.
In R 3 , as in R 2 , lattice percolation and continuous percolation belong to the same universality class of critical phenomena: the class of random (uncorrelated) percolation. Let us just remark that in R the continuous percolation problem is trivial, clearly we have
88
• hopping conduction in doped semi-conductors, • localization of electrons in random potential generated by disordered sys tems, • mathematical biology, epidemiology, • deconfinement of strongly interacting matter at high density (transition to a plasma of deconfined, coloured quarks and gluons). In spite of its theoretical and practical importance, continuous percolation is essentially an open field for mathematical physics. Some exact general results have already been obtained (see e.g. 11>12) but almost all what is known in practice concerning continuous percolation has been obtained from numerical simulations (mainly from Monte-Carlo simulations or direct connectedness expansion) or by using heuristic arguments 4 . However no realistic model has been constructed which allows to have even a tentative answer to the problems which are of practical importance: • size and shape of typical clusters, • percolation threshold and the large scale phenomena such as the existence of an infinite cluster, • behaviour of the percolation process near the threshold: power laws, critical exponents, • generalization to others types of percolating objects (sticks, ellipsoids, squares, cubes, . . . ) , • problems tied to computer simulation e.g. the determination of the proper window in which to perform computations, • connection with scaling theory, renormalization group, universality, frac tal properties, • study of the properties of spanning probabilities on compact regions
\s,s}d. (see however 11,12 for the last two points). In fact, as we shall give examples in the following, we can think of dif ferent basic techniques in order to build more realistic models. Besides the general mathematical framework which is related to the general theory of "coverage processes" 10 or the relation between continuous percolation mod els and bond percolation on Vorono'i lattices, some techniques which appear relatively "natural" are:
89 • put in evidence geometrical and probabilistic constraints which give an insight on the behaviour of the system. • use the connection with the theory of branching processes by constructing a tree approximation (via, e.g. the method of generations). • use the results and experience of bond and site percolation on the lattice Zd, i.e. establish a relation between the continuous percolation problem on Rrf with an ordinary percolation process on Zd (long range percola tion). • use methods of statistical mechanics and relate phase transition in con tinuous Potts model with the existence of an infinite cluster in a corre sponding continuous percolation process . . . Concerning the last point, we would like to state some facts and results dealing with the random cluster representation of a Potts spin system when the spin positions are Poisson distributed in R d . This problem was initialy introduced by Klein 15 as an attempt to generalize Fortuin and Kasteleyn transformation 16 for Potts model in the case of systems in continuous spaces. Closely related are the studies of the Widom-Rowlinson model 17>18, (see also 19 ). Random cluster measures are natural generalizations of percolation, Ising and Potts models. Let Go, = {Vw,Eu) be the finite graph associated to some realization w of a Poisson distribution of density p in a finite domain A C R d . With E w = { 1 , 2 , . . . , q}v" and Q, = {0,1} E " we consider the probability measure TPu,(
K 1 -P( e ))<Me),o + P( e )<W),i *»(«)] , f f £ ^ , w ' e £2'
where p(e) = 1 — exp [—/37(e)] > 0 with inverse temperature j3 and it \ in i\ / 1 if | Xj - X j |< 2R J(e) = J ( | x i - x j | ) = | ( ) o t h e ; w i g e and <Ja(e) = Saiaj,We = (xi,Xj) = (i,j) € E. The first marginal of Pw(<7,w) leads to the distribution of the continuous Potts model nA(
-x-exp
-/3
£
J(e)(l-6aiai)
which reduces, when q = 2, to the Ising model with spins positioned on R rf , where (i, j) £ E means any edge for which \xt —Xj |< 2R.
90 The second marginal is the random cluster measure $ w
( ' ) = F " n [p(e)w'(e) <j -p( e )) i _ w ' ( e ) ] 9fc(a,,(e)) A
k u
e€E„
e
where q ( ^ " is the number of open components of u' (connected clusters in the sense that two points are connected iff [ x, — Xj \< 1R). For q = 1 we recover the continuous percolation model. This generalization of the continuous percolation model allows a geometric representation of the Potts model. The interested reader is encouraged to consult 20 . Our purpose here is to develop a model for continuous percolation based on tree approximation. 2 2.1
Model of generations Presentation
Consider the simplest model of percolating spheres (radius R, volume v
Figure 2. Notice that a path is a collection of overlapping spheres but the order in the sequence may not be uniquely defined. The percolation threshold pc can be denned as the infimum over L of the values p{L) of the density such that, for any box of characteristic length L, there exists, with probability one, a
91
spanning cluster, i.e. a cluster which connects two opposite sides of the box. In the limit of large volume, where translational invariance of the Poisson law holds, we can choose any point 0 in the percolation cluster and consider all the paths leaving O. The set of points belonging to such a graph can then be divided in generations Gk, (fc = 1,2,...), with Go = {O} in the following way: xk £ Gk iff {0,Xi,... ,xk,... ,xk+i,...} being on the same path, then necessarily
I *k -xk±i I Xk-Xk±i
|< 2R |> 2i?,Vi>2
Figure 3. This definition of generations is unique if the graph has no loop. Due to the Poisson distribution, for any sphere of volume vj the mean number of neighbours is given by: JE(#neighbours) = pv^l ,
p being the mean density of centres of sphere.
Then, for any k sufficiently large, we can define Fk by: E(#Gfc+1|#Gfc),FfcE(#Gfc) Fk is some geometrical factor which depends on the definition of generations. It is an index of the amount in which the graph "folds" if one passes from one generation to the next. In this tree approximation the critical value pc is defined by the relation pc{Fk) = 1 , where (Fk) is a suitable average of Fk for different characterizations of generations. We expect (Fk) to be independent of k in the asymptotic limit. Define (F) = lim (Fk), then pc is defined by k—♦oo
PcVd = % =
2.2
Vd — .
Directed percolation
This method of generations turns out to be particularly well suited to study the directed percolation problem of disks in R 2 (characterized by a mean direction vector e).
92 This model is defined by simply adding to the previous conditions (*), the additional constraint: (**) For a path (0,xi that
(xfc+i - x f c ) . e > 0 .
€ G i , x 2 € G 2 , . . . ) (see Fig. 4) the condition (*) implies / xi 6 Ci
\ x2£U(6)
=
C2\(C1nC2).
Figure 4. Moreover, the condition (**) implies x2 £ U($)f]PA = Ur(9) where P A is the half-plane limited by A (xi € A , e orthogonal to A). Then using some easy geometrical arguments and obvious symmetry simplifications, it is easy to calculate the mean value (U) of Ur{6). The same is true for any sequence of 3 successive points. In this way we can identify (U) and F. We obtain F = -;pfl2
and
r)c = 1.21.
93 To our knowledge, there was till now no known result concerning this type of percolating model which can be thought to model, for example, percolation of liquid in porous rocks under the influence of the gravitational field. Numeri cal simulations based on a new algorithm for cluster statistics in continuous percolation systems confirm the above result whose details will be developed in a forthcoming paper. We can point out that this result is not too far from the known numerical result for ordinary (undirected) percolation (TJC = 1.18). The case of directed percolation in R 3 can be handled in the same way. The corresponding value for the critical parameter is found to be r\c = 0.47 to be compared with r)c = 0.35 for ordinary continuous percolation. On a quite general basis, one of the interesting problems is to find the structure of the most probable types of clusters near the percolation threshold (p
3
The A:-cluster model
In 4 Alon, Balberg and Drory proposed a new heuristic percolation criterion for continuous systems, based on the comparison of two fundamental lengths of the system, namely the average bounding length Zi (defined as the mean distance between two connected centres) and the average distance lk between two centres each of which have at least k neighbours. They postulated that percolation, which can be regarded as the condition that the system be macroscopically connected, occurs when the condition lk = 2lx is fulfilled for some definite k depending on the space dimension. The mean distance lk of the centres of two fc-clusters is approximated using the effective density pk of such clusters which is given by the basic Poisson law :
■ exp(-B)
Pk
i=i
where
B — p2dVd
J
The results they obtained (k = 4 in dimension 2 and k = 2 in dimension 3), are in very good agreement with the numerical simulations. A better insight concerning the typical clusters near the percolation threshold can be obtained using a probabilistic model. First let us give some results on the cluster shapes.
94 3.1
Most probable cluster shapes d
In R we consider spheres with radius R, thrown at random with uniform density p. Their centres are distributed according to a homogeneous Poisson law so that the probability P n ( ^ ) to have n points in a volume V is IP„(V) = ^ p e x p ( - p V ) Two points xi and X2 are neighbours iff | x\ - x
Figure 5. The volume of the empty region is minimal if the centres of the spheres almost coincide, but this event has very small probabiUty. Nevertheless, for the same volume of overlapping spheres, the empty volume is minimal if the spheres are in a compact shape, so to say in a sphere of radius r 0 . Then there is a balance between the probabilities of the two following events (see Fig. 6): • to have n centres in a sphere of radius ro, • to have no centre in an annulus }ro, TQ + 2R).
Figure 6.
95 The probability to have a cluster of size n and radius r$ surrounded by a corona of depth 2R is (with TT^ the volume of the unit ball in dimension d): P(„, ro) =
KF
J>
exp(-p7rdr0d) exp [-fmd((r0
+ 2R)d - rft]
This probability is maximal for ro = fo such that: f0(f0 + 2R) d-\
P*d
For d = 2, we obtain r~o(n) / n K 2 R ' = - l + A /l + pwR For a given density (and a given filling factor rj) r"o(n) can be taken to be the radius of the most probable clusters with n spheres. 3.2
Structure of percolating clusters
To find out whether some types of clusters (with a fixed number of elements) have more prominence than others at the percolating threshold, we now intro duce a probabilistic model of percolating clusters. For this we use as building blocks the most probable clusters introduced above. Let O be the centre of a sphere in R d , O is an occupied point. We consider a spherical cluster i.e. an occupied point 0\ surrounded by k occupied points Ai,i = l,...,k, at the same distance IR from 0\ and uniformly distributed around 0\. Take two such clusters {0\,Ai\02,Bj} such as O1O2 = SR
OtAi = OiB, Oi02 = 5R
IR
6 and / are dimensionless parameters. Figure 7. and let Mk(l,8)R be the minimum of the mean square distance between one point of the set {Oi, A{} and one point of the set {O2, Bj}. We shall stipulate
96 that the two clusters percolate if Mk(l,5)R fixed I, a critical value Sc(k) for <S Mk(l,Sc(k)) The set of curves Mk(l,S/l) and 6c(k) satisfies
is less than 2R. This gives, for =2
can be obtained by numerical simulations
M fc (lA(fc)/i) = 2/J
(1)
The proportion of points with fc-neighbours decreases with k. They have a density Xk and if LkR is the mean distance between them we can write (LkR)dXk
« 1
(2)
Xk is given by the Poisson law, i.e. Xk — P^r exp(—a) and a = pTT(i(2R)d = 2dr\ being the mean number of neighbours of one occupied point. Percolation is supposed to appear first when Sc(k) « Lk.
(3)
We focus first on the case d = 2. Using dimensionless variables, the minimum of the mean squared distance between two elements of two different k-clusters is given by the set of curves fk(x) =
\Mk(l,x)}2
which depend only on one parameter x = f and are obtained numerically. If we choose for I the radius of the most probable cluster with fc + 1 elements, I is a function of k and rj:
h(v) = - i + ^ i +
^ -
Moreover the Poisson law can be put in the form:
Then the relations (1), (2), (3) lead to a system of two coupled equations:
{h{x)=^z
\ hk(v) = ,a[!„(„)]» For each value of fc we can determine the critical value T]c(k) given by the lowest intersecting point of the curves
Fk(v) = fk—I
4
lk(v)2\
97 and
4
F
11/2
"fc(77) = [wrm. The percolation critical value r\c is just given by T]C =mmr)c(k)
(4)
K
The value fco for which TJC minimum is attained in (4) can be regarded as a measure of the size of the building blocks of the percolation cluster. The results we obtain with these prescriptions are summarized below. 3.3
Results
• Dimension d = 2 — the curves do not intersect for fc < 3, are almost tangent for fc = 4, but intersect clearly for k w 5. — t)c(k) is almost stationary around k = 5 and very near the value obtained by numerical simulations TJC = 1.18. Moreover the heuristic hypothesis of Alon, Bradberg and Drory, is in reasonable agreement we the result we have obtained. However the percolation seems not to distinguish between fc-clusters for k between 4 and 6. — For k > 6 T]c(k) increases regularly and behaves roughly as ^(k + 1 ) for k —> oo. • Dimension d = 3 The situation is rather different. The clusters are smaller and there is no solution with one type of cluster, compatible with the known physical values of the percolation problem. This may be seen as evidence that in dimension 3 the percolating cluster is more structured and is composed of several types of "building blocks" placed according to an optimal strategy. In order to understand this structure we have to consider more collective phenomena, replacing Xk by the probability of clusters of size greater than or equal to k. Then for I w 0.75 and k = 2 we obtain r\c ~ 0.32 to compare with rjc « 0.35 obtained by numerical simulations.
98
Acknowledgments The BiBoS research center (Bielefeld-Bonn-Stochastic) of Bielefeld University, the Centre de Physique Theorique, CNRS, Marseille and the Mathematical Institute Guido Castelnuovo, Universita La Sapienza, Romal are gratefully acknowledged for kind hospitality and financial support. G.F.D.A. acknowl edges the generous support of the Alexander Von Humboldt Foundation. References 1. Ph. Blanchard, Modelling BJV/AIDS, in Fractals in Biology and Medicine, T. F. Nonnenmacher, G. A. Losa, E. R. Weibel Eds, Birkhaiiser, (1994). 2. H. Satz, Nucl. Phys. A 642, 130 (1998). 3. I. Balberg, Philos. Mag. B 55, 991 (1987). 4. U. Alon, I. Balberg and A. Drory, Phys. Rev. A 42, 4634 (1990). 5. E. J. Garboczi, K. A. Snyder, J. F. Douglas and M. F. Thorpe, Phys. Rev. E 52, 819 (1995). 6. U. Alon, I. Balberg, B. Berkowitz and A. Drory, Phys. Rev. A 43, 6604 (1991). 7. M. B. Isichenko, Rev. Mod. Phys. 64, 4, 961 (1992). 8. D. Stoyan, W. S. Kendall and J. Mecke, Stochastic Geometry and Appli cations, Akademie-Verlag Berlin (1989). 9. V. K. S. Shante and S. Kirkpatrick, Adv. Phys. 20, 325 (1971). 10. P. Hall, Introduction to the Theory of Coverage Processes, Wiley Series in Probability and Mathematical Physics (1988). 11. M. Aizenman, Nucl. Phys. B , 551 (1997). 12. M. Aizenman, A. Burchard, C. M. Newman and D. B. Wilson, to appear in: Random Structures and Algorithms, (1999). 13. G. E. Pike and C. H. Seager, Phys. Rev. Lett. B 10, 1421 (1984). 14. E. T. Gawlinski and H. E. Stanley, J. Phys. A 10, 205 (1977). 15. W. Klein, Phys. Rev. B 26,5, 2677 (1982). 16. P. W. Kasteleyn and C. M. Fortuin, J. Phys. Soc. J. (suppl) 26, 11 (1969). 17. D. Ruelle, Phys. Rev. Lett. 27, 1040 (1971). 18. L. Chayes and R. Kotecky, Comm. Math. Phys. 172, 551 (1995). 19. O. Haggstrom, Mark. Proc. Rel. Fields 4, 275 (1998). 20. G. R. Grimmet, Lectures on probability theory and statistics, Ecole d'ee de Probabilites de Saint Flour, (XXV), Lectures Notes in Mathematics, M. T. Barlow, D. Nualart Eds. Spinger, (1998).
99 R I G G E D HILBERT SPACE R E S O N A N C E S A N D TIME A S Y M M E T R I C Q U A N T U M MECHANICS* A. BOHM AND H. KALDASS University of Texas at Austin Physics Department Austin, TX 78712 E-mail: [email protected] The Rigged Hilbert Space (RHS) theory of resonance scattering and decay is re viewed and contrasted with the standard Hilbert space (HS) theory of quantum mechanics. The main difference is in the choice of boundary conditions. Whereas the conventional theory allows for the in-states <j>+ and the out-states (observables) rl>~ of the S-matrix elements (ip~,(p+) = (V>ou',S0,n) any elements of the HS H, {ip~} = {
1
Introduction
I am happy to be here to celebrate Ludwig Streit. My relation with Ludwig goes back to 1967 when we overlapped as post-docs at Syracuse University. I still remember that at Syracuse Ludwig gave me his unsolicited advise, that I did not heed, which had severe repercussions. He was already at that time much more clever than I. The other thing that I remember often is that sometime in the 1980s Ludwig worked on Gamow vectors and organized a meeting on resonances 1,2. I think the motivation for the meeting was that nobody really knew what resonance states were. At that time my paper on Gamow kets was out al ready 3 and I had written many chapters in a monograph 4 on that subject, but Ludwig did not invite me to that meeting. Then there is Ludwig's interest in the Rigged Hilbert Space 5 . I had been advocating its use in quantum physics for many years, but was not very "Based on a talk delivered at the International Conference on Mathematical Physics and Stochastic Analysis in honor of the 60th birthday of Professor Ludwig Streit.
100
successful. When I found out that Ludwig also was using it, I gained hope again that ultimately the Rigged Hilbert Space will make it into physics. Resonances and Rigged Hilbert Space are the subjects that I want to discuss here. Resonances and decaying states can really not be understood as au tonomous elementary particles in Hilbert space quantum mechanics because the Hilbert space mathematics does not allow state vectors characterized by both an energy ER, and a lifetime r (or a width T = ft/7*)- This is in con trast to the way experimentalists analyze their data and list their results as Breit-Wigner peak value and width (ER,T) for resonances (large values of Y/ER) and as (Ed,T) for decaying states if the (Breit-Wigner or Lorentzian) line shape cannot be resolved but the decay rate can be fitted to an exponen tial (small values of ^ - ) - Since experimental data are finite in number and there are always experimental tolerances and interference with background, one will never be able to say that the experiment has precisely established an ideal Breit-Wigner or an exact exponential. However, Breit-Wigner en ergy distribution and exponential time evolution have been observed so many times for such an enormous number of different systems in all areas of physics and chemistry that one can safely take them as the defining signature of a quasistable state and attribute observed experimental deviations to the ever present background and calculated mathematical deviations to an unjustified mathematical idealization. For the mathematical physicists this means that they have to provide the suitable mathematical idealization. How and why to do that is the content of our discussion here. 2
The fundamental calculational tools of quantum mechanics
In quantum theory one has states denoted by p or W , or by <j> for a pure state p = \(j>)(\,
(1)
and one has observables denoted by A(= A1), A, P (P2 =P),
or by V, if P = \^)(tf>\, for properties.
(2)
The vectors
101
(or 4>) is thus experimentally denned ("determined") by the preparation ap paratus. The quantum physical observables are observed or registered by a registration apparatus, e.g. a detector. The observable A (or ip) or A are thus experimentally defined by the registration apparatus. In experiments with quantum systems one measures ratios of large inte gers Ni/N or N(t)/N, e.g. as ratios of detector counts of the z-th detector JVj and counts of all detectors N or as ratios of detector counts N(t) in the time interval between t = 0 and t = t and in "all" time N = N(oo). This ratio of large numbers is interpreted as probability, e.g. as probability V{Pi) for a property Pi jj*l>(Pi)
(3a)
or as probability for the observable A(() at a time t
M«p(A(0)
(3b)
where the observables P* or A are experimentally given ("defined") by the detector. For a more general observable
X>P<
(4)
one obtains the average value (of the eigenvalues a*) finite
J.J
oo
'=1 finite
,=1
»,
(3c)
oo
The symbol « denotes the association of the experimentally measured quan tity on the left hand side with the theoretically calculated quantity on the right hand side. In quantum theory the probability of an observable A in the state W at time t is calculated as V(t) = V(\(t))
= Tr(A(*)W0) = Tr{AoW{t)).
(5a)
If the state is pure, given by the state vector <j>, and if the observable is a property given by the one-dimensional projector \tp)(ip\, or given by the "observable" vector V, then
P(0 = KVW0>l2 = KiK0WI2-
(5b)
102
The trace in (5) is calculated using any basis system of the space $; either any discrete basis
#^=53|t)<»|0;
(6a)
i
or any continuous basis (Dirac basis vector expansion) $ 9 0 = fd\\X)(X\4>)
(6b)
or any basis system consisting of discrete and continuous (generalized) eigen vectors of a complete system of commuting observables. Thus the trace is given by e.g. Tr(AW) = ^2(i\AW\i)
or Tr(AW) = f dX{X\AW\X).
(7a)
•*
i
For the special case (5b) the probability for the observably ip in the state
2
W)M*WI) = KV#)I
£<#)<#>
(7b)
i=0
or by (7c) if one uses a continuous basis of Dirac kets |A). The time evolution in (5a) and (5b) (dynamics of the quantum system) is given by the Hamiltonian operator H of the system; either as ^
= i[H,W(t))
(8a);
i h ^ - = H<j>{t)
(8b)
(j)[t = 0) = 0o ( Schroedinger picture), or by : ^
= -i[^,A(0]
(8c);
i h
<^
=
-
H m
.
(8d)
(Heisenberg picture) None of the above equations is mathematically precise until we define the space $ , the kets |A) or the integration dX in (6b) and (7c), and specify the initial-boundary conditions
103
3
Empirical reasons for time asymmetric quantum mechanics
The time t in (5) and in the operators A{t) and W(t) is usually allowed to take positive and negative values, i.e. one chooses time symmetric boundary conditions for the Schroedinger and Heisenberg equations of motion (8). This choice is not compelling from an empirical point of view for the following reason. An obvious expression of causality is that a state WQ (or <j>0) has to be prepared first before an observable A(t) or rp(t) can be measured in it. If one calls t = 0 the time at which the preparation of the state W (or <j>) is completed then the time translation in A(t) (or ij)(t)) in (5) makes physically sense only for t > 0. The registration counts N(t) in (3b), by which V(t) is measured, can be taken in the future t > 0, but not in the past. Thus, Ti(A(t)W) will have an experimental counterpart for t > 0 but not for t < 0. The experimental data for V(t) would give that V(t) = Tr(A(t)W)
« ^
for t> 0
(9a)
and V{t) = Tr(A{t)W) « 0 for t < 0
(9b)
because if the detector would click before the state is prepared we would discount this click as noise. Often, such as for stationary states and/or time independent observables, it does not matter at which time V(t) = Tr(A(t)W) = Ti(AW(t)) is calcu lated. But one cannot make it a general principle that time evolution of the observable A(t) (or of the state W(t)) must go into both directions. In fact the most natural description of the experimental situation (9) would be to require a time ordering in the probability formula V(t) = Tr(A(0W(0))
t> t0 = 0
(10)
and admit in the mathematical description only time translations of observ ables A(t) relative to the state W(to) by an amount t-to > 0. This would not only reflect the experimental situation, which forbids time translation of the registration apparatus relative to the preparation apparatus to a time t < t0. It would also incorporate the notion of causality into the mathematical theory on the operator level. The time translation is quite different from other transformations of the space-time symmetry group, e.g. rotations and space-translations. These sym metry transformations of space-time are experimentally realized as transfor mations of the registration apparatus (detector) relative to the preparation
104
apparatus (accelerator). Whereas space translations and rotations of the ap paratuses can be performed back and forth, the time at which the detector is activated can only be shifted into the future not backward past the time of preparation. Thus time translations of the apparatuses form only a semigroup whereas rotations and translations of the apparatuses form a group. Therefore it is natural to represent space translations and rotations by unitary groups of operators in the space of physical states. But it is unjustified to assume that time translations are also always represented by group operators in the space of states, i.e. to assume that in quantum mechanics only states with reversible time evolution exist. When we make the choices that give a precise mathematical meaning to the vectors <j> and ip and the operators W, A, A, A* in (1) and (2) we should therefore not restrict ourselves to those mathematical assumptions that dictate a unitary group evolution. 4
The Hilbert space idealization has reversible time evolution
Of the vectors <j>, tp we have so far assumed that they fulfill the mathemat ical axioms of a linear space with scalar product (
105
in (7a) or other topological notions. The equality w between experimental and mathematical quantities has a meaning only within certain experimental errors and for large numbers N. Therefore, one has to make some—more or less—arbitrary mathematical idealizations in order to obtain complete math ematical structures, like linear topological spaces and topological algebras of operators. One such idealization is the Hilbert space (HS). This is obtained by ad joining to the linear scalar product space $ the limit element of infinite con verging (Cauchy) sequences, with the Hilbert space convergence (or Hilbert space topology TU) defined by
(11)
where the norm ||^>|| is defined by the scalar product |>|| = \J(<j>,
(12)
are not Riemann but are Lebesgue integrals. This means the values of the functions V'(A) = (A|V>) at a particular point (or at all rational numbers) are not defined, which in turn means that the Dirac kets |A) cannot be defined. The Lebesgue integration also makes the interpretation of the probability density |(.E|V>)|2 as the energy resolution of the detector for the observable \ip){ip\ rather unintuitive, at least for those «/> 6 % that do not have a smooth function in the class of Lebesgue integrable functions {(E\ip)} which represent In the (complete) Hilbert space one can define self-adjointness and give the precise meaning of "spectral resolution" to equations like (4). One can make the hypothesis that observables are (not just hermitian or symmetric but) self-adjoint operators and one can prove existence theorems. One of these existence theorems is the Gleason theorem 7 which states that if V{Pi) is the function on the set of projectors {Pi} which fulfills the axioms of probabilities 1 then there exists a positive trace class operator (density x
For V(Pi) or 'P(A) to be a probability it has to fulfill the axioms of probability theory : V(P)>0,
-P(1) = 1, ([PU Pj] =
V{Pi + Pj) = 0,PiPi=0).
V{Pi)+V{Pj),
106
operator) p in U such that V{Pi) = Tr{Pip) = Vp{Pl). This operator p thus defines the state. Since the converse (for any positive trace class operator p, Vp(Pi) = Tr(Pip) fulfills the axioms of probability theory ') is simple to see, one was led to the conclusion 8 that there is one to one correspondence between quantum physical states £> density operators p
(13a)
and in particular between pure quantum physical states o
vectors 4> (up to a phase) in H
(13b)
Another existence theorem (Stone-von Neumann operator calculus 9 ) as serts that the Cauchy problem in quantum mechanics, e.g. in the form of the Schroedinger equation (8b) with the initial condition
-oo<<
(14a)
where 4>(t) depends continuously on the initial condition. This means U^t + r) = t/ f (<)£/* ( r ) , ^^-<j>={-iHYU^(t)4>
UH-t)
= (Ul)-1(t)=U(t)
= U\t){-iHY4>
for 4> eD(H).
(15) (16)
The same result holds for the time evolution of the observable vector ip in H xl>{t) = U(t)4>0 = eiHtip0
- co < t < oo.
(14b)
In terms of the operators W and A this is given by W{t) = U\t)W0U{t)
= e-iHtW0eiHt
\{t) = eiHtA0e~iHt
- oo < t < +00
-oo
(17a) (17b)
2 T h e operator £/'(£) is defined in all of H by the Stone-von Neumann calculus as U^(t) = / ^ ° e~'EtdP(E) not by the exponential series
n=0
which is defined (converges) only on a dense subspace A C D C 7i called analytic vectors.
107
The conclusion that one draws from these two existence theorems is the following : The choice to describe states in the HS theory is very restricted and can only be given by a density operator (positive definite trace-class) and (for a pure state) by a vector <j> 6 K. The time evolution of these states must be the time symmetric reversible group evolution (14). There are no states in the HS quantum mechanics of closed systems (which fulfill (8a) or (8b)) which have asymmetric (irreversible) time evolution 3 . The evolution of the quantum mechanical observables is also time symmetric, i.e. the observables A'(t - to) can evolve relative to the state W(t0) by an arbitrary amount t - t0 > 0 as well as t - t 0 < 0. For every E/(<2 — D (or more general state WD) which has been created at any finite time to / — °° (which we call to = 0) and then decays into decay products. This mathematical consequence is not surprising if one keeps in mind that in the HS theory the t-evolution-where t is the relative time between state and observable-is given by a reversible unitary group. One cannot just impose on the group evolution an arbitrary condition like causality, chopping off one half of the theory, and expect that what remains is still a consistent theory. The reversibility of quantum mechanics in HS and the violation of the causality principle is a consequence of the mathematical idealization given by the topology of the HS, i.e. by (11). It is not a consequence of the fundamental hypothesis of quantum physics as given by the interpretation (5) and by the dynamical equation (8). We shall therefore return to the quantum mechanical Cauchy problem (8) and modify the boundary conditions
Irreversibility in conventional quantum theory is considered to be "non-quantum mechan ical" and always thought of as being due to external influences upon "open" systems. It is described by an additional term on the right hand side of (8a) which is not given by the commutator with H of the quantum system.
108
since we have already a scalar product ( , ) in the linear space $ (and we need the scalar product to calculate such physical quantities like the probabilities \(<j>, ifr)\2), Banach space completion is no more an option. 5
The Rigged Hilbert Space idealization has irreversible semigroup evolution and Gamow vectors with exponential decay
We shall complete the linear scalar product space $ into a locally convex nuclear space with a topology stronger than the Hilbert space topology given by the scalar product. Specifically we shall choose a countable Hilbert space where the meaning of convergence, i.e. the topology T* is defined by a count able number of scalar products ( , ) p , p = 0 , 1 , 2 , . . . where ( , ) p= o = ( > ) is the scalar product of the HS. Convergence with respect to T*, <j>„ A 0, means :
for every p = 0 , 1 , 2 , . . . . This topology is stronger than T « . The countable number of scalar products, i.e. the topology T*, is usually chosen such that the algebra of observables for the physical system under consideration becomes an algebra of r$ -continuous operators. If we denote the "^-completion of the linear space $ again by $ then we have the two complete topological spaces $ C % with $ dense in %. Taking in addition the space $ x of r*-continuous antilinear functionals F(4>) on # , and the space H* of r«-continuous functionals f(h) = (h, f) which are given by the scalar product, we obtain the Gelfand triplet or Rigged Hilbert Space (RHS) 10 $CH
= H* C $ x .
(19)
We shall use the Dirac notation for the r*-continuous functionals F(<j>) = (4>\F), because F(0) is an extension of the scalar product {h,f) to those F € $ x which are not in H. In these RHS's (one for each kind of quantum physical system) Dirac's formalism of kets (with a continuous set of eigenvalues) and the continuous basis vector expansion (6b) attain a mathematical meaning and the integrals in (7) are Riemann integrals. These RHS's also allow for time-asymmetric solutions of the quantum mechanical Cauchy problem (8). In the RHS formulation one can choose different subspaces of V. to distin guish between states and observables. We call $_ the space that describes the
109 states (called in-states in the scattering experiment) prepared by preparation apparatuses (e.g. accelerator). We call $ + the space that describes the ob servables (called out-states in scattering theory) registered by the registration apparatuses (e.g. detector). The two subspaces $_ and $+ are not disjoint. The HS formulation, in contrast, does not allow for this mathematical dis tinction into separate subspaces of states and observables. Thus there is one Hilbert space % and for each quantum mechanical (scattering) system two dense subspaces $ T and therewith two RHS's <J>_ C H C $ ! with the physical interpretation as in-states and
(20a)
$+ C H C $+ with the physical interpretation as out-observables.
(20b)
Mathematically $ T are defined by their realization as function spaces for their energy wavefunctions :
>+ e $_ «• (+E\4>+) eSr\H2_|R+ il>-e*+&(-E\i>-)€Sr\Hl\R+
(2ia) (2ib)
where S denotes the Schwartz space of functions and "H2_, %\ denotes Hardy class functions in the lower and upper complex plane, respectively. (Here complex half-planes refer to the second Riemann sheet of the S-matrix). The ± in {±E\ refers to the ±ie in the Lippmann-Schwinger equation for the eigenkets oiH = H0 + V. The interpretation (20) of the mathematical spaces (21) can be inferred from the preparation -> registration arrow of time n . In addition to the vectors
(Hxp- \ER - tr/2-> = (i,- \H* \ER - tr/2-> =
(ER-iT/2)(ip-\ER-iT/2-)
for all xj)~ 6 $+ (the space of observed decay products). Here ER represents the resonance energy and T the width of the Breit-Wigner energy distribu tion. The superscript " in \ER - iT/2~) is inherited from the kets of the Lippmann-Schwinger equation, the subscripts on the corresponding space $.£ from the mathematicians' notation for Hardy class functions (21). Scattering theory in physics and Hardy class functions in mathematics were developed independently of each other. Except for this discrepancy in the notation for
110
the labels +", the Hardy class spaces $_ provide an excellent mathematical image for prepared states {
{ t )
= e-iHXt\ER-iT/2-)
= e~iE-le-rt/2\ER
-
iT/2~),
for t > 0 only. The Gamow kets are solutions of the Schroedinger equation (8b) but do not fulfill the Hilbert space boundary condition. Instead they fulfill the time asymmetric boundary condition ipG(t = 0) £ $ + . Whereas the first part of (23a) can be formally verified from (22), the derivation of the time asym metry t > 0 is highly non-trivial and requires specific properties of Hardy class functions. The semigroup eZ'H l is only defined for positive values of the time, t > 0, (H* is the operator H^ extended into 4>x defined by the first equality of (22)). There is another semigroup eZtH l,t<0 and another Gamow ket tj)G = \ER + iT/2+) G $ * with the asymmetric time evolution jG(t)
= e~_iHXt\ER+ir/2+) = e-iB^e+r^\ER for t < 0 only .
+
iV/2+),
The Gamow vectors are defined from the pole term of the analytically con tinued 5-matrix at the resonance position at zR = ER — iT/2 (and at zR = ER + iF/2 for (23b)) in the second Riemann sheet. From this one obtains, using the Hardy class property, the Breit-Wigner energy distribution for their wave function 3 ' 13 {-E\i,G) = iJfrbr——± £, - (iSR -
^ - , - 0 0 / / < E < +oo .
(24)
l-^)
The variable E extends over the physical values (upper rim of positive real axis first sheet = lower rim of positive real axis second sheet) and from — oo// to 0 in the second sheet. This is an idealized Breit-Wigner in contrast to the standard Breit-Wigner for which 0 < E < oo. The results (22) and (23) are derived from the pole term definition using the properties of the Hardy spaces (21) 3 ' 13 . If there are iV* resonances in the system, each occurring as a pole of the j - t h partial 5-matrix at the positions zRi = ERi — iIV2, then one obtains N Gamow vectors ipG.
111
The Gamow vectors %l)f are members of a "complex" basis vector expan sion 13 . In place of the well known Dirac basis system expansion (6b) given for the Hamiltonian H by /•+00
4>+ = / dE\E+)(+E\^) (25) Jo (where a discrete sum over bound states has been ignored), every state vector (f>+ £ $ _ can be expanded as
dE\E+){+E\<j>+)
(26)
(where —OOJI indicates that the integration along the negative real axis (or other contours) is in the second Riemann sheet of the 5-matrix). The "com plex" basis vector expansion (26) is rigorous. This allows us to mathemati cally isolate the exponentially decaying states tpf. It also allows us an easy approximation by omitting the background integral in (26), and just using N +
* =!>?>*'
ci = -<^?|0+).
(27)
t=i
Then one obtains the "effective" theories with finite complex Hamiltonian. For instance, for the K\ - K% meson system with TV = 2,
+ ^bL
(28) 14
and one obtains the Lee-Oehme-Yang theory . The finite dimensional ap proximations (27) have been successfully applied to many areas of physics, in particular to nuclear physics 15 ' 2 , which shows that to isolate the Gamow states can be a good approximation. Since (23a) implies the exponential law for the decay rate 6 Vv(t) = Tve~rt the width T of the Breit-Wigner distribution (24) and the lifetime fulfill the exact relation r = p. This has not been obtained before as an exact, precisely derived relation, though it has always been assumed on the basis of some "approximate derivations" 16 . Gamow vectors are ideally suited to describe resonances (the pair (23a) and (23b)) in a scattering process or quasistable particles (23a) that decay. Like the Dirac-Lippmann-Schwinger kets | S ± ) , from which they are con structed, they do not describe interaction-free in- or out- asymptotic states. In a theory that allows only asymptotic particles, they are therefore not ad mitted. Gamow states have all the properties that heuristically the unstable
112
states need to possess. In addition and unintended they give rise to an asym metric time evolution, which may not have been wanted but is in agreement with the empirical principle of causality. Acknowledgement We gratefully acknowledge valuable support of the Welch Foundation for the preparation of this paper and kind hospitality of Rui Vilela Mendes in Lisbon. References 1. Resonances-Models and Phenomena, S. Albeverio, L. S. Ferreira and L. Streit [Eds], Springer Lecture Notes in Physics 211, (Berlin, 1984). 2. M. Baldo, L. S. Ferreira and L. Streit, Nucl. Phys. A 467, 44 (1987); M. Baldo, L. S. Ferreira and L. Streit, Phys. Rev. C 36, 1743 (1987). 3. A. Bohm, J. Math. Phys. 22, 2813 (1981); Lett. Math. Phys. 3, 455 (1978). 4. A. Bohm, Quantum Mechanics, 3rd ed., sections XVIII.6-9 and XX.3, (Springer-Verlag, New York, 1994). 5. L. Streit in Stochastic Analysis and Applications in Physics, A. I. Car doso, et al. [Eds.], (Klumer, Dordrecht, 1994), page 415; M. de Faria and L. Streit in Stochastic Analysis on Infinite Dimensional Spaces, H. Kunita and H. H. Kuo [Eds.], (Longman, 1994), page 52. 6. A. Bohm and N. L. Harshmann in Irreversibility and Causality, p. 225, Sect. 7.4; A. Bohm, H. D. Doebner and P. Kielanowski [Eds.], (Springer, Berlin, 1998). 7. A. M. Gleason, J. Math. 6, 885 (1957). 8. J. von Neumann, Mathematische Grundlagen der Quantentheorie, (Springer, Berlin, 1931) (English translation by R. T. Beyer) (Prince ton University Press, Princeton, 1955). 9. M. H. Stone, Ann. of Math. 33, 643 (1932). 10. A. Bohm and M. Gadella, Dirac Kets, Gamow Vectors and Gel'fand Triplets, Lecture Notes in Physics 348, (Springer-Verlag, Berlin, 1989). 11. A. Bohm, I. Antoniou and P. Kielanowski, Phys. Lett. A 189, 442 (1994); A. Bohm, I Antoniou and P. Kielanowski, J. Math. Phys. 36, 2593 (1995). 12. E. P. Wigner, Symmetries and Reflections, (Indiana University Press, Bloomington, 1967), page 38. 13. A. Bohm, S. Maxson, M. Loewe and M. Gadella, Physica A 236, 485 (1997). 14. T. D. Lee, R. Oehme and C. N. Yang, Phys. Rev. 106, 340 (1957).
113
15. L. S. Ferreira.in Resonances, E. Brandas [Ed], Lecture Notes in Physics 325, (Springer, 1989), p. 201; C. Mahaux, \ p. 139; 0 . I. Tolstikhin, V. N. Ostrovsky and H. Nakamura, Phys. Rev. 58, 2077 (1998); V. I. Kukulin, V. M. Krasnopolsky and J. Horacek, Theory of Resonances, (Kluwer Acad. Publ., 1989). 16. e.g. M. L. Goldberger and K. M. Watson, Collision Theory, (Wiley, New York, 1964); Newton, Scattering Theory of Waves and Particles, 2nded., (Springer-Verlag, 1982), chapter 8.
114 Q-ISING N E U R A L N E T W O R K D Y N A M I C S : A COMPARATIVE R E V I E W OF VARIOUS A R C H I T E C T U R E S D. BOLLE AND G. JONGEN Instituut voor Theoretische Fysica, K.U. Leuven, B-3001 Leuven, Belgium E-mail: {desire.bolle, greetje.jongen}@fys.kuleuven.ac.be. G. M. SHIM Department of Physics, Chungnam National University Yuseong, Taejon 305-764, R-O. Korea E-mail: [email protected]. This contribution reviews the parallel dynamics of Q-Ising neural networks for var ious architectures: extremely diluted asymmetric, layered feedforward, extremely diluted symmetric, and fully connected. Using a probabilistic signal-to-noise ratio analysis, taking into account all feedback correlations, which are strongly depen dent upon these architectures the evolution of the distribution of the local field is found. This leads to a recursive scheme determining the complete time evolution of the order parameters of the network. Arbitrary Q and mainly zero tempera ture are considered. For the asymmetrically diluted and the layered feedforward network a closed-form solution is obtained while for the symmetrically diluted and fully connected architecture the feedback correlations prevent such a closed-form solution. For these symmetric networks equilibrium fixed-point equations can be derived under certain conditions on the noise in the system. They are the same as those obtained in a thermodynamic replica-symmetric mean-field theory approach.
1
Introduction
Artificial neural networks have been widely applied to memorize and retrieve information. During the last few years there has been considerable interest in neural networks with multistate neurons (see * and references cited therein) in order to function as associative memories for gray-toned or coloured patterns or to allow for a more complicated internal structure in the retrieval, e.g., a distinction between background and patterns. Here we review the dynamics of so-called Q-Ising neural networks (see the references in 2 ) for arbitrary Q. They are built from Q-Ising spin-glasses 3,4 with couplings defined in terms of patterns through a learning rule. For Q = 2 one finds back the Hopfield model 5 ' 6 , for Q -» oo one has an analog network (see 7 and references therein). One of the aims of these networks is to memorize a number of patterns and find them back as attractors of the
115
retrieval process. Consequently these networks are also interesting from the point of view of dynamical systems. Besides a learning rule one also needs to specify an architecture indicating how the spins (=neurons) are connected with each other. Several architectures have been studied in the literature for different purposes. From a practical point of view mostly perceptrons or, more general, feedforward layered net works are used since a very long time (see, e.g., 8 for a history). Hopfield 5 6 ' studied a fully connected network with symmetric couplings because it satisfies the detailed balance principle and hence a Hamiltonian can be de fined. Asymmetrically diluted models 9 were used because their dynamics can be solved exactly and because they can learn us something about the loss of information content when some of the synaptic couplings break down. In this contribution we review the study of the parallel dynamics of these types of network using a probabilistic approach (see, e.g., 1 0 ' H ) . In more detail, employing a signal-to-noise ratio analysis based on the law of large numbers (LLN) and the central limit theorem (CLT) we derive the evolution of the distribution of the local field at every time step. This allows us to obtain a recursive scheme for the evolution of the relevant order parameters in the system being, in general, the main overlap for the condensed pattern, the mean of the neuron activities and the variance of the residual overlap responsible for the intrinsic noise in the dynamics of the main overlap (sometimes called the width parameter). The details of this approach depend in an essential way on the architecture because different temporal correlations are possible. For extremely diluted asymmetric and layered feedforward architectures recursion relations have been obtained in closed form directly for the relevant order parameters 9 . 12 . 13 . 14 . This has been possible because in these types of networks there are no feedback correlations as time progresses. As a technical consequence the local field contains only Gaussian noise leading to an explicit solution. For the parallel dynamics of networks with symmetric connections, how ever, things are quite different Mo.n Even for extremely diluted versions of these systems 15>16'17 feedback correlations become essential from the second time step onwards, complicating the dynamics in a nontrivial way. There fore, explicit results concerning the time evolution of the order parameters for these models have to be obtained indirectly by starting from the distribution of the local field. Technically speaking, both for the symmetrically diluted and fully connected architectures the local field contains both a discrete and a normally distributed part. The difference between the diluted and fully connected models is that the discrete part at a certain time t does not involve the spins at all previous times t-l,t — 2,... up to 0 but only the spins at time
116
step t — 1. But in both cases the discrete part prevents a closed-form solution of the dynamics for the relevant order parameters. Nevertheless, the devel opment of a recursive scheme is possible in order to calculate their complete time evolution. In this way a comparative discussion of the parallel dynamics at zero temperature for the various architectures specified above is possible. Finally, by requiring the local field to become time-independent implying that some correlations between its Gaussian and discrete noise parts are ne glected we can obtain fixed-point equations for the order parameters. It turns out that they are equivalent to the fixed-point equations obtained through a thermodynamic replica-symmetric mean-field theory approach. At this point we remark that we do not aim for complete rigour in our derivations. From the point of view of rigorous mathematics, the Hopfield model and, in general, spin-glass theory is recognized to be an extremely difficult, if not imposible, field. For a recent overview of the modest results obtained, mostly concerning thermodynamics, we refer to 18 . The rest of this contribution is organized as follows. In Section 2 we introduce the model, its dynamics and the Hamming distance as a macro scopic measure for the retrieval quality. In Section 3 we use the probabilistic approach in order to derive a recursive scheme for the evolution of the distribu tion of the local field, leading to recursion relations for the order parameters. The differences between the various architectures are outlined. We do not aim for complete rigour and mostly concentrate on zero temperature. In Sec tion 4 we discuss the evolution of the system to fixed-point attractors. Some concluding remarks are given in Section 5.
2
Q-Ising neural networks
Consider a neural network A consisting of AT neurons which can take values Oi from a discrete set S — {-1 = s\ < s 2 < . . . < SQ = +1}. The p patterns to be stored in this network are supposed to be a collection of independent and identically distributed random variables (i.i.d.r.v.), {£? 6 S}, fi £ V = { 1 , . . . ,p} and i € A, with zero mean, E[^] = 0, and variance A = Var[£f]. The latter is a measure for the activity of the patterns. We remark that for simplicity we have taken the patterns and the neurons out of the same set of variables but this is no essential restriction. Given the configuration a\(t) = {
hi(crA(t)) = Y/Jij(t)oj(t) ;6A
(1)
117
with Jij the synaptic coupling from neuron j to neuron i. In the sequel we write the shorthand notation /iA,i(t) = hi(
J D
« = ^£tftf
J c
Z = 4iZGG
■^(0 = ^ £ with the {cij = 0,1},i,j Pr{aj - x} = (1 - C/N)SXt0 symmetric dilution, and c^ for asymmetric dilution. At zero temperature all rule
for
**>''
for
<*>•
^ + 1)^(0,
W (3) (4)
€ A chosen to be i.i.d.r.v. with distribution + (C/N)8Xti and satisfying Cy — Cji, cu = 0 for and Cji statistically independent (with ca — 0) neurons are updated in parallel according to the
<Ti(t) -> <Ji{t + l)=sk:
mine i [s|o- A (t)] = ti[sk\
(5)
We remark that this rule is the zero temperature limit T = / 3 _ 1 —> 0 of the stochastic parallel spin-flip dynamics defined by the transition probabilities « . , , , , > ,
c
o
U
^M _
exP[-/3£i(Sfc|
Pi{(Ji{t + l) = sk €S\
= = . E 8 € 5 e x Pl-/ 3 € <0 s l < T Mt))] Here the energy potential €i[s|crA] is defined by 19 uls\
f(.,
(6)
(7)
118
where 6 > 0 is the gain parameter of the system. The updating rule (5) is equivalent to using a gain function g 6 (),
g,,(hAM) Q
gb(x) = £
*k P [b{sk+i +sk)-x}-6
[b(sk + 8k-t) - x]]
(8)
t=i
with so = - c o and SQ+1 = +oo. For finite Q, this gain function g6(-) is a step function. The gain parameter b controls the average slope of g6(-)In order to measure the retrieval quality of the system one can use the Hamming distance between a stored pattern and the microscopic state of the network
i€A
This introduces the main overlap and the arithmetic mean of the neuron activities
igA
t€A
We remark that for Q = 2 the variance of the patterns A = 1, and the neuron activity a\(t) = 1. 3 3.1
Solving the dynamics Correlations
We first discuss some of the geometric properties of the various architectures which are particularly relevant for the understanding of their long-time dy namic behaviour. For a fully connected architecture there are two main sources of strong correlations between the neurons complicating the dynamical evolution : feed back loops and the common ancestor problem 20 . Feedback loops occur when in the course of the time evolution, e.g., the following string of connections is possible: i —y j —>■ k —>• i. We remark that architectures with symmetric connections always have these feedback loops. In the absence of these loops the network functions in fact as a layered system, i.e., only feedforward con nections are possible. But in this layered architecture common ancestors are still present when, e.g., for the sites i and j there are sites in the foregoing time steps that have a connection with both i and j .
119 In extremely diluted asymmetric architecture these sources of correlations are absent. This class of neural networks was introduced in connection with Q = 2-Ising models 9 . We recall that the couplings are then given by eq. (2) and that in the limit N —> co two important properties of this network are essential 9 ' 2 1 . The first property is the high asymmetry of the connections, viz. Pr{ C l J = c Jt } = ( § )
,
Pr{cy = l A C i i = 0 } = ^ l - ^ .
(11)
Therefore, the number of symmetric connections in the infinite configuration c = {cij}, i, j ^ i € N is finite with probability one, i.e. almost all connections of the graph GN(C) = {(i,j) ■ Cy; = l,i, j ^ i € N} are directed : Cy ^ Cj{. The second property in the limit of extreme dilution is the directed lo cal Cayley-tree structure of the graph Gn(c). By the arguments above the probability FJ. ' (c) that k connections are directed towards a given site i 6 A is
where jf/ = {cji = l , j £ A \ i } is the in-tree for i and |T> m, | its cardinality. This probability is equal to Pr{& = |l\ ( o u t ) | = \{Cij = 1, j € A \ i}\} for connections directed outward a given site i € A. In the limit of extreme dilution we get a Poisson distribution : lim FlA)(C) = ~e-c.
(13)
Hence, the mean value of the number of in (out) connections for any site i E A is E[\T^in)ioui)\] = C. The probability that two sites i and i' have site j as a common ancestor is obviously equal to C/N. From £[17^ '|] = C it follows that after t time steps the cardinality of the cluster of ancestors for site i will be of the order of Cl. The same is valid for site i'. Therefore, the probability that the sites i and i' have disjoint clusters of ancestors approaches (1 - Cl/N)c' ~ exp(-C 2 7A0 for N » 1. So we find that in the limit of extreme dilution : (i) Almost all (i.e. with probability 1) feedback loops in GN(C) are eliminated, (ii) With probability 1 any finite number of neurons have disjoint clusters of ancestors. So we first dilute the system by taking N -v oo and then we take the limit C -* oo in order to get infinite average connectivity allowing to store infinitely many patterns p.
120 This implies that for this asymmetrically diluted model at any given time step t all spins are uncorrelated and, hence, the first step dynamics describes the full time evolution of the network. For the symmetrically diluted model the architecture is still a local Cayley-tree but no longer directed and in the limit N —> oo the probabil ity that the number of connections Ti = {j £ A|cy = 1} giving information to the the site i 6 A is still a Poisson distribution with mean C = £[|7i|]. However, at time t the spins are no longer uncorrelated causing a feedback from t > 2 onwards I 7 . In order to solve the dynamics we start with a discussion of the first time step dynamics, the form of which is independent of the architecture. 3.2
First time step
Consider a fully connected network. Suppose that the initial configuration of the network {a-j(0)},z G A, is a collection of i.i.d.r.v. with mean E[
ml>0.
(14)
This pattern is said to be condensed. By the law of large numbers (LLN) one gets for the main overlap and the activity at t = 0 m^O) = Jim m\(0) £
^ [ ^ ( 0 ) ] = mj
o(0) = lim o A (0) = E[
(15) (16)
where the convergence is in probability 22 . In order to obtain the configuration at t = 1 we first have to calculate the local field (1) at t = 0. To do this we employ the signal-to-noise ratio analysis (see, e.g., 10 ' 12 ). Recalling the learning rule (3) we separate the part containing the condensed pattern, i.e., the signal, from the rest, i.e., the noise to arrive at
M-A(0)) = ^
£ ^(0) + V 5 ^ E &~m £ tf"i(o)
(17) where a = p/N. The properties of the initial configurations (14)-(16) assure us that the summation in the first term on the r.h.s of (17) converges in the limit N -¥ oo to
^^E^W"-1^)ieA\t
(18)
121
The first term £?m1 (0) is independent of the second term on the r.h.s of (17). This second term contains the influence of the non-condensed patterns causing the intrinsic noise in the dynamics of the main overlap. In view of this we define the residual overlap r " ( 0 = lim r£(<) = lim - ± = £ # o , - ( « )
/i£P\{l}.
(19)
jGA
Applying the CLT to this second term in (17) we find
„"5.,/f £ ««*«> = J S . ^ p E ^
£ ^ (0)(20)
= >/a-V(0,XZJ(0))
(21)
where the quantity M(0, V) represents a Gaussian random variable with mean 0 and variance V and where D(Q) = Var[r'i(0)] = a(0). Thus we see that in fact the variance of this residual overlap, i.e., D(t) is the relevant quantity characterising the intrinsic noise. In conclusion, in the limit N -> oo the local field is the sum of two independent random variables, i.e. hi(0)=
lim h A i i ( 0 ) i £ r o 1 ( 0 ) + >/aA/r(0,a(0)).
(22)
For a more rigourous discussion of the first time step for the underlying spinglass model we refer to 23 . At this point we note that the structure (22) of the distribution of the local field at time zero - signal plus Gaussian noise is typical for all architectures discussed here because the correlations caused by the dynamics only appear for t > 1. Some technical details are different for the various architectures. The first change in details that has to be made is an adaptation of the sum over the sites j to A for the layered feedforward architecture and to Tj, the part of the tree connected to neuron i, in the diluted architectures. The second change is that for the diluted architectures an additional limit C -> oo has to be taken besides the N -> oo limit. So in the thermodynamic limit C, N -> oo all averages will have to be taken over the treelike structure, viz. jj £ < G A -> £ £ i 6 7 v • Furthermore a = p/N has to be replaced by a = p/C. 3.3
Recursive dynamical scheme
The key question is then how these quantities evolve in time under the parallel dynamics specified before. For a general time step we find from the LLN in
122
the limit C, N —► oo for the main overlap and the activity (10) m\t
+ l) £
l^\{ai{t
+ \))0)),
a(t + l) £
(({*i(t + l))}))
(23)
with the thermal average denned as
( / ( . ( . + i))>,=V
/w
T^; ( ' ) ^T ) l
<M>
E a e s exp[±/?<7(M0 - for)] where hi(t) = lim^^oo h,\ti(t). In the above ((•)) denotes the average both over the distribution of the embedded patterns {ff} and the initial configurations {<7j(0)}. The average over the latter is hidden in an average over the local field through the updating rule (8). In the sequel we focus on zero temperature. Then, using eq. (8) these formula reduce to
m^t+l)
P
^ jUlgMt)))),
a(t + l) ^
((gUhiit)))) ■
(25)
As seen already in the first time step, we have to study carefully the influ ence of the non-condensed patterns causing the intrinsic noise in the dynamics of the main overlap. The method used to obtain these order parameters is then to calculate the distribution of the local field as a function of time. In order to determine the structure of the local field we have to concentrate on the evolution of the residual overlap. The details of this calculation are very technical and depend on the precise correlations in the system and hence on the architecture of the network as discussed before. For these technical de tails we refer to the relevant literature 1.2.12,14,15 } j e r e w e g j v e g^ extensive discussion of the results obtained. In general, the distribution of the local field at time t + 1 is given by M< +1) = tlml(t
+1) +Af(0,aa(t
+1)) + X(t)[F(hi{t) -gml(t))
+ Baai{t)} (26) where F and B are binary coefficients given below, which depend on the specific architecture. From this it is clear that the local field at time t consists out of a discrete part and a normally distributed part, viz. hi(t) = Mi(t)+M(0,V(t))
(27)
where Mi(t) satisfies the recursion relation M{(t + 1) = X(t)[F(Mi(t)
- Zlml(t))
+ Baai(t)} + ^ m 1 ^ + 1)
(28)
and where V(t) = aAD{t) with D(t) itself given by the recursion relation D(t + 1) = ^ j - ^ + LX2(t)D(t)
+ 2B2X{t)Cov[r»{t),
r»(t)}
(29)
123
where L and B2 are again coefficients specified below. The quantity x(t) reads Q-i
X(0 = £
h»(t)(b(sM
+ *k))(**+i - 8k)
(30)
it=i
where f-h»tt\ is the probability density of hf (t) = lim/v-»oo ^A t(0
w tn
'
(31) Furthermore, F 1 ^) is defined as (32) v
i6A
Finally, as can be read off from eq. (28) the quantity Mi(t) consists out of the signal term and a discrete noise term, viz. 1-2
t-\
Mi(t) = tlm'it) + B.axit - 1)^(1 - 1) + B2 £ a ! ! * ( < t'=0
ls=t'
Since different architectures contain different correlations not all terms in these final equations are present. In particular we have for the coefficients F,B,L,B\ and Bi introduced above FC SED LF AED
F BLBX 1 1 1 1 0 10 1 10 10 0 0 0 0
B2 1 0 0 0
(34)
with B indicating the feedback caused by the symmetry in the architectures and L the common ancestors contribution. At this point we remark that in the so-called theory of statistical neurodynamics 24,25 one starts from an approximate local field by leaving out any discrete noise (the term in <7j(£))- As a consequence the covariance in the recursion relation for D(t) can be written down more explicitly since only Gaussian noise is involved. For more details we refer to 26 . We still have to determine the probability density of f^m in eq. (30), which in the thermodynamic limit equals the probability density of fh{{t)This can be done by looking at the form of Mi{t) given by eq. (33). The evolution equation tells us that <7i(£') can be replaced by pt(/i,(<' — 1)) such that the second and third terms of M,(t) are the sums of stepfunctions of
124
correlated variables. These are also correlated through the dynamics with the normally distributed part of h{(t). Therefore the local field can be consid ered as a transformation of a set of correlated normally distributed variables xs, s = 0,.. .,t — 2,t, which we choose to normalize. Defining the correlation matrix W = (p(s,s') = E[a;sa:g<]) we arrive at the following expression for the probability density of the local field at time t
fhtw(y) = / II d x > d x t 6 {y - M<W - V<*AD(*)*t) x
1
.
with x = (xo,.. .xt-2,xt). simplifies to
exp (-lxW-1xT)
For the symmetrically diluted case this expression
dxt 2
6 y
1
exp (-\-xW-l-xr) V 2
fhi(t)(y) = / I I
(35)
-*
\~
£ml(()
_ a
x(t)ai(t)
- y/aa(t)
xt)
=0
^/det^jrWO
(36)
with x = ({i»}) = (a:t_2[t/2]i • • xt-2>xt)- The brackets [t/2] denote the integer part of t/2. So the local field at time t consists out of a signal term, a discrete noise part and a normally distributed noise part. Furthermore, the discrete noise and the normally distributed noise are correlated and this prohibits us to derive a closed expression for the overlap and activity. Together with the eqs. (25) for m 1 ((+1) and a(t+l) the results above form a recursive scheme in order to obtain the order parameters of the system. The practical difficulty which remains is the explicit calculation of the correlations in the network at different time steps as present in eq. (29). For AED and LF architectures this scheme leads to an explicit form for the recursion relations for the order parameters m"(t + 1) = SvAjlltHt
+ l)Jvzg(?(t
a(t + 1) = (ljvzg2{il{t D(t + 1) = ^ p ±
+^
+ l)ml(t) [(^Jvzzgtfit
+ l)ml{t) + \/aAD(t)
+ yfiAD(fj z)\\
+ %'(() +
*)))
(37) (38)
,/^AW)z)))} (39)
125
with Vz = dz(2ir)-1/2exp(-z2/2). For the AED architecture (L = 0) the second term on the r.h.s. of (39) coming from the correlations caused by the common ancestors is absent. For the LF architecture we remark that this explicit solution requires an independent choice of the representations of the patterns at different layers. At finite temperatures analogous recursion relations for the AED and LF networks can be derived 12,14 by introducing auxiliary thermal fields 27 in order to express the stochastic dynamics within the gain function formulation of the deterministic dynamics. Furthermore, damage spreading 9>28-29, i.e., the evolution of two network configurations which are initially close in Hamming distance can be studied 12 - 14 . Finally, a complete self-control mechanism can be built in the dynamics of these systems by introducing a time-dependent threshold in the gain function improving, e.g., the basins of attraction of the memorized patterns 3 0 _ 3 2 . For explicit examples of this dynamical scheme with numerical results we refer to 1,1S . By using the recursion relations the first few time steps are written out explicitly and studied numerically, e.g, for the Q = 2 and Q = 3 FC and SED models with equidistant states and a uniform distribution of patterns. These results are compared with the approximations studied in the literature JO.11^,17,24,25,33-38 by neglecting some feedback correlations for t > 2. In the whole retrieval region of these networks we find that the first four or five time steps give us already a clear picture of their time evolution. 4
Fixed-point equations
Equilibrium results for the AED and LF Q-Ising models are obtained imme diately by straightforwardly leaving out the time dependence in (37)-(39) (see 12 14 , ), since the evolution equations for the local field and the order parame ters do not change their form as time progresses (see eqs. (37)-(39)). This still allows small fluctuations in the configurations {{}■ The difference between the fixed-point equations for these two architectures is that for the AED model the variance of the residual noise, D(t) is simply proportional to the activity of the neurons at time t while for the LF model a recursion is needed. For the SED and FC architectures, however, the evolution equations for the order parameters do change their form by the explicit appearance of the {<Ji(t')}, t' — 1 , . . . , t term. Hence we can not use the simple procedure above to obtain the fixed-point equations. Instead we derive the equilibrium results of our dynamical scheme by requiring through the recursion relations (26) that the distribution of the local field becomes time-independent. This is an approximation because fluctuations in the network configuration are no longer allowed. In fact, it means that out of the discrete part of this distribution,
126
i.e., Mi(t) (recall (33)), only the <7<(t - 1) term is kept besides, of course, the signal term. This procedure implies that the main overlap and activity in the fixed-point are found from the definitions (10) and not from leaving out the time dependence in the recursion relation (25). We start by eliminating the time-dependence in the evolution equations for the local field (26). This leads to hi = gm1 + [ x a r ] - W ( 0 , a a ) + [ x a r ] _ 1 a x ^
(40)
ar
with \ = 1 - Fx being 1 for the SED and 1 - x for the FC model and hi = \imt-yoo hi(t). This expression consists out of two parts: a normally distributed part hi = N(£lm1,aa/[xar]2) and some discrete noise part. At this point some remarks are in order. First, the discrete noise coming from the correlations of the {<Ji(t)} at different time steps (here only the preceding time step is considered) is inherent in the SED and FC dynamics. Second, the so-called self-consistent signal-to-noise ratio analysis of the FC network con sidered in the literature 39 ' 40 starts from such a type of equation by assuming the presence of a term proportional to the output in the local field without any reference or argumentation based upon the underlying dynamics of the network. Employing this expression in the updating rule (8) one finds Oi=6b(hi + \xaT1«X°i)
(41)
where hi = ^ ( f - m ' . Q o ) is the normally distributed part of eq. (40). This is a self-consistent equation in Oi which in general admits more than one solution. These types of equation have been solved in the literature in the context of thermodynamics using a geometric Maxwell construction 39 , 40 . We remark that for analog networks the geometric Maxwell construction is not necessary: the fixed-point equation (41) has only one solution. For more technical details we refer to 2 6 . This approach leads to a unique solution b = b-[2xaT^X-
* = $&),
(42)
We remark that plugging this result into the local field (40) tells us that the latter is the sum of two Gaussians with shifted mean (see also 3 7 ) . Using the definition of the main overlap and activity (10) in the limit N -> oo for the FC model and limit C,N -> oo for the SED model, one finds in the fixed point m1 = (U1 j Vz
6i
(C1 m1 + V^AD z) \ \
(43)
127 fvz gf (('m1 + V^ADz)\\ . " < From (29) and (30) one furthermore sees that D = [xaT2a/A
(44)
(45)
with
l =
7S5((/P"6i(f'm' + ^ z ) ) ) '
(46
»
These resulting equations (43)-(45) are the same as the fixed-point equations derived from a replica-symmetric mean-field theory treatment in 4 1 ~ 4 4 . Their solution leads to capacity-gain parameter phase diagrams (see, e.g., 44 ). 5
Concluding r e m a r k s
An evolution equation is derived for the distribution of the local field governing the parallel dynamics at zero temperature of extremely diluted symmetric and asymmetric, layered feedforward and fully connected Q > 2-Ising networks. All feedback correlations are taken into account. In general, this distribution is not normally distributed but contains a discrete noise part. Employing this evolution equation a general recursive scheme is developed allowing one to calculate the relevant order parameters of the system, i.e., the main overlap, the activity and the variance of the residual noise for any time step. For the extremely diluted asymmetric and the layered feedforward architectures this scheme immediately leads to explicit recursion relations for the order parameters because the discrete noise part in the local field is absent. For the extremely diluted and the fully connected architectures equilibrium fixed-point equations for the order parameters are obtained under the condition that the local field becomes time-independent, meaning that some of the discrete noise is neglected. The resulting equations are the same as those derived from a replica-symmetric mean-field theory approach. Acknowledgments This work has been supported in part by the Research Fund of the K.U.Leuven (Grant OT/94/9) and the Korea Science and Engineering Foundation through the SRC program. The authors are indebted to S. Amari, R. Kiihn, G. Massolo, A. Patrick and V. Zagrebnov for constructive discussions. One of us (D.B.) thanks the Belgian National Fund for Scientific Research for financial support.
128 References 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16. 17. 18.
19. 20. 21. 22. 23. 24. 25. 26.
D. Bolle, G. Jongen and G.M. Shim, J. Stat. Phys. 9 1 , 125 (1998). D. Bolle, B. Vinck, and V.A. Zagrebnov, J. Stat. Phys. 70, 1099 (1993). D. Sherrington and S. Kirkpatrick, Phys. Rev. Lett. 35, 1792 (1972). S.K. Ghatak and D. Sherrington, J. Phys. C10, 3149 (1977). J.J. Hopfield, Proc. Nat. Acad. Sci. USA 79, 2554 (1982). J.J. Hopfield, Proc. Nat. Acad. Sci. USA 81, 3088 (1984). R. Kiihn and S. Bos, J. Phys. A: Math. Gen. 26, 831 (1993). J. Hertz, A. Krogh and R.G. Palmer, Introduction to the theory of neural computation Addison-Wesley, Redwood City (1991) B. Derrida, E. Gardner, and A. Zippelius, Europhys. Lett. 4, 167 (1987). A.E. Patrick and V.A. Zagrebnov, J. Stat. Phys. 63, 59 (1991). A.E. Patrick and V.A. Zagrebnov, J. Phys. A: Math. Gen. 24, 3413 (1991). D. Bolle, G.M. Shim, B. Vinck, and V.A. Zagrebnov, J. Stat. Phys. 74, 565 (1994). E. Domany, W. Kinzel, and R. Meir, J. Phys. A: Math. Gen. 22, 2081 (1989). D. Bolle D, G.M. Shim, and B. Vinck, J. Stat. Phys. 74, 583 (1994). D. Bolle, G. Jongen and G.M. Shim, Parallel dynamics of extremely diluted symmetric Q-Ising neural networks, to appear in J. Stat. Phys. 96 August 1999. T.L.H. Watkin and D. Sherrington, J. Phys. A: Math. Gen. 24, 5427 (1991). A.E. Patrick and V.A. Zagrebnov, J. Phys. A: Math. Gen. 23, L1323 (1990); J. Phys. A: Math. Gen. 25, 1009 (1992). A. Bovier and V. Gayrard, in Mathematical aspects of spin-glasses and neural networks, eds. A. Bovier and P. Picco, Progress in Probability, Vol 41 (1997). H. Rieger, J. Phys. A: Math. Gen. 23, L1273 (1990). E. Barkai, I. Kanter and H. Sompolinsky, Phys. Rev. A41, 590 (1990). R. Kree and A. Zippelius, in Models of neural networks, eds. E. Domany, J.L. van Hemmen and K. Schulten, Springer, Berlin (1991). A.N. Shiryayev, Probability (Springer, New York, 1984) A. Patrick J. Stat. Phys. 84, 973 (1996) S. Amari and K. Maginu, Neural Networks 1, 63 (1988). M. Okada, Neural Networks 9, 1429 (1996). G. Jongen, On the dynamics of spin-glass models of neural networks, Ph. D. thesis, K.U.Leuven, Belgium (1999).
129
27. 28. 29. 30. 31. 32. 33. 34. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44.
A.E. Patrick and V.A. Zagrebnov, J. Phys. (Paris) 51, 1129 (1990). B. Derrida, J. Phys. A: Math. Gen. 20, L271 (1987). B. Derrida and R. Meir, Phys. Rev. .438, 3116 (1988). D.R.C. Dominguez and D. Bolle, Phys. Rev. Lett. 80, 2961 (1998). K. Kitano and T. Aoyagi, J. Phys. A: Math. Gen. 31, L613 (1998). S. Grosskinsky, J. Phys. A: Math. Gen. 32, 2983 (1999). W. Kinzel, Z. Phys. B 60, 205 (1985). E. Gardner, B. Derrida and P. Mottishaw, J. Physique 48, 741 (1987). W. Krauth, J.P. Nadal and M. Mezard, J. Phys. A: Math. Gen. 21, 2995 (1988). H. Horner, D. Bormann, M. Frick, H. Kinzelbach and A. Schmidt, Z. Phys. 5 76, 381 (1989). R.D. Henkel and M. Opper, Europhys. Lett. 11, 403 (1990); J. Phys. A: Math. Gen. 24, 2201 (1991). D. Gandolfo, M. Sirugue-Collin and V.A. Zagrebnov, Network: Compu tation in Neural Systems 9, 563 (1998). M. Shiino and T. Fukai, J. Phys. A: Math. Gen. 25, L375 (1992). M. Shiino and T. Fukai, Phys. Rev. E 48, 867 (1993). T.L.H. Watkin, and D. Sherrington, Europhys. Lett. 14, 791 (1991). D. Bolle, D. Carlucci and G.M. Shim, Thermodynamic properties of ex tremely diluted symmetric Q-Ising neural networks, in preparation. D. Amit, H. Gutfreund and H. Sompolinsky, Ann. Phys. (N. Y.) 173, 30 (1987). D. Bolle, H. Rieger and G.M. Shim, J. Phys. A: Math. Gen. 27, 3411 (1994).
130 T H E RELATIVISTIC A H A R O N O V - B O H M - C O U L O M B PROBLEM: A PATH I N T E G R A L SOLUTION J. BORNALES Physics Department, MSU-Eigan Institute of Technology Iligan City 9200, Philippines C. C. BERNIDO Research Center for Theoretical Physics, Central Visayan Jagna, Bohol 6308, Philippines
Institute
M. V. CARPIO-BERNIDO Research Center for Theoretical Physics, Central Visayan Jagna, Bohol 6308, Philippines
Institute
The Dirac equation in (2+l)-dimensional spacetime for a particle interacting with a combined Aharonov-Bohm field and Coulomb potential is evaluated by the path integral method. To do this, a modified Biedenharn transformation is used to reduce the path integral to a form similar to the non-relativistic Coulomb problem.
1
Introduction
It is with gratitude and pleasure that we take this opportunity to contribute this work to the 60th birthday celebration in honor of Professor Dr. Ludwig Streit. Professor Streit has been scientific host to two of the authors (C. C. B. and M. V. C. B.) at the Research Center BiBoS in Bielefeld, Germany, during their terms as Alexander von Humboldt Research Fellows. Indeed, the research stays at BiBoS provided special periods of personal and scientific growth — a unique opportunity made memorable by a kind and supportive host. The subsequent visits of Professor Streit to Jagna, Bohol in the Philip pines have also enlarged his circle of friends to include Philippine students, such as the first author (J. B.), who now appreciates his generosity. Together we all wish Professor Streit all the best. The bonds that strongly couple Professor Streit's group at BiBoS and Bohol are primarily based on our investigation of various mathematical and physical aspects of Feynman path integrals. We thus briefly present here the highlights of a path integral solution to the relativistic Aharonov-Bohm-plusCoulomb problem. The details are contained in reference [1].
131
2
Green Function for a Relativistic Aharonov-Bohm-Coulomb System
The original Aharonov-Bohm system [2] is described by the magnetic vector potential, A — ($/27r/j) f, p2 = x2 + y2, for a magnetic flux $ trapped in a long impenetrable solenoid of radius R at the origin, and taken to lie along the 2-axis. To this system we add a Coulomb interaction potential, V = —k/p. The relativistic case for this problem continues to be investigated as shown in recent works [3-6]. Here we show a derivation, within the path integral framework, of the Green function for a spin 1/2 particle interacting with the combined Aharonov-Bohm-Coulomb potential. For the relativistic problem, we consider the Dirac equation (h = c — 1), (M-M)G(/f,p') = <*(/?'-A
(1)
r
for the Green function, G(p",/5 ) = (p"\G\j?), for a particle of mass p.. In Eq. (1), p = (p,
+ P[(k/p) + E\,
(2)
with, etp = a ■ p/p, and,
KABC
= 0
^
L
*-2
) +
2
(3)
in terms of, Lz = x pv - y px , and, a = e$/n . At this point, we make use of an approach used by Biedenharn [7] when he solved the 3-dimensional relativistic Kepler problem by casting it into a nonrelativistic Schroedinger form via a suitable similarity transformation. To this end, Eq. (1) is iterated giving, (n2-M2)g((rj)
= 6(fr-f?).
(4)
l
for a Green function, g{p",p ) - {p"\g\P), satisfying, G(f?\f?) = (n + MWJ). Following references [8-9], we can write the operator g as,
(5)
132 /•OO
g = (n2 - M 2 )" 1 = (t/2/*) /
exp(-iHA) dA,
(6)
Jo
where, H = {fj2 - M2)/2fj,. The Green function of the iterated Eq. (4) is just the matrix element of Eq. (6), i.e., 9(FJ)
= (i/2ft) f"{p\ezp(-iHA)\?) dA, (7) Jo in which the integrand ,(p"\exp(—iHA)\pf), can be identified as a quantum propagator for evolution in A- time with an effective Hamiltonian H. This integrand can then be evaluated as a path integral to get g((?',j?). From the result, the Green function G(/?',/?) for the linear Dirac equation can then be recovered. Introducing an operator in terms of KABC in Eq. (3), TABC
= -(0KABC
+ ikap),
(8)
the effective Hamiltonian can be written in the suggestive form,
2/j
2 , rABC (rABC + 1 ) Pp+ — f?
2kE
(9)
-VT
where, pj, = -[(8/dp) + (l/2/>)]2 and r)2 = E2 - fi2. We can relate TABC in Eq. (9) for this (2+l)-dimensional system to the Temple-Martin-Glauber operator [10-11] used by Biedenhaxn for the Kepler problem [7], and also by Berrondo and Mclntosh [12] for systems with magnetic monopoles. However, as in these earlier works, it is noted that \TABC,H] ^ 0> and a suitable similarity transformation S is required for diagonalization. For our case, this is found to be, S = exp[(i/3ap/2)
(10)
tanh(k/KABc)],
such that, *s = SHS-* = ^ where,
2 , ^ABCJ^ABC + 1) _ ^fcE _
2
(11)
133
TSABC = STABCS-1
=
-0\{KABC
~ *2)1/2|-
(12)
=
For some eigenstates, I7), we have, r ^ g ^ M 7|7)» where, 7 = ±(K2 - A;2)1/2, K? = [m ± (1/2) - (a/2)] 2 , m = 0,±1,±2,..., and a = e$/n. From Eq. (11), we also have TSABC(TSABC + l)|fl - f(£ + 1)|0 , where, £ — |7| + |( 5 ff n 7 _ !)• The diagonalization procedure, in fact, allows for a straightforward separation of the radial and angular components. Thus, we can then write, using, 1 = X) l£)(fl> the Green function as, g(FJ)
= (»/2/i) ^\its"\exp(-iHsA)\f?s) Jo
dA
= (i/2/*) / P ° ° I ] ( ^ | 0 ( p " | e x p ( - i i f c A ) | p ' ) ( e k ' ) dA J0
c
= I>"IO&(P",P')
(13)
(%>'),
where the radial part is given by, / (pn\exp(-iHcA)\p') Jo with an effective radial Hamiltonian of the form, 9dp">P') = (i/W
1
€ tf
+ l)
2kE
dA,
(14)
2
(15) " P P . With K taking positive or negative values, t h e angular function (
2
1\i
«-5 $ (
K+J
(16)
- K - j
where, **±*(V) = ( 1 / v ^ ) exp » U ± - + f J ¥> • At this stage, the radial part &(p",p') still remains to be determined.
(17)
134
3
The Radial Sum-Over-All Histories
We now proceed to evaluate the radial Green function by expressing the in tegrand as the path integral,
= Jexp[iS(p\p')}
Q(p-R)
D\p). (18)
The presence of the impenetrable solenoid is represented by Q(p—R), which is a step-function taking the value unity when the particle is outside the solenoid, p > R, and zero when p < R. The calculation can be carried out by slicing the time-like parameter A into iV-subintervals, i.e., A/N = EJ = Aj - Aj_i, (j = 1,2,..., JV) giving the propagator in the form, /v-i
N
/ TlexrfiSj)
0(/5j-i?)W27ri£j-)1/2 U
J=l
dPj
3=1
(19) with, pj = p{\j), po = P',PN = p", and A = £ ] . Ej. The corresponding action for each interval is, S - M (An)2 b >-2E^Pl)
^
+ 1
)] £ > | b c + 2/ipj p /
■*?'2M£j'
J +
(2Q\
(20)
where, Apj = pj — Pj_i, b = kE/p, and pj = (pjPj^ij1/2 is the geometric mean. We note that the radial path integral, with the Sj given above, is similar to that encountered for the non-relativistic Coulomb problem and, hence, path integrable [13,14]. This, in fact, was the motivation for the application of the Biedenharn transformation to the iterated Dirac equation for the system. Then, taking the limit, R -*• 0, for the radius of the solenoid, while maintaining a non-vanishing flux $ at the origin, the path integration yields the result, K{r",r')
= (l/2r'r") exp[i4ba] {—ipw) csc{ua) exp ?-liu){r"2 +rl2)cot{u)
x Ix+^[-ipMr"r'csc(ua)],
(21)
where, u = i2r]/p, X = 2f + | , r' 2 = p', r" 2 = p", a = A/4(p'p") 1/2 > and, I\+1 is a modified Bessel function. We can use this result to write the radial Green function (14) as,
135
&(p",p')
F
exp[—2pq] csch(q) exp[ir)(p" + p')coth(q)]
Jo
Ix+l[-2ir)(p'p")1/2csch(q)]dq,
*
(22)
where, p = -ikE/r) , ua - ir)A/(4p'p"n2)1/2, and q = T?A/(4p'p"M2)1/2Integrating over q, we further obtain the radial Green function in terms of the Whittaker functions, Matp and Wa>/3,
r(p+£ + i)
9(P",P') = 2ir](p'pny/2T(2^
+ 2)
M_p>s+h(-2ir,p»)
W_pt+h(-2ir,p').
(23)
The Green function G(p",(?) for the linear Dirac equation (1) can also be evaluated in the S representation using Eq. (5), where the operator Ms — SMS'1 = i&a.p [(8/dp) + ( | + i)lp + {kE/j)] - (KE/^), when acting on the K state. This can be used in Eq. (5) together with the recurrence relations for Whittaker functions. The energy spectrum for the Aharonov-Bohm-Coulomb problem can, however, already be obtained from the poles of the Gamma function in equa tion (23), i.e., when p + £ +1 = - n (n = 0,1,2,...), and this yields, 1-1/2
E = p, 1 +
2
,
(24)
(n + ^ m i f - f ) - ^ This energy spectrum agrees with that of reference [3]. For the special case where the flux, o = e$/7r, is zero, the result yields the energy spectrum of the relativistic (2+l)-dimensional hydrogen atom [15]. 4
Conclusion
In this paper, we have considered the relativistic problem for a fermion in teracting with a combined Aharonov-Bohm-Coulomb potential using a path integral approach to solve the (2+l)-dimensional Dirac equation. For our solution, a key role is played by a similarity transformation that casts the iterated Dirac equation into a form similar to the nonrelativistic problem for which path integration has been done earlier. This type of transformation
136
was used by Biedenharn [7] in his treatment of the relativistic Coulomb prob lem. In analogy to Biedenharn's work, we have also introduced an operator, FABC = —(0KABC + ikotp), of the Temple- Martin- Glauber type [10-11], used for (3+l)-dimensional problems. Appropriate diagonalization of this op erator allows the straightforward separation of the radial and angular parts of the path integral. Evaluation of the radial path integral reduces to the simple application of techniques used in the solution of the non-relativistic Coulomb problem. From the resulting Green function, we also obtained the energy spectrum of the Aharonov-Bohm-Coulomb system which reduces to the energy values for the (2+l)-dimensional relativistic hydrogen problem for the case when the magnetic flux $ = 0. We note here that a path integral treatment of the relativistic AharonovBohm-Coulomb problem has also been recently considered [4,6]. The approach adopted in reference [4], however, applied the Levi-Civitd transformation to handle the AB potential and, therefore, differs from the procedure presented here. The system tackled in reference [6], on the other hand, deals only with spinless particles which obey the Klein-Gordon equation. We also note that other systems can be investigated within the relativistic Aharonov-BohmCoulomb framework. This include a spin-i particle moving in the vicinity of a straight cosmic string with a flux $ inside its long lineal structure [16,17], a particle in the field of a massless spinning source [18], and a generalization of the gravitational anyon [19,20] to cases which involve fermions. Acknowledgments Two of the authors (C. C. B. and M. V. C. B.) wish to thank the conference organizers for the invitation to participate in this celebration in honor of Pro fessor Ludwig Streit. They are also grateful to the Abdus Salam International Centre for Theoretical Physics (Trieste) where part of this work was done dur ing an Associateship visit with support from the Japanese government. References [1] J. Bornales, C. C. Bernido, and M. V. Carpio-Bernido, Phys. Lett. A 260 (1999) 447. [2] Y. Aharonov and D. Bohm, Phys. Rev. 115 (1959) 485; An account of interesting basic theoretical and experimental features of the A-B effect is given in, M. Peshkin and A. Tonomura, The Aharonov-Bohm Effect (Springer Verlag, Berlin, 1989) LNP 340. [3] C. R. Hagen and D. K. Park, Ann. Phys. 251 (1996) 45;
137 [4] D. K. Park and S.-K. Yoo, Ann. Phys. 263 (1998) 295. [5] V. M. Villaba, J. Math. Phys. 36 (1995) 3332. [6] D.-H. Lin, J. Phys. A: Math. Gen. 31 (1998) 4785. [7] L. C. Biedenharn, Phys. Rev. 126 (1962) 845. [8] J. Schwinger, Phys. Rev. 82 (1951) 664. [9] B. S. DeWitt, Dynamical Theory of Groups and Fields (Gordon and Breach, New York, 1965). [10] G. Temple, The General Principles of Quantum Mechanics (Methuen, New York, 1948) p. 92. [11] P. C. Martin and R. J. Glauber, Phys. Rev. 109 (1958) 1307. [12] M. Berrondo and H. V. Mclntosh, Jour. Math. Phys. 11 (1970) 125. [13] H. Kleinert, Path Integrals in Quantum Mechanics, Statistics, and Poly mer Physics (World Scientific, Singapore, 1990). [14] R. Ho and A. Inomata, Phys. Rev. Lett. 48 (1982) 231; A. Inomata, Phys. Lett. A 101 (1984) 253; See, also, M. A. Kayed and A. Inomata, Phys. Rev. Lett. 53 (1984) 107. [15] S. H. Guo, X. L. Yang, F. T. Chan, K. W. Wong, and W. Y. Ching, Phys. Rev. A 43 (1991) 1197. [16] M. G. Alford and F. Wilczek, Phys. Rev. Lett. 62 (1989) 1071; M. G. Alford, J. March-Russel, and F. Wilczek, Nucl. Phys. B328 (1989)140. [17] Ph. Gerbert, Phys. Rev. D40 (1989) 1346. [18] Ph. Gerbert and R. Jackiw, Comm. Math. Phys. 124 (1989) 229. [19] Y. M. Cho, D. H. Park, and C. G. Han, Phys. Rev. D43 (1991) 1421. [20] C. C. Bernido, J. Phys. A: Math. Gen. 26 (1993) 5461.
138
STATISTICAL MANIFOLDS: T H E N A T U R A L AFFINE-METRIC S T R U C T U R E OF PROBABILITY THEORY G. BURDET AND PH. COMBE Centre de Physique Thiorique, CNRS-Luminy, Case 907, F 13288 Marseille cedex 9, France. E-mail: [email protected], [email protected] H. NENCKA University of Madeira, Center of Mathematical Sciences, PT 9000 Funchal, Madeira, Portugal. E-mail: [email protected] The methods of differential geometry applied to probability and statistics theories open a new domain of investigation the statistical manifold or Information Ge ometry. This framwork provides a geometrical description of statistical quantities and leads to a new approach of complex statistical problems. A pecular feature is that statistical manifolds are naturaly associated to a family of affine-metric geometries. The differential geometry contents of a particular statistical model, the finite probability manifold, is used to exhibit the relations of some dynamical proprties of the relative entropy and self-parallel curves. Applications are done to the learning process of stochastic neural networks.
1
Introduction
The efforts of Fisher 2 5 and Cramer 17 , to give a more rigorous presentation of statistics and to provide new mathematical tools to solve complex statistical problems, were successfull in the forties when C.R. Rao 33 linked statistics to differential geometry. Having in mind the basis given by Cramer and Fisher, Rao could show that Fisher matrix was in fact a Riemannian metric, giving the very first beginning of Riemannian geometry on spaces of probabilities. It was only thirty years after that was introduced by Efron 2 2 , 2 3 an affine connection into the geometry of parameter spaces to interpret the important notion of statistical curvature. Then Dawid suggested the interest of introducing other affine connections 2 0 , 2 1 . This idea to relate together different branches of mathematics, as differ ential geometry, probabilities, information theory and statistics, presents all the kind of difficulties due to translation of separated and complete mathe matical formalisms, but leads to a new branch of mathematics: the theory of statistical manifolds or Information geometry 3,n,i9,30,34_
139
The second section treats the construction of parametric statistical man ifolds. The a-Chentsov-Amari connections 3 ' 4 ' 15 are introduced and the aself-parallel curves 12,13 are discussed. Sec. 3 is devoted to the important case of (±l)-flat manifolds l 4 and some interesting properties are given. The role of (±l)-self-parallel curves in the minimization of the Kullback-Leibler relative entropy 28 ' 29 is studied . In Sec 4 the special case of statistical man ifolds of finite probabilities are considered and (±l)-self-parallel curves are constructed. Sec. 5 we apply these results to the learning of the Boltzmann machine 1>2'27. Finaly in Sec. 6, we give an explicit calculation for a toy model based on probabilities on four points. The (±l)-self-parallel curves orthogonal to submanifolds associated to the simplest Boltzmann machine are discussed. 2 2.1
Parametric statistical manifold, an overview Statistical manifold
Let (X, T, /i) be a measure space with fi a a-finite measure and consider a parametric family of probabilities, absolutely continuous with respect to the measure /z,
pfi(e) = {p9,oee,
P*<M},
d
where 8 e R is an open subset (or a family of open overlaping subsets) which defines uniquely the elements of the family. Because of the absolute continuity of P$, the Radon-Nikodym derivative dPg/dfj, exists and the family PM (0) can be realized in terms of random variables:
*M(e) = { » = ^ .
oee)
This realization is called the distribution or likelihood realization. The application ip : Pg >-> 0 defines a chart tp : 7^(0) —* ©• Under general smoothness conditions it is possible to equipe V^{&) with a fully symmetric 2-covariant tensor g: the Fisher metric 25,33 and a fully symmetric 3-covariant tensor t: the skewness tensor 3 - 4 ' 15 . The triplet S — {P^(6), g, t} or equivalentely S = { ^ ( 0 ) , g, t} is called a (parametric) statistical manifold on (X ,F,n). Any real one to one application allows to construct other realizations of the parametric family { P < j , # € 0 } a s a random variable spaces. One which being largely used is the log-likelihood: £ „ ( 0 ) = {ie = lnp e ,a.s.,
Pe
e V»(Q)} .
140
This realization is more adapted to information theory and it will be called information realization. 2.2
Tangent and cotangent plane
For sake of simplicity let us assume that the Pg and Pg- are equivalent for all 6 and 9', Pg = Pg' that is Pg < Pg> and Pg- -C Pg and that 6 is an open set of Rd with coordinates 9 = {91,92,..., 9d, } in the canonical basis of Rd. The tangent plane Tpg at a point Pg e P»(Q) is generated by the tangent vectors atPe,
A = Y.AkWw-
(1)
k
The derivative dkPg of Pg along the direction 9k is a signed-measure absolutely continuous w.r.t. Pg, with Radon-Nikodym derivative: ^=dkeg=dklnpe
(2)
wich verifies Ep.[dktg]=0.
(3)
The randon variables dktg, fc = 1, d are called in statistics components of the score vector and generate a (^-dimensional linear space. The tangent space Tpe at Pg € P^(Q) can be identified with the ddimensional Euclidean space Tg of random variables with vanishing expec tation:
Tg = | At e Rx :Ae = £
Ah
(P'We
\,
The space Tg can be equipped with the natural scalar product:
12
(4)
T£fl at Pg e P M (©), generated by the differential
d
i=l
can be also realized on the space Tg through the application d
A(a) = £ 0 , ( 0 ) 0 * ,
(6)
141
where
(a*)* = M
Y,
ekjl-Z-'j*(die)jl
... {d~£)i<... (ddl)jd,
the a' and dj verifying the duality relation Ep.[aidit\ = 8ij. 2.3
(7)
S as a Riemaniann manifold
Using the natural scalar product Eq. (4) on T$ we can define a metric tensor on P^(0) in the following way gip§(A,B)
= Ep,[AB],
or in ^-coordinates: 9\Pe (»i. dj) = 9ij = EPB[d{ee djle] = -Ep, [ 9 ^ 4 ] .
(8)
This tensor g is just the Fisher information metric 25 . The choice of this metric is unique under the condition of Markov invariance 32 . The Riemannian geometry is given by the Levi-Civita connection (covariant derivative) VA ■ TP$(P^(e))
-► Tp^P^Q));
B >-> yAB defined by
(9)
v^^Er*/^. k
or by the Christoffel symbols defined by
9\Pa(VaA,d™) = £ryfcflfcm = r«*.
(io)
k
Then r ijfc = 2 t^ffj* + 9j"5ifc - dk9ij],
(11)
with the the symmetry relation r « k = r,-ifc.
(12)
This connection is torsion free, indeed the torsion is defined by the following relation
T(A,B) = VAB-VBA-(AB-BA),
A,B€TPe,
142
or in ^-coordinates Tijk = 9\PAT(dudj),dk)
=Tijk
- f
jik
•
(13)
o It follows, from the symmetry of r , that f(A,B)
= 0,
(14)
Let us now introduce the curvature tensor R(A,B)C
= [V4, V B ] C -
\A,B[C■
Then the components, in the ^-coordinates, of the Riemann tensor are given by o Rijkt = 9\Pe (R(dit dj,di),dt) = 9tm(dif ,km - dj f
m ik
) + YimeTjkm-Tjmef
m ik
.
(15)
By contraction, we obtain the components of the Ricci tensor o
^^ o
.
and the scalar curvature o
A
o *
( 17 )
R
R = T, ife=i
The geodesies are curves s —► 7(5) on 5 , such that 7 := -£ is parallely transported along 7, therefore solutions of the equation 0
V,7 = 0. Which in the ^-coordinate system is rewritten as ...
0
*....
0k + Tij 6l03 = 0 . The geodesic curves minimize the Riemannian 5(i> 7) = Cst, corresponding to ^(7,7)-0.
(18) distance,
indeed
(19)
143
2.4
Affine metric geometries on S
Let us first define the fully symmetric "skewness" tensor t by t]Po(A,B,C)
= Ep,[AeBBC9],
A,B,Ce
TPo,
or in the ^-coordinates Ujk =
EPW
djt dk£].
(20)
a€R.
(21)
The family of affine connections Vg = at,
3 4 15
are the a-connections of Amari-Chensov ' ' . Such a-connections are non Riemannian if a ^ 0. Moreover, the couple (a, —a), a ^ 0 defines a pair of dual connections w.r.t. the metric g in the sense that AglPti(B,C)=g\,,B(VAB,C)+glPB(B,~VAC),
A,B,CeTPt.
(22)
In the same way as for the Levi-Civita connection we define the coefficients of an a-connection by
and the a-torsion f(A,B)
= VAB-VBA-{AB-BA),
A,BeTPe,
or Tijk = 9\Pe f{di,dj),dk)
= f ijk ~ f at.
(24)
Again the a-connections are torsion free f(A,B)
= 0.
(25)
The associated a-Riemannian tensor takes the form R ijke = 9em (% Tjkm-
dj f
m ik
) + f tmi f jkm - f
giving rise to the Ricci a-curvature tensor
fc=i
and to the scalar a-curvature defined by d
i
jmt
f
m tk
,
(26)
144
The a-self-parallel curves or "a-geodesics" are the curves s —► 7(s), with 7 := dj/ds being parallely transported along 7 with respect to the a-connection. Then: V,7 = 0.
(29)
In the ^-coordinate system the equation for a-geodesics takes the form ...
a k . .
ek + r 0 i V = o.
(30)
But the a-selfparallel curves, with a ^ 0 do not minimize the Riemannian distance, indeed fe9(i,i)
= <xt(i,i,y) -
(31)
then the a-self-parallel curves do not admit 5(7,7) = Cst, as a first integral. 3 3.1
Exponential families The manifolds
We consider in this section important (±l)-flat families of probability distri butions: the exponential families, dominated by the probability P 12 . Let P be a probability on (X,.F,/x), the probability P' is of exponential type with respect to P if the logarithm of the Radon-Nikodym derivative dP'/dP is given by an element of the linear space generated by the score vectors defined in the subsection 2.2. Let us denote by Tp the set of (centered) random variables over (X, T, P) which admit an expansion in terms of the scores under the following form Tp = {XERn
\ X-
Ep[X] = g-p\Ep[Xdi},d£)}
.
By direct inspection, one finds that the log-likelihood £ = In p of the usual parametric families of probability distributions belong to T p as well as the difference I — £* of log-likelihood of two probabilities of the same family. So we want to study the consequences of such empirical properties. Proposition 3.1 Being given an exponential family of probability distribu tions, the metric is generated by the (-l)-Hessian of the corresponding entropy VdlPH(P)
= -glP,
where H{P) = — Ep[£] is the entropy of the probability P
(32)
145
Indeed, for an exponential manifold one has V d£=-g-^-t(de),
(33)
and it is easy to verify that = -g[P+l-±^t(J)\P,
Vd]PH(P)
(34)
where J denotes the "entropy current" J = g-\dH).
(35)
Moreover, the a-curvature tensor of the torsionless a-connection is given by R(X,Y)d£=^—^-
{t{t(d£,X),Y)
-t(t(d£,Y),X)}
= ~R(X,Y)di.
(36)
An algebraic computation leads to the P r o p o s i t i o n 3.2 Let P and P* be two probability distributions of an expo nential family, then Vd|P
K(P;P')=g\P
(37)
Vd|p.Ar(P;P*)=S|p., where K(P;P*)
= EP[e-e*}
(38)
is the Kullback-Leibler relative entropy. More generaly, one verifies that: V d | P K(P;P*)=g\P Vd|P.K(P;P*)=5lp. 3.2
+
±±ZEP[(e-e*)de®d£]
(39)
+L^t(Ep[diP.t}).
The (±1)-self-parallel curves
To obtain properties relating (±l)-self-parallel curves to the relative en tropy ( 12 ), we consider the following identities ^ K ( P ( S ) ; P * ) = VdlpK(P;P*)(7s,7,)+ff(7«,grad|pK(P;P*)X40) | ^ ( P ; P*(t)) = V d| P . K(P; P*)(7t*,7,*) + 9\P> <7«,grad | P .K(P; P')X41)
146 where P belongs to 7. Let us suppose that 'y(s) is (-l)-self-parallel, using the proposition 3.2, one gets
£.j A - ( P ( * ) ; P ' ) = | , ( 7 . , 7 . ) .
(42) P i ds ' The lenght of the velocity field appeared at the right hand side and does not depend on P*, so the differential in P* is null X 1
d,p. (^K(P(s);n
= ^(d,P.*(P(s);P*)) =0.
(43)
Therefore, grad|p.i^(P(s);P*) is a vector field linear in s. Now, to express that the differential in P* of the relative entropy vanishes if P* = P , the origin of 7 is fixed at P(0) = P*. Hence, the set of points P(s), satisfying the relation grad | P .X)(P(s);P(0)) = - 7 0 s , belongs to the (-l)-self-parallel curve 7 of which s is the affine parameter and 70 the initial velocity. Finaly, one gets the interchanging the roles of P et P* and working with the identity (41) Proposition 3.3 The parametrized (±1)-self-parallel curves of a 1-flat sta tistical manifold are explicitly described by the gradients of the relative entropy, (-1) - self - parallel curves : (grad {P. K)(P(s)\ P(0)) = -70s (+1) - self - parallel curves : (grad | P A')(P , (0); P*(t)) = -%t
(44) (45)
Hence, a theoretical stand is given to the steepest gradient optimization method in the statistical manifold framework. 3.3
Minimizations of the Kullback-Leibler relative entropy
Let P be a current probability on a curve 7 in the statistical manifold <S. Being given P* e S, with P* ^ 7, the position of P which minimizes localy the Kullback-leibler relative entropy K(P;P") = Ep[£ - £*}, if it exists, is given by the solutions P(so) of the equation
±K(P(s);P*)=0 i.e. by the s — so solutions of 0(7.,(grad|Ptf)(P;P'))=O.
(46)
On the other hand, by proposition 3.2, the relation (40) can be written as
147
^K(P(s);P*)
= glP(is,js) +g(lss,gTadlpk(P;P*)).
(47)
Due to the properties of the Riemann metric, the first term in the left hand side is positive and from the minimization condition one deduces that the second term vanishes at so showing that the extrema with 7S ^ 0 are minima. Then, from relation (45), it follows the Proposition 3.4 Let 7* be a (1)-self-parallel curve with t as affine parame ter, crossing for t = 0 a curve 7 in P(so). Then, for any t, K(P(so); P*{t)) is minimizing K(P(s);P*(t)) locally, if the curves 7 and 7* are orthogonal, i.e. «/s(7 5o ,7o) = 0.
P*(t)
P(so) = P*(0) ^P(s) "courbe (l)-auto-parallele Figure 1. Minimization of K(P(s); P*(t)) w.r.t. the parameter s
As previously, the above analysis can be done by interchanging the roles of P and P*, so one gets by relation (44) the proposition Proposition 3.5 Let 7 be a (—1)-self-parallel curve with s as affine parame ter, crossing for s = 0 a curve 7* in P*(to). Then, for any s, K(P(s); P*(t0)) is minimizing K(P(s);P*(t)) locally, if the curves 7 and 7* are orthogonal, i.e. 1/5(70,7*0) = °These properties can be extended to congruences of curves, i.e. to submanifolds of 5, allowing for instance to define the orthogonal (±l)-projections minimizing the relative entropy localy. This is the geometrical version of the Pytagorian theorem of Csiszar 18,19 , see also Amari 3 ' 6 .
148
r Figure 2. Minimization of K(P(s); P'(t)) w.r.t. the parameter t
4
Statistical manifolds of finite probability distributions
4-1
The manifold
Let (X, V{X), P) be a probability space, where X = {xo, x i , . . . , x^} a finite space of cardinality | X |= d + 1 and 77(X) the algebra of all mesurable subsets of X, | V(X) |= 2 n + 1 . This algebra is generated by the family {{xo}, { x i } , . . . {xrf}} of the d + 1 one point subsets of X. The finite prob ability P, absolutely continuous with respect to the uniform measure is well defined by its distribution function: p(xk) = P[{xk}} = Pk,
Vxfc 6 X,
d
(48)
fc=0
The distribution p appears as a random variable p : X —* [0,1] which can be written: d
d
P = ^2pkSXk,
Inp = £ = ^2ekSXk,
k=o
4=lnp f c
fc=o
where
*-»(*) = {J
lfX = Xfc
'
otherwise.
Then the d-dimensional manifold S of finite probability distributions on d+ 1 points is defined by one of the natural coordinate systems {pi} or {9i},
149
i =
l,...,d, d
d
p = Y,PiSXi + ( i - EPtJ^xo. t=l
t=l
^ = E ^ x t + ln(l - I ) e fl 0, i=l
^.2
(49)
»i = l n ^ ,
i=l,...,d.
t=l
T/ie Fischer metric
The Fischer information metric is given by 9\P = ^,9ij(p)dpi
®dpj=y^(-il + —)dpi<8> dpj, .-.• \ P» Po/
a
where po = 1 - E Pi an< ^ 9ij(p) = ^p
[ ^ P < ^ P 3 ] (P)>
(50)
&n<
* i t s inverse
i=l
V = E »ij (p)^« ® m* = E (**« - MA de"> ® 9 ^ •
4-5
(51)
77ie skewness tensor
The skewness tensor is defined through the fully symmetric 3-covariant tensor: *\P
=
E*ijfc(P)dPi ® d Pj ® d P*
ij
E f^-^)^® *>»**■ ijk
(52)
Let us introduce the mixed tensor *\P = E AjdPi ® d Pj ® 9Pfc. f'
- V t
i.nkt
Hj ~ I, l*}k9 k
-
5ijSi
p,
Pi
Pi
'
l) Oxj
Pt
(53)
Pn
a
The components of the a-connection V takes the form,
T% = - ^ h ,
(54)
150
and the components, in the p's coordinates, of the Riemann tensor 1 - a2,
a
R hijk
4
, (9hj9ik - 9hk9u),
ah Rijk
1 - a2 = —j—(tj9ik
- Sk9ij),
(55)
those of the Ricci tensor 1 -a2 Rij = — : — {d-
(56)
l)9jk,
and the scalar curvature is given by a 1-a2 R = — T - d(d-\).
(57)
It follows that the a-scalar curvature is constant for any a and that the statistical manifold of probabilities on a finite set is (±l)-flat, the Riemannian geometry being obtained for a = 0. Finally, the a-self-parallel curves are solutions of the following system of differential equations 2\ ^
(
Pi =
1+a
Pt
d
-Pi
EPj
2
jr[Pi
-,
PO
pa = \-Y^Pj-
(58)
j=i
V
I)
It follows that the (-l)-self-parallel curves are linear in t, Pi(t)
= Pi(0) +
Pl(0)t
(59)
and the (l)-self-parallel curves correspond to Pi(0)exp(ff,t)
(60)
Pi(t) = (1 + £ Pi(0)(exp(Kit) - 1)) 3=1
where d
£ «-*$+£*$. *<w-'-i>«»-
(61)
151
5
Symmetric Boltzmann machines
5.1
The manifold
A symmetric Boltzmann machine L2.5.7-8,16,27,31 w j t n n n e u r o n s (B(n)) is a probabilistic formal neural network, the architecture being given by a non-directed simplicial labeled graph Q = (V, £), with set of vertices V = {u\ ,V2,...,vn} and set of edges £ = {e\, e2, • •., e m } . The graph is not neces sarily complete but we prescribe that each vertex has a high degree of connec tivity. Because the graph is simplicial and non-directed, between two vertices i/i, i/j there is at most one edge e = e^j = (vi, Vj). The weight Wij = W(ei}j) of the edge eitj is symmetric Wi, = Wa,
W« = 0.
On each vertex there is a probabilistic binary automata with state space {0,1}, then the configuration space of the network is X — {0,1 } n with cardinality d = 2 n . For sake of simplicity, we assume in the following that each neuron is an input and output neuron. Then the ambient probability space (S) is the family of finite probabilities on the X and depends on 2 n - 1 parameters. Moreover, the log-likelihood £ — lnp can be written as follows n
n
e(x) = ^e\>xtl+
£
ii = l
ii
e^xtlxl2 +
...+6Xi>-i«xllxi2...x,n-V(e),
(62)
with ii < i2 < . . . < in- The normalization constant $(0) being related to the entropy H by
T,ei^-^rii2-i'+H(P),
m = where ri™"1* =
Wr
= EP\xuxl2..
.x t ,.].
(63)
Let us notice that this parametrization is well adapted to describe higher order Boltzmann machines 9 . This family of probability is (±l)-flat and has the 0's as natural parametrization. The TJ'S define an other parametrization the "momentum parametrization" 5 .
152
Neurons are updated according to the state of the connected neurons. An individual neuron at vertex Vi is, at time t + 1, in the state 0 or 1 according to the probability law: (
1 with probability pl0 = _ flh ./ t » Xi(t + l)=l * l + e "Mt)) (64) 0 with probability 1 -pxp where ft is a characteristic parameter of the neurons and hi(t+l) is an effective field defined by 1
n
hi(t) = hi(x(t)) = Y,Wijxj{t),
(65)
i=l
to simplify, the threshold of neurons is chosen to be at zero. The updating procedure (64) does not define the dynamics of the net works, we need to precise how the states of the network are updated. In the following, we consider two extreme cases where the updating is sequential or parallel respectively. 5.2
Sequential dynamics
In a sequential Boltzmann machine l,2(Bseq(n)) one neuron is updated at each step of time along the probabilistic law (64), this neuron being chosen with the uniform law. This procedure determines a Markov chain x(t) on the set of configuration X = {0, l } n and is defined by the transition probability: ifu = u
(iPk(u) P[x(t
+ 1) = v\x(t)
= u] = <
l-Efi(u)
if« = «,
,
(66)
t=i
.0
otherwise,
The configuration u^ = { u i , . . . , 1-ujt,..., uj,}, Uj 6 {0,1} is obtained from u = {ui,...,Ufe,..., u<j} by the change of the state of the neuron x^ alone. The transition probability of this change is given by Pk(u) = P[x(t + 1) = u<*> I x(t) =u}= with AEk{u) = E(u^)
1+JAEk{u),
(67)
— E(u) where E(u) is the energy function 1
n
E(u) = - - J2
n w
u u
Hii =- E
WiiWi-
( 68 )
153 The transition probability (66) defines an irreducible Markov chain with Gibbs probability measure Gp(u) as unique stationary probability p-0E(u)
G
z
*(«) =-77m-.
w=
E
KP)
e 0E{u)
~
>
(6g)
«€{0,1}"
or taking the logarithm n W
Huiui ~ inZ(P) ■
\nG0(u) = -P Ys
(70)
i<j = l
The family of all Gibbs measures is a submanifold Q of <S, defined by the linear equations
f
q
=-pwiit
(n)
then Q is a (+l)-linear submanifold of S which inherits of the dually flat structure of S. By proposition of section 3, because Q is (l)-flat submanifold of <S there exists a unique (-l)-self parallel orthogonal projection Gp on Q of the probability P. The projection Gp is the best approximation of P by a probability in Q. By the Pytagorian theorem 5 if G is a Gibbs measure we have for the Kullback-Leibler relative entropy K(P; G) = K(P; GP) + K(GP; G). Then the minimal value for the Kullback-Leibler relative entropy is fixed by the projection property and is given by min K(P;G) =
K(P;GP).
Then the learning method proposed by Ackley, Hilton and Sejnowski 2 which consists to minimize the Kullback-Leibler relative entropy K(P;G), is reduced to a motion of the current point G toward Gp in Q. This learning procedure is implemented by the steepest descent gradient method which consists in the modification of the synaptic weight Wij in Wij + 6Wij with SWtj
= -lYjJ^dw^ {ki}
kl
=
^Y.9iJM{Ep\ukUt]
- Ea[ukut)},
(72)
{ki}
where t is a sufficiently small parameter. Let us remark that under this infinitesimal transformation the momentum coordinates 77^ = /3Ec[uiUj] of W^ in Q is changed in TJ% + 5t]%,
154 Sr^ = Y,9HMSWu.
(73)
{kt)
Then the gradient learning method for a B se ,(n)-machine consists to follow the (+l)-self parallel curve in Q from the current point G to the (-l)-self parallel orthogonal projection Gp of P e 5 on Q. This method provides a minimization algorithm of the Kullback-Leibler relative entropy K(P; G) in G. 5.3
Parallel dynamics
In a parallel Boltzmann machine l'vo {Bpar(n)) all the neurons are updated simultaneously, then, for thresholds taken at zero, the transition probability takes the form: n
U[x(t + l) = v\ x(t) = u] = n
.
n
M«) = 5ZW«U>-
1 + e0{l_2vi)hi{u),
( 74 )
1
t=l
i-
In this case the Markov chain defined by this transition probability has also a unique stationary probability measure:
nsW=
m S*lw")cosh i**0=W) § (eSM")+0 • (75)
and the log-likelihood
lnll0(u) = 0J2
E
»=i j=i
W
**i + E^cosh ( | £ i=i
j=i
^i)
+ ln-WTff,-
( 76 )
^
Let us remark that for a parallel dynamics the submanifold V of the output stationary measures has a bondary. Indeed, in this case the stationnary measure depends on the Wij by terms of the form z^ = e?Wij cosh § Wy which are bounded below by | (we discuss this aspect on a toy model in the next section). Moreover, for n > 2 the log-likelihood (76) cannot be written in the form of Eq. (62) where the B^ t , are linearly related. Then the submanifold V of the output stationary measures of a given parallel Boltzmann machine is not a (l)-linear submanifold of 5 and even not a (+l)-convex submanifold in the sense that (+l)-self parallel curve joining two elements of V is in V.
155
The uniqueness of the (-l)-projection is not automatically insured and needs a study of the (±l)-self parallel curves. The classical learning algorithm is always based on the minimization by the gradient method of the Kullback-Leibler relative entropy K{P-,IL,)=
£ P(«)ln^, ue{o,i}"
and leads to 5W 0x3
_
y ^
c
= e0 £
9
.iiju*K(P;n,) dwkl '
9'jM{EP\uk(t)ue(t
- 1)] - En[uk(t)ut(t
- 1)]
{kt)
+EP[uk(t - l) u / (t)] - En\uk(t - l)ue(t)}} .
(77)
which does not depend on the present configuration but rather on two consec utive points of the trajectory, then the geometrical interpretation is no longer simple and needs a further study to give a characterization of the learning trajectory. 6 6.1
A toy model The manifolds
Let us consider the simplest Boltzmann machine with two neurons. It will be used as a toy model. Clearly, the associated statistical manifold has specific properties due to the small dimension, but allow to an explicit calculus work out and gives some light on the learning of stochastic neural networks. In this model, the family of ambient probabilities are the probabilities on X = {0, l } 2 . We denote by x = {u2,ui} with u* € {0,1} an element of Q, and put p(0,0)=po,
p(0,l) = P l >
p(l,0)=p2,
p ( l , l ) = P3
(78)
with Po = 1 - P i - P 2 - P 3 Let £ be the logarithm of a probability of the ambient family t = ]np = e\ui+
e\ u2 + e\h2} u l U 2 - V(0),
W)
= - lnpo-
(79)
156 with * } = ! „ £ ! , 0? = l n ^ , tf11+fl? + ^ 1 ' a } = l n ^ (80) Po Po Po The B seg (2)-machine is characterised by the submanifold Bseq of probabilities, 1_ 3 + z'
Pi = P2 = PO P3 =
.
z = ew
(81)
3TI
or equivalently by the equations f Pi = P2 = P, 1 ps = 1 - 3p
o < P < < i , J2P*
=1
(82)
-
i=0
The Bp(ir.(2)-machine is characterised by the submanifold Bpar of probabilities, Pi =P2 = (1 + z ) 2 ' z — ew cosh W
(83)
P3 = (1 + zf it follows that po = 7T+IT7 a n c * t n e e Q u a t i o n s c a n
De
written 3
Pi = P2 = P, P = \^5i - p 3 ,
0 <J»4 < 1,
P3 > I
X ^ P . = 1-
(84)
i=0
Let us remark that the submanifold Bpar is bounded by p\ = P2 = §, P3 = | . 6.2
Tangent and orthogonal vector fields to the B submanifolds
Since the submanifolds Bseq and Bpar have the common characteristic p\ P2, then it is natural to introduce the coordinate system {b = p> ~^Pi, p EL £l | , P3}, in which the metric tensor takes the form
/ pT^i g = (db, dp, dp3) <8>
6 p'2-b3
V 0
-?rr^ 2p p2-62 _2_ Po
, _£ po
° .2. Po
\ / d6\ dp
*+*/ v<w
(85)
157
with p 0 = 1 - 2p - p 3 , and its inverse / f(l-2p) 3
J
-4(1-2p)
= (3&, dp, 9,P3^ \
-PP3
- | ( 1 -2p) ±p-6
-pp3
2
-t>P3
/M
\
(86)
-&P3
P3(l-P3)/
\<W
The restriction g\,b=0) of the metric g to the submanifold {b = 0} takes the form 0 9\u.=0] = (db> d P- dp3)®
0 a + JP
A
PO
\
fdb\ dp
(87)
PO
vo i i+i/
w
withpo = 1 - 2 p - p 3 . The tangent vector field V on {b = 0} is defined, up to a multiplicative factor, by V = Xdp + (j.dP3
The orthogonal vector field N to V in the ambient space N = adb + pdv + 7dP3 is defined, up to a multiplicative factor, by
which gives 7 P3 ... HP + A(l - p 3 ) (88) — = r — with r = P ' P Mp-5)-Aps" Then the vector field normal to the integral curve B of the vector field V — Xdp + jj.dPi is given, up to a multiplicative factor, by N ~ 0 (dp + r ^ dp3\ + adb,
(0,p,p 3 ) € B
(89)
Now, we are interested in the family of (±l)-self-parallel curves, (b{t),p(t),p3(t)) orthogonal to the integral curves Baeq and Bpar for t = 0. The (-l)-self-parallel curves normal to the integral curve B in the ambient space are given by Eq. 59 with 6(0) = 0,
158 b(t) = 6(0) t p ( t ) = p ( 0 ) t + p(0)
V3(t)=(r^t
,
(90)
+ l)p,(0)
and the (l)-self-parallel curves normal in the ambient space are given by Eq. (60) also with 6(0) = 0, b(t) = f $ exp {p(0) t [(2 + r ^ g ) ^
+ jjjy] } sinh * $
Pit) = f § exp {p(0) t[(2
+ jfo]} cosh %$ ,
+ r»ff)
^
(91)
rtt) =PMexp {m * [( 2 +i$?) ste + sfa]} where D
W
=
P
o(0)+ex
P
[(2
+
r^)^]
x {p(0)exp ( $ j j f ) 2cosh ^
+ p 3 (0)exp ( r $ j j f ) } '
Let us first remark that the vector field normal to the integral curve B in (0,p(0),p 3 (0)) € B, corresponds to 0 — p(0) and a = 6(0) arbitrary, V = Xdp 4- ndP3 takes the form, N ~ p(0) (dp + r £jg> 9 P 3 ) + 6(0) a6, 6.5
(0,p(0),p 3 (0)) € B
(92)
The submanifold Bseq
The restriction of the tangent vector V to the submanifold Bseq is given by A = I and fi — — 1, up to a multiplicative factor, V«, ~ \dp - dP3
(93)
then r,eg = 0 and
7, e g = 0.
(94)
The (-l)-self-parallel curves normal to the integral curve Bseq in the am bient space are given by Eq.(59) with 6(0) = 0, one gets 6 ( 0 = 6(0) t p(t)=p(0)t + 1\rz,
z = ew>0
(95)
159
(• I (-self-parallel curves
(I )-self-parallel curves
Figure 3. (±l)-self-parallel curves orthogonal to B,cq with 6(0) = 0
They correspond to line p - T ^ 6 = 5^7 in the surface P3(t) = 3 ^ . It follows that each distribution of probability p{u2,u\) in the ambient space has a unique (-l)-projection on Baeq. The (l)-self-parallel curves orthogonal in the ambient space are given by Eq. (60) also with 6(0) = 0,
w- e x p ga o ) 0 ^ h {(3+4(o)n P(t) =
e x p
\ l ^ ^
cosh{(3 + z)b(0)t}
z = ew,
(96)
*»(o - * exp\Wl\m where ^(<) = 3JT [ 1 + '"rWW'JWl
(2 exp{(3+ 2 )p(0)() cosh{{3+z)b(0)t} + z)] .
It follows that each distribution of probability in the open simplex 0 < p{u2,u\) < 1 has a unique (l)-projection on Baeq. However, iit is not true for the limit points p = p3 = 0 and p = 0.5, p$ = 0 corresponding to t —► TOO which have an infinity of orthogonal projection on Baeq. Let us note that for this probabilities the Kullback-Liebler relative entropy K(G; P(=poo)), where G € Bseq is diverging.
160
6.4
The submanifold Bpar
riction of V to 1the submanifold Bpar corresponds to A = 1 — The restriction M = -1> Vpar
^(1-^)dp~dj'3
YT**
an(
^
(97)
then Tpar = 1 + J^ ^
Z l = z= = eeWHCOSh ' c o sWh ^>>=; i = _ - I- ,, Z
(98)
and Tpar —
Ppar-
I"")
The (-l)-self-parallel curves orthogonal to the integral curve Bpar in the ambient space are given by Eq. (59) with 6(0) = 0 b{t)=b(0)t,
p{t)=p(o)t + wfar,
(ioo)
they lie on the surfaces 2
P + P3 = 7 — • 1+Z Therefore, due to the constraint z > ^, one deduces that the probability distributions which are such that P + P3 < | do not have a (-l)-projection on Bpar, but every others have a unique (-l)-projection on Bpar. The (+l)-self-parallel curves orthogonal to Bpar, in the ambient space, are given by Eq. (60) also with 6(0) = 0, exp{iii^p(0)t}
W) = *
Uiym)
j
sinh{i^-b(o)t}
exp{ii^P(0)(}
Pit) = *
U)»»to
;
,
(101)
cosh{^6(0)0
where D t
( ) = ( T T ^ ' [ 1 + 2 z e x p j ^ ^ ^ P W t j c o 8 h { i i i ^ 6 ( 0 ) t } + 2 2 exp{(l + z ) 3 i p i p ( 0 ) t } l .
161
N^.
* =
3
V
\
* " ' " ■
"'-..
'"-..,
s S
\
'""""■-■■
v
\ ® \ V
N N
*=■; z=0.5
r*:
N
i~.
r~.
71
N Tv
(-1 )-self-parallel curves
(1 )-self-parallel curves
Figure 4. (±l)-self-parallel curves orthogonal to Bpar with 6(0) = 0
they lie on the surfaces In
z(l-2p-p3)
vV362
z In-
P3 2vy^62
Here we are faced with a new situation, again there is an open subset in the ambient space for which probability distributions do not have a (+1)projection on Bpar, but probability distributions belonging to the complementatary subset admit a two-fold (+l)-projection on Bpar characterized by z > 1 and 2 < 1. This reflects in the fact that there are three kinds of limit points which project • (p = \, P3 = 0) —> on the whole B v -'part • (p = 0, P3 = 0) —> on the half part of Bpar corresponding to z > 1, • (p = 0, P3 = 1) —► which project on the other half where i < z < 1. (See Fig. 4 exhibiting graphs of some self-parallel curves, characterised by the initial condition 6(0) = 0, therefore lying in the plan 6(0) = 0 of this Euclidean drawing.) Remark. The learning of the B por (2)-machine by minimization of the Kullback-Leibler relative entropy allows only to reach a subfamily of the am bient probability distributions. This toy model has specific properties due to the low dimension of underlying space, nevertheless it shows that parallel computations can lead to non-converging domain for the algorithm, even if sequential one are converging.
162 Acknowledgments Two of us Ph. C. and H. N. are partially supported by the projects: PRAXIS XXI and pluriannual, Portugal. References 1. E. Aarts, J. Korst, Simulated annealing and Boltzmann machines. A stochastic approach to combinatorial optimization and neural computing (Wiley and Sons, 1989). 2. D. Ackley, G. Hilton, T. Sejnowski, A learning algorithm for Boltzmann machine, Cognitive Science 9, 147-169 ( 1985). 3. S.-I. Amari, Differential Geometrical methods in Statistics, Lecture Notes in Statistics 28, (Springer Verlag 1985). 4. S.-I. Amari, Differential Geometrical Theory of Statistics - Towards New Developments, in Differential Geometry Statistical Inference ed S.S. Gupta, IMS Lecture Notes Monographs series 10, 19-94 ( 1987). 5. S.-I. Amari, Information Geometry of Boltzmann Machine, IEEE Trans actions on Neural Networks 37, 260-271 ( 1992). 6. S.-I. Amari, Information Geometry , Contemporary Mathematics 203, 81-95 ( 1997). 7. R. Azencott, Synchroneous Boltzmann machines: Learning rules, in Pro ceeding of Neural Networks (les Arcs) NATO series , (Springer Verlag 1990). 8. R. Azencott,A. Doutriaux, L. Younes, Synchroneous Boltzmann machines and outline based image classification , in Proceeding of Neural Networks (les Arcs) INCC , (Paris 1990). 9. R.Azencott, Boltzmann machines: Higher-order interactions and syn chronous learning, in Stochastic models, statistical methods, and algo rithms in image analysis. Proc. Spec. Year Image Anal., Rome:Italy 1990, Lect. Notes Stat. 74, 14-45 (1992). 10. B. Appoloni, D. de Falco, Learning by parallel Boltzmann Machines, IEEE Transactions on Information Theory 37, 1162 ( 1981). 11. O. Barndorff-Nilsen, Parametric Statistical Models and Likelihood, Lec ture Notes in Statistics 50, ( Springer Verlag 1988). 12. G. Burdet, Ph. Combe, H. Nencka, Statistical Manifolds, Selfparallel Curves and Learning Processes, Progress in Probability 45, 87-99 ( Birkhauser 1998). 13. G. Burdet, H. Nencka, M. Perrin, An Example of Dynamical Behaviour of the Relative Entropy, Contemporary Mathematics 203, 97-104 ( 1997).
163 14. G. Burdet, H. Nencka, Equation of Self-Parallel Curve deviation on Sta tistical Manifolds, Methods of Functional Analysis and Topology 3, 4650 ( 1997). 15. N.N. Chensov, Statistical Decision and Optimal Inference, in Russian (Nauka, Moscow 1972), English translation AMS 53, (Providence, R.I. 1982). 16. Ph. Combe, H. Nencka, Information Geometry and Learning in Formal Neural Networks, Contemporary Mathematics 203, 105-1116 ( 1997). 17. H. Cramer, Mathematical Methods of Statistics, ( Princeton University Press 1946). 18. I. Csiszar, I-divergence geometry of probability distributions and mini mization problems, Ann. of Probab. 3, 146-158 (1975). 19. I. Csiszar, J. Korner Information Theory, Academic Press , (1981). 20. A. P. Dawid, Discussion on Pr. Efron's paper (1975), Ann. Stat. 3, 1231-1234 (1975). 21. A. P. Dawid, Further comments on some comments on a paper by Bradley Efron, Ann. Stat. 5, 1242 (1977). 22. B. Efron, Defining the curvature of a statistical problem, Ann. Stat. 3, 1189-1942 (1975). 23. B. Efron, The geometry of exponential families, Ann. Stat. 6, 362-376 (1975). 24. W. Feller, An introduction to probability theory and its Applications, 3rd ed.Wiley 1968. 25. R.A. Fisher, Theory of statistical estimation, Proc. Camb. Phil. Soc. 22, 700-725 (1925). 26. R.J. Glauber, Time-dependent statistics of the Ising model, J.Math.Phys. 4, 294-307 (1963). 27. G. Hilton, T Sejnowski, Learning and relearning in Boltzmann machine, in Parallel Distributed Processing: Exploration in Microstructures of Cognition eds. L.L. McClelland, D.E. Rumelhart, Vol 1, ch 7, 147-169 (MIT Press, Cambridge, 1986). 28. S. Kullback, Information Theory and Statistics, (Wiley, New York, 1959). 29. S. Kullback, R. Leibler, On Information and Sufficiency, Ann. Math. Stat. 22, 79-86 (1951). 30. M.K. Murray, J.W. Rice, Differential geometry and Statistics, (Chapman & Hall 1990). 31. P. Peretto, Collective properties of neural networks : a statistical physics approach, Biological Cybernetics 50, 45-62 (1984). 32. D.B. Picard, Statistical morphisms and related invariance properties, Ann. Inst. Statist. Math. 44, 45-61 (1992).
164
33. C.R. Rao, Information and the accurency attainable in the estimation of statistical parameters, Bull. Calcutta Math. Soc. 37, 81-91 (1945). 34. I. Vajda, Theory of statistical inference and information, (Kluwer 1989).
165
B R Y D G E S ' OPERATOR IN RENORMALIZATION THEORY PIERRE CARTIER Ecole Normale Superieure, 45 rue d'Ulm, F-75230 Paris Cedex 05, France CECILE DEWITT-MORETTE Center for Relativity and Department of Physics, The University of Texas at Austin, Austin, Texas 78712-1081 This note is a tribute to both Ludwig Streit and Sergio Albeverio. Given the schedules of the conferences in their honor and the publications of their festschrifts, as well as our current interest in their works, it is more appropriate for us to make our contributions to these celebrations a joint one. Our joint contribution to both Streit and Albeverio makes scientific sense because we show that the white noise calculus of Streit and the AlbeverioHoegh-Krohn functional integrals are two examples of the same scheme. The scheme is introduced in our paper "Scaling and Functional Integration" sub mitted recently to the Canadian Mathematical Society Proceedings which will publish the Albeverio Conference Proceedings. It is summarized here. The initial steps were presented at the Streit Conference in Lisbon. Following the work of D.C. Brydges, J. Dimock, and T.R. Hurd, we define a field f(x) for x in R4, or x in spacetime K 1,3 , as an integral of scale dependent fields over a scaling variable I £ [0, oo[. Let / be a functional integral over such fields of the type used in Quantum Field Theory, /^DxV-expQsfa))
^ X
(1)
which can be rewritten in terms of a gaussian integrator fio I =j
dpab)
• exp Q I/fa)) = (ua, exp Uu^j
\ .
(2)
For instance, if the covariance G is given by ~C< for x e R d , a i \x — y\ one decomposes it into scale dependent contributions. G(\x - y\) =
(3)
/»00
G(£) =
d*l-St
u(0
,
d* I = dill
(4)
Jo where St u(£) = £'"' u(f/£)
,
[u] = physical length dimension of u
(5)
166
and /•OO
dx k ■ jfe-M u(jfc) = - Cd .
/>0 Jo
(6)
The domain of integration of the scaling variable [0, oo[ can be broken into a OO
sum V , [2J^0) 2 J + 1 £Q[. The corresponding decomposition of the covariance, j=-oo + O0
G
(2>
(7)
221 V[Vto,V + lto{<
(8)
= 22
G
] — — oo
induces decompositions of the fields +00
^~
i=-oo
of the gaussian na, and of all quantities defined in terms of no, in particular the functional laplacian" A
G = ^~ f dx f dyG{x,y)
*€{1,*}>
(9)
the Bargmann-Segal transform Bc =MG* = e x p - A c ,
(10)
and its inverse the Wick transform ::G=exp(-iAc) .
(11)
Henceforth the suffix G is replaced by the interval defining the scale de pendent covariance; for example ^Ko,oo[ = ^ [ / o , * [ * A * [ * , o o [ -
(12)
A key integral in Quantum Field Theory can be written, with the notation of (2) I (to) = (/^„,oo[, Z) = (H[t,oo{ , H[io,t[ * Z).
(13)
The convolution H[ia,t[ * Z integrates out the contributions of the scale depen dent fields in the range [IQ, ^[; it substitutes to the integrand Z an ^-dependent "With our normalization the covariance is given by J x dudf)
f{x) f(y) = T~ < -*( I > y).
167
effective integrand; in the case of (1) (rewritten as (2)), it substitutes an ef fective action to the action 5. An important operator Pi introduced by Brydges provides the means for constructing a parabolic equation satisfied by the effective action. The operator Pi rescales the Bargmann-Segal transform so that all integrals are performed with an ^-independent gaussian; indeed let Pe = St/loB[toA
(14)
with S defined by (5) and B defined by (10), then %,oo[, Z) = (/u(<0i0o(, Pi Z).
(15)
The scaling evolution equation defined by Pi is readily obtained from the definition of P/: t±{PtZ)=(s
+ \£}{PtZ),
P(0Z = Z
(16)
where t=fn
HhA-
(17)
Equation (16) determines the renormalization group flow equation. We are now in a position to bring together the functional integrals of Streit and Albeverio, as well as several other functional integrals. In (1) and (2), X is the domain of integration of the functional integral. Let ^"(X) be the space of functional, F : X -> K, integrable with respect to the chosen dimensionless volume element dnc\ then T(X) is the domain of the Bargmann-Segal transform BG : .F(X) -► Fock(X).
(18)
In some of Streit's works, JT(X) = L 2 (X,/i G );
(19)
for Albeverio-Hoegh-Krohn, if Fx(
with A a bounded measure on the dual X' of X, then Fx e J-(X).
(20)
168
Other cases include a space .F(X) of polynomials Fn(
exp(27ri{J,
(21) J=0
or a space ^"(X) of distributions whose Bargmann-Segal transforms are entire functional on the complexified Xc of X. In all cases the Brydges operator Pi rescales the convolution BQ = HG *■
169 F L U C T U A T I O N OF THE BOSE-EINSTEIN C O N D E N S A T E IN A T R A P
H. EZAWA Department of Physics, Gakushuin University, Mejiro, Toshtma-ku, Tokyo 171-8588, Japan E-mail: hiroshi.ezawaQgakushuin. ac.jp K. NAKAMURA Division of Natural Science, Meiji University, Izumi Campus Eifuku, Suginami-ku, Tokyo 168-8555, Japan E-mail: [email protected] K. WATANABE Department of Physics, Meisei University Hino, Tokyo 191-8506, Japan E-mail: [email protected] Y. YAMANAKA Waseda University Senior High School Kamishakujii, Nerima-ku, Tokyo 177-0044, Japan E-mail: [email protected] With modified Bogoliubov replacement ao -» -/No + a'0 we study the fluctuation of trapped condensate, No atoms in average, taking into account the interaction between particles inside and outside the condensate. It is indicated that the oper ator a'0 should in effect be of O(N0' ) for large No if the interaction is repulsive, based on some anticipation of numerical results still to be worked out.
1
Introduction
Since the first observation of the Bose-Einstein condensation (BEC) in 1995 : with 2 x 103 trapped atoms of 87 Rb at 170 nK, followed by the cases 2 ' 3 , 4 of 23 Na, 7 Li and down to 1 H, experimental and theoretical works have ex posed different aspects of this interesting quantum effect. Yet, basic questions remain to be answered. For the free Bose gas, the condition for the BEC is that its number density p be high and temperature T low such that pX3 > 2.162, where A is the thermal de Broglie wave lengthaA := \I2-KTI2/mk^T.
This condition derives from the
"This is not the de Broglie wavelength Ad = 2-nh/p for p 2 /(2m) = 3kBT/2.
170
requirement of maximum entropy. In order to see how the condition is affected by the particle interactions, we have to know the excitation spectrum of the system. The basic tool for the analysis has been the Bogoliubov prescription 5 to replace ao by \/No> where ao is the annihilation operator and 7V0 the number of the condensate particles. In 1967, we proposed6 to modify the Bogoliubov replacement such that ao—>\/N)+a' 0
(1)
by adding another annihilation operator a 0 to preserve the canonical commu tation relations, and studied the fluctuation of the condensate to find out the condition that a'0 be effectively small relative to ^/No. However, analysis was difficult because the Goldstone theorem dictates that the energy spectrum of the system with translation invariant interaction extends down to zero as the associated momentum goes to 0, leading to diverging expectation value of a'0 a'0, if we simply let the volume of periodic box be infinite as the momentum summation tempts us to. With the atoms trapped in a finite volume, we are free from the Goldstone theorem on the one hand, and on the other have no longer the representa tion question 7 of the canonical commutation relations compelling us to the Bogoliubov relacement. We wish to confirm that the replacement (1), and hence the original one, will serve as a valid method of approximation when the condensation takes place. 2 2.1
Hamiltonian Modified Bogoliubov replacement
In terms of the Bose field operator,
A(X)
= V a n " n ( ^ ) with a complete n
orthonormal system of one-particle states {un(x)} and annihilation operators {a„}, the Hamiltonian of the system is given formally by H := JtfA(x) [~A
+ v(x) -
Mj
<j>A(x)d3x
+ \ J
(2)
where v(x) is the trap potential and V(x) = V(—x) the interaction, the former varying much more slowly than the latter; /i is the chemical potential. Suppose the system has iV0 3> 1 condensate particles in the state u0(x) to suggest the replacement (1). Hereafter, we shall simply write ao for a 0 .
171
Then, the field operator takes the form,
(3)
where
2.2
Hartree Potential
If we put (3) in (2), then terms linear in
I
+
± j 2
— ^
A
+ v(x)
+ VH(x)
- fl1
4>\(x)4>\{x')V[x-x')fl>K{x)4>Kix')#x(Px'
/ ■
(4)
by adding and subtracting the term, J
(5)
Now, we choose the system {u„(cc)} to be the eigenfunctions of h2 h=-—A
+ v(x) + vH(x),
(6)
that is to say that u„'s are the self-consistent solutions to hun(x) = enun(x), where n = 0,1,2, ••• in the linear order of increasing e n ; sometimes n is taken to mean a multi-index, arranged yet in the same order. We can take a system of real-valued functions {un(x)}, which will be better suited for later calculations. Then, we have H = E0 + ^ ( e n - /i)oJ,a„ + V^Vo(e0 - /i)(ao + a o) n
+
~rfr
5 2 ■^(mn(a}aman + a/am a n) + -rr /,m,n
5 2 LMmnaka\aman K,/,m,n
(7)
172
where E0 := N0e0 - l-N2 j u2{x)V{x
- x')ul(x')d3xd3x'
- i ] T Jnn,
(8)
and Lkimn ■■= N0 / u fe (x)u,(a:')V(x - x')um(x)un(x')d3xd3x', which has the symmetry, Lfc(mn = Llknm Jmn
3
=
M)Omn>
= Lmnkl. Klmn
=
(9)
Further,
Loimn.
(10)
Delta-Function Interaction
In view of the long wavelengths of the atoms under consideration compared with the range of their interaction potential V, people take the delta-function approximation, V(x-x')=g6(x-x'),
with g = — —
(11)
where M is the mass and a the scattering length of the atoms. Then, (5) turns out to be VH(X) = gN0uo2(x) and the eigenvalue equation for (6) becomes ( h2 1 | ~2MA + v(x) + 9N0u0*(x) | «„(*) = enun(x),
(12)
of which the one for the ground state n = 0 is the Gross-Pitaevski equation 8 . We remark that while this particular equation for n = 0 is a nonlinear Schrodinger equation, those with n ^ 0 are not, being linear Schrodinger equations with a given potential v{x) + gN0Uo2{x). Therefore, the eigenfunctions {un} constitute a complete orthonormal system. (9) becomes Lkimn - gN0 / uk(x)ul(x)um{x)un(x)d3x,
(13)
acquiring full symmetry under any permutation of the suffices. We have to anticipate that (11) may cause some troubles 9 . We shall be ready to truncate the matrix (Lkimn) when necessary.
173
3.1
Splitting the Hamiltonian
Let us now split the Hamiltonian (7) into four parts, 7i — Tic + %B + %BC + Eo, such that the first part involves only the operators a0, pertaining to the condensate particles, the second only those aj, (n ^ 0) of the particles outside the condensate, the third both. Note that ak means a^ and/or a\. We split each of the operator parts of the Hamiltonian further with respect to the powers in A := l/y/No, nr = \-1n{T-1)+Hi0)
+ \H{rl] + \2n™
(r = C,B,BC)
(14)
and correspondingly Ai:=At(0)+AM(1)+AV2) + -"Thus, the condensate part is given by
(15)
4 - 1 ) = (£o-M(°))(aJ + a0), ?4 0 ) = (eo - / i ( 0 ) ) 4 a 0 - M U) (a 0 + a o) + ^Joo(4 + a 0 ) 2 , HQ 1 ' = -/i ( 1 ) a 0 a 0 - M(2)(ao + a o) + ^oo(aoao + a o a o)'
( 16 )
n{c] = -H{2)alao - M(3)(a0 + <*o) + ^ooa^ao 2 , the out-of-the-condensate part by ?4 0 ) = E ' ( £ " - M ( 0 ) )4an + \Y,'jmn{al Tig)
= -fl^Y^'aldn n
+ am)(al + a„)
+ S ' / f l m n ( a J a m C + a/a^O,,) Imn
^B 2) = -fl{2}^2alian+
^2
(17)
L
ktmnala}aman,
klmn
and finally the interaction between the particles in and out-of the condensate by ^BC = S ' - M o J , + an)(al + a0) n
+ 2(aJ + a0)alnan + o o ^ a * J mn
+H n
Jn0 a
\o
a
n + ^ o M ^ n + an) + ad*ah]
(18)
277 174
+4c4a 0 aJ ri a n + o02oj„ajl) + 2 ^ J„ 0 (aJ aoan + a j a o 2 ^ ) n
Here and in the following,
E
l
means summation that excludes n = 0.
n
3.2
Some estimates
Since the mode functions un are normalized, they are of the order of (volume of the condensate) - 1 / 2 , so that the coefficients L^imn ~ 9P with p being the number density of the condensate. In the experiments reported l • 2 ' 3 , the trap potential v(x) can be assumed to be of harmonic oscillator type, and the corresponding numerical solutions to the Gross-Pitaevski equation are known 11 , so that vu(x) can be calculated. For the sake of orientation, however, we take + y2) + vz2z2}
v(x) + VH(X) = y {I/J. V
.
(19)
Then, the eigenvalue problem (12) has the solution, £nxnvnx
= l(nx
+ Uy + 1) U± +
in the triple index notation n = (nx,ny,nz),
( Uz + - J Vz \ h,
(20)
with
«».»,». (x) = NxNyNzHni (0Hny (v)Hn, (C)e-(«2+"2+«2)/2,
(21)
where £ = axx etc. and
a
' = \n?'
^
=
(^2^T)
Ci = *,y,*).
We have
J°° Hn(0Hm(Z)e-2t2dS = (-!)("—)/2 N / 2 ^^T r fm + n + 1 if n + m = even. Also /
Hk(0Hn(0Hm(0e-2i2dt oo
_ Kj2^r, +m-,T(n + m-k
+ l)r(k
+ n-m+l)T(m
+
k
-
n + 1
■n \ 2 ) \ 2 ) \ 2 if k-n — m =even. The integrals vanish otherwise. So, for instance, we obtain (22)
175 where
is a measure of the number density of the condensate since l/oy measure the extension of the condensate in the direction j , and ^nxnynz\mxmym.T
( r ( _ 1 ) (n J -m j ) /2
TT
(nj+m^-l)!!
\
even)
= <
0
(otherwise)
Table 1.
(^ -I- m _ iV| , " . 2("+ m )/ 2 \/n!m!
Case of (n,m) even-even.
T h e m a t r i x is s y m m e t r i c . Its lower-left p a r t is su ppressed a n d so is 1 for (0, 0).
0 2 4 6 8 10 12 14 16 18 20
2 0.353 0.375
4 0.153 0.271 0.273
6 0.070 0.173 0.225 0.226
8 0.033 0.104 0.165 0.196 0.196
10 0.016 0.060 0.113 0.155 0.176 0.176
12 0.007 0.034 0.074 0.115 0.146 0.161 0.161
14 0.004 0.019 0.047 0.081 0.113 0.137 0.149 0.149
16 0.002 0.010 0.029 0055 0.084 0.111 0.130 0.140 0.140
18 0.001 0.006 0.017 0.036 0.060 0.085 0.108 0.124 0.132 0.132
20 0.000 0.003 0.010 0.023 0.042 0.064 0.086 0.105 0.119 0.125 0.125
We note (n + m — 1)!! 1 ^„2/(s„\ ^, is Vn\m\ vitn as n —> oo. This number is maximum on the diagonal m = n, the maximum value decreasing with increasing n only very slowly (see Table 1). Take the case of 87 Rb atom, 6 having M = 86.90919 a.u. = 1.443162 0 x 10" 25 kg, which was used in the pioneering experiment of M.H. Anderson et al. 1 ; the parameters of (19) they gave were: vz = 2 7 T X 2 2 0 H Z = 7 . 5 X 1 0 2 S - \ b
It is radioactive, undergoing /?
u± = vz/y/8 = 2.7 x 1 0 2 s _ 1 ,
decay with the half life of 4.8 x 10 1 0 yrs.
(24)
176
so that az = 1.01 x 1 0 6 m _ \
a x = 6.1 x 10 5 m _ 1 .
(25)
9
We take a = 6 x 10~ m for the scattering length of Rb, finding g = &x 1 0 _ 5 1 J m 3 ,
gp = (1.4 x HT 34 J)iV 0
or, in unit of the quantum of the phonon, hv± = 2.8 x 10
-32
(26) J,
^ - = (5x 10-3)iV0
(27)
Bl/X
which is of the order of 10 for N0 ~ 2000, the case of ref. : . We shall find that the excitation energies of the out-of-the-condensate particles are enhanced by their interactions (see Table 2 below), somewhat reducing the measure (27) of the strength of the interaction. 4
Effective Condensate Hamiltonian (1)
We shall take the following steps to study the fluctuation of the condensate. 1. Diagonalize HB := W^' in (17). Let its eigenstate \n) € HB be the un perturbed state of the out-of-the-condensate particles; HB is their Hilbert space. 2. Take He ■— X~l~Hc~ +'HC as the unperturbed condensate Hamiltonian acting on the Hilbert space He of the condensate particles. 3. Use the perturbation theory of Ezawa and Luban (EL) 6 to construct the effective Hamiltonian operator A for the condensate such that (tfB + tfc + A#i+A 2 tf 2 )Vn = t/>„A,
ViV»n = l ,
(28)
where ipn = W„|n) with Un being an operator in the Hilbert space H B ® H C of the total system, by taking A//i + \2H
1
^ + H& + U^
+ nc1]
(29)
and //2:=^B
2 )
+^C
2 )
+^BC
(30)
4. Determine f/-"^ such that A'") at each order v of the perturbation calcu lation has no terms linear in a0. This is necessary to keep the average number 7V0 of the condensate particles fixed by guaranteeing that no change in the first scalar term in our T/NO + ao be needed when diagonalizing A.
177
5. A is the effective Hamiltonian for the condensate in the sense that the solutions of A„Xa = £naXa solve KlpnXa - IpnKXa
= ZnalPnXa
(31)
to give the eigenstates \pnXa and the eigenvalues £.naoi the total Hamil tonian H, the operator on the left-hand side of (28). This is basically a Born-Oppenheimer procedure 10 which treats the faster motion first, then proceeding to examine its effect on the slower motion. Now after the first step of solving HB\TI) = Wn\n), the EL method 6 proceeds to solve (28) perturbatively for ^ = (l + AW( 1 )+A 2 ^ 2 ) + .--)|n), An = \-lA
(32)
We now specialize to the ground state of the particles outside the con densate. In the lowest order, let |0) be the ground state of HB, and then (28) gives (HB + Hc)\0) = lOHA-'A*-1) + A<°>),
(33)
so that \(-V=H{c-l) A<°> = W0+n^
= (e0-ni0))(al
+ a0)
= W0 + (e 0 -M ( 0 ) )aS«o-M ( 1 ) (4 + ao) + ^ o o ( 4 + oo) 2 . (34)
By the requirement that A ^ should have no terms linear in a', we obtain (o) =_
M
£o,
^(1)=0
and hence A (-D =
o,
\(°) = W0 + ± J 0 o(a 0 + aj) 2 .
(35)
In terms of the canonical pair, Po = -i-j=(ao
-oj),
x0 = -y=(a0+a*0),
\p0,x0] = -i
(36)
we have A(0) = Jooxl,
(37)
which has continuous spectrum. If Joo > 0, or equivalently if g > 0, then the condensate has non-negative excitation energy and hence stable. The expectation values of % can be finite if the condenstate is in an appropriate wave packet state. To proceed to the higher order approximation, we have to finish the step 1.
178
5
Diagonalization of the Out-of-the-Condensate Hamiltonian
In order to diagonalize HB = %B in (17), define the canonical pairs, Pn ■= - « W y ( a n - 4 ) >
Qn ■= J-^-(an
+ a]n)
(n ^ 0),
(38)
with € n := £„ - £o, to obtain
n,m
n
n
with J n m := 2y/enemJnm, which is real and symmetric. We can diagonalize the potential energy part in (39) by an orthogonal transformation, r)k =
Y^TknPn, & = J 2 ' T k n ^
( 40 )
without affecting the form of the kinetic energy part nor the commutation relations. The eigenvalues u\ are all positive if g > 0. Thus,
"B = £ (U + y^ 2 ) - 1 £ ' ^ >
(41)
Therefore, by defining h = J y & + iyJ^-Vk,
such that
[bk,bj] = 6Ut
[bk,bi] = 0,
(42)
we achieve the diagonalization,
HB = £ u,k (b\bk + i ) - \£e™-
( 43 )
Table 2 gives the values of u\ for the case of (24) and (26) in unit of (hvx)2Table 2. The eigenvalues of the matrix (t\8nm + Jnm)/{hvj.)2 Those with superscript d are doubly denerate. 5.048d 13.86d 16.36" 20.54 21.07 31.69d 33.63d 35.22d 47.40 61.96 74.32d 140.2 204.6 Recalling (38), we can write (42) as bk = E
(cknan + s*„<4),
b[ = E
(cknal + sknan)
(44)
179 with
cn
: )=l(\f^±\F)T^
Skn )
2 \ \ en
^
\j oJk/
which satisfy 2_^ n
(CknCln - SknSln) = Skl,
^ n
(Cfc„S(„ - Sfc„Qn) = 0
(46)
and reciprocally 22(CknCkm - SknSkm) = &nm, k
2j(Cfc„Sfcm ~~ sknCkm) = 0. k
(47)
Therefore, (44) can be inverted: an = Y l ^ n
h
k
- Sknbl),
aj, = ^(CfcnfcJ. - Sknbk)-
k
6
(48)
k
Effective Condensate Hamiltonian (2)
We have already obtained the effective condensate Hamiltonian (35) in the lowest two orders. To proceed further, we use the formulas from EL 6 , A^=<0|fri|0>
(49)
and A<2> = H.P.{0\HiU^\0)
+ <0|tf2|0)
(50)
where H.P. stands for Hermitian part, and {Wo - Wm)(m\U<»\0) = (ml^lO) + [ff c ,HW ( 1 ) |0>].
(51)
For (49), we get from (29) A (1) = Joo{a02 a0+ a0a02)
(52)
after eliminating the terms linear in a0 by the choice ^
= £ ( 2 J £ - JS),
(53)
k
where *'
: =
/
,, Jmnskmslni
Jkl
: =
/
y
JmnskmClni
etc.
180
In (50), different states
\K)
contribute to the sum
}^{0\HI\K)(K\U^\0). K
Firstly, single particle states \k) = 6^10) give (fc|#i|0) = ] P Jno{-Sknal
+ 2(ckn - skn)ala0
+ cknal
n
+ X'Hckn - skn){al + <*>)}+&,
(flu ~
Wi^\0))
for which we determine the coefficients in the Ansatz, (fc|W(1)|0) := Aka\2 + Bkalao + Cka20 + Dka\ + Eka0 + Fk, by (51), or -uk (k\U^ |0> = <*|ZT1|0> + [HC, (k\uM |0>],
(54)
obtaining
vk ( 2
u>k
Dk=Ek = -A"1 -U-,
uk
)
Fk = - i - {/?, + £ 7 + - ^ 7 f c - | •
U>k
UJk (
U)k
U>k
)
where fik = — 2 ^ Ltmn(cki
— Ski)(cqm —
2sgm)sqn,
imn,
lk
(5g)
■= 2^, Jno(Ckn ± n
Skn)-
The contributions to («;|i?i|0),and(K|i/^1^|0) from normalized 2- and 3particle states, r)kib\b\\fy and r)hkib)fi\.b\\fy can be obtained similarly, where r)ki = 1 if k ^ I, and = 1/V2 if k = I, and similarly for 77/^. In (50), we also have
(0|tf2|0> = (4 ] T JS - // 2 >)aja 0 -
(3) M
(4 + a0)
k
+ Jooatfat - ( £ Jkl) (<4* + <*o)>
(57)
181
where, we note, / / 2 ' has been determined in (53). In this way, we obtain the second order effective condensate Hamiltonian,
+Lu{x0pl
+ PQZO) + M^oPo + Pozo)2,
(58)
up to a c-number after eliminating the terms linear in xo by choosing /x'3^ appropriately, where
Hi
2k ■= - E ^ ( * * + c * » + c ' ^ - E ^ - ^ ° ™ + E [ft + 2^) ~ -700' IS
mi:=" E ^-kclk + j^oo,
1
- E ZTTT^ + E fa - \JS) - ^oo, K3 ■= - Yl —A -1 c 2 t(c 3 fc + c'3k), k
L4 := - ^
(59)
K4 := - Y* — 4 * c 3 * + 7^00
Uk
W
k
k
4
— f c 4fc c; t + ^/| c 2*(c3fc + c'3k)\ + - Joo-
with c\k ■= Pk ~ %, c2k ■= V2~lk~, /3 4 J 0 0 \ _ 2J 00 ,
C3k t
= ( 2 - ^ r ) i k + wJt> T*1S
TOO
Ol /
Tf*S
a\ki ■= Jkl + Jkl ~ lKJkl
TSC* ^
1
C4k:=
+ Jkl) + -—;
UU
/
+
Joo _
. .
2lk ~~klk'
TOO
\Jkl
TSS \
~ Jkl),
T
d
(60) TCP
2ki '■- Jkl ~
The prime on Cjk (j = 3, 4) means to put J0o = 0 in them.
T*?S
J
ki-
182
7
C o n d e n s a t e Fluctuation
The result we have obtained so far for the effective condensate Hamiltonian is the sum of (37), (52) and (58): A<°>
+ A A « + A 2 A« = ^ j - / o + 9P4 + * ,
(61)
where we have put Joo = 9P and R is the sum of the products of the entries in the first and third lines, and the second and third lines in respective columns of Table 3. Since the coefficients in the second line as well as M4 here are yet to be determined by numerical computations, we simply assume that they have similar magnitudes ~ Joo — 9P- We assume further that g > 0 and Mi > 0, which is cruicial to the following considerations. Table 3. The terms in R 1/(2M2)
Pt>
magn.
K13
9P/23/2 K3 r3
N0-'/2
9P/2Z>2 L\2 zoPo + ■ 2 7V01/6
K4 x0 N-2/3 iV
0
Li
(xoPo + -Y 4
Then, in place of solving the Schrodinger equation for (61), we resort to a conventional uncertainty principle argument. Minimize (61) —R under the condition X0P0 = 1, then po = (gpNoMi)1'6,
xo = (gpN0M4)-1/6,
(62)
which leads to the estimates as given in Table 3, the entries in the third line having the magnitudes given in the fourth line, where A^ stands symbolically for gpN0M4. We see that
>
J
^ + _^ ( ^ + p S l 0 ) .(^)" j I + (»^)" } (63) dominates over any other term in (61) if TVo 3> 1; nevertheless it is much smaller than fuvk as to endorse our assumption of slow condensate fluctuation. (62) indicates also that aj ~ N0' , which is smaller than N0' , justifying our method and hence the Bogoliubov prescription. All these conclusions hinges on our cruicial assumption of Mi > 0, which, together with the other assumptions on the magnitudes of the coefficients in (61), has to be corroborated by numerical works. Besides, calculations are now in progress that incorporates "H^, in (18) into HQ in the step 1. in §4. So far, the effective condensate Hamiltonian in §6 appears not to be changed substantially.
183
Acknowledgments We are very happy to dedicate this article to Prof. Ludwig Streit on the occasion of his sixtieth birthday. He has been a constant source of inspiration on the interplay between physics and mathematics. One of the authors (H.E.) is grateful to Prof. J.R. Klauder and his group for discussions and warm hospitality during the memorable month of March 1999 at the University of Florida, where a part of this work was done. References 1. M.H. Anderson, J.R. Ensher, M.R. Matthews, C.E. Wieman, and E.A. Cornell, Science 269, 198 (1995). 2. K.B. Davis, M.-O. Mewes, M.R. Andrews, N.J. van Druten, D.S. Durfee, D.M. Kurn, and W. Ketterle, Phys. Rev. Lett. 75, 3969 (1995); M.-O. Mewes et al., Phys. Rev. Lett. 77, 416 (()1996). 3. C.C. Bradley, C.A. Sackett, J.J. Toilet and R.G. Hulet, Phys. Rev. Lett. 75, 1687 (1995); C.C. Bradley, C.A. Sackett and R.G. Hulet, Phys. Rev. Lett. 78, 985 (1997) 78, 985 (1997). 4. D.G. Fried et al., Phys. Rev. Lett. 8 1 , 3811 (1998); T.C. Killian et al. Phys. Rev. Lett. 81, 3807 (1998). 5. N.N. Bogoliubov, J. Phys. (USSR) 11, 23 (1947). 6. H. Ezawa and M. Luban, J. Math. Phys. 8, 1285 (1967); H. Ezawa. J. Math. Phys. 6, 380 (1965). 7. H. Araki and J, Woods, J. Math. Phys. 4, 637 (1963). 8. L.P. Pitaevski, Zh. Exp. Theor. Fiz. 40, 646 (1961) [Sov. Phys. JETP 13, 451 (1961)]; E.P. Gross, Nuovo Cimento 20, 454 (1961); J. Math. Phys. 4, 195 (1963). 9. F.A. Berezin, The Method of Second Quantization, tr. by N. Mugibayashi and A. Jeffrey, Academic Press (1966). Chap. M. 10. M. Born and J.R. Oppenheimer, Ann. Physik 84, 457 (1927). 11. M. Edwards and K. Burnett, Phys. Rev. A 5 1 , 1362 (1995); F. Dalfovo and S. Stringari, Phys. Rev. A 53, 2477 (1996). T. Isoshima and K. Machida, J. Phys. Soc. Jpn. 66, 3502 (1997).
184
TIME D E P E N D E N T A N D N O N L I N E A R P O I N T INTERACTIONS RODOLFO FIGARI Dipartimento di Scienze Fisiche, University di Napoli "Federico II", Italy dedicated to Ludwig Streit The aim of the paper is to stress the effectiveness of point interactions for the construction of models of quantum systems whose dynamics is generated by time dependent or nonlinear hamiltonians. Examples of time dependent point interac tion hamiltonians which could have some importance in applications are given. In particular nonlinear interactions of range zero are analyzed in dimensions one and three. Results on local and global existence of the solution and conditions for the blow-up are given and an explicit blow-up solution is constructed in the critical case.
During the activities of BiBoS, at the end of '70s and the beginning of the '80s, research on various subjects of Mathematical Physics was flourish ing around the Zentrum fiir Interdisziplinare Forschung at the University of Bielefeld. Ludwig Streit, as director of several study groups held at ZIF during those years, was able to gather together well known experts from almost every field of mathematical physics and young researchers from almost everywhere in the world. An extremely warm social environment was a strong support for a very effective scientific work. 1
Introduction
Driven by new results obtained by S. Albeverio, R. Hoegh-Krohn, H. Holden, F. Gesztesy, M. Mebkhout, L. Streit and many others a new impulse was given to the analysis of singular perturbations of the laplacian, in particular point interactions, in non relativistic quantum mechanics. As stressed already in the title of the reference book on the subject, ([AGH-KH]), one of the main properties of point interactions is the fact that they are solvable, in the sense that one can write explicit formulas for the integral kernel of the resolvent, of the propagator and of other functions of such hamiltonians. The construction of the dynamics is particularly simple in the one dimensional case due to the fact that the Dirac delta is a small
185
form perturbation of the laplacian. For this reason one dimensional point interactions have been used in elementary courses and textbooks of quantum mechanics to exemplify a complete computation of spectral and scattering data for quantum model systems. In dimensions two and three the perturbative approach does not work and point interactions are most naturally defined using extension theory for sym metric operators. One starts with the laplacian denned on smooth functions vanishing on the interaction centers and then characterizes all the self-adjoint extensions. Such kind of hamiltonians are a set of non trivial perturbations of the laplacian indexed by a minimal set of parameters: the strengths and the po sitions of the interaction centers. As a consequence, the formulas for the propagator of the corresponding dynamics look particularly simple. This re mains true even if the dynamical and geometrical parameters depend on time, opening the possibility of analyzing the dynamics of quantum particles sub ject to linear or nonlinear time dependent forces. In particular Yafaev gave in ([Y2]) a complete analysis of the scattering of a quantum particle by an in teraction center with strength variable in time (see also ([DO]) and references therein). In the next section some new results on solutions of the Schroedinger equation with point interactions with strength and position depending on time in a preassigned way will be given. In the last section a class of nonlinear point interactions will be introduced in dimensions one and three together with some preliminary results on local and global existence of the solution of the corresponding evolution problem. Cases in which the solution blows up in finite time will be exhibited. 2
Time dependent interactions
In order to avoid unnecessary technical difficulties and involved notation only the case of one interaction center will be considered. Results valid in the general case and complete proofs will be given in ([DFT]). In order to clarify differences and common features, the one dimensional and the three dimensional cases will be presented in two different subsections. 2.1
One dimensional case
Point interaction hamiltonians in one dimension are a large class of self-adjoint operators corresponding, roughly speaking, to "potentials" 8, 5' and their combinations, placed in any point of the real line (for a complete characteri-
186
zation of any hamiltonian in the class see ([AGH-KH])). Only the 5 potential will be considered here. The hamiltonian Ha>v with a single point interaction of strength a, placed in y can be defined as the self-adjoint operator in L2(R) whose quadratic form is D(Fa,v) = H1{R) FatV(u)=
[ dx\Vu\2 + a\u(y)\*
(1)
JR.
Due to standard Sobolev inequalities, the value in one point of any func tion in Hl(R) is well defined and bounded by the H1 norm. This implies that the form (1) is a Kato-small perturbation of the quadratic form of the laplacian. As it is well known Ha
a >0 a<0
- for a < 0 the only eigenvalue is simple and the corresponding normalized eigenfunction is
tM*) = Yir e x p(fl x l) - for any a, to each momentum k e [0, oo), corresponds the generalized eigen function
WM) =
v b {ex^ikx) - ^ijfc[ e x p ( i | f c a : | ) )
The integral kernel of the propagator of the unitary group generated by Ha
187
»^2 =
Ha,yipt
= ( - A + a5) tl>t
one obtains the integral equation
V>t(z) = (UQ{t)iM (a?) -ia
f ds U0 {t-s,\x-
y\)tps(y)
(2)
Jo
where UQ is the free propagator in dimension one. From (2) one sees that the solution is completely determined by the free evolution oirpo and by i()t(y), which in turn, as a function of time, must satisfy the integral equation
My) = (U0(W0)
(y) -ia
f ds U0 (t - «,0)^.(tf)
(3)
Jo
It is easy to check that in fact (2) and (3) define a unitary flow in L2(R) whose generator is HaiV and therefore they may be used as a definition of This procedure is directly generalizable to the case of y and a varying with time. In fact, under suitable conditions on the regularity of y(t),a(t) and on the initial condition it is possible to prove that for each s e R the map ips ->• ipt induced by
M s ) = (U0(t - a)i>.) (x)-if
dr O(T) UO (t -T,\X-
y(r)\) ^ r (p(r))
J8
i>Mt)) = (U0(t - s)xps) (y(t)) -if
dr a(r) U0 (t - r, \y(t) - y(r)\) ^ r (y(r)) J8
(4)
defines a group of unitary transformations V(s, t),with V(s, s) = 1, which solves the Schroedinger equation idV{s^"=Ha(t)
(5)
Remark 1 The most general assumptions needed for the validity of the scheme introduced above will not be analyzed. It should be noticed that in dimension
188
one the form domain of the hamiltonians Ha^iy^ does not depend on time and coincide with ^(R). This permits to apply the general abstract theory about the solution of non autonomous evolution problems (see e.g.[Si]). Notice that, on the contrary, in the physical literature, assumptions were always chosen tailored on the specific model taken into consideration and on the techniques used. Remark 2 The problem of ionization of a one dimensional atom perturbed by a periodic force was recently considered in ([CLR]). The qualitative features shown by the computed ionization probability seem in good agreement with what is measured in transitions between states of large principal quantum number in hydrogen atoms. For preliminary results in three dimensions see
(PI). 2.2
Three dimensional case
The theory of self-adjoint extensions of symmetric operators in Hilbert spaces allows to classify all the non trivial extensions of the laplacian restricted to Co°(i? d \ {j/})- It is well known that for d > 3 only the trivial extension exists. In d = 2 and d = 3 there are infinitely many extensions Ha>v indexed by the value of a strength parameter a € f l (see [AGH-KH] for details) . The corresponding quadratic forms are recalled below. In dimensions two D(Fa,9) = {V 6 L2(R2) s.t. 3
C;${x) = <j>x{x) + qGx{x - y)}
F«,vm = HV^II2 + AH^II2 - A|M|2 + [a + log A ) |,|>
(6 )
In dimension three ([T]) D(Fa,v) = {V- 6 L2(R3) s.t. 3
6 C;rp(x) =
i;,a,„(V') = l|V
(7)
A > 0 is the integral kernel of ( - A + A ) - 1 in dimension
Spectral properties and eigenfunctions of Hail) in three dimensions are listed below - the spectrum of Ha
189 a(#a,y) = [0,oo)
a>0
<x(H0ty) = {-167r 2 a 2 } U [0,oo)
a <0
- for a < 0 the only eigenvalue is simple and the corresponding normalized eigenfunction is
- for any a, to each momentum k € i? 3 , corresponds the generalized eigen function exp(r|fc||x|)> lk x) = e x +* ' ( 2 ^ 7 2 ( ? ^ • *> " a - i|fc| /(47 r) |x| Like in the one dimensional case, the solution of the Schroedinger equation .Wt .dipt „ , dt can be given in dimension three via an explicit formula for the propagator of HQ,V ([ST]) . The implicit characterization of the solution, given by formulas (2), (3) in d = 1, becomes in this case tft(z) = (U(t)rPo) (x)+i
f dsU(tJo
a, \x - y\) q(s)
(8)
where (U(a)iPo) (y) / da , = 4\/i7r / da s/t^a Jo Vt - s Jo where U(t) is the free unitary group defined by the kernel q(t) + tViwa
U{t.a-x<)
= e^(x-x>)
=
^
0)
m
To be precise formulas (8) and (9) hold for initial data^o € C£°(fl 3 \{2/}). The evolution of any other initial vector in L2(R3) is obtained by continuity. Formula (9) for the charges holds true for an initial function in the domain
190
of HQty if in the r.h.s. of (9) some oscillatory integrals are interpreted as boundary values of analytic functions. In dimension two the equation for the "charges" q(i) is considerably more delicate to handle due to the log singularity of the Green's function. Corre spondingly the generalization to dynamics generated by time dependent point interaction hamiltonians is more difficult. Here only the three dimensional case will be considered. Notice that the form domain in (7) changes if the position of the point y varies. As a consequence the Schroedinger problem corresponding to a moving point interaction is more complicate than in dimension one. Yafaev ([Yl]) solved the case where the strength parameter a was varying in time (in which case the operator domain is varying but not the form domain). In the following theorem the results in the general case when strength and position change in time are summarized. Theorem Ify(t) is regular curve in R3, a{t) a smooth function in R and f G C^iR3 \ {y(s)}) then there exists a unique i(),(t) € D(Fa^tytV^, t € R, such that *$' 2 3 is in the dual (with respect to L {R )) of D(Fa(t)tV(t)) and i < v(t), ^ ^
> = Ba{t)>y{t)(v(t),
V.W)
Mt)
6
D(FQ(tUit))
Ms) = f (10) where Ba(t),y(t) w the bilinear form corresponding to the quadratic form FQ(t)iV(t). Moreover tp,(t) has the following representation Mt)
= U(t -S)f
+ ij
dr U(t - r; • - y{r))q(r)
(11)
where the charges q(t) satisfy the Volterra integral equation 4
V^l J,
C(t,r)
=--
y/t-T
f
*" Jo
dz
v ^ f* ,JUo(r ~ s)f)(y(T)) \/t=T V-l Js (12)
JB
*
U(T
V ( l - z)z V
+{t-
T)Z,T)
+ B(T + {t-
T)Z,T)
191 +
B(T + ( f-r)z,r)-l\ 2(t - T)Z
)
W3{t,T) J0
2{t-T) 1
/■w(t.T)
B(t,T)=-7—r / v w(*,T)y 0
where B{t,r) 3
d*e , z
denotes the derivative with respect to the second argument.
Nonlinear Interactions
In this section it is described a class of nonlinear evolution problems with point interaction concentrated in a fixed point (which for sake of simplicity will be taken to be the origin) with a strength depending on the behaviour of the solution near the origin. Models of this kind, in space dimension one, appeared already in the physical literature ([J-LPS], [MA], [BKB]) to describe the resonant tunnelling of electrons through a double barrier heterostructure. The results presented in the previous section will be used to define models where the strengths of the interactions are coupled with the solution itself. Such models might be of interest in applied as well as in fundamental Quantum Mechanics. Let us start considering a one dimensional case introduced in ([ATI]). 3.1
One dimensional case
The nonlinear evolution problem is obtained by replacing a(t) in (4) with a function of the value at the origin of the solution itself. More precisely, for 7 G R, a > 0, let us consider a(t) = 7|V>t(0)|2<7
(13)
The resulting nonlinear integral equation is M<*) + *7 f dsU (t - s, \x\) hM0)| 2 'iM0) = (U(t)ip0) (x) Jo
(14)
192
As in (4) the evaluation of (14) in x = 0 gives a closed equation for i/>t(0)
MO) +il
f ds JO
]
> , ( Q ) | 2 g V , ( 0 ) = (U(t)iM (0)
(15)
V 47Tl(t - 8)
Equation (15) is a nonlinear Abel integral equation, involving only the time variable. The search for the solution of (14) is then reduced to the solution of (15). The following proposition refers to results obtained in ([ATI], [AT2]) about uniqueness and existence of the solutions. Proposition 1 For any ipo € H1 (R), there exists i > 0 s.t. problem (14) has a unique solution rpt € Hl{R), t € [0,1). Moreover if j > 0 or if j < 0, a < 1 the solution is global in time. In fact, using the contraction principle, equation (15) can be uniquely solved for small times in the space of continuous functions (see e.g. [M]). Ex ploiting the smoothing properties of the Abel operator ([GoVe]) and classical results about the free propagator one can prove local existence. The result on global existence is a direct consequence of Sobolev inequalities and of the conservation of the X2-norm and of the energy E{rl,t) = f dxmx)\2 JR
+ - J l - I ^ O ) ! 2 ^ 2 = EWo) V+
1
Notice that the evolution reduces to free propagation if the initial datum is odd. Only propagation of the even part is not trivial. According to the standard definition, one says that ipt is a blow-up solu tion if there exists to < oo s.t. \\ip't\\ -»• oo for t -»• toDue to the conservation of the energy, ^ is a blow-up solution if |Vt(0)| diverges for t -+ to- This means that, in this model, a blow-up solution tpt can be equivalently defined by the condition ||V>t||£°° -*■ oo for t -¥ toIf the nonlinear interaction is sufficiently attractive, for suitable initial datum, one can prove that blow-up occurs. Proposition 2 Assume 7 < 0 and a > 1. Then for any ipo 6 Hl(R) E(ipo) < 0 the solution ipt is a blow-up solution.
s.t. \\xtp0\\ < cc and
193 The proof exploits standard technical tools in the analysis of blow-up of so lutions of the nonlinear Schroedinger equation, in particular the computation of the second time-derivative of the moment of inertia I(t)
I(t) = f dxx2\Mx)\2 JR
and the uncertainty principle (see e.g. [RH]). The above results shows that the critical exponent of the nonlinearity is a = 1. The critical case is of special interest because of the existence of an additional symmetry (pseudo-conformal invariance). This symmetry can be expressed as follows: if $t (x) is a solution then
{ )s .-rf i/ i
a to>o
* ' ji^* " *A^)
'
(16)
is again a solution. Moreover, a stationary solution of the problem, i.e. a solution of the form Q(x)e,xt, is easily computed
*«(*)= « p ( - M + i | )
(17)
Combining (16) and (17), we explicitly construct a blow-up solution, namely , ,
x
_
1
(
\x\
.
\x\2
1
^
(18)
An advantage to have the explicit solution (18) is that one can directly check the properties of the solution near the blow-up (non existence of strong L2limit, L 2 -concentration phenomenon, local behaviour, etc.). 3.2
Three dimensional case
In the following some preliminary results on the three dimensional version of the nonlinear point interaction model will be discussed. The same problem in dimension two will be not considered here. There are in fact peculiar features (e.g. the absence of a critical exponent) indicating that the two dimensional model requires a separate analysis. We intend to approach this problem in further work. Notice that even in the linear three dimensional case the solution of the evo lution problem exhibits a singularity where the interaction is placed. This
194
means that the nonlinear interaction cannot be introduced as in (14). The value of the function at the origin should be replaced in this case by the co efficient q(t) of the singular part. This suggests the following formulation of the evolution problem (see (9))
Vt(z) = (U{t)iM(x)+iJ
q{t) +
lv£ f ^^M V - » Jo
y/t-8 2
<*(*) = f\Q(t)\ ",
dsU(t - a, \x\)q(a)
=
^r
d8 V-» JO
OWoKo) y/t — 3
7 € R, o > 0 (19)
The natural space where one looks for the solution of a standard nonlinear Schroedinger equation is if 1 , i.e. the form domain of the generator of the linear dynamics obtained setting to zero the exponent of the nonlinear term. This suggests to look for the solution of (19) in the form domain D(Fa>0) of Hato. It is easy to check that the following characterisation of D(Fa<0) is equivalent to the one given in (7) D(Fafi)
= {ue L2(R3) \u = <j>x + qGx,
<j>xeH\q£C}
One can prove the following result Proposition 3 Let Vo = <£o + 1GX G ^(^0,0), with
195
One expects global existence also for weakly attractive interactions. The proof is not trivial because one is not working in H1 and standard Sobolev inequalities are not available. Consider finally the problem of the existence of blow-up solutions. Since the solution lives in £>(FOi0) (which is strictly larger than H1) it is necessary to modify the usual notion of blow-up. A solution ipt of problem (19) will be said to be a blow-up solution if there exists a io < oo such that
lim IIV&H = oo t—¥to
or equivalently Hm \q{t)\ = oo With this definition of blow-up one can prove Theorem If ^0 = 0* + q{0)Gx with >£ € H2(R3), \\x^oh < o o 7 < 0 andCT> 1 then the solution of problem (19) is a blow-up solution. Acknowledgements With a delay of almost twenty years I want to thank S. Albeverio and L. Streit who put me in touch with point interactions (and much more). I want also to use this occasion to thank my co-workers G.F. Dell'Antonio and A. Teta, whose continuous help and friendship were the main motivation to continue to work on this subject, and my young colleagues R. Adami, G. Panati and F. Pisano on whom we can count for future results on point interactions. References [A] Adams, R., Sobolev Spaces, Academic Press, New York, 1975. [ATI] Adami, R., Teta, A., A Simple Model of Concentrated Nonlinearity, in "Mathematical Results in Quantum Mechanics", Eds. Dittrich, J., Exner, P., Tater, M., Birkhauser, Berlin, 1999. [AT2] Adami, R , Teta, A., A Class of Nonlinear Schroedinger Equations with Concentrated Nonlinearity, Preprint Universita di Roma "La Sapienza", 1999.
196
[AGH-KH] Albeverio, S., Gesztesy, F., Hogh-Krohn, R., Holden, H., Solvable Models in Quantum Mechanics, Springer-Verlag, New York, 1988. [BKB] Bulashenko, O.M., Kochelap, V.A., Bonilla L.L., Coherent Patterns and Self-Induced Diffraction of Electrons on a Thin Nonlinear Layer, Phys. Rev. B, 54, 3, 1996. [CLR] Costin, O., Lebowitz, J.L., Rohlenko, A., Ionization of a Model Atom: Exact Results and Connection with Experiments, Preprint Rutgers University ( mpjirc n.99-185), 1999. [DFT] DelFAntonio, G.F., Figari, R., Teta, A., Schroedinger Equation with Moving Point Interactions in Three Dimensions, in "Stochastic Processes, Physics and Geometry: New Interplays", Vol. 1, Eds. Gesztesy, F., Holden, H., Jost, J., Paycha, S., Rockner, M., Scarlatti, S., CMS Conference Proceed ings Series, 2000. [DO] Demkov, Yu.N., Ostrovsky, V.N., Zero-range Potentials and their Ap plications in Atomic Physics Plenum Press, New York and London, 1988. [GV] Ginibre, J., Velo, G., On a Class of Nonlinear Schrodinger Equations. I. The Cauchy Problem, General Case, J. Func. Anal., 32, 1-32, 1979. [GoVe] Gorenflo, R., Vessella, S., Abel Integral Equations, Springer-Verlag, Berlin Heidelberg, 1978. [J-LPS] Jona-Lasinio, G., Presilla, C , Sjostrand J., On Schrodinger Equations with Concentrated Nonlinearities, Ann. Phys., 240, 1-21, 1995. [M] Miller, R. K., Nonlinear Volterra Integral Equations, W. A. Benjamin Inc., 1971. [Me] Merle, F., Construction of Solutions with Exactly k Blow-up points for the Schrodinger Equation with Critical Nonlinearity, Comm. Math. Phys., 129, 223-240,1990. [MA] Malomed, B., Azbel, M., Modulational Instability of a Wave Scattered by a Nonlinear Center, Phys. Rev. B, 47, 16, 1993. [N] Nier, F., The Dynamics of some Quantum Open Systems with Short-Range Nonlinearities, preprint Ecole Polytechnique, 1998. [P] Pisano, F. Tempi di Ionizzazione di un Atomo Modello Perturbato Periodicamente, Tesi di Laurea in Fisica, Universita Federico II, Napoli, 1999 [RR] Rasmussen, J.J., Rypdal, K., Blow-up in NLSE - A General Review, Physica Scripta, 33, 481-497,1986.
197
[ST] Scarlatti, S., Teta, A., Derivation of the Time-dependent Propagator for the Three Dimensional Schrodinger Equation with One Point Interaction, J. Phys. A, 23, 1033-1035, 1990. [S] Schulman, L.S., Applications of the Propagator for the Delta Function Potential, in Path Integrals from meV to MeV (Bielefeld 1985), World Sci. Publ., Singapore, 1986 [Si] Simon, B., Quantum Mechanics for Hamiltonians Defined as Quadratic Forms, Princeton University Press, 1971 [T] Teta, A., Quadratic Forms for Singular Perturbations of the Laplacian, Publications of the R.I.M.S., Kyoto University, 26, 803-817, 1990. [Y2] Yafaev, D.R., Scattering Theory for Time-dependent Zero-range Poten tials, Ann.Inst. H. PoincarS Phys.Theor 40, 343-359,1984. [W] Weinstein, M.I., NLSE and Sharp Interpolation Estimates, C.M.P. 87, 567-576, 1983.
198
T H E COLE-HOPF A N D M I U R A T R A N S F O R M A T I O N S REVISITED FRITZ GESZTESY Department of Mathematics, University of Missouri, Columbia, MO 65211, USA E-mail: fritzQmath.missouri.edu URL: http://www.math.missouri.edu/people/fgesztesy.html HELGE HOLDEN Department of Mathematical Sciences, Norwegian University of Science and Technology, N-7491 Trondheim, Norway E-mail: holdenQmath.ntnu.no URL: http://www. math. ntnu. no/ 'holden/ Dedicated with great pleasure to Ludwig Streit on the occasion of his 60th birthday. An elementary yet remarkable similarity between the Cole-Hopf transformation relating the Burgers and heat equation and Miura's transformation connecting the KdV and mKdV equations is studied in detail. In the special (1 + l)-dimensional case, our considerations apply to the entire hierarchy of Burgers evolution equa tions. 1
Introduction
Our aim in this note is to display the close similarity between the well-known Cole-Hopf transformation relating the Burgers and the heat equation, and the celebrated Miura transform connecting the Korteweg-de Vries (KdV) and the modified KdV (mKdV) equation. In doing so we will introduce an addi tional twist in the Cole-Hopf transformation (cf. (1.23)), which to the best of our knowledge, appears to be new. Moreover, we will reveal the history of this transformation and uncover several instances of its rediscovery (including those by Cole and Hopf). We start with a brief introductory account on the KdV and mKdV equa tions. The KdV equation [43] was derived as an equation modeling the be havior of shallow water waves moving in one direction by Korteweg and his student de Vries in 1895°. The landmark discovery of the inverse scatter ing method by Gardner, Green, Kniskal, and Miura in 1967 [20] (cf. also [21]) brought the KdV equation to the forefront of mathematical physics, and started the phenomenal development involving multiple disciplines of science "But the equation had been derived earlier by Boussinesq [7] in 1871, see Heyerhoff [37] and Pego [50].
199
as well as several branches of mathematics. The KdV equation (in a setting convenient for our purpose) reads KdV(V) = Vt - 6VVX + Vxxx = 0,
(1.1)
while its modified counterpart, the mKdV equation, equals mKdV(tf>) = 4>t- 6020x + 4>xxx = 0. (1.2) Miura's fundamental discovery [46] was the realization that if
(i,()6R2
(1.3)
both satisfy the KdV equation. The transformation (1.3) has since been called the Miura transformation. Furthermore, explicit calculations by Miura showed the validity of the identity KdV(V±) = (2(£ ± dx) mKdV(<£).
(1.4)
The Miura transformation (1.3) was quite prominently used in the construc tion of an infinite series of conservation laws for the KdV equation, see [47], [16, Sect. 5.1]. Miura's identity (1.4) then demonstrates how to transfer so lutions of the mKdV equation to solutions of the KdV equation, but due to the nontrivial kernel of (2
(1.5)
one derives mKdV(0) = dx (±(4>t - P(V±)1>)\ ,
(1-6)
where 0 = V*M
4>>0,
V±=4>2±<j>x.
(1.7)
Next, let V — V(x, t) be a solution of the KdV equation, KdV(V) = 0, and ip > 0 be a function satisfying rl>t = P(VW,
-i/>** + VTP = 0.
(1.8)
200
Then one immediately deduces that <> / solves the mKdV equation, mKdV(<^) = 0, and hence the Miura transformation has been "inverted". The KdV equation (1.1) and the mKdV equation (1.2) are just the first (nonlinear) evolution equations in a countably infinite hierarchy of such equa tions (the (m)KdV hierarchy). As we will indicate at the end of Section 2, the considerations (1.3)-(1.8) extend to the entire hierarchy of these equations. Next we turn to the the Cole-Hopf transformation and its history. The classical Cole-Hopf transformation [13], [42], covered in most textbooks on partial differential equations, states that uOM) = - 2 ^ ^ ,
(x,t)eRx(0,oo),
(1.9)
where tp > 0 is a solution of the heat equation 1>t ~ 4>xx = 0 ,
(1.10)
satisfies the (viscous) Burgers equation ut + uux - uxx = 0.
(1-11)
However, already in 1906, Forsyth, in his multi-volume treatise on differential equations ([19, p. 100]), discussed the equation (in his notation)
where a = a(x,y).
Hence there exists a function 6 such that 99 da 2 nad9 =d-*> " - ^ - a = 2 0 d - y -
a
,,1 0 , (113)
Assuming the function z satisfies z „ + 2cwa + 2/fey + 7 = 0,
(1.14)
an easy calculation shows that dL(ze»)+20^(ze°)=O.
(1.15)
Introducing new variables t = -y and u(x, t) = - 2 a ( x , y) as well as fixing 0 = 1/2, 7 = 0, and z = 1, one concludes that (1.12) indeed reduces to the viscous Burgers equation ut + uux - uxx = 0,
(1-16)
while (1.15) equals ( A - (ee)xx
= 0,
(1.17)
201
with solutions related by u=-20».
(1.18)
However, Forsyth did not study the ramifications of this transformation, and no applications are discussed. Shortly thereafter, in 1915, Bateman [1] introduced the model equation u ( + uux — vuxx = 0.
(119)
He was interested in the vanishing viscosity limit, that is, when v -¥ 0. By studying solutions of the form u — F(x+Ut), he concluded that "the question of the limiting form of the motion of a viscous fluid when the viscosity tends to zero requires very careful investigation". Only in 1940 did Burgers ([9, p. 8]) introduce6 what has later been called the (viscous) Burgers equation, as a simple model of turbulence, and did some preliminary investigation on properties of the solution. Taking advantage of the later rediscovered Forsyth transformation by Cole and Hopf, Burgers continued the investigations of what he called the nonlinear diffusion equation, focusing mainly on statistical aspects of the equation. The results of these investigations were collected in his book [11]. In 1948, Florin [18], in the context of applications to watersaturated flow, rediscovered Forsyth's transformation, which would become well-known under the name Cole-Hopf transformation only some 44 years later. Although the Cole-Hopf transformation had already been published in 1906, it was only with the seminal papers by Hopf [42]c in 1950d and by Cole [13] in 1951 that the full impact of the simple transformation was seen. In particular the careful study by Hopf concerning the vanishing viscosity limit represented a landmark in the emerging theory of conservation laws. Even though the Cole-Hopf transformation is restricted to the Burgers equation, the insight and the motivation from this analysis has been of fundamental im portance in the theory of conservation laws. Furthermore, Cole states the gen eralization of the Cole-Hopf transformation to a particular multi-dimensional system. More precisely, if ip = tp(x,t), (x,t) € Kn x (0,oo), satisfies the n-dimensional heat equation ipt ~ "A> = 0,
v > 0,
(1.20)
fc Frequently Burgers' equation is quoted from his 1948 paper [10], but he had already introduced it in 1940. c With a misprint in the title, writing ut + uux = p.xx rather than ut + uux = fiuxx. d Hopf [42] states in a footnote (p. 202) that he had the "Cole-Hopf transformation" already in 1946, but "it was not until 1949 that I became sufficiently acquainted with the recent development of fluid dynamics to be convinced that a theory based on (1.19) could serve as an instructive introduction into some of the mathematical problems involved".
202
and one defines u = -2i/Vln(V»),
(1.21)
then u satisfies ut + (u ■ V)u - uAu = 0,
(1.22)
and the vector-valued function u = u(x, t) € R" has as many components (i.e., n) as the dimension of the underlying space. Observe, in particular, that u is irrotational (i.e., u = Vio for some w, or equivalently, curlu = 0). The multi dimensional extension was rediscovered by Kuznetsov and Rozhdestvenskii [44] in 1961. In this note we show the following relation, ut + uux - vuxx = 2v f --j-dx + -rf ) 0>t -
v^xx)
= -2vdx f ±(V>t - vi>xx)\ ,
(1.23)
whenever u — -2i/ipx/i> for a positive function tp. This clearly displays the nature of the Cole-Hopf transformation and closely resembles Miura's identity (1.4) and the relation (1.6). Even though identity (1.23) is an elementary observation, much to our surprise, it appears to have escaped notice in the extensive literature on the Cole-Hopf transformation thus far. While both the KdV and mKdV equations are nonlinear partial differential equations, the case of the Burgers and heat equations just considered is a bit different since it relates a nonlinear and a linear partial differential equation (see also [6, Sect. 6.4]). One can also extend the Cole-Hopf transformation to the case of a po tential term F in the heat equation, see, for instance, [38]. Here the relation (1.23) reads as follows, ut + uux - vuxx + 2vFx = -2vdx (-r(4>t - ^xx
- FipU ,
(1.24)
whenever u = -2i/5jln(i/)) for a positive function t/>. The case of Burgers' equation externally driven by a random potential term recently generated particular interest, see, for instance, [3], [4], [36], [38], [39], [40] and the refer ences therein. We also mention a very interesting application of the Cole-Hopf transformation to the pair of the telegraph and a nonlinear Boltzmann equa tion in [41], generalizing the pair of the heat and Burgers equation considered in this note. As in the KdV-mKdV context, the Burgers equation (1.19) is just the first in a hierarchy of nonlinear evolution equations, and we will show
203
in Section 3 that identity (1.23) extends to the entire hierarchy of Burgers equations. Equation (1.24) extends to the multi-dimensional case corresponding to (1.22) and one obtains
ut + a(u ■ V)u - i/Au + — V F = - — V (\ Ut - «/Atf - Fi>)) , a a \ip )
(1.25)
whenever a € K\{0} and u = -(2u/a)V\n(ip) for a positive function tp. Obviously there is a close similarity between the heat and the Burgers equation expressed by (1.23), and Miura's identity (1.4) relating the mKdV and the KdV equation. The principal idea underlying these considerations being that one (hierarchy of) evolutions equation(s) can be represented as a linear differential expression acting on another (hierarchy of) evolution equa tion^). As long as the null space of this linear differential expression can be analyzed in detail, it becomes possible to transfer solutions, in fact, en tire classes of solutions (e.g., rational, soliton, algebro-geometric solutions, etc.) between these evolution equations. In concrete applications, however, it turns out to be simpler to rewrite a relationship between two evolution equations, such as (1.4) and (1.23), in a form analogous to (1.6) and the sec ond relation in (1.23), rather than analyzing the nullspaces of (2
204 2
The M i u r a transformation
We turn to the precise formulation of the relations between the KdV and the mKdV equation and omit details of a purely calculational nature. Lemma 2.1 Let xjj = ip(x,t) > 0 be a positive function such that t\> 6 C4-°(R x R), ipt e C1-°(R x R). Define <j> = ipx/ip. Then 4> € C3'l(R x R) and mKdV(<£) = ±dx (^\(i>±l)t
- PiiVi)^1)),
(2.1)
where Pi(V) = 2Vdx-Vx
(2.2)
and V± =
(2.3)
Proof. A straightforward calculation. The application to the KdV equation then reads as follows. Theorem 2.2 Let V = V{x, t) be a solution of the KdV equation, KdV(V) = 0, with V 6 C 3,1 (R x R), and let xp > 0 be a positive function satisfying i/> 6 C2'°(R x R), ipt 6 C l i 0 (R x R) and *Pt = Pi {V)rl>, -i>xx + Vil> = 0,
(2.4)
with Pi 0 0 given by (2.2). Define <j> = ipx/ip and V = 4>2 -
(2.5)
Moreover, V satisfies V € C 3,1 (R x R) and the KdV equation, KdV(f) = 0.
(2.6)
Proof. A computation based on Lemma 2.1. Originally, Theorem 2.2 was proved in [31] (see also [22], [23], [30], [32], [33]) using supersymmetric methods. The above arguments, following [34] in the context of the (modified) Kadomtsev-Petviashvili equation, result in con siderably shorter calculations. The "if part" in Theorem 2.2 also follows from prolongation methods developed in [51]. A different approach to Theorem 2.2, assuming rapidly decreasing solutions of the KdV equation, can be found in Sect. 38 of [2]. Remark 2.3 The chain of transformations V -¥
(2.7)
205 reveals a Bdcklund transformation between the KdV and mKdV equations (V ->
(2.8) (2.9)
where *o = l, Xn+l,x
= --Xn,xxx
+ 2VXn,x + VxXn,
neNo,
(2.10)
lo = l, Yn+hx = -^Yn,xxx-r2
+ 2<j>x(d-1(4>Yn,x)+cn),
n€H,,
(2.11)
where c„, n G N, are the integration constants chosen to define Xn, CQ = 1, and we denote
{d;1f)(x) = jXdx'f(x').
(2.12)
It can be shown that Xn(V), Yn(<j>) are differential polynomials in V and (p, respectively (see, e.g., [17, Sect. 2.3]). In particular, one can show that 4>Yn is a total derivative and our notation in (2.11) implicitly assumes a homogeneous integration procedure with cn the only integration constant. Explicit computations then reveal XQ
= Co = 1,
206
J C i = V + Ci,
(2.13)
X2 = ~^VXX + -V2 + cxV + c 2 , etc., Vo = l, Y 1 =2<£ + d 1 ,
(2.14) 3
Y2 = -4>xx + 2<£ + ci2i^ + d2, etc., with c n , d„, n 6 N, integration constants, and thus, KdV 0 (V) = V i - 2 V s = 0 , KdVi(V) = Vt + Vxx - 6VVX mKdV o (0) =
(2.15) Cl2Vx
= 0, etc., (2.16)
2
mKdVi(0) = <j>t +
neNo,
(2.17)
and mKdV„(0) = ±dx(2^1((^±1)t
- Pn(V±)1>±1)),
neNo,
(2.18)
where Pn(V) = 2Xn(V)dx-Xn,x(V), 3
n€^.
(2.19)
T h e Cole—Hopf transformation
Finally we return to relations (1.23) and (1.24). Since they are all proved by explicit calculations we may omit these details and focus on a precise formulation of the results instead. L e m m a 3.1 Let xp = ip(x,t) > 0 be a positive function with ip G C 3 '°(R x (0,oo)), ipt G Clfi(R x (0,oo)). Define u = -2vipx/i> with v > 0. Then ueC2'l(®-X (0,oo)) and ut + uux - vuxx = 2v I -~jdx + -r§ ) (ipt - wfyxx)
=-2vdx (Urpt - vTpxx)Y
(3.1)
(3.2)
The extension to the case with a potential term F in the heat equation reads as follows.
207
L e m m a 3.2 Let F G C1'°(IR x (0,oo)) and assume xp = xp(x,t) > 0 to be a positive function such that xp G C3'°{R x (0, oo)), xpt € C 1 , 0 (R x (0, oo)). Define u = -2uxpx/xp with v > 0. Then u G C2'l{R x (0,oo)) and ut + uux - vxixx + 2uFx = -2vdx (—{A- ^ u - Ftp) j .
(3.3)
We can exploit these relations as follows. T h e o r e m 3.3 Let F G C 1 '°(K x (0, oo)) and v > 0. (i) Suppose u satisfies u G C 2,1 (R x (0,oo)), and ut + uux - vuxx + 2uFx = 0
(3.4)
for some v > 0. Define
H x,t)
= exp(-±-J
dyu(y,t)j.
(3.5)
Then xp satisfies 0 < xp G C3'l{R x (0, oo)) and ^ M - v$„ - FiP) = C{t)
(3.6)
for some x-independent C G C(K). C»^ Lei V > 0 be a positive function satisfying xp G C 3 '°(R x (0, oo)), xpt G C 1,0 (K x (0,oo)) and suppose xpt ~ vxpxx -Fr/> = 0
(3.7)
u = -2i/%.
(3.8)
for some v > 0. Define
T/ien u G C 2 , 1 ( ! x (0,oo)) satisfies (3.4). R e m a r k 3.4 One can "sca/e away" C(t) in Theorem 3.3 (i) by introducing a new function xp. In fact, the function xp(x,t) = xp(x, t) exp(— / 0 dsC(s)) satisfies xp = vxpxx + Fxj).
(3.9)
R e m a r k 3.5 Using the standard representation of solutions of the heat equa tion initial value problem, xpt ~ vipxx = 0,
tp(z, 0) = ik{x),
(3.10)
assuming ipo G C(R),
ipo{x) < Cx exp(C 2 |x| 1 + 7 ) for \x\ > R
(3.11)
208
for some R > 0, Cj > 0, j = 1,2, and 0 < 7 < 1, f/iven 6y (cf. [12, Ch. 3], [14, Ch. V]; J dy exp(-(x - y)2/4ut)Mv)
rl>(x, t) = 2{Jt)1/2
> 0,
(3.12)
the corresponding initial value problem for the Burgers equation ut + uux — vuxx = 0,
u(x, 0) = uo(x),
(3.13)
reads iRdy(x-y)riexp(-(2^J0vdr1u0(r1)-(x-yn4ut)-1) HX
'
'
fR dy exp ( - (2^)-i /0V dnuoiv) - (x -
y)2(4ut)-') (3.14)
assuming UQ £ C(R) and u0(x) > 0 or \ I I Jo
=
dyu0(y)
0 ( | i | 1 + 7 ) for some 7 < 1.
(3.15)
|i|-+oo
Before turning to multi-dimensional extensions of the Cole-Hopf transfor mation we briefly discuss the Burgers hierarchy of nonlinear evolution equa tions and derive the analog of identity (3.2) for the Burgers hierarchy. The higher-order Burgers equations (choosing v = 1 for simplicity) can be defined recursively by [48], [49, Sect. 5.2], B„(u) =ut + 2Zn+hx(u)
=0,
n€No,
(3.16)
where Z0 = 1, Zn+i,x
=
(3.17)
Zn<xx ~ ^uZn,x ~ r ^ i Z , , ,
n € No•
Explicitly, one computes Zo = A) = 1, Zi = - - u + d , Z2 =--^ux
+-u2-C1-U
(3.18) + C2, etc.,
(3.19)
with integration constants c n , n € N, and thus B0(u)=ut-ux, Bi(u) — ut - uxx + uux — c\ux = 0, etc.
(3.20) (3.21)
209 Introducing i/> as in Lemma 3.1, that is, assuming u = -21>x/xl>,
(3.22)
the analog of relation (3.6) for the Burgers hierarchy now reads n+l
B n (u) = - 2 9 I ( i / ; - 1 ( ^ - ^ c n + 1 _ j ^ ) ) ,
neKfc.
(3.23)
j=\
Since clearly ut = -2dx{ip-lipt),
(3.24)
relation (3.23) is equivalent to n+l
Zn+l,x{u)
= dx{i>-lY^Cn+i-idixl>),
n€N,.
(3.25)
i=i
Since (3.25) is easily verified for n = 0,1 one can proceed inductively as follows. Assuming the homogeneous case where Cj = 0 for j = 1 , . . . ,n, and supposing (3.25) to hold in the homogeneous case, that is, Zj+1(u)
= rp-1dl+lxP,
j=0,...,n,
(3.26)
one computes z n +2.x(u) = d2x{i>-ld?+l4>)
+ip-l4>xdx(4>-1d?+1ip)
+ 1>-2(iH>xx ~ ^x){H>-ld^+lxl>).
(3.27)
Similarly, one calculates dx{ip~ld^+1ip)
= right-hand side of (3.27),
(3.28)
and hence Zn+2(u) =
T/T'^+V
(3.29)
in the homogeneous case, completing the proof. Given the analog of identity (3.2) for the entire Burgers hierarchy in the form of (3.23), one can derive the analog of Theorem 3.3. We omit further details along these lines but note that the Cole-Hopf transformation was used by Blaszak [5] to transform the nth order homogeneous Burgers equation B n (u) = 0 to the linear partial differential equation ipt — dx+1ip = 0. Identity (3.23) goes beyond this transformation as detailed in Theorem 3.3 in the case n= 1. The multi-dimensional extension of Lemma 3.2 reads as follows.
210 Lemma 3.6 Let F G C 1 ' ° ( l n x (0,oo)) and assume ip = ip(x,t) > 0 to be a positive function such that ip G C 3>0 (R n x (0,oo)), ipt € C 1 , 0 (R" x (0,oo)). Define u = ~(2u/a)\7ln(ip) with a G R\{0}, v > 0. Then u G C 2 - ^ ! " x ( 0 , o o ) ; l n ) and 2v + — VF =
ut +a(u- V)u-vAu
2v
(1 , A V I - ( ^ t - uAip - Ftp) ) .
(3.30)
Our final result shows how to transfer solutions between the multi dimensional Burgers equation and the heat equation. Theorem 3.7 Let F G C 1 -°(R n x (0,oo)), a G M\{0}, and v > 0. (i) Assume that u G C 2>1 (R n x (0,oo);R n ) satisfies u = V$ for some potential $ G C
3,1
(R
n
(3.31)
x (0,oo)) and
2v ut + a{u ■ V)u - i/A« + — V F = 0. a
(3.32)
De/ine tf = e x p ( - ^ * ) .
(3.33)
Then ip G C 3 l l (R n x (0,oo)) and ^{il>t-vAil>-Fil>)=C(t),
(3.34)
for some x-independent C G C((0, oo)). (ii) Let ip > 0 be a positive function satisfying ip G C 3 , 0 (R n x (0,oo)), ipt G C l i 0 (R n x (0,oo)) and suppose ipt ~ vAij) ~Fip = 0.
(3.35)
Define 2v u =
Vln(V>). a Then u G C 2 ' J (R" x (0,oo);R n ) satisfies (3.32).
(3.36)
Acknowledgments Supported in part by the Research Council of Norway under grant 107510/410 and the University of Missouri Research Board grant RB-97-086. We thank Mehmet Unal and Karl Unterkofler for discussions on the Burgers and (m)KdV equations, respectively.
211 References 1. H. Bateman, Some recent researches on the motion of fluids, Monthly Weather Rev. 43, 163-170 (1915). 2. R. Beals, P. Deift, and C. Tomei, Direct and Inverse Scattering on the Line, Mathematical Surveys and Monographs, Vol. 28, Amer. Math. Soc, Providence, RI, 1988. 3. F. E. Benth and L. Streit, The Burgers equation with a non-Gaussian random force, Stochastic Analysis and Related Topics, L. Decreusefond, J. Gjerde, B. 0ksendal, and A. S. Ustiinel (eds.), Stochastic Monographs, Vol. 6, Birkhauser, Basel, 1998, pp. 267-278. 4. F. E. Benth, T. Deck, J. Potthoff, and L. Streit, Nonlinear evolutionequations with gradient coupled noise, Lett. Math. Phys., 4 3 , 267-278 (1998). 5. M. Blaszak, Solving Lax pairs and perturbing time-dependent force for Burger's hierarchy, Acta Phys. Polonica A 70, 523-528 (1986). 6. G. W. Bluman and S. Kumei, Symmetries and Differential Equations, Springer, New York, 1989. 7. J. Boussinesq, Theorie de I'intumescence liquid appelee onde solitaire ou de translation, se propageant dans un canal rectangulaire, C. R. Acad. Sci. Paris Ser. I Math. 72, 755-759 (1871). 8. W. Bulla, F. Gesztesy, H. Holden, and G. Teschl, Algebro-geometric quasi-periodic finite-gap solutions of the Toda and Kac-van Moerbeke hi erarchies, Mem. Amer. Math. Soc. 135, No. 641, 1-79 (1998). 9. J. M. Burgers, Application of a model system to illustrate some points of the statistical theory of free turbulence, Proc. Konink. Nederl. Akad. Wetensch. 43, 2-12 (1940). 10. J. M. Burgers, A mathematical model illustrating the theory of turbulence, Proc. Konink. Nederl. Akad. Wetensch. 1, 171-199 (1948). 11. J. M. Burgers, The Nonlinear Diffusion Equation, Reidel, Dordrecht, 1974. 12. J. R. Cannon, The One-Dimensional Heat Equation, Addison-Wesley, Reading, MA, 1984. 13. J. D. Cole, On a quasi-linear parabolic equation occurring in aerodynam ics, Quart. Appl. Math. 9, 225-236 (1951). 14. E. DiBenedetto, Partial Differential Equations, Birkhauser, Boston, 1995. 15. D. B. Dix, Nonuniqueness and uniqueness in the initial-value problem for Burgers' equation, SIAM J. Math. Anal. 27, 708-724 (1996). 16. P. G. Drazin and R. S. Johnson, Solitons: An Introduction, Cambridge University Press, Cambridge, 1989.
212 17. G. Eilenberger, Solitons, Springer, Berlin, 1983. 18. V. A. Florin, Some simple nonlinear problems of consolidation of watersaturated soils, Isvestia Akad. Nauk. SSR, Oth. Techn. Nauk, No. 9, 1389-1397 (1948). (In Russian.) 19. A. R. Forsyth, Theory of Differential Equations. Part IV — Partial Differential Equations, Cambridge University Press, Cambridge, 1906, Republished by Dover, New York, 1959. 20. C. S. Gardner, J. M. Green, M. D. Kruskal, and R. M. Miura, Method for solving the Korteweg-de Vries equation, Phys. Rev. Lett. 19, 1095-1097 (1967). 21. C. S. Gardner, J. M. Green, M. D. Kruskal, and R. M. Miura, Kortewegde Vries equation and generalizations. VI. Methods for exact solution, Comm. Pure Appl. Math. 27, 97-133 (1974). 22. F. Gesztesy, Some applications of commutation methods, in Schrodinger Operators, H. Holden and A. Jensen (eds.), Lecture Notes in Physics 345, Springer, Berlin, 1989, pp. 93-117. 23. F. Gesztesy, On the modified Korteweg-de Vries equation, in Differen tial Equations with Applications in Biology, Physics, and Engineering, J. A. Goldstein, F. Kappel, and W. Schappacher (eds.), M. Dekker, New York, 1991, pp. 139-183. 24. F. Gesztesy and H. Holden, Hierarchies of Soliton Equations and their Algebro-Geometric Solutions, monograph in preparation. 25. F. Gesztesy, H. Holden, E. Saab, and B. Simon, Explicit construction of solutions of the modified Kadomtsev-Petviashvili equation, J. Funct. Anal. 98, 211-228 (1991). 26. F. Gesztesy, H. Holden, B. Simon, and Z. Zhao, On the Toda and Kac-van Moerbeke systems, Trans. Amer. Math. Soc. 339, 849-868 (1993). 27. F. Gesztesy, D. Race, K. Unterkofler, and R. Weikard, On Gelfand-Dickey and Drinfeld-Sokolov systems, Rev. Math. Phys. 6, 227-276 (1994). 28. F. Gesztesy, D. Race, and R. Weikard, On (modified) Boussinesq-type systems and factorizations of associated linear differential expressions, J. London Math. Soc. 47, 321-340 (1993). 29. F. Gesztesy, R. Ratnaseelan, and G. Teschl, The KdV hierarchy and asso ciated trace formulas, in Recent Developments in Operator Theory and its Applications, I. Gohberg, P. Lancaster, and P.N.Shivakumar (eds.), Op erator Theory: Advances and Applications, Vol. 87, Birkhauser, Basel, 1996, pp.125-163. 30. F. Gesztesy, W. Schweiger, and B. Simon, Commutation methods applied to the mKdV-equation, Trans. Amer. Math. Soc. 324, 465-525 (1991). 31. F. Gesztesy and B. Simon, Constructing solutions of the mKdV-equation,
213
J. Funct. Anal. 89, 53-60 (1990). 32. F. Gesztesy and R. Svirsky, (m)KdV solitons on the background of quasiperiodic finite-gap solutions, Mem. Amer. Math. Soc. 118, No. 563, 1-88 (1995). 33. F. Gesztesy and K. Unterkofler, Isospectral deformations for SturmLiouville and Dirac-type operators and associated nonlinear evolution equations, Rep. Math. Phys. 31, 113-137 (1992). 34. F. Gesztesy and K. Unterkofler, On the (modified) KadomtsevPetviashvili hierarchy, Diff. Integral Eqs. 8, 797-812 (1995). 35. F. Gesztesy and Z. Zhao, On critical and subcritical Schrodinger opera tors, J. Funct. Anal. 98, 311-345 (1991). 36. M. Grothaus, Yu. G. Kondratiev, and L. Streit, Scaling limits for the solution of Wick type Burgers equation, BiBoS preprint 824/6/98, Univ. of Bielefeld, Germany, 1998. 37. M. Heyerhoff, Die fruhe Geschichte der Solitonentheorie, Ph.D. thesis, Ernst-Moritz-Arndt-Universitat Greifswald, Germany, 1997. (In Ger man.) 38. H. Holden, B. Oksendal, J. Ub0e, and T. Zhang, Stochastic Partial Dif ferential Equations, Birkhauser, Basel, 1996. 39. H. Holden, T. Lindstr0m, B. 0ksendal, J. Ub0e, and T.-S. Zhang, The Burgers equation with a noisy force, Communications in Partial Differen tial Equations 19, 119-142 (1994). 40. H. Holden, T. Lindstr0m, B. 0ksendal, J. Ub0e, and T.-S. Zhang, The stochastic Wick-type Burgers equation, in Stochastic Partial Differen tial Equations (Edinburgh, 1994), London Mathematical Society Lecture Notes Series, Vol. 216, A. Etheridge (ed.), Cambridge University Press, Cambridge, 1995, pp. 141-161. 41. M.-O. Hongler and L. Streit, A probabilistic connection between the Burger and a discrete Boltzmann equation, Europhys. Lett. 12, 193197 (1990). 42. E. Hopf, The partial differential equation ut +uux = \ixx, Comm. Pure Appl. Math. 3, 201-230 (1950). 43. D. J. Korteweg and G. de Vries, On the change of form of long waves advancing in a rectangular canal, and on a new type of long stationary waves, Phil. Mag. 39(5), 422-443 (1895). 44. N. N. Kuznetsov and B. L. Rozhdestvenskii, The solution of Cauchy's problem of quasi-linear equations m many independent variables, Comput. Math. Math. Phys. 1, 241-248 (1961). 45. P. D. Lax, Periodic solutions of the KdV equations, Lectures Appl. Math., 15, 85-96 (1974).
214
46. R. M. Miura, Korteweg-de Vries equation and generalizations. I. A re markable explicit nonlinear transformation, J. Math. Phys. 9, 1202-1204 (1968). 47. R. M. Miura, C. S. Gardner, and M. D. Kruskal, Korteweg-de Vries equa tion and generalizations. II. Existence of conservation laws and constants of motion, J. Math. Phys. 9, 1204-1209 (1968). 48. P. J. Olver, Evolution equations possessing infinitely many symmetries, J. Math. Phys. 18, 1212-1215 (1977). 49. P. J. Olver, Applications of Lie Groups to Differential Equations, 2nd ed., Springer, New York, 1993. 50. R. Pego, Origin of the KdV equation, Notices Amer. Math. Soc. 45, 358 (1998). 51. H. D. Wahlquist and F. B. Estabrook, Prolongation structures of nonlin ear evolution equations, J. Math. Phys. 16, 1-7 (1975).
215
SOME MODELS OF NONIDEAL BOSE GAS WITH BOSE-EINSTEIN CONDENSATE Dedicated with admiration to Prof. Ludwig Streit on his 60
birthday
ROMAN GIELERAK Technical University of Zielona Gora 65-246 Zielona Gora Poland E-mail: gieleratMproton.ift.uni.tvroc.pl Using the functional integral techniques homogeneous limits of the perturbations of thermal states (describing nonrelativistic Bose Matter at the thermal equilibrium) by bounded cocycles are constructed rigorously. Additionally some elementary pro perties of these limiting states are discussed and in particular the preservation of the nonpurity in the critical case is proved.
1 1.1
Introduction The problem of Bose-Einstein condensate
One of the most spectacular achievements in the experimental low temperature physics of the past few years is the laboratory realization of the Bose-Einstein condensate (BEC) of cold atoms l l 2 , 3 . This bizarre quantum state of a Bose Matter is formed at nanokelvin temperatures and requires also high atomic densities. The BEC is formed when the quantum wave packets of atoms overlap at low temperatures and the atoms condense almost motionless, into the lowest quantum state. This means that the wave-lenght of the matter waves associated with the cold atoms 4 , the de Broglie waves, become comparable in size to the mean atomic distances in a cold and dense sample. The phenomenon of BEC was predicted by Bose 4 and Einstein 5 already in 1924-25. But even for the case of simplest ideal Bose Gas the complete mathematical proof took a long period of time ended successfully only in 70-ties by the elaborations of the Dublin group 7 ' 8 . Concerning the standard, gauge-invariant many body hamiltonians with realistic pair interatomic potentials the situation from the theoretical physics point of view is still very far from being clarified. It seems that the class of models which is closest to realistic systems and being tractable mathematically is that described by the so called model systems with diagonal hamiltonians 8 . In a series of papers MO,11,12 new mathematical technologies for study ing the nonrelativistic Bose Matter at thermal equilibrium and in nonzero
216
temperature were invented. They are based on the observation that the mo dular structure of the free Bose Matter has the so called stochastic positivity property 1 1 . It is this property which enables us to use certain functional inte gral/random field description of these genuine noncommutative structures. Ad ditionally those commutative analysis methods open the doors for applications of the methods of classical statistical mechanics for studying the perturbations of the free thermal structures by the so called thermo-field like perturbations. Such programme was initiated in 9 ' 1 0 and developed in certain directions 11 - 12 . The present contribution continues the analysis of9,10 for the so called gentle perturbations of the free Bose Matter. A new, purely commutative strategy for studying the standard many-body hamiltonians initiated in 13 has been extended recently in 1 2 , 1 6 , see also the activity 17 . 1.2
The thermal structure of the Ideal Bose Gas (IBG)
Let W be a *-algebra version of the Weyl algebra over the one-particle Hilbert space h = L2(Rd,dx) with generic elements denoted as Wj, f G h. The one-particle unitary dynamics U° in h is generated by h% = - A - / d , where - A is the (Friedrichs) Laplacean and ^ is the chemical potential. The natural lifting 5? of C/,° is defined as afws = Wvoj. The free thermal state wo on W is given by 18 uo(Wf) = exp-1-(f\coth^hM)
(1)
and the corresponding free thermal state wgr for IBG containing Bose-Einstein condensate w?(Wj)
= exp-ic|/(0)|2 " o ^ / ) ^
(2)
which is well defined, providing d > 3, and where / i s the Fourier transform of / G L1 n L2(Rd, dx), c > 0 is some constant depending on details of the thermodynamical passage and measuring to some extent the size of the condensed fraction of gas. Both states, w0 and wgr are invariant under the action of 3? (5° 1 fi = 0 in the case of WQ1"). Applications of the GNS construction lead to the well known Araki-Wood thermal modules ( ^ c r ) , ^ c r ) ) 5 r ° ' ( c r ) , £/°' (cr) ). We define the corresponding von Neumann algebras M0 (resp. M%) as weak closures of TT°(>V) ( resp. of 7r°'(er)(W)). Then the systems ( M , , a ? , u , 0 ) = .4o,
{Mc0r,a°t'cr,u>^)=Alr
217
form W - K M S systems (and where at are the corresponding extensions of (cr) f/,°' ). The presence of the Bose-Einstein condensate in A" is manifested throughout the nonpurity of AQ . The central decomposition of wgr is well known since Araki-Wood work 1S. In the present contribution we shall study the (homogeneous limits of) perturbations of A0cr throughout the unitary cocycles perturbations 18 and in particular the question on the preservation of the nonpurity of the limi ting thermal states (in the critical regime) will be settled up positively. The techniques employed are based on the functional integral approach invoked in 9,10,11
2
T h e main result
Let (x()oo be a net from Ca(Rd) (the space of continuous functions with compact support), such that Xi > 0 pointwise and \\mxc - 6 an d
in the weak sense. For x £ R , c > 0 and a g f l w e define We(a,x) = Wa.Xt(_x) where Wj is now the representative of Wj in the corresponding free thermal module. Let dp be a (complex in general) Borel measure on R, with compact support and such that dp(—a) — dp(a). For A C Rd being bounded region we define H^xjdp^JdxW^cx) (3) A
where X £ R is called coupling constant, and HlL = \ j dp(a) j dp(0) JdxJdy
W((a, x)T{x - y)Wc(p, y)
(4)
where T € L](Rd) and integrals are defined in the o--weak topology of the corresponding W*-algebras. From the simple estimates ||itf|| < |A|Var(p)|A| l|//ALH<|A||Var(p)|2||r||1|A| where Var(p) is the variation of p and |A| means the volume of A it follows that the unitary cocycles perturbations theory 18 for studying the perturbed
218
(by H*) free dynamics can be applied. For this goal we define: • the unitary cocycle T*' : t
(i
'»-!
A
Tf' ES 1 + £ . ' / *i J dt2 • • • J dtna^\H*) ">!
o o
■ ■ ■ *y«\H*)
o
where # stands for L or nL and the free dynamics a 0 acts as «?(#A) = A / dp(a) | and similarly for H%L, • the perturbed dynamics af'
dxW a U - x . ( ._ r )
:
*
er
= a°t< \A) + £ i" jdhjdh-. l^1
0
j dtn
0
(6)
0
[««;(crW).[---[«?;(cr)(^)^]---]] forAe^, • the perturbed thermal vacuum Qf:
n*i(er) = n ^ +
|
«/«,•■•«/«„
0<»i<-<«»<# e
-»,floi/#e-(.,-»»-)i/....i/#flW
(7)
where //Q r is the generator of a, in WQ anc ^ again the integrals and series are weakly convergent in H" . R e m a r k 1 From the Araki theorem 18,20 it follows that there is a holomorphic map: Rn + iTm
9
( Z l i . ..,*„) -
e"-».jy#e«(*--*-.)»o.
..H*Q(0eT) € W 0 cr)
where
rf/ 2 = { ( S i , . . . , s „ ) | o < S l <••■<«„ ?/2},
(5)
219
continuous on Rn + iTn
and additionally obeying the estimate (uniformly on
Rn + i7fn/2) || e i;nff 0// #
e i(».-i.-,)ir.
< \X\n(Var(p)r < \X\n(Var(p)f"
.Hfn(0cr)\\
..
■ |A|" for
# = L
■ ||r||M|A|r
for
# = n£
From the general theory 18 it follows that the systems
AiA*) = {rff\\M\\-l-a£*o,af*A',)} form a /?-KMS structures and such that the modular dynamics corresponding to the vector states w* given by fl* are exactly equal to a*' . In the following we shall sometimes drop out the superscript # in the notation and the following abbrevations will be used: / d(T, «, *)? = A
dxi A
■ dxn / dn • • •
drn / dp(ar)
/ dp(an),
etc.
A
Now we are ready to formulate main results of this contribution. Theorem 1 (The noncritical case) /. There exists small Ao (depending on /?, # , r,p) such that for all \X\ < X0 the unique thermodynamic limit limw*(A)=u/#(A) A
exists in the sense of weak convergence. The limiting state w*(A) is faithful I on Ao and is ergodic with respect to the (natural) action of the Euclidean motions on AQ. The limiting state is analytic in X and entire analytic on Ao2. If additionally dp(a) = dp(—a), A > 0 and T > 0 (pointwise) in the case # = nL, then the unique thermodynamic limit w#(A) = limw*(A) A
exists for all X > 0 and is faithfull and Euclidean invariant. Here is the main result: Theorem 2 (The critical case) There exists A0 (depending on p, # , T,...) such that the unique thermodynamic limit \\mu)er(X)=u)cr{X) A
220
exists for all\X\ < Ao. The limiting state wer (A) is faithfull, Euclidean invariant and satisfies: there exists a Borel measure dXTen(a, 0) on [0, oo) x [0, 2ir) and a family of faithfull and ergodic (with respect to the translations) states w r »(A), indexed by [0,oo) x [ 0 , 2 T ) such that oo 2 *
/-)(A) = y|dAre„MK,*(A) o o Moreover the measure dXren(r,0) 3
is not concentrated on a single point.
Proofs: the main ideas
For / , g e h such that f = f, g =~g and for any r € [0, fi/2) it follows by the straightforward computations that 9 , 1 0 : (Qicr)\Wfe-^r)Wg^r))
=exp-l-S0o«r)(T,f®g)
(8)
where ^ l ( c r ) ( r , / ® g) = Sg(r, f®g) + (cjf(x)dx and
■,/}
, ~
Sp0(rJ®g)
J g{x)dx)
(9)
irie-T(-*-ii)+e-V)-T)(-*-i*)
= (/|
1
_e^(_A_/l)
9)h
(10)
Extending S^'^CT\T, •, •) to R1 by reflection invariance and periodicity it fol lows that 5^'( cr ^(r, •, •) is stochastically positive and reflection positive on Kp = (—/3/2, (3/2) continuous kernel. Therefore, there exists a Gaussian pro cess £°'( cr ) with values in S'(Rd), which is faithfull, stochastically continuous, periodic and reflection positive, and satisfies
E(e ( c r ) ,)«o' ( c r ) ,-) = ^' (cr) (r,-,-)
(11)
The law of this process can be identified with the Gaussian random field fi^ on the space S'{Kp x Rd) such that E^orMr,
* M 0 , y) = S f ( e r ) ( r . x - y)
We define the following functionals of fi0 ^A (
(12)
.
f dp(a) f dx : eia^T^
:
(13)
221
Uf' = X f dr f dp(a) f dp(0) f dx f dyeia^T^T{x 0
A
- y)eip^T^
A
where ipc(r, x) = (ip * X<)(r> x) The finite volume Gibbsian perturbed measures dp.^cr' = (Z^r1
dfi^)
(14)
are defined as
exp V*{!p)dW\V)
and their corresponding canonical correlation functions PA (n,<xi,xi,...,Tn,an,xn) as />A
cr)
(r1,a1>x1,...,rn)a„,xn) = A n E ^ e « a - ^ r — ) . - . e i « ^ ( T " ' ^ )
Recall, that their thermodynamic limits in various ranges of couplings were controlled rigorously in previous publications 9 | ! 0 . Using definition of WA it follows by straitforward computations that Z^
= (Q^\^)n^
(15)
and for f E h "{Ar\Wl)
= (^ACr)\Wf^r))
= exp -l-S^r\0,f®
f)
• £ [ i /d(r>al«)yn«"m(/|BiX,1,r',(*i) n t e - i ^ - ' - ^ ^ - l ^ K a , * ) ? ]
(16)
>= i
(where SQ^ = (Xe ® Xt) * S o ) complicated) in the case of U%L:
7V,Af,X,>0
'
L
/fl
^ or ^ A
case
i
an
^ similarly (albeit more
222
I d(r\ a\ x 2 ) f d{a\ /?2, y 2 ) f j d(r 3 , a 3 , x 3 )f d(
" r •J] ; =1
exp(i/m|aj*«,.>;(• - xj»exp(i/m(/|/?jx € | i < ,,(. - 2/j)r(xj - y)) L
[ e x p ( - a j S * ( e r > * /(*})) - 1] [ e x p ( - / ? X f ° * /(»))) - l ] } M
f
•J]
exp(iJm(/|a? X ( ,,>.(- - *?»exp(i7m(/|/J? X c,,vj(- - 2/;)r(x? - y))
; = 1 >>
[exp(-a2S0^/(x2))-l]} • f[ [ ex P (t7m(/|a 3 x e | 1 > ; (- - x?» exp(i/m|/3f ^ j ^
- yj >r(x 3 - y3)
[eXp(-^35jf)*/(y3))-l]} •^((rSaSx1^,...,^3,/?3,^)]
(17)
where X«.ir(--*) = ^ T X « ( - - * )
(18)
and
Aer)((ri,«1,*1)fr,-,(^,^,»3)f) =
M
i
x
Formulae (16) and (17) provide a link between the analysis on the abelian sectors described in the earlier papers 9 ' 1 0 and the corresponding states wAcrJ on the whole algebra(s) of observables M." ■ This is why we propose to call them the reduction formula(e) for the state(s). The essential volume dependence of wA is that of the canonical corre lation functions entering in formulas (16) and (17). To control the thermody-
223 namic limits limu>A
we have to divide the analysis into two parts, the first
A
(easier) devoted to the noncritical case and the second dealing with the critical
3.1
The case of noncritical IBG
The following results have been proved in 9 : There exists a small Ao (depending on z,P,...) such that the unique thermodynamic limits limp A >{T, a, x)f - p\
'(T,a,x)f
A
exists for all N and |A| < Ao. The limiting correlation functions p^n \T, a, x)^ are Euclidean invariant, possess the cluster decomposition property and are analytic in the disc |A| < A0. Moreover /?A" —* p\ locally uniformly. See Prop. 3.2 and Prop. 2.4 in 9 . Applying the formulated results, the proof of the first part of Theorem 1 follows whence using the reduction formulae (16) and (17). If additionaly we assume that the measure dp is real and even, A > 0 and T > 0 pointwise (in the case of nL), then we can use Theorem 3.9 of9 to controll the limit limp A = p" . Modulo the cluster decomposition property, the limiting A
canonical correlation functions possess most of the properties as in the local perturbations case and this enables us to prove the second part of Theorem 1 similarly as above. 3.2
The case of critical IBG
The main difficulty here is that SQT has no long range decay and this is why the high temperature/low density methods (Kirkwood-Salsburg identities, cluster expansions) do not apply straightforwardly to study the limit(s) limp A r . A
However, such an analysis is possible in the pure phases. For this goal, let K^ be the circle of radius 2ir and let dAo be the spectral measure of the state u>Qr on K-I-K x (0,oo) i.e.
v% = JdX0(r,6)u«g where to"B are pure states on M"
(20)
characterised by
< i ( W » = e i c ' / V / 3 c o s f i V(o) c -is 0 '(o,/»/)
(21)
The explicite form of dAo is well known 18,19 . It was observed in 10 that the states ufg are stochastically positive and the underlying periodic processes
224
£T with values in S'(Rd) are Gaussian processes with covariances given by (informally): /,«
e-M(-A)
E # ■'>(«)# ■ % ) = ^
. e -(/3-|r|)(-A)
TT^PA)
(22)
and means E£r'\x)
= cl'1r1'2a»6
(23)
The corresponding random fields n®'gT on the space S'(Kp x Rd) are Gaussian and with second moments given by (informally): j
¥>((), x)
= S$(r, x - y)
(24)
d
S'(K„xR ) and means:
/
= c^2r1'2 cos 6
(25)
5'(A'«xft<')
The perturbed, finite-volume Gibbs measures d^' dft9\
J
are defined as:
= ( 4 r ' 8 ) ) _ 1 exp U*{
(26)
where Z™
=
j
eKPU*(
(27)
d
S'(KgxR )
The thermodynamic limits ,.
(r,«)
hm/4
J
(r,8)
= /4'
were controlled rigorously in 10 and in particular the existence of the thermodynamic limits of the correlation functions pKA' '((T, a, x)") defined as
fre\(T, a, *)») = A"
j
S'(KfixR")
H
eiaMTit'i)dft9)
j = 1
and for small values of couplig constant A were proved in 10 .
(28)
225
Similarly as in the noncritical case the following reduction formula (writtten only for the local perturbations only) holds: W
A
(W/) — similar as in (16) but with pj[' ' instead of p\ and with S^> instead of Sf ( c r )
(29)
and where wAr' ' is the state obtained from UJTQ by the unitary cocycle(s) perturbations as in the noncritical case. Using the reduction formula (29) and results of10 the existence of the unique limits ,nmw _,,(»•>») ilT1,,(--,«) u)x A A
as a weak limits on M" and for all (r, 0) G (0, oo) x Kp follows. In particular the limiting states w^r' are Euclidean invariant, pure and faithfull. Using the central decomposition (20) the following formula can be derived easily:
"Z(Wj) = j d\0(r, )<„ ( ^ K J i !
(30)
where ofc'^, £1% are the corresponding vacuums given by the formula (7). From the reduction formulae (29) and (30) it follows easily that sup w%(Wf) < oo A
for any / € S(Rd). Using this observation it follows by a similar arguments to that presented in 1 0 that there exists limdA 0 (r, 6^ A
*
" = dA Pen (r, 0)
(31)
II^'AII
in the sense of measures. That the measure dA re „(r, 0) is not concentrated on a single point was proved in . Summarizing this discussion we have obtained the following result: "x{WS)=
j
dXren(r,0)^'s\(Wf)
(32)
if^x(O.oo)
for sufficiently small A and the measure dA ren {r, 0) is not concentrated on a single point which means that the limiting state w£r is not the pure state. Moreover the states u^ ' appearing in (32) are pure states on M'Q ■
226
4
Concluding remarks
From the Tomita-Takesaki theory it follows that there exists a canonical (mo dular) dynamics afT on MQ such that the constructed states ur"' are KMS states with respect to ajT. On the other hand from the corresponding abelian euclidean-time Green functions some (a priori) another W'-KMS structure(s) can be constructed 9 , 1 °. This yields the difficult (and therefore very intere sting) question whether these two a priori different W*-KMS structures do coincide. In the finite volume perturbation case it has been proved in 15 that these W'-KMS structures coincide. Concerning the infinite volume situation the following result has been proved in 1 5 : Theorem Let £* be the corresponding to the infinite volume limits of the per turbations considered in this contribution thermal processes. Then, these pro cesses are Markovian diffusions on the circle Kp providing A is sufficiently small. From the above theorem it follows that the corresponding thermal vacuum £lA constructed from the abelian euclidean-time Green functions (see 9,1 ° for details) has the following cyclicity property: the linear hull of the vectors
Wfe~T"? WgAQt f = J,g = g,re[0, 0/2) in the corresponding Hilbert space 7iA is dense. However, the still missing point of the identification of the canonical Tomita-Takesaki W'-KMS structure with those obtained in 9 , 1 0 is the question whether the va cuum QA is cyclic under the action of the time - 0 Weyl algebra W(h) in HA. Acknowledgements I would like to express my deep gratitude to Prof. Ludwig Streit for giving me opportunity throughout the study of his papers and discussions with him to understand the power of the functional integral methods in quantum physics. I appreciate very much the acceptance by Robert Olkiewicz to present some of the unpublished results from our joint work being still in progress. References 1. M.H. Anderson, J.R. Ensher, M.R. Matthews, C.E. Wieman, E.A. Cor nell, Science 269, 198 (1995) 2. K.B. Davies, M-O. Mewes, M.R. Anderson, N.J. van Druten, D.S. Durfee, D.M. Kurn, W. Ketterle, Phys. Rev. Lett. 75, 3966 (1995)
227
3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20.
C.C. Bradley, C.A. Sackett, R.G. Hulet, Phys. Rev. Lett. 78, 985 (1997) S,Chu, Nobel Lecture, 8 December 1997, Stockholm and ref. therein A. Einstein, Sitzungsberichte der Preussischen Akad. D. Wiss. 1, 3 (1925) S.N. Bose, Z. Phys. 26, 178 (1924) M. van den Berg, J.T. Lewis, J.V. Pule', Helv. Phys. Acta59, 1271 (1986) T.C. Dorlas, J.T. Lewis, J.V. Pule', Commun. Math. Phys. 156, 37 (1993) and ref. therein R. Gielerak, R. Olkiewicz, J. Stat. Phys. 80, 875 (1995) R. Gielerak, R. Olkiewicz,/. Math. Phys. 37, 1268 (1996) R. Gielerak.L. Jakobczyk, R. Olkiewicz, J. Math. Phys. 39, 6291 (1998) R. Gielerak, J. Damek, Unbounded perturbations of the Bose-Einstein condensate, subm. to Rep. Math. Phys. R. Gielerak, A. Rebenko, Operator Theory: Adv. and Appl. 70, 219 (1994) R. Gielerak, J. Damek, Polynomial, thermofield perturbations of the ideal Bose gas, in preparation R. Gielerak, R. Olkiewicz, Gentle perturbations of the free Bose gas. III. The states and cyclicity of the thermal vacuum, in preparation R. Gielerak, Gibbsian approach to quantum many body problems, papers in preparation Yu. Kondratiev, E.W. Lytwynov, A. Rebenko, M. Rockner, G.V. Schepaniuk. BiBoS preprint 789/6/97 0 . Bratelli, D.W. Robinson, Operator Algebras and Quantum Statistical Mechanics, II, Springer, 1981 H. Araki, E.J. Wood, J. Math. Phys. 4, 637 (1963) H. Araki, Publ. RIMS 4, 361 (1968)
228
CONFIGURATION SPACES FOR SELF-SIMILAR R A N D O M PROCESSES, A N D M E A S U R E S QUASI-INVARIANT U N D E R DIFFEOMORPHISM GROUPS GERALD A. GOLDIN Rutgers University, SERC Bldg. Rm. 239, Busch Campus, 118 Frelinghuysen Road, Piscataway, NJ 08854, USA E-mail: gagoldinQdimacs. rutgers. edu In honor of Ludwig Streit
To describe a continuous unitary representation of the group of diffeomorphisms of physical space, we typically specify a quasi-invariant measure together with a unitary 1-cocycle on some configuration space. Lebesgue measure on the space of Appoint configurations describes JV-particle quantum mechanics, where noncohomologous cocycles (obtained from unitary representations of the symmetric group or the braid group) give the statistics of bosons, fermions, paraparticles, anyons, or plektons, and even suggest possibilities for quantum nonlinearity. Poissonian or more general measures on spaces of infinite but locally finite configurations describe the infinite-volume limit of a gas of particles. Then some cocycles of interest are as sociated with infinite permutations or braids. A class of quasi-invariant measures, obtained from self-similar random processes, requires spaces of configurations hav ing accumulation points. A program of research on this topic is discussed, including ongoing joint work with Moschella.
1
Diffeomorphism Groups, Local Currents, and Quantum Physics
Motivated by the desire to make use of gauge-invariant currents as coordinates in nonrelativistic and relativistic quantum field theory, Dashen and Sharp pro posed a singular local current algebra of mass and momentum densities.L When these are interpreted as operator-valued distributions, one obtains a natural, infinite-dimensional Lie algebra, modeled on spaces of C°° scalar functions and vector fields on the physical space. Exponentiation of this Lie algebra yields a group G—the natural semidirect product of the group of smooth scalar functions (under pointwise addition) with the group of diffeomorphisms (un der composition). 2 ' 3 Distinct quantum systems in the nonrelativistic limit are described by unitary representations of G. In such a representation, interpre tation of the self-adjoint generators as gauge-invariant mass and momentum density operators, obeying an equation of continuity, specifies the kinematics. 4 To establish notation, let X be the Euclidean manifold of physical space equipped with the (local) Lebesgue measure dx. Define as operator-valued
229 distributions the mass density Pop(x) and momentum density Jop(x), x € X, Pop(x) = m^(x)V'op(x), J ^ x ) = {h/2i) [< p (x) VVoP(x) - ( V ^ ( x ) ) ^ ( x ) ] ,
(1)
where tp0p{x) is a second-quantized, nonrelativistic field at fixed time t obeying the equal-time, canonical commutation (—) or anticommutation (+) relations [&, P (x,*),t// op (y,i)] ± = [ < , M ) , ^ p ( y , t ) ] ± = 0, [ ^ o p ( x , t ) , ^ ( y , t ) ] ± =
Pop{h)\
= 0,
[Popif),
Jop(g)] = »fiPop(g- V / ) ,
[JepiSi), Jopfa)] = -ihJop{[gu
ga]),
(3)
where [gi,g2] is the Lie bracket. Define U(sf) — exp[(ta/m)/J op (/)] and V(<j>f) = exp[(is/fi)J op (g)], a € R, where the flow >* obeys d
= U(fi + h ° *i)V(fcfc),
(4)
where 4>\4>2 —
230
possible quantum system by means of the corresponding self-adjoint represen tation of (3). The latter informs us about the physical meaning of the theory, when we interpret pop and J o p as mass and momentum density operators: e.g., the spectrum of pop describes the particle content. An early and rigorous pre diction of anyon statistics in two space dimensions came from carrying out this program for DiffR2, and turned out to confirm a possibility conjectured by Leinaas and Myrheim and afterward suggested by Wilczek. 5 ' 6,7 This led to our understanding of how unitary representations of the braid group BN, includ ing its higher-dimensional represenations, induce CURs of the diffeomorphism group that describe exotic quantum statistics. 4 ' 8 ' 9 The continuity equation, embodying probability conservation, imposes an important constraint on the time-evolution (thus, on the choice of Hamiltonian) in a quantum theory obtained this way. For $ G H obeying Schrodinger's equa tion, the (scalar) mass density expectation value is p m (x,t) = ( ^ / ^ ( x ) * ) , while that of the (vector) momentum density is Jmom(x, t) = (*> Jop(x)*). Then we require that dpm/dt — —V • Jmom2
Fields Intertwining TV-Particle Representations
The iV-particle Bose or Fermi representations of (3) are given by ,£(/)*(•■•> ( x l t . . . x „ ) = m E 7 = i / ( x i ) * ( 8 , a ) ( x i , . . . x N ) , ^(g)*(s'a)(xi,...xw) =
(ft/2i)E7=i{g(xJ)-Vi*C-a)(x1,...xw) + Vi-[g(xi)*(8'a>(x1,...x,v)]},
(5)
where the superscripts "s" or "a" of the square-integrable wave functions * denote respectively symmetric or antisymmetric functions of the N variables. These representations, in which there occur no Schwinger terms, follow from (1) in the "s" (—) or "a" (+) nonrelativistic Fock representations of (2), where the current algebra acts irreducibly on iV-particle subspaces. So we know from the underlying field theory that each series, "s" or "a", N = 0 , 1 , 2 , 3 , . . . , describes a hierarchy of particles (more generally, of quantum excitations) of the same type. When exponentiated, these representations give the Nparticle representations of G (see below). But suppose we do not know about underlying fields, and have only a collection of representations of G satisfying (3) or (4), indexed by N, in Hilbert spaces HN- When are we justified in concluding that they belong to a series describing a hierarchy of excitations? Sharp and I proposed the following criteria. 10 Let pop(f) and Jop(g) be the direct sum over N of all the operators p£,(f) and J<^(g) respectively, in the given indexed collection. Then there should exist, intertwining the
231
representations, fields ip* : HN ->• ' H N + I and ip : "H/v+i -*■ (averaged with a test function h on X) ip* satisfies the brackets
\Pop(f),r(h)} = V*(P£=1 (/w,
UN
, so that
[Japfcww]=r(j£=l(g)h),
(6)
and ip satisfies the adjoint of (6). Note how, on the right-hand side, h is also interpreted as belonging to the Hilbert space for N = 1. Note also that only commutator brackets have been assumed. Since Eqs. (6) do hold in both the "s" and "a" Fock representations, they are reasonable requirements for the collection of representations indexed by N to be a hierarchy. In fact, they apply more generally than in these cases. At the level of the group, we have corresponding intertwining conditions, UN+1(fW(h)
=
r(UN=i(f)h)UN(f),
VN+1(
(7)
where again the adjoint of (7) gives the conditions for ip. One can visualize V>*(x) as creating a singular excitation labeled by x € X, and h as averag ing over such excitations to yield one that is more smooth. Then Eqs. (7) have a natural geometrical meaning. They assert first, that U and xp* act locally in X; and second, that if we create a new (single) excitation labeled by h and then transform the state vector under a diffeomorphism of X, we achieve the same outcome as transforming X by the diffeomorphism and then creating the transformed (single) excitation. The reason for the generality of language is that we are thinking here not only of bosons and fermions, but of anyons, vortices, or more complicated extended quantum objects. Sharp and I constructed nonrelativistic anyon creation and annihilation fields in a particle number representation, obeying the commutation relations above, and found that the fields ip and rp* themselves satisfy ^-commutation relations, where q is the anyonic phase shift. This is one of the places that the g-deformation enters quantum physics without having been introduced "by hand." 3
Configuration Spaces, Quasi-Invariant Measures, and Cocycles
Next we discuss CURs of G more broadly. In a quite general framework, a unitary representation of the diffeomorphism group DiffX can be written so as to act in a Hilbert space ~K = L£(A,W). Here we need a configuration space A, a continuous action of DiffX on A, and a measure / i o n A that is quasi-invariant under diffeomorphisms; i.e., diffeomorphisms must preserve the class of measure zero sets. Then "H is the space of /i-square-integrable functions #(7), for 7 G A, taking values in an inner product space W. The
232
representation V is given by [Vimh)
= X0(7)*(07) W ^ f (7),
(8)
where 07 indicates the action of the diffeomorphism <j> on the configuration 7; /i0 is the measure transformed by <j>; dn^/d/i is the Radon-Nikodym derivative of fifj, with respect to /z; and the X(t>(l) '• W ->• W are unitary operators in W. The latter are defined (V0 G DiffX) almost everywhere in A, and must satisfy (V0i,
(9)
almost everywhere in A. The quasi-invariance of /x is the condition that guar antees the existence of the Radon-Nikodym derivative in (8); the square root is the real 1-cocycle needed to make the representation unitary. While A together with n tell us the quantum configurations, the unitary 1-cocycle x gives in formation about (generalized) phases, including possible exchange statistics of particles. It is always allowed to set x
(10)
which together with (8) defines the representation obeying (4). With X = Kd, the CURs of G that result from exponentiating (5) fit this picture very nicely. They are obtained by letting AN be the space of (unordered) iV-point subsets of X, and /x be the (local) Lebesgue measure on this space. Then Diff X acts on AN in the natural way: for 7 = {xi,..., x/v}, we have $ 7 = {^(x!),.. . ,$(x/v)}. The group action is transitive, and ft is quasi-invariant under it. Indeed the Radon-Nikodym derivative is given by the product (dfj,^/d/j,){j) = U^=i Jtt>(xj), where ^ ( x ) is the Jacobian of <j> at x. Furthermore, we embed A in V'{X) by defining < 7, / >= X)jLi /( x i)- This gives a class of CURs of the semidirect product group describing the quantum theories of iV indistinguishable particles. To specify a particular CUR in the class, we must now choose a unitary cocycle x^MWe can fix an arbitrary element 7 0 = {xj, ...,x°N} € AN, and associate noncohomologous unitary cocycles with inequivalent, unitary representations of the stability subgroup Diffyo X. The latter consists of all diffeomorphisms
233
<j> such that 07° - 7 0 . Note that for <j> 6 Diffyo X, Eq. (9) reduces directly to such a representation at 7 0 . Under the right technical conditions, the CUR of DiffX described by the cocycle can be obtained directly by inducing from a CUR of Diff^o X. The method is similar to Mackey's construction, n except that our groups (being infinite-dimensional) are not locally compact—hence we do not have Haar measure with which to work, and must obtain n and its good properties by other means. 12 Let us look a little more closely at Diffyo X, when 7 0 is an iV-point con figuration. A difFeomorphism <j> can leave 7 0 fixed by leaving each point x9 of 7 0 individually fixed; but for X = R d , d > 1, <j> can also permute the points. Thus we immediately have a natural, continuous homomorphism hyo from Diff^a X to the symmetric group SN- A unitary representation u of SN yields a CUR u o /i7o of Diff^o X, which in turn induces a CUR of DiffX. In this way, the iV-particle representations with different unitary cocycles describe not only Bose and Fermi particle statistics, but also parastatistics (induced from the higher-dimensional unitary representations u of SN)- The trivial repre sentation u = 1 gives the cocycle x<j>{l) = 1> which describing bosons. The alternating representation of SN leads to a 1-cocycle describing fermions. When d = 2 more possibilities exist. There is now a natural homomor phism from Diff^o R 2 to the braid group BN- Unitary representations of BN then provide CURs of Diff7o R 2 which induce CURs of Diff R 2 . In this way one-dimensional unitary representations of BN lead to the intermediate statis tics of anyons, and higher-dimensional representations of BN to plektons. 6 ' 9 ' 13 It is interesting to remark on what might seem a pathological feature. The measure zero set 2$ c A on which x
234
DiffX. Lebesgue measure on A N is quasi-invariant under diffeomorphisms, but now the corresponding CURs of DiffX are reducible; diffeomorphisms alone do not connect the subspaces having different exchange symmetry. How ever, let us introduce a set of distinct particle masses rrij ^ m* for j ^ k, and define < 7 , / > = X) J = 1 mjf(xj)Then the mass density operator acts as multiplication of $(7) by < 7, / >; i.e., pNp,m„...,mN{fMxu
...,Xyv)
=
^ m j / ( x j ) $ ( X l ) ...,x„).
(11)
i=i
The corresponding CUR of the semidirect product group is now irreducible. We return to systems of distinguishable particles later. There are some different ways to go from CURs of G describing N parti cles to those describing infinitely many particles. One way is to put N bosons in a compact, rectangular region of volume V, impose periodic boundary con ditions on wave functions in the region, and then let N and V become infi nite while their ratio N/V approaches a finite constant (the mean density). More precisely, let V'N,V(/WN,V{4>) be the iV-particle Bose representation of G = T>(X) x DiffX, where X is now a torus of volume V. Let the (normalized) ground state wave function on this torus be CIN,V = V ~N/2, and write the gen erating functional Lsy{f) = (£lNy,UN,v{f)£lN,v)The limiting functional L„(f) =
lim LN,v(f) = exp{p/"[e''M-l]dx} (12) ;v,v-»oo, N/v-tp > 0 J is then the Fourier transform of a Poisson measure Up, where p > 0 is the Poisson parameter. To elaborate, we obtain \ip as a probability measure on the configuration space I'j^ of countably infinite but locally finite subsets of X. The measure is concentrated on the subset consisting of all configurations that have average density p. For 7 = {x.j\j — 1,2,3,...} e ^x , a n ^ f° r K
= f
expi<7,/>dM/,(7).
(13)
Now for <j> € DiffX, its natural action on Tj^ is given by $ 7 = {
235
The more important fact is that fip is quasi-invariant under this action—the Radon-Nikodym derivative (dn^/d^)(-y) is the infinite product of the Jacobians J0(xj), x.j € 7, but as <(> is compactly supported, only finitely many of the factors are unequal to one. Thus we have (for each value of p) a CUR of the semidirect product group G. This CUR describes the infinite free Bose gas. Another way to obtain the same representation is to start in the Fock representation of the canonical Bose fields ip^p, ipop. Make the substitutions ip*p -+ ^gP+^/pI and ipop -> ipop+VpI, which leave the canonical commutation relations manifestly invariant. Now Eq. (1) gives the self-adjoint representation of the current algebra associated with the Poisson measure. 14 ' 15 The configuration space I * , equipped with Poisson or Gibbs measure and carrying a CUR of the group DiffX, is of great current interest in both quantum and statistical physics, and the above development has been advanced considerably. 16,17 For the Bose case we have been discussing, x^(7) = 1- Evi dently, in more than one space dimension, one can obtain additional, inequivalent representations from the cocycles associated with the nontrivial represen tations of a group of infinite permutations, or (in two space dimensions) infinite braids. I am currently pursuing an idea in this direction, in collaboration with Kondratiev, that we believe might give us a nice description of an "infinite Aharanov-Bohm gas." Of long-term interest is also the question of how to construct measures that are quasi-invariant, under diffeomorphisms of X, on spaces of extended objects—configurations such as strings, loops, or ribbons embedded in X. A step in this direction is to consider the more elementary possibility of infinite configurations of points in X that are not locally finite, but have accumulation points. This is one of the key ideas behind my work with Moschella, which I shall describe next. 4
Measures from Self-Similar Random Processes
When X is the real line, the half-line, or the circle, there are some special techniques for obtaining CURs of DiffX and its central extensions. 18>19>20. Here, through a few elementary examples, I would like to explain a method that Moschella and I introduced 21 ' 22 . 23 > 24 for constructing measures quasiinvariant for DiffR.1. Let us work in the sequence space X°°, with X = R 1 . To start, draw a pair of points £1,2:2 6 R 1 from some nowhere vanishing probability density f(x), and for convenience label the points so that x\ < x-i. Now draw two more points from the uniform probability density on the interval [11,12]. and label these so that 13 < X4. Iterating this procedure, we have a Markov process in which X2m+i and a;2m+2 are drawn from the conditional
236 probability density hm+l
{x\X2m-l,X2m)
= hm+2
(x | X2m-1,
X2m)
=
8 am-i'»»-J ^^ F2m — X2m-l|
^ ^
where I[a,b] denotes the indicator function of the interval [a,b]. To adapt the method to the circle S 1 , begin by drawing X\ and x-i from a uniform density and continue with the shorter of the two resulting intervals of arc (which with probability one are of unequal length). It is not hard to show that with probability one there is precisely one cluster point for the sequence {XJ), to which it converges. The sequence of conditional probability measures for obtaining the Xj defines (using Kolmogorov's theorem) a probability measure H on the space X°° of infinite sequences (XJ). Now let 7 = (XJ), and consider the Radon-Nikodym derivative (d(i#/dfi)(y) of the transformed measure /i^ with respect to /i, at 7. If it exists, it can be expressed as an infinite product: (dfi^/dn)^) = rij^i uj,<)>(y) > where the jfth factor 1^,0(7) arises from the step in which the point Xj is drawn. Quasi-invariance of pi under diffeomorphisms means that this infinite product converges to a finite, non-zero limit. In general, supposing Xj to be selected from the conditional probability density fj (x | Xi,..., Xj-i), we have =
fM*s)\«*i),---M**-i))
J { )
(15)
fj(Xj\Xl,---,Zj-l)
The explicit expression for ^ ^ ( 7 ) in the present example, using (14), is «i=2m+l,0(7J =
777v T7x7 J
■
U&)
and similarly for j = 2m + 2. Since with probability 1 the sequence Xj con verges, we have Ujt<j> ->■ 1. Note how the nature of the conditional probability leads to this conclusion. We let the width of the probability distribution for each pair of points be established by the outcome in choosing the preceding pair. Then, as the Xj approach a limit, the first factor in (16) tends toward the reciprocal of the Jacobian. The self-similarity built into the configurations selected by the measure is what gives us the quasi-invariance. Now, tij-,0 -+ 1 is necessary but not sufficient to ensure that the infinite product converges to a finite, non-zero limit. It is also necessary that the rate at which Uj^ -¥ 1 be sufficiently great. In the case at hand, it is not hard to show that (with probability 1) £ j l i \XJ+I - xj\ < 00, which guarantees the desired convergence. So we do have a quasi-invariant measure, a CUR of DiffK1, and a quantum system that the representation describes.
237
Physically, this CUR may be interpreted as the quantum theory of a strictly confined cluster of infinitely many particles. Despite the fact that the Xj are ordered by the index j , the component particles in the cluster are not truly (i.e., not intrinsically) distinguishable. This is because the index la beling of the points in (XJ) coincides, with probability one, with the ordering obtained directly from the values of the Xj (using the order relation in R 1 ). We can simply label the odd points a;2m+i as we come in from the left, and the even ones X2m+2 as we come in from the right, and obtain the same in dices that occurred during the random process just described. However we can further introduce a mass distribution for the particle cluster. Choose a fixed sequence (rrij), with each entry m,j > 0 and X),-mi = M. For / G X>(RX) let < 7, / > = M - 1 E ~ i "»y/(xy) < oo. Then U(f)V (<(>)$ (i) = e l ' < w > l * ( 0 7 )
N
U^iAi)
(17)
gives a CUR of the semidirect product group G. The infinite free Bose gas in R 1 , described as in the previous section by a Poisson measure with parameter p, can also be constructed through a self-similar random process "reciprocal" to the one above—where the pairs of points are chosen outside rather than inside the intervals established by the preceding choices. We choose the first pair of points xi and xi to the left and the right of the origin (respectively) from exponential densities with parameter p. Then continue with densities Um+i and fim+z that vanish in the interior of the interval [af2m-i»^2m]» and are nonzero but fall off exponentially to the left and the right of the interval respectively (keeping p independent of j). There exist further interesting possibilities for quasi-invariant measures based on intervals. We can construct tensor product representations, for which the configurations are N tightly-bound (possibly overlapping) clusters. We can also consider configurations that are the union of infinitely many clusters, where the accumulation points distribute according to a Poisson measure with fixed parameter. In such examples, we begin to deal with the possibility that distinct ordered sequences may lead to the same unordered configuration. Measures on spaces of random Cantor sets, quasi-invariant for diffeomorphisms, were constructed by the present method, and independently by Neretin.24,25 As above, we draw x\ and x-i from nowhere vanishing densities fi(x) = h{x) on R 1 , or S 1 , labeling so Xi < x2, and draw x3 and au from a uniform density on [ari,^], labeling so that x3 < x 4 . Let U\ be the relative complement of [23,£4] in [si.sa], the disjoint union of a pair of intervals (of different lengths). Now draw £5 and XQ from the uniform density on the lower
238
of these intervals, and £7 and x% from the uniform density on the upper one. Again form the relative complement by removing the inner intervals, to obtain £/2 as the disjoint union of four intervals. At the mth stage we thus draw 2 m random points (x2m +1 ,x 2 '»+2>---,X2 m + 1 ) from a measurable set Z7m_i, obtain ing Um C Um-\ as the disjoint union of 2 m intervals. This procedure defines a measure fi on the space of infinite sequences (XJ). With probability one (by this construction) the set of accumulation points in R 1 of such a sequence is a random Cantor set, uncountably infinite but of Lebesgue measure zero. To demonstrate quasi-invariance of \i under a diffeomorphism
239 0 < K < Ko , (XJ) converges to XQ with probability one (condensed phase); and when K > Ko, \XJ\ -t oo geometrically with probability one (rarefied phase). The associated measures on the sequence space are again quasi-invariant under elements of the group Diff'R.1. Thus for all K ^ Ko, we have a unitary group representation and the possibility of its interpretation as a quantum system. In this construction, the use of normal densities is not really essential. We need only the scaling property, and applicability of the central limit theorem. The next major step is to obtain some examples in R d for d > 1. We conjecture that measures can be constructed through self-similar random pro cesses that generalize the preceding example. In R d we anticipate choosing the point i m + i fr°m a density that depends on the outcome for xo, establishing the mean, and on the preceding d points (xm-d+i, ■ ■ ■ , i m ) , establishing the covariance matrix. 5. Configuration Spaces with Accumulation Points Typically distinguishable particles in a manifold X are described by ordered or labeled points in X. Indistinguishable particles are described by (unordered) subsets or generalized subsets of X, where in a generalized subset we permit the same element to occur more than once. Correspondingly, measures quasiinvariant under diffeomorphisms can be constructed on spaces of iV-tuples or infinite sequences of points drawn from X, or alternatively on spaces of finite or countable subsets or generalized subsets. To relate the methods that Moschella and I are using to those of others, especially the cited papers by Albeverio et al. 18>17 and Neretin, 25 we need a framework for statistical physics with a configuration space larger than T ^ . This is a direction of our ongoing research, about which I shall mention just a few ideas. 27 We begin as usual with a smooth, noncompact manifold X, with all the needed nice topological properties. Let S x = { 7 C X : 7 is finite or countably infinite} .
(18)
That is, with N e x t = N U { « 0 } , we have \-y\ 6 N e x t for all 7 G E x . This is the largest "point" configuration space in which we now propose to work. At least in principle, E x allows the description of continuous, extended configurations such as submanifolds embedded in X (strings, membranes, and so on), since these can occur as sets of accumulation points of sequences. We take "gen eralized configurations," to be generalized finite or countably infinite subsets of X; i.e., elements of E x each of whose members x has a multiplicity n(x) which is 1,2,3,..., or N0. Denote the set of generalized configurations by £x.
240
Next we consider the natural projections from ordered iV-tuples and se quences to unordered generalized finite and countable configurations. Let XN be the space of JV-tuples drawn from X, where by convention X N~° contains one element, the empty sequence (); let X °° be the space of infinite sequences of elements of A"; and let S(X) = (Wfi=0XN) UX°°. Thus S(X) includes both finite and infinite sequences (XJ). We can now let p : S(X) -* Ex be the map which assigns to a given sequence (XJ) G S(X) the corresponding generalized set of points; and let p : S(X) -¥ Ex be the map which assigns to 7 = (XJ) the corresponding set 7 = {XJ}. Note that the image of X°° under p is actually all of Ex, as infinite sequences can consist of finitely many recurring points. Of course, this will typically occur with probability zero. Let X?° C X °° be the set of infinite sequences having exactly j accumulation points in X. Such sets will typically have positive measure. Note also that ac cumulation points in X depend only on 7, and not on the particular sequence 7Gp_1(7)-
Let Tx = {7 G Ex : |7 n K | is finite (V compact K C X). Then T™ = {7 € Tx : |7l = No}- Define E ^ = {7 € Ex : H = No}, and let E ^ c ^x consist of configurations that have exactly j accumulation points'in X. So we have T^ = E £ ° L 0 C E ^ , and p(Xf°) = E J ^ . Subsets of the space Ex of generalized configurations can be introduced along similar lines. Our goal is to move via p from quasi-invariant measures on S(X) or X°°, constructed through sequential, self-similar random processes, to corresponding quasi-invariant measures on configuration spaces of identical particles. This should also permit the subsequent consideration of interesting, nontrivial cocycles. I would remark at this juncture that a locally finite configuration 7 defines a functional on V{X) as a sum of evaluation functional, but this functional is not defined for a general element of Ex. Therefore measures on Ex quasiinvariant for DiffX, from which we obtain unitary representations of DiffX, may not always yield unitary representations of the semidirect product group. Now each of the quasi-invariant measures /1 on X°° described in the pre vious section is obtained, by a standard construction, through a compatible family of measures on cylinder sets with finite-dimensional Borel base. We wish to obtain from /J under these conditions a quasi-invariant measure on Ex or one of its subsets; i.e., to be able to essentially disregard the order of the entries in the sequence, and look only at the target points. To accomplish this, the idea is to introduce the largest a-field on Ex for which the projection map p is measurable, and obtain the projected measure p,. One nice result is that for any j , the set of all configurations with precisely j accumulation
241
points in a given Borel set of X belongs to the the a-field on Ex- Thus we can count the numbers of accumulation points. A second, essential result is that the construction respects the quasi-invariance of \x under DiffX. Acknowledgments I would like to thank the Alexander von Humboldt Foundation for its sup port of the continuation of this work and related research during my 199899 sabbatical year in Germany. I am also grateful to the Institut fur Angewandte Mathematik, Universitat Bonn, Germany, the Dipartimento di Scienze Chimiche, Fisiche, e Matematiche, Universita delPInsubria, Como, Italy, and the Arnold Sommerfeld Institut fur Mathematische Physik, Technische Uni versitat Clausthal, Germany, for hospitality during my recent visits. I have benefitted from interesting conversations connected to this work with S. Albeverio, Y. G. Kondratiev, T. Kuna, U. Moschella, and M. Rockner. References 1. R. Dashen and D. H. Sharp, Phys. Rev. 165, 1867 (1968). 2. G. A. Goldin and D. H. Sharp, Lie algebras of local currents and their representations. In 1969 Battelle Rencontres: Group Representations, Lecture Notes in Physics 6, ed. V. Bargmann, (Springer, Berlin, 1970), p. 300. 3. G. A. Goldin, J. Math. Phys. 12, 462 (1971). 4. G. A. Goldin, R. MenikofF, and D. H. Sharp, Phys. Rev. Lett. 51, 2246 (1983). 5. J. M. Leinaas and J. Myrheim, Nuovo Cimento 37B, 1 (1977). 6. G. A. Goldin, R. Menikoff, and D. H. Sharp, J. Math. Phys. 22, 1664 (1981). 7. F. Wilczek, Phys. Rev. Lett. 48, 1144 (1982). 8. G. A. Goldin and D. H. Sharp, Phys. Rev. D 28, 830 (1983). 9. G. A. Goldin, R. Menikoff, and D. H. Sharp, Phys. Rev. Lett. 54, 603 (1985). 10. G. A. Goldin and D. H. Sharp, Phys. Rev. Lett. 76, 1183 (1996). 11. G. W. Mackey, Induced Representations of Groups and Quantum Me chanics (Benjamin, New York, 1968). 12. G. A. Goldin, R. Menikoff, and D. H. Sharp, J. Math. Phys. 2 1 , 650 (1980). 13. G. A. Goldin, Parastatistics, ^-statistics, and topological quantum me chanics from unitary representations of diffeomorphism groups, in Procs.
242
14. 15. 16. 17.
18. 19.
20. 21.
22. 23.
24.
25. 26.
27.
of the XV. Int'l. Conf. on Differential Geometric Methods in Theoret ical Physics, eds. H.-D. Doebner and J. D. Hennig (World Scientific, Singapore, 1987), p. 197. G. A. Goldin, J. Grodnik, R. T. Powers, and D. H. Sharp, J. Math. Phys. 15, 88 (1974). A. M. Vershik, I. M. Gelfand, and M. I. Graev, Dokl. Akad. Nauk. SSSR 232, 745 (1977). S. Albeverio, Y. G. Kondratiev, M. Rockner, J. Funct. An. 154, 444 (1998); J. Funct. An. 157, 242 (1998). S. Albeverio, Y. G. Kondratiev, M. Rockner, Diffeomorphism groups and current algebras: Configuration space analysis in quantum theory, Rev. Math. Phys. (in press). G. Segal, Commun. Math. Phys. 80, 301 (1981). R. S. Ismagilov, Representations of Infinite Dimensional Groups, AMS Translations of Math. Monographs 152 (Providence, RI: American Math. Society, 1996). E. T. Shavgulidze, Proceedings of the Steklov Institute of Mathematics 217, 181-202 (1997). G. A. Goldin and U. Moschella, Diffeomorphism groups, quasi-invariant measures, and infinite quantum systems. In Symmetries in Science VIII, ed. B. Gruber, (New York, Plenum, 1995), p. 159. G. A. Goldin and U. Moschella, J. Phys. A: Math. Gen. 28, L475 (1995). G. A. Goldin and U. Moschella, Diffeomorphism group representations and quantum phase transitions in one dimension. In Xlth International Congress on Mathematical Physics, ed. D. Iagolnitzer, (Cambridge, MA: International Press, 1995), p. 445. G. A. Goldin, Unitary representation of diffeomorphism groups from ran dom fractal configurations. In Lie Theory and its Applications, eds. H. D. Doebner, V. Dobrev and J. Hilgert (Singapore, World Scientific, 1996), p. 82. Y. A. Neretin, Sbornik: Mathematics 187, 857 (1996). See also G. A. Goldin and U. Moschella, Random Cantor sets and measures quasi-invariant for diffeomorphism groups. Submitted for the Procs. of the 17th Workshop in Geometric Methods in Physics, Bialowieza, Poland, 1998 (1999, in press). G. A. Goldin and U. Moschella, Generalized configuration spaces for quantum systems. Submitted for the proceedings of the International Conference on Infinite-Dimensional (Stochastic) Analysis and Quantum Physics, Leipzig, January 18-22, 1999 (in press).
243
STOCHASTIC PROCESSES A N D THE F E Y N M A N INTEGRAL Z. HABA Institute of Theoretical Physics, University of Wroclaw, Poland We discuss a representation of quantum dynamics in terms of Markov processes. It is shown that a holomorphic continuation of the Schrodinger wave function ^t(x) —* ^t(z) determines a complex Markov process qi(z). Feynman integration at a finite temperature is briefly discussed.
1
Introduction
An attempt to express the quantum dynamics by means of a stochastic process has a long history. We consider here some methods which relate the Feynman integral l to a probability measure (for some other approaches see 2 - 3 and references quoted there). As is well-known the Feynman integral cannot be expressed as a complex measure on an infinite dimensional space of real paths. It can however be represented as a distribution on a space of Wiener function a l ( Hida distribution; see 4 - 5 ) . We discuss in this paper another approach which allows to express the Feynman integral by means of a stochastic process running over a complexified configuration space (the process is a functional of the Brownian motion). Such an approach has been discussed earlier by Cameron 6 , Doss 7 and Azencott and Doss8. We shall denote by b(t) € R" the Brownian motion, which is the Gaussian process with the covariance E[bj{t)bk(s)] = 6jkmin(t, s) where j,k = 1,2, ..,n. The expectation value can be realized by the Wiener measure. A gener alization of the Feynman formula of Cameron and Doss 6 7 reads (we exclude some points from Rn, then the admitted class of potentials is much larger than the one considered by previous authors). Theorem 1 Assume that i/>(x) and V(t,x) (for each t 6 [0,r]) are analytic functions and e x p f - ^ / V(t-s,x
+ \ob(s))ds\tp(x
+\ob(t))
is integrable with respect to the Wiener measure for t € [0,r] and x € V,
244
where T> is a region in Rn, then ip(t,x)=E where
exp
-u>
--
/ V (t - s, x + Xab (s)) ds)ip(x
+ \
(1)
A = - L ( l + z) and a —
is the solution of the Schrodinger equation ihdtip(t,x)
= -—A4>(t,x) + V(t,x)1>(t,x) (2) 2m (with the initial condition il>) for < € [0, r] and x E D . In the proof of this theorem one shows through differentiation (apply ing elementary stochastic calculus) that the Schrodinger equation is satisfied pointwise. The assumption of integrability can be satisfied for a large class of potentials with V which is the whole of Rn with a possible exception of arbitrarily small neighbourhoods of a finite number of points (see a discus sion in 9 ). Analytic potential which is linearly bounded on the domain (z € C" : z = x + y + iy where x € V and y G Rn) will satisfy our assump tions. In particular, we discussed in 9 the rational functions of the form P(x)Q(x)-1 where P is a polynomial of degree k and Q is a polynomial of degree greater than k-2. Then, the set V will be Rn without small neighbourhoods of a finite number of points where the Brownian motion (multiplied by (1 + i) ) hits the complex roots of Q. Another class of admissible potentials includes rational functions of trigonometric and hyperbolic functions. For example we can treat V(x) = tan(z) if x ^ (2k + l)7r/2 in ipt(x). 2
Representation of the quantum dynamics by a stochastic process
We show in this section that quantum dynamics can be represented by a Markov process. First, we show how to construct the process as a solution of a stochastic differential equation. Then, we express the correlation functions of the process and its characteristic function by means of the solution of the
245
corresponding Schrodinger problem. We assume here that the ground state x is known ( for another classical construction of the process see 9 ) . We write the ground state x °f the Hamiltonian H in the form x = e x p ( - ^ ) (x is strictly positive, hence In x is well-defined) # e x p ( - - ) = £0exp(--) where h2 H = - ^ A
+
V
(3)
We consider the initial condition for the Schrodinger equation (2) in the form tp = x4> where both x and <j> are assumed to be analytic. Vt is the solution of the Schrodinger equation (2) with the initial condition tp = x
(4)
with the initial condition
(6)
is the solution of the equation (4). It is assumed here that the expectation value (6) is taken over the paths q which do not explode till the time r(q) > t and moreover that the initial wave function
exp (-S^JEM*
(z))]
(7)
is the solution of the Schrodinger equation (2) on Cn . q«(z) can be considered as a random analytic map Cn -»• Cn. In general, in the stochastic equation (5) we could consider any solution xt of the Schrodinger equation. The solution may have nodes. In such a case the drift in the stochastic equation has poles at some points z*. In the complex topology the poles at zjt are treated in the
246 same way as the one at oo (which appears even if S(z) is analytic). Then, the problem of an explosion to infinity with probability 1 is the same as the problem that the solution qt hits a pole of the drift ( this probability can be zero, see 1 1 1 0 ). In principle, the stochastic equation (4) can be decomposed into its real and imaginary parts. We obtain a degenerate (real) diffusion on R2n (the degeneracy means that the dimension of the Brownian motion is a half of the dimension of the diffusion). Correlation functions of a Markov process can be evaluated by means of the transition function ( we use here one-dimensional notation in order to avoid indices) . If 0 < t\ < t2.... < tk then E[qtl (x) qtk (x)] = J dyx....dykp{0, x, tx, t/i )p{h, j/i, t2, y2) p(tk-i,yk-uh,yk)yiy2 y*
^
We decompose the stochastic process qt G Cn into two real processes q = q R - H q / = (qfl,q/) The representation (6) means that the solution of the Schrodinger equation can be defined by the transition function of the Markov process q £ R2d 0 t (x) = £Wqt(x))] = y d y R ( i y / p ( O , ( x , O ) , « , ( y f l , y / ) ) 0 ( y H + i y / )
(9)
On the other hand from eq.(6) it follows that we can express
(10)
where H = exp(-)#exp(--) Namely
<M*) = | d x % ( x , x W )
(11)
As a consequence we can represent the correlation functions (8) in the form (again in the one-dimensional notation) E[qtl(x)....qtk{x)\ K^-t^Axk-ux^Xi
= / dxi...dxkKtl(x1Xi)Kt2-ti(x1,x2).... xk
fl2)
Let us note that i ~ S i exp(--tH) ~ exp(—)exp(—-tH)exp(— n a h
S —) h
(13)
247
Hence, we can express the correlation functions (8) and (12) directly by the eigenfunctions ipn and eigenvalues en of H, i.e., if (where Vo = e x P ( — § ) )
then (we assume for simplicity that the spectrum is purely discrete)
Kt(x,x') = exV{^)Y,i^{^n{x')exp(-l-ent)exp{-^-)
(14)
n
Inserting this expression for the kernel K into the formula (12) we can repre sent the correlation functions of the process qt by the solution of the quantum mechanical eigenvalue problem (0 < t\ < ti-— < <jt) E[qtl(x)....qtk(x)] = e x p ( ^ ) £ n i nfc xnin2xn^n3....xnk0ipni(x) exp(-i'tii/ n 2 n i - itivn3n2 - .... - itkVnko)
^
where = / dxipn(x)xipm(x)
(16)
and Vnm - («n ~ f-m)l%
(17)
We can also express the characteristic function of the process qt by means of a solution of a quantum mechanical problem. For this purpose let us consider a perturbation of the Hamiltonian H by a time-dependent electric field f (t) ihdtipt = {H + gf(t)x)iPt
(18)
In an elementary approach to the problem one applies an expansion into the basis of unperturbed eigenstates ipn n
Then, eq.(18) leads to an equation for an ih-rr- = £nan + ^ 5 f x n m a m
(19)
m
The solution of eq.(18) can also be expressed by the stochastic process qt(x). Assume that the th initial wave function is tp = exp(—^)(j>. Then, its time evolution reads lW,x)=exp(-^)exp(-IS(x)) E [exp ( - * / 0 ' f (* - *,q s (x)) ds) > (m (x))j
(M)
248
If we consider an evolution of the ground state x under the action of an external electric field g{(t) then from eq.(20) X{(x)
= exp (-*?*) exp ( - ± S ( x ) ) E [exp ( - f fif (t - s)q g (x)ds)] (21) Eq.(21) determines the characteristic function of the process by the time evo lution of the ground state. We can apply for the computation of x{ ( x ) t n e Feynman formula X f(x)
= E [ e x p ( - i / , ; V ( x + A«Tb.)ds)
exp ( - if / 0 ' f (t - s) (x + \ab,) ds-S(x
+ Xabt) /hj ]
The correlation functions of q at different times can be computed from x{ by means of a functional differentiation. Note that these correlation functions can be calculated from the conventional perturbation theory, i.e., solving the differential equation (19) till the k-th order in g with the initial condition
an(0) = <W We can obtain a solution x{(z) °f t n e Schrodinger equation (18) in an external field f as well as the solution q«(z) of the stochastic equation (5) for a complex z 6 C n . Hence, the random map z —>• q«(z) is determined by a solu tion of the quantum mechanical problem. Note, however, that we are unable to derive from quantum mechanics neither the correlation functions between the real and the imaginary parts of qt(z) nor the correlation functions of q«(z) and qt(z') if z ^ z'. Hence, a solution of the stochastic equation (5) gives more information then it results from quantum mechanics and (presumably) more then it is needed for quantum mechanics. 3
Quantum oscillators
The Feynman integral (1) represents the quantum dynamics as a perturbation of the free motion. In many applications (in particular to quantum field theory 13 ) it is useful to take the discrete spectrum explicitly into account. So, if the potential is growing to plus infinity then we separate its quadratic part writing the Hamiltonian in the form (for simplicity of the notation we restrict ourselves to one degree of freedom) 1
H = H0 + V= ^-p2
fn
+ ^ V x 2 + V(x)
where Ho denotes the Hamiltonian of an oscillator.
(22)
249
The ground state x of HQ is mux2. — )
, . . X(x)=exp(
(23)
The stochastic equation (5) for the harmonic oscillator reads dq = -iwqdt + Xadb
(24)
Let qt(x) be the solution of the stochastic equation with the initial condition x. Then, the Schrodinger time evolution of the harmonic oscillator in the state tjj = x
= (exp (-M0£j
^
(x) =X(x)E[4>(qt
(x))]
(25)
where qt(x) = exp(-iw()x + \
We obtain now the following Feynman formula Theorem 2 Consider the Hamiltonian H (eq.(22)). Assume that the initial state tp is of the form ip = x4> where x ls the ground state of Ho and (j> is square integrable. Assume that rt eX
P(
_
£ /[
Jo
Vfa V(qr{x))dT T(x))«
is integrable. Then, the solution of the Schrodinger equation (t > 0) ikdtipt = Hipt with the initial condition ip has the probabilistic representation M*)
= {Ut1>){x) = X(x)E [exp ( - i £ V (qT (x)) dr) <j> (qt (i))]
(26)
Proof: by a similarity transformation we obtain H = x~lHx
= H0 + V =
h2 2 d + kuxd + V(x) 2m
H0 is the generator of the process qt. Then, the formula (26) is the standard formula for a perturbation of the semigroup generated by ^ o by a potential - ^ V as expressed by a multiplicative functional. It can be checked by ele mentary direct differentiation that dtipt\t=o — "ji^^That ipt satisfies the Schrodinger equation for arbitrary t (under the assumption that the integrand
250
in eq.(26) is well-defined) follows from the composition law of multiplicative functional (the Markov property). We wish to derive an explicit Feynman integral formula for the evolution kernel of the operator (26). Let us start with a definition of the Gaussian process 0 < r < t Q1(T) = al{T) + Xa^ir)
(27)
where o/ is the solution of the oscillator equation (j^W)at(T) = (-/V(r) = 0
(28)
with the boundary conditions a e (0) = y and o:'(£) = y', explicitly , v
'
sinMt-r)) sin(w<)
sjM-Hl, y
sin (a;*)*
7* is the Gaussian process (satisfying the zero boundary conditions 7 f (0) = jl(t) = 0) with mean zero and the covariance (this is an analytic continuation in time of a formula of ref.12) E[1t(r)1t(T')]=M(T,r')
(29)
where CTM(J,T')
= (-^L-U2)M(T,T')
= 8(r - r')
(30)
i.e.,£Al = 1. M is the Green function of the operator C with the Dirichlet boundary conditions on the interval [0,t]. For t < IT/CJ the kernel M is positive definite. The explicit formula for M and T' >T reads M{T,T')
=
(sin (u> (t —
T1))
cos (w (t — r)) — cot (vt) sin (u> (t — r)) sin (w (t — r'))) /u> (31) The unitary time evolution can be expressed by the evolution kernel
Tpt(y) = fdy'K(t,y,y,)i>(y') We can now represent the evolution kernel by the bridge Q. Let 0 < t < — then the integral kernel of the unitary evolution operator Ut of Theorem 2 (under the assumptions of this theorem ) has the representation K(t,y,y')
= KQ(t,y,y')E[exp(-frtiv(QT)dT)]
= K0K
(32)
251
where KQ is the Feynman kernel oi
exp(-iHo^)
K0{t, x, y) = (2-rria2 sin (tut)/ w)~ 2 exp (j^j (x2 cot (ut) + y2 cot (wt) - 2 I J / / sin (at)))
,„„, ^
In order to obtain a formula for larger t we may use the composition rule
ul+s = utu.. 4
The Feynman integral at finite temperature
The quantum mechanics of an isolated system represented by a pure state describes an idealization of real systems. The physical systems encountered in applications usually are in equilibrium with an environment. The equilibrium is described by the Gibbs state p0 = (Tr (exp (-0H )))" 1 exp (-/?//) where 1//3 = kT T is the temperature and k is the Boltzman constant. The quantum statistical correlation functions of observables .4 and B are expressed by a formula (AB(t))0
= (Tr[exP(-0H)]ylTr[Aexp{itH/h)Bexv{-itH/h)
exp{-pH)] (34) We consider the Hamiltonian of the form (22). We can see that for a prob abilistic representation of the correlations (34) we need a probabilistic repre sentation of the operator Uz = exp(-zHfh) where z = 0h + it with (3 > 0. Let us denote xpz = Uzip then ipz satisfies the equation -hdt1)z
= Hrpz
(35)
We wish to express the solution of eq.(35) by the Wiener integral. We consider a complex path Z(T) = P{r)h+it(T) in the right half- plane such that z(0) = 0 and z(s) = z — 0h + it . We define
( ^ ^ e x p ^ d - e ^ ) ) ) ^ ^ ) * ^ . ^ ) *
06)
252
where e(0') is equal to +1 if the derivative 0' is positive and -1 in the opposite case (clearly 4§ = t'/0'). The square root of the last term in eq.(36) is denned by an analytic continuation of the power series denning (1 + Qi. We can generalize Theorem 2 to complex time. Let \ D e the ground state of the oscillator and xjj = x
E[exp(-UoV(qz{r))dz(T))i>(qz)]
^
]
where Qz(r) = exp {-UJZ (r)) x + a I
exp (-wz
(r'))
I - ^ \
<#>(T')
(38)
If the exponential factor in eq.(37) is integrable then the proof of eq.(37) follows by a formal differentiation using the Ito formula (in the same way as the proof of Theorem 2). In fact, the formula (37) for /? > 0 can be proved for a larger class of potentials then eq.(26) for 0 = 0 (see 1 4 ). We can repeat now the derivation of the stochastic equation of sec.2. Let X be the ground state of H. Assume the initial condition ip = X
If tpz = Uzxp satisfies the Schrodinger equation (35) then we obtain the fol lowing equation for
+aA
m
*db(r)
(40)
dT
with the initial condition q| z= o = x . The solution of eq.(39) is r/fz(x) = Xz(x)E[4>(qz(x))]
(41)
The solution (38) for the harmonic oscillator is an example of an application of the formulae (40)-(41). In order to compute the expectation value (34) in the Gibbs state we need an evolution kernel K(z,y,y') of Uz (i.e., {y\Uz\y')). It can be obtained by means of an analytic continuation of the formula (32) K(z,
y, y') = K0(z,
y, y')E
[exp ( - £
/ ' V ( Q , ( T ) ) dz (r))]
(42)
253 where KQ is the Feynman kernel of
exp(-Hoj^)
2
K0{z,x,y) - (27T(7 sinh(w2)/w)-i exp ( - 2^7 (a:2 coth (wz) + y2 coth (wz) - 2xj// sinh (wz)))
. ^
and Q is constructed from a and 7 1
Qz{T)=a'{T)+a(^yY(T) with (Z-Z(T))) sinh(cjz(r)) , T^ ; 2/ + —r-r~, :—y sinh (uz) sinh (CJ,Z) 7Z is the Gaussian process fulfilling the zero boundary conditions 7 2 (0) = 7 2 (s) = 0 with mean zero and the covariance a
(r) =
smh(uj
E[1*{T)Y{T')\=M{T,T')
(44)
where ( for r ' > r ) M{T, T') - sinh (w(z - z (r'))) cosh (UJ(Z - z (T))) /W , ._> — coth (wz) sinh (w (z — 2: (r)))sinh(ui (z — z(r'))) /w ^ We are interested in the computation of the correlation functions of the position operators x(i) (in the Heisenberg picture) (xx{t))0 = (Tr (exp (—/?#)))
(46)
j dxdyxyK(t, y, x)K(z, y, x)
For the harmonic oscillator we compute easily (using the formulae (33) and (43)) (x(O)x(t))0 = ^ j (coth.(/ftjft/2) cos(wt) - isin(wt)\
^ '
This is the famous formula for the frequency distribution of the quantum thermal noise 15 . In general we can write (A(x)B(x(t)))0 = (Tr (exp (-0H)))'11
dxdyA(x)B(y)K(t,
y, x)K(z, y, x)
{
™>
In the classical limit we obtain (by means of a formal saddle point method, see 16 ) {A(x)B(x(t)))0 = JdxdPA(x)B (yt (*,p))exp(-/H/(s,p))
l
*y;
254
where yt(x,p) is the solution of the Newton equations with the initial position x and the initial momentum p and H(x,p) is the classical Hamiltonian. The problem (48) is interesting from the point of view of the relation between the classical tunneling at finite temperature and the quantum tunneling (relevant for a quantum theory of chemical reactions). For this purpose one has to take A concentrated on one side of the potential barrier and B concentrated on the other side. There is a finite tunneling in the classical limit because of the heat bath . The classical tunneling rate behaves as exp(—c0). The quantum tunneling rate at zero temperature behaves as exp(—^-). It seems that the quantum formula at finite temperature is unknown and could be estimated from the Feynman integral. Acknowledgement This paper is dedicated to Ludwig Streit on the occasion of his 60th birth day. It has been a wonderful experience supporting my interest in stochastic methods to visit him at the stochastic center (BiBoS) in Bielefeld. References 1. R.P.Feynman, Phys.Rev.80,440(1950) 2. S. Albeverio and R. Hoegh-Krohn, Mathematical Theory of Feynman Integrals, Springer, 1976 3. S. Albeverio and Z. Brzezniak, Journ. Func.Anal. 113,117(1993) 4. L. Streit and T. Hida, Stoch.Proc.Appl.l6,55(1983) 5. T. Hida, H.H. Kuo, J.Potthoff and L. Streit, White Noise Analysis, Kluwer, Dordrecht, 1993 6. R.H.Cameron, J.Math.and Phys. 39,126(1960) 7. H.Doss, Commun.Math.Phys. 73,247(1980) 8. R.Azencott and H.Doss,in Lecture Notes in Math.,No.ll09,Springer,1985 9. Z. Haba, Feynman Integral and Random Dynamics in Quantum Physics, Kluwer, Dordrecht, 1999 10. R.Carmona, in Proceedings of the Taniguchi Symposium, ed.K.Ito and N.Ikeda, Academic Press,New York,1987 11. E. Carlen, Commun.Math.Phys. 94,23(1984) 12. I.Simao, Stochastic Analysis and Appl. 9,85(1991) 13. Z. Haba, Journ.Math.Phys. 39,1766(1998) 14. J.Feldman, Trans.Amer.Math.Soc. 108,251(1963) 15. H.B. Callen and T.A. Welton , Phys.Rev. 83,34(1951) 16. L. Dolan and J. Kiskis, Phys.Rev. D20,505(1979)
255
N O N - M A R K O V I A N R A N D O M FLOWS A N D H E D G I N G STOCKS IN M A N U F A C T U R I N G PROCESSES M.-O. HONGLER Departement de Microtechnique, Institut de Production E.P.F.L.t CH-1015 LAUSANNE, Switzerland E-mail: [email protected]
Microtechnique
We consider a single production device which delivers a single type of items. The operating policy is of the type make-to-stock with a single hedging point. We calculate the optimal position of the hedging level in two different contexts: a) a discrete flow model describing a reliable machine operating with a random cycle time and serving non-markovian random demand and b) a fluid model, (i.e. con tinuous flow), describing a failure non-markovian prone machine serving a constant demand. The influence of the variability of the underlying stochastic processes on the position of the hedging level is explicitly observable as closed form expressions can be derived for these simple configurations.
1
Introduction
To take the appropriate decisions in a random environment is a crucial problem in our everyday life. In the industrial context where randomflowsof produced goods and their associated demands are ubiquitous, the production manager does strongly rely on powerful and reliable decision strategies. In this manu facturing context, we will focus on one of the simplest case which consists of a machine M able to produce a single type of items. Even in this elementary configuration, several sources of randomness can be identified, namely: ran dom time to complete the production of one item, (also called random cycle time), random availability of the machine due to failures, randomness in the external demands for the item, etc To absorb the effects of these fluctua tions, it is common for the production managers to regulate their production by using a make-to-stock (MTS) production policy. With MTS, one keeps finished items inventories and the strategy is to optimally determine the size of these stocks and to specify the associated feeding policy. The optimization is performed to minimize the production costs which arise, on one hand, from the very presence of the storage zones and, on the other hand, from the fact that the demands are not always immediately fulfilled. In the models studied below, we will always assume that unsatisfied demands are backlogged. Two situations will be analyzed: • a) The demands arrive according to a general renewal process and the
256
number of items produced during a time interval does also follow a gen eral renewal process. We assume that the machine M is here perfectly reliable. The sources of fluctuations in this case are entirely due to the random interarrival times between two successive demands and to the random cycle time. The dynamics is therefore intrinsically discrete and will be observed to be closely related to the G/G/l queueing process. This model is discussed in section 2.
• b) In this second model, we assume that the production and demand flows are continuous, (i.e. we have afluidmodel). The machine is assume to be failure prone. The random time interval between successive failures and the random time needed to repair a failure are the sources of fluctuations. The operating state of the machine are either "on" or "off" and the dynamics will be described by a general alternating renewal process. This is a non-markovian dam model which behavior will be discussed in section 3.
When the underlying stochastic processes are markovian, (i.e. M/M/l queue and markovian alternating renewal process), both situation have been studied in the literature [Ake 86], [Bie 88] and [Vea 96]. It has been established that the optimal policy is of the hedging point type, (H — type). Operating under a H - type policy means: produce until the level of finished items reaches the optimal level z*, (i.e. the hedging point) and stop the production if the level exceeds z* . Roughly speaking, the level z* acts as a reflecting barrier for the stochastic processes which describes the the level of items in the inventory. In this paper, we focus on one aspect of the generalization to non-markovian dynamics. When non-markovian processes are introduced, the simple H-type policy does not remain optimal. The underlying memory effects precludes the existence of a single hedging point, (hedging lines need to be introduced to deal with the "aging" in the states of the underlying stochastic processes). Accordingly, the optimal policy becomes very complex, too complex in fact to be useful for actual implementations. Due to its simplicity, the single point H — type policy is very appealing and despite to the fact that it is only suboptimal for non-markovian dynamics, it is often adopted as a working tool for actual applications. When a H — type policy is adopted for non-markovian dynamics, it remains to determine the optimal position of the hedging level z*, this is the aim of the present contribution.
257
2
Reliable machine and random demands - Discrete flow model
Assume that a machine M is able to produce a single type of item. The time interval between to successive completions of this item, (i.e. the cycle time), is a (positive) random variable described by a probability distribution G(x). The demands d for the produced items arrive at random times and the time interval between two consecutive demands is described by a probability distribution F(x). For the distributions F(x) and G(x), we introduce the notations: - = r°xdG(x)t ^ Jo
a2G = r Jo
x2dG{x) - \ and CV2 = 2fi M JJT
(l)
and
i
r°°
r°°
2
i
2
n2
i= xdF(x), a\ = / x dF{x) - ± and CV = ^f (2) A yo Jo •* JT Clearly the coefficients of variations CVQ and CV£, defined for positive random variables, directly measure the strength of the fluctuations, (note indeed that CV2 does vanish in deterministic cases). Let us denote by X(t) £ Z the surplus process. The process X(t) describes the difference between the numbers of produced and delivered items. We assume that a finished item is immediately delivered when a demand occurs. Accord ingly, when the process X(t) is positive, we have stored items in the finished good inventory. On the other hand, when X(t) is negative backlog is present. We shall restrict ourself to traffic intensity below unity, (i.e. p = £ < 1 ). This guarantees the ergodicity of the process. Indeed for p < 1, the flow of demand arrivals is, on the average, smaller than the production flow of items ; therefore a stationary regime will be reached. Assume now that we produce under a hedging type, (H - type), policy with a hedging level at position z G N. Hence, the machine M is operating as long as the surplus level X(t) is below the level z and M is shut down whenever X(t) > z. Under this policy and when time t -»• oo we clearly have X(t) € (-oo, z). The costs incurred by the storage or the backlog of items will be modeled by a convex function h(x) with h(0) = 0. This last property indicates that no cost is incurred when the production exactly matches the demand. For simplicity of the calculations, but without lost of generality, we shall choose h(x) in the form: h(x) = c~x~ + c+x+,
^yO
and x± = max(0, ±x).
(3)
When running this system during a large production horizon T, a stationary regime is attained. For T -»• oo, the average total production costs J(z) will
258
be independent of the initial conditions. Moreover, ergodicity implies that we have: J{z)=
lim i / "
h(X(t))dt=
£
h{x)Pz(x),
(4)
where P«(x) is the stationary probability measure of the process X(t) having a hedging barrier at z. Knowing Pz{x) in Eq.(4), it is possible to explicitly calculate J(z) and then the optimal hedging level z* which minimizes J{z), will be found by solving: ~J(z)
U=,>=0.
(5)
To determine Pt{x), we define a new stochastic process E(t) = z — X(t) =>• E(f) € [0, oo). Clearly the dynamics of the process E(t) is identical to the queue length of the F/G/1 non-markovian model. Hence, the stationary measure Pz(x) appearing in Eq.(4) derives directly from the stationary measure for the queue length of the F/G/1 process, (see [Coh 82] for a detailed exposition devoted to queueing processes). In particular, for the markovian situation which is obtained when F(x) = 1- e~Xx and G(x) = 1 — e-'**, the stationary measure of the markovian queue length is given by: 0(f) = (1 - p)p4, f e N. This implies directly that Pz{x) = (1 — p)(£z~xh Using Eqs.(4) and (5), we shall end with the result: z = max
R(-£)i}.
where \z~\ £ N stands for the upper integer part of z and
(7)
r =^ £ l The expression Eq.(6) has been recently derived in [Vea 96]. 2.1 Non-markovian queues
The stationary measure for the queue length of the general F/G/1 non-markovian queue is in general difficult to work out. In [Leb 88], the author develops a method to calculate this measure for PH/PH/l queues, where PH stands for phase type distributions. The PH/PH/l queues cover a very wide subclass of the F/G/1 dynamics under interest. When F(x) is a phase type distribution, its Laplace transform can be written as: f{s)=re—dF{x) Jo
= ¥p/D(»)
(8)
259 According to [Leb 88], the stationary measure for the queue length of the PH/PH/l queue with distributions F(x) and G(x) has the form: 0(O) = l - p = l - -
(9)
and *(0 = « 5 > i » ? | ,
Vf=(z-x)>l,
£€N,
(10)
j=0
with (11)
/o(*i) (l-*i)&v(*j)
n
(12)
jf=wi^"
z
*
(13)
K= p
and ^j, j = 1,2,... J, are the complex solutions with positive real parts of the equation: fN(s)gN(s) - fD(a)go(s) = 0. (14)
Illustration Let us explicitly work out the case resulting when F(x) is an Erlang distri bution with k phases and mean ^ and G(x) = 1 - e~,lx. For this choice, we directly have: fN{s) = Afc, gN{s) = n, fD{s) = (A* - s)k and gD{s) = (/i - *)•
(15)
In this case, Eq.(14) admits only one solution. Hence J - 1 in Eqs.(lO), (12) and (13). Accordingly >(£) and therefore P*(x) take very simple forms which, once introduced into Eq.(5), give a closed form for the optimal hedging level z* (see [Cip 99] for details of the calculations), namely: CV^lnp z* = max < 0, [ ln(l + f i £ £ ) L
h^} InpJ
(16)
260
and z\ being in this case the single solution of the equation: \kH-(\k-z)k{ti
+ z) = 0.
(17)
iProm Eq.(16), the following observations can be made: • a) The hedging level is increasing with the coefficient of variation CVp. It is intuitively clear that an increase in the variability of the dynamics requires a larger security stocks. Numerical illustrations can befoundin [Cip 99]. • b) For k = l, Eq.(17) implies z\ = (i - A and Eq.(16) directly reduces to Eq.(6) : • c) For certain values of the parameters, z* = 0. It is therefore optimal to produce without hedging stocks. This policy is commonly known as the just-in-time policy. 3
Failure prone machine and constant demand - Continuous flow model
Let us now study the case where one has a failure prone machine M ableto produce a single type of items for which the customers demand flow rate d is a constant. The output flow of M and the demand flow d are here assumed to be continuous, (i.e. we have a fluid model). Due to the failures in the production process, the output flow of M is a random process which we shall model by a general alternating renewal process (ARP) denoted by x(*)» ( see [Ros 82] for details on the ARP's). The process x(t) takes values in the state space fl = {0,1} where the states {0} and {1} respectively indicate that the machine M is in the failed, respectively operating states. When it is failed, the machine M is under reparation during a random time. Hence, the sojourn times in the states {0} and {1} are (positive) independent random variables characterized by the distributions G{x) and F(x) respectively. We write TR and TBF the random variables respectively describing the reparation time and the time between successive failures. We have: Prob {0 < TR < x} = G(x)
(18)
Prob {0 < TBF < x} = F(x)
(19)
and similarly:
The moments of the distributions G(x) and F(x) are defined as in Eqs.(l) and (2).
261
When the machine M is in the operating state, (i.e. x(t) — 1)> its throughput, (i.e. output rate) will be denoted by u(t). The function u(t) is controllable as t grows and takes values in the range [0,r]. With the ARP denned above, the maximal average throughput (r) of the machine M will be asymptotically given by:
(20)
i,From now on, we assume that the demand rate d is feasible, namely: (r) = J—>d
=»
p=^ = _ ^ _ < i
(21)
where p plays the role of the traffic and we have introduced the notation:
As before, we consider the surplus process X(t). This stochastic process can be represented here as the solution of the stochastic differential equation: jtX(t)
= u(t)X(t)-d,
X(t = 0) = xo,
X (t
= 0) = xo.
(23)
Again, X(t) > 0 indicates that finished items are stored while when X(t) < 0, the production process fails to satisfy the existing demand, (i.e. backlog is present). Storing or backlogging items incur costs which are modeled by a cost function h(x) given by Eq.(3). For a production time horizon T, the average cumulate cost can be written as: Ju(x0, Xo; T) = i J
dt h(X(t)) with X(0) = x0, x(0) = Xo-
(24)
The problem is again to find an optimal policy u*(t) which minimizes the cumulate cost. Formally, we must have: «• {xo, Xo; T) < Ju (XQ , xo; T), Vu admissible.
(25)
An admissible policy is non-anticipating and feasible. For a large production horizon T ->• oo and when the ARP \(t) is markovian (i.e. when G(x) = 1 - e _ / l x and F(x) = 1 - e~Xx), the optimal policy was shown to be of the hedging type H - type, (see [Bie 88]). Hence, we have:
262
{
r HX(t)
0
(26)
tfX(t)>z\
where 2*is the hedging level, which for a cost function of the form given by Eq.(3), has the form:
where T has been defined in Eq.(7) and I = j , (i.e. the ratio between the mean time to repair and the mean time between failures), is the unavailability factor of the machine M. As in Eq.(6), only the parameters /* and A appear in Eqs.(27). Hence only the average values of the TR and the TBF characterize the position hedging level z*. This reflects the fact that the coefficients of variation of exponential distri butions identically equal unity ; they cannot therefore be freely changed. To infer the role of the variability on the hedging level, (i.e. changes in the coeffi cient of variations), we examine now the situation arising when non-mar kovian ARP are used to model the operating states of the machine M. 3.1 General alternating renewal processes As before the presence of a non-markovian dynamics implies that the optimal policy does not obey to a simple H - type policy. Indeed, the optimal policy necessarily depends on the "aging" of the processes characterizing the TR and TBF random variables. This implies that the optimal policy cannot be defined in term of a single hedging point. Hedging curves rather than single points will occur. The qualitative properties of these curves can be described, [Hu 94], but their explicit forms are generally very hard to calculate. For actual applications however, the simplicity of a single point H—type policy and hence its easy implementation, implies that, even for non-markovian dynamics, we often choose to operate under a simple H — type policy and this, despite to the fact that only sub-optimality will result. This choice done, it remains to calculate the optimal position of the hedging level z*. For simplicity we focus to the case where the variable TR remains exponentially distributed while the TBF obeys to a general distribution F(x) with mean j and general coefficient of variation CVp as defined in Eq.(2), (other configurations can also be found in [Cip 98]). Let us now introduce the following probability distributions: P0(x, t) = Prob{X(t) < x, X(t) = 0}
(28)
263
and Pi {x; y,t)dy = Prob {X{t) <x,y<
Y(t)
(29)
The process Y(t) is a supplementary variable which characterizes the "aging" of the operating state of the machine M, (i.e. the amount of time that elapsed since the last entry into the non-exponentially distributed operating state of M). We start by choosing z to be a trial level for a hedging point. Under &H- type policy, the range of the surplus process is X(t) 6 (-oo,z]. By definition of the H - type policy, the process X{t) has a "reflecting barrier" at position z. This type of random evolution has been considered in [Mil 63] and its result will be directly applicable. As before, we introduce the change of variable defined by: E{t)=z-X(t)
=> S(t)€[0,oo).
(30)
With the supplementary variable Y(t), the process (X(t),Y(t)) and hence (E(t),Y(t)) become markovian. Writing £ = (z-x), the associated ChapmanKolmogorov equation takes the form [Mil 63]:
§tpoAZ,t) = -d^PoAU) - Aft,,K,*) + f™rPiAZ,y>t)dv
(3i)
for t ^ | and
^i.,K,»,«) = (r-d)j^.,K,v,*) - J£*U£,y,t) -rPiAtv,*), (32) for 0 < y < t and y ^ | f - 27r-
™-T%><
m
= T,FM-
< 33 >
The boundary conditions are: Pi.»K,V=0,t) = £ J M € , t )
(34)
and PoAW) = i. PiA^y,o) = o(35) For time t -¥ oo, the stationary measures associated with Eqs.(31) and (32) respectively read: lim PoAZ, t) = PoAO =
IK (l — e - °°( z_x )) L , , ,
(36)
264
and 1(1-
e -ao[(*-x)+y(r-(i)n
£mPllXtf,i/,t)=/i(l-F(y))-i where ao is the positive solution of
—
da0 = it [l - f([r - d]a0)]
i
(37)
(38)
with /(«) = /0°° f(y)e~'vdy being the Laplace transform. Note that Eq.(38) plays here a role similar to Eq.(14) of section 2. Using ergodicity and the cost function defined in Eq.(3), the time integral by Eq.(24) becomes: lim Ju(xQ, xo; T) = J{z) = c+ f xdPz (0 - c~ f xdPz (fl, t = z-x, ->°° Jo J-oo
T
(39)
where Px (£) is the marginal probability measure defined by:
P . ( 0 = PoAO + f" A..& y)dy.
(40)
Jo
In view of Eq.(39), the optimal position z* for the hedging level will be given
Note that when F(x) = 1 - e - A x , Eq.(38) implies that a0 = /i(l - />) and the markov situation given by Eq.(27) is indeed obtained directly. The result for z* given by Eq.(41) can now be discussed in two limiting situations: • a) ao « 1. This case corresponds to the heavy traffic regime p ~ 1 in which the machine M can barely follow the demand rate d. By expanding the function /(ao) given in Eq.(38), we can approximately write:
(42) 2. = ^±i ln f£^)),M±imr 2A(l-p) + 7^ ;) 2A(l-p) — -p) A \ 11 + 2A(l-p) As intuitively expected, Eq.(42) exhibits that the level z* grows with the coefficient of variation CVj.. b) ao >> 1. Here the production rate of the machine M follows easily the demand rate d and hence the traffic p « 1. In this case the solution of Eq.(38) becomes insensitive to the detailed behavior of /(ao). Indeed
265
for large value of c*o the right hand side of Eq. (38) is approximately equal to p, and we have:
• c) When ( \+p ) < 1> the logarithmic contribution becomes negative and therefore the optimal hedging level is z* = 0. Hence the just-in-time production policy is optimal. 4
Conclusion
To face uncertainty of the environment, (randomness in the demand and in the production processes), and to respond to the demand with a high reactivity, the production managers often use make-to-stock policies. These consist of storing a certain amount of finished items to absorb the randomness in the demands. While failing to respond to the demands instantaneously usually incurs high costs, the costs of storages are also not negligible. We model these storage and shortage costs by means of a convex function of the surplus, the surplus being the instantaneous, (algebraic), difference between the numbers of ordered and available items. The simplest policy which regulates the level of the finished items inventory is to use a hedging point. Under a hedging point policy, we produce until the inventory offinisheditems reaches a hedging level and we stop the production otherwise. While this hedging policy can be shown to be optimal when the underlying fluctuations obey to markovian dynamics, it becomes only sub-optimal for non-markovain processes. Despite to its suboptimality in general, the hedging point policy is nevertheless often selected on the basis of its simplicity. Once this choice is adopted, it remains to select the best hedging level. According to intuition, this level obviously will increase with the variability of the underlying environment. We give here a quantitative confirmation of this assertion by explicitly solving simple configurations involv ing discrete and continuous flows of items and several sources of randomness, namely random demands and/or failure prone production processes. Acknowledgments I am deeply indebted to Ludwig Streit for numerous stimulating scientific and human discussions over many years. Its enthusiasm, its "savoir vivre" and its exceptional sense friendship enlighten my life each time I met him. I also heartily thanks the organizers of the Lisbon's Conference held in honor of L. Streit's 60th birthday. This was an unforgettable and marvelous week.
266
References [Ake 86] R. Akella and P.R. Kumar Optimal control of production rate in a failure prone manufacturing systems. IEEE Trans. Aut. Cont. AC-31, (1986), 116-126. [Bie 88] T. Bielicki and P.R. Kumar. Optimality of zero inventory policies for unreliable manufacturing systems. Operation Research 36, (1988), 532-541. [Cip 98] Ph. Ciprut, M.-O. Hongler and Y. Salama. Hedging point for nonmarkovian piecewise deterministic production processes. Discrete Event Dy namic Systems:Theory and Applications, 8, (1998), 365-375. [Cip 99] Ph. Ciprut and M.-O. Hongler. A dynamic allocation index policy for non-mar kovian multiclass production. Proceeding of the Internat. Conf. on Industrial Engineering and Production Management "fucam" , Glasgow (July 1999), 2.348. [Coh 82] J. W. Cohen. The single server queue. North Holland (1982), second edition. [Hu 94] J.Q Hu and D. Xiang. Structural properties of optimal production controllers in failure prone manufacturing systems. IEEE Trans Aut. Cont AC-39, (1994), 640-642 [Leb 88] J.-Y. Le Boudec.Steady state probabilities of the PH/PH/1 Queueing Systems 3, (1988), 73-88.
queue.
[Mil 63] R.G. Miller. Continuous time stochastic storage process with random linear inputs and outputs. J. of Mathematics and Mechanics 12, (1963), 275291. [Ros 82] S. Ross Stochastic processes. J. Wiley (1982). [Vea 96] M. H. Veatch and L.M. Wein. Scheduling a make to stock queue: index policies and hedging points. Operation Research 44, (1996), 634-647.
267 OPTIMAL PORTFOLIO IN A FRACTIONAL BLACK & SCHOLES M A R K E T *
YAOZHONG HU Department of Mathematics, University of Kansas, 405 Snow Hall, Lawrence, Kansas 66045-2142, USA (huOmath.ukans.edu) BERNT 0KSENDAL Department of Mathematics, University of Oslo, Box 1053 Blindern, N-0316 Oslo, Norway ([email protected]) Norwegian School of Economics and Business Administration, Helleveien 30, N-5035 Bergen-Sandviken, Norway AGNES SULEM INRIA, Domaine de Voluceau-Rocquencourt B.P. 105, F-78153 Le Chesnay Cedex, France ([email protected]) We use the martingale method of Cox and Huang to solve explicitly the optimal portfolio problem in a Black & Scholes type of market driven by fractional Brown ian motion Bnlt) with Hurst parameter H £ (5,1). The results are compared to the corresponding well-known results in the standard Black fe Scholes market.
1
Introduction
If if is a constant, 0 < H < 1, then the fractional Brownian motion with Hurst parameter H is the Gaussian process Bn{t) — Bn{t,ui); t > 0, CJ € fi with mean E[Bu(t)] = 0 for all t > 0 and covariance E[BH(t)BH(s)}
= \{t2H
+ s2H - |« -
s\2H)
for all s, t > 0. Here E denotes the expectation with respect to the probability law fin for Bu on Q. We assume that BH(0) = 0. If H = I then Bu(t) coincides with the standard Brownian motion B{t). If H > I then BH [t) has a long range dependence in the sense that if we let p{n) = cov(BH (1), BH (n + 1) - BH (n)) then
53 P(n) = °° ■ n=l
'DEDICATED TO PROF. LUDWIG STREIT ON THE OCCASION OF HIS 60TH BIRTHDAY
268 Bn(t) is self-similar in the sense that Bn(oct) has the same law as aHBn(t), for any a > 0. Because of these properties fractional Brownian motion has been suggested as a useful tool in finance and other applications. See e.g. [M]. However, B«(t) is neither a semimartingale nor a Markov process, so many of the powerful techniques from stochastic analysis are not available when dealing with Bn{t). On the other hand, recently a white noise theory for BH{t) for \ < H < 1 was developed [H0] (see also [DHP]) and this was applied to prove that if the corresponding integration theory (in the Ito sense, i.e. based on the Wick product rather than the pathwise product) is used, then the corresponding fractional Black & Scholes market is without arbitrage, it is complete and an explicit fractional option pricing and hedging formula can be given. We now describe this fractional Black & Scholes market more precisely and refer to [H0] and the references therein for more information. We assume throughout that \ < H < 1. Suppose we have the following two investment possibilities: (i) A bank account or a bond, where the price A(t) at time t > 0 is given by dA{t) = rA{t)dt;
A(0) = 1,
(1.1)
where r > 0 is a constant. (ii) A stock, where the price S(t) at time t > 0 is given by dS{t) = aS(t)dt + oS(t)dB„(t);
S(0) = s > 0,
(1.2)
where a > r > 0 and a ^ 0 are constants. Here the differential dB« (t) is the Ito type fractional Brownian motion differential used in [H0]. Suppose an investor chooses a portfolio 0(t) = (a(t),/3(t)) giving the number of units a(t),P(t) held at time t of bonds and stocks, respectively. We assume that a(t),0(t) are ^ ^ - a d a p t e d processes, where J^"' is the a-algebra generated by {B//(s)}g< { . Let Z{t) = Z\t, u) = a(t) A(t) + (3{t)S{t)
(1.3)
be the value process corresponding to this portfolio. Following [H0] we will assume that 6 is self-financing, in the sense that dZ(t) = a(t)dA(t) + 0(t)dS(t).
(1.4)
269 We say that 6 is admissible if, in addition, Ze(t) is nonnegative, a.s. We let A denote the set of admissible portfolios. With a given initial value z > 0 consider the problem to find V(z) and 6* € A such that V(z) = V„{z) = sup E*[u{Ze{T))] = E*[u(Ze'(T))}, (1.5) eeA where T > 0 is a given constant, Ez denotes expectation w.r.t. (J,H when Ze(0) = z and u: (0, oo) ->• R is a given utility function, assumed to be nondecreasing and concave. An example of such a utility function is u(x) = -x1 where 7 € (0,1) is constant. (1.6) 7 The constant 1 — 7 is interpreted as the risk aversion. In the case of standard Brownian motion B(t) a natural approach to this problem would be dynamic programming, which leads to the HamiltonJacobi-Bellman equation (see e.g. [0, Ch. 11]). However, with B(t) replaced by Bn{t) it is no longer possible to use such a method, because we cannot make the system Markovian. Another commonly used approach in the standard case (H — | ) is the martingale approach, introduced by Cox and Huang [CHI], [CH2]. See also the presentation in [KLS]. The purpose of this note is to show how this approach in spite of the fact that BH {t) is not a martingale (not even a semimartingale) - can be adapted to solve (1.5). We give an explicit solution and compare it to the well-known solution in the standard case. 2
Explicit solution of the optimal portfolio problem
We now show in detail how the martingale approach of Cox and Huang can be used to solve problem (1.5) for an Ito type fractional Black & Scholes market with Hurst parameter H G ( j , l ) - To this end, note that if we substitute (from (1.3)) a(t) = A(t)-l(Z(t)-(3(t)S(t))
(2.1)
into (1.4) we get dZ(t) = rZ(t)dt + <70{t)S{t)
a—r -dt + dBH{t)
(2.2)
As in [H0] we rewrite this as e-rtZ{t)
= z+
f exp(-rs)o-0{s)S(s)dBH(s), 0
(2.3)
270
where, by the fractional Girsanov theorem [H0, Theorem 3.18], the process BH(t):=^-t
+ BH(t)
(2.4)
is a fractional Brownian motion with respect to the measure JLH defined on } 1" " " by by ^
T
T l
^L
= exp(-jK(s)dBH(s)- s\K\lj
=:exp* ( - f K{s)dBH{s)^
o
o
=:„, (2.5)
where TT
\K\l = jjK(s)K(t)
0 0
and
(a-r)(JW)*-%|T1(8)
= k
'
2
- 2H) cos(7r(ff - I)) '
k
'
where T is the gamma function. As proved in [H0] any given Tf ^-measurable F(UJ) > 0 can be achieved as the terminal value Ze(T,u) a.s. for some 0 £ A and with Z8(0) = z if and only if z = EiiH[e-rTF].
(2.7)
Therefore the problem (1.5) can be reformulated as follows V{z) = sup{E[u(F)]- E[e-rTrjF] = z} ,
(2.8)
F
where the supremum is taken over all !Ff -measurable nonnegative random variables F. To solve this constrained maximization problem we consider, for a given A > 0, the unconstrained problem **(*): - sup{E[u(F)] - XE[e~rTVF]}
.
(2.9)
F
Suppose F£ solves this unconstrained problem. Then for any F such that E[e~rTriF] = z we have E[u{FZ) - \e-rTr)F^\
> E[u{F) - Ae~ rT r/F] = E[u{F)} - Xz.
In particular, if there exists Ao > 0 such that
(2.10)
271
then we get from (2.10) E[u(F*Xo)}>E[u(F)}. Hence Fj^ solves the constrained problem (2.8). First let us assume that, as in (1.6), u(x) = -a; 7 for some 7 € (0,1). 7 Then the unconstrained problem (2.9) becomes
$\(z) = supE -F 7 (w)-Ae- r i »7(w)F(a;)
(2.12)
(2.13)
F
We solve this w-wise by maximizing -Xe-rTrjF
g(F):=-F^ 7 with respect to F > 0: We have
g'(F) = F 7 - 1 - Ae-rT77 = 0 for F = (Ae- rT 7?) ^
.
By concavity we conclude that g(F) is maximal when (Xe-rTr,)^.
F = F'x =
(2.14)
We now seek Ao such that (2.11) holds, i.e.
£;[e- rT 7 ? (A 0 e- rT 7 ? )^ T ]=^.
(2.15)
This gives zy-ier-rT
Ao =
7-1
(2.16)
If we substitute A = Ao in (2.14) we get
Ft =
zerTrf^ E
*1 ' [,*]
(2.17)
272
This is the optimal value of F in the constrained problem (2.8). Hence
V{z) = E
far =
,
r
I-, J = W""(JS\ v *])
.
(2.18)
By (2.5) we have T -IT 7
TJ "
#(*)<*£„ (s) +
= exp
2(1-7)
0
1*1
X
o
= «P° ( T ^ IK{a)dBH{8))
exp
(l-7)al
'"
2(1-7)H*l
1)
(2(1^^) •
Since E[exp < > (//(s)dS / / (s))] = 1 for all / e L%, (see [H0]) we get
4^1 ^ C T 1 * 1 * )
(2.19)
V(z) = VH(z) = i ^ e x p ( n r + 2 7 J T ^ I ^ I ^ ) -
(2-20)
Hence by (2.18)
Similarly, from (2.17) and (2.19) we get the corresponding optimal terminal value Z8'(T): Z9'{T)
= F*Xo T
0 T
= ^exp(r^/^wd5«w+rT+2^^1^)-
(2 21
-
273
To compute \K\2^ we use that r f K{s)
for 0 < t < T
(2.22)
(see [H0, (5.13)]). This gives, using (2.6), T T
T
\K\l = f f K(s)K{t)
f K{t)dt
(2.23)
0
2a2H ■ T{2H) ■ T(2 - 2H) • COS(TT(H - |)) {a-rf-Y2{\-H)-T2-2H 2a2H ■ (2 - 2H) • T(2H) • T2(2 - 2H) cos{n{H - i)) '
(2.24)
where we have use that (see [H0, Section 6]) T
**-**=^§T"
0
(2.25)
plus the basic identity
T{x + 1) = Xr{x) . Therefore, if we put An —
""
r 2 (f-tf) 2H ■ (2 - 2H)• T{2H) ■ T(2 - 2ff) • COB(V(H - ±))
(2.26)
we obtain from (2.24) that \TS\1
/ _ I ° ~
\ r
\
2
A .. . T 2 - 2 / /
(2.27)
Therefore, by (2.20) and (2.21) we get Theorem 2.1 The value function V(z) of the optimal portfolio problem (1.5)'-(1.6) is given by V{z) = VH{z) = -z 7 exp f r 7 T + 7~
-*V
7
2(l- 7 )
(^h
• ( — - ) AH ■ T 2 " 2 " ) . (2.28)
274
The corresponding optimal terminal value Z°* (T) is given by: T 9
Z \T)=ze^^jK(s)dBH(s)+rT+^^.(^LyAHT^ o (2.29)
Remark It is natural to ask how the value function V = Vn(z) in (2.28) is re lated to the value function Vi/2(z) for the corresponding problem for standard Brownian motion (H = \). In this case it is well-known that (see e.g. [0, (11.2.53)]) V1/2(z) = I^exp (nT+^
■ (^)V)
.
(2.30)
Therefore we see that, as was to be expected, lim V„(z) = Vl/2(z).
(2.31)
Next we turn to the question of finding the corresponding optimal port folio 9* = (a*,/?*). Because the system is not Markovian we cannot use the traditional PDE method to find 9*. However, we can use the Clark-Ocone formula in [H0, Theorem 4.15]. Applying this formula to our situation we get, with Z = Z8'(T), T rT
rT
e- Z(u>) = £ A „ [e- Z] + jE^ o
[e~rTDtZ | J*H)]dBH{t).
(2.32)
where Dt denotes the stochastic gradient with respect to the measure /2// and E[-\-} denotes the quasiconditional expectation. If we compare (2.32) with (2.3) we get by uniqueness that \ F{tH)]
exp(-rt)
\
tfH)].
(2.33)
275
By (2.29) and (2.4) we have Z = zexp(-L-
f
K(s)dBH( s)~
0
1~27
:
1-7 J
o
0
rfr V
\HT2-W\
(2.34)
and therefore, by the chain rule, DtZ =
mz
(2.35)
To facilitate the computation of the quasiconditional expectation we write, using (2.34) and (2.23),
T
T
= 2 exp* f ^ i - f K(s)dBH(s)\
■M ,
(2.36)
o
Define where
/,-, PH{)
ff (2ff - 1) 4r2(2f/)r2(2-2^)cos2(7r(H-l/2)) f f Jo Jo
, V
. '
;
\u-v\lH-\u-uy2-H{v-v'1)1'i-Hdu
j * f* 0(«, <,)*r( U )*»ducfo = T2-2HpH It is obvious that />«(!) = A H
(Jp) (^~)
•
(2.39)
276 Therefore, | ?lH)] = f ^ M e x p * (j±-
UhiH[DtZ
E
JK(s)dBH(s)^ o
(2.40)
t
= YT^Mexp
f—L- y K(s)dB„(s) -
]_
J j
0
Combining (2.40), (2.33) and (2.1)-(2.4) we get Theorem 2.2 The optimal portfolio 6* = (am,/3*) for problem (1.5)-(1.6) is given by t
0'(t) = e^-^cr-'S-'^^expij— (a-r)2T2-2H
f .
J K(s)dBH(s) o (t\
+
~ 2(l-7)V [ ^ ^ ( f j
(2.41)
+ r r 11
j}
and a*(i) = A" 1 (t)(Z*' (0 - F(t)S{t))
,
(2.42)
w/iere t rt e
e~ Z '(t)
= z+ f exp(-rs)o-f3*(s)S(s)dBH(a)
.
(2.43)
o Now let us consider the following utility function u(x) = log x ,
a: > 0.
(2.44)
This is called the Kelly criterion, see e.g. [A] and [0]. Then the unconstrained problem (2.9) becomes * A ( « ) = sup E [logF - \e~rTr}F] f
Let g(F)=logF-\e-rTr}F.
.
(2.45)
277
The maximum of g is attained when 1 -e^r)-1. A1
F = FA* =
In this case the equation (analogous to (2.15))
z=
1 rT ri —1 E{e^TnFQ=E e —rTriT)—e r)
A0
has a solution
Thus the optimal value of (2.8) with (2.44) is attained when ft,
= ™rTri-1 ■
The value function is given by V(z) = E [log FA*0] = log z + rT - E logr,
= i o g , + r T + i^y A „r
2 2
- ".
When H -t 1/2+, we obtain Vl/2(z) = \ogz + rT + ± ( ^ - ^
T.
Similar to the argument for (2.40) we have exp* (f
K(s)dBH(a))
=l + J
HtdBH(t),
where Ht = h„
\r>t (exp° (j\(s)dBH(s)\j
= K(t)exp<>if
\T{tH)
K(s)dBH(s)\
= K(t)expl^K{S)dBH(s)
- \
(±=^f*-™PH
278 The optimal terminal value Z9' (T) is
Z6'{T) = Fl rT
K(s)dBH(s)+±\K\2A
z e x p j r r + jT = zerT exp{\K\l-^ = zexp IrT + J = zerTexPuf
K(s)^~ds
K(s)dBH(s) K(3)dBH(a)\
+ Jo
K(8)d6H{a)-\\K\l}
^\K\2A
.
(2.51)
Thus by (2.50) h» [btZe'{T)\FlH)] =ze^K{twAj\(s)dBH(s)~\(^jT^"pH
(i) j .
By (2.33), we have Theorem 2.3 The value function V(z) for problem (1.5) and (2.44) is given by V(z) = \ogz + r T + U ^ \
AHT2~2H.
(2.52)
The optimal portfolio 8* = (a*,/?*) for problem (1.5) and (2.44) is given by 0*(t) =
ze^a-'S-Wze^Kit) exp | j * K(s)dBH(s)
- I ( ^ )
* T*-™pH
( i ) j (2.53)
and a*(t) = A-l(t)(Ze'(t)-/3'(t)S(t))
,
(2.54)
where t
_ -Tt» _e« • , _%
e - Z '{t)
= z + fexp(-rs)a0*(s)S(s)dBH(s)
.
(2.55)
o
Acknowledgements This work was partially supported by the French-Norwegian coopera tion project Stochastic Control and Applications Aur 99-050. A. Sulem and
279 B. 0ksendal wish to thank the French Research Council and the Norwegian Research Council for their support through this project. References [A]
K.K. Aase: Optimum portfolio diversification in a general continuous time model. Stoch. Proc. and Their Appl. 18 (1984), 81-98. [CHl] J. Cox and C.-F. Huang: Optimal consumption and portfolio policies when asset prices follow a diffusion process. Journal of Economic Theory 49 (1989), 33-83. [CH2] J. Cox and C.-F. Huang: A variational problem arising in financial eco nomics. J. Mathematical Economics 20 (1991), 465-487. [DHP] T.E. Duncan, Y. Hu and B. Pasik-Duncan: Stochastic calculus for frac tional Brownian motion. I. Theory. To appear in SIAM J. Control and Optimization. [H0] Y. Hu and B. Oksendal: Fractional white noise calculus and applications to finance. Preprint University of Oslo 1999. [KLS] I. Karatzas, J. Lehoczky and S.E. Shreve: Optimal portfolio and con sumption decisions for a small investor on a finite horizon. SIAM J. Control and Optimization 25 (1987), 1157-1186. [M] B.B. Mandelbrot: Fractals and Scaling in Finance: Discontinuity, Con centration, Risk. Springer-Verlag 1997. [0] B. 0ksendal: Stochastic Differential Equations. 5th edition. SpringerVerlag 1998.
280
NONRENORMALIZABILITY A N D NONTRIVIALITY JOHN R. KLAUDER Departments of Physics and Mathematics University of Florida Gainesville, Fl 32611, USA E-mail: [email protected] A redesigned starting point for covariant <£*, n > 4, models is suggested that takes the form of an alternative lattice action and which may have the virtue of leading to a nontrivial quantum field theory in the continuum limit. The lack of conventional scattering for such theories is understood through an interchange of limits.
Despite being perturbatively nonrenormalizable, the quantum theory of covariant scalar 0* models has been shown to be trivial for all space-time dimensions n > 5, while for n = 4 it is widely believed to be trivial as well. ' Triviality follows by showing that the conventionally lattice-regularized, Euclidean-space functional integral tends to a Gaussian distribution in the continuum limit independent of any choice of renormalizations for the mass, coupling constant, and field strength. Although mathematically sound, a trivial result is inconsistent in the sense that the classical limit of the quantized theory differs from the original (nontrivial) classical theory. In this article we reexamine this problem once again, and suggest an alternative formulation whereby quantum models for 0* may be nontrivial. Generally, in what follows, we set h = 1. We start with a lattice-regularized, Euclidean-space functional integral expressed as 5 a (/i) = (exp(E/ l f c ^a")) = 7Va / e x p { £ hk<j>kan - \Z £(0*. - 4>k)2an~2 - \Zm20 E 0 | a " -Z2g0i:
-HP(Z1/2
.
(1)
Here k = (A;0,- • • ,fc n_1 ) , k1' € Z for all j , labels a lattice site; k* signifies one of the n nearest neighbors to k and the sums run over a large but finite hypercubic lattice; a represents the lattice spacing; hk denotes the lattice-cell average of a smooth source function h(x), x € R n ; Z > 0, m j , and g0 > 0 are functions of the cutoff a; and—for the present—the auxiliary term P = 0. We choose Na such that 5 a (0) = 1, and let {(•)) denote an average with respect to the resultant probability distribution. The continuum limit is defined as the
281
limit a —> 0 in conjunction with a diverging number of lattice sites so that the space-time volume itself eventually tends to R n in a suitable way. For n < 3, the continuum limit leads to acceptable (nontrivial) results 2 ; for n > 5, on the other hand, the continuum limit has the form 1 \imSa(h) = exp[iJ7i(x)C(z - y)h(y) d"x
(2)
for a suitable covariance function C(x — y), and all indications point to the same conclusion when n — 4. [If C(x — y) is not locally integrable, then this condition is replaced by one in terms of correlation functions with noncoincident points.] Let us sketch one plausible argument that leads to trivial behavior. Con sider the dimensionless and rescaling invariant correlation-function ratios, which also admit meaningful continuum limits, given, for r > 1, by
9{r)
X(4>o4>k2 ■ ■ ■ 4>kiT)T . r n 1) 2 - [E(^0 f c )] [Sfc2(
,„, ^'
by symmetry, all odd-order correlations vanish. For n > 5, mean field theory is generally accepted, and for small a it leads to the behavior that g^ oc a ("-4)(r-i) Thus forn > 5 and r > 2, c/(r) -> 0 in the continuum limit. The Lebowitz inequality, 2 which states that {
(4)
282
for n — 4 the right side of (4) should be replaced by | l n ( a ) | _ 1 . With such a choice for P, the combined effects of the divergence due to long-range order and rescaled amplitude lead to g^r) oc a 0 = 1 for all r > 1, and for any n > 4. Choosing Z oc an~4 [or | ln(a)| _ 1 ] properly rescales the correlation functions to macroscopic values. Such a theory would be non-Gaussian, hence nontrivial, in the continuum limit. Based on experience with related but soluble models, 4 we conjecture that a P of the form
for suitably chosen A, B, and C (which also depend on n), may do the job. As a ->■ 0, we expect that A -> oo, B -¥ 0, and C -»• 0. Thus the indicated expression is a regularized form of a formal continuum potential proportional to l/(j>(x)2. However, just as the 1/r2 potential that arises from the kinetic energy in a spherically symmetric, quantum-mechanical situation necessarily carries a proportionality factor of h2, it is more proper to recognize that the formal auxiliary potential P is proportional to ft2/0(z)2, i.e., A oc h2. Carrying the analogy further, we observe that P is not a counterterm for the quartic interaction but rather for the kinetic energy term. Thus we are led to propose that P is a nonclassical, auxiliary potential, which explains its absence in a strictly classical limit in which h —► 0. Based on related models, we are also led to conjecture that a P having the desired properties leads to a quantum theory the classical limit of which agrees with the classical theory with which one started. Assuming that some such P exists, we can proceed to derive, in a general fashion, certain additional facts. The nature of the nonclassical, auxiliary po tential P leads to a generalized Poisson distribution in the continuum limit. In Minkowski space-time the operator structure of such fields admits a rel atively simple superstructure. Introduce the basic (Fock) operators A\ and A\, I £ {0,1,2,...}, where [Ai,Am] = 0 and [4j,,4 ra ] = Slm for all / and m, and Ai\0) = 0 for all /, with |0) unique. As usual, the Hilbert space is spanned by vectors of the form |0), A\|0), .AJMJJO), etc. Furthermore, let Ai m (z) [= Ami(z)*] and Q(x) denote a suitable set of complex fields. With summation implied, the Minkowski field operator has the representation 5
(6)
It follows from (6) that (0\
(7)
283 assuming all odd-order correlation functions vanish. Since (0\
< <0M/iM/ 2 M/2M/i)|0> T x(0|v(ffiM52M52M5i)|0)T.
(8)
Passing to asymptotic fields, <£>(/) -> vp0ut(/), and ip{g) -¥ <#n(<7), leads to 0 <
|(0^out(/l)¥'out(/2)^n(52)^n(5l)|0)T|2 < (0|ut(/2)
= 0.
(9) t in
This behavior is consistent with the assumption that Ajm(x) -* A/°|^ ' (a;) = 0 and Ci{x) -»• Cf u t (z) = Cfix). On the surface, it would appear that a quantum theory with no con ventional scattering would be inconsistent with the original classical theory which is known to exhibit nontrivial classical scattering. 7 However, an inter change of limits is involved here, which, on closer inspection, shows that no inconsistency arises. If we assume the suggested form for the nonclassical, auxiliary potential P, there exists, in effect, an additional term in the operator energy density proportional to h /(p(x)2, or in the corresponding Heisenberg equation of motion an additional term proportional to h2/
284
great deal of energy and composed of a huge number of quanta. By the correspondence principle, the associated scattering behavior can be approx imately treated classically and is therefore nonvanishing provided one sticks to possibly large, but finite, preparation and detection times in the past and future, respectively. In other words, such a field may not exhibit conventional quantum particle scattering in the strict sense, but under suitable conditions, an effective scattering theory may well exist that could help in recovering the nontrivial scattering of the original classical theory. Moreover, should there be any stable large-field configurations, such as solitons, then scattering between such entities may well exist. With A(a) —> oo sufficiently fast, the proposed form of P given in (5) may avoid the fate—irrelevancy—of usual higher-order interactions in a renormaliza-tion-group treatment. Any analysis of these models should per haps begin with the case g0 = 0 for which, thanks to P, nontriviality is still expected. Dedication It is a pleasure to offer warm congratulations to Ludwig Streit on reaching his 60th birthday. He has been an exceptional friend and colleague for a great many years, and I wish him well in the future, scientifically and otherwise. References 1. R. Fernandez, J. Froehlich, and A. Sokal, Random Walks, Critical Phe nomena and Triviality in Quantum Field Theory (Springer-Verlag, New York, 1992). 2. J. Glimm and A. Jaffe, Quantum Physics (Springer-Verlag, New York, 1987), Second edition. 3. M. Fisher, Rep. Prog. Phys. 30, (1967) 615. 4. J.R. Klauder, Beyond Conventional Quantization (Cambridge University Press, Cambridge, 1999). 5. J.R. Klauder, Phys. Rev. Lett. 28, (1972) 769. 6. D. Buchholz, private communication. 7. M. Reed, Abstract Linear Wave Equations (Springer-Verlag, Berlin, 1976).
285
ON THE SPECTRUM OF LATTICE DIRAC OPERATORS C. B. LANG Jnstitut fur Theoretische Physik, Karl-Franzens- Universitat Graz, A-8010 Graz, AUSTRIA, E-mail: Christian, [email protected]. at With the Schwinger model as example I discuss properties of lattice Dirac oper ators, with some emphasis on Monte Carlo results for topological charge, chiral fermions and eigenvalue spectra.
1 1.1
Introduction Continuum Concepts vs. Lattice Concepts
Relativistic particle physics is described by relativistic quantum field theory; the theories explaining the fundamental interactions are gauge theories. QFT as it is formulated in terms of Lagrangians and functional integration has to be regularized. The only known regularization scheme that retains the gauge symmetry is replacement of the space-time continuum by a space-time lattice. For convenience and other reasons this is done in an Euclidean world. The most prominent QFT is QCD, the SU(3) gauge theory of quarks and gluons. There various non-perturbative phenomena appear intertwined: confinement and chiral symmetry breaking. The classical continuum gauge fields A that are continuous and differentiable, living on compact manifolds, may be classified by a topological quantum number (the Pontryagin index) Q{A). Changing through continuous deformations of the field from one such topologically defined sector to another is impossible. Topology is closely related to the fermion zero modes; these are eigenstates (eigenvalue E = 0) of the Dirac operator hn(d^ ~ ieA^ip
= Exj) = 0
-►
-y5ip = ±ip .
(1)
Due to the anti-commutation of 75 with P the zero modes can be chosen as eigenstates of 75 with definite chirality. The Atiyah-Singer index theorem (ASIT) 1 relates the topological charge Q{A) of the background gauge field to these modes, Q{A) = index(yl) =n+-n_
.
(2)
where n± denotes the number of independent zero modes with positive or negative chirality, respectively. In two dimensions, another theorem of the
286 continuum - the so-called Vanishing Theorem 2 - ensures that only either positive or negative chirality zero modes occur. Quantization involves summation over non-differentiable fields and the lattice formulation does not provide for a unique definition of topological charge at all. Any lattice definition involves implicit or explicit assumptions, usually on the continuity and smoothness at the scale below one lattice spac ing. Quantization usually means Monte Carlo integration on finite lattices and it is legitimate to question the ergodicity of the respective "simulation" with regard to the topological sectors. Even the task of putting fermions on the lattice introduces limitations. Various lattice Dirac operators have been proposed. It has been demonstrated early 3 that within quite general assumptions (like locality and reflection positivity) it is not possible to have single chiral fermions; chiral symmetry has to be broken explicitly. For the simple Wilson action, the breaking of the symmetry is so bad that no trace of the chiral properties of the continuum theory is kept in the lattice theory. If, however, chiral symmetry is broken already on the level of the action, how can one hope to identify the (expected) spontaneous chiral symmetry breaking of the full theory? Is there a lattice version of the ASIT? As we will discuss below, recent developments do al low us to construct lattice Dirac operators, which break chiral symmetry in a minimal way. Confinement, on the other hand, proved to be more straightforward. The gauge coupling g enters the lattice gauge action in form of the multiplicative coupling 0 = l/g2. The lattice formulation works particularly well at strong coupling (small 0), where confinement within a non-vanishing domain of the couplings was proved. In the Monte Carlo calculations for lattice QCD up to now no signal was found that indicates a phase transition between that confinement phase and the weak coupling (perturbative) regime; for finite temperature such a deconfinement transition was established. In order to retrieve continuum QFT numbers for the physical quantities - like masses of hadronic bound states or certain matrix elements - one has to show that all dimensional physical quantities scale according to the scaling function of the lattice spacing a(0), which in turn should asymptotically (in the continuum limit 0 -> oo, a -» 0) agree with continuum renormalization group scaling. Current lattice studies have to live in that environment: Try to improve scaling properties in order to get reliable continuum results and try to deal with chirality without destroying it from the begin!
287
1.2
Lattice Dirac Operators
All Dirac operators have 75-hermiticity, 75
£>75=£t.
(3)
It follows that • The eigenvalues are either real or are complex conjugate pairs. • The operator 75 V has a real spectrum. • We denote by Vi the eigenvector of V for eigenvalue A;; then 75 V{ is an eigenvector of Vt for the same eigenvalue. • The diagonal entries of the chiral density matrix vanish for non-real eigen values: X{ $ Re -4 (v,|75 v^ = 0. • The non-diagonal entries of the chiral density matrix vanish whenever their respective eigenvalues are not complex conjugate pairs: A* / Xj — ► (vi\l$Vj)
=0.
An important conclusion is that only real eigenvalues lead to contributions to the diagonal elements. Thus these modes are the only candidates for zero modes in the continuum limit: in the continuum theory the (normalized) zero modes contribute to the diagonal entries, (^75 ip) = ± 1 . Chirality is necessarily broken for lattice Dirac operators. However, barely noticed for almost two decades, Ginsparg and Wilson4 formulated a condition (GWC) under which circumstances5 remnants of the chiral symmetry survive in a lattice action for massless fermions. If the Dirac operator obeys i {l5,V}
= aVybRV
,
(4)
where R denotes a local matrix, then chiral symmetry is violated only by a local term 0(a). Dirac operators satisfying the GWC will be called GW Dirac operators. Liischer6 has pointed out the explicit form of the associated symmetry of the action, ip -> exp[i07 5 (l -aRV)}
ip , i> -> rp exp[i0(l - aV R)^]
.
(5)
The GWC provides a sensible way (respecting the Nielsen-Ninomiya the orem) to construct suitable Dirac operators. As has been shown7 such ac tions cannot be ultra-local, but may be local, i.e. the coupling exponentially damped in real space.
288 renormalized trajectory ^ a u a n t u m perfect action
fixed point action
Figure 1. The FP action is the classical perfect action; the quantum perfect action may deviate from it away from the FP.
For certain classes of Dirac operators one may derive stringent bounds on the shape of the eigenvalue spectrum. This has been discussed in the framework of so-called fixed point (FP) actions 8 ' 5 . On the lattice real space renormalization group (RG) transformations are realized through block spin transformations. The field variables over a localized region around a site x are averaged to produce the field variable of a coarser lattice. This mapping procedure in the space of configuration ensembles may be associated to a parameter flow in the space of actions. It is implicitly assumed that indeed a Gibbsian measure is suitable to describe the ensemble of blocked configurations. The continuum limit of the quantum theory is obtained at a critical FP of the system (cf. Fig.l). The so-called renormalized trajectory in this space of actions leads to the FP and along its path there are no corrections to scaling. Ideally one could simulate the system at any point of the renormalized trajectory (with the so-called quantum perfect action) and obtain "perfect" continuum results. In reality that action may be quite complicated and has to be truncated in the number of coupling constants. Hasenfratz et al. 8 have suggested to determine instead the action at the FP of the RG transformation. For (asymptotically) free theories it may be determined from the classical field equations. The action has therefore been baptized "classical perfect action". Its classical predictions agree with those of the continuum action independent of the coarseness of the lattice. FP actions are solutions of the GWC 5 . At the classical level, for FP actions, the Atiyah-Singer theorem finds correspondence on the lattice 5,9 ; at the quantum level, no fine tuning, mixing and current renormalization occur, and a natural definition for an order pa rameter of the spontaneous breaking of the chiral symmetry is possible 10 . R is then local and bounded and as a consequence the spectrum of V in complex space is confined between two circles9 (cf. Fig.2), |^
7"rmn| ^ ^min
i
|^
Tmax \ S ^max
»
\OJ
289
Re A.
Figure 2. For FP Dirac operators the eigenvalues are confined in the shaded area of two circles tangential to the imaginary axis. The thick line indicates the position of the spectrum for R = 1/2.
where the real numbers rmin and rmax are related to the maximum and min imum eigenvalue of R respectively. For non-overlapping BSTs R = 1/2 and (6) reduces to |A - 1| = 1, i.e. the spectrum lies on a unit circle. Independent implementations of the GWC are provided by the over lap formalism11, which allows the formulation of chiral fermions on the lat tice. These solutions are obtained in an elegant way, as shown recently by Neuberger 12 , through some map of the Wilson operator with negative fermion mass. In this case we have R = 1/2 and circular spectrum \X - 1| = 1, too. For R = 1/2 the GWC assumes the simple form
v + £>* = r> t- p = r » r > t .
(7)
On the lattice, an index for V may be defined in a way analogous to the continuum, explicitly expressed by the relation9 index([/) = - t r ( 7 5 i ? P ) .
(8)
For GW Dirac operators and R = 1/2 this relation comes out trivially. Only the modes with real non-vanishing eigenvalues contribute to the trace in the r.h.s; since the overall chirality must be zero, it reproduces up to a sign index([/). This index can be used to define a fermionic lattice topological charge, Q(erm(U) = index(I/)> for which the ASIT is satisfied by definition. For the FP action that fermionic definition coincides9 with the pure-gauge quantity QFP{U), the FP topological charge13 of the configuration U: Q(erm(U) = index(C/) = QFp(U) .
(9)
The non-obviousness of this relation relies on the fact that QFP(U) can be defined in the pure gauge theory, without any regard to the fermion part. This result is particular for a FP action and has no counterpart for a general
290 (non-FP) G W action. Of course, in practical implementations one relies on approximate parametrizations of the FP Dirac operator and the strictness of the relation is lost. 1.3
Test Bed Schwinger Model
As our test bed we consider a 2-dimensional (2D) quantum field theory with U(l) gauge group and Nf flavors of fermions. The action for the massless continuum model reads
S = JtPx l^F^F^ + f^jfVty]
.
(10)
For Nf = 1 this is the Schwinger model 14 . This 2D version of QED resembles 4-dimensional QCD in various ways 15 ' 16 . Quarks are "trapped", i.e. in a mechanism superficially mimicking confinement we observe only bosonic asymptotic states. For Nf = 1 this is the Schwinger boson (called r) by analogy to 4D) with the mass m , = g/ir for the physical gauge coupling g. For the 2-flavor model one expects also a triplet of massless bosons (pions) 16 , although in 2D there can be not spontaneous symmetry breaking due to the Mermin-Wagner-Coleman theorem 17 . Finally, there is a non-vanishing condensate {4> ip) due to an anomaly. In the lattice formulation the gauge action in the compact Wilsonformulation is written S9 = / ? £ ( l - R e C / p ) ,
(11)
where A denotes the lattice Z2N and the plaquette variable Up is the oriented product of links variables UXi(1 £ U(l) at site x in direction (i = 1,2. In the continuum limit UXtli ~ exp(iagAXrfl) where A is the gauge field in the non-compact continuum formulation, a denotes the lattice spacing, and /9 = l/(ff 2 a 2 ). A lattice version for the integer geometric topological charge of the gauge field may be defined, Q(A) = ^fd2xF12(x)
-►
Q(£/) = ^ £ l m l n t / p .
(12)
p 18
In torus geometry (i.e. periodic boundary conditions) this number may be non-zero for compact gauge field configurations.
291
The lattice fermion action is formally i)V{m)i>,
(13)
where V (m) denotes the lattice Dirac operator matrix (fermion mass m) and the fermions are Grassmann fields. In a 2D context the Dirac matrices 7^ and 75 are to be replaced by a^ and (T3. In the quantized theory the Grassmann integration over the fermions yields factors (detX>) and the expectation value of some operator C may be written (C) = \ j\dU) es>W
(det V
f> C(U) ,
(14)
where one usually samples over the gauge fields with some Monte Carlo pro cedure. In the so-called quenched approximation (Nj = 0) the fermionic determinant is not included; this essentially neglects the fermionic vacuum loops. Dynamical fermions can be included either by incorporating them into the sampling probability measure for the gauge configurations or in the ob servable. The first case is realized in the so-called "Hybrid Monte Carlo" method. However, since the determinant may be negative, only even num bers of mass-degenerate fermions can be studied then. The second approach requires calculation of the determinant for each gauge configuration. It is plagued by the notorious sign problem, giving rise to possibly violent fluctu ations and large statistical errors of the results. The gauge integral should sample over all topological sectors. For Dirac operators which allow exact zero modes the determinant weight (for massless fermions) removes the corresponding contribution to the integral. 2
Lattice Fermions and Topology
We have been studying the lattice Schwinger model for various lattice sizes and values of the gauge coupling, both quenched and with dynamical fermions. Here we discuss some of our results, with some emphasis on the properties of the spectra of GW operators. 2.1
The Wilson Dirac Operator
The original Wilson action 19 for fermions has the form P W i =(m + 2)l-±M
(15)
292
Q
P = 2.0 Q = l
P = 2.0 Q = 2
P = 2.0
^v:
%d ^:--'- ' ;
V »•.:''•■■
b
p = 4.0 Q = l
Q = 0
''"■'..:.:V.v.V:-.; ^ " C
P = 4.0
free fermions
.'••'•.
■"•j.-\
i • . # .•
^;.>-''''
d -4
e 0
f
4
Figure 3. Spectrum of the hopping matrix M for an 8 x 8 lattice at various values of /?, close to K C (/3); Q denotes the (geometric definition of the) topological charge. The eigenvalues for free fermions are degenerate.
with the hopping matrix M
*y=
5Z [(l + ^ ) ^ y ^ , „ - A , + (l-o-/1)C/Htx<5x,!/+^] .
(16)
The hopping parameter K defines the fermion mass, for free lattice fermions K = 1/(2 m + 4). However, for the full theory with gauge interactions this relationship no longer holds. On one hand due to quark trapping there are no asymptotic fermions. On the other hand chiral symmetry is explicitly broken. Since this breaking is a local feature, we expect that at some KC(0) the chiral symmetry is formally restored in the sense, that the corresponding Ward identity is satisfied with a vanishing fermion mass parameter. This idea 20 has been pursued in QCD 21 and we have utilized it also in the Schwinger model with Wilson fermions in order to identify the position of the critical line, where the fermion mass parameter vanishes 22 . In Fig.3 we compare 23 typical eigenvalue spectra of the hopping matrix M at different values of the gauge coupling and in different topological sectors. We find strong correlation between the number of real eigenvalues and the topological charge. Closer inspection leads to the conclusion, that indeed the real modes (counted according to their chirality 24 ) may be interpreted as the would-be zero modes. Actually one has to divide that number by a trivial multiplicity factor of 4, since only the rightmost modes become zero modes of T>wi, whereas the other ones are doublers at other corners of the Brillouin
293
zone. The agreement of this number with the topological charge improves towards the continuum limit. In that sense even the Wilson Dirac operator obeys the ASIT for (3 ->■ oo; e.g. at p — 4 in 99% of the gauge configurations (sampled in a hybrid Monte Carlo simulation with dynamical fermions) the numbers of real modes (divided by 4) agrees with the topological charge. Also the Vanishing theorem is obeyed 23 . 2.2
The Fixed Point Dirac Operator
Although it was possible to obtain FP actions for some scalar or fermionic systems (cf. the summary in Ref.5) it turned out to be be a formidable problem for realistic gauge-fermion systems like QCD. We have succeeded25 to find explicitly an approximate FP action for the lattice Schwinger model. The FP Dirac operator is parametrized by a set of 429 terms bilinear in the fermion and anti-fermion fields,
The sum runs over paths / connecting a central site x with sites in a 7 x 7 neighborhood; U(x, / ) denotes the product of link gauge variables along that path. Standard symmetries are taken into account. Altogether we considered 123 independent coupling parameters. The general idea is to start with a parametrization of the action, generate a set of gauge field configurations, perform the real space RG transformation and identify the blocked action. The matrix elements of the Dirac operator on the blocked configurations are then compared with the parametrization and the coupling constants adjusted. The whole procedure is iterated until the parameters converge. We studied samples of 50 gauge configurations and 14 x 14 lattices at large values of /?. We could show, that the resulting fermion action had couplings damped exponentially with their spatial extent; although the parametrized action is anyhow ultra-local by construction, this observation indicates that locality of an untruncated FP action seems feasible. Using the FP action we determined the bosonic bound state propagators for the Nf = 1 and 2 Schwinger model and found excellent rotational invariance and good scaling properties (cf. Fig.5), to be discussed below in context of other actions. For the eigenvalue spectrum of the FP operator we found the situation shown in Fig.4: The eigenvalues are distributed close to a unit circle, in partic ular towards the continuum limit (growing /3). Towards the "hot" region the
294
ImA.
-0.5
-1.5
-0.5
-0.5
0.5
1.5
2.5 ReX
-1.5
-0.5
0.5
1.5
2.5
Figure 4. Eigenvalues for sets of 25 configurations (superimposed) for lattice size 16 x 16, sampled according to the compact gauge action (from Ref.26).
fuzziness increases. This demonstrates the predicted behavior for FP Dirac operators discussed in Sect. 1.2. The deviation from exact circularity is ex plained by the approximateness of the action, which gives raise to fluctuations at smaller /?. Most remarkable is, that indeed we do find (within the numerical accu racy) vanishing eigenvalues. These are individual zero modes and computing the matrix element {vi\ozVi) from their corresponding eigenvectors we find values ± 1 . Thus these zero modes have definite chirality. The partner modes with opposite chirality are at the other edge of the spectrum, at A = 2, irrel evant in the continuum limit. The zero modes occur whenever the geometric topological charge of the gauge configuration is non-zero and its value agrees with the number of zero modes, even at finite /?. This agreement with the ASIT is quantitatively much better than for the Wilson action. 2.3
Neuberger's Overlap Dirac Operator
Motivated by the overlap action 11 Neuberger proposed 12 to start with some Dirac operator with sufficiently negative mass, e.g. the Wilson operator at a value of m corresponding to J^ < K < 2n-2 anc ^ t n e n construct X»Ne = 1 + 7 5 e(75 £*Wi)
(18)
Although the actual value of K used in this definition is largely arbitrary its choice may influence the approach to scaling in the continuum limit. The generalized sign function of the hermitian operator 75 T>w\ may be interpreted
295
Figure 5. The dispersion relations E(p) for the n (left) and the t] (right) propagators, determined for three actions discussed in the text: Wilson (squares), fixed point (circles), Neuberger (diamonds) (from Ref.27).
as a realization of £>wi |2>'Wi
£>wi
IbVva
— = 7 5 — . '"
""
(19)
v/ZWAvi ^(75Pwi)2 and is determined through the eigenvectors and the eigenvalues, e( 7 5 VWl)
= USign(A)W
with 7 s P W i = UAW
.
(20)
(Sign(A) denotes the diagonal matrix of signs of the eigenvalue matrix A.) Gauge configurations with non-zero topological charge imply 12 exact zero eigenvalues of V^e; therefore exact chiral modes are realizable. This was confirmed e.g. in Ref.27; we found an exactly circular eigenvalue spectrum and zero modes. Due to the freedom of choice of the hopping parameter in the Wilson action used to construct T>j^e it cannot be excluded that some real doubler modes of the Wilson action are implicitly mistaken for zero modes leading to a violation of the ASIT at finite 0. This situation can be efficiently improved, if one starts already with a better action, e.g. an approximate FP action. An immediate question concerns locality of this operator. In 4D QCD it has been demonstrated 28 that for large enough 0 one may expect locality and for a Monte Carlo generated ensemble of gauge configurations exponential falling off of the effective coupling parameters has been shown. Neuberger's operator provides an explicit example of a GW action with optimal chirality properties. However, nothing can be concluded as to the scaling properties towards the continuum limit. In fact, in Fig. 5 we see, that the spectral properties of the bound states propagators are not improved over
296 those of the original Wilson action. V^e is automatically 0(a) corrected 29 ' 30 and thus improves scaling for the on-shell quantities; without at least introduc ing improvement of the current operators one would not expect improvement for the propagators, as shown by our results. The two features - chirality and scaling - seemingly may be considered quite separate issues. Optimal actions should allow chiral fermions and have good scaling properties. 2.4
Spectral Distribution and chRMT
The limiting value of the spectral density for small eigenvalues and large volume, -n
lim lim p(X) = {4>i>)
(21)
A-+0 V-HX)
provides an estimate for the chiral condensate due to the Banks-Casher relation 31 . Exact zero modes are disregarded, only the density close to zero is of relevance. In order to study this property of the lattice spectra (for the GW Dirac operators) one should map the eigenvalues from the circular shape to the imaginary axis. This is done by a stereographic projection
* = rrx1
( 22 )
2
Near zero the resulting distribution on the (tangential) imaginary axis agrees with that on the circle. The generalization of such a projection for a general GW operator is the introduction of V V = — (23) which anti-commutes with 75, is anti-hermitian and has purely imaginary spectrum. A procedure to define a subtracted fermion condensate has been suggested by Hasenfratz5 (within the overlap formalism cf. Ref.32),
(see also Ref.s 12,26,32,33). The expectation value includes the weight due to the fermionic determinant. In Fig.6 (from R.ef.27) the spectral density for the Neuberger operator is compared with that of the FP operator; near zero we find good agreement with each other and with the value expected from the continuum theory. Studying the spectra of the Dirac operators suggests comparison with Random Matrix Theory 35 (RMT). The spectrum is separated in a fluctuation
297 0150 r0125
^
0.100
L
P^ii
i
'a. 0.075 0.050 0025 0.000 L -J
-2
-I
0
I
2
1
X Figure 6. The unquenched (AT/ = 1) eigenvalue density distribution (16 x 16, 0 = 6) projected from the circle onto the imaginary axis for X>Fp (thick lines) and X>Ne (thin lines). The horizontal line denotes the continuum value at infinite volume.
part and a smooth background. The fluctuation part is conjectured to follow predictions in one of three universality classes. For chiral Dirac operators (chiral RMT 36 ) these are denoted by chUE, chOE and chSE (chiral unitary, orthogonal or symplectic ensemble, respectively). Several observables have been studied in this theoretical context. Comparison of the data should ver ify the conjecture and allows one to determine the chiral condensate. This information is contained in both, the smooth average of the spectral distribu tion, and in the fluctuating part. In particular the distribution for the smallest eigenvalue pm'mW contains this observable: Its scaling properties with V are given by unique functions of a scaling variable z = XVY,, depending on the corresponding universality class. Usually this is the most reliable approach to determine E, which then serves as an estimate for the infinite volume value of the condensate in the chiral limit. Within the Schwinger model we found34 the universal properties of the (expected) chUE-class, unless the physical lattice volume is too small. In Fig. 7 we show the distribution of the smallest eigenvalue for two of the actions studied. Further examples, also including dynamical fermions and 4D applications, are discussed in Ref.s 34 ' 37 ' 38 . It has become evident, that chRMT describes the fluctuating part of the spectrum satisfactorily within the expected universality classes. It provides a means to separate the universal features from quantities like the chiral condensate, which have dynamic origin. Or, to say it (tongue in cheek) more provocative, to separate the known thus uninteresting, universal properties from the unknown, physical properties of the system.
298 FP L=24 p=4 NpO v=0
P(^min)
30
Neub. L=16 P=2 NpO v=0
20
10
0 0.00
0.03
0.05
0.08
0.10
0.00
0.15
Figure 7. The distribution density for the smallest eigenvalue in the quenched (Nf = 0) case in the sector with topological charge zero. The full line gives the prediction for the chUE class, the broken line would correspond to the chSE universality class.
3
Conclusion
The main messages I wanted to deliver here are • Lattice gauge theory has made a big step forward in the understanding of chiral fermions and their lattice formulation. The Ginsparg-Wilson relation is a cornerstone in that development. Meanwhile U(l) chiral gauge theories with anomaly-free multiplets of Weyl fermions have been constructed properly 39 . • Neuberger's overlap action is a wonderful testing ground for chirality; its implementation in 4D is computationally expensive. Perfect actions ide ally provide the best of both, chirality and scaling, but their construction in realistic 4D models seems to have unsurmountable problems. In a 2D gauge theory, however, we were able to construct an explicit example, which has all the beautiful features expected. • The spectra of Dirac operators have universal features, which are de scribed by chiral RMT. This opens the path to efficient methods to ex tract from the data non-perturbative quantities like the chiral condensate. • The Schwinger model has provided an excellent test case to study various features also observed in QCD. Although maybe of different dynamical origin, mechanisms like confinement, chirality and symmetry breaking can be nicely studied in their lattice realization. This may lead to a better understanding also for the 4D theories.
299 Acknowledgments Ludwig Streit has been a colleague and friend since long; I learned a lot from him although we never collaborated (except for surfing in Santa Barbara). I want to thank F. Farchioni, C. Gattringer, I. Hip, T. Pany and M. Wohlgenannt for collaboration on various topics within the Schwinger model and for allowing me to present also some of their contributions in this text. This write-up tries to present the results available at the conference in Lisbon, Oct. 1998; only the references have been brought up to date. Support by Fonds zur Forderung der Wissenschaftlichen Forschung in Osterreich, Project P11502-PHY, is gratefully acknowledged. References 1. M. Atiyah and I. M. Singer, Ann. Math. 93, 139 (1971). 2. J. Kiskis, Phys. Rev. D 15, 2329 (1977); N. K. Nielsen and B. Schroer, Nucl. Phys. B 127, 493 (1977); M. M. Ansourian, Phys. Lett. 70B, 301 (1977). 3. H. Nielsen and M. Ninomiya, Nucl. Phys. B 185, 20 (1981); ibid. 193, 173 (1981); ibid. 195, 541 (1982). 4. P. H. Ginsparg and K. G. Wilson, Phys. Rev. D 25, 2649 (1982). 5. P. Hasenfratz, Nucl. Phys. B (Proc. Suppl.) 63A-C, 53 (1998). 6. M. Luscher, Phys. Lett. B 428, 342 (1998). 7. I. Horvath, Phys. Rev. Lett. 81, 4063 (1998). 8. P. Hasenfratz and F. Niedermayer, Nucl. Phys. B 414, 785 (1994). 9. P. Hasenfratz, V. Laliena, and F. Niedermayer, Phys. Lett. B 427, 125 (1998). 10. P. Hasenfratz, Nucl. Phys. B 525, 401 (1998). 11. R. Narayanan and H. Neuberger, Phys. Lett. B 302, 62 (1993); Phys. Rev. Lett. 71, 3251 (1993); Nucl. Phys. B 412, 574 (1994); ibid. 443, 305 (1995). 12. H. Neuberger, Phys. Lett. B 417, 141 (1998); ibid. 427, 353 (1998). 13. M. Blatter, R. Burkhalter, P. Hasenfratz, and F. Niedermayer, Phys. Rev. D 53, 923 (1996). 14. J. Schwinger, Phys. Rev. 125, 397 (1962); ibid. 128, 2425 (1962). 15. J. H. Lowenstein and J. A. Swieca, Ann. Phys. 68, 172 (1971); S. Coleman, R. Jackiw, and L. Susskind, Ann. Phys. 93, 267 (1975); S. Coleman, Ann. Phys. 101, 239 (1976); J. Frohlich and E. Seiler, Helv. Physica Acta 49, 889 (1976); L. V. Belvedere, J. A. Swieca, K. D. Rothe, and B. Schroer, Nucl. Phys. B 153, 112 (1979); J. Challifour and D.
300
16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 35. 36.
Weingarten, Ann. Phys. (N.Y.) 123, 61 (1979); H. Joos and S. I. Azakov, Helv. Phys. Acta 67, 723 (1994); H. Dilger, Nucl. Phys. B 434, 321 (1995); H. Dilger and H. Joos, Nucl. Phys. B (Proc. Suppl.) 34, 195 (1994). C. R. Gattringer and E. Seiler, Ann. Phys. 233, 97 (1994). N. D. Mermin and H. Wagner, Phys. Rev. Lett. 17, 1133 (1966); S. Coleman, Commun. Math. Phys. 31, 259 (1973). H. Joos, Helv. Phys. Acta 63, 670 (1990); I. Sachs and A. Wipf, Helv. Phys. Acta 65, 653 (1992). K. G. Wilson, Phys. Rev. D 10, 2445 (1974). M. Bochicchio et al., Nucl. Phys. B 262, 331 (1985). K. Jansen et al, Phys. Lett. B 372, 275 (1996). I. Hip, C. B. Lang, and R. Teppner, Nucl. Phys. (Proc. Suppl.) 63, 682 (1998). C. R. Gattringer, I. Hip, and C. B. Lang, Nucl. Phys. B 508, 329 (1997). P. Hernandez, Nucl. Phys. B 536, 345 (1998). C. B. Lang and T. K. Pany, Nucl. Phys. B 513, 645 (1998); Nucl. Phys. B (Proc. Suppl.) 63A-C, 898 (1998). F. Farchioni, C. B. Lang, and M. Wohlgenannt, Phys. Lett. B 433, 377 (1998). F. Farchioni, I. Hip, and C. B. Lang, Phys. Lett. B 443, 214 (1998). P. Hernandez, K. Jansen, and M. Liischer, hep-lat/9808010 , CERNTH/98-250 (unpublished). Y. Kikukawa, R. Narayanan, and H. Neuberger, Phys. Lett. B 399, 105 (1997). F. Niedermayer, Nucl. Phys. B (Proc.Suppl.) 73, 105 (1999). T. Banks and A. Casher, Nucl. Phys. 169, 103 (1980). H. Neuberger, Phys. Rev. D 57, 5417 (1998). T.-W. Chiu and S. V. Zenkin, Phys. Rev. D 59, 074501 (1999). F. Farchioni, I. Hip, C. B. Lang, and M. Wohlgenannt, Nucl. Phys. B 549, 364 (1999). T. Guhr, A. Muller-Groeling, and H. A. Weidenmuller, Phys. Rep. 299, 189 (1998). H. Leutwyler and A. Smilga, Phys. Rev. D 46, 5607 (1992); E. V. Shuryak and J. J. M. Verbaarschot, Nucl. Phys. A 560, 306 (1993); J. J. M. Verbaarschot, Phys. Rev. Lett. 72, 2531 (1994); P. H. Damgaard, Phys. Lett. B 424, 322 (1998); P. H. Damgaard, J. C. Osborn, D. Toublan, and J. J. M. Verbaarschoot, Nucl. Phys. B 547, 305 (1999); G. Akemann and P. H. Damgaard, Nucl. Phys. B 528, 411 (1998); J. C. Osborn, D. Toublan, and J. J. M. Verbaarschoot, Nucl.
301
Phys. B 540, 317 (1999). 37. M. E. Berbenni-Bitsch et al., Phys. Rev. Lett. 80, 1146 (1998); M. E. Berbenni-Bitsch, S. Meyer, and T. Wettig, Phys. Rev. D 58, 071502 (1998); M. E. Berbenni-Bitsch et al, Nucl. Phys. B (Proc. Suppl.) 73, 605 (1999). 38. R. G. Edwards, U. M. Heller, J. Kiskis and R. Narayanan, Phys. Rev. Lett. 82 (1999) 4188. 39. M. Liischer, Nucl. Phys. B 549, 295 (1999); and hep-lat/9904009 (un published).
302 I N T E G R A L S OF M O T I O N A N D Q U A N T U M
V. I. M A N ' K O P.N. Lebedev Physical Institute, Leninskii Moscow 117924, Russia E-mail: [email protected]
FLUCTUATIONS
Pr.
53
The description of both classical and quantum states with fluctuations, in view of the concept of the tomographic-probability distribtution, is presented. Timedependent integrals of motion (invariants) for classical and quantum systems with noise are discussed. Relation of the invariants to the propagator of the classical Boltzmann equation and quantum evolution equation for density operator written in the tomographic-probability representation is elucidated. Examples of a free particle and quantum harmonic oscillator are given in detail.
1
Introduction
Fluctuations of physical observables and noises in physical processes1 play important role in both classical and quantum domains. There is an essential difference in the nature of classical and quantum fluctuations. The latter ones cannot be annihilated by any means due to the quantum uncertainty principle (uncertainty relation). 2-4 In contradistinction to quantum noise, the classical fluctuations, in principle, can be reduced to zero (by increasing an accuracy of measurement and by decreasing the temperature). Till now, in the description of classical and quantum fluctuations, similar but, nevertheless, different mathematical tools were employed. The presence of classical noise in the state of a physical system is desribed by a nonnegative probabilitydistribution function, which obeys to a classical kinetic equation. The quantum fluctuations are associated with the description of pure quantum state by a complex wave function, which obeys to the Schrodinger equation,5 or, in generic case, by a density matrix,6 which obeys to a quantum kinetic equation. During last 70 years, several attempts of different kinds were produced in order to construct a bridge between the classical and quantum pictures.7 For example, Feynman introduced the path integral method.8 Physical and mathematical aspects of the Feynman path integral are presented in detail in reviews.9110 Wigner proposed a real quasiprobability-distribution function (known as Wigner function) to describe a quantum state. 11 The quantum-evolution equa tion was given for the quasidistribution by Moyal.12 Recently, a solution of the problem to give a unified description of classical and quantum states was found, namely, the nonnegative tomographic-probability distribution to de-
303
scribe the quantum states obeying a Fokker-Planck-like evolution equation was suggested,13,14 along with the tomographic probability which was used to describe states in classical statistical mechanics.15 The invertable map of the tomographic probability onto the wave function of pure state for description of analytic signals was found in the explicit form.16 The tomographic-probability description was successfully applied in quantum mechanics,17'18 and this ap proach was extended to the systems with spin.19,20 In the electronic-beam analysis 21 and image processing,22 the tomographic-probability description was employed in the classical domain. Other ingredients, common for both classical and quantum domains, are the integrals of motion or invariants which contain, in generic case, an ex plicit dependence on time. They exist for both time-independent and timedependent Hamiltonians but, for time-dependent Harmltonians, the invariants have specific properties, for example, the time-dependent Hamiltonians with kicks are important in the study of different aspects of quantum chaos. 23-25 Kicked systems with time-dependent Hamiltonians were considered within the framework of path integral approach.26 The classical invariant for the parametric oscillator (quadratic in position and momentum) has been found in 1880 by Ermakov,27 and a quantum analog of the Ermakov invariant was employed for plasma physics research.28 Linear in position and momentum invariants for the parametric oscillator were studied in a number of papers 2 3 , 2 9 _ 3 1 New time-dependent integrals of motion for a free Dirac particle were found explicitly.32 Review of the quantum time-dependent invariants is presented in monographs.33 Solutions to the evolution equation are related to the integrals of motion in both classical and quantum domains. In view of this, our aim is to discuss the properties of classical and quantum time-dependent invariants and to study the relation of the classical and quantum propagators to the classical and quan tum invariants, respectively. We consider the invariants, in view of the formal ism of density matrix, including the tomographic-probability representation. We illustrate the behavior of classical and quantum systems with fluctuations within the framework of Boltzmann and Schrodinger (Moyal) equations (for a free particle and the harmonic oscillator). 2
Integrals of Motion in Classical Mechanics
The total time-derivative of a function f(q,p,t) dependent Hamiltonian H(q,p,t) reads
for a system with a time-
304
where the Poisson braket of the functions f(q,p, t) and H(q,p,t)
has the form
rH n {HJ}
_ 9H df dH df --dp-dq--^dp~-
(2)
The classical integral of motion J(q,p, t) for the Hamiltonian system is defined by the equality
that means the invariance of the function 3 on the trajectory, i.e., for t > to, J(qo,Po,to)=J{
(4)
If the integral of motion J(q,p) does not depend on time explicitly, i.e.,
it is defined by the equality {J,H} = 0,
(6)
which holds both for time-dependent and time-independent Hamiltonians. Any function F(J\, J 2 , • • •, JN) of the integrals of motion is the integral of motion, i.e., dF/dt — 0. 2.1
Example of Free Motion
The Hamiltonian of a free particle reads
*• = £•
m
One can check that two linear in position and momentum functions Po(g,P,*)=P
(8)
and
P qo{q,p,t) = — t + q (9) m are integrals of motion. For example, quadratic in position and momentum function 2J2
n-rrf
^(po,?o) = ?o(?.P.<)=9 2 + ^ - - 2 — m m is the integral of motion for a free particle.
(10)
305
2.2
Example of Harmonic Oscillator
The Hamiltonian of the harmonic oscillator reads
One can check that linear in position and momentum functions Po(q,p,t) = pcosut + muqsinujt
(12)
and
P sin u)t + q cos uit (13) mu ' are the integrals of motion for the harmonic oscillator. In the limit of free motion w —» 0, the intergals of motion (12) and (13) coincide with (8) and (9), respectively. The integrals of motion (8), (9), (12), and (13) can be described by the real symplectic matrix A(t) qo(q,P,i) =
$:£})= A( " (I) •
<">
For free motion, the symplectic matrix A(t) has the appearance A
W = ( -t/m
\ ) '
(1&)
and for the harmonic oscillator it reads . .
( coswt mwsinoit \ \ — sinwi/mw sinut J '
. . * '
The real symplectic matrices satisfy the condition A£A tr = £ ,
(17)
where the matrix £ can be taken as 0 -1
1 0
(18)
The symplectic transform (being a partial case of the canonical transform) preserves the Poisson braket {Po,qo} = {p,q}-
(19)
306
3
Classical Statistical Mechanics
The state of a Hamiltonian classical system with fluctuations (noise) is de scribed by a joint-probability-distribution function p(q,p, t), which is nonnegative and normalized, J p{p, q,t)dqdp=l.
(20)
Using Fourier transform one can introduce the complex density-matrix-like function R(x,x',t) = R*(x',x,t), in view of the relationship R(x,x',t) = J- y p i ^ L
lP ,t)
e*<*-'> dp,
(21)
and the real nonnegative tomographic-probability function34 » ( « , M,", 0 = / / » ( « , P, t) « - « < - « - . » >
d
± ^
.
(22)
Functions (21) and (22) determine the classical distribution function, in view of the relationships p{q,P,t)=
^ r " / * ( 9 + | . 9 - ; | . * ) *~ipU du
(23)
and p (, p, t) = -^
j w (as, M, u, t) e ^ - M - p ) dx dp. dv .
(24)
The tomographic probability is a homogeneous function w(Xx,\p,
\v,t) = \\\~1w(x,p,i/,t).
(25)
There exists a class of functions, which are represented in terms of a com plex wave-like function rl>(x,t) such as R,/,(x, x',t) = ip(x,t)ip*(x',t). For such functions, Eq. (23) takes the form p(q,P,t) = ^
/ v > ( < Z + | , < ) V < ( < ? - | , t ) e-^du.
(26)
The admissible functions i/>(z, t) are the functions, which give the nonnegative probability distribution p{q,p,t) in (26). The following equality
j p2 (q, p, t) dp dq = ^
,
p0(t) = 1
(27)
307
is the nesessary condition for the normalized distribution function to be pre sented in the factorized form. In the simple case, the probability distribution p(q,p,t) satisfies the Boltz mann equation dp p dp dp dV _ dt m dq dp dq where the Hamiltonian reads H=^
+
Viq)>
(29)
with V being the potential energy. The Boltzmann equation (28) can be rewritten in terms of the complex density-matrix-like function R(x,x',i) = R*(x',x, t) (we use dimensionless variables) i ( d2
d2 \
/x + x'\
*(*'*'•*)- 2 [dx^ ~ dx^J &{*>*'>*) + *(*-*')*' {—2-J
R x
^ '^)
(30) The Boltzmann equation has the form of Eq. (3) for the time-dependent integral of motion p + {H,p} = 0. (31) This means that the classical-probability-distribution function is the integral of motion. It can be represented as a function F(Ji,..., Jjy) of some integrals of motion Ji(q,p, t),..., Jpi(q,p,t). Thus, for a free particle, a solution to the Boltzmann equation is a nonnegative normalized function F( 3/1,3/2) of the integrals of motion (8) and (9), i.e., p{(q,p,t) = F(q-^t,
p).
(32)
For the harmonic oscillator, the solution to the Boltzmann equation is Pos(9iP,t) — F I
sinw£ + qcoswt, pcoswt + mivqsinu)t) .
(33)
Given an arbitrary initial distribution function p{q,p,t-0)
= po(q,p),
(34)
the solution to the Boltzmann equation reads p(q,p,t) - po{qo{q,p,t), po{q,p,t)),
(35)
= °-
308
i.e., in po(q,p), we replace the arguments q and p by the integrals of motion ?o(9iP>*) a n d Po{q>P>i)> which have the initial values qo(q,p,t = 0) = q and Po(9)Pi* = 0) = p. Since the canonical transformation of position and momen tum preserves the phase volume and in view of Eq. (35), for generic potential V(q) in (28), the parameter p.o in Eq. (27) is invariant. The solution (35) can be represented in the integral form P{q
p0(q' ,p) dq' dp',
(36)
where we use the propagator Il(q,p,q',p',t)
= S(q' - q0(q,p,t)) 6(p' - p0{q,p,t))
.
(37)
The propagator (37) satisfies the system of equations qo(q,P,t)H(q,p,q',p',t) po{q,p,t)U(q,p,q',p',t)
= =
q'U(q,p,q',p',t) ; p'Tl(q,p,q',p',t).
(38) (39)
Given the integrals of motion qo{q,p,t) and Po(q,P,t), for the classical Boltzmann equation (28) one has the propagator of the form (37). 3.1
Example of Gaussian Distributions
The Gaussian distribution is determined by means (p) and (q), by variances crpp and crqq, and by covariance apq. For the initial Gaussian distribution, po{q,p) =
1
(40)
25i-\/det a
where the variables p = p-
(p),
q-q
(41)
and the real symmetric dispersion matrix 'vv
'VI
(42)
are used, the solution to the Boltzmann equation for a free particle or for the harmonic oscillator has the Gaussian form p{q,p,t)
~
l exp 2?rv/det
-S(PW.9W)* _ 1 W(-$})
(43)
309 In (43), the time-dependent means (p(t)) and (q(t)) and the time-dependent dispersion matrix cr(t) are expressed in terms of the initial means (p) and (q) and the dispersion matrix a by the relationships a-1(t)
= At'(t)<x-1A(t)
(44)
and (PW>
-£Atr(t)£
(45)
where A(t) is the symplectic matrix, which determines the integrals of motion, and the matrix S is given by Eq. (18). For example, in the case of free motion, the initial Gaussian distribu tion (40) evolves as p{q,p,t)
l exp 27r\/det a
-\{p-bU-%-{*))*-* p-{p)
(46)
This means that the variance of the position depends on time as a
qqit)
— aqq "I
2t a
pq "^
m the variance of the momentum is preserved app{t)
t< a
2 VV '
(47)
m* (48)
'vv'
and the covariance also depends on time (49)
°>«(0 = Gvi + The position and momentum means read
(p(0> =
P
m
(50)
For the harmonic oscillator, one has the time-dependent distribution p(q,p,t)
1 2x\/det<7 x exp — (pcoswf + mwqsinuit, sinwt + qcosut I <7_1 [ 2 \ mw / pcoswf + rnwgsinwt \ , , — -2- sinwt + ocoswt / row
*
/ .
310
The Gaussian distribution (51) has the time-dependent means (p(t))
=
(p) coswt — mw(q) sinwi ; (52)
(P) (q(t)}
=
sin wt + (q) cos ut, rrua
with the variances <7pp(t)
=
avvcos2 wt +a qq siu2 wt— 2apq sin wt coswt;
crqq(t)
=
oqq cos wt + avv sin wt + 2<7p? sin wt cos wt,
and the covariance avq{i) = avq cos 2wt + - ( —— — mwcrqq j sin 2wt.
(53)
(54)
For the repulsive oscillator with the potential energy V(q) = - m w V / 2 , all the former relationships for integrals of motion (12) and (13), propagator (37), and Gaussian distribution (51) hold with the replacement w —+ ho. 4
Integrals of Motion in Q u a n t u m Mechanics
In Schrodinger representation, the quantum integral of motion I(t) is deter mined by the equality dl(t) _ dt ~
dl(t) + dt
l
-[H,T(t)]=0;
H=^+V(q),
(55)
and this equality is the quantum counterpart of Eq. (1). The fact that the total time derivative of an observable is equal to zero correponds to the invariance of mean of the observable on the quantum trajectory, i.e.,
(56)
This relation is the quantum analog of the classical relation (4). Any function F yh{t),l2{t),.. . , / ^ ( i ) J of the integrals of motion is the integral of motion. For free motion with the Hamiltonian Hi — p2 /2m, there exist two linear in position and momentum integrals of motion po(t) and qo{t) given by Eqs. (8) and (9) with the replacement of c-numbers by operators q —» q and p —> p. For the harmonic oscillator with the Hamiltonian ff H
_ P2
°>-2^n
,mw2 +
-2-q
2
'
311
the linear in position and momentum integrals of motion po{t) and qo(i) are given by Eqs. (12) and (13) using the same replacement. Thus, for the two systems under consideration, both the quantum timedependent integrals of motion and the classical integrals of motion are deter mined by the symplectic matrix A(t). The integral of motion I(t) is related to the unitary evolution operator U(t), which connects wave functions at different moments of time | ip(t)) = U(t) \ ip{0)), by the relationship
T(t) = U(t)T(0)U-\t).
(57)
Matrix elements of the evolution operator (Green function) {x\U(t)\y)
= G(x,y,t)
(58)
describe the evolution of the wave function (x | ip(t)) = rj){x,t)
MM) = JG{x,y,t)il>{y,0)dy and the evolution of the density matrix p^(x,x',t) = ip(x,t)i]>'(x',t) pure state with the density operator p^(t) = | t/>(2))(i/>(t) | p+(x, x', t)=
f G{x, y, t)G*{x', y', t)p^(y, y', 0) dy dy'.
(59) of the
(60)
Since for impure state, p(t) = J2nwn I V,n(0}(V'n(0 |> where wn is the prob ability of the presence of state | il>n{t)) {il>n{t) | in the mixed state p(t), the Green function G(x,y,t) describes the evolution of impure state as well. The density matrix p(x,x',t) satisfies the evolution equation
>(*' *'•t] ~ £ (& ~ ^^ ) P(*' X' *> + I ^X) ' V<"'» **' *'' *) = (61) Equation (61) written in the operator form ?+^[H,p]=0
(62)
demonstrates that the density operator is the quantum time-dependent integral of motion p\t) = tr(t)p(0)Cr _1 (t). The integrals of motion q0(t) = U^qU'1^) and p0{t) = UtyplJ-1^) describe the initial mean values of the position and momentum. Rewriting
312
Eq. (57) in matrix form one obtains the equation for the Green function 9o{t){q)G(q,q',t)
=
q'G(q,q',t);
(63)
Mi)(i)G(q,q',t)
=
ih —
(64)
G(q,q',t).
These equations are analogs of the classical equations (38) and (39). The density operator can be described in the Wigner-Weyl representation by the Wigner function
W(q,p,t) = Jp(q+^,q-^,t)
e-^hdu;
j W(q,p,t) ^
= 1. (65)
For pure state | ip(t)), the purity parameter /^o Ho
/"J(..»«)*£ = ..
W
The purity parameter y.Q = Trp2(<) is invariant of the quantum evolution. For impure state, one has for the invariant purity parameter 0 < ^o < 1. The nonnegative tomographic-probability distribution in quantum domain is connected with the Wigner function by the formulas (we use dimensionless units) w (X, p, „, t) = J exp [-ik(X
-
M
- up)) W(q, p, t) ^ ~ ^
(67)
and W(9>Pi*) = -— w(X,n,u,t)exp[-i(fj,q 2x J
+ up- X)] dfj.dudX ,
(68)
which are analogs to (22) and (24), respectively. The property (25) and normal ization condition J w (X, /i, v, t) dX = 1 hold for both classical and quantum tomographic-probability distributions. For Gaussian states, the tomographic probability of a quantum system (as well as for the classical one) has the form
"^'"'"''^^ra^^r}-
(69
»
where
+ 2nva„{t);
X(t) = p(q(t)) + u(p(t)).
(70)
313 The Gaussian tomographic probabilities for a free particle and for the harmonic oscillator are described by (70) with the fluctuation parameters given by ( 4 8 ) (50) and (52)-(54), respectively. T h e tomographic probability of a quantum state satisfies the evolution equation (in dimensionless units) w —
v
fj,—w
dv
-V
1
,
_JL_S_JL
d/dXdn I d .v d
1
2dX
W :
d/dXd~Ji+l2d~X
0.
(71)
Equation (71) is the tomographic form of Eq. (61) for density matrix. The tomographic form of the classical Boltzmann equation (31), written for the density-matrix-like function R(x, x',t) and rewritten in terms of the probability distribution w(x, n, v,t), reads
a
dv
w — H—w — dq —— (q) u dv
axw
o,
(72)
where the argument of the function dVjdq is replaced by the operator 9= -
dX
d_ dp
For the harmonic oscillator, there exist stationary solutions to the evolution equations (71) and (72) for the tomographic probability for the integers n > 0
< 73 >
where Hn(x) are Hermite polynomials. These solutions are admissible in quan t u m domain, but they are nonadmissible in classical domain, since the function pn(q,p,t) (28) takes negative values. The corresponding Wigner function (68) takes also negative values but this is permitted in quantum domain. For linear systems with time-dependent Hamiltonians p2 m(t)w(t)q2 l,/w„„ ... „ >. , v. H - ^?— k(t)(pq - + + ^—i - J±_ v~i ? ?- _ ++ r7> M*)0*?+ +qp)9P)+ +d(t)p d(t)p+ +e(t)q e(t)q, ,
(74)
there exist time-dapendent integrals of motion po(') and qo{t) of the f o r m 2 9 - 3 1 Po(*) qo(t)
Mt) (*) + ( S 2 ) ,
(75)
314
where the symplectic matrix A(t) and the vector A(i) with components ^i(t) and £2(2) satisfy the equations A = iABoy, A = iAcryC and the initial conditions A(0) = 1, A(0) = 0; ay being the Pauli matrix. In order to write these equations, the symmetric 2x2-matrix
_ / m-HO k(t) \ -{ k(t) m(t)^(t)j
B
(78>
and the 2-vector C = (d(t),e(t)) were introduced. The propagator, which describes solutions of Eqs. (71) and (72) w (X,M, M ) = / n (X, n,u,X',//,
v',t)w{X\
ft', v\ 0)dX'dft'dv'
(77)
for both classical and quantum tomographic probabilities of linear systems, is the same and it is expressed in terms of the integrals of motion, i.e., in terms of the matrix A and vector A, n (X, n, v, X', n', 1/, t) = 6 {X - X' + AfA^A) 6 (AT - AfA'1),
(78)
with the vectors J\f - (v,n) and A/' - (1/, n'). Formula (78) holds for the multimode system as well. In the case of linear systems, for the Gaussian solutions (69) of Eqs. (71) and (72), the tomographic probabilities are ad missible in classical domain for arbitrary values of the initial dispersion ma trix but, in quantum domain, they are admissible only for the initial disper sion matrix, which satisfies the Schrodinger-Robertson uncertainty relation a vv°ii ~ apq — 1/^ (' n d i m e n s i ° n l e s s units). 5
Conclusions
Let us extract some lessons from our discussion. We have shown that states of both classical and quantum systems with fluctuations can be described by one and the same probability distribution w(X, /i, v,i). This function is nonnegative and normalized probability distri bution of position X measured in different reference frames in the phase space; the reference frames being labeled by real parameters \i amd i/.13"14 If one knows the tomographic probability, the standard probability distribution p(q,p,t) in the phase space is reconstructed for classical systems and the Wigner quasidistribution is reconstructed for quantum systems. In classical and quantum domains, the tomographic-probability distribu tion obeys to different evolution equations of the Fokker-Planck-type. For
315 linear nonstationary systems, the evolution equation for the tomographic prob ability has the identical appearance for both classical and q u a n t u m domains. Propagators for both classical and quantum tomographic probabilities are nonnegative transition-probability functions. The propagators are connected with different evolution equations (e.g., Boltzmann equation and von Neumann equation). The form of propagator is determined by the time-dependent inte grals of motion. T h e difference of quantum and classical tomographic probabilities is con nected not only with different dynamics. As we have shown, for linear systems, the dynamics is the same. The difference consists of the initial conditions im posed for solving the evolution equations. Some tomographic-probability dis tributions are admissible only in quantum domain, some are admissible only in classical domain. The tomographic probabilities, which are admissible only in classical domain, have the fluctuations ax (70) of the variable X which, for some reference-frame parameters \t and u, violate the uncertainty relation. T h e tomographic probabilities, which are admissible only in quantum domain, provide the Wigner function which takes negative values in some points of the phase space. One should point out that, in classical mechanics, a formalism of the density-matrix-like function R(x,x',t) can be introduced. The evolution of this function is described by Eq. (30) which has the same form as quantum evolution equation (61) but only for the quadratic potential V = a(t)x2 + b(t)x. In generic case, the evolution equation (30) differs from (61). For a subclass of "pure" classical states with fluctuations, the wave-like function can be introduced by the relationship R^,(x,x',t) = ip(x,t)tp*(x',t) in complete analogy with quantum wave function. This construction was used for the description of the classical electronic beam within the framework of thermal wave model. 21 We have found a lot of similar (even identical) aspects of the two differ ent worlds (classical and quantum). The common aspect for both worlds is the presence of fluctuations, which always can be described by the standard probability distribution (e.g., tomographic probability). Propagators in both worlds always can be described by the standard transition probabilities. These propagators are closely connected with the time-dependent integrals of motion. In spite that some clearance in understanding the relations between classical and quantum pictures appeared, there are still some problems to be elucidated not only technically but also conceptually.
316
A cknowle dgment s The author would like to acknowledge Grupo de Fisica-Matematica, Complexo II, Universidade de Lisboa for kind hospitality and the Russian Foundation for Basic Research for the partial support under Project No. 99-02-17753. References 1. T.I. Hida, H.H. Kuo, T. Pothoff, and L. Streit, White Noise, An Infinite Calculus (Kluwer Academic, Dodrecht, 1993). 2. W. Heisenberg, Z. Phys. 43, 172 (1927). 3. E. Schrodinger, Sitzungsber. Preuss. Acad. Wiss. 24, 296 (1930). 4. H.P. Robertson, Phys. Rev. 35, 667 (1930). 5. E. Schrodinger, Ann. d. Physik (Leipzig) 79, 489 (1926). 6. J. von Neumann, Mathematische Grundlagen der Quantenmechanik (Springer, Berlin, 1932). 7. J.S. Bell, Speakable and Unspeakable in Quantum Mechanics (Cambridge University Press, 1987). 8. R.P. Feynman, Rev. Mod. Phys. 20, 367 (1948). 9. S. Albeverio and R. Hoegh-Krohn, Mathematical Theory of Feynman In tegrals, Lecture Notes in Mathematics (Springer, Berlin, 1976), Vol. 523. 10. M. Faria, J. Potthoff, and L. Streit, J. Math. Phys. 32, 2123 (1991); D.C. Khandekar and L. Streit, Ann. Phys. 1, 46 (1992). 11. E. Wigner, Phys. Rev. 40, 749 (1932). 12. J.E. Moyal, Proc. Cambridge Philos. Soc. 45, 99 (1949) 13. S. Mancini, V.I. Man'ko, and P. Tombesi, Phys. Lett. A 213, 1 (1996). 14. S. Mancini, V.I. Man'ko, and P. Tombesi, Found. Phys. 27, 81 (1997). 15. Olga Man'ko and V.I. Man'ko, J. Russ. Laser Res. (Plenum Publ.) 18, 4o7 (1997); J. Russ. Laser Res. (Kluwer Academic/Plenum Publ.) 20, 67 (1999). 16. V.I. Man'ko and R.V. Mendes, " Noncommutative time-frequency to mography of analytic signals," Eprint LANL Physics/9712022 Data Analysis, Statistics, and Probability; Phys. Lett. A (1999, in press). 17. V.I. Man'ko, L. Rosa, and P. Vitale, Phys. Rev A 58, 3291 (1998); Phys. Lett. B 349, 328 (1998). 18. Vladimir Man'ko, Marcos Moshinsky, and Anju Sharma, Phys. Rev. A 59, 1809 (1999). 19. V.V. Dodonov and V.I. Man'ko, Phys. Lett. A 229, 335 (1997). 20. V.I. Man'ko and O.V. Man'ko, JETP 85, 430 (1997). 21. R. Fedele and V.I. Man'ko, Phys. Rev. E (1999, in press).
317 22. M.A. Man'ko, J. Russ. Laser Res. (Kluwer Academic/Plenum Publ.) 20, 226 (1999). 23. G. Karner, V.I. Man'ko, and L. Streit, Reports Math. Phys. 29, 177 (1991); "Quantum kicks, quasi-energy, and chaos," in: V.I. Man'ko and M.A. Markov (eds.) Theory of the Interaction of Multilevel Systems with Quantized Fields, Proceedings of the Lebedev Physical Institute (Nova Science, New York, 1996), Vol. 209, p. 191. 24. R. Vilela Mendes, Phys. Lett. A 171, 253 (1992). 25. R. Vilela Mendes and R. Coutinho, Phys. Lett. A 239, 239 (1998). 26. M. Grothhaus, D.C. Khandekar, J.L. da Silva, and L. Streit, J. Math. Phys. 3 8 , 3278 (1997). 27. P. Ermakov, Univ. Izv. (Kiev) 2 0 , No. 9, 1 (1880). 28. H.R. Lewis, Phys. Rev. Lett. 18, 510; 636 (1967); H.R. Lewis and W. Riesenfeld, J. Math. Phys. 10, 1458 (1969). 29. LA. Malkin and V.I. Man'ko, Phys. Lett. 32, 243 (1970). 30. LA. Malkin, V.I. Man'ko, and D.A Trifonov, Phys. Lett. 3 0 , 414 (1969). 31. LA. Malkin, V.I. Man'ko, and D.A. Trifonov, Phys. Rev. D 2, 1371 (1970). 32. V.I. Man'ko and R.V. Mendes, Phys. Scr. 56, 417 (1997). 33. LA. Malkin and V.I. Man'ko, Dynamic Symmetries and Coherent States of Quantum Systems (Nauka, Moscow (1979) [in Russian]; V.V. Dodonov and V.I. Man'ko, Invariants and Evolution of Nonstationary Quantum Systems, Proceedings of the Lebedev Physical Institute (Nova Science, New York, 1989), Vol. 183. 34. S. Mancini, V.I. Man'ko, and P. Tombesi, Quantum Semiclass. Opt., 7, 615 (1995).
318
D I S T R I B U T I O N S GAUSSIENNES ET APPLICATIONS A U X EQUATIONS A U X DERIVEES PARTIELLES STOCHASTIQUES HABIB OUERDIANE Departement de Mathematiques Faculti des Sciences de Tunis Campus Universitaire, 1060 Tunis, Tunisie e-mail : habib.Ouerdiane&fst.rnu.tn
1. Introduction In white noise analysis (WNA), analyse du bruit blanc, on utilise le triplet de Gelfand : S(M) —► L2(lR,dt) = (L2(m,dt))'
—► S'(R)
ou S(M) est I'espace des fonctions C°° a decroissance rapide et son dual topologique S'(M) I'espace des distributions temp^rees de L. Schwartz. Soit 7 la mesure gaussienne sur S'(1R) = S' donnee par sa transformed de Fourier (par le theoreme de Bochner-Minlos) : 7(0= /
e i < x , « ) d7(x)=e-i | l «l | 2
JS'
llfll2 = (fiOL2(K,dt) e t (x>f) est ladualite entre S'(-R) et S(R) qui prolonge le produit scalaire dans L2. Definition. Le triplet (5'(J?),B,7) est appete I'espace de probabilite du Bruit blanc ou B est la tribu borelienne sur S'(1R). Lien avec le mouvement Brownien : B ( t ) . V f e S(1R); W^ = (.,^) est une variable aleatoir sur S'(M) et on a : W{ = U > ~ - N ( 0 , K I 2 ) En posant : Bit
T\ = {
Kx
'>
<*' 1 (M>:
si
*> °
et
x e
\ -(x, l (ti0 ]); si t < 0 et x e 2
ou ici puisque S(1R) est dense dans L (R,dt)
'( )
S'(R) 2
: V / G L (JR,dt)
(.,/) = lim (.,£„) dans L 2 ( S \ 7 ) n—KX>
S R
319
et ou £n est la suite de 5(JR) telle que £„ -► / dans L2(M, dt). Alors B(t, x) est le pocessus de Wiener ou mouvement Brownien (M.B) et on a:
^ M = j(t,*) = *(0 au sens des distributions. A partir de maintenant on travaille avec un triplet plus general : X—>H*H'—>X'
(1)
ou X est un espace de Frechet nucleaire reel, et H un espace de Hilbert reel. Alors X s'ecrit : X = limprojp>oXp ou pour tout p, Xp est un espace de Hilbert, et X' = lim indp>0X'p. Si on complexifie le triplet (1) on a : N —+ Z —* N' ou N = X + iX.
(2)
Comme dans la theorie generate des distributions de L. Schwartz en dimension finie, on va developper une theorie des distributions en dimension infinie (JRn est remplace par X) en considerant un triplet centre sur L2 :
r_>L 2 (jf', 7 )—>r
(3)
ou T est I'espace des fonctions test sur X' et T' le dual topologique de T c'est a dire 1'ensemble des formes lineaires continues sur T, est I'espace des distributions. 2. Transformation chaotique. En dimension finie la transformation de Fourier (T.F) joue un role important dans la resolution des equations aux derivees partielles. (e.d.p). En dimension infinie c'est une transformation appel£e S- transform [4] ou transformation chotique (T.C) qui jouera le role de (T.F). De plus la (T.C) prolonge a T ' l'isometrie de Wiener-Ito-Segal : [3] L2(X',7)
-A
Fock(Z)
(4)
avec pour tout z dans Z = H + iH. f(z) = f JX'
f(x)e{x'z)-z*/2dy(x)
= f
f(x + z)d1{x)
JX'
ou Fock(Z) = {/ = (/„)„>„ ; Vn fn e HP"
= zO„
320
et de plus ll/1lFoc*:=£n!l/«|2<0°}' n>0
c'est un espace de Hilbert appele le Fock bosonique (ou Fock symetrique). Remarques 1. V/ = (fn)n € Fock(Z) on lui associe sa realisation holomorphe /(z) = /(z) = £ < z ® - , / „ )
(5)
n>0
2. Soit hn le polynome d'Hermite d'ordre n : M*) = (-l)Be^(^)B(c-^)
(6)
- Si a = ( a i , a 2 , . . . , a n ) multi-indice i.e aj € W;V i, on pose HQ(x) =
hai(xi)ha3{x2)...han(xn)
n
oil x = (x\...xn) € Ft . - S\ x G X' et soit {ej} une base orthonormee (b.o.n) de H. Alors si on pose Xi — (x, e{) on a : n
Ha(x) =
'[[hat((x,ei)). i=l
Posant A — {l'ensemble des multi-indices a = (aj) avec a< = 0 sauf un nombre fini }. Alors {-j£;}a€A est base orthonormee de L2(X',y) et done le deVeloppement de tout / € L2{X',-y)
s'ecrit f(x)
= £ n ( : x®n
:
>/n) oil
fn e / f ® » et : x ® „ : polynome de Wick d6fini par :
(:*®-:,{®-:> = | f | » M ^ ) 3. Espaces de fonctions test et de distributions. Soit le triplet : X -UH~H'
A
X'
(7)
oil i est l'injection continue a image dense. Soit N = X + iX le complexifie de X. II est nucteaire complexe et complet dont la topologie peut etre definie
321
par d'une famille filtrante croissante de semi-normes hilbertiennes (||p)P67v on a : N = \improjNv p>0
et N' = limindN' v
p>0
P
Fonctions entieres (holomorphes) a croissance exponentielle Soit B un espace de Banach complexe. Soit k > 1 et m > 0 (fc est I'ordre de croissance et m est le type de croissance). Posons : Exp(B,k,m) = {/ : B —► C holomorphe : ||/|U, m = sup|/(z)|e—ll 2 H k
= limproj r a > 0 p^ExV{N'v,k,m)=T
Remarque. Si k = 2 et X = S(R) alors £'k(N') = £^{S'(R)) des distributions de Hida [3] [4]. Theoreme 1.
(8) est I'espace
1. Pour tout k > 1 £k(N') est un espace de Frechet nucleaire . 2. Si 1 < k < 2 on a le triplet de Gelfand suivant : £fc(AT')->L2(X',7)-+^)fort
(9)
3. La transformation de Laplace (L) T e (£k(N'))' z
ou pour z dans Ne
—> (T,e z ) = (LT)(z)
(10)
est l'application : N —> C u —► e
etablit un isomorphisme topologique entre (£k(N'))'
fort et Mk'{N)
:= limm
ou \ + jr = 1 (k > 1). Remarque : Dans [6] [7] on obtient des r^sultats similaires a ceux du th6reme 1. Comme la (T.C) d'une distribution S 6 £'k(N') est donnee par S(z) = (T.C)(S)(z)
= (5, e<-*>-*8/a)
(n)
322
e-"2(LS)(z)
S(z) =
on en deduit Theoreme 2 [14]. Soit (X C H C X',j) un espace gaussien nucl^aire. Soit 1 < k < 2. Alors la transformation chaotique (T.C) etablit un isomorphisme topologique de triplets. £k(N')-+L2(X',7)—►
fi(JV') (12)
(T.C) £ * ( > ) —> Fodc(Z)
—♦ XfcT(7V)
avec
jf + F = 1 Autrement dit : T 6 £t(./V') si et seulement si 3 p£N,3
m>0
c > 0 : | f (z)| <
k
ce^^\-p)
Theoreme 3 (Representation integrate) 1. Soit T une forme lineaire continue positive sur Ek{N') : T(g) > 0, V g e £*(#') avec $(ar + iO) > 0 ¥ i € X ' Alors il existe une unique mesure de Radon /z sur X' :
T(g) = f g(x)dn(x)
(13)
ix< et de plus m
= (T-FUO
= T(e«) V ( e X
(14)
2. Une mesure de Radon \x sur X ' represente une forme lineaire positive sur £k(N') si et seulement si : (i) \i provient d'une mesure /ij sur X'j (i.e /x = s'j(fij) avec s,- : X —► Xj et s'j : XJ —► AT' (ii) 3Q > 0
/ expa(\x\j)kdfij(x) Jx1.
<
00
(15)
323
condition d'integrabilite qui generalise X. Fernique [15]. 4. Noyaux et symboles d'operateurs . Soit L(£k(N'),£'k{N')) = G l'ensemble des operateurs continus de £k{N') dans £'k(N'). Si u G G, son noyau uk est defini par : V f,9££'k(N')
(u(f),g) = {ukj(g)g)
(16)
En utilisant le theoreme de Schwartz-Grothendick des noyaux [15] [18] puisque £k(N') est un Frechet nucleaire on a : L{Ek{N'),e'k(N'))^e'k{N')^S'k{N')^e'k{N'
x N')
et done Theoreme 4. [15] L'application qui a tout u € L(£k(N'),£'k{N'))
associe
son symbole u(z, z1) := (TC)(uk)(z, z') := (uk, e ^ '
2
etablit un isomorphisme topologique de : L{£k(N'),£'k(N')) —► MAN
(17)
x N)
(18)
5. Generalisation des theoremes 1.2.3.4 pour des espaces de types ee(N'). _ Definitions. 1. Soit 6 : M+ -* M une fonction croissante, convexe et telle que 0(0) = 0 et ?izl -► 4-oo quand x -4 +oo (on dit que e'est une fonction de Young) 2. Soit B un espace de Banach, et m > 0 on pose : Exp(B,6,m)
:= {/ e H(B) : ||/||,, m = sup \f(z)\e-°^*^
< oo}
z€B
ou H{B) est l'espace de fonctions entieres sur B. 3. Soit £e(N') = \improjm>o,pen Exp(Np,6,m) lim indm>o,P£MExp(Np,d, m) Vu l'hypothese limj-v+oo -^1 = +oo V ue N
et
Me{N)
=
l'application e u : 2 € N' —► e <x,u>
est un element de £e{N'). Done pour tout T e {£${N'))' sa transformed de Laplace est definie par : (£T)(z) = ( T , e 2 ) = T ( e * )
324
4. Soit 6 une fonction de Young. Sa fonction polaire est definie par 6*(x) :=sup{tx-8(t),t
>0} V x > 0
elle est aussi une fonction de Young. Theoreme 5. topologique
La transformation de Laplace C etablit un isomorphisme
€'e(N')
-^Me-(N).
Si on suppose de plus que lim a; - + + 00 - ^ < oo pour que £$(N') s'injecte continument dans L2(X',-y) on obtient : Theoreme 6. La transformation chaotique (ou 5-transform) realise un iso morphisme topologique entre les triplets suivants : £,(N')
^ L2(X'',7) Is
->
S'e{N')
(T.C)
£e(N') <-> Fock{Z) <->•
M9.(N)
Preuve On utilise essentiellement l'application "S6rie de Taylor" pour associer a I'espace £$(N') un espace de series formelles : Ee(N) = limprojm>0tPEe,m(Np) ou Ee,m(Np)
= if = (/n)n>o avec / „ £ NO. p
et tel que :
£0- 2 m-"|/4
£9(N') s4 Ee(N) Me-(N)S4 Me(N') Remarques
325 k
k'
1. Pour k > 1 soit 9{x) = ^- alors 9*(x) = jp- avec j + p- = 1. On retrouve alors le cas usuel i.e pour tout / € Fe(N) V m,p > 0 : | | / 1 | | i m i J , = E(n!) 2 / f c m-"|/ n |^ < oo ( | = 1 + 0, voir [7]). En particulier si k = 2 on retrouve les espaces de Hida-Kubo-Takenaka [3], [4], [9] 2. Si on pose 6*(t) = Log{Ga(t2))
avec
Ga(t) = J2a^-r n>0
On retrouve les espaces de distributions [i^]* etudie dans Cochran-KuoSengupta [1] (1998), 3. Si 6 et ip sont deux fonctions equivalentes a rinfini i.e
Urn M
= 1
x-H-oo ip(X)
on a : £e(N') = £V(N') et Me(N)
=
MV{N)
4. On obtient un th^oreme 2bis, 3bis et 4bis en rempla^ant les indices k par 6 et k' par 6" 5. L'hypothese d'analycite sur la fonction Ga{t) dans [1] n'est pas necessaire dans notre cas et de plus on a explicitement I'espace de fonctions test ainsi que sa caracterisation par la transformation chaotique. 6. Si 6(x) = ( i + l)Log(x + 1) - x on a 0*(x) = ex — x - 1 ~ ex et on retrouve 1'ensemble donne dans [1] en utilisant les Bell's numbers : GBeii(k){t) - Y, ~ZJ-tn ~ -^Z?r n! ou eee kfoia
. est la composee de la fonction exponentielle k fois.
326 6. Application aux equations aux derivees partielles stochastiques . Soit I'equation de la chaleur sur X — JRn : ■§lf(t,x)
=
/(O, x)
=
t>0etxeRn
±A x /(*,x) ;
(19) g(x) condition initiale .
En posant
/(0 = (-J=)" f e-^'>f(x)dx y/Z7T
Jftn
la transformation de Fourier de / par rapport a a; on obtient 1
a
/(«.0=$(fle-*»l€l et done par Fourier inverse :
(20) Jjj»
\/lnt <*7,(u)
f(t,z)=
[
g(x + Vtu)dj(u)
(21)
ou ft est la mesure gaussienne centree, de variance t et done la solution de (E0) est donnee par : f(t,x)
= (it*g)(x).
(22)
En 1967 L. Cross generalise I'equation (E0) au cas de la dimension infinie, en travaillant sur un espace de Wiener abstrait (H <-+ B, -yt) ou -yt est la mesure de Wiener sur B et en posant A c = Laplacien de Gross = / dfdt (Formalisme Obata [18]) ou dt = Ds, est la derivation de Hida on a : AQU{X) = traceHu"'(x) et il considere I'equation : (&f(t,x) (^o)oo : <
I /(0,x)
=
|AG/(«,x);t>0i6B (23)
=
g(x)
et montre que / ( * , x ) = I g(x + u)d-rt(u) = ( 7 t * g)(x) JB
(24)
327 Theoreme 6. Soit g 6 £*(./V') donnee. Alors l'equation : f - / M
=
|Ac/(t,x) ; f > 0
\
=
g(x)
f(0,x)
et x 6 X'
^
admet une solution unique dans £k(N') donnee par : f(t,x)=
f
g(x + Vty)dj(y)
(26)
Equation de Burger stochastique (Wick-Burger) Soit le probleme de Cauchy suivant : ljY(t,x)
=
\
=
Y(0,x)
et ou Y{t,x) = Y(t,x,w)
aAY(t,x) f(x)
+ Y(t,x)OG(t)
(2y)
xeRn
; t>0,
est un pocessus generalise € £k(N')
u>eX'
; f(x) =
f(x,u)<E£'k(N')
et G(t) = G(t,u) e S'k(N') et ou le produit de Wick de deux distributions est defini par : Si S et T 6 S'k{N') U = SQT est I'unique distribution telle que sa transformee chaotique U est donnee par : U = S$t
= S.f.
(28)
En posant : Y(0 = (T.C)Y(0
; SeX
+ iX = N
on deduit de (27) : (dtY(t,x,i ,0 I Y{0,x,S ,0
=
aAY(t,x,0
=
+
Y(t,x,O.G(t,0
/>,0
(29)
En appliquant la (T.F) par rapport a x, on a : / 5 f y ( i , P ,, 0
=
- a | p | 2 r ( * , p , 0 + *"(«,?, flG(*,0
\ y(o,P,o
=
/(p,o
(30)
ou F(p) = (T.f)» K(p) d'ou Y
^*>0
= 7r^l2exp([ (47rat)n/z
70
G(s,0ds)x[[
7Kn
/(y.Oe-^dy]
(31)
328
Theoreme 7 L'equation de Wick-Burger stochastique (27) admet une solu tion unique Y(t,x) e £[(N') si / et G € £[{N'). Sa (T.C) est donned par (31). C a s particuliers.[16] i) Si G(t,u) = W(t) = B(t) est le processus du Bruit Wane, alors Y(t,x,t)
= ^y^™P(f
Z(s)ds) x [J ^ f(y)e-^dy]
(32)
ii) Si G(t, W) = e * ' 1 ' ou W est le processus du Bruit blanc Yfo*'*)
= (A \nn/2/'exP( (4nat)
Ji0
eMds
x
)
[f J f(y,Oe-^dy] Rn
(33)
Exemple. (<§tudie par P. L. Chow [2] (1989) et J. Potthoff [20] (1994) Soit le probleme de Cauchy suivant : (variante Stratanovich)
{
ftu(t,x)-^(t)+a2(t))Au(t,x) u(0,i)
= =
a(t)W(t)$Vsii(t,x) <50(*)
avec (t,x) £ M+ x R ; v > 0 et v{t) € L\0C{B+,dt)
;a €
. K
. >
L2oc(M+,dt).
Proposition La solution du probleme de Cauchy (34) est donnee par :
u(t x)
> =vmexp~
^){x'
foa{s)dB{s)?
(35)
ou 7(0 = / v(s)ds. Jo Variante ltd du probleme (34) precedent : f ^u(t, x) - ±v(t)Au(t, x) \ u{0,x)
= -
(36)
v et a verifiant les memes conditions que dans (34) Theoreme 8. II se presente trois cas pour la resolution du probleme de Cauchy (36) : i) Si i(i) = / 0 > ( s ) - 2(s))ds > 0 On obtient une solution explicite de (36). II suffit de remplacer dans la formule precedente (35) 7 par 7'.
329 ii) Si j'(t) = 0 i.e
/ 0 u(s)ds — / 0 cr2(s)ds ; la solution est donnee par : u(t,x) = 8x([ a(s)dB(s)) Jo
iii) Si1'(t)
=
(37)
Si{u(a)-a3(8))ds<0
I'inverse par (T.C) donne des expressions divergentes et done la solution est donnee par une distribution : g € €z{JR))' . Done V / 6 £i{R) on a : {9J)
=2^7T
X P [
-^
+
^
X
fe-^+^f(iy)dy
(38)
avec : a'(t) = / u(s)ds Jo
f.fCW-W)^ Jo u(s)ds
(39)
et {fa2(s)ds)1'2
a =
(40)
Jo Bibliographie 1. W.G.COCHRAN, H. H. KUO, A. SENGUPTA : A new class of white noise generalized functions. Infinite Dimensional Analysis, Quantum Probability and Related Topics Vol. 1, No.l (1998) 43-67. 2. P. L. CHOW : Generalized solution of some parabolic equations with a random drift. J. App. Math. Optimization 20, pp. 1-17 (1989). 3. T. HIDA : Brownian notion. Springer, Berlin ; Heidelberg. New-York (1980) 4. T. HIDA ; H. H. KUO ; J. POTTHOFF ; L. STREIT : White noise an infinite dimensional calculus. Maths, and its applications. Kluwer Academic Publishers. (1993) 5. H. HOLDEN, T. LINDSTROM, B. OKSENDAL, J. UBOE, T.S. ZHANG. The stochastic wick-type Burgers equation, London Math. Soc, 216 (1995), 141-161.
330
6. Y. G. KONDRAT'EV : Nuclear spaces of entire functions in problems of infinite dimensional analysis. Soviet. Math ; Dokl. Vol 22, pp 588-592 (1980) 7. Y. G. KONDRAT'EV, L. STREIT : spaces of white noise distributions Constructions, Descriptions, Applications I. Rep. Math. Phys. 33 (1993) 341-366. 8. P. KREE : Distributions, Sobolev spaces on Gaussian spaces and Ito's calculus. Stochastic Processes and their applications, S. Albeverio and all... (eds) Kluver Academic Pub. pp 203-225 (1990). 9. I. KUBO - S. TAKENAKA : Calculus on Gaussian white noise I.Proc. Japan Acad. 56, pp 376-380 (1980). 10. H. H. KUO ; J. POTTHOFF ; L. STREIT : A characterization of white noise test functionals. Nagoya math. J., 121, pp 185-194 (1991). 11. H. OUERDIANE : Dualite et operateurs de convolutions dans certains espaces de fonctions entieres nucleaires a croissance exponentielles Abhandlungen aus der math. Seminar. Univ. Hamburg. Germany. Band 54. pp. 276-283 (1983). 12. H. OUERDIANE : Application des methodes d'holomorphie et de dis tributions en dimension quelconque a l'analyse sur les espaces gaussiens. BiBos, No 491/1991. 13. H. OUERDIANE : Extention de deux theoremes de type Kondrat'evYokoi. BiBos No 572 (1993) 14. H. OUERDIANE : Fonctionnelles analytiques avec condition de crois sance et application a l'analyse gaussienne. Japanese Journal of Math. Vol. 20, No 1, pp. 187-198 (1994). 15. H. OUERDIANE : Noyaux et symboles d'operateurs sur des fonction nelles analytiques Gaussiennes, BiBos, N 634 (1994) Japanse Journal of Math. Vol. 21, No 1 (1995). 16. H. OUERDIANE : Algebre nucleaires et equations aux derivees partielles stochastiques. Nagoya Math. Journal Vol 151 (1998), pp. 107-127. 17. N. OBATA : An analytic characterization of symbols of operators on white noise functionals. J. Math. Soc. Japan. Vol. 45, No 3, pp. 421-445. (1993).
331
18. N. OBATA : White noise calculus and Fock space. L.N.M. No 1577 Springer-Verlag (1994). 19. J. POTTHOFF : On positive generalized functionals. J. Funct. Anal. 74, pp. 81-95. (1987) 20. J. POTTHOFF : White noise approach to parabolic stochastic partial differential equations. In : A.I. Cardoso et al (ds.) : Stochastic anal ysis and App. in Physics. NATO-ASI. Series 499, Kluwer Academic Publishers (1994) 21. I. YOKOI: Positive generalized white noise functionals. Hiroshima Math. J., 20, pp 137-157 (1990).
332
ON DONSKER'S DELTA F U N C T I O N IN W H I T E NOISE ANALYSIS J. POTTHOFF, E. SMAJLOVI6 Fakulat fiir Mathematik und Informatik, Universitat D-68131 Mannheim, Germany
Mannheim,
It is shown that Donsker's delta function can be defined via the properties of its integral against a smooth function. Moreover, the composites of tempered distributions with (non-degenerate) elements in the first chaos are constructed with the help of the characterization theorem.
1
Introduction
Donsker's delta function, that is, the composition Sx o Bt of the Dirac dis tribution at x € R with Brownian motion at time t > 0, has been a source of inspiration and a testing ground for new ideas for almost two decades. Its first mathematically rigorous constructions were given in four papers by T. Hida 1, I. Kubo 2 , H.-H. Kuo 3 , and S. Watanabe 4 which all appeared 1983 in proceedings edited by G. Kallinapur 5 . In particular, S. Watanabe 4 constructs the composite of any tempered distribution with a random variable whose Malliavin covariance matrix is non-degenerate in a space of generalized random variables (the Meyer-Watanabe space), which is densely and contin uously embedded in the space of Hida distributions, and thereby obtains — in a sense — the most general result. The following papers on Donsker's delta function are too numerous to be all cited here explicitly. Recently, one of the authors of the present note began together with Th. Deck and G. Vage 6 a reformulation of white noise analysis which is based on a general probability space equipped with a Brownian motion. (The interested reader is also referred to a related paper by J. Potthoff 7 .) This set-up uses only the properties of Gaussian random variables, and unifies the bases of white noise analysis and the Malliavin calculus. The main technical difficulty is the loss of the linear structure of the underlying sample space. (Both calculi are usually formulated in the framework of a topological vector space which carries a Gaussian measure.) In this paper we review the chaos expansion, the construction of spaces of generalized random variables, and the characterization theorem within the general set-up of white noise analysis. As an illustration we construct Donsker's delta function in terms of the structural property one expects it to have: Its integral with respect to x against any smooth function / should give
333
foBt. Furthermore, we review some results of I. Kubo 2 concerning the com position of tempered distributions with (non-degenerate) random variables in the first chaos. 2
Framework
Throughout the paper we use a similar framework as in Deck et al. 6 and Potthoff 7 . We make the following basic assumptions: (H.l) (H.2)
(fi, B, P) is a complete probability space on which there is a standard Brownian motion B = (Bt, t £ R+); B is equal to the (completion of the) cr-algebra generated by B.
It is well-known, e.g., Revuz and Yor 8 , Lemma V.3.1, that condition (H.2) implies that the algebra generated by the exponential functions in the vari ables Bti, ■. •, Btn, ti, ■ ■ ■, tn € R + , n £ N, is dense in L2(P). (We choose all Hilbert spaces to be real in this paper; — the modifications for the complex cases are obvious.) In fact, both statements are equivalent, and it is obvious that the same holds as well for the algebra of polynomials. For example, one can choose for (Cl,B, P) the standard Wiener space. Another choice is the white noise space over the half line (*S'(R+), B, /z), where fi is the white noise measure on («S'(R+), B) and B is the <7-algebra generated by the weak topology on 5'(R+). This probability space is discussed in detail in Potthoff 7 . Consider the mapping X:
L 2 (R + )
g
—► L 2 (P)
^
X9:= I
g(t)dB(t).
JR+
It is clear that {Xg, g € L2(R-|_)} is a centered Gaussian family with covariance given by the inner product (-, O2 of L 2 (R + ). If {Yi, i £ / } is a set of random variables indexed by some index set / , we denote by V{Yi, i e / } , £{Yt, i € / } respectively, the (real) algebras of polynomials, exponentials resp., in the random variables in {Yi, i € / } . A routine approximation argument shows that V{Xh, h 6 C£°(R + )}, and £{Xh, h € C^°(R + )} are dense in L2(P), too. Moreover, it is clear that if (e„, n £ N) is a complete orthonormal set in L 2 (R + ), then also V{Xen, n £ N}, and £{XCn, n £ N} are dense in L2(P). Remark For white noise analysis with time parameter in R, it is convenient to replace hypotheses (H.l), (H.2) by the following:
334
(H.l)'
(H.2)'
(ft, B, P) is a complete probability space, and there is a linear map ping from CC°°(R) into L2(P) so that {Xh, h € C ~ ( R ) } forms a centered Gaussian family with covariance given by the inner product ofL 2 (R). B is equal to the completion of the cr-algebra generated by the family
{xh,heC?(R)}. As an example, one can choose the classical white noise probability space, e.g., Hida et al. 9 . In the sequel, we shall only consider the case of time parameter in R-t-, and use the hypotheses (H.l), (H.2). The S-transform of an element Y 6 L2(P) is defined by SY(h) := e~^2^E(Yex"),
heC™(R+),
where | • I2 denotes the norm of L 2 (R+). Since the set {exp(A"/,), h G C£°(R+)} is total in L2(P), it is clear that the S-transform — as a mapping from L2(P) into the real-valued functions on C£°(R+) — is injective. Next we sketch the well-known chaos (or Fock) decomposition of L2(P), and we closely follow Nelson 10 . For n 6 No, let Qn denote the L 2 (P)-closure of the set of all polynomials in the random variables Xh, h 6 L 2 (R+), of degree less or equal to n. For n = 0, let Ho = Go = R) and for n > 1, denote by H„ the orthogonal complement of Qn-i in Qn- %n is called the n-th homogeneous chaos. Because of the density of the polynomials in the variables Xg, g £ L 2 (R + ), we obtain the following direct sum decomposition of L2(P): 00
2
L (P) =
Q)Hn. n=0
For
■ ■ ■ X9n
:.
Assume that e i , . . . , e m are orthonormal in L 2 (R + ). Then the independence of the Gaussian random variables Xei,..., XCm together with well-known properties of the Hermite polynomials and an elementary argument lead to the following formula : X»» • ■ • X£
: = Hni (Xei) ■ ■ ■ Hnm
(XeJ,
335
where n\,...,nm € N, n< ^ rij for i ^ j , and H„ is the n - t h Hermite polynomial in the form
Hn{x) = ^ e ^ '
2
* A=0
2
2
Let / be an element in the subspace L (R") of L (R"), consisting of elements which are a.e. symmetric in their n arguments, and let / „ ( / ) denote its n-fold Ito-integral: /»(/):=«•' /
f(tltt3,...,tn)dBu
...dBh,
where Sn is the simplex {0 < tn < tn-\ < ... where
/„:
LHR»)
—► nn
f
.-+
/«(/),
is a bijection. Moreover, it is easy to compute the L 2 (P)-norm of the multiple Ito-integral In(f) with the Ito-isometry:
l|/n(/)lli 2(P) =n!|/| 2 L2(R?+) . In summary, every element Y G L2 (P) is in one-to-one correspondence with a sequence (/„, n 6 N 0 ), fn 6 Z/ 2 (R"), so that
7 = £/„(/„), n=0 00
imii*(p) = E n! l/nli»(R;)n=0
Given a closable operator A on L 2 (R + ), we can now define as usual a closable operator T(A) on L2(P) by the recipe 00
r(/i)y:=^/„(r/„), n=0
336 for Y of the above form, with (/„, n € N 0 ) so that / „ is in the domain of A®n, and such that T(A)Y € L2(P). (For a detailed discussion of domain questions we refer the interested reader to, e.g., Reed and Simon 12 , p. 298 ff.) We are particularly interested in the following differential operator
acting on / e C£>(R+), the space of twice continuously differentiable functions on [0, +oo) which vanish rapidly at infinity. This operator has been discussed in detail in Deck et al. 6 , Potthoff 7 . It has a unique self-adjoint extension in L 2 (R+), which will be denoted again by A, and C2>0(R-)-) is a domain of essential self-adjointness for A. (We follow here and in the sequel the custom not to distinguish between a function and its class in an Z^-space, unless there is any danger of confusion.) Moreover, the Laguerre functions l„, n G N 0 , are eigenfunctions of A with eigenvalues n + § (of multiplicity one). In analogy to the Hamiltonian of the harmonic oscillator, which generates the nuclear triple <S(R)cL2(R)cS'(R) of the usual Schwartz spaces (e.g., Reed and Simon A defined above generates a nuclear triple
12
, p. 141 ff), the operator
5(R+)CL2(R+)C5'(R+) of functions on R+. It has been shown in Deck et al. 6 that the functions in «S(R+) are infinitely often differentiable on [0, +oo), and vanish together with all their derivatives faster than any power at infinity. (I.e., / 6 <S(R+) can be viewed as the restriction of a function in <S(R) to R+.) Moreover, using the self-adjoint operator T(A), we construct a nuclear triple of smooth and generalized random variables 11 C L2{P) C TV in the ususal way: Denote by 1ZP, p € N 0 , the domain of r(A)p, equipped with the graph norm, so that it becomes a Hilbert subspace of L2(P). Let fc-p be its dual, and identify L2(P) with its dual. 11 is the projective limit of the Hilbert spaces 1lp, p € No, and TV is the dual of K, so that W is the union of the Hilbert spaces 72._p, p 6 N 0 . (We chose the notation 11, 1Z* resp., in order to distinguish these spaces from the Hida spaces (S), («S)*, for white noise analysis with time parameter in R.) The spaces H, 11*, of smooth and generalized random variables share almost all properties of the well-known spaces (<S) and (S)*, and we do not list these properties here; —
337
the interested reader is referred to Hi da et al. 9 , Kondratiev et al. 13 , and the references quoted there. The results of the next section will be proved with the help of charac terization theorems for TV, and they will be given next in a form, which is convenient for this purpose. As usual, we define a space U of real-valued functions F on C%°(R+), which are such that F has an everywhere ray entire extension (denoted again by F), and that there exist a constant Kp > 0 and a norm | • | F on C~(R+) which is continuous with respect to the topology induced by 5(11+) on C^°(R + ), so that for all z € C, h 6 C£°(R+), the following estimate holds: \F(zh)\
StW-fr-.e**:), for $ G TV, where : e Theorem 1
Xh
/ieCc°°(R+),
: is a shorthand for exp(Xh -
^\h\l).
The 5 transform is a bijection from TV onto U.
The proof can be done word by word as in Potthoff and Streit 14 , or Kondratiev et al. 13 . (Note that there is a gap in the proof in Potthoff and Streit 14 which has been fixed in Kondratiev et al. 13 .) We also want to mention some other papers in the same direction, such as Cochran et al. 15 , Grothaus 16 , Kondratiev 17, Lee 18 , Leukert 19 , Ouerdiane 21 20 , and the works quoted there. Keeping track of the constants appearing in the proof of Theorem 1, one obtains the analogue of Corollary 12 in Kondratiev et al. 13 , which reads here as follows Theorem 2 Assume that F is everywhere ray entire on C£°(R+) and satifies a bound of the form \F{zh)\ < K, exp (K2 \z\2 |/i| 2 iP ),
h G CC°°(R+), p
for some p, K\, K2 > 0, and where \h\2
338
then
||$|| 2 ,_ 9 <4v / 5/r 1 . The second half of Theorem 2 comes from the elementary facts that the operator norm of . 4 - 1 is equal to 2/3, while its Hilbert-Schmidt norm is less than A / 7 9 / 8 0 . Very often (generalized) random variables depend on additional parame ters. In order to control this, a rather general result was proven in Deck et al. 22 , which in the present context reads as follows. Theorem 3 Let X be a first countable topological space, and Y be an arbitrary set. Let F b e a mapping from X x Y into U with the following properties: 1. There is x0 e X so that for all h € C~(R+), the family (F(-,y; h), y £ Y) is equicontinuous at XQ. 2. There exist p, Kx, K2 > 0, so that for all x 6 X, y € Y, z 6 C, h E
IF^z/i^tfxexp^lzl2!/^). Let $(x, y) := S~1F(x,y), and q be as in Theorem 2. Then the family (^(i2/)i y 6 V) is a family of mappings from X into 7£_(, +1 ), which is equicontinuous at x0. In particular, as x tends to x0, $(x, y) tends to $ ( i o , y) in the strong topology of 1Z*, the convergence being uniform in y € Y. Remark Theorem 3 contains the following special case which was also proved in Potthoff and Streit 14 and in Kondratiev et al. 13 , and which is worth noting. Assume that one is given a sequence (Fn, n € N) of elements in U, which converges pointwise to F £ U, and which is such that the growth estimate in assumption 2 above holds uniformly in n, i.e., there exist p e No, Ki, K2 > 0, so that for all n e N, z 6 C, h 6 C£°(R+),
iF„(z/i)i
339 3
Donsker's Delta Function
We begin by recalling quickly the concepts of Pettis and Bochner integration (e.g., Hille and Phillips 23 ) of 7c*-valued functions. Let (X, A, p) be a measure space with a a-finite measure fjt. Consider a mapping $ from X into 7c*. It is called weakly measurable, if for every tp £ 7c, the mapping from {X,A) into (R, B(R)) given by x i—► ($(x),tp) is measurable. $ is called Pettis-integrable, if it is weakly measurable, and if for every tp € 7c, f \(${z),
$(x)n(dx),tp}
= / ($(x),tp)
(i(dx).
Assume now, that there exists p £ No, so that $(x) £ 7c_ p for all x £ X. Note that 7c_ p is a separable Hilbert space, with 7c as a dense subspace. Therefore, we have by Pettis theorem (e.g., Hille and Phillips 2 3 ) , that if $ is weakly measurable, it is measurable as a mapping from (X, A) into (7c_p,Z?_p), where B-v is the Borel a-algebra of 7c_ p . It is well-known that the last statement is equivalent to the fact that $ is the pointwise limit of a sequence („, n 6 N) of measurable, countably valued mappings from (X, A) into (7c_ p , B-p). (This is just the Lebesgue approximation of $.) The Bochner integral of such a $ n is denned in the obvious way, and $ is called Bochner integrable (w.r.t. p), if the sequence of Bochner integrals of the $„ converges in 7c_ p . (In this case, it converges also in every 7c_, with q > p, and because the injection from 7c_ p into 7c_, is continuous, it converges in 7c_, to the same element.) The limit is called the Bochner integral of $. It turns out, that $ is Bochner integrable if and only if f \\$(x)\\2,-Pli(dx) is finite. Moreover, if $ is Bochner integrable, then it is Pettis integrable, and both integrals coincide. Let g £ L 2 (R+) be non-zero, and consider the Gaussian random variable X9. Definition A weakly continuous mapping $ g from R into 7c* with the property that for every / € <S(R), foXg=
f f(x)%(x)dx,
(1)
holds, where the right hand side is a Pettis integral in 7c*, is called a Donsker delta function (of Xg).
340
Below we shall show that there exists a unique element in 71* with this property, and we shall use for it the conventional notation 8xo Xg. The case where g is the indicator of [0, t],t>0, yields Donsker's delta function SxoBt of Brownian motion. We want to remark here that equation (1) has been shown in Leukert 1 9 for Sx o Bt constructed on the basis of the Fourier representation of 6X, and that this result inspired the approach in the present paper. The heart of the matter is the following rather trivial result. (It can also be found in Kubo 2 within the usual framework of white noise analysis. However, the proof given there uses the linear structure of the white noise sample space, and therefore another proof has to be given here.) Lemma 1
For all / e 5(R), g 6 L2(R+), g f 0, h € C~(R+), S(foXg)(h)
= PlglJ((g,h)2),
(2)
where (Ps, s > 0) denotes the heat semigroup on the real line. Proof
We write / in terms of its Fourier transform / : S(f o Xg){h) =
S((2IT)-"2
f f(k)eikX>
dfc)(A),
and interchange the order of integration with the help of Fubini's theorem. Then the S-transform is readily calculated, with the result S(foX9)(h)
=
(2TT)-1/2
f
f(k)e-ik'M'-iki9',l)*dk.
Applying the Plancherel theorem to the last equality yields the claim.
□
Remark Observe that both sides of equation (2) extend continuously to / e L2{R), whence (2) is valid for / 6 L 2 (R) as well. Theorem 4 There exists a unique weakly continuous mapping $ 9 from R into TV satisfying (1) for all / G 5(R). Proof We show first uniqueness. Assume that x i—> $g(x) is weakly continuous, and that (1) holds. We take the S-transform of both sides of (1) at h £ C£°(R+). By definition of the Pettis integral, we may interchange the S-transform with the integral on the right hand side, and obtain by Lemma 1 the equation
Jf(x)S(*9(x))(h)dx = (2n\g\l)-1/2
I f(x) exp( - (l/2\g\2)(x
- (g,h)2)2)
dx.
341
Since this has to hold for all / G S(R), we can conclude that for all h G C^°(R+) and (Lebesgue-) a.e. l E R , 5(* f l (i))(ft) = (27r| 5 |i)- 1 / 2 exp ( - (l/2| 5 |i)(x - (g,h)2)2).
(3)
The assumption that $ g is weakly continuous implies that x i—► S($ ff (x))(/i) is continuous for every h G C£°(R + ), and hence (3) holds for all x G R, h G C^°(R + ). But then $ s (x) is for every x G R uniquely determined by its S-transform. For existence, we observe that the right hand side of (3) is — as a function of h — for every x € R an element in U. Therefore Theorem 1 shows that for every x € R there is $(x) G TZ* satisfying equation (3). Now we use Theorem 3 to argue that x i—► $g{%) is strongly (and hence weakly) continuous: Consider the right hand side of (3), replace h by zh, and estimate as follows
\S(*9(x))(zh)\<(2«\g\l)-^exp(\z\2\h\l). The last bound is uniform in x, and therefore we may apply Theorem 3 to obtain the claimed continuity. Finally, we prove equation (1). Note that by the injectivity of the S-transform (1) follows, once we have shown that f$g is Pettis integrable. But the last estimate and Theorem 2 show that (/(x) * 9 ( x ) , x G R) is for some q € N0 bounded in 11—q by a constant times |/(x)|. Since / is integrable, x i—► / ( x ) $ 9 ( x ) is Bochner integrable, and therefore also Pettis integrable. D As remarked above, we shall write from now on 6X ° Xg for the unique element $ 9 (x) in 7£*. Let us denote the one-dimensional heat kernel by p(t;x,y), and set for fixed h G C^°(R+), g G L 2 (R+), g ^ 0, kg,h(x)
■=p{\9\l;x,(g,h)2),
and observe that fcSi/, belongs to the Schwartz space <S(R). Then we may write equation (2) also as follows
s{foXg){h) = (f,kg,k),
/>eCr(R+),
(4)
where we consider / as an element in S'(R), and denote the dual pairing between S'(R) and <S(R) also by (•,•). (There will be no danger of confusion.) Also, equation (3) can be now written in the following way: S{6x°Xg)(h)
= (6t,kg,h). 2
(5)
Assume that (<5J, n G N) is a sequence in L (R) which converges weakly to Sx in S'(R). Then a glance at (4), (5), and the remark after Theorem 3 shows
342
that the random variables S" o Xg in L2(P) converge strongly in TV to 6xoXg. Therefore Donsker's delta function as defined by equation (1) coincides with the usual constructions via regularizations and limits. Equation (4) suggests to define the composite ToXg for T 6 S'(R) as an element in H* as the 5-inverse of F(h):=(T,kg,h),
heC^(R+).
In fact, equation (4) has already been used in Kubo 2 for such a purpose. In that paper, I. Kubo uses methods from the theory of reproducing kernel Hilbert spaces (with proofs which are not completely explicit). Let us carry this out here with the help of the characterization theorems. Denote by H the Hamiltonian of the harmonic oscillator, i.e., Hf(x)
= -f"(x)
+ x2 f(x), p
f e 5(R), x € R. 2
Let <SP(R), p € N 0 , be the domain of H in L (R). We equip <SP(R) with its natural norm, which we also denote by | • |2,p (again, there will be no danger of confusion), so that under this norm SP(R) becomes a Hilbert space. The dual of <SP(R) is denoted by <S_p(R), its (Hilbertian) norm by | • |2,- p - It is well-known, e.g. Reed and Simon 12 , that «S(R) (with its usual topology) is the projective limit of the Hilbert spaces «SP(R), while S'(R) is the union of the spaces <S1P(R), p € NoLet T € <S'(R), so that there is a p £ N 0 with T e <S_P(R). In other words, there is p € N 0 , so that H~PT belongs to L 2 (R), and we may write for every ip € <S(R), ( 7 » = (tf-PT,tfV)L*(R). Lemma 2 For every p € No there is a polynomial -yp of degree 2p in the variables Ig]^1, (3,/i)2, x s o that Hpkg,h(x) Proof
= fpilgli1 ,(g,h)2,x)
kgA(x).
A trivial induction.
Corollary 1
n
For all p 6 No, x € R, the mapping h^Hpkg,h(x),
/i€C c °°(R+),
has an everywhere ray entire extension. The ray entire extension of Hpk3i.{x) will be denoted in the sequel by the same symbol. It is straightforward to derive from Lemma 2 the following estimate:
343
Lemma 3 For every p G No there is a strictly positive function K\ on (0, +oo), and a constant K2 > 0, so that for all x G R, z G C, I, h G C£°(R+), < K^lgl;1)
\H»k9il+zh(x)\
exp(-(l/8\g\22)x2)
exp(K2(\l\2 + \z\2\h\l)).
(6)
Let /, h G C£°(R+). It is trivial to check, that the real and imagi nary parts of kg
+
i{T,lm(kg,l+zh)).
For every T G «S'(R), and all/, h G C^°(R + ) the mapping z i—► (T, fc9,j+,h)
is continous on C. Proof Assume that T G <S_P(R) forp G No- Let z G C, and let {zn,n€ N) be a sequence converging to z. Without loss of generality we may assume that \z - zn\ < 1 for all n G N. Then we have \(T,k9yl+zh)
- (T,k9tl+Znh)\
< |r| 2 ,_ p |/f p (fc 9 ,i + , / l -
kg>l+Znh)\2.
p
By Corollary 1, it is clear that H (kgj+Zh(x) - fc9>(+Zn/,(a;)) converges to zero for every i g R a s n tends to infinity. On the other hand, inequality (6) provides readily a uniform square integrable bound. Therefore, an application of the dominated convergence theorem ends the proof. D Lemma 5
For every T G <S'(R), and all/, h G C ~ ( R + ) the mapping z •—► (T, k9ii+zh)
is entire. Proof By Lemma 4 and Morera's theorem, it suffices to show that for every (smooth) closed path C in the complex plane we have
But
£ f{T,kg
(T, k9ii+zh) dz = 0.
=j ,
{H-*T,H"kg,l+Zh)2dz p
= (»->*■ IH k
9tl+Zh
= 0,
dz) 2
344
where the interchange of the integrals is justified by Fubini's theorem (use the bound (6)), and the last line follows from Corollary 1 and Cauchy's theorem. □ Lemma 6 Let T £ 5_ P (R), p € N 0 . Then there is a strictly positive function K\ on (0, +oo) and a constant K2 > 0, so that for all h G C£°(R+), * (EC, |
This estimate follows directly from Lemma 3.
n
Theorem 5 Let g 6 L 2 (R+), g ^ 0. There exists an injective, strongly sequentially continuous mapping *9 :
5'(R) T
—* TV .—► * 9 ( r )
so that the 5-transform of $ 9 (T) is given by 5* 9 (T)(/i) = (T,kg,h),
h e CC°°(R+).
(7)
Proof Theorem 1, Lemmas 5 and 6 show that equation (7) defines indeed for every T € <5'(R) an element in TV. Consider now the mapping $ 9 : S'(R) —>1V defined this way. Formula (7), the injectivity of the 5-transform, and the fact that the family {fc9)/,, h 6 C£°(R+)} separates the points of <S'(R) (cf. Corollary A.2 in the appendix) show that $ s is injective. If Tn —>T strongly in <S'(R), then there is p e N 0 so that Tn —>T in S-P(R). Therefore Lemma 6 shows that 5$ 9 (T n )(/i) converges to SVg(T)(h) for every h € C£°{R), and that there is a bound of exponential second or der which is uniform in n. Therefore Theorem 3 can be applied, with the conclusion that $ 9 is strongly sequentially continuous. □ Let T € S'(R). Then there is a sequence (Tn,n € N) in «S(R) which converges strongly to T in <S'(R). For T„ o Xg, we know from Lemma 1, that S(TnoXg)(h)
= (Tn,k9th),
h € CC°°(R+).
Since this sequence of 5-transforms converges to the 5-transform of $ ff (T), it follows that Tn o Xg converges strongly to $ 9 (T) in TV. Therefore it is reasonable to denote $ 9 (T) by T o Xg. Remarks
345
1. The composition of a tempered distribution with Bt, t > 0, has also been discussed with a different method in Kallianpur and Kuo 24 (see also, Hida et al. 9 , Chapter 3). 2. We expect that our method can be extended to provide a construction of the composition of elements in S'{K) with solutions of (non-degenerate) Ito equations (with time dependent coefficients). Appendix In this appendix we prove that the family of translates of the GauB function is total in <S(R). Lemma A . l Let / G S{R), n G N 0 . For a ^ 0, let fan)(x) be an n-th order difference quotion which converges to the n-th derivative f^(x) of / at x G R as a tends to zero. Then fa converges to /(") in «S(R) as a tends to zero. Proof Without loss of generality we may choose the difference quotion approximation of f^(x) which is symmetric around x G R, and we may assume that a < 1/n. Then
/<">(*) = (2a)-" J2 (t){-l)kf(X k=o ^
+ {n
~ 2*)a)'
(A1)
'
The fundamental theorem of calculus gives the formula /J">(x) = (2o)- n /
/ ( n ) ( x + yi+...
+ yn)dy1...
dyn.
Let Q, P 6 N 0 . Then (1 + x 2 ) Q / 2 D"(/(") - / W ) ( x ) = (2a)-" (1 + x / •/[-o,o]»
(f{n+0\x)
V
x2)a'2
- f{n+0) {x + yi + ... + yn))dyi...
dyn,
'
where the interchange of the derivative D& with the integral over [-a, a] n is readily justified. For y G [-a,a]n, y = (yl}.. .y„), put y+ := yi + ■ ■ ■ + yn, and decompose [-a,o]" = Nal)Pa, where A^ := {y G [-a,a]n; y+ < 0}. The contribution from ./V0 to the above integral can be estimated in the following way: (2a)"" (1 + x 2 )°/ 2 /
(f(n+0\x)
- /<»+">(i + y+)) dVl...
dyn
346 < (2a)-" (1 + x2)a'2
I
\&+0+1\u)\
i
dudy,
...dyn
JNa Jx-\y+\
< na(2a)-n
(1 + x2)Q'2{\
+ (x -
l)2)~a/2\Na\
xsup(l+u2)Q/2|/
llsll^^su^Ki + x 2 )"/ 2 ^^)!, for the Schwartz space semi-norms, and C is a universal constant (which may be determined by elementary calculus to be (3 + \f%)/2). The contribution from the set Pa can estimated in a similar way. Therefore, we obtain for all a, P € N 0 a bound of the form
ll/ (n) -fin)\\a,0
< 2C Q / 2 na||/|| a ,„ + / 3 + 1 ,
which proves the Lemma.
□
Corollary A . l Assume that T is a translation invariant subset of <S(R). If T together with all the derivatives of its elements form a total subset of <S(R), then T is total in S(R). Proof Formula (A.l) shows that with / , fa belongs to T for all n € N, a > 0. By Lemma A.l, fa converges in <S(R) to the n-th derivative of / as a converges to zero, and hence the claim follows. D Lemma A.2 Assume that T is total in <S(R), and that ip € S(R) is strictly positive. Then
tp)fn\\a,0
1
= U V(/n - V" 9)\\a,0 + ||(1 " i>)fn\\a,0 < W*P\\*& ||/n - V" 1 g\\a"J3» + ||(1 - ^)/»||aJJ,
347
for some o', a", 0', /?" € N 0 , since rpip € <S(R) and <S(R) is a topological algebra. By construction, both terms of the last estimate converge to zero with n —> +00. n For / i € R , let us denote 4>»(x) := e - ( l - " ) 2 , L e m m a A.3
x € R.
The set {0„, /i € R} is total in 5(R).
Proo/ Let cf> = <j>0, and denote by e n , n 6 No, the n-th (unnormalized) Hermite function, i.e.,
en(x) =
(-l)ne^2
It is well-known, e.g. Reed and Simon total in S(R), and hence so is the set
12
, that the set of Hermite functions is
/•:={(-l)ne„,n6No}. Obviously we have $:={ 0, let «W(z) : = e x p ( - ^ ( z - A * ) 2 ) ,
x e R.
Then a simple scaling argument yields Corollary A.2 For every a2 > 0, the set { f ^ , fi € R} is total in <S(R). In particular, this set separates the points of S'(R). References 1. T. Hida, Generalized Brownian functionals, in reference 5 2. I. Kubo, ltd formula for generalized Brownian functionals, in reference 5 3. H.-H. Kuo, Donsker's delta function as a generalized Brownian functional and its application, in reference 5 4. S. Watanabe, Malliavin's calculus in terms of generalized Wiener func tionals, in reference 5 5. G. Kallianpur, Theory and Applications of Random Fields, Springer (LNCIS 49), 1983
348
6. T. Deck, J. Potthoff, G. Vage, A review of white noise analysis from a probabilistic standpoint, Acta Appl. Math. 48, 91 (1997) 7. J. Potthoff, On differential operators in white noise analysis, Preprint (1999) 8. D Revuz, M. Yor, Continuous martingales and Brownian motion, Springer, 1991 9. T. Hida, H.-H. Kuo, J. Potthoff, L. Streit, White Noise — An Infinite Dimensional Calculus, Kluwer, 1993 10. E. Nelson, Probability theory and Euclidean quantum field theory, in Constructive Quantum Field Theory, G. Velo, A. Wightman (ed.'s), Springer, 1973 11. H.P. McKean, Jr., Stochastic Integrals, Academic Press, 1969 12. M. Reed, B. Simon, Methods of Modern Mathematical Physics I: Func tional Analysis, Academic Press, 1972 13. Yu.G. Kondratiev, P. Leukert, J. Potthoff, L. Streit, W. Westerkamp, Generalized functional on Gaussian spaces - the characterization theo rem revisited, J. Fund. Anal. 141, 301 (1996) 14. J. Potthoff, L. Streit, A characterization of Hida distributions, J. Fund. Anal. 101, 212 (1991) 15. G. Cochran, H.-H. Kuo, A. Sengupta, A new class of white noise gener alized functions, IDAQPRT 1, 43 (1998) 16. M. Grothaus, New Results in Gaussian Analysis and Their Applications in Mathematical Physics, Ph. D. Thesis, Bielefeld, 1998 17. Yu.G. Kondratiev, Nuclear spaces of entire functions of an infinite number of variables, connected with the rigging of a Fock space, in: Spectral Analysis of Differential Operators, Math. Inst., Acad. Sci. Ukrainian SSR, p. 18, English translation: Selecta Math. Sovietica 10, 165 (1991) 18. Y.J. Lee, Analytic version of test functionals, Fourier transform and a characterization of measures in white noise calculus, J. Funct. Anal. 100, 359 (1991) 19. P. Leukert, Generalized Functions on Gaussian Spaces — Constructions, Characterizations, Calculus, Diploma Thesis, Bielefeld, 1994 20. H. Ouerdiane, Algebres nucleaires de fonctions entieres et equations aux derivees partielles stochastiques, Nagoya Math. J. 151, 107 (1998) 21. H. Ouerdiane, Application des m^thodes d'holomorphie et de distribu tions en dimension quelconque a l'analyse sur les espaces gaussiens, BiBos 491 (1991) 22. Th. Deck, J. Potthoff, G. Vage, H. Watanabe, Stability and viscosity limit for parabolic stochastic differential equations; Preprint (1996), to appear in Appl. Math. Optim.
349
23. E. Hille, R.S. Phillips, Functional Analysis and Semigroups, American Math. Soc. Colloq. Publ., 1957 24. G. Kallianpur, H.-H. Kuo, Regularity properties of Donsker's delta func tion, Appl. Math. Optim. 12, 89 (1984)
350
L ^ U N I Q U E N E S S ON M E A S U R A B L E STATE SPACES: A CLASS OF EXAMPLES MICHAEL ROCKNER Fakultat fir Mathematik Universitat Bielefeld Postfach 10 01 31 D-33501 Bielefeld Germany We give an analytic proof for L1 -uniqueness of a class of diffusion operators on arbitrary measurable state spaces. In particular, this generalizes a recent result, proved by probabilistic means by L. Wu, to the non-symmetric case. The method we use is based on the classical DuHamel formula which is applicable here due to results on (non-symmetric) Dirichlet forms on merely measurable state spaces, and a recent result by W. Stannat.
Mathematics Subject Classification (1991): Primary: 47D07, 31C25 Secondary:35K05, 47B44, 60J35 Key Words and Phrases: L^uniqueness, maximal-dissipativity, DuHamelFormula, Dirichlet forms Running head: L1-uniqueness on measurable state spaces
1
Introduction and main result
Let (E,B) be a measurable space and {HX)X^E a measurable Hilbert bundle over E (cf. e.g. 6 or 15 ), where each Hilbert space Hx is assumed to be separable and (•, -)x denotes the inner product on Hx. Set | • \x := y/(-,-)x. Let /xo be a finite positive measure on (E,B) and for p € (0,oo] let Lp(no) := LV{E,B,HQ) denote the corresponding (real) L p -space. Let D\ C L°°(/iu) be an algebra of (^o-classes of) functions, dense in L2{HQ) such that l e f l i . Assume we are given a gradient V : A -> /
Hxno(dx),
JE
i.e., V satisfies the product rule. Here as usual J Hxfio(dx) denotes the space of L2(/x0) - sections in (HX)X^E(See 6 , 15 . The reader who is unfamiliar
351
with this notion, for simplicity should think that Hx = H for all x £ E and then / ® Hxfio(dx) = L2(E -> H,fi0).) Define for u,v € £>i £(u,u) := f(Vu{x),Vv(x))xno{dx)
(1)
and assume that (£,Di) is closable on L2(JJLQ) {cf. e.g. 10 or * for the def inition, easy to apply criteria, and examples). Let (£,D(£)) denote its clo sure. Clearly, V extends then to all u £ D{£) and it is easy to see that for u,v E D(£) such that u ■ u, u|Vu|,u|Vu| € L2(no) we have u-v e D{£) and V(uu) = uVv + vVu
(2)
Remark 1. (£,D(£)) is then a symmetric Dirichlet form on L2(/io) (in the sense of e.g. ). But we shall not use this fact below. For explicit examples for (£, D(£)) above and for notions concerning sectorial forms used below we also refer to 10 . We emphasize that, in particular, infinite dimensional state spaces are covered. Let ({L0,D(Lo)) be the generator of (£,D(£)) and let D C D(L0) be a core for (L0,D(LQ)), i.e., D is a linear subspace of D(L0) which is dense /2
V
(
with respect to the graph norm I H-koll^^,,) + II • IlL^,,) ) • Assume that D C L°°(fi0). Let ip e D(£), tp > 0 ^o-a.e., such that <^2 e D(£) (or equivalently,
(3)
L2(fio) C Ll(n) continuously.
(4)
that
Let B be a measurable section in erties:
(HX)X€E
satisfying the following prop
(Bl) \B\
(B3) J(B - ^-,Vu)d^i
= 0 Vu e £>.
Define for u € D LBu:=L0u+(5,Vu).
(5)
352
Then (LB, D) is a linear operator on Ll(y). that if2 G L2(fM)), since we have that (\L0u\dn+
This is due to (B2) and the fact
(\{B,Vu)\dn
<(f\L0u\2dno\
([
+(f\B\2
E{u,uY'2
Vu G D.
Remark 2. It is trivial to check that (B3) is equivalent to
h
(L0u + {B,Vu))d(i
= 0 VueD,
(6)
i.e., n is an infinitesimally invariant measure for (LB,D). that B — ^r has divergence zero with respect to fi.
(B3) just means
Example 1. (i) If B := ^ £ , then by (2) f\B\V
<*/*> = 4 f\Vcp\2
/"|B| V
dfi0=4
dno
and f
due to our assumptions on
= /"(W2,V(Viu))d/io = -
/Wi,V{ip2u))dno
LQ^I ipiudfio + / L 0 ^i foudno
= 0.
In particular, all assumptions on 5 are satisfied if the triple (E,H,fio) forms an abstract Wiener space with Hx := H V i € J5, and ^ I , ^2 are two different eigenfunctions (which are known explicitely) for the same eigenvalue A G N of Lo, which is here just the Ornstein - Uhlenbeck-Operator on (abstract) Wiener space.
353
By (B3) the operator (LB,D) is always dissipative on Lx(fj,) (cf. e.g. 7 Lemma 1.8 in ) . So, since D is dense in L2(/xo), hence by (4) dense in L 1 (/i), it follows that (LB,D) is closable on L1(fi) (cf. e.g. Sect. X.8 in 1 2 ) . Let LB ' ^ , D ( L B ''') ) denote its closure on
Ll(n).
Now we can formulate the main result of this paper. Theorem 1. (i) [LB ' ,D(LB ' )) generates a Co-semigroup (i.e., a strongly continuous operator semigroup) {T^t^o on L1^). Equivalently, it is the unique closed extension of (LB,D) with this property. (ii) Each linear operator Tf has norm less or equal to one and is Markovian, i.e., f € Ll(p), f>0=>T?f>0 and Tfl = 1. The equivalence in Theorem l(i) is a consequence of the Hille-Yosida theorem and a result by W. Arendt (cf. A-II, Theorem 1.33 in 5 ) . The first part of Theorem 1 (ii) is well-known to follow from the dissipativity of (LB, D) on Ll(n) (cf. e.g. the Lumer-Phillips Theorem,e.g. in Chap. I of n ) . The second part follows from Lemma 1.9 in 7 , since by Lemma 1 below 1 € D(LB~h,i) a n d l ^ 1 ' " l = l. So, it remains to prove Theorem l(i) which will be done in the next section using the classical DuHamel formula and some results from 10 . However, we need that there exists at least one closed extension which is a generator, i.e., we need the following result by W. Stannat: Proposition 1. There exists a closed extension of (LB,D) generates a Co-semigroup on Ll(n).
on Ll(y)
that
Proof. The proof is entirely analogous to that of Proposition 1.3(i) in Part II of14. □ Theorem 1 generalizes a recent result by L. Wu (cf. 16 ) proved in a completely different way by probabilistic methods in the special case where B = ^ - . We would like to emphasize here that any symmetric Dirichlet form on L2(no) admitting a square field operator T is of gradient type, i.e. is of type (1). The construction of a corresponding measurable Hilbert bundle and of an appropri ate gradient was carried out in detail in Subsections 1) and 2) of Appendix D in 7 . So, indeed the case in 16 where the starting point was a Dirichlet form possessing a square field operator, is fully covered by Theorem 1. Finally we would like to refer to 13 for a recent survey of results, related to Ll-(or even L p -)uniqueness. It is a pleasure to thank Liming Wu for a discussion of his new result and of a very rough draft of this paper during the 26th Conference on Stochas-
354
tic Processes and their Applications to the Bernoulli Society in Beijing in June 1999. 2
Proof of T h e o r e m l(i)
We adopt the notation of the previous section. We need two lemmas. L e m m a 1. We have D{L0) C D ^ 1 ' " ) and T^^u
= L0u + (B,Vu)
Vu <E D(L0).
Proof. Let u £ D(L0) then 3un € D s.t. un -4 u and Loun -> L0u in L2(/xo) hence in L1 (/*) as n -> oo. But since convergence in graph norm of a generator of a Dirichlet form implies convergence in Dirichlet norm, we also have ► 0 in L2(/xo).
|Vu n - Vul n—»oo
Hence by (B2) j\(B,
Vu - Vu„>| dn < (j\B\\*
duo)
(J\Vu
- Vun\2 d/io)
>0, n—voo
and the assertion follows.
D
L e m m a 2. Suppose |B| € L°°((j,0)- Define £B(U,V)
:= MVu, Vv)dpo + f{B,Vu)vdno
u,v € D(£ f l ) := £>(£).
Then 3 a > 0, such that ( £ B ) Q , £)(£)) is a (non-symmetric) Dirichlet form on L2(fio) (where £s,a '■= £ + OJ(-,-)L2(/IO))- Furthermore, if the corresponding (L2(/zo)-)generator is denoted by (LB,D{LB)), then D ( Z ^ ) = D(L0) and TB~U
= L0u + (B, Vu)
Vu e
D(TB~)
= D(L 0 ).
Froo/. The same arguments as in the proof of Proposition 4.7(i) in 4 yield the assertion. □ So, to prove Theorem l(i) let {Tt)t>o be any Co-semigroup on L X (M) with generator (L, D{L)) extending (LB,D), hence by its closedness also extending \LB
' ,D(LB
) )• According to the discussion in the preceding section,
355
we have to show that (Xi)t>o is the only Co-semigroup on L1(/x) with this property. This will be done using a standard method (see e.g. 9 , 3 or 8 ) based on DuHamel's formula, which works here because of Lemma 2 and results in 10
To this end let Bn, n G N, be measurable sections in (HX)X€E such that \Bn\ 6 L°°(n) and \B - Bn\ > 0 in L2U). These exist since \B\ e n—*oo 2
L (fi) by (B2). Applying Lemma 2 to Bn, it follows by the general results in Chap. I, Sect. 2, and Chap. II, Sect. 3e) of generates a Markovian Co-semigroup [T, ) \
10
that each ( Z ^ 7 , D ( L B J j in L2(/xo), which is analytic.
/ t>o
In particular, for all t > 0 T t (n) (L 2 ( Mo )) C D{L~B-n) = D(L0) C D(L^ljl)
C D(L)
(7)
where we used both Lemma 1 and Lemma 2. Therefore, since by (4) for all U£D(TBZ)
It-J^—JLBZU
in L1^),
(8)
we can apply DuHamel's formula on L1(fi) to obtain that for all T > 0 and all u € D rt
Ttu - T t (n) u = / Tt.s(L - LBn)T,
=J
Tt-^LB^-LB^T^uds
= f
Jo
Tt-a(B-Bn,VT^u)ds,
where besides (7) we used that (L,£)(L)) extends I LB ' , D ( L s ' ) j and Lemmas 1 and 2. Since (Tt)t>o is strongly continuous on L1(fi), it follows that for some c\, C2 € (0, oo) \\Ttu - r f w « | | L 1 ( M ) < c e ^ f\\\B
- B B | | | 1 1 ( | 0 | | | V r i " ) i . | | | | . 1 M da.
(9)
J0
Now we proceed in exactly the same way as on p. 342 in show that
2
. Indeed, we shall
356
Of course, (10) implies that the left hand side of (9) converges to zero as n -> oo. Hence Ttu is uniquely determined for all u G D, thus for all u G Ll{/i), and the uniqueness of (Tt)t>o follows. To prove (10) we observe that by (8), Lemma 2, and the product rule for V d_y\T^u\2d^=2JL^T^uT^udn ds
j{B-^-MTln)u?)d»
+
+ 2 f{Bn -
B,WT^u)T^udfi
= - 2 |(VTJ n > U , V ^ T f M ) dfM> 2 + 2: [(VTWu,V>p )T^udiM)
+ 2: J(Bn-
B,VTMv)TWudp
= - 2 /"(VTJn)u, VT,(n)u) d/x +
2J(Bn-B,VT^u)T^udfi,
where we used (B3) and the fact that for all i>i G D(L0),v2 G D(£) / Lo^i^d/io = -£{vi,V2) = -
{Vvi,Vv2)dfio.
Hence 2|||VT|")u|||^ (M)
< -±\\TSMW)
+
l|r.(n)«lll-(rt 111^. - ailku., + IH^-MIll^
which yields l||VT<»>«||ll.<,> < -l||T(-)u|| a 1 1 ( M ) + H l ^ l l l * . -
B\f„w
357
Integrating over s, we get
/■ f |||V2-i"> u ||[J 2(M) ^ < n^iii2(/J) -t-*n^ili~ (/i) |||s«
-S|||J2(M),
J 0
which in turn, immediately implies (10). □ Remark 3. Similarly, under corresponding assumptions on B and (fi, one can prove Lp(fi)-uniqueness for all p 6 [1,2). Acknowledgments We would like to thank the organizers for a very nice conference in Lisbon in honour of Ludwig Streit's 60th birthday. Financial support of the German Science Foundation (DFG) through the Sonderforschungsbereich 343 and the EU-TMR Project, No. ERB-FMRX-CT96-0075 is gratefully acknowledged. References 1. S. Albeverio, M. Rockner: Dirichlet forms on topological vector spaces— closability and a Cameron-Martin formula. J. Funct. Anal. 88, 395 - 436 (1990) 2. S. Albeverio, V.I. Bogachev, M. Rockner: On uniqueness of invariant measures for finite and infinite dimensional diffusions. Commun. Pure and Appl. Math. 52, 325-362 (1999) 3. S. Albeverio, Y.G. Kondratiev, M. Rockner: Addendum to: An approxi mate criterium of essential self-adjointness of Dirichlet operators. Poten tial Analysis 2, 195-198 (1993). 4. S. Albeverio, M. Rockner: Dirichlet form methods for uniqueness of martingale problems and applications. In: Stochastic Analysis. Pro ceedings of Symposia in Pure Mathematics. Vol. 57, 513-528. Editors: M.C. Cranston, M.A. Pinsky. Am. Math. Soc.:Providence, Rhode Island 1995. 5. W. Arendt: The abstract Cauchy problem, special semigroups and per turbation. In: One-parameter semigroups of positive operators, Edited by R. Nagel. Berlin: Springer 1986 6. J. Dixmier: Les algebres d'operateurs dans l'espace hilbertien. Paris: Gauthier-Villars 1969 7. A. Eberle: Uniqueness and non-uniqueness of singular diffusion operators. Doctor-Thesis, Bielefeled 1997, SFB-343-Preprint (1998), 291 pages, to appear as Springer Lecture Notes in Math.
358
8. V. Liskevich, M. Rockner: Strong uniqueness for a class of infinite di mensional Dirichlet operators and applications to stochastic quantization. Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4) 27, no. 1, 69-91 (1998) 9. V.A. Liskevich, Y.A. Semenov: Dirichlet operators: A priori estimates and the uniqueness problem. J. Funct. Anal. 109, 199-213 (1992) 10. Z.M. Ma, M. Rockner: An introduction to the theory of (non-symmetric) Dirichlet forms. Berlin: Springer 1992. 11. A. Pazy: Semigroups of linear operators and applications to partial dif ferential equations. Berlin: Springer 1992 12. M. Reed, B. Simon: Methods of modern mathematical physics II. Fourier Analysis. New York - San Francisco - London: Academic Press 1975 13. M. Rockner: LP-Analysis of Finite and Infinite Dimensional Diffusion Op erators, SFB-343-Preprint (1998) To appear in C.I.M.E.-Summer School "Kolmogorov Equations in infinite dimensions". Lecture Notes in Math., Berlin: Springerx 14. W. Stannat: (Nonsymmetric) Dirichlet operators on L1: exis tence, uniqueness and associated Markov processes. Ann. Scuola Norm. Sup. Pisa Cl.Sci. (4)28 (1999),no. 1,99-140. 15. M. Takesaki: Theory of operator algebras I. Berlin: Springer 1979 16. L. Wu: Uniqueness of Nelson's diffusions (II): infinite dimensional setting and applications. To appear in: Potential Analysis (in the end of 1999)
359 A GLAUBER DYNAMICS FOR THE SHERRLNGTON-KIRKPATRICK AND HOPFIELD MODELS E. SCACCIATELLI Dipartimento di Matematica,Universita' "La Sapienza" P.le A.Moro 5, 00185 Rome,Italy 3cacciatelli6axcasp.ca8pur.it A stochastic Glauber dynamics is introduced in the two models, in the high tem perature region. This dynamics drives the systems towards the equilibrium state. An estimate is given for the spectral gap of the spectrum of the generator.
1
The models
Both the Sherrington-Kirkpatrick and the Hopfield model, are stochastic mod els of spins
= --JffY,l
(l
^
In (l.l)i, the {Jij} are i.i.d. random variables with zero mean, common J 2 variance, and are symmetrically distributed. In (1.1)2 the patterns fM are p random vectors, p = p(N)
usually p(N) = aN
with N components. The components of all the patters are i.i.d. random vari ables which take values ±1 and are symmetrically distributed. Let us denote by
360 At this value of the parameters, as in the previous case, the variance of the fluctuation field of the free energy is diverging [2]. We shall be interested in the "ergodic" regions. In a previous paper it was proved that for any cylindrical function f de pending on the spin variables 0,N = ^ . r i i i ^ , , , ^ ] . .. x . (H)_ Eg./(*A)«p[& E ; , T E i < 3
tfejWj]
(1.2)
^A = IligA °i converge almost surely, for 0J < 1, ^ £ < 1 respectively, to the trivial mean M/(aA)) = 2-lAl£/(
In [3] the speed of such convergence was determined. Namely, looking at the "rescaled" random variables <^a
A
>(
s
-*\
(1.3)
they converge a.s. to a joint gaussian distribution with variance
« [VN < on >%$? »=^
$£j,
(1.4)
respectively. Let us give an example of the limiting structure of such variables looking at the thermal expectations (in both models) of <7j
(1.5)
The random variables (1.5) appear, up a ^ correction, as the summation over all the "simple strings" (fig (1)) with fixed end points i j , where each edge (h,k), contains a factor th[-4-] Jhk
for
$SSfc»tf*
for the H model
the
S- K
model
-
In the second case two "near" edges have to have a different "frequency" fi.
361
. —^^Fig. 1 The thermal expectation < <J\<JIPIP\ >, appears as the summations of prod ucts of two simple strings with end points chosen among the vertices 1-4. 2
Doubly stochastic dynamics
We want now to define a (stochastic) Glauber dynamics over our stochastic model. We speak essentially about the S-K model. Looking at the Gibbs measures which are well defined for any realization {J} and finite N (but in the "ergodic" region also for N ->• oo as we saw), the "natural" operator we can choose to define our dynamics is 4N) =EH=iCh(g:,{J))Ah where ch(a, {J}) = l+e^Jau,iJH
(21)
over L2(fi0), Ahf{a) = / ( a w ) - /(a) being the usual spin-flip operator. We immediately get, L N)
0
= I 2 > + th(~m 1
h=i
V i V
E JH
(2-2)
&h
It is immediately seen that the operator L^ is not a symmetric one, w.r.t. the measure n<>, while, with respect to the "Gibbs measure"
(2 3)
* n - > = ^ «™J
-
t is a symmetric and non positive one, by the detailed balance principle. But we have in mind to use the measure JU0 as the initial one for our "process". Then, for instance, we have Mo (l
• it-paw)
= -i*[(th-fa Ek^i J*Oh)°i + {thjft E
W
JjVrM (2.4)
362 Moreover we pay for our mean field theory, as, looking to any cilyndrical func tion /(CTA), the action of La generates a much broader function (essentially the new function depends on all the variables): L{Pf{OK)
= I J > + th{^= J2 Jhk
(2.5)
Anyway we are not able to handle such an operator; we have in mind to study instead the operator
h
V
>#fc
that is, the operator obtained with the first term of the development of the function th(-). Such a position should be in some sense meaningfull only if the random function
is vanishingly small in the limit. Let us observe that the typical value of the spin variables {
(2.7)
where the subspaces lk{(*o) are generated by the vectors CTA
lAl
= lc
(2 8)
JO(M°) = constants
then the restrictions of l £ L°/N)aA
^
to the subspaces /*(/x0) are symmetric operators:
= [-|A|crA + ^
5 3 Jhj*\/huj}
7>€AjgA
J
~ $3 51
hjO\/{hj]
(2.9)
heAj^fcjeA
So the vector L°^ 'a\ has components in the subspaces
Z|A|(/JO)
and
*|A|-2(/*o): then for |A'| = |A|, Ho(
forA' = A =}>
=
H0{
(IOKL^VA)
= -|A|
forA' = A/ho U j 0 = > M * A ' L°^N)u\) for|A n A'I < |A| - 2 = ► MO(CTA- L°0'{N)aA)
= -Jh0j0 = 0
(2.10)
363
while for |A'| = |A| - 2: A = A' U {ij}, we get
y,0(<JkLf
VA<) = 0
Then we shall start with a measure w.r.t. for which, not only our "generator" is not symmetric, but even its restrictions to the "fixed particles" subspaces, at least at a first sight, are not negative for each realization. Moreover as we perform a substitution of a random variable which does not excede the absolute value 1, with one that looks at a first sight "much larger" for finite N, we can expect some extra difficulties. Neverthless we shall study the "semigroup" SN{t) = etLym
(2.12)
while, anyway to avoid technical difficulties, we shall suppose that the random variables Jy are bounded 3
Results
It is possible to prove for /?J < 1 (i.e. in the ergodic region) the following facts: • for each cilyndrical function
/(
and for each t > 0
Zimiv-Hx>Mo('SjvM/(<7A)) = MO(/(
a.s.
(3.1)
• for each sufncently large but finite N Umt-+00iJ,o(SN{t)f((TA))
(3.2)
exists over a set of realizations {J} which is growing with N
limN-Kx>limt-+oolJ>o{SN{t)f{
MO(/(^A))
a.s.
(3.3)
Moreover, for each monomialCTAwith |A| even limN->00Ho{SN(t)N[T[
(3.4)
converges a.s. to a joint gaussian distribution similar, but much more complex than to those described above.
364
4
Some spectral properties
Let us study first the one body spectral properties of our operator. The "zero body", i.e. the effect over the constants, is clearly trivial with the only eigenvalue equal to one. Let us consider the matrix elements ^0((TiSN(t)aj)
= ZZop^i([L°0'{N)]k
(4.1)
J
= e-*[e^ ]« Then we can write, below RN(-; J) is the resolvent of the operator \-4% J^N>> with entries |-fe*Aj SN(t) = ^fdterW-VRNiXi
J)
(4.2)
where the integral has to be computed along a circuit in the complex plane A, which include "all the eigenvalues". As N is large but finite, the position of such eighenvalues depends on the realization. Neverthless, for finite t we can take a circle large enough (we do not care that |A| > 1). For the spectral density we get (the discussion could be made rigorous in a obvious way) pN(x) = 2^mfdXe~t{1~X)6(x
- *)TrRN{X\J)
(4.3)
For very large N, it is well known (Wigner law) that V5 > 0 JjTrRN(\;J)n
=
jj < TrRN(X;J) »
+o(N-U-s>)
2X{1
i^)+o(N-U-*)i
(4.4)
It follows, up a -4fi correction, P(«) = AifdXe-W-VSix
-&) - A)2A(1"^?
(4.5)
Namely we have, from the square root, a cut along the real axis between the points (-0J,0J). If 0J < 1, we can avoid, to surround the cut, the point 1. Then we have a spectral gap for our operator as large as 1 - 0J.
365
References 1. M.Aizenman,J.L.Lebowitz,D.Ruelle : Some Rigorous Results on the Sherrington-Kirkpatrick Spin Glass Model,Comm.Math.Phys. 112,320,1987. 2. B.Tirozzi,E.Scacciatelli : On The Fluctuation of the Free Energy in the Hopfield Model,J.Stat.Phys. 67 981-1008,1992. 3. E. Scacciatelli : On the essential uniqueness of the thermal state in the "ergodic" regions for the Sherrington-Kirkpatrick and Hopfield models. In progress
366
E X T E R N A L A N D I N T E R N A L GEOMETRY O N CONFIGURATION SPACES JOS£ L. SILVA CCM, University of Madeira, P-9000 Funchal, Portugal and BiBoS, Universitdt of Bielefeld, D-S3501 Bielefeld, Germany E-mail: [email protected] (joint work with Yu. G. Kondratiev and M. Rockner) Dedicated to Professor Ludwig Streit on the occasion of his 60th birthday In this talk we will show a transparent relation between the intrinsic pre-Dirichlet form ££ and the extrinsic one £p x corresponding to the Gibbs measure fi on the configuration space Tx- This extends the result obtained in * (see also 2 ) for Poisson measure Tra. As a consequence we prove the closability of f£ on l?{Yx,n) under very general assumptions on the interaction potential of the Gibbs measures /i, see also. 3
1
Introduction
In the recent papers, 2 4 l and 5 analysis and geometry on configuration spaces Tx over a Riemannian manifold X, i.e, r := Tx := {7 C X | [7 D K\ < 00 for any compact K C X},
(1)
was developed. They realized that the Dirichlet form of the Poisson measure 7r„ with intensity measure a on B(X) describes the well-known equilibrium process on configuration spaces, moreover this form is canonically associated with the introduced geometry on configuration spaces and is called intrinsic Dirichlet form of the measure na. On the other hand there is a well-known realization of the Hilbert space L2(rx,7Tff) and the corresponding Fock space 00
ExpL2(X,a)
:= Q)ExpnL2(X,a),
(2)
n=0
where Exp n L 2 (X,
367
dient" operator in L2(T, na). The differentiable structure in L2(T,-K„) which appears in this way we consider as external. In * the authors shown that the intrinsic Dirichlet form €l of the measure ■Ka can be represented also in terms of the external Dirichlet form E^ Hx with coefficient H* (the Dirichlet operator associated with a on X) which uses this external differentiable structure, i.e., y
tf*VpG(7)W,a)GM7).
(3) If we change the Poisson measure ira to a Gibbs measure \i on the configura tion space T which describes the equilibrium of interacting particle systems, the corresponding intrinsic Dirichlet form can still be used for constructing the corresponding stochastic dynamics (cf. 5 ) or for constructing a quantum infinite particle Hamiltonian in models of quantum fields theory, see. 7 The aim of this talk is to show that even for the interacting case there is a transparent relation between the intrinsic Dirichlet form and the extrinsic one, see Theorem 5.1. The proof is based on the Nguyen-Zessin characterization of Gibbs measure (cf. 8 or Proposition 5.2 below) which on a heuristic level can be considered as a consequence of the Mecke identity (cf. 9 ) . As a consequence of the mentioned relation we prove the closability of the pre-Dirichlet form (££, TC^iJ), T)) on L2(Tx, (J-), where \i is a tempered grand canonical Gibbs measure, see Section 2 for this notion. We would like to emphasize that we achieve this result under a general condition (see (35) below) on the potential $ which is not covered by condition (A. 6) in. 10 Finally we mention the closability of the Dirichlet form ££ which is crucial (for physical reasons, see, 7 and) for applying the general theory of Dirichlet forms including the construction of a corresponding diffusion process (cf. n ) which models an infinite particle system with (possibly) very singular interactions (cf. 5 ) . 2
Framework
In this section we describe some facts about probability measures on config uration spaces which are necessary later on. Let X be a connected, oriented C°° (non-compact) Riemannian manifold. For each point x € X, the tangent space to X at x will be denoted by TXX; and the tangent bundle will be denoted by TX = Ux^xTxX. The Riemannian metric on X associates to each point x G X an inner product on TXX which we denote by (•, -) T x and the associated norm will be denoted by | • \TXX-
368 Let m denote the volume element. O(X) is defined as the family of all open subsets of X and B(X) denotes the corresponding Borel (7-algebra. Oc(X) and BC(X) denote the systems of all elements in O(X), B(X) respectively, which have compact closures. Let T := Tx be the set of all locally finite subsets in X: Tx := {7 C X 117 n K\ < 00 for any compact K C X}.
(4)
We will identify 7 with the positive integer-valued measure Ylxe-y6*Then for any ip 6 CQ(X) we have a functional r 3 7 >-* (tp, 7) = ^xe-r ^ ( x ) e R. Here Co(X) is the set of all real-valued continuous functions on X with compact support. The space T is endowed with the vague topology. Let B(r) denote the corresponding Borel
= exp (j
(e*W - l)da(x)\ , p 6 C0(X),
(5)
see e.g.,1 12 13 . Let us mention that if p e Ll{X, m), then we have a finite intensity measure
EXh) := { [
(6) +00
otherwise,
where the sum of the empty set is defined to be zero.
369 Later on we will use conditional energies which satisfy an additional as sumption, namely, the stability condition, i.e., there exists B > 0 such that for any A € BC(X) and for all 7 £ I \ Et(7)
> -B\y\.
Definition 2.1 For any A 6 Oc(X) define for 7 € T the measure II*(7, •) by nA'*(7, A ) : = l { z „ , * < o o } ( 7 ) [ ^ * ( 7 ) ] - 1 jf
1A(7X\A
+
7A)
(7)
• exp[-£*(7x\A + YA)]<MY), A G B(T), where Z°K *(7) := j £ exp[-£*( 7 x\A +
YA)]<MY)-
(8)
.A probability measure p on (T, B(T)) is called grand canonical Gibbs mea sure with interaction potential $ if for all A € Oc{X)
/dtf=M-
(9)
£e£ Qgcip, $) denote the set of all such probability measures [i. The Equations (9) are called Dobrushin-Landford-Ruelle (DLR) equations. 3
Intrinsic geometry on Poisson space
We recall some results to be used below from 1 4 to which we refer for the corresponding proofs and more details. A homeomorphism tp : X —► X defines a transformation of V by ^(7) = {ip{x)\x € 7}. Any vector field v € Vo(X) (i.e., the set of all smooth vector fields on X with compact support) defines a one-parameter group ip", t € R, of diffeomorphisms on X. Definition 3.1 For F : T —> R we define the directional derivative along the vector field v as (provided the right hand side exists) ( V j F ) ( 7 ) := ! i W ( 7 ) ) l * = o . This definition applies to F in the following class !FC^°(T>, T) of so-called smooth cylinder functions. Let V := C™{X) (the set of all smooth functions on X with compact support). We define FC^iV, T) as the set of all functions on r of the form F{l) = 9F((l, ¥>I), • • ■, (7,
(10)
370
where tpi,..., ipN G V and gF is from C£°(R N ). Clearly, TC£°(T>, T) is dense in L2(7ra) := L 2 (r,7r a ). For any F e .H7£°(V,T) we have (V v r F)( 7 ) = £
^ « 7 , Vi), • • •, (7, ^ ) ) ( 7 , V * V i ) ,
(11)
where x >—> (Vx
J VlFGdK.
= - f FVlGdwa
-
f FGBZ°dna,
(13)
or (V£)* = — V£ — B*°, as an operator equality on the domain TC^iJ),T) in L2(na). Definition 3.4 We introduce the tangent space T^r to the configuration space r at the point 7 6 T as the Hilbert space of 'y-square-integrable sections (measurable vector fields) V : X —* TX with scalar product (V1,V2)T r— fx(Vl(x),V2(x))TxXdy(x), V\V2 e T^r = L2(X - TX;y). The clrresponding tangent bundle is denoted by TT. The intrinsic gradient of a function F € ^ r Cj°(P, T) is a mapping r 3 7 ■-► r ( V F ) ( 7 ) 6 T 7 r such that (V{;F)(7) = (VrF(j),v)T^r for any v e VQ(X). Furthermore, by (11), if F is given by (10), we have for 7 e T, x e X (VrF)(7;*) = f ^ ( < ^ , 7 > , . . . > N , 7 > ) V * ^ ( x ) . i=i
(14)
OSi
Definition 3.5 For a measurable vector field V : T —► T r the divergence (^vngV is defined via the duality relation for all F 6 .FC,£0(Z>,T) by Jiy^
V r F ( 7 ) ) r , r < M 7 ) = - J F ( 7 ) ( d i v ^ V)(-y)dwa(j),
(15)
371
provided it exists (i.e., provided F i-» Jr(V-,, V r F(7))x., 1^*7(7) is continuous on
L2(TX„)).
Proposition 3.6 For any vector field V = Gv, where G G v G VQ(X) we have ( d i v ^ M = ((V r G)( 7 ),t,) T ,r + G ( 7 ) ^ ' ( 7 ) -
FC^{V,T),
(16)
For any F,G G .FC£°(1?, T) we introduce the pre-Dirichlet form which is generated by the intrinsic gradient V r as £$.(F,G)
= ^((VrF)(7),(VrG)(7))r,rd7rff(7).
(17)
We will also need the classical pre-Dirichlet form for the intensity measure a which is given as £*(tp, rp) = / x ( V x i ^ , V X V ) T X ^ for any
7
) = - A r F ( 7 ) - ( d i v * ( V r F ) ( 7 , -),7>.
(18)
Theorem 3.7 The operator H% is associated with the intrinsic Dirichlet form Ela, i.e., el(F,G)
= (H^F,G)L2M,
or Hlo = - d i v ^ V r on TC%°(D,T)operator of the measure ira. 4
(19)
We call H$a the intrinsic Dirichlet
Extrinsic geometry on Poisson space
We recall the extrinsic geometry on L2(ira) based on the isomorphism with the Fock space. Our approach is based on 18 but we should also mention 19 20 6 21 22 for r e i a t e c ; considerations and references therein. For proofs of the results stated below in this section, we refer to. ! Let us define another "gradient" on functions F : T —> R. This gradient V p has specific useful properties on Poissonian spaces. We will call V p the Poissonian gradient. To this end the tangent space to T at any point 7 G T
372
we consider the same space L2(a) and define a mapping TC^^^T)
VpFeL2(na)®L2(a)by
( V P F ) ( 7 , x) := F ( 7 + ex) - F ( 7 ) , 7 e I\ x € X.
3 F >-» (20)
We stress that the transformation r 9 7 i - > 7 + e x e r i s 71-,,-a.e. well-defined because 7rCT({7 € T|x 6 7}) = 0 for any x 6 X. The directional derivative is then defined as ( V £ F ) ( 7 ) = j T [ F ( 7 + et) - F(1))tp(x)da(x),
(21)
The Poissonian gradient V p yields an orthogonal system of Charlier polyno mials on the Poisson space (r,B(r),7r CT ). For any n G N and all if € V we introduce a function in L2(na) by Q^(7;^"):=((V£ri)(7),
(22) 17
and define QJ° := 1. Due to the kernel theorem these functions have the representation Q£"(7; f®n) = ( Q ^ ' C T ) ' V®n)> with generalized symmetric kernels T 3 7 >-► Q*°(7) e ExpnX>', n € N . Here and below by E x p n £ we denote the n-th symmetric tensor power of a Unear space E. Then for any smooth kernel ip^ e Exp n D® n we introduce the function Q^'iTiP^) '•= (<3n"(7). V (n) ) such that for all
I Qn°(r, fin))Q^(r,
V-(m))<M7) = <5«mn!(
(23)
Hence (22) extends to the case of kernels from the so-called n-particle Fock space Exp n L 2 (er), n € N , and we set Exp0Z,2(<7) := R. As usual the symmetric Fock space over the Hilbert space L2(a) is defined as ExpZ,2(
The following proposition shows that the operators V£ and V£* play the role of the annihilation resp. creation operators in the Fock space ExpL 2 (cr). P r o p o s i t i o n 4.1 For all
vpQZ'ir,v®n) n
(24)
= Ql'+iir,v®n<M, 7 e r,
(25) n
where ip®
373 Next we give an explicit expression for the adjoint of the Poissonian gra dient V p *. Proposition 4.2 For any function F € Ll{-Ka)®Ll(a) we have F G D(VP*) and the following equality holds (V P *F)( 7 ) = / F ( 7 - e x , * ) d 7 ( * ) - f F(-y,x)da(x),
Jx
Jx
7 € T,
(26)
provided the right hand side of (26) is in L2(7Ta). Proof. For X = R d this proposition was proved in 6 . The general case follows from (20) and the Mecke identity, see e.g., 9 f (J
M7,x)d 1 (x)\ d^i-y) = JJhfr
+ e„x)dir a ( 1 )da(x),
(27)
where h is any non-negative, B(T) x 5(X)-measurable function. For any contraction B in L2(a) it is possible to define an operator ExpB as a contraction in ExpL 2 (a) which in any n-particle subspace Exp n L 2 (cr) is given by B<8>- • -®B (n times). For any positive self-adjoint operator A in L2(o) (with V C D(A)) we have a contraction semigroup e~tA, t > 0, hence it is possible to introduce the second quantization operator dExpA as the generator of the semigroup E x p ( e _ M ) , t > 0, i.e., E x p ( e _ M ) = exp(-tdExp J 4), see e.g. 24 We denote by H p the image of the operator rfExpA in the Poisson space 2 L (7r„) under the described isomorphism. Proposition 4.3 Let V C D(A). Then the symmetric bilinear form corre sponding to the operator H% has the following form, (F, G € !FC™(T>, T)) (H%F,G)L3M
= y"(VpF(7),ylVpG(7))^(.)^(7).
(28)
The right hand side of (28) is called the "Poissonian pre-Dirichlet form" with coefficient operator A and is denoted by £ p A. Let us consider the special case of the second quantization operator dExpA, where the one-particle operator A coincides with the Dirichlet op erator H* generated by the measure u o n X . Then we have the following theorem which relates the intrinsic Dirichlet operator H$a and the operator Theorem 4.4 H$a = H^x
on fC%°(V,r)-
In particular, for all F,G G
I ( V r F ( 7 ) , VrG(7)}r,r
r
(29)
374
5
Relation between intrinsic and extrinsic Dirichlet forms
Here we consider the class of measures G\c(a, $) consisting of all \i 6 Qgc{p, $) such that
I
y(K)d[i(y) < oo for all compact K C X. ?r We define for any n € G\c{o, $) the pre-Dirichlet form ££ by %(F,G)
:= I ( V r F ( 7 ) , V r G( 7 ))T,rdM(7), F,G € FC?(V,T).
(30)
After all our preparations we are now going to prove an analogue of (29) for n € Glc(a' $)• We would like to emphasize that the corresponding formula (31) is not obtained from (29) by just replacing ira by fj, e G\c{v, $)• The essential difference is, in addition, an extra factor involving the conditional energy E%. Theorem 5.1 For any /x € G]c(a,$), we have for all F,G € -FC£°(D,T) tfiFG)
{^xyPF{1,x)^xWpG{1,x))TiXe-Et^+^da{x)d^1).
=J J
(31) Proof. Let us take any F € TC^{V, T) of the form (10). Then given 7 e T and x € X (20) implies that N Q
VxVpF(i,x)
J2jT-((
= i=l
o s i
Let us define £ ( 7 ) := ^gf ((vi,T>. • • •, (vw.7)). * = 1. • • • . N - Obviously, it is enough to prove the equality (31) for F — G. Thus, inserting the result of VxVpF(j,x) into the right hand side of (31) we obtain r r N / / E (V*<*(*), V x ^ ( x ) > r i X F i ( 7 + e x ) F J ( 7 + £ x ) e - ^ > ( 7 + £ ' W ( x ) ^ ( 7 ) . (32) Then we need the following useful proposition which generalizes the Mecke identity to measures in Ggc(v, $ ) , see. 8 2 5 Proposition 5.2 Let h : T x X —> R + be B(T) x B(X)-measurable, and let M € Ggc(o> $ ) • TVien we Aave
/ Y /
fc(7,a:)rf7(a;)^d/i(7)=
/" f h(j +
ex,x)e-Et^+£l)dfi(1)da(x). (33)
375
Using this proposition we transform (32) into
r
N
/ £
Fi(7)fi(7)((Vx^(-),Vx^(-))rx,7)^(7)-
On the other hand using (14) we obtain N
(VrF(T), VrG(7))rr - £
£(7)^(7)«VV(-),
VxVi{-))TXn).
t,j=i
Therefore the equality on the dense ^ r Cj°(P, T) is valid which proves the theorem. 6
Closability of intrinsic Dirichlet forms
In this section we will prove the closability of the intrinsic Dirichlet form (££, ^"C6°°(D, T)) on L2(n) := L2(T, p.) for all /i G Glgc(o; * ) , using the integral representation (31) in Theorem 5.1. The closability of (££,.FC,£0(Z>,r)) over T is implied by the closability of an appropriate family of pre-Dirichlet forms over X. Let us describe this more precisely. We define new intensity measures on X by da-l(x) :— p^(x)dm(x), where p7(x) := e-B?-><•»+'"V(ar), x E X,f
GT
(34)
It was shown in 26 (in the case X = R d ) that the components of the Dirichlet form (£* ,Va~l) corresponding to the measure a^ are closable on L 2 ( R d , a 7 ) if and only if a^ is absolutely continuous with respect to Lebesgue measure on R d and the Radon-Nikodym derivative satisfies some condition, see (35) below for details. This result allows us to prove the closability of (££,FC£°(V,r)) on L2(p). Let us first recall the above mentioned result. Theorem 6.1 (cf. Theorem 5.3 in 26) Let v by a probability measure on (Rd,B(Rd), d G N and let V" denote the v-classes determined by V. Then the forms {£Vti,V) defined by ^ , £Vii(u,v)
N
f := /
du dv — — dv, u,v G V,
JRd OXi OXi
are well-defined and closable on L2(Rd,v) for 1 < i < d if and only if v is absolutely continuous with respect to Lebesgue measure Xd on R d , and the Radon-Nikodym derivative p = dv/dXd satisfies the condition: for any 1 < i < d and A d _ 1 -a.e. x G iy G R d _ 1 | / p^(s)d\1(s)
>0
376 pW = 0 A ^ a . e . on R\R(/)W), where p(j\s) s € R, if x = ( x i , . . . ,x
1
[ /
d_1
:= p(xi,...,
Xi_i, s, Xi,...,
, and where
xd), (35)
- r r — d s < oo for some e > 0 >
Jt-c pP(s)
j
(36)
There is an obvious generalization of Theorem 6.1 to the case where a Riemannian manifold X is replacing R d , to be formulated in terms of local charts. Since here we are only interested in the "if part" of Theorem 6.1, we now recall a slightly weaker sufficient condition for closabUity in the general case where X is a manifold as before. T h e o r e m 6.2 Suppose o\ = p\ ■ m, where p\ : X —► R+ is B(X)-measurable such that pi = 0 m—a.e. on X\< x € X\ I —dm < oo for some open neighbourhood hx of x > . I J\, Pi J 77ien (£*, V1) defined by £*(u, w) := J (Vxu(x),Vxv(x))TlX
' dax(z);
^
u,veV,
is closable on L2(o~\). The proof is a straightforward adaptation of the line of arguments in n (Chap. II, Subsection 2a), see also Theorem 4.2 in 2 7 for details. We emphasize that (37) e.g. always holds, if p\ is lower semicontinuous, and that neither v in Theorem 6.1 nor o\ in Theorem 6.2 is required to have full support, so e.g. pi is not necessarily strictly positive m-a.e. on X. We are now ready to prove the closability of (f£,.7 : C , £ 0 (2>,r)) on L2((i) under the above assumption. T h e o r e m 6.3 Let /i € Glc{o~, $ ) . Suppose that for fi-a.e. 7 € T the function p-y defined in (34) satisfies (37) (resp. (35) in case X — R d j . Then the form (S£,rCS°(p,r)) is closable on L2(fi). We address the interested reader to 3 for the details of the proof. R e m a r k 6.4 The above method to prove closability of pre-Dirichlet forms on configuration spaces Tx extends immediately to the case where X is replaced by an infinite dimensional "manifold" such as the loop space (cf. 2 8 j . E x a m p l e 6.5 Let X — Rrf with the Euclidean metric and a := 2 ■ m, z 6 (0,oo). A pair potential is a B(Rd) -measurable function
377
and $(7) :—
Acknowledgments We would like to thank Tobias Kuna for helpful discussions. Financial support of the INTAS-Project 378, PRAXIS Programme through CITMA, Funchal, and TMR Nr. ERB4001GT957046 are gratefully acknowledged. References 1. S. Albeverio, Y. G. Kondratiev, and M. Rockner, J. Funct. Anal. 154, 444 (1998). 2. S. Albeverio, Y. G. Kondratiev, and M. Rockner, C. R. Acad. Sci., Paris Ser. I Math. 323, 1179 (1996). 3. Y. G. Kondratiev, M. Rockner, and J. L. Silva, Math. Nachr. (1998). 4. S. Albeverio, Y. G. Kondratiev, and M. Rockner, C. R Acad. Sci. Paris Ser. I Math. 323, 1129 (1996). 5. S. Albeverio, Y. G. Kondratiev, and M. Rockner, J. Funct. Anal. 157, 242 (1998). 6. Y. G. Kondratiev, J. L. Silva, L. Streit, and F. G. Us, Infinite Dimensional Analysis, Quantum Probabilities and Related Topics 1, 91 (1998). 7. S. Albeverio, Y. G. Kondratiev, and M. Rockner, Rev. Math. Phys. 11, 1 (1999).
378
8. 9. 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 22. 23. 24. 25. 26. 27.
28.
X. X. Nguyen and H. Zessin, Math. Nachr. 88, 105 (1979). J. Mecke, Z. Wahrsch. verw. Gebiete 9, 36 (1967). H. Osada, Comm. Math. Phys. 176, 117 (1996). Z.-M. Ma and M. Rockner, Introduction to the Theory of (NonSymmetric) Dirichlet Forms (Springer-Verlag, Berlin, 1992). I. M. Gel'fand and N. Y. Vilenkin, Generalized Functions (Academic Press, New York and London, 1968), Vol. TV. H. Shimomura, J. Math. Kyoto Univ. 34, 599 (1994). J. C. Preston, Z. Wahrsch. verw. Gebiete 46, 125 (1979). C. J. Preston, Random Fields, Vol. 534 of Lectures Notes in Math. (Springer-Verlag, Berlin Heidelberg New York, 1976). O. Georgii, Canonical Gibbs Measures, Vol. 760 of Lectures Notes in Math. (Springer-Verlag, Berlin, 1979). Y. M. Berezansky and Y. G. Kondratiev, Spectral Methods in InfiniteDimensional Analysis (Kluwer Academic Publishers, Dordrecht, 1995). Y. G. Kondratiev, J. L. Silva, and L. Streit, Methods of Functional Anal ysis and Topology 3, 28 (1997). Y. M. Berezansky, V. O. Livinsky, and E. W. Lytvynov, Methods of Functional Analysis and Topology 1, 28 (1995). Y. Ito and I. Kubo, Nagoya Math. J. I l l , 41 (1988). D. Nualart and J. Vives, in Seminar on Stochastic Analysis, Random Fields and Applications (Birkhauser, Basel, 1995), Vol. 36, pp. 205-213. N. Privault, J. Funct. Anal. 132, 335 (1995). T. Hida, H. H. Kuo, J. Potthoff, and L. Streit, White Noise. An Infinite Dimensional Calculus (Kluwer, Dordrecht, 1993). M. Reed and B. Simon, Methods of Modern Mathematical Physics (Aca demic Press, New York and London, 1975). K. Matthes, J. Mecke, and W. Warmuth, Math. Nachr. 88, 117 (1979). S. Albeverio and M. Rockner, J. Func. Anal. 88, 395 (1990). S. Albeverio, J. Brasche, and M. Rockner, in Schrodinger Operators, Vol. 345 of Lecture Notes in Phys., edited by H. Holden and A. Jensen (Springer-Verlag, Berlin, 1989), pp. 1-42. Z.-M. Ma and M. Rockner, preprint (unpublished).
379
CIRCULAR INVARIANCE OF THE WEYL FORM OF T H E CANONICAL COMMUTATION RELATION
Instytut Matematyki,
JAN STOCHEL Uniwersytet Jagielloriski, ul. Reymonta J,, PL-30059 Krakdw E-mail: [email protected]
Instytut Matematyki,
F.H. SZAFRANIEC Uniwersytet Jagielloriski, ul. Reymonta 4, PL-30059 Krakow E-mail: [email protected]
The Schrodinger-Weyl representation of the canonical commutation relation is circulary invariant in a sense. Here, as a continuation of 5 , we show the converse.
1
Introduction
The Weyl version of the canonical commutation relation, in its simplest form, can be written as follows (W)
U(s)V(t) = elHV(t)U(s),
s,t€R,
where { £ / ( S ) } S 6 R and {V{t)}teR are unitary C0-groups 6 (see also 4 for more information concerning commutation relations in general). In particular, the pair ({t/Sw(s)}s6R, {Kw(*)}teR) °f unitary C0-groups defined in £ 2 (R) by (Usv/(t)f)(x)
= eitxf(x),
(V,w(t)f)(x)
= f(x -t),
/ G £ 2 ( 1 ) , t,x € R,
satisfies the commutation relation (W). The infinitesimal generators iQsvl and LPSW of {C/ SW (S)} S GR and {Vsw(<)}teR, respectively, are given by Q SW
—
-^1
M W
=
*~i
•
dx The pair (Qsw^sw), the classical object of quantum mechanics, is composed of selfadjoint operators and is known as the Schrodinger couple. It justifies the following definition: any couple ({C/(S)} S € R, {V(t)}teR) of unitary Co-groups which is unitarily equivalent to ({^sw(s)}s6R,{Vrsw(t)}(6R) w ' u be called a Schrodinger- Weyl couple. The two facts which follow make the starting point for our further considera tions (N) a couple ( { £ / ( S ) } S € R , {V(t)}t€n) of unitary Co-groups satisfies (W) if and only if it is the orthogonal sum of Schrodinger-Weyl couples (this is the celebrated von Neumann uniqueness theorem 3 ) ;
380
(0) if a couple ({£/(s)}*eR> {V"(t)}teiO of unitary Co-groups satisfies (W), then so does the couple ({e*'t/(s)}, € R , {ei,JtV(0}i€iO for all£,n E l ( o rather simple observation). It turns out that the following circular invariance property (which resembles (0)) holds true: for every (£,»j) G R2, {{^'U(a)}.ett, { e ^ W W ) is a Schrodinger-Weyl couple provided so is ({C(s)} se R, {V(t)}t€R). Our goal in this note is to find out a pretty wide class X of couples of unitary Co-groups within which the invariance property: (I) for every (£,!,) G K2, ({e*'tf(*)}, g R l W*V(t)}ta)
G X,
identifies Schrodinger-Weyl couples. 2
T h e class W
Let us begin with a (formally) more general case. Assume that V. is a separable complex Hilbert space. Given an orthonormal basis e = {en}£Lo m ^ a n ^ a sequence o = {on}™=o of nonzero complex numbers, we denote by W(e,a) the set of all couples ({t/(s)} s >o, {V(t)}t>o) of Co-semigroups ° of bounded linear operators in H whose infinitesimal generators written as (iQ, iP) satisfy the following conditions: Q and P are symmetric operators and V^ C T>(Q) D T>(P), Qen = CTn_ie„_i -I- ff„en+i, n > 0, Pen = i (an-ien-i -
(1) (2) (3)
where T>i stands for the linear span of {e„}^L0 and V(Q) denotes the domain of Q. Set W = [jg & W(e,cr), where the union is taken over all possible pairs of e's and a's. By Wo we denote those members of W which, when belonging to suitable W(e,a), are such that
a
7r-
1 4
/ (-l)"2-n/2(u!)-1/2el2/2-^re-12,
a semigroup approach to the CCR can be found in 2 as well as in
x G R.
381 Theorem 2.1 // iQ and \P are infinitesimal generators of Co-semigroups {£/(s)}s>o and{V(t)}t>o, respectively, and({U(s)}3>o, {V(t)}t>0) € W, then both operators Q and P are selfadjoint. Proof Since iQ is an infinitesimal generator of a Co-semigroup, the spectrum of Q must be different from C. This and the fact that Q is symmetric imply that Q is maximal symmetric. Denote by Qe the closure of the restriction of Q to Ve. Since Qg is an operator induced by a Jacobi matrix, its deficiency indices are equal to each other and do not exceed 1; denote their common value by K. NOW if K, — 0, then Qe is selfadjoint and consequently so is Q. If /c = 1, then, by maximality of Q, Qg ^ Q- Since both deficiency indices of Qg are equal to 1, each proper symmetric extension of Qg must automatically be selfadjoint. In particular, Q is selfadjoint. Likewise we treat P. u Because, by Theorem 2.1, if {{U(s)}s>0,{V(t)}t>0) 6 W, then the Cosemigroups {U(s)}s>o and {V(t)}t>o extend in a unique way to unitary Cogroups {t/(s)} se R a n d {V(t)}teR, respectively, we preserve the same notation for both. 3
The main result
Our main result which follows immediately shows that the invariance property (I) distinguishes Schrodinger-Weyl couples ( { £ / ( S ) } S 6 R , {V(t)}teR) among all members of the class X = W. Theorem 3.1 If ({U(s)}s>o,{V(t)}t>o) are equivalent (i) ({e^U(s)}s>0,{e^V(t)}t>0)
(ii) ({&'U(s)}.>o,
£ W, then the following conditions
G W / o r all (£,IJ) e R 2 ,
{e""y(0}t>o) € W / o r some (£,n) e K2 \ {(0,0)},
(iii) there exists a 6 M \ {0} such that ({U(as)}seR, Schrodinger- Weyl couple.
{V(at)}ten)
is a
If in (i) and (ii) W is replaced by W 0 , then in (iii) a turns out to be 1. Moreover, if ({£/(s)}»eR, {^(OheR) ' 5 a Schrodinger-Weyl couple, then so is ({e*'U(8)}a6R, {e"*V(i)}teR) for all ( , , e l . Theorem 3.1 is an exponential version of a result of 5 and corresponds in a sense to the von Neumann uniqueness theorem (N). Following 5 , we denote by J(e, a) (e and a being as in Section 2) the set of all pairs (Q, P) of closed operators in % which fulfil conditions (1), (2) and (3). Set J = \Je . J(e,<7),
382
where the union is taken over all possible pairs of e's andCT'S.Jo stands for the set of all members of J which, when belonging to suitable J(e, a), are such that <70 = \j-/2 and un > 0 for n = 1,2,... We say that (Q, P) is a Heisenberg couple with respect to an orthonormal basis {en}%L0, if (Q, P) G J({e„}£Lo>^) with an = \/(n + l)/2, n > 0. Heisenberg couples are characterized (up to a multiplicative constant) by Theorem 8 (see also Remark 9) in 5 : Theorem 3.2 If(Q,P)
G J, then the following conditions are equivalent
(i) for all £, rj 6 R, (Q + £ J, P + 17J) € J, (ii) there exist ( , i ) f R suc/i that £2 + 7?2 > 0 and (<2 + £/, P + nl) G J, (iii) there exists a G R \ {0} such that ( a Q , a P ) is a Heisenberg couple. If in (i) and (ii) J is replaced by J 0 , Men in (iii) a turns out to be 1. Moreover, if{Q, P) is a Heisenberg couple, then {Q+£I, P+i)l) is a Heisenberg couple for all £, 77 G R. Proof of Theorem 3.1 Let us begin with stating two facts (a) the infinitesimal generator of the Co-semigroup {e'4*f/(s)}s>o, £ G R, is the translation of that of {U(s)}a>o by i£7; (/?) the infinitesimal generator of the Co-group {t/(as)}, e R, a G R, is that of { [ / ( S ) } J € R multiplied by a. Suppose (ii). Since (Q, P) G J, by (a), we have {Q + £I,P + nl) G J. Due to Theorem 3.2, there is a G R \ {0} such that (aQ,aP) is unitarily equivalent to (Qsw,^sw)- This and (/?) lead to (iii). Now suppose (iii). By (/?), the couple (aQ,aP) G J(e, {^/(n + l)/2}~ = 0 ) with suitable e. Using again Theorem 3.2 we get (Q + £I,P + nl) is in J. This, by (a), gives immediately (i). The rest can be easily deduced from Theorem 3.2 via the same arguments. This completes the proof of Theorem. ■ Acknowledgment The research resulting in this paper was supported by the KBN grant # 2 P03A 004 17 References 1. C. Foias. and L. Geher, Acta Math. Sci. (Szeged) 24, 97 (1963).
383
2. C. Foia§, L. Geher and B. Sz-Nagy, Acta Math. Set. (Szeged) 21, 78 (1960). 3. J. von Neumann, Math. Annalen 104, 570 (1931). 4. C. R. Putnam, Commutation properties of Hilbert space operators and related topics, (Springer Verlag, Berlin, Heidelberg, New York, 1967). 5. Jan Stochel and F.H. Szafraniec, A peculiarity of the creation operator, submitted. 6. H. Weyl, Z. Phys. 46, 1 (1928).
384 N U M E R I C A L SOLUTION OF T H E N O N L I N E A R SCHRODINGER EQUATION WITH NEWTON-TYPE METHODS
L. V A Z Q U E Z Dept. Matematica Aplicada, Facultad de Informdtica, Universidad Complutense, 28040-Madrid, Spain E-mail: [email protected] D. N. K O Z A K E V I C H Department of Mathematics, Physical and Mathematics Centre, Federal University of Santa Catarina, Campus Trindade, 88040-900 Florianopolis SC, Brazil. E-mail: [email protected] S. J I M E N E Z Dept. Matematica y Fisica Aplicadas, Universidad Alfonso X El Sabio, Avda. Universidad 1, 28691-Villanueva de la Canada, Madrid, Spain E-mail: [email protected] Newton-type methods provide a significant computacional economy for simula tions done with totally implicit numerical schemes. Several numerical results are presented for the nonlinear Schrodinger equation, showing the efficiency and ro bustness of the methods.
1
Introduction
In the numerical simulation with finite difference methods of PDE modelling nonlinear wave equations, implicit numerical schemes usually give better re sults so as to stability and exactness 21 . Nevertheless, the complexity and amount of CPU involved in those methods usally makes them less preferable than some other explicit or semi-implicit numerical scheme. The main draw back comes from the fact that in an implicit method variables for one oscillator get coupled to those of every other oscillator in the discrete-spatial mesh at the top time level and some technique is needed to solve the nonlinear system of equations involved at every time step. This problem is of significance when large numbers of oscillators have to be considered and/or long times are to be reached in the simulations. Gomes-Ruggiero il and Kozakevich14 showed that Newton-type methods are very useful to simulate nonlinear boundary value problems. In our work, we are interested to use these methods on evolution problems, as an efficient possibility to implement implicit schemes in finite differences. We have chosen
385
the cubic Schrodinger equation in one space dimension since it is a model of physical interest for which the behaviour of the solutions is known 0 . Many numerical schemes in finite differences have been developed for this equation, especially to avoid the computational complexity of implicit methods. We have chosen a second order, fully implicit scheme, that is known to be among the most accurate ones 2 2 . This contribution is organized as follows: in Section 2 we introduce the continuous model and the numerical scheme used. In Section 3 we present different algorithms to solve the nonlinear systems, while the numerical simu lations are treated in Section 4. A conclusion summarizes the results. 2 2.1
Continuous model and numerical scheme Nonlinear Schrodinger equation
The Nonlinear (Cubic) Schrodinger equation (NLS) in one spatial dimension arises in a diversity of contexts: solid state Physics, signal propagation in optical fibers, plasma Physics, etc l 3 . The equation we are considering is irlH + V-xr + 2|V| 2 ^ = 0
(1)
where ip(x,t) is a complex function. Associated to it we will consider an initial value problem and we will restrict ourselves to the study of solutions with spatial compact support Q, at least for some finite interval of time. This equation has soliton solutions and an infinite set of conservation laws 24 . Among those we have the charge and the energy:
Q = I \4>fdx Jn
E= [ {-\^X\2+W)dx.
(2)
(3)
Jn For the numerical tests we will use the one-soliton solution, a collision of two solitons and a bounded state of three solitons that are considered somehow standard tests 22,6 . In the first case the exact solution is 4>(x, t) = 2tiyj- exp{2iXx) sech{2r)[x - x0 - c(t - *„)]}
(4)
where 77, a, \ and c are parameters related to the width, center and velocity of the soliton and will be properly fixed. In the second case the analytical "Applications to higher dimensions and to other nonlinear wave equations are under way and will be published elsewhere.
386
Figure 1: One soliton
form of the solution is not known, but we can consider as initial condition a superposition of two solitons with incoming velocities provided their spatial supports do not overlap significatively: V>(x,0) = 277iW-exp(2ixiz)sech{27? 1 (:E-:r 1 )} + 2r)2J - exp(2ix2x)sech{2r)2(x
- x2)}.
(5)
After the schock, both solitons emerge unchanged but for a shift in the posi tions. The third case corresponds to the initial condition ip(x,0) = sech(a: — xi). This case is very sensitive since very large fluctuations appear locally.
(6)
387
2.2
Numerical scheme
Several numerical schemes have been used to simulate equation (1) 3 ' 22 ' 6 . Among the schemes of order 2, we will consider
■c"-^r , v-iyy - 2v>r'+cv , ^ - 2 ^ + ^-1
1
2Ax 2
At
+ (W+1|2 + W I
2Ax 2
2
) ^ t ^ =0
(7)
which was used first by Delfour et al.3. Here t/>" is the discrete representation of Tp(xo + lAx,t0+nAt), where to is the initial time (usually to — 0) and XQ is the leftmost value of the spatial interval considered, At and Ax are respectively the time and spatial mesh sizes. The temporal index n ranges from 0 to N such that io + NAt is the maximal time to be reached, while the spatial index ranges from 0 to L such that [xo, xo + l A x ] is the maximal spatial support of the solution, and we will asume that xo and L are chosen such that xp vanishes identically (for all the times considered) on the extreme values of the interval. This translates at discrete level into Vn = Q,...,N,tl>% = il>l = Q.
(8)
The initial conditions provide the values of ipf and the scheme is used to compute the values of ^ " + 1 for n > 0 and / = 1,...,L— 1. Some estimate of ^ " + 1 must be given in order to start the iterative Newton-type methods, but we have found that linear estimates provide seeds sufficiently accurate. This implicit scheme is considered to give very good results but at a high computational cost 22 and is usually implemented with some iterative method, other than Newton-type ones 3 , 2 2 . It is a conservative scheme and has discrete counterparts of the conservation of the charge and of the energy. If we define: L
Q" = £Ax|V?| 2 ,
(9)
1=1 L
En = Y, Ax
^"+i ~ W Ax
+ w\
(io)
V n = ! , . . . , # , Qn = Q°, En = E°.
(11)
I=I
we have that 3 , 1 2
388
The conservation law for the charge, in both cases continuous or discrete, is important since it gives bounds on the solutions. At discrete level, both conservation laws provide tests for good behaviour of the numerical solutions. In order to use the scheme we split the equation into its real and imaginary parts: defining 4>? = a? + ib?
(12)
we have the two real equations a
+a i+i ~ ia\ ;-i 2 2Ax
<>i -o? At |
A<
a
i+i ~ lai +ai-\ 2 2Az
( a r 1 ) 2 + W + ' ) 2 + («?) 2 + W)2
2
i 2 2A:r i 2
2
a?+1 + a? ^
Q
(13)
2 2Az 2
(ar ) +(6r ) +(ar) +(feD ^r+i+fe? = 0 ,14^ 2
2
'
K
'
Let us denote for sake of simplicity these two equations as
F 1 ,K- + i 1 .< + , .«r + + i 1 ;*r + 1 ) = o. ^(ar + 1 ;6?_ + 1 1 ,6r + 1 .V + + , 1 ) = 0,
(15) (16)
where we have considered variables at temporal step n as parameters. We want to solve this system of 2(L - 1) equations with a Newton's or Quasi Newton's method. Let us construct the jacobian of this nonlinear system: if we order our variables at time step n + 1 as (.. .,al_l
,o,_1 ,a,
,o,
,al+1 ,ol+l
,...)
with I = 1,2,.. .,L, and the equations as Fn,F2i,Fi2,
F22, • • •,
F\L-\,F2L-I
we get a jacobian which is a 5 diagonal-matrix that has the following tridiagonal
389
form using 2 x 2 boxes:
D 0 0 0 D M2 D 0 0 0 D M3 D 0
0
/ Mi
\
0 0
(17)
\
0 0
0 0 D 0 0 0
ML-2
D
D
ML-J
where Mi are full matrices and D is a diagonal matrix: / M,
1 A* 2
,n + l
+ *? +
4
+ G
2
At
'
+ a"
\
2
hn + 1 4- bn
1
Ai
<
+ a
<
Ax 2
2
4
'
2
(18)
with /» = ( a ^ 1 ) 2 + (6| l + 1 ) 2 + (a?) 2 + (&?)2, and
/_L_2 o \ 2Ax
D=
V o 3 3.1
(19)
I
2Ax2'
Algorithms for Solving Nonlinear Systems of Equations Nonlinear Systems of Equations
Given F : R" —► R", F — (/i(u), • •., /n(«)) T , if some fj is a nonlinear function of u we say that F(v) = 0 (20) is a nonlinear system of equations. Our aim is to find solutions of (20). In the specific case of the numerical scheme defined in the previous section, u corresponds to the solution at some given time step, say n: u = (a1,bl ...,a,
,b,,...
,aL_1,bL_l)
(21)
,
while the nonlinear function F is: F = (^lii
^21. Fie,
^22. • ■ -i FIL-I,
F2L-\)
■
(22)
390
Figure 2: Collision of two soli tons
We assume that F is well defined and has continuous partial derivatives on an open set of R". We denote J(u) the matrix of partial derivatives of F (Jacobian matrix):
J(u) = F'(u) =\
=
:
:
.
(23)
We are mostly interested in problems where n is large and J{u) is structurally sparse. This means that most entries of («) are zero for allu in the domain of F. Sparsity is a particular case of the more general notion of structure. Jaco bian matrices can be symmetric, antisymmetric, positive definite, combination of other matrices with some particular structure, etc.. Most popular methods for solving nonlinear systems are local. A local method is an iterative scheme that converges if the initial approximation is close enough to a particular solution. Frequently, we are also able to prove
391 rate of convergence results for these methods, which tell something about the asymptotic velocity of convergence of the process. Fortunately, in many prac tical cases the domain of convergence of local methods is large, so that these methods are useful. However, when the initial estimate of the solution is very poor, local methods must be modified in order to improve their global conver gence properties. In the following subsection we survey local Newton-type methods, namely: Newton's method and Quasi-Newton methods. 3.2
Newton's Method
Newton's method is the most widely used algorithm for solving nonlinear sys tems of equations. Given an initial estimation u° of the solution of (20), this method considers, at each iteration, the approximation F(u) at Lk{u) = F(uk) + J{uk){u - uk)
(24)
and computes uk+1 as a solution of the linear system Lfc(u) = 0. This solution exists and is unique if J(uk) is nonsingular. Therefore, an iteration of Newton's method is described by J{uk)sk k+1
u
= -F{uk),
(25)
=uk + sk.
(26)
At each iteration of Newton's method, we must compute the Jacobian («*) and solve the linear system (25). Quadratic convergence is the most atractive property of Newton's method. If, instead of the actual Jacobian in (24), which is generally expensive to calculate, we use an approximation by differences of J{uk) we obtain the Finite-Difference Newton's Method, whose convergence properties are very similar to those of Newton's method. Now, (25) is a linear system of equations. If n is small, this system can be solved using the LU factorization with partial pivoting or the QR factorization (see Golub et al.8). Using these linear solvers, the cost of solving (25) is 0(n3) floating point operations. If n is large this cost becomes prohibitive. However, in many situations, where the matrix J(uk) is sparse, we can solve (25) using LU factorizations. In fact, many times the structure of the matrix is such that the factors L and U of its factorization are also sparse, and can be computed using a moderate amount of operations. Computer algorithms for sparse LU factorizations are surveyed in Duff et al.5 In Gomes-Ruggiero et al.10 the first version of the NIGHTINGALE package for solving sparse nonlinear systems is described. In NIGHTINGALE, we use the sparse linear solver of George et al.7.
392
The George-Ng method performs the LU factorization with partial pivoting of a sparse matrix A using a static data structure defined before beginning numerical computations. In Newton's method we solve a sequence of linear systems with the same structure, so, the symbolic phase that defines the data structure is executed only once. The system (25) has a unique solution if and only if J{uk) is nonsingular. If the Jacobian is singular, the iteration must be modified. Moreover, if J(uk) is nearly singular, it is also convenient to modify the iteration in order to prevent numerical instability. Many modifications are possible to keep this phenomenon controlled. In the NIGHTINGALE package, when a very small pivot (relative to the size of the matrix) occurs, it is replaced by a nonzero scalar whose modulus is sufficiently large (NIGHTINGALE has other ways to cope with convergence problems, but they were unnecessary in our case). Moreover, nearly singular or ill-conditioned matrices usually cause very large increments sk. So, ||s*|| must also be controlled. The main convergence results relative to Newton's method are found in Ortega et al.20, Dennis et al.4, etc. 3.3
Quasi-Newton Methods
In this survey, we call Quasi-Newton methods those methods for solving (20) whose general form is uk+1=uk-B;1F(uk). (27) Newton's method, studied in last section, belongs to this family. Most QuasiNewton methods use less expensive iterations than Newton's, but their con vergence properties are not very different. In general, Quasi-Newton methods avoid either the necessity of computing derivatives, or the necessity of solving a full linear system per iteration or both tasks. The most simple Quasi-Newton method is the Modified Newton's method or Stationary Newton's Method, where Bt — J(u°) for all k £ N . In this method, derivatives are computed at the initial point and we only need the LU factorization of J{u°). A variation of this method is the Stationary Newton's Method with restarts, where Bj, = J{uk) if k is a multiple of a fixed integer m and Bic = f?t_i otherwise. The number qf iterations used by this method tends to increase with m, but the average computer time per iteration decreases. In some situations we can determine an optimal choice for m . An obvious drawback of the stationary Newton's methods is that, except when k = 0 (mod m), Bt does not incorporate information about uk and F(uk). Therefore, the adequacy of the model Ijt(u) = F(uk) + Bk(u - uk) to the real function F(u) can decrease rapidly as k grows. Observe that, due to (27), in Quasi-Newton methods uk+1 is defined as the solution of Ljt(u) =
393
M
Figure 3: Bounded state of three solitons
0, which exists and is unique if Bk is nonsingular. One way to incorporate new information about F on the linear model is to impose the interpolating conditions Lk+l(uk) Lk+1(uk+l)
= F(uk),
(28)
= F(uk+1).
(29)
Defining yk = F(uk+1) - F(uk)
(30)
and substracting (28) from (29) we obtain the Secant Equation Bk+lsk=yk.
(31) k
k +1
Reciprocally, if Bk+i satisfies (31), Lt+i interpolates F at u and u . We give the name Secant Methods to the family of Quasi-Newton methods based on (27) and (31).
394
If n > 2, there exist infinite many possible choices of Bjt + i satisfying (31). If, in addition to (31), we impose that Bk+isk->
= if-i,j=l,...,n-l 23
(32) 9 15
we obtain the Sequential Secant Method '^ > . If the set of increments {sk,sk~1,.. . , s * _ n + 1 } is linearly independent there exists only one matrix Bjc+i that satisfies (32). In this case, fl*+, = ( y * , y * - 1 , . . . > t f f c - n + , ) ( « * > « * - 1 ) . . . , « * - n + 1 ) - 1
(33)
B;li=(sk,sk-l,...,sk-n+1)(yk,yk-\...,yk-n+1r1.
(34)
and can
e
1
2
^t+i ^ obtained from B^ using 0(n ) floating point operations. How ever, in order to ensure numerical stability, the definition of the increments s ; that appear in (33) and (34) must sometimes be modified. When these mod ifications are not necessary, the Sequential Secant Method has the following interpolating property: L*+i(u*+>) = F(uk+1), j = 0, - 1 , . . . , - n .
(35)
The Sequential Secant Method is useful in many situations where n is small. When n is not very small, it is not worthwhile to waste time trying to preserve the interpolating condition (35) for j ss —n. It is more profitable to maintain the Secant Equation (31), using the degrees of freedom inherent to this equation to guarantee numerical stability. Broyden's "good" method 2 and the Column Updating Method (COLUM) 19 are two examples of this idea. In both methods (yk -
*+» = * + "
k T Bksk)(z ; )
^ ; r
(36)
where z* = sK
(37)
for Broyden's method and zk = eik,
(38)
|(y'*)V| = 11**110 1
n
(39) n
for COLUM. {e ,.. . , e } is the canonical basis of R . Applying the Sherman-Morrison formula to (36) we obtain 8
B
*+1 -
Bk
+
(z*)TB^y*
Bk
■
(40)
395 Formula (40) shows that Bklx can be obtained from Bk point operations in the dense case. Moreover,
!
using 0(n2)
s*+i = U + " * ( ^ ) T ) f l t " 1 . where vk = {sk - S" 1 y k )/\z k ) 7 'B~ l y k , Bk-l=(I
+ vk-l(zk-1f)...(I
floating
(41)
so
+ v°(zQ)T)B;\
for
i = 1,2,3,...
(42)
Formula (42) is used when n is large. In this case, the vectors v°, z0,..., u i _ 1 , zk~l are stored and the product Bkl F(uk) is computed using (42). In this way, the computer time of iteration k is O(kn) plus the computer time of computing BQ1 F(uk). If k is large the process must be periodically restarted taking Bk ss J(uk). Local superlinear convergence results also hold for suitable implementa tions of the Sequential Secant Method and its variations. Under slightly stronger assumptions, we can prove stronger convergence results for these methods. 4
N u m e r i c a l simulations
We developed a very portable code for this problem, using FORTRAN 77 (double precision), and ran it in a Silicon Graphics Oxigen-2 workstation. As mentioned before, we simulated three different cases: • O n e Soliton Case: we have chosen (4) with a = 2, J] = 0.5, x = 0.5, x0 = 20.0; x e [0,60], t € [0,15]. The mesh parameters are Ax = 0.1 and At = 0.02. The result is plotted in Fig.l. The soliton is stable and travels with constant speed. • Two Solitons Case: we take the collision of two solitons (5) with a = 2, 17! = % = 0.5, xi = 0.25, X2 = 0.025, xj = 20.0, x 2 = 45.0; x £ [0,100], t £ [0,60]. The mesh parameters are Ax = 0.25 and At = 0.125. We have plotted the solution in Fig.2, and we see the collision and the shift in the position of the otherwise not disturbed solitons. • T h r e e Solitons Case: we take the bounded state of three solitons (6) with o = 18, X! = 20.0; x 6 [0,40], t G [0,2.5]. The mesh parameters are Ax = 0.03125 and At = 0.00625. The solution is plotted in Fig.3 (where only the significant part of the spatial interval is shown) and we can see the local fluctuations. Using these three cases, we have tested four different methods: Newton's (as a reference to all the rest), Modified Newton's, Broyden and COLUM. All four methods are iterative and for all of them the sequence {«*}£i 0 that at each
396
time step converges to the solution must be truncated: the iterations stop when the infinity-norm of the residual value H F ^ * ) ^ is less than some tolerance parameter Tol. In order to fix the value of Tol we have used the conservation law for the discrete charge (9) as an indication of the accuracy of the solution: ideally the charge is a constant at each time step, but due to the truncation and other possible numerical errors, there is a difference or error with respect to the initial value. We have taken Tol = 10" 1 5 in general, however, in the Three Solitons case, Tol = 10" 1 0 for Newton's method: with this tolerance the solution is of the same precision as the solution found by the other methods. Table 1: Newton's method for the One Soliton case Evolution time 0.02 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 10.00 11.00 12.00 13.00 14.00 15.00
Number of iterations 6 7 6 6 7 6 7 6 6 6 7 6 6 6 6 6
CPU time 0.482E-01 0.562E-01 0.481E-01 0.481E-01 0.557E-01 0.478E-01 0.557E-01 0.482E-01 0.479E-01 0.477E-01 0.560E-01 0.479E-01 0.478E-01 0.478E-01 0.478E-01 0.478E-01
Residual value 0.937E-15 0.696E-15 0.760E-15 0.715E-15 0.659E-15 0.793E-15 0.708E-15 0.816E-15 0.767E-15 0.774E-15 0.902E-15 0.940E-15 0.604E-15 0.687E-15 0.618E-15 0.593E-15
AQ 0.2OOE-14 0.444E-15 0.444E-15 0.444E-15 0.266E-14 0.222E-14 0.377E-14 0.133E-14 0.888E-15 0.444E-14 0.200E-14 •0.888E-15 0.111E-14 0.666E-15 0.200E-14 0.333E-14
In Table 1, we summarize the performance of Newton's Method for the One Soliton case. For some time steps, labelled by the corresponding time value, we show the computational cost: number of iterations of the method and CPU time invested. We also present two values related to the quality of the solution: the infinity-norm of the residual value and the error of the charge (AQ) mentionned before. As we see, every time step is similar in cost and precision. The same happens if we consider the other methods and the Two Solitons case. In Table 2 we present the performance of Newton's Method now for the Three Solitons case. The time values correspond to those plotted in Fig. 3. As we see, whenever the solution is smooth, few iterations are needed. When large fluctuations appear, the number of iterations necessary to obtain a solution of the same precision increases significatively. This feature also happens for the
397 Table 2: Newton's method for the Three Solitons case Evolution time 6.25E-3 0.125 0.375 0.500 0.625 0.750 0.875 1.000 1.125 1.250 1.375 1.500 1.625 1.750 1.875 2.000 2.125 2.250 2.375 2.500
Number of iterations 3 8 7 6 10 4 6 9 7 6 11 5 4 11 6 7 8 6 4 8
CPU time 0.541E-01 0.143E+00 0.125E+00 0.107E+00 0.178E+00 0.718E-01 0.230E+00 0.160E+00 0.125E+00 0.107E+00 0.196E+00 0.895E-01 0.716E-01 0.195E+00 0.107E+00 0.248E+00 0.143E+00 0.107E+00 0.710E-01 0.142E+00
Residual value 0.851E-10 0.108E-10 0.118E-10 0.705E-10 0.233E-10 0.325E-10 0.985E-11 0.829E-10 0.122E-10 0.510E-10 0.308E-10 0.239E-10 0.873E-10 0.409E-10 0.368E-10 0.131E-10 0.173E-10 0.911E-10 0.359E-10 0.120E-10
AQ 0.133E-14 0.244E-14 0.155E-14 0.155E-14 0.888E-15 0.888E-15 0.133E-14 0.355E-14 0.222E-15 0.133E-14 0.488E-14 0.666E-15 0.666E-15 0.155E-14 0.222E-14 0.444E-14 0.222E-15 0.133E-14 0.666E-15 0.444E-15
other three methods. The comparison of all four methods is summed up in Table 3: for all four methods, all three cases and the whole integration process, we present the accumulated values of the CPU time, the number of total iterations of the method, the average local computational cost (cpui<,c) and the accumulated error in the charge (AQ). The average local computational cost cpu;oc is the accumulated CPU time divided by the total number of unknowns a" and 6", and represents the average computational cost invested to obtain the solution at one point of the discrete space-time mesh. We see that in all circumstances Quasi-Newton methods perform better than Newton's method: they are faster and require less computational effort. Although the Modified Newton's method does more iterations than any other method, it is the fastest and the one that invests less effort per point. In the Three Solitons case, Newton's methods needs less iterations than the other methods (due to the fact that its Tol is 5 orders of magnitud bigger), but it still is the slowest and the one that invests more effort per point.
398 Table 3: Comparison of the four methods for the three cases
Newton's „ ... One csouton
Two Solitons
Broyden
COLUM
CPU No. Iter. . _ AQ cpu ( o c
42. 4649 .11E-11 4.67E-05
17. 4682 .10E-11 1.89E-05
21. 4610 .82E-12 2.33B-05
19. 4632 .15E-11 2.11E-05
CPU
32. 5431 .24E-11 8.35E-05
12. 5707 .86E-12 3.13E-05
15. 4870 .22E-11 3.91E-05
13. 4865 .14E-11 3.39E-05
30. 1470 .19E-11 2.93E-05
21. 2718 .28E-11 2.05E-05
26. 2502 .93E-12 2.54E-05
24. 2483 .59E-12 2.34E-05
. J. AQ cpu l o c
CPU Three Solitons . ,L AQ CpUfoc
5
Modified
Conclusions
The test performed show that Newton-type methods and fully implicit schemes are an efficient way to simulate the nonlinear Schrodinger equation. All three Quasi-Newton methods gave better performances than the New ton's method (used as a reference). Among them, the modified Newton's method gives the best results. Broyden's and COLUMN perform similarly, although COLUMN gives a slightly better performance. Acknowledgments We are grateful to the Optimization Group of Campinas, Department of Ap plied Mathematics of IMECC - UNICAMP, Brazil, for the use of the software NIGHTINGALE. L. Vazquez and S. Jimenez have been partially supported by the Comision Interministerial de Ciencia y Tecnologia of Spain under grant PB95-0426. D. N. Kozakevich has been supported by CAPES, Brazil, under grant BEX098/598-23. References 1. J.G.P. Barnes, Computer Journal 8, pp. 66 - 72 (1965). 2. C.G. Broyden, Mathematics of Computation 19, pp. 577-593 (1965).
399 3. M. Delfour, M. Fortin, G. Payre, Journal of Comp. Phys. 44, pp. 277288(1981). 4. J.E. Dennis Jr., R.B. Schnabel, SIAM Review 21, pp. 443-459 (1979). 5. I.S. Duff, A.M. Erisman, J.K. Reid, Direct Methods for Sparse Matrices, Oxford Scientific Publications (1992). 6. Z. Fei, V.M. Perez Garcia, L. Vazquez, Applied Mathematics and Com putation 71, pp. 165 - 177 (1995). 7. A. George, E. Ng, SIAM Journal on Scientific and Statistical Computing 8, pp. 877-898 (1987). 8. G.H. Golub, Ch.F. Van Loan, Matrix Computations, The Johns Hopkins University Press, Baltimore and London (1989). 9. W.B. Gragg, G.W. Stewart, SIAM Journal on Numerical Analysis 13, pp. 127- 140 (1976). 10. M.A. Gomes-Ruggiero, J.M. Martinez, A.C. Moretti, SIAM Journal on Scientific and Statistical Computing 13, pp. 459 - 483 (1992). 11. M.A. Gomes-Ruggiero, D.N. Kozakevich, J.M. Martinez, Computers and Mathematics with Applications 32, pp.1-13 (1996). 12. S. Jimenez, App. Math, and Comput. 64, pp. 13-45 (1994). 13. Y.S. Kivshar, B.A. Malomed, Rev. Mod. Phys. 61 763 (1989). 14. D.N. Kozakevich, Doctoral Thesis, "Numerical Solution of Nonlinear Sys tems in Physics and Engineering", DMA, IME, UNICAMP, Campinas, SP, Brazil (1995). 15. J.M. Martinez, Mathematics of Computation^, pp. 457-481 (1992). 16. J.M. Martinez, Mathematics of Computation 60, pp. 681-698 (1993). 17. J.M. Martinez, On the Convergence of the Column-Updating Methods, Technical Report, Department of Applied Mathematics, University of Campinas (1992). 18. J.M. Martinez, SIAM J. of Num. Anal. 29, pp. 1413-1434 (1992). 19. J.M. Martinez, Computing 33, pp. 353-362 (1984). 20. J.M. Ortega, W. G. Rheinboldt, Iterative Solution of Nonlinear Equa tions in Several Variables, Academic Press, NY (1970). 21. D. Potter, Computational Physics, John Wiley k Sons, NY 1977. 22. J.M. Sanz-Serna, J.G. Verwer, IMA J. of Num. Anal.6, pp.25-42 (1986). 23. P. Wolfe, Communications ACM 12, pp. 12 - 13 (1959). 24. V.E. Zakharov, A.B. Shabat, Sov. Phys. JETP 34, pp 62-69 (1972).
400
S P I N O R DESCRIPTION OF A GENERAL SPIN-J SYSTEM V. R. VIEIRA, P. D. SACRAMENTO Centro de Fisica das Interaccoes Fundamentals, Institute) Superior Av. Rovisco Pais, 1096 Lisboa Codex, Portugal
Tecnico,
We consider a spin coherent states description of a general quantum spin system. It is shown that it is possible to use the spin-1/2 representation to study the general spin-./ case. We identify the 1/2 spinor components as the homogeneous coordinates of the projective space associated to the complex variable that labels the coherent states and establish a relation between the two-component spinor and the bosonic Schwinger representation of a spin operator. We rewrite the equations of motion, obtained from the path integral for the evolution operator or partition function, in terms of the 1/2 spinor and define the effective Hamiltonian of its evolution.
1
Introduction
Coherent states provide a convenient way to represent quantum operators in terms of c-number functions. They have been used extensively in the liter ature in various fields like quantum optics, condensed matter and quantum field theory 1. They are closest to a classical description in the sense that they minimize the Heisenberg uncertainty relations. In particular, spin coherent states have proved useful to deal with spin systems 2 . Since the algebra of the spin operators is more involved than that for boson or fermion operators, the construction of the spin coherent states is somewhat more complicated. Recently, we have pursued the similarities to the boson coherent states and have shown that it is possible to establish a closer parallelelism than previ ously realized. In particular, we obtained generalized representatives for spin operators 3 ' 4 extending results obtained previously for bosons 5 . Also, we considered the path integral for the transition amplitude and the partition function 6 and considered quantum Monte Carlo algorithms using coherent states 7 . The handling of spin operators is traditionally a difficult problem imply ing, in general, unphysical states that have to be projected out. There are methods which do not include extra states 8 ' 9 but they are not easily general ized to spin values greater than one. Another advantage of the spin coherent states is that the spin value J enters only as a parameter and therefore all spin values can be handled similarly. The price to pay is a description in terms of continuous variables instead of a discrete number of states. The spin coherent states are therefore a convenient basis to perform a semi-classical expansion
401
for large values of the spin 10 and they appear naturally in the quantization of a semi-classical spin theory u ' 1 2 . The fact that J is only a parameter in the formalism suggests that one can reformulate the spin-J case in terms of the J — 1/2 case. In this paper we emphasize that indeed most results obtained for a general J-value are simply related to the same expression for J — 1/2 which is in general simpler to obtain. 2
Spinor versus stereographic projection descriptions
The spin coherent states can be defined by
2
_1 a>= R(6t
(1+Ja
8
/
J
- 1 J J >,
(1)
where R(6,tp) is a rotation through an angle 9 about the axis n — (siny?, - cos?,0), normal to the z-axis and to the vector rh — (sin 6 cos
Q
< l >
=
(l + q " q ) 2 J 2 ^(l +■ |a'| .vi2wn^ui2w K(l + H
^ ( 1 + COS0)''
(2)
(3)
where $ is the area of the spherical triangle with vertices m, rh' and ez and 0 is the angle between the two directions defined by rh and rh'. The coherent state \a > depends on the variables a and a*, which should be considered as independent. It is convenient to define matrix elements depending only on two complex variables (and not four) 3 . The same happens in the case of bosons where the so-called holomorphic representation is used 13 - 14 . The easiest way to use this type of representation is to define new states 3 \\a > = eaJ- \JJ >
402
which depend only on a. Their overlap is < /?||l||a > = (1 + 0*a)2J and the decomposition of the identity becomes 27+ + 11 [,2n ir J
l|a>
1-
(4)
The coherent states \\a > and < a\\ for spin | are given by the spinors A(a) = ( * )
(5)
At(a) = ( l a * ) ,
(6)
respectively, and one can write A t (a)A(a) = l + a*a
(7)
< a | | i | | o > = [A t (a)A(a)] 2 ' 7 .
(8)
The matrix element of an operator F is defined as F(P*,a) = < /3||F||a >. It depends only on a and 0*, and it does not depend on a* or /3. One can then use a* instead of /?*, and treat a and a* as independent variables, without any loss of generality. For some operators like the density matrix it is convenient to use the so-called diagonal representative given by the weight function of the decomposition of the operator in a superposition of coherent states projectors 5 . In connection with the holomorphic representation it is convenient to define the diagonal representative / ( Q , a * ) given by 3 F = ^ - ^
fd2a\\a
> f(a,a*)
< a\\.
(9)
Since the spin coherent states are not eigenvectors of the angular momen tum operators, it is important to know their action on them. These relations were obtained before 4 e0j-\\a>
= \\a + 0>
e0S'\\a>=e<3J\\e-0a>
.
(10)
The action of the infinitesimal operators on the coherent states follows imme diately 4 and can be summarized as f\\a >= {Jm + /L[(l + o*a)^
- 2Ja']}\\a >
(11)
403
The matrix elements of the operators J± and Jz are given by 1 ' 2,3 ' 4 < a\J\a > = Jrh. Similarly, the diagonal representatives are given by j(a, a*) = (J + l)m. From eq. (11) follows that the coherent states minimize the uncer tainty relation for any two spin components since, from the Schwarz inequal ity, in order that a state \<j> > minimizes the uncertainty relation, the states (Ji — < Ji >)\
ei$i
>,. -. . = cos — + i— ■ asm — 2
2
i2
=(r;')'
<>
where fi = c o s | + i c o s 0 s i n | and v = ? sin ^e"''sin | . These two complex numbers are not independent, since they satisfy \fi\2 + |i/| 2 = 1. The inverse of an operator is obtained replacing y, by n* and u by —v. Using these parameters, the disentangling theorem l is written as
D(g) = =
J+
e-£
e-2ln,i'J'ePJ
(13)
eij_e2in„j,e-^j+
The action of a general rotation on a coherent state is
(14) 4
D(g)\\a>=(»-v'a)*J\\'^-^>.
(15)
The matrix element of the operator D{g) is then given by < a'\\t){g)\\a >= \p + a"v - u'a + n*a"a]2J
(16)
The matrix element of the spin 1/2 operator V{g) between these spin | co herent states is given by V(g;a*,a)
— n + a*u - v*a + fi'a'a = \i(a)V(g)A(a).
'
(17)
404
One concludes then that the matrix element for general spin J is simply =[V(g;a\a)]2J
(18)
i.e. the power 2J of the corresponding spin i matrix element, with eq. (8) being a special case. These results are easily understood if one uses the tensorial product of 2J spins | , Ja, a = 1 , . . . , 2J, and write \JJ > = | | § > • • • | | | > and
wTXTT •
(19)
which can be rewritten as
^^)=[v{g_lJa)?{J+iy
(20)
i.e. the diagonal representative of D(g), for general spin , is given by the power - 2 ( J + 1) of the matrix element, for spin | , of the operator D~1(g). A direct way to calculate the matrix elements of products of powers of spin operators consists in differentiating the appropriate generating function < a \\euJ"evJb ■ ■ -\\a > and using the reduction of this matrix element to the 27 power of the corresponding spin | matrix element. Repeated use of the identity oaOb = $ab + i^abc^c, where aa are the Pauli matrices, reduces the calculation to ma =< a||<70||a > / < a||l||a >. One concludes then that any correlation function is a function only of the vector J = Jm. The same applies to diagonal representatives. In practice, however, correlation functions and cumulants, in particular, are most easily evaluated using the fact that the insertion at right (or at left) of additional spin operators into matrix elements or diagonal representatives of some operator is given by the differential operators Jm + / L ( l + o*a)-^ and (J + l)m - h+(l + a*a)g^(or their complex conjugates), acting on those matrix elements or diagonal representatives 15 . If we consider the action of a rotation V(g) on a general | spinor
AA(a)=(M
(21)
405
we find that the relation between the components of the rotated A'A(a) = V(g)K\(a) and original spinor A*(a) can be written as
V
* = A'
-±£±.
(22)
/i-^f
From eq.(15) it follows that the variable a, labeling the spin coherent states, has the same transformation law as the ratio a/A, of the two components of the spinor. This is the relation, found in geometry, between an affine space and the projective space associated with it. More precisely, we identify the 1/2 spinor components as the homogeneous coordinates of the projective space associated to the complex variable which labels the coherent states. This coordinate transforms according to a bilinear or Mobius transformation. One recovers the well known isomorphism between the SU(2) group and the group of the bilinear transformations. The variable A is the rescaling necessary to have the first component of the spinor equal to one. Under a transformation in which the spinors A\(a) and A A (a) transform according to A A (a') = UA\(a) and A A (a') = A A (a)V, where U and V are not necessarily related, the Jacobian for the change of variables is given by (|A'| 2 ) 2 d 2 ^ =detUdetV(\X\2)2d2^. A
(23) A
The two components of the spinors transform separately as the numerator and the denominator of the bilinear transformations. This is very useful in practical calculations. For example, in the calculation of the integrals involving matrix elements or diagonal representatives, one can replace matrix elements between coherent states by matrix elements between spinors, and use rotations and not bilinear transformations, to simplify their evaluation. This, combined with the relation between the spin J and spin 1/2 quantities, allows a great simplification of the calculations. Also, the connection between rotations of vectors and rotations of spinors expressing the fact that the spin is a vector operator and given by H(m) ■ a = Vfh ■
(24)
in terms of the spinors A and A*. Due to the homogeneity of the expression for m the vector / = J (A A
406
3
Path integral and equations of motion
The transition amplitude =< a}\\Tte-*j<.
0{a),tr,at,U)
can be obtained by the path integral expression U(a},tr,ai,U)
= J'tPii(l
dm(t)
||af >
(25)
6
+ a}a(tJ))Je*tidtC(l
+ cf(ti)ai)J
(26)
where i 1+a a
at
at
is the Lagrangian, in which 7ic(a*,a, t) is the normalized matrix element of the Hamiltonian. The dependence on a and a* is through rh only, as explained above. The two boundary terms appear naturally from the construction of the path integral 6 and are necessary to have the boundary conditions ot(ti) = di and <**() = a } , when varying the action. Note that a(tf) is not the complex conjugate of a*, and that a*{U) is not the complex conjugate of ct{. They are the values taken by the variables a(t) and ct*(t), satisfying these boundary conditions, at the times t = tj and t = U, respectively. In particular, in the saddle-point approximation they are obtained solving the classical equations of motion for a(t) and a*(t) and evaluating their values at tf and ti, respectively. These terms have been shown to be necessary to obtain the exact solution using the saddle-point approximation in the case of a spin in a time-independent magnetic field, for instance 6 . In the discrete approximation the integration measure is P2/x = II^J^ 1 2 J £ l (i+„°a ) i ' ^ can be seen as a consequence of the quantization of a classical problem with constraints 12 . The usual transition amplitude is therefore given by
=/^f\ +a ^ ) Ve^:/^f 1 : Q, ^) J J
\ l + a*faf j
\
l + afaj
/ (28) where a / and a* are the complex conjugates of a*f and a*, respectively. The path integral representation for the diagonal representative of the evolution operator is similar 6 . It can be checked that, both for matrix elements and diagonal represen tatives, the classical equations of motion, obtained varying the Lagrangian
407
term and the boundary condition terms, are the classical limit of the DysonSchwinger equations, with the appropriate boundary conditions 6 . Also, the Poisson brackets, or more precisely, the Dirac brackets for the quantization of a theory with constraints, of the matrix elements or diagonal representatives of any two spin components satisfy the commutator algebra of the quantum spin operators. The path integral expression for the partition function can be easily ob tained and one finds using matrix elements
Z = fv^We-lC^-*^^*-*
•£)+*«<»•.«)]
(29)
with periodic boundary conditions in the imaginary time. The expression for the case of the diagonal representative is similar 6 . Geometrically, the free term in the Lagrangian is the area in the sphere defined by the trajectory under consideration and by two great arcs from the north pole to the initial and final points of the trajectory. It can be rewritten as dJ C0 = — xJ-& (30) at where (p = "-.,, with u = e"z, reflecting the choice of the z axis as the quantization axis in the construction of the coherent states. In general, u can be any arbitrary unit vector. Changing it, the free Lagrangian is modified by 5 Co = j^ ( Jt -J x " ' &*) >- e > t n e total derivative of a function of the initial and final positions (the difference in the areas defined by each of these points and the old and new u). Conversely, under a rotation, i.e., under a change of variables of the form a' = * ^ . ° , the free Lagrangian changes by C'0 - C0 = ^ ln(^ I°.£) which is also the total derivative of a function of the initial and final positions. Finally, under a spin inversion, in which J -> —J, a —> —-^-, a* -¥ —^, the free Lagrangian transforms according to Co —¥ -C0 + 5 ^ ( _ l ^ m ( ^ " ) ) — -Co - 2J^£, with the geometrical interpretation that the sum of the areas defined by a closed trajectory encircling the poles and viewed from them is 47r, the total area of the sphere. The Lagrangian is totally rewritten in terms of the vector J. The inte gration on the sphere can also be written as 1/J2d?J8(J* — J2). We arrive then at the path integral for spins
fvjS(J2-J2yJdtc
(31)
408
Similarly, a formulation involving only the spinors can also be found. Making the rescaling a —> a/X and a* —> a*/A*, the integration measure becomes dQ = |A| 2 d 2 a/(A t A) 2 . In this expression the values of A and A* are arbitrary but fixed. In order to integrate over d?\ one introduces two delta functions <5(5RA - Ar) and <5($A - A<). Using the identity d(f,9) d(x,y)
S(f(x,y))S(g(x,y))
= 6{x - x0)S(y - y0)
(32)
z=z 0 ,2/=yo
where XQ and yo are such that f(xo,yo) — s(^o,2/o) = 0, one is able to give other forms to the path integral. In particular defining
where r(A*, A) is an arbitrary but not constant function, one finds immediately the path integral expression found by Jevicki and Papanicolaou Z=
n
ff[v2ai'[[S[A\t)A(t)-2J}5[
(34)
t
derived using the Schwinger representation for spins and using the DiracFaddeev theory of constrained systems, extended to the case of second class constraints 12 , to make the restriction to the physical space A^A = N = 2J and where <j) = a\ + a{ is the gauge fixing condition. In terms of the function r(A*, A) this restriction and the gauge condition become |r| 2 = 1 and r* = r, respectively. The classical equations of motion can be obtained varying the Lagrangian term and the boundary condition terms. In the case of the path integral for the matrix element, we find 2J da dU ih 2 (1 + a'a) dt da' ,fc 2J da' _ &H (35) ( l + a * a ) 2 dt ~ da with a(ti) = Qj and a*(tf)
= a', as boundary conditions, as expected
Using (1 + a*a)dm/da = 2h+ and motion can be rewritten as 2 ih (1 + a'a)2 ■ft 2 (l + a*a) 2
its complex conjugate da dU 2/i_ dt dJ 1 + a'a da' &n 2h+ dt ~ d j ' l + a'a
17
16
.
, the equations of
W
409
where J = Jrh. In these equations Heff = j i is the effective magnetic field acting on the spin. Making the rescaling a -> f and a* -> £- and defining the effective Hamiltonian it = Heff ■ §, acting on the spinors, one finds that these equations of motion are equivalent to i n ^ = iCAx (37) at and its hermitian conjugate. Since K. is an hermitian operator, the norm of the spinors, i.e. A^A* = A*A + a*a, is conserved. The equation of motion of the moment J , i.e., of the expectation value of the Pauli matrices in the spinor, is — = —5 x J dt dJ which is the equation of precession in the effective field -gj. The equation of motion of the spinor can be formally solved by
(38)
Ax(a) = U(t,t0)A°x(a)
(39)
U{tM)=Tte'k^dtk
(40)
where
is the evolution operator corresponding to the effective Hamiltonian K. One can take advantage of the connection between vector rotations and spinor rotations by defining the operator in spinor space J = J-a
(41)
having the property J2 = J2. The equation of precession becomes ih^- = [iCj] at
(42)
J = Oj^tJ-1
(43)
and can be solved by
The case of many spins can be easily treated, repeating this procedure for each spin. The path integral follows immediately, with the coupling between spins arising from the Hamiltonian U. The saddle point equations follow from there and one finds the mean field approximation in which each spin i precesses in the effective field E\ss = | ^ resulting, in particular, from the interaction with external fields and the other spins. One introduces Pauli matrices at
410
for each spin. The effective Hamiltonian for each spin is Ki = Htff ■ f The total effective Hamiltonian is /C = X)i £»• ^ *s t n e s u m °f single spin terms, reflecting the character of the mean field approximation in which each spin precesses in the mean produced by the others. We note that in the treatment of continuous spin chains the matrix J is the central quantity for the Lax representation used by Takhtajan 18 and Jevicki and Papanicolaou u in the derivation of nontrivial solutions of the soliton or instanton type. 4
Summary
We have replaced a description of a quantum spin operator in terms of clas sical variables (denned by the stereographic projection of the sphere from its south pole to the equatorial plane) by a spinor description (two-component spinor) and replaced the action of a rotation on the spin coherent states (which involves bilinear transformations of the complex variables) by the action of a rotation (matrices operating in spinor space) on the two-component spinor. This reformulation appears naturally for J = 1/2 but we have shown that it generalizes in a straightforward way to all spin values. We have noted that this spinor description is related to the description of a spin system by Schwinger bosons. Both descriptions enlarge the space and a projection to the proper subspace is required. We have reexpressed the equations of motion in terms of the 1/2 spinor and obtained simple expressions that have a formal appear ance familiar from standard dynamics of two-level systems. This enables a simpler treatment of a general spin- J system. Interacting spin systems are dealt with more conveniently in terms of an evolution operator that acts on spinor space and in terms of a transfer matrix that connects different points in real space. Also, we have made contact to previous treatments where the Lax representation is used in the context of the search for nontrivial solutions of the saddle-point equations via the inverse scattering method. References 1. Klauder J R, Ann. Phys. 11, 123 (1960). Glauber R J, Phys. Rev. 130, 2529 (1963). Carruthers P and Nieto M M, Rev. Mod. Phys. 40, 411 (1968). Klauder J R and Skagerstam B S, Coherent States, Applications in Physics and Mathematical Physics (World Scientific, Singapore, 1985). Zhang W M, Feng D H and Gilmore R, Rev. Mod. Phys. 62, 867 (1990). 2. Radcliffe J M, J. Phys. A 4, 313 (1971).
411
3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.
16.
17. 18.
Arecchi F T, Courtens E, Gilmore R and H Thomas, Phys. Rev. A 6, 2211 (1972). Lieb E, Commun. Math. Phys. 31, 327 (1973). Kuratsuji H and Suzuki T, J. Math. Phys. 21, 472 (1980). Klauder J R, Phys. Rev. D 19, 2349 (1979). Fradkin E and Stone M, Phys. Rev. B 38, 7215 (1988). Vieira V R and Sacramento P D, J. Phys. A 27, L783 (1994). Vieira V R and Sacramento P D, Annals of Phys. 242, 188 (1995). Cahill K E and Glauber R J, Phys. Rev. 177, 1857 (1969). Vieira V R and Sacramento P D, Nucl. Phys. B 448, 331 (1995). Vieira V R and Sacramento P D, Physica A 207, 584 (1994). Berezin F A and Marinov M S, Annals of Phys. 104, 336 (1977). Vieira V R, Phys. Rev. B 23, 6043 (1981). Popov V N and Fedotov S A, Sov. Phys. JETP 67, 535 (1988). Haldane F D M, Phys. Lett. A 93, 464 (1983); Phys. Rev. Lett. 50, 1153 (1983). Jevicki A and Papanicolaou N, Annals of Phys. 120, 107 (1979). See Appendix A of ref. 6. P. Senjanovic, Annals of Phys. 100, 227 (1976). Faddeev L D and Slavnov A A, Gauge Fields: Introduction to Quantum Theory (Benjamin, Reading, 1980). Schuster H G and Vieira V R, Phys. Rev. B, 34, 189 (1986). The vectors h+,h- are the eigenvectors of the projectors P+b = |(<^fc + ieabcmc) = 2h%hb_ and P~b = {P+)\ where 6±b = Sab - 8^b and S^b — mami, are the transversal and longitudinal projectors relative to the vector m. See ref. [4]. In some applications, involving circles or straight lines as trajectories, the mapping of circles and straight lines into circles and straight lines by the bilinear transformations becomes useful. The parameterization of the solutions and handling of the boundary conditions becomes easier using the invariance of the cross ratio of four points under the bilinear transformations, or in a circle, under complex conjugation. The same can be done using the spinors in projective space. The variations of h± with a and a" are given by (1 + a*a)dh-/da = -m+a*hand (l+a*a)dh+/da = -ah+ and their complex conjugates. Takhtajan L A, Phys. Lett. A 64, 235 (1977).
412
GLUON CONDENSATE AND A VACUUM STRUCTURE FOR NONABELIAN GAUGE THEORY R. VILELA MENDES Grupo de Fisica-Matemdtica Complexo Interdisciplinar, Univ. de Lisboa Av. Gama Pinto, 2, 1699 Lisboa Codex Portugal e-mail: [email protected] (Dedicated to Ludwig Streit on the occasion of his 60th birthday) Phenomenological evidence and analytic approximations to the QCD ground state suggest a complex gluon condensate structure. Exclusion of elementary fermion excitations by the generation of infinite mass corrections is a consequence. In addi tion the existence of vacuum condensates in unbroken non-abelian gauge theories, endows SU(3) and higher order groups with a non-trivial structure in the manifold of possible vacuum solutions, which is not present in SU(2). This may be related to the existence of particle generations
1
Vacuum condensates in QCD. Phenomenological and theoretical evidence
Reconstruction from the ground state is a powerful technique leading to the rigorous construction of quantum theories which cannot be adequately de scribed by potentials. It is also a field where Ludwig Streit has made im portant contributions 1 2 3 . Here I will sketch some of the results that may be obtained when this technique is applied to the gluon sector of QCD. The basic idea is to use a path-integral expansion to guess a non-trivial vacuum structure and then to use this one as the defining ground state functional for a quantum theory. There is now ample phenomenological evidence for the existence of a non-trivial structure in the QCD vacuum, containing both quark4 and gluon5 condensates. A good description of many hadronic quantities is obtained in the framework of the QCD sum rules6, using as input the vacuum expectation values (iq) and (F^F^). Analytical approximations to the QCD ground state also provide theoreti cal evidence for the existence of the condensates and, in addition, supply some additional information on the nature of the condensates. In Ref.7, for exam ple, a path-integral representation of the ground state is used to obtain a sys tematic expansion in which even the leading term contains non-perturbative information. For the gluon sector of QCD (in the temporal gauge), the leading
413
term is
*W=^(-S/*^WU(4«)UMM))IB";W)
(1)
with Bf^e^^djA^-iu^Al)
«W
= W
(dmSaa- - 9faWAl)
(2)
(3)
In the long-wavelength limit, the $o{>!} state of Eq.(l) bears some resem blance to the ansatz proposed by Greensite 8 , however the power dependence on the chromomagnetic fields is different. By considering either fast varying potentials A% or constant field configurations, on sees that (1) interpolates nicely between an abelian-type vacuum for the high frequencies and a config uration of random magnetic fluxes at low frequencies. The non-perturbative nature of this analytical approximation to the (gluon sector) vacuum is made apparent when one considers the effect of long-wavelength (constant A%) field fluctuations on the effective mass and the propagation of elementary fermions. In addition some information may be gathered from (1) concerning the group-theoretical structure of the QCD vacuum (Sect.3). This, however, is likely to be more general than the approx imation described in Eq.(l) and to depend only on the octet structure of the operators that appear in the composite condensates {A%Aa)l) and {F^"FaiiV). Greensite8 and Feynman 9 , inspired by the form of the abelian ground state and the requirements of gauge invariance, have conjectured a form 90{A}
= exp(-f
d3x j/Tr (B(x) • Sxy • B(y) • Svx) f(\x - y\)j
(4)
for the (gluonic) QCD ground state. Sxy is the gauging factor
and for the kernel f(\x - y\), which contains all the non-trivial dynamical information, a few requirements were discussed by these authors. However, comparing Eq.(4) with the leading path-integral approximation in Eq.(l), the conclusion is that a simple coordinate dependence for the kernel is unlikely. Instead one obtains a complex dependence on the dynamical variables which
414
seems difficult to guess from qualitative considerations. One also notices that Eq.(l) may be written as
exp(-\Jdizt(x)»t{x?)
90{A} =
(5)
where
«(*) = ((W*M(*))) 1/4 )1° **'{X)
W
the last equality being obtained from a standard representation10 for fractional powers of positive operators. Eq.(6) puts into evidence the highly non-local nature of the effective QCD coordinates. The plan of the present paper is as follows: One works in the framework of the Schrodinger functional formulation of quantum field theory (SFQFT) n 12 13. Choosing a local representation for the Lie algebra-valued one-forms A = Aak(x)dxkSa (7) the coordinate set for SFQFT is a collection of functions A^ (X) defined on R3 or, whenever convenient, on a three-dimensional compact manifold. The labels k and a take the values k = 1,2,3 and a = 1,..., n, n being the number of generators of the Lie algebra Q of the internal symmetry group G. In the Schrodinger formulation, the functionals involve products of fields and functional derivatives. Regularization of these quantities is needed to obtain well defined quantities. Two approaches are possible. The first one consists in restricting thefieldconfigurations by Sobolev norms and the second one relies on a lattice regularization. In the first case let the space spanned by the coordinates {^(z)} be denoted by V. Using the G—invariant scalar product (.,.) in G, a scalar product is defined in V by r
(A,B)= J
3
dxsy£(Mx),Bk(x))
(8)
*=i
and, with the covariant derivative Dj(A)^ = djSa'3-gf^A](x) Sobolev norms of class r are defined on V by r
(9)
415 Vjfe is then the Hilbert space completion of V with respect to the norm (., .) r and, in each case, the appropriate Sobolev order r must be chosen depending on the Schrodinger functional to be evaluated. The alternative approach of using a lattice regularization will be consid ered in Sect.2, where the effect of a ground state of the form (1), on the effective mass of elementary fermion excitations, is considered. The quantum theory that is studied in this paper is the one that is defined by the functional (1), in the sense of reconstruction from the ground state 14 1 23 . The theory is invariant under time-independent gauge transformations and, with the adequate regularization, some precise statements will be made about it in Sects.2 and 3. However, how well it approximates the gauge sector of the actual QCD ground state is a question about which a rigorous statement cannot, at this stage, be made in the framework of the path-integral expansion leading to the functional (1). 2
T h e effective mass of elementary fermion excitations
As will be seen later on, the nature of the gauge group determines the multi plicity of vacuum configurations, the SU(2) and SU(3) groups behaving differ ently in this aspect. However, dynamical features like the coupling constant dependence of the condensates and their effect on the propagation of elemen tary excitations are expected to depend mostly on the non-Abelian character and not so much on the order of the gauge group. Therefore, for simplicity, calculations in this section will be carried out for the SU(2) group. A lattice regularization will be used with the continuum limit obtained when the lattice spacing a -> 0. As shown in 7 , for the quantum theory constructed from (1), the existence of a finite mass gap implies a running coupling constant behavior <72(a) ~ ^— . Therefore g(a) -* 0 when a -¥ 0 which justifies the use of the small noise Wentzell-Freidlin technique 15 to analyze the mass-gap. In the lattice regularization one makes the substitutions gaAf{x)^6a(x,x+T) = ef(x) 5a
2
B?(x) ->/3J»(i) = i 7 « » (ea(x+3,x
+ 3+k)- \fa(h0>(x,x+J)P'{x,x
(11) + k)) (12)
(Di)a0 v0(x) -> i (P,-)Q/3 t/»(i) = I {^ (f{x + i) - v°(x -?)) - / < ^ ( x , x + V (*)} (13)
416
where 7 y * = (signi)(signj)(signfc)c|j||J||fc| and i denotes the unit lattice vector along the i-direction. Then (1) becomes
*«=^(-55»£r**"j*w(A+i£^L'H (14) where R(fl)™> = enmn> Vm (6)aa and a standard integral representation 10 was used for the fractional power of the operator. To study the long-wavelength contribution to the vacuum condensates, consider constant gauge potentials. Because My = £ Q 9?9f is a symmetric matrix it may diagonalized by a space rotation and, in the new coordinates, the three SU(2) vectors Of 6% and 0$ are orthogonal. Without loosing gener ality, SU(2) coordinates may be chosen such that 0? = (ai,0,0) ; 9% = (0,a 2 ,0) ; Of = (0,0,a 3 ) 0? = (-0303,0,0) ; fi = (0,-osOi.O) ; R = (0,0,-o 1 a 2 )
(15) (16)
Then (14) becomes tf0 {6} = exp (a (01,02,03)) with a(ai,02,03) = -2^y/ 0 °°dAA-i{(oi,O2,O3) 2 (a? + o | + 0$) +A(af ( a | + 4) + <4(al + 0%) + ai{a1 + a!)) +\>(alal + aW3+al4)} x{4(aia 2 a 3 ) 2 + A(A -I- a\ + o | + o f ) 2 } - 1
(u) {
'
N is the number of sites in the regularizing lattice. The state *o {9} = exp ((7(01,02,03)), as it stands, is not normalizable. This has two origins. First, as one sees from (16), there still is a gauge freedom on the planes {(aj,aj) : Ok = 0}, where one of the arguments vanishes. This is corrected by integrating not on doido2da3 but on the components of the chrompmagnetic field. This is equivalent to introduce the Jacobian factor y/ala%a\. Even then the state is not normalizable unless on restricts the large poten tial fluctuations. This is done by multiplying the integration measure by exp {-/x (oj -I- Oj + 03)} with lim^-m being taken in the end. Consistently regularized expectation values of operators are therefore obtained from <0>=
,. / s / a ^ e - ^ + ' ^ d a i d a z d a ^ o { 9 } » 0 * $0{9} hm , -, -, av ""* J^/^^e-^a'+a'+a'>dalda2cUi3^l{9}
(l°)
Because g(a) -+ 0 when a ->• 0, these integrals may be evaluated by asymptotic expansion methods, namely the Laplace method. The minimum of a(ai, 02,03) is zero and is obtained when any two of its arguments vanish.
417
Fixing each time one of the arguments and applying the Laplace method in the plane of the other two arguments, one obtains three similar contributions. For example for fixed 03 one computes the second derivatives in the plane (ai,o 2 ) <7(0,0,a3) = 0 « a(0,0,a 3 ) = 5^7(0,0, a3) = 0 £I<J{Q,0, a3) = ga(0,0,03) = 2 * ^
35^(0,0,03) = o
^
a
Using /0°° zk exp(-az2)dz = r(*±i)/(2a~? 1 ) and the generalized Laplace method one obtains for the normalization integral at small g /aV(o 1 ,oa,a 8 )*gW = 3/-°L*i ( - f e ) 2 | « 3 | e - ^ ~ - 9 ( ^ ) 2 l n M
(20)
where I have denoted by dv{a\,0,1,0.3) the measure aV(ai,02,a3) = ^/a]aj^e- M ( 0?+a ' +0 °)aaida 2 da 3
(21)
and the last expression in (20) is the asymptotic behavior for small /j. Likewise / aV(0l, oa, a3)n {9} (a2 + a2 + a32) ~ 3 ( # ) ' I /aV(ai,a 2 ) a 3 )* 2 {6} (01 + a2 + a 3 ) 2 ~ 3 ( # ) ' I /di/(a 1) a 2 ,a 3 )*g{^} ((a lfl2 ) 2 + (a2a3)2 + (a 3 ai) 2 ) ~ 3 ( # ) *
(22) fi
Therefore for < a2 + a?, + a3 >, which is the long-wavelength contribution to g2a2 < A°A£ >, one obtains
lim ——-—
M-fO - 3 / x l n / i
the same result applying for U01 + a2 + a 3 ) 2 Y Therefore the regularized long wavelength contribution, of the vacuum background, to these expectation values diverges for all values of the coupling constant. These results may now be used to find the effect of the gluon background, associated to the ground state (14), on the mass and propagation of elemen tary fermion excitations. The eigenvalue equation for the Dirac Hamiltonian of a fermion minimally coupled to this constant background is Hi (PW(P) = (-a{Pi +g(A% + a*A?)°-j-+ 7 °m) tf(p) = E^(p)
418
To find the contribution of the background to the fermion mass consider a Lorenz frame where P— 0. In this frame, without loss of generality, one makes A% — 0 by a gauge transformation and diagonalizes A" A? by space rotations obtaining, as above A = -(Au0,0)
;A% = 1(0,A2,0)
, A% =
1(0,0,A3)
With these choices the eight eigenvalues of H\ are
±Jm2 + (Al±A2±A3)2 for all possible sign choices. Hence the background contribution to the squared mass of the single fermion excitations is ((At + A2 + A3)2 \ , the same for all eigenstates because of the symmetry of the state (14). As seen above, this vacuum expectation value diverges for all values of the coupling constant and single fermion excitations acquire an infinite mass correction from the ground state background. Hence the conclusion is that: # Single fermion excitations cannot propagate in the background created by the lattice regularized ground state in Eq. (1), . .Remark: However, for a composite state with overall neutral color, the to tal contribution of the constant background vanishes and its effect on the mass can only come from background modifications of the many-body interactions. For a fermion-antifermion state the total energy is17 Hi(pi) + if2(P2) + V12 where V12 stands for the two-body interactions and H2 acts on the adjoint fermions by H2(p) tf (p) = (-a*p, -g(A$+
a'Af) ^ + 7°m) ^ (p)
Hence, the leading order effect of the constant background cancels for the color neutral combination xj> (p2)V'(Pi)3
Group structure of the SU(3) ground s t a t e
A gluonic ground state, of the type of Eq.(l), defines a probability distribution for chromomagnetic field fluctuations around zero mean in such a way that
(4:> = (*r>=o
(23)
419
as required by Lorentz and SU(3) invariance, while quantities like (A^Aa^) and (F^uFa(lv) may be different from zero. The integration measure that con trols the ground state fluctuations is |$o{-^}| Ylx * a dA%(x). The domain of integration of this measure is the space of all possible ground state fluctua tions, which should be defined in such a way as to preserve the symmetries of the theory. In the temporal gauge, one may, by space rotations, transform the gauge potential A%{x) at each point x into a triplet of SU(n) orthogonal directions. Then, the directions to include in the domain of integration are those that can be obtained from this triplet by arbitrary SU(n) rotations. If the gauge group is SU(2), these three orthogonal directions in the SU(2) space may, by SI/(2) transformations, be brought to any other orthogonal directions. This is because the Lie algebra of SI/(2) coincides with the Lie algebra of SO(3). It means that, to implement full S£/(2)-symmetry, all pos sible directions have to be included in the domain of integration. Hence the state is unique. For SC/(3), however, the situation is different. By Sf/(3) transformations one cannot rotate an octet vector and, even less, a triplet of octets, to all possible directions in Rs. This is the well-known fact that the SU (3) group has a non-trivial structure in the octet space 16 . To preserve SU(3) invariance it suffices to include in the vacuum the fluctuating directions that can be reached from a particular one by SU (3) transformations. Denoting by VOs the space of orthogonal frames in Rs we conclude that there are as many nonequivalent ways to construct ground states compatible with SU(3) invariance as there are SU(3) orbits in VOs- Therefore, because there is no special reason to prefer any particular orbit, one concludes that a gluonic vacuum, with a non-trivial gluon condensate, has a larger SO (8) symmetry, because SO (8) is the smallest group that rotates any three-frame in VOg into any other. Hence, the possible vacuum structures correspond to distinct representations of the homogeneous coset space SO(8)/S£/(3). This reasoning holds independently of the particular form of the ground state measure density |* 0 {-^}| • It seems to be a general consequence of the existence of a non-trivial gluon condensate structure in the vacuum. To characterize the representations of the coset space one has to specify the imbedding of SC/(3) in SO(8) because, as is well known, branching rules depend on the nature of the subgroup imbedding. In the case that is studied here, the SO (8) group is the group of rotations in the octet representation of SC/(3), the imbedding of SU(3) in SO(8) being uniquely defined (see the Appendix). The next step is the classification of the possible ground state structures through the irreducible representations of SO (8) and their reduction under
420
SU(3). The lowest dimensional representations of 50(8) are the trivial one and those associated to the fundamental weights Ai, A 2 , A3 and A4 1 8 . The trivial [0,0,0,0] representation would corresponds to a vacuum without a condensate structure and therefore is of no interest. A2 = [0,1,0,0] is the 28-dimensional adjoint representation which under the SU(3) color subgroup (26) reduces into 10 + 10 + 8. The most interesting case corresponds to the other three representations Ai = [1,0,0,0], A3 = [0,0,1,0] and A4 = [0,0,0,1] which are 8-dimensional representations which are also irreducible octets un der this SU(3) subgroup. They are one-dimensional representations of the ho mogeneous coset space 50(8)/5C/(3). Therefore one finds 3 identical ground state structures which are only distinguished by 50(8) quantum numbers (the maximal weights or the values of the Casimir operators). The conclusion is that: # For the SU(S) group, a ground state of the form (1) (or in general any state with a similar gluon background structure) has a three-fold multiplicity Remark: 5t/(3) is the dynamical invariance of the theory that we start with. Therefore it is this group that should control all the dynamical features of the theory: selection rules, mass matrix structure, etc. The extra 50(8) symmetry emerges only as a consequence of the realization of the vacuum through the composite condensates (A%Aail) and (FgvFa(lv). Hence the only place where the 50(8) extra quantum numbers naturally appear is on the labelling of the vacuum classes, which however are equivalent from the point of view of QCD interactions. In particular, if the mass matrix is SO (8)-blind, one would obtain an example of a democratic mass matrix with all elements equal, as discussed by a number of authors 19 2 0 2 1 . When diagonalized it leads (in leading order) to one massive state and two massless ones. Therefore the SO (8)-induced background multiplicity is suggestive of a generation type mechanism. 4
Appendix. SO(8) rotations in the octet space
The antihermitean generators for the Lie algebra of 50(8) have the commu tation relations [M„, Mrt] = 5qTMpa - Sq.Mpr - 6prMqil + 6p,Mqr
(24)
p,q,r,s = l--8 The generator Mpq, with matrix elements (MPq)jk
=
5
P36lk ~ 6Pk6Q3
(25)
421
in the defining representation, generates rotations in the plane (pq)- The structure constants of a Lie algebra are the matrix elements of the adjoint representation. Therefore, from the usual structure constants fabc of S£/(3), one reads the representation of the antihermitean SU(S) generators Fa as functions of the SO (8) rotations in the octet space ( * - * • )
*i = - M 2 3 - iM 47 + iM 56 F2 = M13 - ±M46 - \Mbl F3 = -M12 - |M(5 + iM 6 7
F4 F5 Fe F7 F8
= ±M17 + iM 2e + |M 3 5 - ^ M 5 8 = -jAfu + \M27 - IM 34 + ^Mis = iAf16 - iM 24 - iM 3 7 - # M 7 8 = - i M 1 4 - |M 2 5 + |M 3 6 + ^ M 6 8 = - f M45 - ^ M 6 7
(26)
To make the Cartan algebra of Si/(3), {F3,FS}, a subalgebra of the Cartan algebra of 50(8) it is convenient to choose this latter as { M = {M 12 ,M 45 ,M 67 ,M 38 }
(27)
The three octet representations [1,0,0,0], [0,0,1,0] and [0,0,0,1] of 50(8) 1 8 are also irreducible octets of the St/(3) subgroup defined in (26). The correspon dence between states is the following: A! = [1,0,0,0] SU(S) [III) 1110) 1100) |1 — 10) |000) III - 1) ' | Ij2j 2^ - l >
50(8) «-»•
|0t00)
+—»• jtOOO) <—> ^ (|000 - i) - | 0 0 0 t » <—► |-i000) +—► A- (|000 - ») + |000i)) <—> |00 - iO) <-► 10 - *00)
W
422
A3 = [0,0,1,0] SU(3)
50(8) 2 2?X/ - m i l
(29)
2.523/
|H0) 1100) |1 - 10) |000)
L
(li="
I I - I I '
72 I 2 2 2 :
^(IJ¥H) 2 - 5 |VW^» *
l
IS •
•
•
J 1
A4 = [0,0,0,1] 5f/(3)
50(8)
12 2 V
1110) 1100) |1 - 10) |000) 111 _ l \ 2 2
I2 2
/
V
(30)
^mfej+fe&i))
iiiii
where the 5£/(3) quantum numbers are \II3Y), with J3 = Z.F3 and Y = i-T-Fg, and the 50(8) quantum numbers are the eigenvalues of the antihermitean generators of the Cartan algebra {hi}. References 1. S. Albeverio, R. Hoegh-Krohn and L. Streit; J. Math. Phys. 18 (1977) 907. 2. S. Albeverio, R. Hoegh-Krohn and L. Streit; J. Math. Phys. 21 (1980) 1636. 3. S. Albeverio, S. Fukushima, W. Karwowski and L. Streit; Commun. Math. Phys. 81 (1981) 501. 4. M. Gell-Mann, R. Oakes and B. Renner; Phys. Rev. 175 (1968) 2195.
423
5. M. Shifman, A. Vainshtein and V. Zakharov; Phys. Lett. B77 (1978) 80; Nucl. Phys. B147 (1979) 385, 448, 519. 6. M. Shifman (Ed.); Vacuum structure and QCD sum rules, North Holland, Amsterdam 1992. 7. R. Vilela Mendes; Z. Phys. C - Particles and Fields, 54 (1992) 273. 8. J. P. Greensite; Nucl. Phys. B158 (1979) 469. 9. R. P. Feynman; Nucl. Phys. B188 (1981) 479. 10. K. Yosida; Functional analysis, Springer, Berlin 1974. 11. K. Symanzik; Nucl. Phys. B190 [FS3] (1981) 1. 12. M. Liischer; Nucl. Phys. B254 (1985) 52. 13. R. Jackiw; Analysis on infinite-dimensional manifolds - Schrddinger rep resentation for quantum fields, MIT preprint CPT 1632, 1988. 14. F. Coester and R. Haag; Phys. Rev. 117 (1960) 1137. 15. M. I. Freidlin and A. D. Wentzell; Random perturbations of dynamical systems, Springer, Berlin 1984. 16. L. Michel and L. A. Radicati; Ann. Inst. Henri Poincare A18 (1973) 185. 17. H. Ito; "Two-body Dirac equation and its wave function at the origin", LANL hep-ph/9708268. 18. J. F. Cornwell; Group Theory in Physics, vol. 2, Academic Press, London 1984. 19. H. Fritzsch; Phys. Lett. 73B (1978) 317; 166B (1986) 424; 184B (1987) 391. 20. H. Georgi and C. Jarlskog; Phys. Lett. 86B (1979) 297. 21. P. Ramond, R. G. Roberts and G. G. Ross; Nucl. Phys. B406 (1993) 19.
424
ON THE INVARIANCE PRINCIPLE AND THE LAW OF ITERATED LOGARITHM FOR STATIONARY PROCESSES DALIBOR VOLNY UPRES-A 6085 CNRS , Universiti de Rouen, F-76821 Mont-Saint-Aignan Cedex, France PAVEL SAMEK Xemax a.s., Branickd 43, 147 00 Prague 4, Czech Republic E-mail: [email protected] Let T be an ergodic, measurable and measure preserving automorphism of a prob ability space (fl,A,n), f = m + g- goT where m e L 2 , g € V, g - g o T e Lr, 0 < p < 2 < r; (m o T*) is a martingale difference sequence. We shall study for which values of p, r the (Donsker) invariance principle and the law of iterated logarithm hold for the process ( / o T ) .
1
Introduction
Let (il,A,fi) be a probability space, T : fl -> il a one-to-one bimeasurable and measure preserving transformation. For any measurable function / on ft, the process (/ o Tl) is strictly stationary and vice versa, for every strictly stationary process (Xi) we can find {Tl,A,n), T and / such that (/°T*) and (Xi) have the same distributions. We shall suppose that the measure /i is ergodic with respect to T, i.e. every set A G A with A = T~XA has measure 0 or measure 1. A classical method for proving the central limit theorem is to find a de composition f = m + g-goT
(1)
where the CLT holds true for the process (m o T') and the function g is measurable. The difference g - g o T is called a coboundary and g is called a transfer function. We shall study the integrability conditions for g and g — goT which guar antee also the preservation of the (Donsker) invariance principle and the law of iterated logarithm. All the limit laws concerned are guaranteed for a square integrable (and ergodic) martingale difference sequence. Classically, if g is square integrable, the invariance principle and the law of iterated logarithm are preserved 9 . In this paper we shall give values o f 0 < p < 2 < r such that if g e U" and g — g°T € Lr then the perturbation of m by the coboundary
425
g - goT preserves the limit laws and we also give values o f 0 < p < 2 < r such that a counterexample exists. All the limit laws concerned are guaranteed for a square integrable mar tingale difference sequence. As shown by Volny 13 , (1) with m,g € IS, p = 1,2, where (m o T*') is a martingale difference sequence, is equivalent to the convergence of the sums oo
oo
^EiforiM),
£(/oT-'-£(/or-'pW))
t=0
»=0
(2)
in U> where M C T~lM C A {{T^M) is then a filtration for (m o T*)). The proof was given for p = 1,2 but it works for all 0 < p < oo. Using (2), the condition (1) with j e L 2 can be verified in many important cases, for example if - (/ o T') is a stationary linear process with / = £ £ - o o a - ^ where e{ = m o Tx is a sequence of square integrable and ergodic martingale differences and £ ~ x ( E ~ „ o < ) 2 + E ~ = 1 ( £ ~ „ « - < ) 2 < oo (cf. HallHeyde 9 ) - (/oT*) is a strictly stationary (^mixing processes with Y^kLiiH^))1^ oo (cf. Hall-Heyde 9 or Diirr-Goldstein 4 )
<
- {Xi) is a stationary and ergodic Markov chain with Markov kernel P, foTi = h(Xi), and ft e Im(P - I) (cf. Gordin-Lifslc 8 ) - (Q, A, fi) is the interval [0,1] with the Borel
J2(\E(f\TjM)\ + |/ - E(f\T-*M)\) < oo
(3)
i=o and limsup E\Sn(f)/Vn\
< oo
(4)
n-»oo
then there exists a limit a = limsup„_ l . 00 E|5'„(/)/v / n|- If
426
The condition (3) implies (1) with m, g integrable (in fact, (2) is a gen eralization of (3)); (4) then guarantees that m is square integrable. In the papers by Gordin and Lifsic 8 , by Donsker and Varadhan 3 , by Liverani n and by Derriennic and Lin 2 , the central limit theorem for Markov chains was studied via the martingale-coboundary decomposition. In the last three contributions, the case of not square integrable transfer function g occurs and especially the results of Derriennic and Lin give examples where the V norm of g can be calculated.
2
Equivalence Theorem
Recall that by Sn(f)
we denote the sum
S'n(f,t)
YA=O
f ° ^*i
we
define
= S[nt]{f) + (nt - [nt]) f o T M .
For the process (/ ° T') the (weak, Donsker) invariance principle holds true iff the C[0, l]-valued random variables S^(f,t)/o-^/n weakly converge to a Brownian motion process, a = lim,,..^ (E(S%/n)j . For simplicity we shall suppose that a = 1. The law of iterated logarithm takes place iff for almost all u € fi, limsup^oo Sn(h)/s/nloglogn = 1 and Urninf„_>„, S n ^ / v ' n l o g l o g n = - 1 . The functional law of iterated logarithm takes place iff the sequence of S* (/, t)/y/n log log n is relatively compact and the set of its limit points co incides with the set of all absolutely continuous functions x € C[0,1] such that x(0) = 0 and / 0 (x'(t))2dt < 1 (where x' denotes the derivative of x determined almost everywhere with respect to the Lebesgue measure. Theorem 1. (The equivalence theorem) Let us suppose that for the process (m o r')< e z the invariance principle, the law of iterated logarithm (functional law of iterated logarithm) respectively, hold true. Let g be a measurable func tion and f =
m+g-g°T.
Then for the process (/ o T*)j6z - the invariance principle holds if and only if —= max | o ° T * | fi=?o*0 v/ni
in probability
(5)
427
- the law of iterated logarithm as well as the functional law of iterated logarithm holds if and only if
7n^m^9°Tn^°
a s
(6)
--
Proof. By ||.|| we denote the supremum norm in C[0,1]. From W , t ) = S'n(m,t) + S*n(g-goT,t),
te [0,1]
it follows \\S*(f) - S ; ( m ) | | = \\Sfy-goT)\\= = ,max
max
\Sh[g-goT)\
\g-g°Tk\
hence \g o Tn\ - \g\ < m ^ \g o Tk\ - \g\ < \\S*{f) - S*n(m)\\ -i<*<„
Ifl
°r*l
(7)
+ Ifl|
Recall ! that the invariance principle holds for (/ o T{) true if and only if the random functions S^(f,t)/y/n converge in finite dimensional distributions to the Brownian motion process, and for every e > 0, P(
sup
- T = | 5 * ( / , a) - S;(/, <)| > e) -> 0 as <5 \
0 uniformly
(8)
|t>-t|<<5 V n
for all n. Suppose that (5) takes place. By Hall-Heyde 9 the invariance principle holds for the process of (m o T') hence (8) holds for (m ° T'), too. (7) implies that | | 5 ; ( / ) - 5*(m)||/v^ -► 0 in probability. Therefore, (8) takes place for (/oT 1 '). Because (l/v^Sn(3-ffoT) = ( l / ^ ) ( s - 5 o r ) - > 0
in probability,
(9)
Sn(f> 0 / V " converge in finite dimensional distributions to the Brownian mo tion process. The invariance principle thus holds for the process (/°T*) true 1. If the (functional) law of iterated logarithm holds true for the process of (m o 7"), (6) and (7) imply the (functional) law of iterated logarithm for the process (/ o T*). This proves the sufficiency.
428
Suppose that for (/ o T*) the invariance principle takes place. Because it takes place also for the process of (tnoT'), we get (8) for S*{g - g o T, t)/\/n. (9) implies that the random functions 5* (g - g o T,t)/s/n converge in finite dimensional distributions to a function which is identically zero, hence the random functions 5*(g - g ° T,t)/y/n weakly converge to a zero function *. The random variables s u p ^ ^ S* (g-goT, t)/\/n thus converge in probability to zero, which imply (5). Let us suppose that (6) does not hold. We shall use Lemma. Let g be a measurable function and ip : N ->• K+ a positive increasing function such that inf
777^
=
a > 1
-
Then either
*£-.(> a.,
(10)
i>(n)
or Ppimsup ^ " T " 1 = oo) = 1. n->oo
(11)
1p(n)
If (6) does not hold, then by (7) and the Lemma we get w ' lim sup v/nloglogn n—foo
y/nloglogn
= lim sup . . . = = oo a. s. n-+oo y/n log log n
Since for (m o T*) the (functional) law of iterated logarithm holds true and the elements of C[0,1] are bounded, limsup,,^„> |S„(/)|/VnIogTogn = oo and the set of the functions S'*(/)/v/nloglogn is not relatively compact for a.e. weft. D
So, the last step which remains to be done is Proof of Lemma. For c > 0 we denote r \g ° Tn\ i A. = < limsup ' . ' > c> oo
n n i){ifl°:r*i^cw*)}GA c'
k>n
429 For every c, Ac is an invariant set: ip is increasing, hence Ac c Therefore, Ac is an invariant set. T is ergodic, hence
T~1Ae.
P(Ac) = 0 or P{Ac) = l. Let us suppose (10) does not hold. We shall prove (11). There is a CQ such that P{ACo) > 0, i.e. P(Ac0) = 1 Let 0 < c < Co be a fixed number such that ac > CQ. For TV e N we define BN = {u:
\g{Tku)\>catp{k)}.
3fc >N
Let us choose an arbitrary UJ € ACo. There exists a sequence {TIJ }^2. j , tij > 27V, Wj t oo, such that \g(Tn'u)\ hence for [^-]
> ciliirij),
j = 1,2,...,
- N and j = 1,2,... we have
19(^-^^)1
>aP{nj) >ca^([y]) >ca^(nj-t).
For [%] < i < Uj - N thus T'w 6 BN hence E ^ J ^ ^ ^ - T V ,
U<EAC0,
J =
1,2,...
Because P(A C0 ) = 1, we deduce 1 "_1 1 Um sup n- V XBN ° Tl > 2- a.s. n-»oo
^
By the Birkhoff ergodic theorem 1 "_1 lim -YiXBll°Ti
= P(BN)
a.s.
«=0
hence P(BN)
> \ for all TV > 1. By the definition BN c
P ( D N > I ^ N ) > 1/2. Therefore
P(A CQ )>p(f| ^ ) > ^ , N>1
BJV+I,
hence
430
hence P(Aca)
= 1.
We deduce that the set of all c such that P(AC) > 0 cannot have a finite supremum hence (11) takes place. □
3
Sufficient conditions
Theorem 2. Let 0 < p < 2 < r, g 6 V, g - g o T G V. If ^ r+2 P> r then the invariance principle and the law of iterated logarithm take place. 1. Proof of the law of iterated logarithm. By Theorem 1 we have to prove 1 g o T" -> 0 almost surely, s/n log log n i.e. that for any e > 0, |<7 o T"|/\/niogIogn > e for at most finitely many n. Let a > 0 (to be fixed later), [x] denote the integer part of x, m
i-i
< = X>' a ]' 1 = 1,2,... J=I
By the Borel-Cantelli lemma it suffices to prove that the sum
JT.P (30 < i < \3%
l
|g°T^+'| > e)
is finite. For any integers m,j > 1 we have 3 * {30e}c[j[j{\goTm+l-l-goTm+l\>1}
e, i'
>=i1=1
c\\{\g°Tm+i-l-goTm+i\>e-:}, iTi
3
431
hence
p 30s,SJ
(
-7S15B?Sl"rw,|><)
(12)
SP
(vmloglogm"'°r*'>s)
+
SF
(^^logml!">l)+iP(v/mlo6lo5ml9-''or|>^)'
therefore f > (30 < t < [T],
^
\goTm'+i\
/
^mjloglogmj
00
/
i=i
>e ]
2y
1 2(j a ]
^mjloglogmj
A direct calculation shows that p > ^
is equivalent to
2~P
p(ao
-
igor"*-*!^
- • " Vmloglogm'^
'
/ ~
\mP/2
There exist c 1 (e), 02(e) such that P (\g\ > — VmloglogmJ < c i ( e ) j j ^ j and
mTl2J
K
'
432
hence, if we put c = max(ci(e), 02(e)), (13) follows from (12). By (13) for any e > 0 there exist constants c = c(e),C = C(e) > 0 such that
f > f 30 < i < [ja], ]^[
V
\goTm^\ > e)
\
^/mjloglogmj
+
- U [mf
\rri)
1
J
l +\j°r \
U ^(Q+1)P/2 i (a+1)r/2 ) '
The last sum is finite hence by the Borel-Cantelli lemma the law of iterated logarithm takes place. Let us suppose 2-p _ r - 2 ~p r + 2' We put a = ^
= ^f. We shall directly prove that the sums
p
151
>?V £ (^r log log— m_,- 2 p^(i»-.^i> 6N/;i y i7 ) are finite. We have mj = Jj(ll [1% j = 1,2,... hence £ p (\g\ > ^v/m.loglogm,) < £ P ( | 5 | > c j 1 / " ) for some constant c > 0; because g E V, the sum on the right converges. It remains to show that for a = £=§ the sum
Y^a\P [\9-9°T\> 2^\A"7logtagm,J converges. We denote h = \g - g o T\r; there exists a constant c > 0 such that the sum then can be estimated by 00
Y,jaP(h>c-ja+1)
=
433
^ P ( c - j Q + 1
j=\
^ - l - ( j + i r + 1 P ( c . r + 1 < / l < c . ( ; + l)«+1); because r > 2 and h is integrable, the sum converges.
□
2. Proof of the invariance vrincivle. By Theorem 1 we have to show that for every g € Lp with g — g°T € Lr —= max \g o Tk\ -*■ 0
in probability;
y n l<*
it suffices to show —= max | - g o T*| -> 0 yn
in probability.
i
The assumption r+2
-1 r
We thus have
l-a-f<0, 1 - r- + ar < 0. Let 3 G L p and g-goT P
eLT.
Then for any e > 0 and 1 < k < n
(^^lff-5°T
j=0 «=1
= Ei j (i«/-«7°^*i>^) + ([^] + i)E p (i 5 i(iff-p°ri)>^)
434
<^
+ l)p{\9\>l^)
+ (l +
<(? + i n 2 - ^ - i t f
l)t,p(\g-goT\>±V*)
\g\pdP + kT+1
-^ e n
I
\9-9°T\*dp\.
' J\g-goT\>t^i/2k
J
Let us choose k = [na). Then the last expression is maj orated by a multiple of A(n) = nl~a
( 2—n-p/2 / \g\P dP + V eP J\g\>eVX/4
(H)\ a ^)-r,2 f
\g-goT\*dp).
From the assumptions 1 — a - § < 0 and 1 - § + ar < 0 it follows that A{n) -► 0. □
4
Counterexamples
Theorem 3. Let 0 < p < 2 < r, r-1 '
2
T/ien f/iene exists a function g G IP such that g - goT € V and —= max \goTk\-/*0 ■y/n i < * < n
in probability,
= |5oT*| 7^- 0 s/n log log n
o. s.
Vh port«cu/or, t/iere citsis a function g £ L1 such that and —y= max koT^I^AO y/n l<*
in probability,
g£L2,g-gaT€L2,
435 r
.
.
i
.
Proof. Notice that p < ^ 4 - is equivalent to *—f < ^ — ±; there thus exists an a such that i - 1
Fix e > 0, e < 1. Let {rij = c'}j>i, {k{ = [nf]}j>i be two sequences of positive integers where c > 2 is an integer and [x] is the integer part o f i e R For i > 1, using the Rokhlin lemma, we find a set Ai € A such that sets Ai,TAi,...
,Tni~xAi
are pairwise disjoint and
(14)
n,-l
P{[}TiAi)>\-e.
(15)
We define functions ^ ^ l o g l o g n , .x{foTni-JAi}
(16)
i=i
**
and gt = hi + hi o T + ... + hi o T ^ - 1 = ^ /i{ o T j i=o
(17)
(see the picture; here we denote d = V".'°K'°K".),
\ hd
hi Ai
TAi
<e}d
... ki times Q
The function g is defined by setting oo
a. s. «=1
We have ^
9 = 1.
- / *
Vfijlogloga
...
'
\ ni j
\Y,JX{T - Ai} +
2
£ ( *<--J)X{T"-^} .
436
From (14), (15) it follows ±>P{Ai)>'L-£,
»>1,
(18)
hence ki
< 5 > j + l)nf Vloglogni =i oo
T < 2 ] ^ ^a+i-i "\/loglognj < oo.
«=i
By (16), (17), we have \9i-9i°T\ = \hi-hioTki\
=
Vn
°fcgi0gn
X{T-*'
(J
T^Ai}\
hence (cf. (18)) oo
\\9-9°T\\r
=
^l/r y , nf " log log n<
oo
< 4^n?
r
v
T
' < oo.
From & = v/niloglognj
on T" 4 - **^,
0 < g* < v^loglognj
we derive P ( - ^ m « > o l * | > 1) = P ( Q T-"{g > 0 ^ } ) > P ( Q T-*{ 5 i > Vn7})
437
= P(\J Tni-ki-kAi) = P{\J Tn'-kAi) = mP{Ai) > l - e for i > 1, and 1
|g°r*|^o a. s.
v/nloglogn
D The function can be constructed such that we have the invariance principle without iterated logarithm, or the law of iterated logarithm without invariance principle, while the Gordin martingale approximation 7 takes place. References 1. P. Billingsley, Convergence of probability measures, ch. 8, J. Wiley, New York, 1968. 2. Y. Derriennic and M. Lin, in preparation. 3. M. Donsker and S.R. Varadhan. 4. D. Diirr and S. Goldstein, Remarks on the CLT for weakly dependent random variables, Stochastic Processes - Mathematics and Physics (Biele feld) (S. Albeverio, Ph. Blanchard, and L. Streit, eds.), Lecture Notes in Maths., vol. 1158, 1984. 5. C.G. Esseen and S. Janson, On moment conditions for sums of indepen dent variables and martingale differences, Stochastic Process. Appl. 19 (1985), 557-562. 6. M.I. Gordin, Abstracts of Communications, International Conference on Probability (Vilnius), vol. T.1:A-K. 7. , The central limit theorem for stationary processes, Soviet Math. Doklady 10 (1969), 1174-1176. 8. M.I. Gordin and V. Lifsic, A central limit theorem for Markov processes, Soviet Math. Doklady 10 (1978), 392-394. 9. P. Hall and C.C. Heyde, Martingale limit theory and its application, Aca demic Press, New York, 1980. 10. M. Kac, Probability methods in some problems of analysis and number theory, Bull. Amer. Math. Soc. 55 (1949), 641-665. 11. Liverani, Central limit theorem for deterministic systems, International Conference on Dynamical Systems (Montevideo, 1995) (Harlow), Pitman Res. Notes Math. Ser., vol. 362, Longman, 1996, pp. 56-75.
438
12. D. Volny, in preparation. 13. , Approximating martingales and the central limit theorem for strictly stationary processes, Stochastic Processes and their Applications 44 (1993), 41-74.