Recent Development in Stochastic Dynamics and Stochastic Analysis
INTERDISCIPLINARY MATHEMATICAL SCIENCES Series Editor: Jinqiao Duan (Illinois Inst. of Tech., USA) Editorial Board: Ludwig Arnold, Roberto Camassa, Peter Constantin, Charles Doering, Paul Fischer, Andrei V. Fursikov, Sergey V. Lototsky, Fred R. McMorris, Daniel Schertzer, Bjorn Schmalfuss, Yuefei Wang, Xiangdong Ye, and Jerzy Zabczyk Published Vol. 1: Global Attractors of Nonautonomous Dissipative Dynamical Systems David N. Cheban Vol. 2: Stochastic Differential Equations: Theory and Applications A Volume in Honor of Professor Boris L. Rozovskii eds. Peter H. Baxendale & Sergey V. Lototsky Vol. 3: Amplitude Equations for Stochastic Partial Differential Equations Dirk Blömker Vol. 4: Mathematical Theory of Adaptive Control Vladimir G. Sragovich Vol. 5: The Hilbert–Huang Transform and Its Applications Norden E. Huang & Samuel S. P. Shen Vol. 6: Meshfree Approximation Methods with MATLAB Gregory E. Fasshauer Vol. 7: Variational Methods for Strongly Indefinite Problems Yanheng Ding Vol. 8: Recent Development in Stochastic Dynamics and Stochastic Analysis eds. Jinqiao Duan, Shunlong Luo & Caishi Wang Vol. 9: Perspectives in Mathematical Sciences eds. Yisong Yang, Xinchu Fu & Jinqiao Duan
Interdisciplinary Mathematical Sciences – Vol. 8
Recent Development in Stochastic Dynamics and Stochastic Analysis Editors
Jinqiao Duan Illinois Institute of Technology, USA
Shunlong Luo Chinese Academy of Sciences, China
Caishi Wang Northwest Normal University, China
World Scientific NEW JERSEY
•
LONDON
•
SINGAPORE
•
BEIJING
•
SHANGHAI
•
HONG KONG
•
TA I P E I
•
CHENNAI
Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE
British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.
RECENT DEVELOPMENT IN STOCHASTIC DYNAMICS AND STOCHASTIC ANALYSIS Interdisciplinary Mathematical Sciences — Vol. 8 Copyright © 2010 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.
For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.
ISBN-13 978-981-4277-25-9 ISBN-10 981-4277-25-8
Printed in Singapore.
v
Professor Zhi-Yuan Huang
Editorial Foreword
This volume of Interdisciplinary Mathematical Sciences collects invited contributions giving timely surveys on a diverse range of topics on stochastic dynamics and stochastic analysis. This includes dynamics under random boundary conditions, decoherent information analysis, stabilization by noise, stochastic parameterization, white noise theory, self-similar processes and colored noise, non-Gaussian noise, stochastic evolutionary equations, infinite dimensional rotation group and Lie algebra, Weyl processes, random fields, Malliavin calculus, and stochastic integral. The ordering of the chapters follows the alphabetic order of the last names of the first authors of the contributions. We are grateful to all the authors for their contributions to this volume. We thank Dr. Chujin Li, Huazhong University of Science and Technology, Wuhan, China, for technical support. We would also like to thank Rok Ting Tan and Rajesh Babu at World Scientific Publishing, for their professional editorial assistance. We are privileged and honored to dedicate this volume to Professor Zhi-Yuan Huang, on the occasion of his 75th birthday, in celebration of his achievements in mathematical sciences. Jinqiao Duan, Illinois Institute of Technology, Chicago, USA Shunlong Luo, Chinese Academy of Sciences, Beijing, China Caishi Wang, Northwest Normal University, Lanzhou, China September 1, 2009
vii
Preface
This festschrift volume is dedicated to professor Zhi-Yuan Huang on the occasion of his 75th birthday. Zhi-Yuan Huang was born on June 2, 1934, in Nanchang, the capital city of Jiangxi province of southern China. He entered Wuhan University, a prestigious university in China, in 1956 and graduated in 1960. After graduation, he remained as a faculty member at the Department of Mathematics. In 1962 he went to Zhongshan University for advanced studies on theory of stochastic processes and published his first research papers. Two years later, he returned to Wuhan University, where he spent more than thirty years as a faculty member. From 1982 to 1983, he worked with Professor S. Orey as a senior visiting scholar at the University of Minnesota, USA. During that time, he proposed a theory of stochastic integration over a general topological measurable space, which included the well-known Itˆo integral as well as other existing stochastic integrals as special cases. At the end of 1983, he returned to Wuhan University and gave lectures to graduate students on stochastic analysis. In 1984, he was appointed as associate director of the Research Institute for Mathematics at Wuhan University and the following year he was promoted to full professor. He continued his research on stochastic analysis and visited Kyoto University, Japan for one month in 1987 as visiting professor supported by JSPS. In 1988, his book Foundations of Stochastic Analysis was published, which was the first monograph systematically presenting Malliavin calculus in Chinese. In fact he had got interested in Malliavin calculus earlier than 1985, when he published his paper Malliavin calculus and its applications. At the end of 1988, he also wrote an article introducing the quantum stochastic calculus, a new theory created by R. L. Hudson and K. R. Parthasarathy. In 1990, he received the Natural Science Prize of Ministry of Education, China. The same year he was appointed as a doctoral advisor, a higher position at universities in China. In 1992, he moved to Huazhong University of Science and Technology (HUST), also in Wuhan, and since then he has been working there as a professor of mathematics, dean of the School of Science (1994-2000) and Vice President of Academic Committee of HUST (2000-2004). He also held a concurrent position of research fellow in the Institute of Applied Mathematics, Chinese Academy of Sciix
x
Preface
ences (1996-98). Currently, he is Director of the Research Center for Stochastics at HUST. The white noise theory initiated by T. Hida is essentially an infinite dimensional analog of Schwartz distribution theory. Zhi-Yuan Huang was the first one who noted the potential role of the white noise theory in developing the quantum stochastic calculus. In 1993, he published his celebrated research paper Quantum white noises: White noise approach to quantum stochastic calculus. As is suggested by the title, he applied the white noise theory to the quantum stochastic calculus and proposed the notion of quantum white noises. He showed that the quantum white noises were pointwise-defined creation and annihilation operators on the boson Fock space and could be used to extend the Hudson and Parthasarathy’s quantum stochastic integral to the non-adapted situation. Following this work, together with Shunlong Luo, he further developed the Wick calculus for generalized operators and successfully applied it to quantum field theory. Professor Hida said: Huang’s calculus “has become a very powerful and important tool in the theory of quantum stochastic analysis”. Professor Parthasarathy also highly appraised Huang’s work and wrote: “the contributions of Professor Huang are very original and have a great potential for further developments in our understanding of rigorous quantum field theory.” From 1993 to 1999, Huang was invited to give talks about his calculus at several international conferences. In 1997, he and Jia-an Yan coauthored a new book entitled Introduction to Infinite Dimensional Stochastic Analysis, which was the first monograph dealing with Malliavin calculus and white noise theory in a unified framework (the English version appeared in 2000). The same year he was one more time awarded the Natural Science Prize of Ministry of Education, China. In 1998, he became an editor of the journal Infinite Dimensional Analysis, Quantum Probability and Related Topics. In 1999, he and Jia-an Yan shared a prize awarded by the Chinese Government for their joint monograph promoting mathematical learning. Since 2000, Huang has been devoting himself to the study of L´evy white noise as well as fractional noises. He has published a series of research papers independently or jointly with his students and much progress has been made. He, together with his Ph.D. students, gave an interacting Fock expansion of L´evy white noise functionals and developed a white noise approach to analysis for fractional L´evy processes. In 2004, he published his third book Quantum White Noise Analysis (in Chinese, with Caishi Wang and Guanglin Rang). In 2006 and 2009, he received the Natural Science Prize and Teaching Achievement Prize of Hubei Province, China, respectively.
Preface
xi
Ph.D. Graduate Students of Zhi-Yuan Huang Shunlong Luo (Wuhan University, 1995) Zongxia Liang (Institute of Applied Mathematics CAS, 1996) Mingli Zheng (Institute of Applied Mathematics CAS, 1996) Qingquan Lin (Institute of Applied Mathematics CAS, 1998) Caishi Wang (Huazhong University of Science and Technology, 1999) Xiangjun Wang (Huazhong University of Science and Technology, 1999) Shaopu Zhou (Huazhong University of Science and Technology, 2000) Xiaoshan Hu (Huazhong University of Science and Technology, 2002) Jihui Hu (Huazhong University of Science and Technology, 2002) Guanglin Rang (Huazhong University of Science and Technology, 2003) Ying Wu (Huazhong University of Science and Technology, 2005) Chujin Li (Huazhong University of Science and Technology, 2005) Guanghui Huang (Huazhong University of Science and Technology, 2006) Peiyan Li (Huazhong University of Science and Technology, 2007) Xuebin L¨ u (Huazhong University of Science and Technology, 2009) Junjun Liao (Huazhong University of Science and Technology, 2010)
Publications of Zhi-Yuan Huang Since 1980 Books (1) Foundations of Stochastic Analysis (in Chinese), Wuhan Univ. Press (1988); Second edition, Science Press (2001) (2) Introduction to Infinite Dimensional Stochastic Analysis (in Chinese, with JiaAn Yan), Science Press (1997); English version, Kluwer (2000) (3) Quantum White Noise Analysis (in Chinese, with Caishi Wang and Guanglin Rang), Hubei Sci. Tech. Publ. (2004) Papers and Invited Lectures (1) On the generalized sample solutions of stochastic differential equations, Wuhan Univ. J., No. 2 (1981), 11-21 (in Chinese, with M. Xu and Z. Hu) (2) Martingale measures and stochastic integrals on metric spaces, Wuhan Univ. J. (Special issue for Math.), No. 1 (1981), 89-102 (3) Stochastic integrals on general topological measurable spaces, Z. Wahrs. verw. Gebiete, Vol. 66 (1984), 25-40 (4) On the generalized sample solutions of stochastic boundary value problems, Stochastics, Vol. 11 (1984), 237-248
xii
Preface
(5) A comparison theorem for solutions of stochastic differential equations and its applications, Proc. Amer. Math. Soc., Vol. 91 (1984), 611-617 (6) The weak projection theory and decompositions of quasi-martingale measures, Chinese Ann. Math. (Ser. B), Vol. 6 (1985), 395-399 (7) The Malliavin calculus and its applications, J. Applied Probab. Statist., Vol. 1, No. 2 (1985), 161-172 (in Chinese) (8) On the product martingale measure and multiple stochastic integral, Chinese Ann. Math. (Ser. B), Vol. 7 (1986), 207-210 (9) Spectral analysis for stochastic integral operators, Wuhan Univ. J., No. 4 (1986), 17-24 (in Chinese, with Y. Liao) (10) Functional integration and partial differential equations, Adv. Math. (China), Vol. 15, No. 2 (1986), 131-174 (based on a series of lectures given by Prof. P. Malliavin in Wuhan Univ. in June 1984) (11) Spectral analysis for stochastic integral operators, invited lecture given in Kyoto Univ. (1987) (12) An introduction to quantum stochastic calculus, Adv. Math. (China), Vol. 17, No. 4 (1988), 360-378 (13) Quasi sure stochastic flows, Stochastics, Vol. 33 (1990), 149-157 (with J. Ren) (14) Some recent development of stochastic calculus in China, Contemporary Math., Vol. 118 (1991), 177-185 (15) Stochastic calculus of variation on Gaussian space and white noise analysis, in “Gaussian Random Fields”, K. Ito and T. Hida eds., World Scientific (1991) (16) Quantum white noises - White noise approach to quantum stochastic calculus, Nagoya Math. J., Vol. 129 (1993), 23-42 (17) Quantum white noises analysis - a new insight into quantum stochastic calculus, invited lecture of 9th Conference on Quantum Probability and Applications, Nottingham (1993) (18) P-adic valued white noise functionals, Quantum Prob. Related Topics, Vol. IX, (1994), 273-294 (with Khrennikov) (19) An extension of Hida’s distribution theory via analyticity spaces, in “Dirichlet Forms and Stochastic Processes”, Walter de Gruyter, 1995 (with H. Song) (20) Generalized functionals of a p-adic white noise, Doklady Mathematics, Vol. 52 (1995), 175-178 (with Khrennikov) (21) Quantum white noises and quantum fields, 23rd SPA Conference, Singapore (1995) (with S. Luo) (22) A model for white noise analysis in p-adic number fields, Acta Math. Sci., Vol. 16 (1996), 1-14 (with Khrennikov) (23) Quantum white noise analysis, invited lecture of Conference on Stochastic Differential Geometry and Infinite Dimensional Analysis, Hangzhou (1996) (24) Analytic functionals and a new distribution theory over infinite dimensional spaces, Chinese Ann. Math. (Ser. B), Vol. 17 (1996), 507-514 (with J. Ren) (25) Quantum white noises, Wick calculus and quantum fields, invited lecture of
Preface
(26) (27) (28) (29) (30) (31) (32) (33) (34) (35) (36) (37)
(38) (39)
(40) (41) (42) (43)
(44)
xiii
Conference on Infinite Dimensional Analysis and Quantum Probability, Rome (1997) The weak solution for SDE with terminal conditions, Math. Appl., Vol. 10, No. 4 (1997), 60-64 (with Q. Lin) Quantum white noises and free fields, IDAQP, Vol. 1, No. 1 (1998), 69-82 (with S. Luo) D∞ -Approximation of quadratic variations of smooth Ito processes, Chinese Ann. Math. (Ser. B), Vol. 19 (1998), 305-310 (with J. Ren) Wick calculus of generalized operators and its applications to quantum stochastic calculus, IDAQP, Vol. 1, No. 3 (1998), 455-466 (with S. Luo) Positivity-preservingness of differential second quantization, Math. Appl., Vol. 11 (1998), 31-32 (with C. Wang) Quantum integral equation of Volterra type with generalized operator-valued kernels, IDAQP, Vol. 3 (2000), 505-517 (with C. Wang and X. Wang) Quantum cable equations in terms of generalized operators, Acta Appl. Math., Vol. 63 (2000), 151-164 (with C. Wang and X. Wang) Wick tensor products in Levy white noise spaces, Second Sino-French Colloquium in Probability and Applications, Invited Lecture, Wuhan (2001) Explicit forms of Wick powers in general white noise spaces, IJMMS, Vol. 31 (2002), 413-420 (with X. Hu and X. Wang) L2 (E ∗ , µ)-Weyl representations, IDAQP, Vol. 5 (2002), 581-592 (with X. Hu and X. Wang) A white noise approach to quantum stochastic cable equations, Acta Math. Sinica, Vol. 45 (2002), 851-862 (with C. Wang) Quantum integral equations with kernels of quantum white noise in space and time, Advances in Math. Research Vol. 1 (2002), G.Oyibo eds. NOVA SCI. PUBL., 97-108 (with C. Wang and T-S. Chew) Quadratic covariation and extended Ito formula for continuous semimartingales, Math. Appl., Vol. 15 (2002), 81-84 (with J. Hu) White noise approach to interacting quantum field theory, in “Recent Developments in Stochastic Analysis and Related Topics”, eds. S. Albeverio et al., World Scientific, 2004 (with G. Rang) Quantum stochastic differential equations in terms of generalized operators, Adv. Math. (China), Vol. 32 (2003), 53-62 (with C. Wang and X. Wang) White noise approach to the construction of Φ44 quantum fields, Acta Appl. Math., Vol. 77 (2003), 299-318 (with G. Rang) Generalized operators and operator-valued distribution in quantum field theory, Acta. Math. Scientia, Vol. 23(B) (2003), 145-154 (with X. Wang and C. Wang) A W-transform-based criterion for the existence of bounded extensions of Eoperators, J. Math. Anal. Appl., Vol. 288 (2003), 397-410 (with C. Wang and X. Wang) The non-uniform Riemann approach to anticipating stochastic integrals, Stoch.
xiv
Preface
Anal. Appl. Vol. 22 (2004), 429-442 (with T-S. Chew and C. Wang) (45) Generalized operators and P (Φ)2 quantum fields, Acta Math. Scientia, Vol. 24(B) (2004), 589-596 (with G. Rang) (46) Analytic characterization for Hilbert-Schmidt operators on Fock space, Acta Math. Sin. (Engl. Ser.), Vol. 21 (2005), 787-796 (with C. Wang and X. Wang) (47) A filtration of Wick algebra and its application to quantum SDE, Acta Math. Sin. (Engl. Ser.), Vol. 20 (2004), 999-1008 (with C. Wang) (48) A moment characterization of B-valued generalized functionals of white noise, Acta Math. Sin. (Engl. Ser.), Vol. 22 (2006), 157-168 (with C. Wang) (49) δ-Function of an operator: A white noise approach, Proc. Amer. Math. Soc., Vol. 133, No. 3 (2005), 891-898 (with C. Wang and X. Wang) (50) Fractional Brownian motion and sheet as white noise functionals, Acta Math. Sin. (Engl. Ser.), Vol. 22, No. 4 (2006), 1183-1188 (with C. Li, J. Wan and Y. Wu) (51) Interacting Fock expansion of L´evy white noise functionals, Acta Appl. Math., Vol. 82 (2004), 333-352 (with Y. Wu) (52) L´evy white noise calculus based on interaction exponents, Acta Appl. Math., Vol. 88 (2005), 251-268 (with Y. Wu) (53) Anisotropic fractional Brownian random fields as white noise functionals, Acta Math. Appl. Sinica, Vol. 21, No. 4 (2005), 655-660 (with C. Li) (54) White noise approach to the construction of Φ44 quantum fields (II), Acta Math. Sin. (Engl. Ser.), Vol. 23, No. 5 (2007), 895-904 (with G. Rang) (55) On fractional stable processes and sheets: white noise approach, J. Math. Anal. Appl., Vol. 325, No. 1 (2007), 624-635 (with C. Li) (56) Explicit forms of q-deformed L´evy-Meixner polynomials and their generating functions, Acta Math. Sin. (Engl. Ser.), Vol. 24, No. 2 (2008), 201-214 (with P. Li and Y. Wu) (57) Generalized fractional L´evy processes: A white noise approach, Stochastics and Dynamics, Vol. 6, No. 4 (2006), 473-485 (with P. Li) (58) Fractional generalized L´evy random fields as white noise functionals, Front. Math. China, Vol. 2, No. 2 (2007), 211-226 (with P. Li) (59) Fractional L´evy processes on Gel’fand triple and stochastic integration, Front. Math. China, Vol. 3, No. 2 (2008), 287-303 (with X. L¨ u and J. Wan) (60) Fractional noises on Gel’fand triples, Invited lecture of International Conference on Stochastic Analysis and Related Fields, Wuhan (2008) (with X. L¨ u).
Contents
Editorial Foreword
vii
Preface
ix
1. Hyperbolic Equations with Random Boundary Conditions
1
Zdzislaw Brze´zniak and Szymon Peszat 2. Decoherent Information of Quantum Operations
23
Xuelian Cao, Nan Li and Shunlong Luo 3. Stabilization of Evolution Equations by Noise
43
Tom´ as Caraballo and Peter E. Kloeden 4. Stochastic Quantification of Missing Mechanisms in Dynamical Systems
67
Baohua Chen and Jinqiao Duan 5. Banach Space-Valued Functionals of White Noise
77
Yin Chen and Caishi Wang 6. Hurst Index Estimation for Self-Similar Processes with Long-Memory
91
Alexandra Chronopoulou and Frederi G. Viens 7. Modeling Colored Noise by Fractional Brownian Motion Jinqiao Duan, Chujin Li and Xiangjun Wang xv
119
xvi
8.
Contents
A Sufficient Condition for Non-Explosion for a Class of Stochastic Partial Differential Equations
131
Hongbo Fu, Daomin Cao and Jinqiao Duan 9.
The Influence of Transaction Costs on Optimal Control for an Insurance Company with a New Value Function
143
Lin He, Zongxia Liang and Fei Xing 10. Limit Theorems for p-Variations of Solutions of SDEs Driven by Additive Stable L´evy Noise and Model Selection for Paleo-Climatic Data
161
Claudia Hein, Peter Imkeller and Ilya Pavlyukevich 11. Class II Semi-Subgroups of the Infinite Dimensional Rotation Group and Associated Lie Algebra
177
Takeyuki Hida and Si Si 12. Stopping Weyl Processes
185
Robin L. Hudson 13. Karhunen-Lo´eve Expansion for Stochastic Convolution of Cylindrical Fractional Brownian Motions
195
Zongxia Liang 14. Stein’s Method Meets Malliavin Calculus: A Short Survey With New Estimates
207
Ivan Nourdin and Giovanni Peccati 15. On Stochastic Integrals with Respect to an Infinite Number of Poisson Point Process and Its Applications
237
Guanglin Rang, Qing Li and Sheng You 16. L´evy White Noise, Elliptic SPDEs and Euclidean Random Fields
251
Jiang-Lun Wu 17. A Short Presentation of Choquet Integral Jia-An Yan
269
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 1–21
Chapter 1 Hyperbolic Equations with Random Boundary Conditions
Zdzislaw Brze´zniak and Szymon Peszat Department of Mathematics, University of York, York, YO10 5DD, UK,
[email protected] and ´ Tomasza 30/7, 31-027 Institute of Mathematics, Polish Academy of Sciences, Sw. Krak´ ow, Poland, e-mail:
[email protected] Following Lasiecka and Triggiani an abstract hyperbolic equation with random boundary conditions is formulated. As examples wave and transport equations are studied.
Contents 1
Introduction . . . . . . . . . . . . . . . . . . . . . 1.1 The Wave equation . . . . . . . . . . . . . . 1.2 The Transport equation . . . . . . . . . . . 2 Abstract formulation . . . . . . . . . . . . . . . . . 3 The Wave equation - introduction . . . . . . . . . 4 Weak solution to the wave equation . . . . . . . . 5 Mild formulations . . . . . . . . . . . . . . . . . . 6 Scales of Hilbert spaces . . . . . . . . . . . . . . . 6.1 Application to the boundary value problem 7 Equivalence of weak and mild solutions . . . . . . 8 The Fundamental Solution . . . . . . . . . . . . . 9 Applications . . . . . . . . . . . . . . . . . . . . . 10 The Transport equation . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
1 2 3 3 4 6 7 8 10 11 13 14 16 20
1. Introduction Assume that A is the generator of a C0 -group U = (U (t))t∈R of bounded linear operators on a Hilbert space H. Let U be another Hilbert space and let u ∈ L2loc (0, +∞; U). Typical examples of H and U will be spaces L2 (O) and L2 (∂O). Research of the second named author was supported by Polish Ministry of Science and Higher Education Grant PO3A 034 29 “Stochastic evolution equations driven by L´evy noise”. Research of the first named author was supported by an EPSRC grant number EP/E01822X/1. In addition the authors acknowledge the support of EC FP6 Marie Curie ToK programme SPADE2, MTKDCT-2004-014508 and Polish MNiSW SPB-M 1
2
Zdzislaw Brze´ zniak and Szymon Peszat
Lasiecka and Triggiani, see Refs. 14–17, 4 and 18, discovered that for a large class of boundary operators τ , the hyperbolic initial value problem d X(t) = AX(t), dt
t ≥ 0,
X(0) = X0 ,
considered with a non-homogeneous boundary condition τ (X(t)) = u(t),
t ≥ 0,
can be written in the form of the homogeneous boundary problem d X(t) = AX(t) + (λ − A)Eu(t), dt
X(0) = X0 ,
(1.1)
by choosing properly the space U, an operator E ∈ L(U, H) and a scalar λ from the resolvent set ρ(A) of A. In this chapter u will be the time derivative of a U-valued c` adl` ag process ξ. We will discuss the existence and regularity of a solution to the problem (1.1). The abstract framework will be illustrated by the wave and the transport equations. Let us describe briefly the history of the problem studied in this chapter. To our (very limited) knowledge, the first paper which studied evolution problems with boundary noise was a paper3 by Balakrishnan. The equation studied in that paper was first order in time and fourth order in space with Dirichlet boundary noise. Later Sowers28 investigated general reaction diffusion equation with Neumann type boundary noise. Da Prato and Zabczyk in their second monograph,12 see also Ref. 11, have explained the difference between the problems with Dirichlet and Neumann boundary noises. In particular, the solution to the former is less regular in space than the solution to the latter. Maslowski23 studied some basic questions such as exponential stability in the mean of the solutions and the existence and uniqueness of an invariant measure. Other related works for parabolic problems with boundary noise are E. Al`os and S. Bonaccorsi1,2 and Brze´zniak et. al.5 Similar question have also been investigated in the case of hyperbolic SPDEs with the Neumann boundary conditions, see for instance Mao and Markus,22 Dalang and L´evˆeque.7–9,19 Moreover, some authours, see for example Chueshov, Duan and Schmalfuss6,13 have studied problems in which deterministic partial differential equations are coupled to stochastic ones by some sort of boundary conditions. To our best knowledge, our paper is the first one in which the hyperbolic SPDEs with Dirichlet boundary conditions are studied. 1.1. The Wave equation Let ∆ denote the Laplace operator, τ be a boundary operator, O be a domain in Rd with smooth boundary ∂O, and let S (∂O) denote the space of distributions on ∂O. We assume that ξ take values in S (∂O). In the first part of the paper, see
Hyperbolic Equations with Random Boundary Conditions
Sections from 3 to 9 we will
3
be concerned with the following initial value problem
∂2u = ∆u ∂t2 dξ τu = dt u(0, ·) = u 0 ∂u (0, ·) = u0,1 ∂t
on (0, ∞) × O, on (0, ∞) × ∂O,
(1.2)
on O, on O.
In fact, we will only consider the Dirichlet and the Neumann boundary conditions. In the case of the Dirichlet boundary conditions we put τ = τD , where τD ψ(x) = ψ(x) for x ∈ ∂O, whereas in case of the Neumann boundary conditions we put τ = τN , where τN ψ(x) = ∂ψ ∂n (x), x ∈ ∂O and n is the exterior unit normal vector field on ∂O. 1.2. The Transport equation In Section 10 we will consider the following stochastic generalization of the boundary value problem associated to the following simple transport equation introduced in Ref. 4 [Example 4.1, p. 466], ∂u ∂u = on (0, ∞) × (0, 2π), ∂t ∂x (1.3) on (0, 2π), u(0, ·) = u0 τ (u(t, ·)) = dξ (t) for t ∈ (0, ∞), dt where τ (ψ) = ψ(2π) − ψ(0). This paper is organized as follows. The next section is devoted to an abstract framework. This framework is adapted from works by Lasiecka and Triggiani and also the book by Bensoussan et al.4 In the following section we study the wave equation with the Dirichlet and the Neumann boundary conditions. In particular we will investigate the concepts of weak and mild solutions, their equivalence, their relation with the abstract framework and their regularity. The final section (Section 10) is devoted to the transport equation. 2. Abstract formulation We will identify the Hilbert space H with its dual H . Therefore the adjoint operator A∗ is bounded from its domain D(A∗ ), equipped with the graph norm, into H and hence it has a bounded dual operator (A∗ ) : H → (D(A∗ )) . It is easy to see that the latter operator is a bounded linear extension of a linear map A : D(A) → H. Since A is a closed operator, D(A) endowed with a graph norm is a Hilbert space.
4
Zdzislaw Brze´ zniak and Szymon Peszat
Alternatively, we can take κ ∈ ρ(A) and endow D(A) with the norm (κ − A)f H . Clearly these two norms are equivalent. We assume that for any T > 0 there exists a constant K > 0 such that T E ∗ A∗ U (t)∗ f 2H dt ≤ K f 2H , f ∈ D(A∗ ). (2.1) 0
Assume that X is a mild solution of (1.1) with u being the weak time derivative of ξ, i.e. t dξ(r) , t ≥ 0. (2.2) X(t) = U (t)X0 + U (t − r)(λ − A)E dr 0 Then, integrating by parts we see that the mild form of (1.1) is X(t) = U (t)X0 + (λ − (A∗ ) ) [Eξ(t) − U (t)Eξ(0)] t (A∗ ) U (t − r)(λ − A)Eξ(r)dr, t ≥ 0. +
(2.3)
0
In other words,
X(t) = U (t)x0 + (λ − (A∗ ) ) [Eξ(t) − U (t)Eξ(0)] + (A∗ ) Y (t), where
Y (t) :=
t
U (t − r)(λ − A)Eξ(r)dr,
t ≥ 0,
t ≥ 0.
0
Since the trajectories of the process ξ are c`adl` ag, they are also locally bounded and hence locally square integrable. Hence, by4 [Proposition 4.1], the process Y has trajectories in C(R+ ; H). Since (A∗ ) is a bounded operator from H to (D(A∗ )) we have the following result. Theorem 2.1. Assume (2.1). If ξ is an U-valued c` adl` ag process, then the process X defined by (2.3) has c` adl` ag trajectories in (D(A∗ )) . 3. The Wave equation - introduction In the next section we will introduce a concept of weak, in the PDEs sense, solution to the boundary problem for wave equation (1.2). Next, denoting by τ the appropriate boundary operator, by Dτ the boundary map with certain parameter κ and by ∆τ the Laplace operator with homogeneous boundary conditions, in Section 5 we will show in a heuristic way that problem (1.2) can be written as follows dX =Aτ Xdt + (κ − A2τ )Dτ dξ, †
=Aτ Xdt + ((κ − ∆τ ) Dτ ) dξ,
(3.1)
Hyperbolic Equations with Random Boundary Conditions
where
X=
Note that
0 I 0 , Dτ = , Aτ = , ∂u ∆τ 0 Dτ ∂t 0 † ((κ − ∆τ ) Dτ ) = . (κ − ∆τ ) Dτ u
0 I 0 κ0 0 I − ∆τ 0 Dτ 0κ ∆τ 0 κ − ∆τ 0 0 0 = = 0 κ − ∆τ Dτ (κ − ∆τ ) Dτ
κ − A2τ Dτ =
5
(3.2)
(3.3)
= ((κ − ∆τ ) Dτ )† . Let Uτ be the semigroup generated by the operator Aτ . In Section 7 we will show that a weak solution to problem (1.2) exists and moreover, it is given by the following formula t
X(t) = Uτ (t)X(0) + Uτ (t − r) κ − A2τ Dτ dξ(r), t ≥ 0. (3.4) 0
In other words, a weak solution to problem (1.2) is the mild solution to problem (3.1). In (3.4), the integrals are defined by integration by parts. Thus t Uτ (t − r)(κ − A2τ )Dτ dξ(r) 0
= (κ − A2τ )Dτ ξ(t) − Uτ (t)(κ − A2τ )Dτ ξ(0) t
Aτ Uτ (t − r) κ − A2τ Dτ ξ(r)dr, +
(3.5) t ≥ 0.
0
√ From now on we will assume that κ > 0 is such that both κ and κ belong to the resolvent set of Aτ . Then √
√ Dτ κ −I Dτ √ √ √ = κ − Aτ κDτ κDτ −∆τ κ 0 † = = ((κ − ∆τ ) Dτ ) , (κ − ∆τ ) Dτ √ and hence we see that our case fits into the abstract framework with λ = κ, and Dτ . E= √ κDτ Since condition (2.1) is satisfied4 , we get the following corollary to Theorem 2.1. Proposition 3.1. The process X defined by formula (3.4), is (D(A∗ )) -valued c` adl` ag.
6
Zdzislaw Brze´ zniak and Szymon Peszat
Alternatively, one can give a proper meaning of the term (κ − A2τ )Dτ by using the scales of Hilbert spaces τ Hs+1 Hsτ := × , Hsτ = D(κ − ∆τ )s/2 , s ∈ R, τ Hs where κ belongs to the resolvent set of ∆τ , see Section 6 for more details. It turns out that if ξ is sufficiently regular in space variable, then Dτ ξ takes values in the domain of ∆τ considered on Hτ,−s for s large enough, so the term appearing in the utmost right hand side of (3.1) is well defined. 4. Weak solution to the wave equation We will introduce a notion of weak solution to the wave equation and we will discuss their uniqueness. By S(O) we will denote the class of restrictions of all test function ϕ ∈ S(Rd ) to O and we will denote by (·, ·) the duality forms on both S (O) × S(O) and S (∂O) × S(∂O). We will always assume that any S (O)-valued process v is weakly measurable, that is Ω × [0, ∞) (t, ω) → (v(t)(ω), ϕ) ∈ R is measurable for any ϕ ∈ S(O). Assume now that ξ is an S (∂O)-valued process. Taking into account the Green formula, see e.g. the monograph20 by Lions and Magenes, we arrive at the following definitions of a weak solution. Definition 4.1. We will say that an S (O) × S (O)-valued process (u, v) is a weak solution to (1.2) considered with the Dirichlet boundary condition, i.e. τ = τD , iff (aD) for all t > 0, t (v(r), ϕ)dr, P − a.s. ∀ ϕ ∈ S(O), (4.1) (u(t), ϕ) = (u0 , ϕ) + 0
and (bD) for all t ≥ 0, P-a.s. for all ψ ∈ S(O) satisfying ψ = 0 on ∂O, t ∂ψ . (u(r), ∆ψ)dr + ξ(t) − ξ(0), (v(t), ψ) = (u0,1 , ψ) + ∂n 0
(4.2)
We will call an S (O) × S (O)-valued process (u, v) a weak solution to (1.2) considered with the Neumann boundary condition, i.e. τ = τN , iff (aN) equality (4.1) holds and (bN) for all t > 0 and ψ ∈ S(O) satisfying ∂ψ ∂n = 0 on ∂O, P-a.s. t (u(t), ψ) = (u0 , ψ) + (u(r), ∆ψ)dr − (ξ(t) − ξ(0), ψ) . (4.3) 0
Hyperbolic Equations with Random Boundary Conditions
7
Let κ ≥ 0. For the future consideration we will need also a concept of a weak solution to the (deterministic) elliptic problem ∆u(x) = κu(x), x ∈ O, τ u(x) = γ(x), x ∈ ∂O.
(4.4)
Definition 4.2. Let γ ∈ S (∂O). We call u ∈ S (O) the weak solution to (4.4) considered with the Dirichlet boundary condition τ = τD iff ∂ψ = κ(u, ψ), ∀ ψ ∈ S(O) : ψ = 0 on ∂O. (u, ∆ψ) + γ, (4.5) ∂n We will call u ∈ S (O) a weak solution to (4.4) considered with the Neumann boundary condition (τ = τN ) iff (u, ∆ψ) − (γ, ψ) = κ (u, ψ),
∀ ψ ∈ S(O) :
∂ψ = 0 on ∂O. ∂n
(4.6)
For the completeness of our presentation we present the following result on the uniqueness of solutions.5 Proposition 4.2. (i) For any u0 , u0,1 and γ problem (1.2) considered with Dirichlet or Neumann boundary conditions has at most one solution. (ii) For any γ ∈ S (∂O) and κ ≥ 0 problem (4.4) with Dirichlet boundary condition has at most one solution. (iii) For any γ ∈ S (∂O) and κ > 0 problem (4.4) with Neumann boundary condition has at most one solution. We will denote by DD γ and DN γ the solution to (4.4) with Dirichlet and Neumann boundary conditions. We call DD and DN the Dirichlet and Neumann boundary maps. Note that both these maps depend on the parameter κ and hence should κ κ and DN . However, we have decided to use less cumbersome be denoted by DD notation. 5. Mild formulations In this section we will heuristically derive a mild formulation of the solution to the stochastic nonhomogeneous boundary value problems to the wave equation. In Section 7 we will show that a mild solution is in fact a weak solution. Assume now that a process u solves wave problem (1.2). As in Ref. 11 we consider a new process y := u − Dτ ∂ξ ∂t . Clearly τ y(t) = 0 for t > 0 and ∂ξ ∂ξ ∂y ∂ (0), (0) = u0,1 − Dτ (0). ∂t ∂t ∂t ∂t Next, by the definition of the map Dτ , we have y(0) = u0 − Dτ
∂2y ∂ξ ∂ξ ∂2 − = ∆y + κD Dτ . τ 2 2 ∂t ∂t ∂t ∂t
(5.1)
8
Zdzislaw Brze´ zniak and Szymon Peszat
Let (Uτ (t))t∈R be the group generated by the operator Aτ defined by equality (3.2). Let us put z = ∂y ∂t . Then, for t ≥ 0, y y(0) (t) = Uτ (t) z z(0) t 0 dr. Uτ (t − r) + ∂ξ ∂ξ ∂2 (r) − ∂r κDτ ∂r 2 Dτ ∂r (r) 0 On the other hand, by the integration by parts formula, we have t 0 0 dr = − ∂ Uτ (t − r) ∂ 2 − ∂ξ ∂ξ 0 ∂t Dτ ∂t (t) ∂r 2 Dτ ∂r (r) t 0 0 + Uτ (t) ∂ − dr. Aτ Uτ (t − r) ∂ ∂ξ ∂ξ 0 ∂t Dτ ∂t (0) ∂r Dτ ∂r (r) Set
Z(t) := −Uτ (t)
Then, for t ≥ 0, t Aτ U (t − r) 0
t
= Z(t) + t
= Z(t) + 0
0 ∂ξ ∂ ∂r Dτ ∂r (r)
Aτ Uτ (t − r)
0
Dτ ∂ξ ∂t (0) 0
+
t
dr =
0
0 ∂ξ (r) Dτ ∂r
Dτ ∂ξ ∂t (t) 0
Uτ (t − r)
0 ∂ξ (r) Dτ ∂r dr
A2τ Uτ (t − r)
.
∂ξ ∂ ∂r Dτ ∂r (r) dr 0
dr.
Hence, in view of the equalities appearing in (5.1), we arrive at the following identity y (t) Dt ∂ξ u0 ∂t (t) = Uτ (t) − ∂ ∂ξ z u0,1 ∂t Dτ ∂t (t) t ∂ξ 0 (r)dr, t ≥ 0. (κ − A2τ )Uτ (t − r) + D ∂r τ 0 Putting v(t) :=
∂u(t) ∂t
we observe that
∂ξ ∂ξ ∂ ∂ ∂y(t) + Dτ (t) = z(t) + Dτ (t), ∂t ∂t ∂t ∂t ∂t and hence we obtain (3.4). v(t) =
6. Scales of Hilbert spaces Let (A, D (A)) be the infinitesimal generator of an analytic semigroup S on a real separable Hilbert space H. Let κ ˜ belongs to the resolvent set of A. Then, see e.g. Ref. 21, the fractional power operators (˜ κ − A)s , s ∈ R, are well defined.
Hyperbolic Equations with Random Boundary Conditions
9
In particular, for s < 0, (˜ κ − A)s is a bounded linear operator and for s > 0,
s −s −1 (˜ κ −A) := (˜ κ −A) . For any s ≥ 0 we set Hs := D (˜ κ −A)s/2 = R (˜ κ −A)−s/2 . We equip the space Hs with the norm κ − A)s/2 f H , f Hs := (˜
f ∈ Hs .
Note that for all s, r ≥ 0, (˜ κ − A)r/2 : Hs+r → Hs is an isometric isomorphism. We also introduce the spaces Hs for s < 0. To do this let us fix s < 0. Note that the operator (˜ κ − A)s/2 : H → Hs is an isometric isomorphism. Hence we can define κ − A)s/2 f H . We Hs as the completion of H with respect to the norm f Hs := (˜ have (˜ κ − A)−s/2 f Hs = f H
for f ∈ H−s .
−s/2
κ − A) can be uniquely extended to the linear Thus, since H−s is dense in H, (˜ isometry denoted also by (˜ κ − A)−s/2 between Hs and H. Assume now that A is a self-adjoint non-positive definite linear operator in H. It is well known (and easy to see) that A considered on any Hs with s < 0 is essentially self-adjoint. We denote by As its unique self-adjoint extension. Note that D (As ) = Hs+2 . Finally, for any s ≥ 0, the restriction As of A to Hs+2 is a self-adjoint operator on Hs . The spaces Hs , s < 0, can be chosen in such a way that Hs → Hr → H → H−r → H−s ,
∀ s ≥ r ≥ 0,
with all embedding dense and continuous. Identifying H with its dual space H we obtain Hs → H ≡ H → Hs ,
s ≥ 0.
Remark 6.1. Under the identification above we have Hs = H−s . Moreover, As f, g = f, A−s g ,
∀ f ∈ D (As ), g ∈ D (A−s ),
where ·, · is the bilinear form on Hs × H−s whose restriction to (Hs ∩ H) × H is the scalar product on H. Given s ∈ R, define
Hs+1 Hs := × . Hs
On Hs we consider an operator As defined by the following formulas 0 I , D (As ) = Hs+1 . As := As 0 By the Lumer–Philips theorem24 , As generates an unitary group Us on Hs .
10
Zdzislaw Brze´ zniak and Szymon Peszat
6.1. Application to the boundary value problem Let −∆D and −∆N be the Laplace operators on H = L2 (O) with the homogeneous Dirichlet and Neumann boundary conditions, respectively. The corresponding scales of Hilbert spaces will be denoted by (HsD ) and (HsN ) and the restriction (or, if s < 0, N the unique self-adjoint extension) of ∆D and ∆N to HsD and HsN by ∆D s and ∆s . Finally, we write HsD
D Hs+1 := × HsD
and
AD s := AN s :=
and
0 I , ∆D s 0 0 I , ∆N s 0
HsN
N Hs+1 := × HsN
D D (AD s ) = Hs+1 ,
N D (AN s ) = Hs+1 .
Example 6.1. (i) Let O = (0, 1) and κ = 0. Define functions ψi , i = 1, 2 by ψ1 (x) = 1 and ψ2 (x) = x for x ∈ [0, 1]. Then, see e.g. Ref. 5, DD : ∂O ≡ R2 → C([0, 1]) is given by a = aψ1 + (b − a)ψ2 , DD b d d d D and, by Remark 6.1, see also Ref. 5, ∆D s ψ1 = dx δ0 − dx δ1 and ∆s ψ2 = − dx δ1 for s ≤ −1. Taking into account (3.3), for s ≥ 1 we have
† a a 0 2 D = −∆ = . ) D D −(AD D −s −s D d d δ0 + b dx δ1 b b −a dx
(ii) For the Neumann boundary conditions on O = (0, 1) we take κ = 1. Then b − ae b − ae a a = ψ ψ + a + , ∈ R2 ≡ ∂O, DN 1 2 b b e − e−1 e − e−1 where ψ1 (x) = e−x and ψ2 (x) = ex . Then, by Remark 6.1, for any φ ∈ D (∆1N ), and s ≥ 1, 1
N
d2 φ ∆−s ψi , φ = ∆N ψ , φ = ψi (x) 2 (x)dx. −1 i dx 0 Since
1
ψ1 (x) 0
= e−1
d2 φ (x)dx = dx2
0
1
e−x
d2 φ (x)dx dx2
dφ dφ (1) − (0) + e−1 φ(1) − φ(0) + dx dx
0
1
e−x φ(x)dx
Hyperbolic Equations with Random Boundary Conditions
and
1
0
d2 φ ψ2 (x) 2 (x)dx = dx
ex 0
=e and since
dφ dx (0)
=
dφ dx (1)
1
d2 φ (x)dx dx2
dφ dφ (1) − (0) − eφ(1) + φ(0) + dx dx
11
1
ex φ(x)dx, 0
= 0, it follows −1 δ1 − δ0 + ψ1 , ∆N −s ψ1 = e
∆N −s ψ2 = −eδ1 + δ0 + ψ2 . Consequently,
1 − ∆N −s DN
a b
b − ae
b − ae −1 (eδ1 − δ0 ) = δ0 − e δ1 + a + e − e−1 e − e−1 = −aδ0 + (b − ae) δ1 , and hence
2 1 − (AN DN −s )
† a a = 1 − ∆N D N −s b b 0 . = −aδ0 + (b − ae) δ1 ,
(iii) Let O = (0, ∞) and κ = 1. Then, see e.g. Ref. 5, DD : ∂O ≡ R → C([0, ∞)) is given by DD a = aψ, where ψ(x) = e−x . Next, again by Remark 6.1, see also d Ref. 5, ∆D −s ψ = dx δ0 + ψ, s ≥ 1, and consequently for any s ≥ 1,
†
0 D 2 D 1 − (A−s ) DD (a) = 1 − ∆−s DD (a) = . d δ0 −a dx (iv) For the Neumann boundary problem on (0, ∞) with κ = 1 we have DN a = −aψ, where ψ(x) = e−x . Then, ∆N −s ψ = δ0 + ψ. Consequently,
†
0 N 2 N 1 − (A−s ) DN (a) = 1 − ∆−s DN (a) = . −aδ0 7. Equivalence of weak and mild solutions We denote by τ either the Dirichlet or the Neumann boundary condition. We assume that Dτ is a corresponding boundary map, i.e. it satisfies ∆Dτ ψ = κDτ ψ,
τ Dτ ψ = ψ on ∂O.
(7.1)
12
Zdzislaw Brze´ zniak and Szymon Peszat
τ Recall that Hsτ is the domain of (˜ κ −∆τ )s , hence in particular,
τHs+1 is the domain of τ τ the Laplace operator ∆s considered on Hs . We denote by Us (t) t∈R the C0 -group on Hsτ generated by Aτs . The following existence theorem is the main result of this section.
adl` ag process in a space Hsτ for some s ∈ Theorem 7.1. Assume that Dτ ξ is a c` T τ R. Then for any X0 := (u0 , u0,1 ) ∈ Hs−4 there exists a unique weak solution adl` ag, with the boundary u to problem (1.2) whose trajectories are Hs−3 -valued c`
∂u T ˙ condition τ u = ξ. Moreover, the process X = u, ∂t is given by t †
τ τ (t)X0 + Us−3 (t − r) (κ − ∆τs−2 )Dτ dξ(r) Us−3 0 t †
(7.2) τ τ Aτs−3 Us−2 (t − r) (κ − ∆τs−2 )Dτ ξ(r)dr = Us−3 (t)X0 + 0
† †
τ (t) (κ − ∆τs−2 )Dτ ξ(0), + (κ − ∆τs−2 )Dτ ξ(t) − Us−2
t ≥ 0.
τ . We will Proof. Clearly the process X defined by formula (7.2) is c` adl` ag in Hs−3 show that it is a weak solution to problem (1.2). well-known equivalence By the ∗ , result, see e.g. Ref. 10 or Ref. 25, for any h ∈ D Aτs−3
X(t), h
Hτs−3
= X(0), h
Hτs−3
t
+ 0
∗
X(r), Aτs−3 h Hτs−3 dr
†
+ h, κ − ∆τs−2 Dτ (ξ(t) − ξ(0)) Hτs−3 , Clearly,
Aτs−3
∗
=
0 ∆τs−3 I0
(7.3) t ≥ 0.
.
Let X = (u, v)T , and let ϕ ∈ S(O) and ψ ∈ S(O)be such that τ ψ = 0 on ∂O. ∗
. Applying equality (7.3) to Note that h1 := (ϕ, 0)T , h2 := (0, ψ)T ∈ D Aτs−3 h = h1 we obtain t v(r), ϕ Hs−2 dr, t ≥ 0. u(t), ϕ Hs−2 = u0 , ϕ Hs−2 + 0 s−2
Consequently, for ϕ˜ := (κ − ∆)
ϕ,
t
(u(t), ϕ) ˜ = (u0 , ϕ) ˜ +
(v(r), ϕ)dr, ˜
t ≥ 0.
0
Next, for h = h2 ,
t
v(t), ψ Hs−3 = u0,1 , ψ Hs−3 +
u(r), ∆ψ Hs−3 dr + Rτ (t), t ≥ 0,
0
where, for t ≥ 0,
Rτ (t) := ψ, κ − ∆τs−2 Dτ ξ(t) Hs−3 − ψ, κ − ∆τs−2 Dτ ξ(0) Hs−3 .
Hyperbolic Equations with Random Boundary Conditions s−3
Let ψ˜ := (˜ κ − ∆) operator,
13
ψ. It remains to show that in the case of the Dirichlet boundary Rτ (t) =
∂ ψ˜ ξ(t) − ξ(0), ∂n
and, in the case of the Neumann boundary operator, Rτ (t) = − ξ(t) − ξ(0), ψ˜ . These two identities follow from an observation that by Definition 4.2, if z is such that Dτ z ∈ Hs , then for any ψ ∈ S(O) satisfying τ ψ = 0, z(t), ∂ψ if τ is Dirichlet,
∂n ψ, λ − ∆τs−2 Dτ z(t) = − ξ(t) − ξ(0), ψ˜ if τ is Neumann. 8. The Fundamental Solution 2
Let Gτ be the fundamental solution to the Cauchy problem for ∂∂t2u = ∆u associated with the boundary operator τ . In other words, Gτ : (0, ∞) × O × O → R satisfies τ τ Gτ (t, x, y) = 0 with respect to x and y variables, Gτ (0, x, y) = 0 and ∂G ∂t (0, x, y) = δx (y), ∂ 2 Gτ (t, x, y) = ∆x Gτ (t, x, y) = ∆y Gτ (t, x, y), t > 0, x, u ∈ O. ∂t2 Then the wave semigroup is given by, for x ∈ O, u0 (x) Uτ (t) u0,1 ∂ ∂t Gτ (t, x, y)u0 (y) + Gτ (t, x, y)u0,1 (y) dy O . = ∂2 ∂ G (t, x, y)u0 (y) + ∂t Gτ (t, x, y)u0,1 (y) dy O ∂t2 τ Hence, for x ∈ O, Uτ (t) ((κ − ∆τ ) Dτ )† v(x) Gτ (t, x, y)(κ − ∆τ )Dτ v(y)dy . = O∂ G (t, x, y)(κ − ∆τ )Dτ v(y)dy O ∂t τ
(8.1)
Let us now denote by σ the surface measure on ∂O. The following result gives the formula for the solution to wave problem (1.2) in terms of the fundamental solution. Theorem 8.1. Assume that u is a solution to wave problem (1.2), where for simplicity u0 = 0 = u0,1 . (i) If τ is the Dirichlet boundary operator, then the solution is given by t ∂Gτ u(t, x) = − (t − s, x, y)v(y)dξ(s)(y)σ(dy), t ≥ 0, x ∈ O. ∂O ∂ny 0
14
Zdzislaw Brze´ zniak and Szymon Peszat
(ii) If τ is the Neumann boundary operator, then t u(t, x) = Gτ (t − s, x, y)dξ(s)(y)σ(dy), 0
t ≥ 0, x ∈ O.
∂O
Proof. Let us assume that u0 = u0,1 = 0. Then by (8.1), the solution to wave problem (1.2) is given by t Tτ (t − s)dξ(s), t ≥ 0, x ∈ O, u(t, x) = 0
where
Tτ (t)v(x) :=
O
Gτ (t, x, y)(κ − ∆τ )Dτ v(y)dy,
t ≥ 0, x ∈ O.
Note that, first by Remark 6.1, and then by the fact that Gτ satisfies the boundary condition τ Gτ (t, x, y) = 0 with respect to y-variable, we obtain, for t ≥ 0, x ∈ O, Gτ (t, x, y)(κ − ∆τ )Dτ v(y)dy Tτ (t)v(x) := O = (κ − ∆τ )y Gτ (t, x, y)Dτ v(y)dy. O
The Green formula and the fact that (κ − ∆)Dτ v = 0, yield then ∂G ∂Dτ v − Tτ (t)v(x) = (t, x, y)Dτ v(y) + G(t, x, y) (y) σ(dy) ∂ny ∂ny ∂O for all t ≥ 0 and x ∈ O. Hence if τ is the Dirichlet boundary operator, then ∂Gτ (t, x, y)v(y)σ(dy), t ≥ 0, x ∈ O. Tτ (t)v(x) = − ∂O ∂ny If τ is the Neumann boundary operator, then repeating the previous argument we obtain Gτ (t, x, y)v(y)σ(dy), t ≥ 0, x ∈ O. Tτ (t)v(x) = ∂O 9. Applications Assume that the process ξ(t)(x), t ≥ 0, x ∈ ∂O, is of one of the following two forms λk Wk (t)ek (x), t ≥ 0, x ∈ O, (9.1) ξ(t)(x) = k
or ξ(t)(x) =
Zk (t)ek (x),
t ≥ 0, x ∈ O,
(9.2)
k
where (λk ) is a sequence of real numbers, (ek ) is a sequence of measurable functions on ∂O, (Wk ) is a sequence of independent real-valued standard Wiener processes,
Hyperbolic Equations with Random Boundary Conditions
15
and (Zk ) is a sequence of uncorrelated real-valued pure jump L´evy processes, that is t t z πk (ds, dz) + π(ds, dz), t ≥ 0, Zk (t) = ak t + {|z|≤1}
0
0
{|z|>1}
where πk are Poisson random measures each with the jump measure νk . In the jump case, let us set λk,R := ak + νk {1 < |z| ≤ R}, Then, for all k and R,
˜ k,R := λk,R + νk {0 < |z| ≤ R}. λ t
Zk (t) = λk,R t + Mk,R (t) +
{|z|>R}
0
where
t Mk,R (t) := 0
z πk (ds, dz),
{|z|≤R}
t ≥ 0,
π(ds, dz),
t ≥ 0.
Let τk,R := inf{t ≥ 0 : |Zk (t) − Zk (t−)| ≥ R}. Then τk,R ↑ +∞ as R ↑ +∞. Moreover, Zk (t) = λk,R t + Mk,R (t) on {τk,R ≥ t}. Recall, see e.g. Ref. 25 [Lemma 8.22] or Ref. 26, that Mk,R are square integrable martingales and that for any predictable process f and any T > 0, t T f (s)dMk,R (s) ≤ C νk {0 < |z| ≤ R} E |f (s)|ds, E sup t∈[0,T ]
and
0
0
t 2 E sup f (s)dMk,R (s) ≤ C t∈[0,T ]
{0<|z|≤R}
0
z 2 νk (dz)
T
E |f (s)|2 ds,
0
where C is a certain (independent of f, T, R and πk ) universal constant. Recall that Gτ is the fundamental solution to the Cauchy problem for the wave equation with boundary operator τ . It turns out that the Dirichlet problem has a function valued solution if ξ is absolutely continuous. Therefore we present below the result on the Neumann problem. For specific examples see e.g. Refs. 7–9,19. Theorem 9.1. Assume that τ is the Neumann boundary operator. (i) If ξ is given by (9.1), then the solution u to (1.2) is a square integrable random field on [0, +∞)× O if and only if 2 t ds < ∞, t ≥ 0, x ∈ O. λ2k G (s, x, y)e (y)σ(dy) I(t, x) := τ k k
0
∂O
16
Zdzislaw Brze´ zniak and Szymon Peszat
Moreover, u(t, x) =
t λk 0
k
Gτ (t − s, x, y)ek (y)dσ(y)dWk (s), t ≥ 0, x ∈ O
∂O
and E |u(t, x)|2 ≤ I(t, x) for all t ≥ 0 and x ∈ O. (ii) If ξ is given by (9.2), then for all R > 0, t ˜ Gτ (s, x, y)ek (y)σ(dy) ds < ∞. λk,R JR (t, x) := 0
k
∂O
and the solution u to (1.2) is a random field if τR → +∞ as R → +∞. Moreover, t Gτ (t − s, x, y)ek (y)σ(dy)dZk (s), t ≥ 0, x ∈ O u(t, x) = − k
0
∂O
and E |u(t, x)|χ{t≤τR } ≤ CJR (t, x) for all t > 0, x ∈ O, and R > 0. Example 9.2. As an example of the wave equation with stochastic Dirichlet boundary condition, consider d = 1 and O = (0, +∞). Then the fundamental solution G is given by 1
χ{|x−y|
0
In general, process u = u(t) t≥0 takes values in a proper space of distributions. In fact, for a test function ϕ, t (u(t), ϕ) = ϕ(t − s)dξ(s), t ≥ 0. 0
For similar results in the case of the transport equation see Examples 10.3 and 10.4. 10. The Transport equation Let us describe an abstract framework in which we will study problem (1.3). Namely, in this case we put H = L2 (0, 2π) and U = R. We will identify H with the space of all locally square integrable 2π periodic functions f : R → R. Alternatively, we can identify H with the space L2 (S 1 ), where S 1 is the standard unit circle equipped
Hyperbolic Equations with Random Boundary Conditions
17
with the Haar measure (multiplied by 2π). In the space H we consider an operator A defined by 1,2 (0, 2π), D(A) = Hper
Au =
du , dx
(10.1)
where H 1,2 (0, 2π) is the Sobolev space of functions u ∈ L2 (0, 2π) with the weak 2 derivative du dx ∈ L (0, 2π), and 1,2 (0, 2π) = {u ∈ H 1,2 (0, 2π) : u(0+) = u(2π−)}. Hper
It is easy to see that A generates a C0 -group (U (t))t∈R in H. In fact, this group is the standard translation group defined by ˙ U (t)u(x) = u(t+x),
t ∈ R, x ∈ (0, 2π),
(10.2)
˙ is the addition modulo 2π. Let τ be the boundary operator defined by where + H 1,2 (0, 2π) u → τ u = u(2π−) − u(0+) ∈ R.
(10.3)
Note that by the Sobolev embedding theorem H 1,2 (0, 2π) → C([0, 2π]) and hence 1,2 u(2π−) and u(0+) make sense for each u ∈ H (0, 2π). Finally, we assume that ξ = adl` ag process defined on some complete filtered probability (ξ(t) t≥0 is an R-valued c` space (Ω, F , (Ft ), P). With all these notation we can now present an abstract form of problem (1.3), i.e. ∂u(t) = Au(t) for t ∈ (0, ∞), ∂t (10.4) u(0, ·) = u0 on (0, 2π), dξ(t) τ u(t) = for t ∈ (0, ∞). dt The problem needs to be reformulated as,for example, on the one hand there is an expression Au(t) and on the other hand τ u(t) may be different from 0, and hence u(t) may not belong to D(A). In order to give a proper definition of a mild solution to the above problem we will argue heuristically as in Section 5. For this we need to introduce a counterpart of the Dirichlet map Dτ from Section 4. To this aim let z0 ∈ H 1,2 (0, 2π) be the unique solution of the following problem d z0 (x) = z0 (x), x ∈ (0, 2π), dx
z0 (2π) − z0 (0) = 1.
Thus z0 (x) = (e2π − 1)−1 ex , x ∈ (0, 2π). Then we define a map Dτ : R α → αz0 ∈ H 1,2 (0, 2π). Note that the map Dτ satisfies the following u ∈ H 1,2 (0, 2π),
du =u dx
on (0, 2π) and τ (u) = α ⇐⇒ Dτ (α) = u.
In other words, τ ◦ Dτ is the identity operator on R.
18
Zdzislaw Brze´ zniak and Szymon Peszat
Let u be a solution to (10.1). As in Section 5 we consider a new process y defined by the following formula y(t) := u(t) − Dτ
dξ (t), dt
t > 0.
(10.5)
Clearly τ y(t) = 0 for t > 0 and y(0) = u0 − Dτ dξ dt (0). Next we have dξ ∂u ∂ ∂y = − Dτ (t) ∂t ∂t ∂t dt dξ dξ ∂ Dτ (t) = A y(t) + Dτ (t) − dt ∂t dt dξ dξ ∂ Dτ (t) . = Ay(t) + Dτ (t) − dt ∂t dt Continuing as in Section 5 we obtain t dξ ∂ dξ y(t) = U (t)y(0) + Dτ (r) dr. U (t − s) Dτ (r) − dr ∂r dr 0 On the other hand by integration by parts formula t dξ dξ ∂ − Dτ (r) dr = −Dτ (t) U (t − r) ∂r dr dt 0 t dξ dξ +U (t)Dτ (0) − AU (t − r)Dτ (r)dr. dt dr 0 Hence, we infer that
dξ dξ y(t) = U (t) y(0) + Dτ (0) − Dτ (t) dt dt t dξ + (I − A) U (t − r) Dτ (r) dr, t ≥ 0. dr 0
Therefore, in view of (10.5) we arrive at the following heuristic formula t dξ u(t) = U (t)u0 + (I − A) U (t − r) Dτ (r) dr, t ≥ 0. dr 0 Since (A∗ ) is an extension of A, to obtain a meaningful version of the above equation one should write t dξ (10.6) (I − (A∗ ) ) U (t − r) Dτ (r) dr, t ≥ 0, u(t) = U (t)u0 + dr 0 where U is the extension of the original group to (D(A∗ )) . Summing up, we have shown (for more details see Ref. 4) that the transport problem can be written in the abstract form (1.1) with E = Dτ and λ = 1. It is known (see Ref. 4) that the condition (2.1) is satisfied. It follows from Ref. 4 [Proposition 1.1, p.459] that the above integral defines 2 a function belonging to C([0, ∞); H) provided dξ dr ∈ Lloc (0, ∞). However, we are
Hyperbolic Equations with Random Boundary Conditions
19
interested in cases when this condition is no longer satisfied. Hence, if we perform integration by parts in the integral in (10.6) we obtain the following Anzatz for the solution to problem (10.1): u(t) = U (t)u0 + (I − (A∗ ) ) Dτ ξ(t) − U (t)Dτ ξ(0) t (10.7) (I − A∗ ) ) AU (t − r) Dτ ξ(r) dr, t ≥ 0. + 0
Let us observe that formula (10.7) is a counterpart of formula (3.5) from Section 3. In some sense, this could be seen as a stochastic counterpart of a deterministic result from Ref. 4, see Proposition 1.1 on p. 459. As far as the problem (10.1) is concerned, it remains to identify the space (D(A∗ )) with an appropriate space of distributions. We have the following. Proposition 10.3. The space (D(A∗ )) is equal to H −1,2 (0, 2π) and the operator (A∗ ) is equal to the weak derivative. In particular, dϕ , u ∈ H, ϕ ∈ H 1,2 (0, 2π). ((A∗ ) u, ϕ) = − u, dx
−1 x Recall that Dτ (α) = αz0 , where z0 (x) = e2π − 1 e , x ∈ (0, 2π). Thus, by the proposition above, for any ϕ ∈ H 1,2 (0, 2π), 2π dϕ ∗ ((A ) z0 , ϕ) = − z0 (x) (x)dx dx 0 2π = −z0 (2π)ϕ(2π) + z0 (0)ϕ(0) + z0 (x)ϕ(x)dx, 0
and hence
∗
(I − (A ) ) Dτ [α] =
1 e2π δ2π − 2π δ0 α. e2π − 1 e −1
Next not that, for any test function ϕ ∈ H 1,2 (0, 2π), t t
˙ δ2π , ϕ(t − s+·) U (t − s)δ2π dξ(s), ϕ = dξ(s) 0
0
t
˙ ϕ(t − s+2π)dξ =
t
˙ ϕ(t − s+0)dξ t U (t − s)δ0 dξ(s), ϕ . =
=
0
0
0
Therefore, by (10.8), t t ˙ U (t − s) (I − (A∗ ) ) Dτ dξ(s), ϕ = ϕ(t − s+0)dξ(s), 0
and in other words, we have the following result.
0
(10.8)
20
Zdzislaw Brze´ zniak and Szymon Peszat
Proposition 10.4. The solution u to problem (1.3) is an H −1,2 (0, 2π)-valued process such that for any test function ϕ ∈ H 1,2 (0, 2π), t ˙ ϕ(t − s+0)dξ(s), t ≥ 0. (u(t), ϕ) = (u0 , ϕ) + 0
Example 10.3. Assume that ξ is a compound Poisson process defined on a probability space (Ω, F , P) with the jump measure ν; that is
Π(t)
ξ(t) =
Xk ,
t ≥ 0,
k=1
where Π is Poisson process with intensity ν(R) and Xk are independent random variable with the distribution ν/ν(R). Let τk be the moments of jumps of Π. Then t ˙ ˙ ϕ(t − s+0)dξ(s) = ϕ(t − τk +0)X k. 0
τk ≤t
Since each τk has a absolutely continuous distribution (exponential), the formula can be extended to any ϕ ∈ L2 (0, 2π), and hence the solution is a cylindrical process in L2 (0, 2π); that is for each t > 0, u(t) is a bounded linear operator from L2 (0, 2π) to L2 (Ω, F , P). Example 10.4. Let ξ be a Wiener process. Then again u is a distribution valued process and a cylindrical Gaussian random process in L2 (0, 2π). References 1. E. Al` os and S. Bonaccorsi Stability for stochastic partial differential equations with Dirichlet white-noise boundary conditions, Infin. Dimens. Anal. Quantum Probab. Relat. Top. 5, 465–481 (2002) 2. E. Al` os and S. Bonaccorsi, Stochastic partial differential equations with Dirichlet whitenoise boundary conditions, Ann. Inst. H. Poincar´e Probab. Statist. 38, 125–154 (2002) 3. A. V. Balakrishnan, Identification and stochastic control of a class of distributed systems with boundary noise. Control theory, numerical methods and computer systems modelling (Internat. Sympos., IRIA LABORIA, Rocquencourt, 1974), pp. 163–178, Lecture Notes in Econom. and Math. Systems, Vol. 107, Springer, Berlin, 1975. 4. A. Bensoussan, G. Da Prato, M.C. Delfour, and S.K. Mitter, Representation and Control of Infinite Dimensional Systems, Second edition. Systems & Control: Foundations & Applications. Birkh¨ auser Boston, Inc., Boston, MA, 2007. 5. Z. Brze´zniak, B. Goldys, G. Fabri, S. Peszat, and F. Russo, Second order PDEs with Dirichlet white noise boundary conditions, in preparation. 6. I. Chueshov and B. Schmalfuss, Qualitative behavior of a class of stochastic parabolic PDEs with dynamical boundary conditions, Discrete Contin. Dyn. Syst. 18, 315–338 (2007) 7. R. Dalang and O. L´evˆeque, Second order linear hyperbolic SPDE’s driven by isotropic Gaussian noise on a sphere, Ann. Probab. 32, 1068–1099 (2004)
Hyperbolic Equations with Random Boundary Conditions
21
8. R. Dalang and O. L´evˆeque, Second-order hyperbolic SPDE’s driven by homogeneous Gaussian isotropic noise on a hyperplane, Trans. Amer. Math. Soc. 358, 2123–2159 (2006) 9. R. Dalang and O. L´evˆeque, Second-order hyperbolic SPDE’s driven by boundary noises, Seminar on Stochastic Analysis, Random Fields and Applications IV, pp. 83–93, Progr. Probab., 58, Birkhuser, Basel, 2004. 10. G. Da Prato and J. Zabczyk, Stochastic Equations in Infinite Dimensions, Cambridge Univ. Press, Cambridge, 1992. 11. G. Da Prato and J. Zabczyk, Evolution equations with white-noise boundary conditions, Stochastics and Stochastics Rep. 42, 167–182 (1993) 12. G. Da Prato and J. Zabczyk, Ergodicity for Infinite Dimensional Systems, Cambridge Univ. Press, Cambridge, 1996. 13. J. Duan and B. Schmalfuss, The 3D quasigeostrophic fluid dynamics under random forcing on boundary, Commun. Math. Sci. 1, 133–151 (2003) 14. I. Lasiecka and R. Triggiani, A cosine operator approach to modeling L2 (0, T ; L2 (Γ))boundary input hyperbolic equations, Appl. Math. Optim. 7, 35–93 (1981) 15. I. Lasiecka and R. Triggiani, Regularity of hiperbolic equations under L2 (0, T ; L2 (Γ))Dirichlet boundary terms, Appl. Math. Optim. 10, 275–286 (1983) 16. I. Lasiecka and R. Triggiani, Differential and algebraic Riccati equations with applications to boundary/pioint control problems, Lecture Notes in Control and Inform. Sci., vol. 165, Springer-Verlag, Berlin, Heidelberg, New York, 1991. 17. I. Lasiecka and R. Triggiani, Control Theory for Partial Differential Equations: Continuous and Approximation Theories. II. Abstract hyperboliclike systems over a finite time horizon, Encyclopedia of Mathematics and its Applications, 75. Cambridge Univ. Press, Cambridge, 2000. 18. I. Lasiecka, J. L. Lions, and R. Triggiani, Non homogeneous boundary value problems for second order hyperbolic operators, J. Math. Pures Appl. 65, 149–192 (1986) 19. O. L´evˆeque, Hyperbolic SPDE’s driven by a boundary noise, PhD Thesis 2452 (2001), EPF Lausanne. 20. J.L. Lions and E. Magenes, Non-Homogeneous Boundary Value Problems and Applications I, Springer-Verlag, Berlin Heidenberg New York, 1972. 21. A. Lunardi, Analytic Semigroups and Optimal Regularity in Parabolic Problems, Birkhauser, 1995. 22. X. Mao and L. Markus, Wave equations with stochastic boundary values, J. Math. Anal. Appl. 177, 315–341 (1993) 23. B. Maslowski, Stability of semilinear equations with boundary and pointwise noise, Ann. Scuola Norm. Sup. Pisa 22, no. 1, 55–93 (1995) 24. A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations, Springer, New York, 1983. 25. S. Peszat and J. Zabczyk, Stochastic Partial Differential Equations Driven by L´ evy Processes, Cambridge Univ. Press, Cambridge, 2007. ´ 26. E. Saint Loubert Bi´e, Etude d’une EDPS conduite par un bruit poissonien, Probab. Theory Related Fields 111, 287–321 (1998) 27. L. Schwartz, Th´ eorie des distributions I, II, Hermann & Cie., Paris, 1950, 1951. 28. R.B. Sowers, Multidimensional reaction-diffusion equations with white noise boundary perturbations, Ann. Probab. 22, 2071–2121 (1994)
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 23–42
Chapter 2 Decoherent Information of Quantum Operations
Xuelian Cao, Nan Li∗ and Shunlong Luo† School of Mathematics and Statistics, Huazhong University of Science and Technology, Wuhan 430074, China Quantum operations (channels) are natural generalizations of transition matrices in stochastic analysis. A quantum operation usually causes decoherence of quantum states. Speaking in a broad sense, decoherence is the loss of some kind of correlations (in particular, entanglement). This is also recognized as the origin of the emergence of classicality. While this decoherence has been widely studied and characterized from various qualitative perspectives, there are relatively fewer quantitative characterizations. In this work, motivated by the notion of coherent information and consideration of correlating capability and transferring of correlations, we study an informational measure of decoherent capability of quantum operations in terms of quantum mutual information. The latter is usually regarded as a suitable measure of total correlations in a bipartite system. This measure possesses a variety of desirable properties required for quantitative characterizations of decoherent information, and is complementary to the coherent information introduced by Schumacher and Nielsen (Phys. Rev. A, 54, 2629, 1996) in a loose sense. Apart from the significance in its own right, the decoherent information also provides an alternative and simple interpretation of, and sheds new light on, the coherent information. Several examples are worked out. A quantum Fano type inequality, an informational no-broadcasting result for bipartite correlations, and a continuity estimate for the decoherent information, are also established.
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Quantum mutual information and purification of mixed states 3 Decoherent information . . . . . . . . . . . . . . . . . . . . . 4 No-broadcasting in terms of decoherent information . . . . . 5 Continuity of the decoherent information . . . . . . . . . . . 6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ∗ Academy
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
24 25 26 36 38 40 41
of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China † Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing 100190, China,
[email protected] 23
24
Xuelian Cao, Nan Li and Shunlong Luo
1. Introduction Quantum states and quantum operations (channels) are fundamental objects in quantum theory,20,22 and they are the natural quantum counterparts of probability densities and transition matrices in classical probability theory. The coupling between states, measurements and quantum operations is the basic starting point for generating probabilities of measurement outcomes and usually causes decoherence of quantum features. This decoherence is also the central phenomenon relating the quantum world to the classical realm. Given a mixed state ρ, which is mathematically represented by a non-negative operator with unit trace on a Hilbert space, and a quantum operation E, which is described by a trace-preserving completely positive linear mapping on quantum states,20 in order to quantify the decoherence caused by the quantum operation, one naturally asks how the output state E(ρ) is different from the input state ρ, and furthermore, how the relations with their respective outside environments are changed. We will distinguish two kinds of differences: The first is physical and the second is informational. As for the physical difference, there are many distance measures, such as the trace distance, the Hilbert-Schmidt distance, the Bures distance (a simple function of fidelity), etc.,19,20 which quantify the formal difference between any two states. While these distance measures capture the formal difference of quantum states and play a significant role in the study of state disturbance, they are hardly directly useful in characterizing the intrinsic difference of informational contents of quantum states in the context of correlations, which lies in the heart of decoherence. To take an extreme example, any two unitarily equivalent states of the same quantum system have the same amount of informational contents and can be converted to each other by a unitary operation without loss of information, and thus should be considered informationally equivalent, but the above various distance measures fail to characterize this phenomenon. As for the informational difference between quantum states, a seemingly necessary requirement for this kind of measure is that it should not distinguish any two states which can be converted to each other by quantum operations while preserving their correlations with other systems. In particular, the informational distance should be zero for any two states which are related by unitary operations. In this work, we are concerned with informational difference and interested in how the operation E causes informational decoherence of a quantum state ρ in the context of correlations. The setup is as follows: From a fundamental point of view, any mixed state ρ can be interpreted as a marginal state of a pure state in a larger system (purification), that is, arising from the entanglement of the quantum system with an auxiliary system, and the informational content of ρ quantifies the correlating capability of the quantum system to the auxiliary system. Consider the task of sending the system state through a noisy channel in order to transmit the
Decoherent Information
25
initial correlations (not just the initial system state). Since the auxiliary system is isolated and left undisturbed, the joint operation on the initial purified state is the tensor product of the system evolution E and the identity operation on the auxiliary system, and the final bipartite state will usually have less correlations. The purpose is to maintain correlations between the final system state and the auxiliary system as many as possible. In such a scenario, a fundamental and key quantity is the difference between the initial correlations and the final correlations. We interpret this quantity as a measure of decoherent information, with the correlations quantified by the quantum mutual information. It turns out that this measure is somewhat complementary to the celebrated coherent information introduced by Schumacher and Nielsen.27 As a simple consequence, our approach simplifies their derivations and intuitions considerably, and furthermore, clarifies the true meaning of the coherent information as a hybrid quantity (which is however ambiguously interpreted in the literature). Moreover, the decoherent information, though rather trivially related to the coherent information from the mathematical viewpoint, is conceptually more intuitive, and can be more easily manipulated. Among other applications of the notion of the decoherent information, we evaluate the decoherent information for several important quantum operations, establish a quantum Fano type inequality, an informational no-broadcasting theorem, and a continuity estimate for the decoherent information. The remaining part of this work is structured as follows. In Sect. 2, we review briefly the notions of quantum mutual information and purification, which will play crucial roles in our approach. In Sect. 3, working in the framework of Schumacher and Nielsen,27 and Adami and Cerf,1 we introduce a measure of decoherent information and enumerate its fundamental properties, which are essentially reformulations of the results in Refs. 1 and 27. However, the simple and intuitive derivations here do not depend on their results. We will emphasize the complementary relation between the decoherent information and the coherent information. Our main results in this section are the explicit evaluations of the decoherent information for several important quantum operations (channels). A quantum Fano type inequality is also established. As an interesting application of the decoherent information, we establish in Sect. 4 an informational no-broadcasting theorem. We provide a continuity estimate for the decoherent information in Sect. 5. Finally, we conclude with discussion in Sect. 6. 2. Quantum mutual information and purification of mixed states For any bipartite quantum state ρab of a composite quantum system H a ⊗ H b with marginal states (partial traces) ρa = trb ρab ,
ρb = tra ρab ,
26
Xuelian Cao, Nan Li and Shunlong Luo
its quantum mutual information is defined as1,30 I(ρab ) := S(ρa ) + S(ρb ) − S(ρab ). Here S(ρa ) := −trρa logρa is the quantum entropy (von Neumann entropy) and the logarithm may be taken to any base larger than 1. Quantum mutual information is widely recognized and used as a natural measure of total correlations in a bipartite state, and there are a variety of mathematical, informational as well as physical arguments supporting this belief and usage.1,10,16,29,30 In particular, see Ref. 16 for a concise and informative review. A fundamental property of the quantum mutual information is the decreasing property under local quantum operations. More precisely, let E a and E b be quantum operations on the systems H a and H b , respectively, and put E ab := E a ⊗ E b , then I(ρab ) ≥ I(E ab (ρab )). This is a particular instance of the monotonicity of quantum relative entropy, which is an extremely important and fundamental property.13,30 In quantum mechanics, there are several interpretations of the notion of a density operator (mixed state). We will regard it as the marginal state of a pure state in a larger system. This has both the mathematical convenience and physical significance, and is the departure of our study of decoherence (of correlations) in this chapter. See Ref. 18 for a concise review of the status of density operators. For any state ρb of a quantum system H b (the unusual superscript here is for latter convenience), there is a pure state |Ψab of a composite system H a ⊗ H b such that ρb = tra |Ψab Ψab |. Here H a is an auxiliary system. Moreover, ρb and ρa = trb |Ψab Ψab | have the same quantum entropy. The most natural construction of a purification is as follows:20 Let the spectral decomposition of ρb be λi |ψib ψib |, ρb = i
then |Ψab =
λi |ψia ⊗ |ψib i
is a particular purification of ρ . Here |ψia = |ψib . Notice that purification is not unique, but this will have no consequence here because our approach will be independent of the purifications. b
3. Decoherent information Now consider a state ρb of a quantum system H b and a quantum operation E b , thus with the input state ρb , we get the output state E b (ρb ). In order to quantify the
Decoherent Information
27
informational decoherence caused by E b on the state ρb , we compare the correlating capability of the original state ρb with an auxiliary system and that of the final state after the operation. For this purpose, we put the quantum system in the purified context, and consider the quantum state ρb as the marginal state of a larger system.1,27 Thus let |Ψab be a pure state in a composite quantum system H a ⊗ H b which purifies ρb , that is, ρb = tra |Ψab Ψab | (here H a is an auxiliary system). Let I a be the identity operation acting on the auxiliary system H a . After the joint operation I a ⊗ E b acting on the joint pure state |Ψab , the output state is
ρa b := I a ⊗ E b (|Ψab Ψab |) with marginal states
ρb := tra ρa b = E b (ρb ),
ρa := trb ρa b .
Before the action of the quantum operation E b , the correlating capability of ρb is quantified by the quantum mutual information I(ρab ) in the bipartite state ρab := |Ψab Ψab |,
and after the action, the correlating capability of the output state ρb = E b (ρb ) in this context is quantified by the quantum mutual information I(ρa b ). Consequently, the difference
D(ρb , E b ) := I(ρab ) − I(ρa b ) quantifies the loss of correlating capability of ρb due to the operation (decoherence map) E b , and thus serves as a natural measure of the decoherence. We will call D(ρb , E b ) the decoherent information (as opposed to the coherent information), for its own sake and further for a reason that will be transparent after we establish the complementary nature between it and the coherent information of Schumacher and Nielsen27 in the subsequent Eq. (3.6). The decoherent information is an intrinsic quantity, depending only on ρb and E b , and is independent of the purifications. To gain an intuition about this notion, let us first consider some extreme examples. Example 1. Let U be any unitary operator on H b and consider the unitary operation E b (ρb ) = U ρb U † . Because the evolution is unitary, there is not any decoherence here. Consequently, we expect that the decoherent information in this situation should be zero. This is indeed the case since
I(ρa b ) = I(1a ⊗ U ρab 1a ⊗ U † ) = 2S(ρa ) = I(ρab ), from which we obtain D(ρb , E b ) = 0. Here 1a is the identity operator on H a .
28
Xuelian Cao, Nan Li and Shunlong Luo
Example 2. Let E b be the completely depolarizing operation which transforms every state into the maximally mixed one:
1b E (σ) = d b
(here 1b is the identity operator on H b , and d is the dimension of H b ). Then since this is the complete decoherence and we expect the decoherent information equals to the correlating capability of the original state, that is, 2S(ρb ) (total correlations of the original purified state). This is indeed the case because ρ
a b
1b =ρ ⊗ d a
and thus I(ρa b ) = 0, from which we obtain
D(ρb , E b ) = I(ρab ) − I(ρa b ) = 2S(ρb ), and we see that all correlations are lost. Now, we list some of the main properties of the decoherent information. (1) For any quantum state ρb , it holds that 0 ≤ D(ρb , E b ) ≤ 2S(ρb ). (2) D(ρb , E b ) = 0 if and only if E b is invertible on ρb , that is, there exists a quantum operation E b such that E b ◦ E b (ρb ) = ρb . (3) D(ρb , E b ) = 2S(ρb ) if E b is a completely depolarization operation. (4) Let E b and E b be two quantum operations acting on the H b system and the H b system, respectively, then
D(ρb , E b ) ≤ D(ρb , E b ◦ E b ). For item (1), the first inequality follows from the monotonicity of the quantum mutual information under local operations,13,30 and the second inequality is trivial since
D(ρb , E b ) = I(ρab ) − I(ρa b ) ≤ I(ρab ) = 2S(ρb ). Item (2) is also a direct consequence of the monotonicity of the quantum mutual information and the equality condition as specified by Petz et al.11,23 Item (3) follows from direct calculations, as shown in Example 2. Item (4) also follows immediately from the monotonicity of quantum mutual information. This may be interpreted as a data processing inequality. Following Schumacher and Nielsen,27 let us recall the coherent information
C(ρb , E b ) := S(ρb ) − S(ρa b ),
(3.1)
which is an intrinsic quantity, depending only on ρb and E b , and is actually the minus of the quantum conditional entropy S(ρa b |ρb ) := S(ρa b ) − S(ρb ). The coherent information is often interpreted as a measure of degree of entanglement
Decoherent Information
29
retained by the systems H a and H b ,27 a measure of quantumness of the correlations in the final joint state. However, these interpretations are ambiguous and cannot be taken too seriously for two reasons: First, it can be negative, and second, as we will see after we establish Eq. (3.6), the coherent information, as the difference of the entanglement entropy and the decoherent information, is actually a hybrid quantity. Nevertheless, the coherent information is a fundamental quantity and plays a significant role in the study of quantum error corrections and quantum channel capacities.3,5,7,8,12,17,26–28 We will also need the notion of entropy exchange,26,27 which is defined as the von Neumann entropy
Se (ρb , E b ) := S(ρa b )
(3.2)
of the final state ρa b . This is an intrinsic quantity measuring the information exchanged between the system and its exterior world. To see this more clearly, let us introduce an environment system H c with an initial pure state |ψ c in order to unitarily dilate the quantum operation E b as E b (ρb ) = trc (V (ρb ⊗ |ψ c ψ c |)V † ).
(3.3)
Here V is a unitary operator on the composite system H b ⊗H c (system plus environment). Then as a whole combining the quantum system H b , the auxiliary system H a purifying the quantum state ρb , and the environment H c dilating the quantum operation E b , we have a tripartite system H a ⊗ H b ⊗ H c and the initial tripartite pure state |Ψabc := |Ψab ⊗ |ψ c , which is driven by the unitary operator 1a ⊗ V to the final pure state
|Ψa b c := 1a ⊗ V (|Ψab ⊗ |ψ c ). See Figure 2.1 for a schematic illustration. Because ρa b c := |Ψa b c Ψa b c | is pure, for the marginal states, we have S(ρa b ) = S(ρc ), and consequently,
Se (ρb , E b ) = S(ρa b ) = S(ρc )
(3.4)
is the entropy of the final state of the environment, which in turn can be interpreted as the net increase of the entropy of the environment since the initial entropy of the environment is zero (note that the initial state of the environment |ψ c is pure). In this context, we may also rewrite the coherent information as
C(ρb , E b ) = S(ρb ) − S(ρc ). This equation corroborates in a loose sense the interpretation of the coherent infor mation as a measure of retained entanglement since S(ρb ) is an upper bound of the entanglement of H b with other system and S(ρc ) quantifies the net information flow to the environment.
30
Xuelian Cao, Nan Li and Shunlong Luo
U a'
Ua
Ub
U b'
V U c'
\c
Figure 2.1. Tripartite purification of the quantum state ρb and the quantum operation E b . This is a combination of the purification of ρb and the unitarization of E b . Note that ρabc := |Ψabc Ψabc |, ρab := |Ψab Ψab |, ρa b c := |Ψa b c Ψa b c |, ρa b := (I a ⊗ E b)(ρab ), and various marginal states are obtained by taking partial traces, such as ρab = trc ρabc , ρa = trb ρab = trbc ρabc , ρa = trb ρa b = trb c ρa b c , etc..
Combining Eqs. (3.1) and (3.2), we obtain
C(ρb , E b ) + Se (ρb , E b ) = S(ρb ).
(3.5)
Eq. (3.5) indicates that the coherent information and the entropy exchange are complementary to each other with respect to the final system state ρb = E b (ρb ). In contrast, it turns out that the decoherent information is complementary to the coherent information with respect to the initial system state ρb , thus indeed is a measure of lost coherent information. This is summarized in the following equality C(ρb , E b ) + D(ρb , E b ) = S(ρb ). b
(3.6)
Since the von Neumann entropy S(ρ ) has a natural interpretation as the entanglement entropy of the initial purified state |Ψab ,4,24 the above equation exhibits a decomposition of the entanglement quantity into two complementary parts: the decoherent information may be roughly interpreted as the lost, and the coherent information as the retained (which is in agreement with the original interpretation of Schumacher and Nielsen27 ). However, an irritating issue concerning the coherent information is that it may be negative, and thus the above interpretations of the
Decoherent Information
31
coherent information are somewhat vague and ambiguous. In sharp contrast, the meaning of the decoherent information is clear: it is the loss of total correlations which may contain both classical and quantum parts. To establish Eq. (3.6), note that S(ρa ) = S(ρb ) (since ρab = |Ψab Ψab | is pure) and ρa = ρa (since the evolution on the auxiliary system H a is the identity), we have
D(ρb , E b ) = I(ρab ) − I(ρa b )
= 2S(ρb ) − (S(ρa ) + S(ρb ) − S(ρa b )) = S(ρb ) − C(ρb , E b ). It is interesting to compare Eqs. (3.5) and (3.6). The former exhibits a decomposition of the final state entropy and the latter exhibits a decomposition of the initial state entropy. We also observe that the decoherent information and entropy exchange exhibit similar properties. Consider the representation of the operation E b in Eq. (3.3), we have a complementary quantum operation (channel) Ecb (ρb ) := trb (V (ρb ⊗ |ψ c ψ c |)V † ). Due to the complementary nature of the two quantum operations E b and Ecb , we expect that the decoherent information D(ρb , E b ) and D(ρb , Ecb ) should be complementary to each other in some sense. Indeed, we have the following identity D(ρb , E b ) + D(ρb , Ecb ) = 2S(ρb ),
(3.7)
which may be interpreted as an information conservation principle for the decoherent information of two complementary quantum operations. To establish Eq. (3.7), note that S(ρa ) = S(ρb ) = S(ρa ), and since ρa b c is pure, we also have S(ρa b ) = S(ρc ) and S(ρb c ) = S(ρa ). Now by the definition of the decoherent information, we have
D(ρb , E b ) = I(ρab ) − I(ρa b )
= 2S(ρb ) − (S(ρa ) + S(ρb ) − S(ρa b ))
= S(ρb ) − S(ρb ) + S(ρc ), and similarly,
D(ρb , Ecb ) = 2S(ρb ) − I(ρa c )
= 2S(ρb ) − (S(ρa ) + S(ρc ) − S(ρa c ))
= S(ρb ) − S(ρc ) + S(ρb ). Summing up the above equations yields the desired Eq. (3.7). A particular interesting instance of Eq. (3.7) is that when D(ρb , E b ) = 2S(ρb ). In such a case, the H b system is completely decoupled from the H a system (the correlations between them are completely lost). However, we have D(ρb , Ecb ) = 0, which indicates that there is no decoherence between the H a system and the
32
Xuelian Cao, Nan Li and Shunlong Luo
environment H c , or equivalently, the lost correlations between H a and H b reemerge as the correlations between H a and the environment H c . In other words, the correlations, like incompressible fluid, are not destroyed, but only transferred to different systems. Combining Eqs. (3.6) and (3.7), we have the following information conservation relation for the coherent information of two complementary quantum operations C(ρb , E b ) + C(ρb , Ecb ) = 0. We next relate the decoherent information to the entanglement fidelity, another important quantity.26,27 Recall that the entanglement fidelity is defined as
Fe (ρb , E b ) := Ψab |ρa b |Ψab . This quantity also characterizes how much entanglement is retained after the quantum operation. The quantum Fano inequality states that20,26,27 Se (ρb , E b ) ≤ H(Fe ) + (1 − Fe )log(d2 − 1).
(3.8)
Here H(p) := −plogp − (1 − p)log(1 − p) is the binary Shannon entropy function, Fe := Fe (ρb , E b ) and d is the dimension of H b . Now, we can formulate a quantum Fano type inequality for the decoherent information as
D(ρb , E b ) ≤ S(ρb ) − S(ρb ) + H(Fe ) + (1 − Fe )log(d2 − 1) ≤ 2 H(Fe ) + (1 − Fe )log(d2 − 1) .
(3.9) (3.10)
In particular, if the quantum operation E b is a von Neumann projective measurement, then D(ρb , E b ) ≤ H(Fe ) + (1 − Fe )log(d2 − 1).
(3.11)
Inequality (3.9) follows from the combination of Eqs. (3.5) and (3.6) and inequality (3.8). To prove inequality (3.10), note that by Eq. (3.4) and the quantum Fano inequality, we have
S(ρc ) ≤ H(Fe ) + (1 − Fe )log(d2 − 1). Now by the fact that
S(ρb ) = S(ρa ) = S(ρa ) = S(ρb c )
and the subadditivity of the quantum entropy S(ρb c ) ≤ S(ρb ) + S(ρc ), we obtain the desired result. When E b is a von Neumann projective measurement, we further have S(ρb ) ≤ S(ρb ), and the desired inequality (3.11) now follows from inequality (3.9). By use of the decoherent information, we may recast the perfect error correction condition of Schumacher and Nielsen27 in a more elegant way as D(ρb , E b ) = 0.
Decoherent Information
33
Here perfect error correction means that there exists a further quantum operation E b which takes ρb := E b (ρb ) to ρb such that the overall entanglement fidelity
Fe (ρb , E b ◦ E b ) = 1. We can give a simple proof of the above result without reliance on the original result of Schumacher and Nielsen,27 whose proof is rather ingenious. First, if Fe (ρb , E b ◦ E b ) = 1, then S(ρb ) = S(ρb ), and by Eq. (3.10) (with E b ◦ E b playing the role E b there), we immediately obtain D(ρb , E b ◦ E b ) = 0, which implies that D(ρb , E b ) = 0 by the data processing inequality. Conversely, if D(ρb , E b ) = 0, then
I(ρab ) = I(ρa b ) = I(I a ⊗ E b (ρab )). Now by the equality condition of the monotonicity of relative entropy11,23 (in par ticular, see Theorem 3 in Ref. [11]), there exists a quantum operation E b such that
ρa
b
:= (I a ⊗ E b ) ◦ (I a ⊗ E b )(ρab ) = ρab ,
which implies that Fe (ρb , E b ◦ E b ) = 1. We remark that by use of the decoherent information, it is natural to define the following quantity D(E b ) := D(1b /d, E b ) as the decoherent information of the quantum operation E b . Here d is the dimension of H b . The quantity D(E b ) varies from zero for unitary operations to 2S(ρb ) for the completely depolarizing channel. This notion should be useful in studying the decoupling capability of local quantum operations. Let us evaluate D(E b ) for several widely used quantum operations (channels) on a qubit. In the following situations, we always have ρa = ρb = 12 and we may take |Ψab = √12 (|00 + |11 ) such that 1001 1 0 0 0 0 . ρab = |Ψab Ψab | = 2 0 0 0 0 1001 For any p ∈ [0, 1], recall that H(p) = −plogp−(1−p)log(1−p) is the Shannon binary entropy function. The logarithm is taken to base 2 in the following calculations. (1). Following Buscemi et al.,6 a complete decohering channel acting on a qubit system is defined by the following quantum operation E b (σ) = M ◦ σ. Here M is a correlation matrix, i.e., a non-negative definitive matrix with all diagonal elements being 1, and the product ◦ here denotes the Hadamard product (also called Schur product, entry-wise product of matrices). In particular, 12 M may be regarded as a quantum state.
34
Xuelian Cao, Nan Li and Shunlong Luo
The most general 2 × 2 correlation matrix can be written as 1α M= , |α| < 1. α ¯ 1 Now the final state
ρa b
10 1 00 := I a ⊗ E b (|Ψab Ψab |) = 20 0 α ¯0
0α 0 0 . 0 0 01
The eigenvalues of this ρa b are λ1 = λ2 = 0, λ3 = 1−|α| 2 , λ4 = 1 − |α| , D(E b ) = H 2
1+|α| 2 .
Consequently,
which turns out to be equal to the quantum entropy S( 12 M ). (2). The Pauli channel is defined as E b (σ) = p0 σ + p1 XσX + p2 Y σY + p3 ZσZ, where X, Y, Z are the Pauli spin matrices, and p = (p0 , p1 , p2 , p3 ) is a probability distribution. We have p0 + p3 0 0 p0 − p3 1 0 p1 + p2 p1 − p2 0 . ρa b := I a ⊗ E b (|Ψab Ψab |) = 0 0 p1 − p2 p1 + p2 2 0 0 p0 + p3 p0 − p3
The eigenvalues of ρa b are p0 , p1 , p2 , p3 , consequently, D(E ) = − b
3
pi logpi .
i=0
The bit-flip channel and the depolarizing channel are special instances of the Pauli channel. In particular, the depolarizing channel 1 E b (σ) = pσ + (1 − p) trσ, 2
−
1 ≤ p ≤ 1, 3
which is defined for any 2 × 2 matrix σ, can be viewed as a Pauli channel with 1 + 3p 1 − p 1 − p 1 − p , , , . p= 4 4 4 4 In this instance, D(E b ) = 2 −
3 − 3p 1 + 3p log(1 + 3p) − log(1 − p). 4 4
This will be compared with the next example.
Decoherent Information
35
(3). For the transpose depolarizing channel 1 E b (σ) = pσ T + (1 − p) trσ, 2 we have
−1 ≤ p ≤
1+p 2
ρa b = I a ⊗ E b (|Ψab Ψab |) =
1 0 2 0 0
0 1−p 2
p 0
1 , 3 0 0 . 0
0 p 1−p 2
0
1+p 2
The eigenvalues of ρa b are λ1 = λ2 = λ3 =
1+p 1 − 3p , λ4 = , 4 4
and therefore 3 + 3p 1 − 3p log(1 − 3p) − log(1 + p). 4 4 (4). For the amplitude damping channel D(E b ) = 2 −
E b (σ) = E1 σE1† + E2 σE2† √ 1 0 0 p √ , E2 = , 0 ≤ p ≤ 1, we have with E1 = 0 0 0 1−p √ 1 0 0 1−p 1 0 00 0 ρa b = I a ⊗ E b (|Ψab Ψab |) = 0p 0 2 0 √ 1−p 0 0 1−p with marginal states
ρa =
1 2
10 01
,
ρb =
1 2
1+p 0 0 1−p
.
The eigenvalues of ρa b are λ1 = λ2 = 0, λ3 = p2 , λ4 = 1 − p2 . Consequently, p 1 + p +H . D(E b ) = 1 − H 2 2 (5). For the phase damping channel E b (σ) = E1 σE1† + E2 σE2† 1 0 0 0 √ , E2 = , 0 ≤ p ≤ 1, we have with E1 = √ 0 1−p 0 p √ 1 0 0 1−p 1 0 00 0 . ρa b = I a ⊗ E b (|Ψab Ψab |) = 00 0 2 0 √ 1−p 0 0 1
36
Xuelian Cao, Nan Li and Shunlong Luo √
The eigenvalues of ρa b are λ1 = λ2 = 0, λ3 = 1− 21−p , λ4 = 1 − √1 − p b . D(E ) = H 2
√ 1+ 1−p . 2
Therefore,
We compare the decoherent information between the depolarizing channel and the transpose depolarizing channel (see Figure 2.2), and that between the amplitude damping channel and the phase damping channel (see Figure 2.3). It is interesting to note that the amplitude damping channel is more decoherent than the phase damping channel with the same parameter p.
Graphs of the decoherent information D(E b ) for the depolarizing channel and the transpose depolarizing channel versus the parameter p.
Figure 2.2.
4. No-broadcasting in terms of decoherent information In this section, we establish an informational no-broadcasting result for correlations. Just like no-cloning,9,25,31 no-broadcasting is a fundamental characteristic of the quantum world and there are various characterizations.2,14 In the framework of the previous section, let n ≥ 2 be any natural number and consider a quantum operation E b which takes a mixed state ρb to an n-partite state ρb := E b (ρb ) on the composite system H b1 ⊗ H b2 ⊗ · · · ⊗ H bn . By taking partial traces over various subsystems of this composite state, we obtain n reduced states
ρbk := trk ρb ,
k = 1, 2, · · · , n.
Here trk means taking partial trace with respect to all subsystems except the k-th
Decoherent Information
37
Graphs of the decoherent information D(E b ) for the amplitude damping channel and the phase damping channel versus the parameter p. Figure 2.3.
one. Then we have n induced quantum operations E bk := trk ◦ E b ,
k = 1, 2, · · · , n.
Theorem 1. For n ≥ 2, it holds that D(ρb , E b1 ) + D(ρb , E b2 ) + · · · + D(ρb , E bn ) ≥ nS(ρb ).
(4.1)
To see this, by the fact S(ρa ) = S(ρa ) and the definition of the decoherent information, we have D(ρb , E b1 ) + D(ρb , E b2 ) + · · · + D(ρb , E bn ) n I(ρab ) − I(ρa bk ) = k=1
= nI(ρab ) −
n
I(ρa bk )
k=1 n
S(ρa ) − S(ρa bk |ρbk )
= 2nS(ρa ) −
k=1
= nS(ρa ) +
n
S(ρa bk |ρbk ),
k=1 a bk
bk
a bk
where S(ρ |ρ ) := S(ρ ) − S(ρbk ) is the quantum conditional entropy (which may be negative). To prove inequality (4.1), it suffices to show that n S(ρa bk |ρbk ) ≥ 0. k=1
38
Xuelian Cao, Nan Li and Shunlong Luo
But this follows from the fact that n n S(ρa bk |ρbk ) + S(ρa bn+1−k |ρbn+1−k ) , S(ρa bk |ρbk ) = 2 k=1
k=1
and each summand in the right hand side is nonnegative. The latter is due to the general inequality S(ρab |ρb ) + S(ρac |ρc ) ≥ 0, which holds for any tripartite state ρabc and is actually equivalent to the strong subadditivity of the quantum entropy. The above theorem puts an informational constraint on the ability of locally broadcasting the bipartite correlations between H a and H b to the n bipartite cor relations between H a and H bk , k = 1, 2, · · · , n. To gain an intuitive understanding of this, let n = 2 and consider the case D(ρb , E b1 ) = 0, which indicates that the correlations are perfectly transferred from H a ⊗H b to H a ⊗H b1 . Now by the above theorem, we obtain D(ρb , E b2 ) ≥ 2S(ρb ), but we clearly have D(ρb , E b2 ) ≤ 2S(ρb ). Consequently, D(ρb , E b2 ) = 2S(ρb ), and this means that no correlations in H a ⊗ H b are transferred to H a ⊗ H b2 . 5. Continuity of the decoherent information We establish a continuity estimate for the decoherent information of quantum operation D(E b ) in this section: If two quantum operations are close enough in some sense to be specified late, then their magnitudes of decoherent information are also close enough. For any quantum operation
E b : S(H b ) → S(H b ), its dual map
E b∗ : B(H b ) → B(H b ) is defined via the duality relation
ρ ∈ S(H b ), X ∈ B(H b ).
trE b (ρ)X = trρE b∗ (X),
Here B(H b ) denotes the space of observables on the Hilbert space H b . Let
Eib : S(H b ) → S(H b ),
i = 1, 2,
be two quantum operations, and put 1 δ := ||E1b∗ − E2b∗ ||cb , 2 where the so-called norm of complete boundedness (cb-norm, for short) is defined as21 ||E1b∗ − E2b∗ ||cb := sup ||In∗ ⊗ (E1b∗ − E2b∗ )||∞ . n
(5.1)
Decoherent Information
39
Here In denotes the identity operation on the state space of an n-dimensional Hilbert space, and the norm || · ||∞ is the conventional operator norm for the dual maps of quantum operations considered as operators on the space of observables. In particular, from the duality relation between the operator norm || · ||∞ for any linear operator K on the space of observables B(H) and the trace norm || · ||1 on S(H), we have ||K||∞ = sup ||K∗ (ρ)||1 .
(5.2)
ρ∈S(H)
Theorem 2. Let d be the dimension of H b and δ := (0, 1 − 1d ), then
1 b∗ 2 ||E1
− E2b∗ ||cb . If δ ∈
|D(E1b , ρb ) − D(E2b , ρb )| ≤ 2H(δ) + δ log(d − 1)(d2 − 1).
(5.3)
Here H(x) := −xlogx − (1 − x)log(1 − x) is the binary Shannon entropy function. To establish the above result, let ρab = |Ψab Ψab | with |Ψab ∈ H a ⊗ H b a purification of ρb = tra ρab (here H a = H b ). Let ρabi = I a ⊗ Eib (ρab ), i = 1, 2. By the definition, we have D(E1b , ρb ) = S(ρb ) − S(ρb1 ) + S(ρab1 ), D(E2b , ρb ) = S(ρb ) − S(ρb2 ) + S(ρab2 ), from which we obtain |D(E1b , ρb ) − D(E2b , ρb )| = |S(ρb2 ) − S(ρb1 ) + S(ρab1 ) − S(ρab2 )| ≤ |S(ρb1 ) − S(ρb2 )| + |S(ρab1 ) − S(ρab2 )|. Let γ1 = 12 tr|ρb1 − ρb2 |, γ2 = 12 tr|ρab1 − ρab2 |. Applying the result in Zhang,32 which states that |S(ρ) − S(σ)| ≤ H(γ) + γ log(m − 1), where γ = 12 tr|ρ − σ|, and m is the dimension of the system, we have |S(ρb1 ) − S(ρb2 )| ≤ H(γ1 ) + γ1 log(d − 1), and |S(ρab1 ) − S(ρab2 )| ≤ H(γ2 ) + γ2 log(d2 − 1). Let f (γ1 ) = H(γ1 ) + γ1 log(d − 1),
g(γ2 ) = H(γ2 ) + γ2 log(d2 − 1),
then |D(E1b , ρb ) − D(E2b , ρb )| ≤ f (γ1 ) + g(γ2 ). Note that f (γ1 ) is monotone increasing when γ1 ∈ (0, 1 − 1d ), and from the monotonicity of the trace distance, we have γ1 ≤ γ2 . Consequently, f (γ1 ) ≤ f (γ2 ),
40
Xuelian Cao, Nan Li and Shunlong Luo
and when γ2 ∈ (0, 1 − 1d ), we have |D(E1b , ρb ) − D(E2b , ρb )| ≤ f (γ2 ) + g(γ2 ) = 2H(γ2 ) + γ2 log(d − 1)(d2 − 1), From γ2 = = = = = ≤ ≤
1 tr|ρab1 − ρab2 | 2 1 ab1 ||ρ − ρab2 ||1 2 1 a ||I ⊗ E1b (ρab ) − I a ⊗ E2b (ρab )||1 2 1 ||(I a ⊗ E1b − I a ⊗ E2b )(ρab )||1 2 1 a ||I ⊗ (E1b − E2b )(ρab )||1 2 1 a∗ ||I ⊗ (E1b∗ − E2b∗ )||∞ (by Eq. (5.2)) 2 1 b∗ ||E − E2b∗ ||cb (by Eq. (5.1)), 2 1
we conclude that γ2 ≤ δ. Let h(γ2 ) = 2H(γ2 ) + γ2 log(d − 1)(d2 − 1). Since h(γ2 ) is increasing in the interval (0, 1 − √ √
1 , (d−1)(d2 −1)+1
we have h(γ2 ) ≤ h(δ) when δ ∈
1 ), and 1 − d1 ≤ 1 − (d−1)(d2 −1)+1 (0, 1 − d1 ). Thus when δ ∈ (0, 1 − d1 ),
the desired inequality (5.3) follows. 6. Discussion By exploiting the difference between the quantum mutual information before and after a quantum operation, we have introduced the notion of decoherent information, which quantifies how much correlation information is lost, and which in turn characterizes certain aspect of decoherence caused by a quantum operation. The decoherent information is somewhat complementary to the coherent information with respect to the initial state. The informational meaning and properties of the former are more transparent and can be more simply derived than the latter from the monotonicity of quantum mutual information. By use of the decoherent information, we have not only provided an alternative and intrinsic interpretation of the coherent information, but have also gained some new insight. We have also established a quantum Fano type inequality for the decoherent information, an informational no-broadcasting result and a continuity estimate for the decoherent information. It is hoped that the notion of decoherent information may be useful in studying decoherence phenomena and quantum channel capacities.
Decoherent Information
41
Acknowledgement This work was supported by NSFC, Grant No. 10771208, and by the science Fund for Creative Research Groups, Grant No. 10721101. References 1. C. Adami and N. J. Cerf, von Neumann capacity of noisy quantum channels, Phys. Rev. A 56, 3470 (1997). 2. H. Barnum, C. M. Caves, C. A. Fuchs, R. Jozsa, and B. Schumacher, Noncommuting mixed states cannot be broadcast, Phys. Rev. Lett. 76, 2818 (1996). 3. H. Barnum, M. A. Nielsen, and B. Schumacher, Information transmission through a noisy quantum channel, Phys. Rev. A 57, 4153 (1998). 4. C. H. Bennett, H. J. Bernstein, S. Popescu, and B. Schumacher, Concentrating partial entanglement by local operations, Phys. Rev. A 53, 2046 (1996). 5. C. H. Bennett, P. W. Shor, J. A. Smolin, and A. V. Thapliyal, Entanglement-assisted classical capacity of noisy quantum channels, Phys. Rev. Lett. 83, 3081 (1999); Entanglement-assisted capacity of a quantum channel and the reverse Shannon theorem, IEEE Trans. Inform. Theory 48, 2637 (2002). 6. F. Buscemi, G. Chiribella, and G. M. D’Ariano, Inverting quantum decoherence by classical feedback from the environment, Phys. Rev. Lett. 95, 090501 (2005). 7. I. Devetak and P. W. Shor, The capacity of a quantum channel for simultaneous transmission of classical and quantum information, Commun. Math. Phys. 256, 287 (2005). 8. I. Devetak, The private classical capacity and quantum capacity of a quantum channel, IEEE Trans. Inform. Theory 51, 44 (2005). 9. D. Dieks, Communication by EPR devices, Phys. Lett. A 92, 271 (1982). 10. B. Groisman, S. Popescu, and A. Winter, Quantum, classical, and total amount of correlations in a quantum state, Phys. Rev. A 72, 032317 (2005). 11. P. Hayden, R. Jozsa, D. Petz, and A. Winter, Structure of states which satisfy strong subadditivity of quantum entropy with equality, Commun. Math. Phys. 246, 359 (2004). 12. A. S. Holevo, On entanglement-assisted classical capacity, J. Math. Phys. 43, 4326 (2002). 13. B. Ibinson and A. Winter, All inequalities for the relative entropy, Commun. Math. Phys. 269, 223 (2007). 14. A. Kalev and I. Hen, No-broadcasting theorem and its classical counterpart, Phys. Rev. Lett. 100, 210502 (2008). 15. D. Kretschmann, D. Schlingermann, and R. F. Werner, The information-disturbance tradeoff and the continuity of Stinespring’s representation, IEEE Trans. Inform. Theory 54, 1708 (2008). 16. N. Li and S. Luo, Total versus quantum correlations in quantum states, Phys. Rev. A 76, 032327 (2007). 17. S. Lloyd, Capacity of the noisy quantum channel, Phys. Rev. A 55, 1613 (1997). 18. S. Luo, Quantum versus classical uncertainty, Theore. Math. Phys. 143, 681 (2005). 19. S. Luo and Q. Zhang, Informational distance on quantum-state space, Phys. Rev. A 69, 032106 (2004). 20. M. A. Nielsen and I. L. Chuang, Quantum Computation and Quantum Information (Cambridge University Press, Cambridge, UK, 2000).
42
Xuelian Cao, Nan Li and Shunlong Luo
21. V. I. Paulsen, Completely Bounded Maps and Operator Algebras (Cambridge University Press, Cambridge, 2002). 22. A. Peres, Quantum Theory: Concepts and Methods (Kluwer, Dordrecht, 1993). 23. D. Petz, Monotonicity of quantum relative entropy revisited, Rev. Math. Phys. 15, 79 (2003). 24. S. Popescu and D. Rohrlich, Thermodynamics and the measure of entanglement, Phys. Rev. A 56, R3319 (1997). 25. V. Scarani, S. Iblisdir, and N. Gisin, Quantum cloning, Rev. Mod. Phys. 77, 1225 (2005). 26. B. Schumacher, Sending entanglement through noisy quantum channels, Phys. Rev. A 54, 2614 (1996). 27. B. Schumacher and M. A. Nielsen, Quantum data processing and error correction, Phys. Rev. A 54, 2629 (1996). 28. B. Schumacher and M. D. Westmoreland, Quantum privacy and quantum coherence, Phys. Rev. Lett. 80, 5695 (1998). 29. B. Schumacher and M. D. Westmoreland, Quantum mutual information and the onetime pad, Phys. Rev. A 74, 042305 (2006). 30. V. Vedral, The role of relative entropy in quantum information theory, Rev. Mod. Phys. 74, 197 (2002). 31. W. K. Wootters and W. H. Zurek, A single quantum cannot be cloned, Nature 299, 802 (1982). 32. Z. Zhang, Uniform estimates on the Tsallis entropies, Lett. Math. Phys. 80, 171-181 (2007).
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 43–66
Chapter 3 Stabilization of Evolution Equations by Noise
Tom´ as Caraballo1 and Peter E. Kloeden2 1
Dpto. Ecuaciones Diferenciales y An´ alisis Num´erico, Universidad de Sevilla, 41080–Sevilla, Spain, [email protected] 2
Institut f¨ ur Mathematik, Goethe Universit¨ at, D-60054 Frankfurt am Main, Germany, [email protected]
Some recent results on stabilization by noise in systems modeled by evolution equations is reviewed, mostly for partial differential equations but also for delay differential equations and systems without uniqueness.
Keywords: Stochastic partial differential equations. Itˆ o noise, Stratonovich noise, exponential stochastic stability, stabilization, destabilization. 2000 AMS Subject Classification: 35R10 35B40 47H20 58F39 73K70 Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Linear PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Persistence of stability and stabilization by Itˆ o noise . . . . . . . . 2.2 Destabilization by Itˆ o noise . . . . . . . . . . . . . . . . . . . . . . 3 Linear PDEs without fully commuting noise . . . . . . . . . . . . . . . . . 3.1 Stabilization by simple multiplicative Itˆ o noise . . . . . . . . . . . . 3.2 Stabilization by Stratonovich noise . . . . . . . . . . . . . . . . . . 4 Nonlinear PDEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.1 Stabilization by Itˆ o noise . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Stabilization by Stratonovich noise . . . . . . . . . . . . . . . . . . 5 Other types of evolution equations and models . . . . . . . . . . . . . . . 5.1 Delay differential equations . . . . . . . . . . . . . . . . . . . . . . 5.2 Stabilization of evolution inclusions and PDEs without uniqueness 5.3 Stabilization of stationary solutions of a stochastic PDE . . . . . . 5.4 Other types of problems . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
44 46 48 49 50 50 51 53 54 59 60 60 60 63 63 64
44
Tom´ as Caraballo and Peter E. Kloeden
1. Introduction Stabilization by noise has a long history dating back, in engineering practice, to the early days of the industrial revolution. Attempts to understand the phenomenon mathematically in terms of stochastic differential equations (SDEs) began in the 1960s, at first mainly for finite dimensional systems and in the past decade for infinite dimensional systems, that is for evolution equations, by which is meant mainly partial differential equations (PDEs), but also includes delay differential equations (DDEs). The mechanism behind stabilization is intuitively quiet simple. Consider a linear system for which the zero steady state solution is unstable, but not unstable in all directions. The idea is to apply noise in an appropriate way to drive the dynamics in the unstable directions into the stable directions. Stabilization thus requires the system to be sufficiently strongly stable in certain directions to overcome the effects of the unstable directions. This restricts, in general, the classes of systems which can be stabilized. Another important consideration is whether the SDEs are interpreted in the Itˆ o and Stratonovich senses. This is essentially a modelling issue, but has significant consequences. For example, the scalar SDE dXt = aXt dt + bXt dWt , where a > 0 and Wt is a standard Wiener process, has the Itˆo and Stratonovich solutions 1 2 Xt = e(a− 2 b )t+bWt X0
and Xt = eat+bWt X0 .
It follows by the properties of the Wiener process (see Arnold3 or Kloeden & Platen36 ) that the zero solution is pathwise exponentially stable for the Itˆo SDE for b2 large enough, while the same is not true for the Stratonovich SDE. This seems to imply that Itˆ o noise has a more profound stabilizing effect than Stratonovich noise. However, this argument is somewhat misleading. Indeed, the above system has no stable directions and the interplay between stable directions and Stratonovich noise can lead to stabilization in higher dimensional systems, as will be seen in some examples below. See Caraballo10 for a detailed discussion on the effects of Itˆo and Stratonovich formalisms on stabilization. There is an extensive literature about such problems for finite-dimensional systems (see for example Arnold,5 Arnold et al.,7 Arnold & Kloeden,8 Mao,43 Scheuto zow46 ). Many results on the stabilization and destabilization produced by both Itˆ and Stratonovich noise have been obtained, and these have also been applied to construct feedback stabilizers, which are an important tool in control problems. Although in each particular situation one or other choice of the noise may be more appropriate, stabilization by Stratonovich noise might be considered more
Stabilization of Evolution Equations by Noise
45
significant and has a non-trivial literature with both mathematical and engineering contributions (see Arnold,5 Arnold et al.7 and the references therein). Since such noise behaves like a periodic zero-mean feedback control, its stabilizing effect is unexpected and very intriguing. In the finite-dimensional case, Arnold and his collaborators proved that the linear differential system dx = Ax, (1.1) dt can be stabilized by the addition of a collection of multiplicative noisy terms, dXt = AXt dt +
d
Bi Xt ◦ dWti ,
(1.2)
i=1
where the Wti are mutually independent Wiener processes and the Bi are suitable skew-symmetric matrices, if and only if tr A < 0.
(1.3)
Since the trace of the matrix equals the sum of its eigenvalues, this indicates that the system must have stable directions which are sufficiently strongly attracting to overcome the effects of the unstable ones. (The stabilization and destabilization of nonlinear ordinary differential equations has been considered by Appleby et al.2 and the papers cited therein). The corresponding problem for linear partial differential equations remained open for a long time (and in fact is still open in the most general case) because it needs a version of the renowned Oseledec Multiplicative Ergodic Theorem for infinite-dimensional spaces, which has not yet been established as far as we know. However, Caraballo & Robinson26 were able to circumvent this difficult by stabilizing the linear PDE with a finite sum of Stratonovich terms as in (1.2), which allowed them to project onto a finite-dimensional subspace and thus use finitedimensional arguments. From an engineering perspective finite-dimensional noise is not necessarily a restriction and, indeed, is possibly more realistic. To the best of our knowledge, the problem of stabilization of nonlinear PDEs by Stratonovich noise is still an unsolved problem in general, although it has been shown to be possible in a number of interesting applications by using Itˆ o’s noise (see Caraballo & Langa,11 Caraballo et al.,14,20,23,24 Kwiecinska,39 and Leha et al.41 amongst many others). Our aim in this article is to present some of the results that have been obtained and to indicate the basic techniques used. To be more precise, we consider a linear evolution equation on a separable Hilbert space H given by du = Au, (1.4) dt where A : D(A) ⊂ H → H is a linear operator which has a sequence of eigenvalues λj with associated eigenfunctions ej . We assume that these eigenfunctions form an orthonormal basis of H and that the eigenvalues λj are bounded above (but not
46
Tom´ as Caraballo and Peter E. Kloeden
necessarily below), so that they can be ordered in the form λ1 ≥ λ2 ≥ . . . . In addition, we denote by |·| the norm in H and by (·, ·) its associated scalar product. We will consider the stochastically perturbed evolution equation dUt = AUt dt +
d
Bi Ut ◦ dWti ,
(1.5)
i=1
where the Bi : D(Bi ) ⊂ H → H are linear operators and the Wti are mutually independent Wiener process on the same probability space (Ω, F , P) . We will report in the following sections the differences on the behaviour of problems (1.4) and (1.5) In fact, most of the analysis carried out in this field is concerned only with the stabilization of the trivial solution. Nevertheless, going deeper into the investigation of the nonlinear models, we can find in the stochastic models some special solutions called stationary but which are not stationary in the deterministic sense, i.e., steady state solutions. These solutions sometimes become random attractors for some systems, so their existence and properties are very important. Some preliminary results have been obtained in Caraballo et al.16,17 2. Linear PDEs We first consider some properties of the solutions of a linear stochastic PDE and a result on the exponential stability of its zero solution. This will allow us to point out the different effects that the interpretation of the noise may produce in the final result as well as to characterize the stabilization of linear PDEs by Stratonovich noise. To start we consider the Itˆo formulation of the linear SPDE. We can thus apply a result due to Da Prato & Zabczyk31 which ensures the equivalence of the stochastic PDE to a pathwise nonautonomous deterministic PDE, i.e. a random PDE. Then, we transform our Stratonovich model to an equivalent Itˆo model and apply this result. (This equivalence has been proved by Kunita37 for suitable partial differential operators and we implicitly assume that we are considering this case). Consider the Cauchy problem for the linear Itˆ o SPDE, dUt = AUt dt +
d
Bk Ut dWtk ,
U0 = u0 ∈ H,
(2.1)
k=1
where A : D(A) ⊂ H → H and the Bk : D(Bk ) ⊂ H → H, k = 1, · · · , d, are generators of C0 -semigroups SA (t) and Sk , respectively, and the Wt1 ,· · · ,Wtd are independent real Wiener processes on the same probability space (Ω, F , P). We also need the following assumptions: Assumption 1 (1ex). (A1) , The operators B1 ,· · · ,Bd generate mutually commuting C0 -groups Sk . d (A2) D(Bk2 ) ⊃ D(A) for k = 1,· · · ,d and k=1 D((Bk∗ )2 ) is dense in H, where Bk∗ denotes the adjoint operator of Bk .
Stabilization of Evolution Equations by Noise
(A3) C = A −
1 2
d k=1
47
Bk2 generates a C0 -semigroup SC .
Given a realization of the Wiener processes Wtk (ω) for fixed ω ∈ Ω we define Tω (t) =
d
Sk (Wtk (ω)) and v(t) = Tω−1 (t) u(t),
t ≥ 0,
(2.2)
k=1
and we consider the auxiliary deterministic system dv (t) = Tω−1 (t) C Tω (t) v(t), v(0) = u0 , (2.3) dt which is a deterministic Cauchy problem depending on the parameter ω. The following result, along with the definition of a strong solution, can be found in Da Prato & Zabczyk:31 Proposition 2.1. Suppose that Assumptions (A1)–(A3) are satisfied. Then, if Ut is a strong solution of (2.1), the process v(t, ω) defined by (2.2) satisfies (2.3). Conversely, if v is a predictable process whose trajectories are continuously differentiable and satisfy (2.3) , P-a.s., then the process Ut (ω) := Tω (t)v(t, ω) takes values in D(C), P-a.s., and for almost all t, and is a strong solution to (2.1). A sufficient condition ensuring the solvability of (2.3) which is useful in applications can also be found in Da Prato & Zabczyk31 , pp. 177–179. Now consider the Stratonovich version of the problem, i.e, dUt = AUt dt +
d
Bk Ut ◦ dWtk ,
U0 = u0 ∈ H.
(2.4)
k=1
To obtain existence of its solutions we consider its equivalent Itˆo version d d 1 2 dUt = (A + Bk Ut dt + Bk Ut dWtk , U0 = u0 ∈ H. 2 k=1
(2.5)
k=1
Instead of (A3), we assume Assumption 2. (A3’) C = A +
1 2
d k=1
Bk2 generates a C0 -semigroup SC ,
as well as (A1) and (A2), then, thanks to Proposition 2.1, problem (2.4) can be rewritten equivalently as dv (t) = Tω−1 (t)ATω (t) v(t), v(0) = u0 , (2.6) dt The following result of Caraballo & Robinson26 (see also Caraballo & Langa11 ) characterizes the asymptotic stability of the Stratonovich model (2.4) under the assumption that all the operators involved in the equation mutually commute. We call this the fully commuting case and show how, under the same assumptions, the Itˆo equation (2.1) may exhibit very different asymptotic behavior.
48
Tom´ as Caraballo and Peter E. Kloeden
Theorem 2.1. In addition to Assumptions (A1)–(A2) and (A3’), suppose that A commutes with each Sk (t). Then, the strongly continuous semigroup SA (t) generated by A is exponentially stable, i.e., there exist M0 , γ > 0 such that |SA (t)| ≤ M0 e−γt
for all t > 0,
if and only if there exist α, C > 0 and Ω0 ⊂ Ω with P(Ω0 ) = 0 such that for any ω ∈ / Ω0 there exists T (ω) > 0 such that the solution Ut of (2.4) satisfies |Ut | ≤ C|u0 | e−αt for t ≥ T (ω). In the fully commuting case, the stability properties of the deterministic problem (1.4) and the stochastic (2.4) are thus equivalent, so we can ensure the adequacy of the deterministic model to the stochastic real phenomenon. However, if we interpret the noise in the sense of Itˆo, then we may have very different results. For example, it may happen that (1.4) is stable and (2.1) remains stable (persistence of stability from the deterministic to the stochastic model), or that (1.4) is unstable and (2.1) becomes stable (stabilization produced by the noise), or that (1.4) is stable and (2.1) becomes unstable (destabilization). We will illustrate these scenarios in the following subsections. 2.1. Persistence of stability and stabilization by Itˆ o noise Let O be a bounded domain in Rd (d ≤ 3) with C ∞ -boundary, and consider the noisy reaction-diffusion equation ∂u(t, x) ˙ t, = ∆u(t, x) + αu(t, x) + γu(t, x) W ∂t (2.7) u(t, x) = 0, t > 0, x ∈ ∂O, u(0, x) = u0 (x), x ∈ O, ˙t where ∆ denotes the Laplacian operator and Wt is a scalar Wiener process and W the corresponding Gaussian white noise. To set this problem in our framework as an Itˆ o equation, we take H = L2 (O), 1 A = ∆+αI and B = γI. Then D(A) = H0 (O)∩H 2 (O). Let λ1 > 0 denote the first eigenvalue of −∆. Then, as a consequence of Theorem 3.1 in Subsection 3.1 (see also Kwiecinska38 ), it is easy to check that the null solution of (2.7) is exponentially stable, P-a.s., if the parameters in the equation satisfy 2(α − λ1 ) − γ 2 < 0. Let us now discuss what this condition means. First, notice that when α < λ1 , the deterministic equation (i.e. Eq. (2.7) with γ = 0) is exponentially stable. Then, for any γ ∈ R (in other words, no matter how large or small the intensity of the noise might be), the stochastic equation (2.7) remains exponentially stable, P-a.s. Consequently, the stability persists in the presence of noise.
Stabilization of Evolution Equations by Noise
49
When α > λ1 , however, the deterministic equation is not stable (see, for instance, Example 2.2 below for a more detailed analysis in a case of one spatial dimension). But, if we choose γ large enough so that 2(α − λ1 ) − γ 2 < 0, then the stochastic equation becomes exponentially stable, P-a.s. Consequently, noise with large intensity stabilizes the system. 2.2. Destabilization by Itˆ o noise Consider again the deterministic heat equation, now in one spatial dimension: 2 ∂u(t, x) = ∂ u(t, x) + αu(t, x), t > 0, 0 < x < π, ∂x2 ∂t (2.8) u(t, 0) = u(t, π) = 0, t > 0, u(0, x) = u0 (x), x ∈ [0, π]. 2
∂ 1 2 Set H = L2 ([0, π]) and A = ∂x 2 + α, so D(A) = H0 ([0, π]) ∩ H ([0, π]). This system has the explicit solution
u(t, x) =
∞
2
an e−(n
−α)t
sin nx
n=1
with initial value u0 (x) = ∞ n=1 an sin nx, and the zero solution is exponential stable if and only if α < n2 for all n ∈ N, ie., if and only if α < 1. Consider now the problem dTt = A Ut t + B Ut dWt ,
(2.9)
where B is defined by Bu(x) = δ
∂u(x) ∂x
for any u ∈ H01 ([0, π]) and some δ ∈ R. If we choose δ such that δ2 < 1, 2 then the stochastic problem becomes unstable. Indeed, denoting by C = A − 12 B 2 , the stability of problem (2.9) is equivalent to the stability of 1−α≤
dUt = C Ut dt + B Ut ◦ dW (t),
U0 = u 0 .
(2.10)
But, due to the commutativity property of these operators, Theorem 2.1 ensures that the stability of (2.10) is equivalent to the stability of the deterministic problem ∂u(t, x) δ 2 ∂ 2 u(t, x) = 1− + αu(t, x), 2 ∂x2 ∂t u(t, 0) = u(t, π) = 0, t > 0, u(0, x) = u0 (x), x ∈ [0, π].
50
Tom´ as Caraballo and Peter E. Kloeden
The zero solution of this system is exponentially stable if and only if α<1−
δ2 . 2
Since our constants satisfy the opposite inequality, we see that the noise destabilizes the deterministic exponentially stable system. In conclusion, it is clear in this fully commuting case that we should be very careful how we interpret the noise since the behavior of the resulting stochastic models may then be completely different. More precisely, the Stratonovich noise does not modify the stability properties of the deterministic model, while Itˆ o noise can produce very different effects. 3. Linear PDEs without fully commuting noise Under fully commuting assumptions in the previous subsection, the zero solution of the deterministic problem is exponentially stable if and only if the stochastically perturbed equation (Stratonovich sense) has the same property. However, an immediate question arises. When no commutativity holds between A and some Bk , then there are some sufficient conditions in Caraballo & Robinson,26 which ensure the persistence of exponential stability of the deterministic system in the stochastic model. 3.1. Stabilization by simple multiplicative Itˆ o noise A simple multiplicative noise in the Itˆ o sense will stabilize the deterministic linear partial differential equation (1.4) in many cases, so we do not need to worry too much about looking for a very complicate expression of the noise. A term like ˙ (t) σuW can produce that effect. This stabilization can be produced for more general terms and, in some cases, we can even determine the decay rate of the solutions (exponential, sub- or super-exponential), see Caraballo et al.14 The following result is a particular situation of a much more general nonlinear theorem (see Section 4 for more details). First, recall that a linear operator A generates a strongly continuous semigroup SA (t) satisfying |SA (t)| ≤ eαt ,
(3.1)
for some α ∈ R, if and only if (Au, u) ≤ α|u|2 for all u ∈ D(A). Theorem 3.1. Assume that A generates a strongly continuous semigroup SA (t) satisfying (3.1) for some α ∈ R, and that B : D(B) ⊂ H → H is a linear (bounded or unbounded) operator with D(A) ⊂ D(B). Suppose also that the two following hypotheses hold:
Stabilization of Evolution Equations by Noise
51
i) There exists β ∈ R such that 1 (Au, u) + |Bu|2 ≤ β|u|2 , 2
∀u ∈ D(A)
(3.2)
(which is immediately fulfilled for β = α + 12 B2 , if B is bounded). ii) There exist b, b ∈ R with 0 ≤ b ≤ b such that b|u|2 ≤ (u, Bu) ≤ b|u|2 ,
∀u ∈ D(B).
(3.3)
Then, for every u0 ∈ D(A),u0 = 0, the solution Ut := u(t, ω; 0, u0 ) to the problem dUt = AUt dt + BUt dWt ,
U0 = u0 ∈ H,
satisfies lim sup t→+∞
1 log |Ut |2 ≤ −(b2 − β), t
P − a.s.
In the particular case that B is defined by Bu = bu with b ∈ R, then β = α + 12 b2 and, hence, b2 − β = 12 b2 − α, which is positive when b2 is large enough. Thus the zero solution will be stabilized by Itˆ o noise of the above type with sufficiently large intensity. 3.2. Stabilization by Stratonovich noise To obtain the same effect using Stratonovich noise turns out, however, to be a completely different and much more difficult problem as mentioned in the Introduction. But, surprisingly, a very simple trick (discovered long time after the results in the finite-dimensional case were obtained) allows one to prove that the negative trace assumption (1.3) is a necessary and sufficient condition for the stabilization of a linear PDE by a suitable Stratonovich noise (see Caraballo & Robinson26 for a detailed exposition on this problem). Instead of presenting this stabilization result here, we will motivate the problem with an example, which provides the basic idea of the proof of the theorem. Consider the following one-dimensional heat equation ∂ 2 u(t, x) ∂u(t, x) = + 2u(t, x), t > 0, 0 < x < π, ∂x2 ∂t (3.4) u(t, 0) = u(t, π) = 0, t > 0, u(0, x) = u0 (x), x ∈ [0, π]. This problem can be formulated in our framework by setting H = L2 ([0, π]) and A ∂2 1 2 = ∂x 2 + 2I, so D(A) = H0 ([0, π]) ∩ H ([0, π]). Recall from Section 2.2 that this problem has the explicit solution u(t, x) =
∞ n=1
2
an e−(n
−2)t
sin nx
52
Tom´ as Caraballo and Peter E. Kloeden
for the initial value u0 (x) = ∞ n=1 an sin nx. Hence, it is clear that the zero solution of the problem (3.4) is not stable. However, we will see that by an appropriate choice of operators Bk : H → H, k = 1, · · · , d, the zero solution of the system dUt = AUt dt +
d
Bk Ut ◦ dWtk ,
(3.5)
k=1
is exponentially stable with probability one. Note that the operators Bk cannot commute with A here. 2 The operator ! A has eigenvalues λn = 2 − n , n ≥ 1, with associated eigenfunc-
2 tions en = π sin nx, which form an orthonormal basis of the Hilbert space H. Hence any u ∈ H can be represented in the form (u, ek )ek = u k ek . u= k≥1
k≥1
We define B : H →H via Be1 := −σe2 , Be2 := σe1 and Ben := 0 for any n ≥ 3. This is a linear operator, which does not commute with A. Using the Fourier representation for the solution u(t) to (3.5), this problem can be re-written as k≥1 duk (t) ek = k≥1 λk uk (t)ek dt + [σu2 (t)e1 − σu1 (t)e2 ] ◦ dW (t) (3.6) u(0) = u0 = k≥1 u0,k ek . Identifying the coefficients gives two coupled problems, the first being a 2dimensional stochastic ordinary differential system and the second an infinitedimensional system which is exponentially stable (since λn < 0 for all n ≥ 3), namely 0 σ u1 (t) u1 (t) d u1 (t) = λ1 0 dt + ◦ dWt −σ 0 u2 (t) 0 λ2 u2 (t) u2 (t) (3.7) u1 (0) = u0,1 , u2 (0) = u0,2 and
k≥3
The matrix
duk (t) ek =
λk uk (t) ek dt,
k≥1
k≥3
0 1 −1 0
uk (0) ek =
u0,k ek .
(3.8)
k≥3
is a basis for the linear space of skew symmetric 2 × 2 matrices, so results in Arnold et al.7 show that the leading Lyapunov exponent of solutions to (3.7) tends to 1 2 (λ1 + λ2 ) = −1/2 as the intensity parameter σ grows to +∞. Moreover, the leading Lyapunov exponent for the solutions to (3.8) is λ3 = −7, so we can ensure that the top Lyapunov exponent for the solutions of (3.6) is negative. The main idea for the stabilization of evolution equations is thus to decompose the problem into two new problems: a finite-dimensional one which can be stabilized
Stabilization of Evolution Equations by Noise
53
by using previously available methods from the finite dimensional framework, and an infinite-dimensional system which is already exponentially stable. This idea can be extended in a general way to enable the stabilization of a wide class of deterministic PDEs, which appear very frequently in applications. Consider again the deterministic infinite-dimensional linear system (1.4). The main stabilization result of this article is the following one from Caraballo & Robinson26 . Theorem 3.2. Assume that the trace of A is negative, i.e., tr A :=
∞
λj < 0.
(3.9)
j=1
Then, there exist linear operators Bk : H → H, k = 1, · · · , d, such that the zero solution of dUt = AUt dt +
d
Bk Ut ◦ dWtk ,
(3.10)
j=1
is exponentially stable, P-a.s. The operators Bk are such that for some N > 0, the N × N matrices D1 , · · · , Dk defined by
(Bk e1 , e1 ) (Bk e2 , e1 ) · · · (Bk eN , e1 ) (Bk e1 , e2 ) (Bk e2 , e2 ) · · · (Bk eN , e2 ) Dk = .. . : : : (Bk e1 , eN ) (Bk e2 , eN ) · · · (Bk eN , eN ) are skew-symmetric. Conversely, if there exist linear operators Bk : H → H, k = 1, · · · , d, with the above properties, for which the zero solution of (3.10) is exponentially stable with probability one, then the trace of A is negative. 4. Nonlinear PDEs The objective of this section is twofold. First, we will show that there is a well developed theory concerning the stabilization of nonlinear PDEs by Itˆ o noise with applications to several interesting examples. On the other hand, since not much is known about the same topic involving Stratonovich noise, we will analyze a particular example (which can, in a sense, be considered as canonical) in which the previous Theorem 3.2 combined with some order preserving properties allow one to establish stabilization for the Chafee-Infante equation by Stratonovich noise. A more complete study for more general nonlinear equations is a topic for further research.
54
Tom´ as Caraballo and Peter E. Kloeden
4.1. Stabilization by Itˆ o noise Let H be a real separable Hilbert space and let V a real reflexive and separable Banach space such that V → H ≡ H → V , where the injections are continuous and dense, and both V and V are uniformly convex. Further, denote by · , | · | and · ∗ the norms in V , H and V , respectively; by · , · the duality product between V , V , and by (· , ·) the scalar product in H. Finally, let a1 be the constant of the injection V → H, i.e, a1 |u|2 ≤ u2 for all u ∈ V. Now consider the Cauchy problem du = F (t, u), dt
u(0) = u0 ∈ H,
(4.1)
where F (t, ·) : V →V , t ∈ R+ , is a family of (nonlinear) operators satisfying F (t, 0) = 0 and the following hypothesis: Assumption 3. There exist a continuous function ν(·) and a real number ν0 ∈ R such that 2 u, F (t, u) ≤ ν(t)|u|2 , for all u ∈ V, where lim sup t→∞
1 t
t
ν(s) ds ≤ ν0 .
(4.2)
(4.3)
0
Moreover, assume that for each u0 ∈ H, there exists a unique strong solution u(t) := u(t; u0 ) to (4.1) with u(t; u0 ) ∈ L2 (0, T ; V ) ∩ C 0 ([0, T ]; H). Observe that, when F (t, ·) satisfies a coercivity condition of the type 2 u, F (t, u) ≤ −εup + α|u|2 , ∀u ∈ V, for certain parameters ε > 0, α ∈ R,p > 1, and a monotonicity hypothesis, there exists a unique strong solution u =u(t; u0 ) to (4.1) in Lp (0, T ; V ) ∩ C 0 ([0, T ]; H), see Lions.42 This coercivity assumption obviously implies (4.2). We will see that (4.1) can be stabilized by using a stochastic perturbation of the kind g(t, u(t)) dWt , where Wt is (for simplicity) a standard real Wiener process defined on a certain complete probability space (Ω, F , P) with filtration (Ft )t≥0 , and g(t, ·) : H → H satisfies g(t, 0) = 0 and the following condition Assumption 4. |g(t, u) − g(t, v)|2 ≤ λ(t)|u − v|2 ∀t ∈ R+ , ∀u, v ∈ H, where λ(·) is a nonnegative continuous function such that 1 t λ(s) ds ≤ λ0 ∈ R+ . lim sup t→∞ t 0
(4.4)
(4.5)
Stabilization of Evolution Equations by Noise
55
We suppose that for each u0 ∈ H the stochastically perturbed problem u(0) = u0 ∈ H,
dUt = F (t, Ut ) dt + g(t, Ut ) dWt ,
(4.6)
has a unique strong solution to (4.6) in I p (0, T ; V ) ∩ L2 (Ω; C 0 ([0, T ]; H)) for all T > 0 and a certain p > 1, where I p (0, T ; V ) denotes the space of all V -valued measurable processes Ut satisfying T Ut p dt < +∞ E 0 45
(see for instance Pardoux for conditions under which there exists a unique solution for each u0 ∈ L2 (Ω, F0 , P; H)). Finally, we assume the existence of a Lyapunov-like functional W : R+ × H → R+ which is a C 1,2 -positive functional such that Wu (t, u) ∈ V for any u ∈ V and t ∈ R+ and we define operators L and Q as 1 (t, u)g(t, u), g(t, u) LW (t, u) = Wt (t, u)+ < Wu (t, u), F (t, u) > + Wu,u 2 and 2
QW (t, u) = Wu (t, u), g(t, u) for each u ∈ V and t ∈ R+ . The following theorem is from Caraballo et al.23 Theorem 4.1. Assume that the solution of (4.6) satisfies that |u(t)| = 0 for all t ≥ 0, P-a.s., provided |u0 | = 0, P−a.s. Let V ∈ C 2 (H; R+ ) and let ψ1 and ψ2 be two real-valued continuous functions on R+ with ψ2 ≥ 0. Assume that there exist p > 0, γ ≥ 0 and θ ∈ R such that ∀u ∈ V ; (a). |u|p ≤ W (u), ∀u ∈ V, t ∈ R+ ; (b). LW t, u ≤ ψ1 (t)W (u), ∀u ∈ V, t ∈ R+ ; (c). QW t, u ≥ ψ2 (t)W 2 (u), t t ψ2 (s)ds 0 ψ1 (s)ds ≤ θ, lim inf 0 ≥ 2γ. (d). lim sup t→∞ t t t→∞ Then, the unique strong solution Ut of (4.6) satisfies log Ut γ−θ lim sup ≤− , P − a.s., t p t→∞ whenever U0 = u0 ∈ H is an F0 -measurable random vector such that |u0 | = 0 a.s. In particular, if γ > θ, then the solution is P-a.s. exponentially stable. A direct application of Theorem 4.1 with the function W (t, u) = |u|2 gives the following result: Theorem 4.2. Assume that the solution of (4.6) satisfies |u(t, u0 )| = 0 for all t ≥ 0, P-a.s., provided |u0 | = 0 , P-a.s. In addition to hypotheses (4.2) − (4.5), assume that (g(t, u), u)2 ≥ ρ(t)|u|4 ,
∀ u ∈ H,
(4.7)
56
Tom´ as Caraballo and Peter E. Kloeden
where ρ(·) is a nonnegative continuous function such that 1 t lim inf ρ(s) ds ≥ ρ0 , ρ0 ∈ R+ . t→∞ t 0
(4.8)
Then, the solution Ut = u(t, u0 ) of (4.6) satisfies lim sup t→∞
1 log |Ut |2 ≤ −(2ρ0 − ν0 − λ0 ), t
P − a.s.
(4.9)
for any such u0 ∈ H. In particular, if 2ρ0 > ν0 + λ0 , then the zero solution of equation (4.6) is P-a.s. exponentially stable. 4.1.1. A general nonlinear example We are now going to apply Theorem 4.2 to analyze the pathwise stability of a nonlinear stochastic partial differential equation, which has been investigated by Pardoux45 and Caraballo & Liu22 amongst others, namely dUt = A(t, Ut ) dt + B(t, Ut ) dWt ,
u(0) = u0 ∈ H,
(4.10)
where A(t, ·) : V → V is a family of nonlinear operators defined for almost every t satisfying A(t, 0) = 0 for t ∈ R+ , and B(t, ·) : V → H satisfies (b.1) B(t, 0) = 0; (b.2) There exists k > 0 such that |B(t, y) − B(t, x)| ≤ ky − x,
∀x, y ∈ V, t-a.e.
The following result is proved in Caraballo & Liu.22 Theorem 4.3. Assume, in addition to (b.1)–(b.2), the following coercivity condition: there exist α > 0, p > 1 and λ ∈ R such that 2 x, A(t, x) + |B(t, x)|2 ≤ −αxp + λ|x|2 .
(4.11)
for almost all t ∈ R+ and x ∈ V . Then, there exists r > 0 such that E|Ut |2 ≤ E|u0 |2 e−rt ,
∀t ≥ 0,
where Ut = u(t; u0 ), if at least one of the following conditions hold: (i) λ < 0 or (ii) λβ 2 − α < 0 and p = 2. Furthermore, under the same assumptions, the solution is P-a.s. exponentially stable, i.e., there exist positive constants ξ, η and a subset Ω0 ⊂ Ω with P(Ω0 ) = 0 such that, for each ω ∈ Ω0 , there exists a positive random number T (ω) satisfying |Ut (ω)|2 ≤ η|u0 |2 e−ξt ,
∀ t ≥ T (ω).
Stabilization of Evolution Equations by Noise
57
Observe that, in many applications, conditions (i) and (ii) mean that the term containing B must be small enough with respect to A. For example, consider the Sobolev spaces V = W01,p (O) and H = L2 (O) with their usual inner products for some 2 ≤ p < +∞, where O is an open bounded subset in Rd with regular boundary. Suppose that the operator A : V → V is defined by N ∂u(x) p−2 ∂u(x) ∂v(x) v, Au = − dx + au(x)v(x) dx, ∀u, v ∈ V, ∂xi ∂xi ∂xi O i=1 O where a ∈ R, and consider B of the form B(t, u) ≡ bu, where b ∈ R. Finally, let Wt be a standard real Wiener process. Then, 2 x, A(t, x) + |B(t, x)|2 = −2xp + 2a|x|2 + b2 |x|2 , ∀x ∈ V,
(4.12)
so (4.11) holds with equality for α=2
and λ = 2a + b2 .
Condition (i) requires 2a + b2 < 0, so a < 0 and b2 < −2a. On the other hand, 2 condition (ii) holds whenever (2a + b2 )a−1 1 − 2 < 0, that is, when b < 2a1 − 2a. Thus, Theorem 4.3 guarantees the exponential stability of paths, P-a.s., only for these values of a and b, which means that the zero solution of the deterministic system du = A(t, u(t)) (4.13) dt is exponentially stable and the random perturbation is small enough. However, Theorem 4.2 ensures exponential stability for sufficiently large perturbations even though the deterministic system is unstable. In this case, it is not difficult to see that 2a|x|2 if p > 2, p 2 2 x, A(t, x) = −2x + 2a|x| ≤ (4.14) (2a − 2a )|x|2 if p = 2, 1 so λ0 = ρ0 = b2 ,
2a, ν(t) = ν0 =
if p > 2,
2a − 2a , if p = 2. 1
Thus, Theorem 4.2 yields
−(b2 − 2a) if p > 2, 1 lim sup log |Ut |2 ≤ −(b2 − 2a + 2a ) if p = 2. t→∞ t 1
Consequently, we have pathwise exponential stability, P-a.s., if 2a if p > 2, b2 > 2a − 2a if p = 2. 1
(4.15)
58
Tom´ as Caraballo and Peter E. Kloeden
In general, we have the following result: Theorem 4.4. Assume (b.1), (b.2), (4.11) and that there exists a nonnegative continuous function b = b(t) such that (B(t, x), x)2 ≥ b(t)|x|4 with 1 lim inf t→+∞ t Then, P-a.s.,
t
∀x ∈ V,
b(s) ds ≥ b0 ∈ R+ .
(4.16)
(4.17)
0
−(2b0 − λ) if p > 1, 1 lim sup log |Ut |2 ≤ −(2b − λ + αa ) if p = 2, t→+∞ t 0 1
(4.18)
for any solution Ut with U0 = u0 ∈ L2 (Ω, F0 , P; H) such that |u0 | = 0, P-a.s. Observe that if λ < 0, then (4.10) is pathwise exponentially stable, P-a.s., for all p > 1 and all b0 ∈ R+ . However, when λ > 0, then (4.10) is stable if 2b0 > λ (for p = 2) or 2b0 > λ − αa1 (for p = 2). Taking into account our previous theorems, we can summarize the analysis for the preceding example: Case 1: The nonlinear problem, i.e., p > 2. In this case the problem is exponentially stable for all b ∈ R when a ≤ 0. However, if a > 0 Theorem 4.3 gives stability provided b2 > 2a. We do not know what happens here if a > 0 and b2 ≤ 2a. Case 2: The linear problem, i.e., p = 2. As in the preceding case, when a ≤ 0 the system is P-a.s. exponentially stable for all b ∈ R. But, if a > 0, then we need to check (ii), which requires b2 < 2a1 − 2a, or we have b2 > 2a − 2a1 . Hence, if a ≤ a1 , then P-a.s. exponential stability follows for all b ∈ R. However, when a > a1 , we can only ensure stability for b2 > 2a − 2a1 and we do not know what happens for b2 ≤ 2a − 2a1 . Remark 4.1. Theorem 4.3 is a particular case of a more general result which ensures stabilization with general decay rate (super- or sub-exponential). These results can be used to construct stabilizers of PDEs, see Caraballo et al.14 for more details on these topics. In conclusion, our results guarantee exponential stability for a wide range of values of a and b. Of course, this means that, given the deterministic system (4.13), the perturbed system becomes exponentially stable when the parameter of the noise is of the type bx(t) dWt and satisfies the conditions above. Otherwise, when we do not know whether the system is stable or not, what can we say? Is it possible to add another stochastic term in order to stabilize the stochastic PDE? The answer is positive and some results on this direction can be found in Caraballo et al.23
Stabilization of Evolution Equations by Noise
59
4.2. Stabilization by Stratonovich noise There are no general results that we know of concerning the stabilization of nonlinear PDEs by Stratonovich noise. The problem appears to be both difficult and challenging. The only work in this direction involves a canonical model whose dynamics is very well known in the deterministic case. This is the Chafee-Infante equation ∂u = ∆u + βu − u3 , x ∈ O, u|∂O = 0, (4.19) ∂t where O is a smooth bounded domain in Rn . It was shown in Caraballo et al.13 that the nonlinear equation (4.19) can be stabilized by adding a collection of noisy terms similar to the linear case in Section 2, leading to the system d dUt = ∆Ut + βUt − Ut3 dt + Bk Ut ◦ dWtk .
(4.20)
k=1
Essentially, it is shown that solutions of (4.20) can be bounded using appropriate positive solutions of the linear equation dUt = [∆Ut + βUt ] dt +
d
Bk u ◦ dWtk .
(4.21)
k=1
Since (4.21) can be stabilized via a suitable choice of Bk , so can (4.20). The proof makes essential and repeated use of the order-preserving properties of (4.20). To set this problem in a suitable context, we choose H = L2 (O), denote by −A the linear operator in H associated to the Laplacian, and then take A = A + βI, which clearly satisfies the conditions of Theorem 3.2. Finally, we let N be the N smallest integer such that j=1 (β − λj ) < 0. It follows that there exist linear operators Bk : H → H such that the zero solution of dUt = [−AUt + βUt ] dt +
d
Bk Ut ◦ dWtk
(4.22)
k=1
is exponentially stable, P-a.s. This fact can be used to deduce the stabilization of the nonlinear equation via the addition of the same noisy terms. The proof of the next result can be found in Caraballo et al.:13 Theorem 4.5. There exist bounded linear operators Bk : H → H and independent real Wiener processes W k , k = 1, . . . d, such that the zero solution of d dUt = −Au + βUt − Ut3 dt + Bk Ut ◦ dWtk k=1
is exponentially stable, P-a.s.
(4.23)
60
Tom´ as Caraballo and Peter E. Kloeden
This simple, but illustrative example may help to solve the stabilization problem for more general nonlinear PDEs that arise in applications. To the best of our knowledge, this is still an open problem. 5. Other types of evolution equations and models In this final section we briefly comment on some other types of evolution models whose stabilization by noise have been or are currently under investigation. 5.1. Delay differential equations Some models require the inclusion of delay terms in the equations. The problem of stabilization of delay (ordinary or partial) differential equations is also an important task. In the finite dimensional context, there are only a few results on the stabilization by the Itˆ o noise when the delay is small enough (see Appleby & Mao1 ). So far nothing is known for systems with arbitrary delay (finite or infinite), either with Itˆo or Stratonovich noise. 5.2. Stabilization of evolution inclusions and PDEs without uniqueness It often happens in applications that a model is better described if we use a setvalued differential equation, i.e., a differential inclusion. In addition, there are other situations in which we cannot ensure uniqueness of solutions for certain evolution equations. These two cases yield to the framework of set-valued dynamical systems. The stabilization analysis carried out in the single-valued case is also important and interesting in both of these situation. For the sake of brevity, we will only exhibit some of the few results which are known on this field, so far (see Caraballo et al.21 for more details). Let V be a separable and reflexive Banach space (with norm || · || and inner product ((·, ·))), and consider a Hilbert space H (with norm | · | and inner product (·, ·)). If we identify H with its dual space, then we can identify H with a subspace of V , so that we have V → H → V , where the previous inclusions are continuous and dense. We will denote by || · ||∗ the norm in V and by ·, · the duality product between V and V . Let us first consider the following stochastic evolution inclusion in the Itˆo sense: du (t) ∈ Au (t) + F (u (t)) + d Bi u(t) d W i , 0 ≤ t < +∞, i=1 dt dt t (5.1) u (0) = u0 ∈ H, where Wt1 , Wt2 , · · · , Wtd are mutually independent standard Wiener processes over the same filtered probability space (Ω, F , {Ft }t≥0 , P), Bi : H → H is a linear operator for i = 1,. . ., d, A : V → V is a linear A operator which is the infinitesimal
Stabilization of Evolution Equations by Noise
61
generator of a strongly continuous semigroup (i.e., of class C0 ) denoted by S (t). As we are interested in analyzing the behaviour of the variational solutions of (5.1) (see below for the definition), we need to assume some additional hypotheses ensuring their existence. To be more precise, we need the following assumptions: Assumption 5. (A4) Coercivity: There exist α > 0, λ ∈ R such that −2 Au, u + λ|u|2 ≥ α||u||p ,
for all u ∈ V,
(5.2)
where p > 1 is fixed. Assumption 6. (A5) Boundedness: There exists β > 0 such that ||Au||∗ ≤ β||u||p−1 , for all u ∈ V.
(5.3)
Notice that, in the case p = 2, condition (5.2) implies that the operator A is the generator of a strongly continuous semigroup (see Dautray & Lions [32, page 388]). Assumption 7. The set-valued term F : H → 2H satisfies: (F1) F has closed, bounded, convex, non-empty values. (F2) There exists C > 0 such that distH (F (u) , F (v)) ≤ C|u − v|, ∀u, v ∈ H, where distH (·, ·) denotes the Hausdorff distance between bounded sets. (F3) F (0) = 0. Under the preceding assumptions (in fact without assuming (5.2), (5.3) and (F3)), Theorem 2.1 in Da Prato & Frankowska30 ensures the existence of at least one solution u (·) of (5.1) for any random variable u0 ∈ Lp (Ω, F0 , P, H) with some p > 2. By such a solution we mean an adapted process u(·) taking values in H and such that: (1) u (·, ω) is continuous for P-a.a. ω ∈ Ω. (2) For any T > 0, u(·) is a mild solution, on the interval [0, T ] , of the problem du (t) = Au(t) dt + f (t) dt + di=1 Bi u(t )dWti , (5.4) u(0) = u , 0 i.e., , we have for all t ∈ [0, T ], t d t u(t) = S(t)u0 + S(t − s)f (s) ds + S(t − s)Bi u(s) dWsi , 0
i=1
0
62
Tom´ as Caraballo and Peter E. Kloeden
where f (·) an adapted process such that f (s, ω) ∈ F (u (s, ω)) , for a.a. (s, ω) ∈ (0, T ) × Ω, with
T
E
|f (s) | ds 2
< ∞.
0
Observe that, for the selection f (·), the unique mild solution to (5.4) is given by u(·). However, in order to apply Itˆo’s formula (or to make an appropriate change of variable) we need to handle a stronger concept of solution, say, either the so-called strong solution or the variational concept of solution. We will consider the latter here. In addition to the assumptions just used we now also assume the coercivity (5.2) and boundedness (5.3) assumptions. Then, (see Pardoux45), for any u0 ∈ Lp (Ω, F0 , P, H) (p > 2), there exists a unique variational solution of problem (5.4). In other words, there exists a stochastic process v(·) which belongs to Lp (Ω × (0, T ); V ) ∩ L2 (Ω; C(0, T ; H)) and that satisfies the equation in (5.4) in the sense of V , i.e., t d t (Av(s) + f (s)) ds + Bi v(s) dWsi , P − a.a. ω ∈ Ω, v(t) = u0 + 0
i=1
0
for all t ∈ [0, T ], where the equality is understood in the sense of V . Now, taking into account that the variational solution (when it exists) is also a mild solution (see, e.g. Caraballo9) it follows that for an initial datum u0 ∈ Lp (Ω, F0 , P, H) and for the selection f (·) in (5.4), there exists a unique variational solution which is also a solution of (5.1) in the sense of Da Prato and Frankowska. Henceforth, when we talk about a solution of (5.1) we will be always referring to this one. As was seen in the single-valued case, in order to produce a stabilization effect on deterministic (and even stochastic) systems one does not need to perturb the model with a very general noise (provided it is considered in the Itˆ o sense). In fact, a very simple multiplicative one is enough and the following stabilisation result holds (see Caraballo et al.21 ). Theorem 5.1. Assume that B2 = · · · = Bd = 0 and B1 is given by B1 v = σv for v ∈ H and σ ∈ R. Under the preceding assumptions, for σ 2 large enough so that γ = σ 2 − λ − 2C > 0 (whereλ and C are the constants appearing in (5.2) and (F2)), there exists Ω0 ⊂ Ω with P(Ω0 ) = 0 and a random variable T (ω) ≥ 0 such that for any initial datum u0 ∈ Lp (Ω, F0 , P, H) (p > 2), any of its corresponding solutions u(·) of problem (5.1) satisfies |u(t, ω)| ≤ e−γt/2 |u0 (ω)| 2
2
for all t ≥ T (ω), a.s.
(5.5)
Stabilization of Evolution Equations by Noise
63
Remark 5.2. It is worth mentioning that the stabilization of this evolution inclusion by Stratonovich noise has not been proved yet. One can also find a result on the stabilization of set-valued dynamical systems generated by evolution partial differential equations without uniqueness. The main tool for the proof is the theory of random dynamical systems and random attractors. Specifically, in Caraballo et al.21 is proved that some kind of high intensity noise, in the sense of Itˆ o, will ensure the existence of a random attractor for the perturbed problem, while the deterministic one does not have an attractor (or it is not known whether the deterministic attractor exists). We will not include more details on this problem, since it is not our aim to introduce the theory of random attractors here. 5.3. Stabilization of stationary solutions of a stochastic PDE The analysis in the preceding sections was concerned with the stabilization of the null solution of an evolution equation or inclusion by introducing some kind of noise in the deterministic model. However, it may happens that zero is not solution of the unperturbed equation, or may not be solution of the stochastic model when the noise is present. In this case, Caraballo et al.17 showed the existence of an exponentially stable stationary stochastic solution when a suitable Itˆ o noise was included, i.e., where the stationarity is understood in the sense of stochastic processes. Again, it is an open problem to analyze the same stabilization effect produced by Stratonovich noise (see Caraballo et al.16,17 for more details). 5.4. Other types of problems Another problem that may be even closer to reality is related to the effect produced by the noise when it acts only on (part of) the boundary of the domain and not in the forcing term in the equation. For instance, if we are considering an oceanic model, the stochastic disturbances may appear on the ocean surface and not in the equations driving the system. To the best of our knowledge, no papers have been published on this topic, although we have started to investigate this problem and some preliminary results will appear in the future. On the other hand, the synchronization of systems modelled by differential equations has received much attention over the last years, for example, see the review articles Caraballo et al.15 and Kloeden & Pavani.35 For instance, Chueshov & Rekalo27,27 analyzed the synchronization of a deterministic model concerning a reaction diffusion equation on either side of a permeable membrane. The synchronization proved by Chueshov and Rekalo is at the level of global attractors. However, the addition of a non-degenerate noise reduces the attractors to pathwise asymptotically stable stochastic stationary solutions and, in this way, almost all solutions starting in two points on both sides of the membrane, have to evolve in a synchronized way which is determined by the random attractor (which is given
64
Tom´ as Caraballo and Peter E. Kloeden
by a stationary solution) on the membrane. This can be regarded as a form of stabilization (see see Caraballo et al.12 for more details). Acknowledgement This work has been partly supported by Ministerio de Ciencia e Innovaci´on, Spain, under the grant MTM2008-00088, and Junta de Andaluc´ıa grant P07-FQM-02468. References 1. J. Appleby and X. Mao, Stochastic stabilization of functional differential equations, Systems and Control Letters 54(11) (2005), 1069–1081. 2. J. Appleby, X. Mao and A. Rodkina, Stabilization and destabilization of nonlinear differential equations by noise, IEEE Trans. Automatic Control 53(3) (2008), 683–691. 3. L. Arnold, Stochastic Differential Equations: Theory and Applications, Wiley & Sons, New York, (1974). 4. L. Arnold, Random Dynamical Systems. Springer, New York, 1998. 5. L. Arnold, Stabilization by noise revisited, Z. Angew. Math. Mech. 70(1990), 235-246. 6. L. Arnold and I. Chueshov, Order-preserving random dynamical systems: Equilibria, attractors, applications, Dyn. Stab. Sys. 13 (1998), 265–280. 7. L. Arnold, H. Crauel and V. Wihstutz, Stabilization of linear systems by noise, SIAM J. Control Optim. 21(1983), 451-461. 8. L. Arnold and P. E. Kloeden, Lyapunov exponents and rotation number of twodimensional systems with telegraphic noise, SIAM J. Applied Math. 49 (1989), 12421274. 9. T. Caraballo, PhD. Thesis, University of Sevilla, Spain (1988). 10. T. Caraballo, Recent results on stabilization of PDEs with noise, Bol. Soc. Esp. Mat. Apl. 37 (2006), 47-70. 11. T. Caraballo and J.A. Langa, Comparison of the long-time behavior of linear Ito and Stratonovich partial differential equations, Stoch. Anal. Appl. 19(2) (2001), 183-195. 12. T. Caraballo, I. Chueshov and P.E. Kloeden, Synchronization of a stochastic reactiondiffusion system on a thin two-layer domain, SIAM J. Math. Anal. 38 (2007), 1489– 1507. 13. T. Caraballo, H. Crauel, J.A. Langa and J.C. Robinson, The effect of noise on the Chafee-Infante equation: a nonlinear case study, Proc. Amer. Math. Soc., in press. 14. T. Caraballo, M.J. Garrido-Atienza and J. Real, Stochastic stabilization of differential systems with general decay rate, Systems & Control Letters 48(5) (2003), 397-406. 15. T. Caraballo, P.E. Kloeden, A. Neuenkirch and R. Pavani, Synchronization of dissipative systems with additive and linear noise, in Festschrift in Celebration of Prof. Dr. Wilfried Grecksch’s 60th Birthday, Christiane Tammer and Frank Heyde (Editors), Shaker-Verlag, Aachen, 2008, pp. 25–48. (ISBN 978-3-8322-7500-6 ISSN 0954-0882) 16. T. Caraballo, P.E. Kloeden and B. Schmalfuss, Exponentially stable starionary solutions for stochastic evolution equations and their perturbations, Appl. Math. Optim. 20(2004), 183–207 17. T. Caraballo, P.E. Kloeden and B. Schmalfuß, Stabilization of stationary solutions of evolution equations by noise, Discrete Conts. Dyn. Systems, Series B. 6 (2006) 1199-1212. 18. T. Caraballo, J.A. Langa and J.C. Robinson, Stability and random attractors for
Stabilization of Evolution Equations by Noise
19. 20. 21.
22. 23. 24. 25.
26. 27. 28.
29.
30. 31. 32. 33. 34. 35. 36.
37. 38. 39.
65
a reaction-diffusion equation with multiplicative noise, Discrete Cont. Dyn. Sys. 6 (2000), 875–892. T. Caraballo, J.A. Langa and J.C. Robinson, A stochastic pitchfork bifurcation in a reaction-diffusion equation, R. Soc. Lond. Proc. Ser. A 457 (2001), 2041–2061. T. Caraballo, J.A. Langa and T. Taniguchi, The exponential behaviour and stabilizability of stochastic 2D-Navier-Stokes equations, J. Diff. Eqns. 179(2002), 714-737. T. Caraballo, J.A. Langa and J. Valero, Stabilisation of differential inclusions and PDEs without uniqueness by noise, Comm. Pure and Applied Analysis 7 (2008), 13751392. T. Caraballo and K. Liu, On exponential stability criteria of stochastic partial differential equations, Stoch. Proc. & Appl. 83 (1999), 289-301. T. Caraballo, K. Liu and X.R. Mao, On stabilization of partial differential equations by noise, Nagoya Math. J. 161(2) (2001), 155-170. T. Caraballo, A.M. M´ arquez-Dur´ an and J. Real, On the asymptotic behaviour of a stochastic 3D-Lans-alpha model, Appl. Math. Optim. 53(2006), 141-161. T. Caraballo, J. Real and T. Taniguchi, On the existence and uniqueness of solutions to stochastic 3-dimensional Lagrangian averaged Navier-Stokes equations, R. Soc. Lond. Proc. Ser. A 462(2006), 459-479. T. Caraballo and J.C. Robinson, stabilization of linear PDEs by Stratonovich noise, Systems & Control Letters 53(2004), 41-50. I.D. Chueshov and A.M. Rekalo, Global attractor of contact parabolic problem on thin two-layer domain, Sbornik: Mathematics, 195 No. 1 (2004), 103–128. I.D. Chueshov and A.M. Rekalo, Long-time dynamics of reaction-diffusion equations on thin two-layer domains, in EQUADIFF-2003 Proceedings, edited by F. Dumortier, H. Broer, J. Mawhin, A. Vanderbauwhede and S.V. Lunel, World Scientific Publishing, Singapore 2005, pp. 645–650. G. Da Prato, A. Debussche, and B. Goldys, Some properties of invariant measures of non symmetric dissipative stochastic systems, Prob. Theor. Relat. Fields 123 (2002), 355-380. G. Da Prato and H. Frankowska, A stochastic Filippov theorem, Stochastic Anal. Appl. 12 (4) (1994), 409-426. G. Da Prato and J. Zabczyk, Stochastic Equations in Infinite Dimensions, Cambridge University Press, (1992). R. Dautray and J.L. Lions, “Analyse Math`ematique et Calcul Num´erique por les Sciences et les Techniques”, Masson, Paris (1984). M. Hairer, Exponential mixing properties of stochastic PDEs through asymptotic coupling, Prob. Theory Relat. Fields 124 (2002), 345–380. R. Has’minskii, Stochastic Stability of Differential Equations, Sijthoff and Noordhoff, Netherlands, (1980). P.E. Kloeden and R. Pavani, Dissipative synchronization of nonautonomous and random systems, GAMM-Mitt. 32 (2009), No. 1, 80 92. P.E. Kloeden and E. Platen, The Numerical Solution of Stochastic Differential Equations, Springer–Verlag, 1992 (revised reprinting 1995, 3rd revised and updated printing 1999) H. Kunita, Stochastic Partial Differential Equations connected with Non-Linear Filtering, in Lecture Notes in Mathematics 972, pp. 100-169, (1981) A.A. Kwiecinska, Stabilization of partial differential equations by noise, Stoch. Proc.& Appl. 79 (1999), no. 2, 179–184 A.A. Kwiecinska, Stabilization of evolution equations by noise, Proc. Amer. Math. Soc. 130(2002), No. 10, 3067-3074.
66
Tom´ as Caraballo and Peter E. Kloeden
40. J.A. Langa and J.C. Robinson, Upper box-counting dimension of a random invariant set, J. Math. Pures App., 85 (2006), no. 2, 269–294. 41. G. Leha, B. Maslowski and G. Ritter, Stability of solutions to semilinear stochastic evolution equations, Stoch. Anal. Appl. 17(1999), No. 6, 1009-1051. 42. J.L. Lions, Quelque m´ethodes de r´esolution des probl` emes aux limites non lin´eaires, Dunod, Gauthier- Villars, Paris, 1969. 43. X.R. Mao, Stochastic stabilization and destabilization, Systems & Control Letters 23 (1994), 279-290. 44. A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations, Springer-Verlag, New York Inc., 1983. ´ 45. E. Pardoux, Equations aux D´eriv´ees Partielles Stochastiques non Lin´eaires Monotones, Thesis Univ. Paris XI, 1975. 46. M. Scheutzow, Stabilization and destabilization by noise in the plane, Stoch. Anal. Appl. 11(1) (1993), 97-113. 47. H.J. Sussmann, On the gap between deterministic and stochastic ordinary differential equations, The Annals of Probability 6(1978) , No. 1, 19-41. 48. E. Wong & M. Zakai, On the relationship between ordinary and stochastic differential equations and applications to stochastic problems in control theory, Proc. Third IFAC Congress, paper 3B, 1966.
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 67–76
Chapter 4 Stochastic Quantification of Missing Mechanisms in Dynamical Systems Baohua Chen and Jinqiao Duan∗ Department of Applied Mathematics, Illinois Institute of Technology, Chicago, IL 60616, USA [email protected] Complex systems often involve multiple scales and multiple physical, chemical and biological mechanisms. Due to the lack of scientific understanding, some mechanisms are not represented (i.e., “unresolved”) in mathematical models for these complex systems. The impact of these unresolved processes on the resolved ones may be delicate and needs to be quantified. A stochastic dynamical approach is proposed to parameterize these missing mechanisms or unresolved scales. An example is briefly discussed to demonstrate this stochastic approach. Key Words: Stochastic parameterization; impact of unresolved scales or missing mechanisms; fractional Brownian motion; correlated noise; stochastic partial differential equations
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . 2 Stochastic analysis and stochastic parameterizations 3 An example . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
67 68 70 75
1. Introduction In building mathematical models for complex dynamical systems in science and engineering, not all mechanisms are taken into account, due to the lack of scientific understanding of or difficulty in representing such mechanisms. These missing mechanisms, although perhaps small in spatial scales or fast in temporal scales, may have delicate impact on the overall, either in transient or in long time, evolution of the systems. This is especially the case when a deterministic model could predict some dynamical behaviors of a complex phenomenon, but fails to capture other system features. The missing mechanisms may appear as random fluctuations, which are thus more amenable to stochastic representations. ∗ Partially
supported by NSF grants 0620539 and 0731201, the Cheung Kong Scholars Program and the K. C. Wong Education Foundation 67
68
Baohua Chen and Jinqiao Duan
We propose a stochastic approach for representing missing mechanisms in dynamical systems, as mathematical models for complex phenomena. This is a stochastic analysis-based, data-driven method for stochastic parameterizations. In §2, we present a stochastic parameterization approach, and then apply it to a geophysical model in §3. 2. Stochastic analysis and stochastic parameterizations We review a few stochastic concepts. • Brownian motion: Brownian motion (also known as Wiener process) is among the simplest of the continuous-time stochastic processes. Brownian motion Bt on a probability space (Ω, F, P) is characterized by the following facts: – B0 = 0 almost surely. – Bt is almost surely continuous. – Bt has independent and stationary increments with distribution Bt − Bs ∼ N(0, t − s) for 0 ≤ s < t. The most important property of Brownian motion is that its successive increments are uncorrelated: each displacement is independent of the former, in direction as well as in amplitude. • Fractional Brownian motion: Mandelbrot and van Ness (1968) defined a family of stochastic processes they called fractional Brownian motion. The main difference with ordinary Brownian motion is that in a fractional Brownian motion successive increments are correlated. Fractional Brownian motion has for long served as the archetype of a process with long range dependence (LRD).18 A fractional Brownian motion B H (t), t ∈ R, is a continuous-time Gaussian process starting at zero, with mean zero, and having the following correlation function: 2H 1 2H 2H , E B H (t)B H (s) = t + s − t − s 2 where 0 < H < 1, called the self-similar parameter or Hurst parameter. The Hurst parameter may be interpreted as a measure of the roughness. Figure 4.1 show the sample paths of fractional Brownian motion with various parameter values H. The larger the parameter H is, the smoother the path is. Some properties of the fractional Brownian motion are as follows. – Fractional Brownian motion is a Gaussian process. – Fractional Brownian motion is statistically self-similar B H (λt) = |λ|H B H (t),
0 < H < 1, λ ∈ R.
Stochastic Quantification of Missing Mechanisms
69
– Fractional Brownian motion has dependent and stationary increments. – For H = 12 , the process is the ordinary Brownian motion, which has independent increments. – For H > 12 , the increments of the process are positively correlated. – For H < 12 , the increments of the process are negatively correlated. The Hurst parameter H characterizes the important properties of fractional Brownian motion. Several statistics have been introduced to estimate H, such as wavelets, k-variations, maximum likelihood estimator, and spectral methods. A recent reference for these various approaches is represented by Ref. 2. • Colored noise: Fractional Brownian motion has dependent and stationary increments. Its generalized time derivative is a mathematic model of colored noise. By the way, white noise is modeled as the generalized time derivative of Brownian motion.
Consider a dynamical system modeled by the following deterministic partial differential equation for a state variable q(x, t) qt = Aq + N (q),
(2.1)
where A is a (unbounded) linear differential operator and N (·) is a nonlinear operator. If the model output matches well (in certain metric) with observation for q, there is not much need to go to stochastic dynamical modeling. However, when there is no such good match, this deterministic model (sometimes, a simplified or idealized conceptual model) needs to be improved. In fact, mathematical modeling is a process of obtaining more and more accurate descriptions for complex systems. To improve this deterministic model (2.1), we first figure out the model error or model uncertainty. Namely, we use observational data for q to feed into an appropriately space-time discretized version of (2.1). Because of the mis-match mentioned above, there will be a residual term F in the discretized version of (2.1). This residual term, defined on a space-time grid, is the model uncertainty. It contains information for the unaccounted, missing dynamical mechanisms in the original deterministic model. Note that the observational data is usually fluctuating or random, the missing mechanisms F (x, t, ω) as represented as data on the space-time grid are also random, i.e., with different realizations or samples. This calls for a stochastic parameterization for the missing mechanisms F (x, t, ω). For example, we may try this approximation F (x, t, ω) = σ(x, t, q) B˙ tH ,
(2.2)
where σ is an empirical formula describing fluctuating amplitude or noise intensity for F , and BtH is a fractional Brownian motion with Husrt parameter H. Here σ
70
Baohua Chen and Jinqiao Duan
may depend on parameters. These parameters and H may be estimated using data for F , with help of stochastic analysis. Thus we obtain a new, hopefully improved, dynamical model, i.e., a stochastic partial differential equation qt = Aq + N (q) + σ(x, t, q) B˙ tH .
(2.3)
For more information on stochastic parameterization, see Refs. 8–11,16,23,30 and references therein. 3. An example We now consider an example to demonstrate the above stochastic parameterization method. We consider a model for water vapor dynamics in the climate system. An advection-diffusion-condensation equation is an idealized model for specific humidity.20,21,25 In the absence of diabetic effects, parcel motion is restricted to two dimensional surfaces of the constant potential temperature, i.e., to isentropic surface (We consider θ = 315K here).14 The specific humidity q, subject to an advection, diffusion and condensation, is governed by the following equation: qt + v · ∇q = η∆q + S(q, qs ) + F,
(3.1)
where v = (u, v) is advection velocity field, η is diffusivity of condensable substance, and S(q, qs ) represents the sources and sinks of water vapor for the parcel of air considered. A frequent simplification is made in modeling that the water vapor falls out in the form of precipitation as soon as it condenses. The sink is thus represented as 1 S = − (q − qs ) if q > qs , τ =0 if q ≤ qs . Saturation specific humidity is qs = es /p, with water vapor saturation vapor pressure es , which is a function of temperature. Saturation vapor pressure is calculated according to the Clausius-Clapeyron relation. The empirical calculating formula is introduced in Ref. 3. The missing mechanism F includes the small scale convective moistening, which is also called “convective forcing” thereinafter. It consists of contributions owing to subgrid-scale moist convection and to other subgrid scale turbulence. They are vital to the water vapor distribution but are not resolved explicitly in the model. We analyzed the mean fields of water vapor in ECMWF Re-Analysis (ERA-40) 4-time daily data for the years 1975-2000, level is fixed at one isentropic surface θ = 315K. Water vapor content is measured by specific humidity q. Dynamical fields in the reanalysis data are represented spectrally, with triangular truncation at total wavenumber 159. Specific humidity is represented on a corresponding reduced Gaussian grid at N80 resolution. For the computation of water vapor and other flow fields, we transformed spectral fields to the corresponding regular Gaussian
Stochastic Quantification of Missing Mechanisms
71
grid and interpolated specific humidity to the same grid. The resolution of the data is 2.5◦ × 2.5◦ in longitude and latitude. The ERA-40 data currently represents the higher resolution data set that provides water vapor and dynamical fields with continuous long-term coverage. Convective moistening is deduced by equation (3.1): F = qt + v · ∇q − η∆q − S(q, qs ).
(3.2)
Discritizing this equation by utilizing up-wind scheme, we obtain qi+1,j,k − qi,j,k qi,j+1,k − qi,j,k qi,j,k+1 − qi,j,k + (ui,j,k + vi,j,k ) Fi,j,k = ∆t ∆x ∆y qi+1,j,k + qi−1,j,k − 2qi,j,k qi,j+1,k + qi,j−1,k − 2qi,j,k −η( + ) 2 (∆x) (∆y)2 (3.3) −Si,j,k . Plugging data of v = (u, v), q and S(q, qs ) into this discretized equation, convective forcing F (x, y, t) can be estimated at each time and each grid, for various realizations or samples. Autocorrelation of F shows that it is temporally correlated, so colored noise is needed to parameterize it:4 dBtH , (3.4) dt where BtH is the fractional Brownian motion with Hurst parameter H, and σ(x) ≥ 0 is the deterministic noise intensity which usually depends on space. The noise σ(x)(q − α · qs )B˙ tH is a multiplicative colored noise with tuning parameter α, which is determined by empirical testing. F (x, t) = σ(x)(q − α · qs )
Now we estimate the Hurst parameter H and noise intensity σ. Integrating the equation (3.4) on the observational time interval [0, T ], we get T T F (x, t)dt = σ(x) (q(x, t) − α · qs (x, t))dBtH . (3.5) 0
0
In this way, the mean of this integral is not zero (different from Ito integral a ), but in consistent with the mean of the convective forcing F . Therefore, there is no need to find an expression for the mean of F , as its impact is included in F . Let us denote Zt =
t 0
F ds, then equation (3.5) can be written:
ZT = σ(x) 0
T
(q(x, t) − α · qs (x, t))dBtH .
We have the following property for fractional stochastic integrals:6 a Ito
Integral’s properties: if f ∈ ν(0, T ), E[
T 0
f dBt ] = 019
(3.6)
72
Baohua Chen and Jinqiao Duan
1 Theorem 3.1 (Convergence of Variation6 ). For a given p < 1−H , as n → ∞, we have the (uniform) convergence in probability: T 1−pH n p n Vp (Z)T → cp σ (x) |q(x, τ ) − α · qs (x, τ )|p dτ, (3.7) 0
Γ( p+1 2 ) Γ( 12 )
p/2
2
where cp = E(|B1H |p ) = tion Vpn (Z)T is defined as
with Γ the Gamma function, the p-power varia-
Vpn (Z)T =
n
|Zi/n − Z(i−1)/n |p ,
i=1
and finally n =
T n
and
tni
= in .
Thus, we have an estimator for the noise intensity when n is large. σ ˜ (x) ≈
" cp
T 0
1−pH Vpn (Z)T n |q(x, τ ) − α · qs (x, τ )|p dτ
# p1
.
(3.8)
Since σ ˜ (x) is sample-wise, we take the mean $" σ(x) ≈ E
cp
T 0
1−pH Vpn (Z)T n |q(x, τ ) − α · qs (x, τ )|p dτ
# p1 %
.
(3.9)
ˆ n is taken as the minimizer ˆ n for H? Estimator H How to find the estimator H for the optimization problem: T n p V (Z) − c σ ( x ) |q(x, τ ) − α · qs (x, τ )|p dτ , (3.10) min 1−pH T p n p 0
0
where σ is estimated as (3.9). See Ref. 4 for more details. Thus we obtain a stochastic model for water vapor evolution: qt + v · ∇q = η∆q + S(q, qs ) + σ · (q − α · qs )B˙ tH .
(3.11)
Figure 4.2 shows a reasonable agreement between observation, and simulation of the stochastic model (3.11), for the El Ni˜ no-associated eastward shift in the moister regions from the western Pacific warm pool to the central eastern Pacific during 1997/1998 winter.
Acknowledgements We are very grateful to Ray Pierrehumbert, Joe Tribbia, and Grant Branstator for their constructive suggestions and insights. We also thank Xiaofan Li for numerous valuable comments with numerical simulations, and thank Chi-fan Shih for providing access to data sets.
Stochastic Quantification of Missing Mechanisms
Figure 4.1. (bottom)
73
Fractional Brownian motion with Hurst parameters H = 0.25 (top) and H = 0.75
74
Baohua Chen and Jinqiao Duan
Figure 4.2. Seasonally-averaged specific humidity in El Ni˜ no winter: (a) Observation; (b) Stochastic model simulation
Stochastic Quantification of Missing Mechanisms
75
References 1. Bender, C. “An Ito Formula for Generalized Functionals of a Fractional Brownian Motion with Arbitrary Hurst Parameter”. Stoch. Proc. Appl. 104(2003): 81-106. 2. Beran, J. “Statistics for Long-Memory Processes”. Chapman and Hall, 1994. 3. Buck, A. L. “New Equations for Computing Vapor Pressure and Enhancement Factor”. Journal of Applied Meteorology 20(1981): 1527-1532. 4. Chen, B. Stochastic Dynamics of Water Vapor in the Climate System. Ph.D. Thesis, 2009, Illinois Institute of Technology, Chicago, USA. 5. Coeurjolly, J.-P. “Estimating the Parameters of a Fractional Brownian Motion by Discrete Variations of Its Sample Paths”. Stat. Inference for Stoch. Proc. 4(2001): 199-227. 6. Corcuera, J. M., D. Nualart, and J. H. C. Woerner. “Power Variation of Some Integral Fractional Processes”. Bernoulli 12(2006): 713-735. 7. Dai, A. “Recent Climatology, Variability, and Trends in Global Surface Humidity”. J. Clim. 19(2006): 3589-3606. 8. Du, A. and J. Duan. “A Stochastic Approach for Parametering Unresolved Scales in a System with Memory”. Journal of Algorithms and Computational Technology 3(2009): 393-405. 9. Duan, J. and B. Nadiga. “Stochastic Parameterization of Large Eddy Simulation of Geophysical Flows”. Proc. American Math. Soc. 135(2007): 1187-1196. 10. Duan, J. and B. Chen. “Quantifying Model Uncertainty by Fractional Brownian Motions”. Oberwolfach Reports 5(2008). 11. Duan, J. “Stochastic Modeling of Unresolved Scales in Complex Systems”. Frontiers of Math. in China 4 (2009), 425-436. 12. Duan, J. “Predictability in Nonlinear Dynamical Systems with Model Uncertainty”. In Stochastic Physics and Climate Modeling, T. N. Palmer and P. Williams (eds.), Cambridge Univ. Press, 2009. 13. Held I. M. and B. J. Soden. “Water Vapor Feedback and Global Warming”. Annual Review Energy Environment 25(2000): 441-475. 14. Hoskins, B. “Towards a PV-θ View of the General Circulation”. Tellus 43AB(1991): 27-35. 15. Kolmogorov, A. N. “The Local Structure of Turbulence in Incompressible Fluid at Very High Reymond Numbers”. Dokl. Acad. Sci., USSR, 30(1941): 299-303. 16. Lin, J. W.-B. and J. D. Neelin. Considerations for Stochastic Convective Parameterization. J. Atmos. Sci. 59(2002): 959-975. 17. Marshak, A., A. Davis, R. Sahalan, and W. Wiscombe. “Bounded Cascade Models as Nonstationaty Multifractals”. Phys. Rev. E49(1994). 18. Nualart, D. “Stochastic Calculus with respect to the Fractional Brownian Motion and Applications”. Contemporary Mathematics 336 (2003), 3-39. 19. Oksendal, B. “ Stochastic Differenntial Equations”. Sixth Ed., Springer-Verlag, New York, 2003. 20. Pierrehumbert, R. T., Brogniez H, and Roca R, 2005: “‘On the Relative Humidity of the Earth’s Atmosphere”. In The General Circulation of the Atmosphere, 143-185, T. Schneider and A. Sobel, eds. Princeton University Press, 2007. 21. Pierrehumbert, R. T. “Subtropical Water Vapor As a Mediator of Rapid Global Climate Change”. In Clark PU, Webb RS and Keigwin LD eds. Mechanisms of Global Change at Millennial Time Scales. American Geophysical Union: Washington, D.C. Geophysical Monograph Series 112: 177-201. 22. Pierrehumbert, R. T. “Anomalous Scaling of High Cloud Variability in the Tropical
76
Baohua Chen and Jinqiao Duan
Pacific”. Geophysical Research Letters, 23(1996): 1095-1098. 23. Palmer, T. N., G. J. Shutts, R. Hagedorn, F. J. Doblas-Reyes, T. Jung and M. Leutbecher. “Representing Model Uncertainty in Weather and Climate Prediction”. Annu. Rev. Earth Planet. Sci. 33(2005): 163-193. 24. Sui, H., W. G. Read, J. H. Jiang, and et.al, “Enhanced Positive Water Vapor Feedback Associated With Tropical Deep Convection: New Evidence From Aura MLS”. Gephys. Res. Lett. 33(2006). 25. Sukhatme, J. and R. T. Pierrehumbert. “Statistical Equilibria of Uniformly Forced Advection Condensation”. (posted on ArXiV), 2005. 26. Sun, D.-Z. and R. S. Lindzen, “Distribution of Tropical Tropospheric Water Vapor”. J. Atmos. Sci. 50(1993a): 1643-1660. 27. Teixeira, J. and Carolyn A. Reynolds. “Stochastic Nature of Physical Parameterizations in Ensemble Predictions: A stochsatic Convection Approach”. Monthly Weather Review 136(2008): 483-496. 28. Tudor, C. A. and F. G. Viens. “Variations and Estimators for the Self-similarity Order through Malliavan Calculus”. Submitted, 2007. 29. Vainshtein, S. I., K. R. Sreenivasan, R. T. Pierrehumbert and et al, “Scaling Exponents for Turbulence and Other Random Process and Their Relationships with Multifractal Structure”. Phys. Rev. E 50(1994). 30. Williams, P. D. “Modelling Climate Change: the Role of Unresolved Processes”. Phil. Trans. R. Soc. A 363(2005): 2931-2946.
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 77–89
Chapter 5 Banach Space-Valued Functionals of White Noise
Yin Chen and Caishi Wang∗ School of Mathematics and Information Science, Northwest Normal University, Lanzhou 730070, China Let X be a complex Banach space (not necessary to be reflexive). In this chapter, we prove a moment characterization theorem and a convergent theorem for X-valued generalized functionals of white noise, which refine the corresponding results recently obtained by Wang and Huang (Acta Math. Sin. Engl. Ser. 22 (2006), 157–168).
Contents 1 Introduction . . 2 Kernel theorems 3 Main theorems . References . . . . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
77 78 81 89
1. Introduction Let X be a complex Banch space. By X-valued generalized functionals of white noise we mean continuous linear mappings defined on the testing functionals of white noise and valued in X. Such mappings form an important subset of vectorvalued functionals of white noise and may find their applications in the study of stochastic evolution equations in Banach spaces. Kondratiev and Streit5 introduced the moment approach to characterize scalarvalued generalized functionals of white noise. It turns out that the moment approach is independent of the choice of the framework of white noise analysis. In Ref. 9, the authors applied the moment approach to X-valued generalized functionals of white noise and obtained some interesting results under the condition that X is reflexive. The reflexive condition on X, however, appears somewhat unnatural. The main purpose of the present paper is to refine the work of Ref. 9. More precisely, we will show that the moment characterization theorem and the convergent theorem given in Ref. 9 remain true without the reflexive condition on X. ∗ Corresponding
author ([email protected]) 77
78
Yin Chen and Caishi Wang
2. Kernel theorems Let X be a complex Banach space (not necessary to be reflexive). In this section, we prove some kernel theorems for X-valued multilinear mappings. These theorems will play a crucial role in proving our main theorems. Let H be a separable complex Hilbert space with inner product ·, · and norm | · |. We denote by ·, · X ∗ ×X the canonical bilinear form on X ∗ × X, where X ∗ is the dual of X. Let n ≥ 1 and M : Hn −→ X an n-linear mapping. M is called bounded if M < ∞, where M is defined by & M = sup M (h1 , h2 , · · · , hn )X |h1 | ≤ 1, |h2 | ≤ 1, · · · , |hn | ≤ 1, ' (h1 , h2 , · · · , hn ) ∈ Hn . In that case, M is called the norm of M . Definition 2.1. Let n ≥ 1. A bounded n-linear mapping M : Hn −→ X is said to be strongly bounded if there exists an orthonormal basis { ek }k≥1 of H such that ( ) g, M (ej1 , ej2 , · · · , ejn ) ∗ 2 < ∞ M 2s ≡ sup (2.1) X ×X where of M .
g =1,g∈X ∗ j ,j ,··· ,j 1 2 n
j1 ,j2 ,··· ,jn
≡
∞
j1 ,j2 ,··· ,jn =1 .
In that case, M s is called the strong norm
As is shown below, M s is actually independent of the choice of the orthonormal basis { ek }k≥1 . Let H⊗n be the n-fold Hilbert tensor product of H. By convention, the inner product and norm of H⊗n are still denoted by ·, · and | · |, respectively. Theorem 2.1. Let n ≥ 1. If M : Hn −→ X is a strongly bounded n-linear mapping, then there exists a unique bounded linear operator TM : H⊗n −→ X such that M (h1 , h2 , · · · , hn ) = TM (h1 ⊗ h2 ⊗ · · · ⊗ hn ), (h1 , h2 , · · · , hn ) ∈ Hn ,
(2.2)
and moreover, TM = M s , where TM stands for the usual operator norm. Proof. Obviously, TM is unique if it exists. To prove the existence, we define a mapping M+ : X ∗ −→ (H⊗n )∗ as follows ( ) M+ g = g, M (ej1 , ej2 , · · · , ejn ) X ∗ ×X R(ej1 ⊗ ej2 ⊗ · · · ⊗ ejn ), g ∈ X ∗ , j1 ,j2 ,··· ,jn
(2.3) where { ek }k≥1 is an orthonormal basis of H and R : H⊗n −→ (H⊗n )∗ is the Riesz mapping. It can be easily verified that M+ : X ∗ −→ (H⊗n )∗ is a bounded linear operator and ( ) g, M (ej1 , ej2 , · · · , ejn ) ∗ 2 , g ∈ X ∗ , (2.4) M+ g2(H⊗n)∗ = X ×X j1 ,j2 ,··· ,jn
Banach Space-Valued Functionals of White Noise
79
which means M+ = M s . For (h1 , h2 , · · · , hn ) ∈ Hn and g ∈ X ∗ , it follows that ( ) M+ g, h1 ⊗ h2 ⊗ · · · ⊗ hn (H⊗n )∗ ×H⊗n ( ( )( ) ( ) ) g, M (ej1 , ej2 , · · · , ejn ) X ∗ ×X h1 , ej1 h2 , ej2 · · · hn , ejn = j1 ,j2 ,··· ,jn
=
(
) g, M h1 , ej1 ej1 , h2 , ej2 ej2 , · · · , hn , ejn ejn X ∗ ×X
j1 ,j2 ,··· ,jn
( ) = g, M (h1 , h2 , · · · , hn ) X ∗ ×X . Now let J1 : H⊗n −→ (H⊗n )∗∗ and J2 : X −→ X ∗∗ be the natural embedding ∗ the adjoint of M+ . Then, for (h1 , h2 , · · · , hn ) ∈ Hn mappings and denote by M+ ∗ and g ∈ X , we have ( ∗
) M+ J1 h1 ⊗ h2 ⊗ · · · ⊗ hn , g X ∗∗ ×X ∗ (
) = J1 h1 ⊗ h2 ⊗ · · · ⊗ hn , M+ g (H⊗n )∗∗ ×(H⊗n )∗ ) ( = M+ g, h1 ⊗ h2 ⊗ · · · ⊗ hn (H⊗n )∗ ×H⊗n ( ) = g, M (h1 , h2 , · · · , hn ) X ∗ ×X ( ) = J2 M (h1 , h2 , · · · , hn ), g X ∗∗ ×X ∗ , which implies that for each (h1 , h2 , · · · , hn ) ∈ Hn , it holds that
∗ J1 h1 ⊗ h2 ⊗ · · · ⊗ hn = J2 M (h1 , h2 , · · · , hn ) ∈ J2 (X). M+
(2.5)
Since { h1 ⊗ h2 ⊗ · · · ⊗ hn | (h1 , h2 , · · · , hn ) ∈ Hn } is total in H⊗n and J2 (X) is a closed subspace of X ∗∗ , it follows that
∗ J1 H⊗n ⊂ J2 (X). M+ ∗ J1 is a bounded linear operator from H⊗n to X. It follows Hence TM ≡ J2−1 M+ from (2.5) that
M (h1 , h2 , · · · , hn ) = TM h1 ⊗ h2 ⊗ · · · ⊗ hn , (h1 , h2 , · · · , hn ) ∈ Hn .
Finally, TM = =
sup
TM uX =
sup
∗ M+ J1 uX ∗∗ =
|u|=1,u∈H⊗n |u|=1,u∈H⊗n
sup
|u|=1,u∈H⊗n
∗ J2−1 M+ J1 uX
sup
v =1,v∈(H⊗n )∗∗
∗ M+ vX ∗∗
∗ = M+ = M+ = M s .
This completes the proof.
Remark 2.1. According to Theorem 2.1, if M : Hn −→ X is a strongly bounded n-linear mapping, then M ≤ M s .
80
Yin Chen and Caishi Wang
Let H⊗n be the n-fold symmetric Hilbert tensor product of H, which is a closed = C. By convention, H⊗n is endowed with the subspace of H⊗n . Note that H⊗0 inner product n! ·,√· instead, which is equivalent to the inner product ·, · of H⊗n . = n! | · |. Hence · H⊗n
Theorem 2.2. Let n ≥ 1. If M : Hn −→ X be a strongly bounded symmetric n −→ X linear mapping, then there exists a unique bounded linear operator LM : H⊗n such that 2⊗ · · · ⊗h n ), (h1 , h2 , · · · , hn ) ∈ Hn , M (h1 , h2 , · · · , hn ) = LM (h1 ⊗h
(2.6)
and moreover, 1 LM = √ M s . n!
(2.7)
Proof. By Theorem 2.1, there exists a unique bounded linear operator TM : H⊗n −→ X such that M (h1 , h2 , · · · , hn ) = TM (h1 ⊗ h2 ⊗ · · · ⊗ hn ), (h1 , h2 , · · · , hn ) ∈ Hn .
⊗n −→ X is a bounded Put LM = TM |H⊗n . Then, it is easy to verify that LM : H linear operator and moreover, LM satisfies equality (2.6). Taking an orthonormal basis { ek }k≥1 of H, we have ( ) g, M (ej1 , ej2 , · · · , ejn ) ∗ 2 . M 2s = sup X ×X
g =1,g∈X ∗ j ,j ,··· ,j 1 2 n
It is known that with 1 ≤ i1 < i2 < · · · < ik , 1 ≤ r1 , r2 , · · · , rk ≤ n, r1 + r2 + · · · + rk = n and 1 ≤ k ≤ n, the following vector set constitutes an orthonormal basis of the symmetric Hilbert tensor H⊗n * ⊗r + 2 ⊗r ⊗rk ei1 1 ⊗e i2 ⊗ · · · ⊗eik √ . r1 !r2 ! · · · rk ! Hence LM 2 = L∗M 2 = = = = =
sup
g =1,g∈X
sup
g =1,g∈X ∗
k , ⊗r2 ⊗ · · · ⊗e ⊗r e⊗r1 ⊗e ik L∗M g, i1 √ i2 ∗ r !r ! · · · r !
sup
g =1,g∈X ∗
sup
g =1,g∈X ∗
sup
g =1,g∈X ∗
1 = M 2s , n!
L∗M g2(H⊗n )∗
1
n
2
k
(H⊗n )∗ ×H
2 ⊗n
( 2 ) 1 L∗M g, ej1 ⊗e j2 ⊗ · · · ⊗e jn ∗ ×H⊗n ⊗n (H ) n! j ,j ,··· ,j 1 2 n ( 2
) 1 g, LM ej1 ⊗e j2 ⊗ · · · ⊗e jn ∗ ×X X n! j ,j ,··· ,j 1
2
n
1
2
n
( 2
) 1 g, M ej1 , ej2 , · · · , ejn ∗ ×X X n! j ,j ,··· ,j
Banach Space-Valued Functionals of White Noise
81
where n denotes the following relation: 1 ≤ i1 < i2 < · · · < ik , 1 ≤ r1 , r2 , · · · , rk ≤ n, r1 + r2 + · · · + rk = n √ and 1 ≤ k ≤ n. Hence LM = M s / n!.
3. Main theorems Let X be a complex Banach space (not necessary to be reflexive). In the present section, we present a moment characterization theorem and a convergent theorem for X-valued generalized functionals of white noise, which refine that of Ref. 9. As is seen, their proofs depend on the kernel theorems given in Section 2 We first outline the framework of white noise analysis where we work. Let H be a real separable Hilbert space with norm | · |0 and inner product ·, · . Let A be a positive self-adjoint operator in H such that there exists an orthonormal basis {ei }i≥1 for H satisfying the following conditions (1) A ei = λi ei , i = 1, 2, · · · , (2) 1 < λ1 ≤ λ2 ≤ · · · ≤ λn ≤ · · · , ∞ −α < ∞ for some positive constant α. (3) i=1 λi For each p ∈ R, define | · |p ≡ |Ap · |0 and let Ep be the completion of DomAp with respect to | · |p . Then Ep is a real Hilbert space for each p ∈ R and moreover Ep and E−p can be viewed as each other’s topological dual. Let E be the projective limit of {Ep | p ≥ 0} and E ∗ the inductive limit of {E−p | p ≥ 0}. Then we get the following inclusion relation E ⊂ Eq ⊂ Ep ⊂ H ⊂ E−p ⊂ E−q ⊂ E ∗ where 0 ≤ p ≤ q. Moreover E and E ∗ can be regarded as each other’s topological dual and E ⊂ H ⊂ E ∗ constitutes a Gel’fand triple. We denote by ·, · the canonical bilinear form on E ∗ × E which is consistent with the inner product of H. By the Minlos theorem,3 there exists a Gaussian measure µ on E ∗ such that 2 ei x,ξ µ(dx) = e−|ξ|0 /2 , ξ ∈ E. (3.1) E∗
The measure space (E ∗ , µ) is known as the white noise space. Let (L2 ) ≡ L2 (E ∗ , µ) be the complex Hilbert space of µ-square integrable functions on E ∗ with inner product ((·, ·)) and norm · 0 . Let Hc be the complexification of H and (·, ·) the inner product of Hc . For denote the symmetric n-fold Hilbert tensor product of Hc with n ≥ 1,√ let Hc⊗n norm n!| · |0 , where | · |0 stands for the norm of Hc⊗n . It is known that, for each −→ (L2 ) such that integer n ≥ 0, there exists a linear isometry In : Hc⊗n In (ξ ⊗n ) = : ·⊗n : , ξ ⊗n , ξ ∈ Ec
(3.2)
82
Yin Chen and Caishi Wang
where Ec means the complexification of E and ·, · the canonical bilinear form on ) ⊥ In (Hc⊗n ) whenever m, n ≥ 0 and m = n. (Ec⊗n )∗ × Ec⊗n . In addition, Im (Hc⊗m The next lemma is known as the Wiener-Itˆ o-Segal isomorphism theorem. Lemma 3.1. Let Γ(Hc ) be the symmetric Fock space over Hc . Then there exists an isometric isomorphism I : Γ(Hc ) −→ (L2 ) such that . ∞ ∞ ∞ . fn = In (fn ), fn ∈ Γ(Hc ) (3.3) I n=0
n=0
n=0
where the series on the righthand side converges in the norm of (L2 ). Now let Γ(A) be the second quantization operator of A defined by Γ(A)ϕ =
∞
In (A⊗n fn ), ϕ ∈ DomΓ(A)
(3.4)
n=0
/ 2 where ϕ = I( ∞ n=0 fn ). Then Γ(A) is a self-adjoint operator in (L ) with inverse −1 p Γ(A ). Similarly, for each p ∈ R, define · p ≡ Γ(A) · 0 and let (Ep ) be the completion of DomΓ(A)p with respect to norm · p . Then (Ep ) is complex Hilbert space for each p ∈ R and (Ep ) and (E−p ) can be viewed as each other’s dual. Let (E) be the projective limit of {(Ep ) | p ≥ 0} and (E)∗ the inductive limit of {(E−p ) | p ≥ 0}. Then we have the following inclusion relation (E) ⊂ (Eq ) ⊂ (Ep ) ⊂ (L2 ) ⊂ (E−p ) ⊂ (E−q ) ⊂ (E)∗
(3.5)
where 0 ≤ p ≤ q. Moreover (E) is a countably Hilbertian nuclear space and (E)∗ can be regarded as the topological dual of (E). Hence we come to a second Gel’fand triple (E) ⊂ (L2 ) ⊂ (E)∗
(3.6)
which is known as the framework of white noise analysis over E ⊂ H ⊂ E ∗ . Usually, elements of (E) are called testing functionals while elements of (E)∗ are referred to as generalized functionals. We denote by ·, ·
the canonical bilinear form on (E)∗ × (E). /∞ /∞ Let ϕ ∈ (L2 ) with ϕ = I( n=0 fn ), n=0 fn ∈ Γ(Hc ). Then ϕ ∈ (E) if and ⊗n only if fn ∈ Ec for each n ≥ 0 and ∞
2 n!fn p < ∞, ∀ p ≥ 0.
(3.7)
n=0
For ϕ, ψ ∈ (E), their Wick product ϕ ψ is defined as . ∞ m fl ⊗g ϕψ =I n=0 l+m=n
(3.8)
/∞ /∞ where ϕ = I( n=0 fn ) and ψ = I( n=0 gn ). It is known that ϕψ ∈ (E) whenever ϕ, ψ ∈ (E). Moreover the mapping (ϕ, ψ) ∈ (E)×(E) −→ ϕψ ∈ (E) is continuous.
Banach Space-Valued Functionals of White Noise
83
For ϕ, ψ ∈ (E), let ϕψ be the usual product of ϕ and ψ. Then ϕψ ∈ (E) and moreover the mapping (ϕ, ψ) ∈ (E) × (E) −→ ϕψ ∈ (E) is continuous. The following lemma shows the relationship between the Wick and the usual products. See Ref. 4 or Ref. 6 for its proof. Lemma 3.2. Thereexists a linear homeomorphism Θ : (E) −→ (E) such that for any ξ ∈ Ec , Θ I1 (ξ) = I1 (ξ) and moreover Θ(ϕψ) = Θ(ϕ) Θ(ψ),
ϕ, ψ ∈ (E).
(3.9)
The linear homeomorphism Θ : (E) −→ (E) is known as the renormalization operator Recall that X is a complex Banach space (not necessary to be reflexive). By an X-valued generalized functional we mean a continuous linear mapping from (E) into X. As usual we denote by L[(E), X] the space of X-valued generalized functionals. Definition 3.2. Let T ∈ L[(E), X] be an X-valued generalized functional. Define M0T = T 1 and MnT (ξ1 , ξ2 , · · · , ξn ) = T I1 (ξ1 )I1 (ξ2 ) · · · I1 (ξn ) , ξ1 , ξ2 , · · · , ξn ∈ Ec (3.10) for n ≥ 1. We call MnT the moment of order n of the X-valued generalized functional T. It is easy to see that the moment of order n of an X-valued generalized functional is a symmetric n-linear mapping from Ecn to X. In the following, for an n-linear mapping M : Ecn −→ X, we use the following notation 0(ξ) = M (ξ, ξ, · · · , ξ), M
ξ ∈ Ec
(3.11)
where ξ appears n times on the righthand side. The next proposition shows that an X-valued generalized functional is uniquely determined by its moment sequence. Proposition 3.1. Let T1 , T2 ∈ L[(E), X]. Then T1 = T2 if and only if 1 1 T1 T2 M n (ξ) = Mn (ξ),
ξ ∈ Ec , n ≥ 0.
(3.12)
Proof. We need only to prove the if part. Let Λn = { In (ξ ⊗n ) | ξ ∈ Ec }. Then the set span{∪n≥0 Λn } is dense in (E). For each n ≥ 0 and any ξ ∈ Ec , we have 1 1 T1 T2 −1 T1 Θ−1 [In (ξ ⊗n )] = M [In (ξ ⊗n )] n (ξ) = Mn (ξ) = T2 Θ which implies that T1 Θ−1 (ϕ) = T2 Θ−1 (ϕ),
# " ϕ ∈ span ∪n≥0 Λn .
Hence, by the continuity of T1 Θ−1 and T2 Θ−1 , we come to that T1 Θ−1 = T2 Θ−1 which implies that T1 = T2 .
84
Yin Chen and Caishi Wang
Theorem 3.1. Let T ∈ L[(E), X] be an X-valued generalized functional and { MnT }n≥0 the moment sequence of T . Then there exist constants K > 0 and p ≥ 0 such that √ 2 2 1T (ξ)2 ≤ K n! |ξ|n , ξ ∈ Ec , n ≥ 0. 2M (3.13) p n X Proof.
By Lemma 3.2, we have 1T (ξ) = T I1 (ξ)n = T Θ−1 In (ξ ⊗n ), M n
ξ ∈ Ec , n ≥ 0.
On the other hand, we see that T Θ−1 ∈ L[(E), X], which implies that there exist constants K > 0 and p ≥ 0 such that 2 2 2T Θ−1 (ϕ)2 ≤ Kϕp , ϕ ∈ (E). X Hence, for any ξ ∈ Ec and n ≥ 0, it holds that √ 2 2 1T (ξ)2 ≤ KIn (ξ ⊗n )p = K n! |ξ|n . 2M p n X
This completes the proof.
Theorem 3.2. Let M0 ∈ X. For each n ≥ 1, let Mn : Ecn −→ X be a symmetric n-linear mapping. Assume that there exist constants K > 0 and p ≥ 0 such that √ 2 2 21 (3.14) Mn (ξ)2X ≤ K n! |ξ|np , ξ ∈ Ec , n ≥ 0. Then there exists a unique X-valued generalized functional T ∈ L[(E), X] such that MnT (ξ1 , ξ2 , · · · , ξn ) = Mn (ξ1 , ξ2 , · · · , ξn ), Moreover for q ≥ p +
α 2
ξ1 , ξ2 , · · · , ξn ∈ Ec .
(3.15)
with e2 A−(q−p) 2HS < 1 we have
K ϕq , T Θ−1 (ϕ)X ≤ 2 1 − e A−(q−p) 2HS
ϕ ∈ (E).
(3.16)
Proof. T is obviously unique if it exists. Below we verify the existence of T . Take q ≥ p + α2 such that e2 A−(q−p) 2HS < 1. Firstly, by (3.14) and the polarization formula, we get nn Mn (ξ1 , ξ2 , · · · , ξn )X ≤ K √ |ξ1 |p |ξ2 |p · · · |ξn |p , n!
ξ1 , ξ2 , · · · , ξn ∈ Ec , n ≥ 0. (3.17)
Since | · |p ≤ | · |q , we then come to nn Mn (ξ1 , ξ2 , · · · , ξn )X ≤ K √ |ξ1 |q |ξ2 |q · · · |ξn |q , n!
ξ1 , ξ2 , · · · , ξn ∈ Ec , n ≥ 0
(3.18) n , where Eq,c is which imply that each Mn has a unique bounded extension to Eq,c the complexification of Eq .
Banach Space-Valued Functionals of White Noise (q)
85 (q)
n Let Mn be the bounded extension of Mn to Eq,c for n ≥ 0. Then Mn remains −q symmetric. It is known that { A ek }k≥1 is an orthonormal basis of Eq,c . And moreover, for n ≥ 1 we have
Mn(q) (A−q ej1 , A−q ej2 , · · · , A−q ejn )2X
j1 ,j2 ,··· ,jn
=
Mn (A−q ej1 , A−q ej2 , · · · , A−q ejn )2X
j1 ,j2 ,··· ,jn
≤
j1 ,j2 ,··· ,jn
= K2
K2
n2n −q |A ej1 |2p |A−q ej2 |2p · · · |A−q ejn |2p n!
n2n −(q−p) 2n A HS n!
< ∞.
(q)
n Hence, for n ≥ 1, the symmetric n-linear mapping Mn : Eq,c −→ X is strongly bounded and
Mn(q) 2s ≤ K 2
n2n −(q−p) 2n A HS . n!
(3.19)
⊗n , X] such that By Theorem 2.2, for each n ≥ 1 there exists an Ln ∈ L[Eq,c (q)
Mn(q) (ξ1 , ξ2 , · · · , ξn ) = L(q) n (ξ1 ⊗ξ2 ⊗ · · · ⊗ξn ),
(q)
(q)
ξ1 , ξ2 , · · · , ξn ∈ Eq,c
(q)
(3.20)
(q)
1 and Ln 2 = n! Mn 2s , where Ln is the usual operator norm of Ln as a ⊗n bounded operator from Eq,c to X. (q) Define a mapping L : Γ(Eq,c ) −→ X as follows
∞
L(q) F = L(q) n (fn ), n=0
(q)
F =
∞ .
fn ∈ Γ(Eq,c )
(3.21)
n=0
(q)
where L0 : C −→ X is defined as L0 (z) = zM0 . We assert that L(q) is well defined and moreover L(q) ∈ L[Γ(Eq,c ), X].
86
Yin Chen and Caishi Wang
In fact, for F = ∞
/∞ n=0
fn ∈ Γ(Eq,c ), we have
L(q) n (fn )X ≤
n=0
≤
∞
√ L(q) n n! |fn |q
n=0 3 ∞
2 L(q) n
41/2 3 ∞
n=0
41/2 n!|fn |2q
n=0
41/2 2 . ∞ 2 1 2 2 (q) 2 Mn s = fn 2 2 n! Γ(Eq,c ) n=0 n=0 4 3 ∞ 1/2 n2n A−(q−p) 2n ≤K F Γ(Eq,c ) HS n!n! n=0 41/2 3 ∞ ≤K e2n A−(q−p) 2n F Γ(Eq,c ) HS 3 ∞
n=0
K ≤ F Γ(Eq,c ) 2 1 − e A−(q−p) 2HS <∞ (q) which implies that the series ∞ n=0 Ln (fn ) is absolutely convergent in X. Hence the mapping L(q) : Γ(Eq,c ) −→ X is well defined. Clearly L(q) is a linear mapping. Moreover, from the above estimate, we see that K L(q) (F )X ≤ F Γ(Eq,c ) , 1 − e2 A−(q−p) 2HS
F ∈ Γ(Eq,c )
(3.22)
which implies that L(q) ∈ L[Γ(Eq,c ), X]. −1 Now let T = L(q) I (q) Θ, where I (q) = I Γ(Eq,c ) which is an isometric isomorphism from Γ(Eq,c ) to (Eq ). Then T ∈ L[(E), X] and moreover for any ξ1 , ξ2 , · · · , ξn ∈ Ec and n ≥ 0 MnT (ξ1 , ξ2 , · · · , ξn ) = T [I1 (ξ1 )I1 (ξ2 ) · · · I1 (ξn )] −1
2⊗ · · · ⊗ξ n )] = L(q) I (q) [In (ξ1 ⊗ξ
= L(q) n ξ1 ⊗ξ2 ⊗ · · · ⊗ξn
= Mn(q) ξ1 , ξ2 , · · · , ξn
= Mn ξ1 , ξ2 , · · · , ξn . Clearly (3.22) implies (3.16).
Remark 3.2. The combination of Theorems 3.1 and 3.2 forms a moment characterization theorem for X-valued generalized functionals of white noise. It was originally proved in Ref. 9 under the condition of X being reflexive. As an immediate consequence of Proposition 3.1 and Theorem 3.2, the next theorem gives a useful norm estimate for X-valued generalized functionals.
Banach Space-Valued Functionals of White Noise
Theorem 3.3. Let T ∈ L[(E), X] and K > 0, p ≥ 0. Assume that √ 2 2 1T (ξ)2 ≤ K n! |ξ|n , ξ ∈ Ec , n ≥ 0. 2M p n X Then for q ≥ p +
α 2
87
(3.23)
with e2 A−(q−p) 2HS < 1 it holds that
K ϕq , T Θ−1 (ϕ)X ≤ 2 1 − e A−(q−p) 2HS
ϕ ∈ (E).
(3.24)
Let T ∈ L[(E), X] and { Tk | k ≥ 1 } ⊂ L[(E), X]. The sequence { Tk | k ≥ 1 } is said to converge strongly to T if for each ϕ ∈ (E) we have Tk ϕ −→ T ϕ in the norm of X. Using the above results, we can prove the next theorem, which offers a necessary and sufficient condition for a sequence of X-valued generalized functionals to converge strongly. Theorem 3.4. T ∈ L[(E), X] and { Tk | k ≥ 1 } ⊂ L[(E), X]. Then { Tk | k ≥ 1 } converges strongly to T if and only if the following
two conditions
are satisfied (1) For each ξ ∈ Ec and each n ≥ 0, Tk Wξn −→ T Wξn (k −→ ∞) in the norm of X. (2) There exist K > 0, p ≥ 0 such that √
(3.25) sup Tk Wξn X ≤ K n! |ξ|np , ξ ∈ Ec , n ≥ 0. k≥1
Here, by convention, Wξ = I1 (ξ) for ξ ∈ Ec . Proof. We first prove the if part. Let Sk = Tk Θ−1 for k ≥ 1 and S = T Θ−1 . Then we see that S, Sk ∈ L[(E), X] and moreover { Tk } converges strongly to T if and only if { Sk } converges strongly to S. Hence it suffices to show that { Sk } converges strongly to S. Take q ≥ p + α2 with e2 A−(q−p) 2HS < 1. Write T0 = T and S0 = S. Then by (3.25) we have √ 2 2 Tk n 2 ξ ∈ Ec , n ≥ 0. sup 2M n (ξ) X ≤ K n! |ξ|p , k≥0
Hence by Theorem 3.3 we get K ϕq , sup Sk (ϕ)X = sup Tk Θ−1 (ϕ)X ≤ 1 − e2 A−(q−p) 2HS k≥0 k≥1
ϕ ∈ (E).
(q)
Let Sk be the bounded extension of Sk to (Eq ) for k ≥ 0. Then from the above inequality we get K (q) sup Sk ≤ 2 1 − e A−(q−p) 2HS k≥0 (q)
(q)
where Sk denotes the usual operator norm of Sk
as an element of L[(Eq ), X].
88
Yin Chen and Caishi Wang
On the other hand, for any n ≥ 0 and ξ ∈ Ec , we have
Sk In (ξ ⊗n ) = Tk Wξn −→ T0 Wξn = S0 In (ξ ⊗n ) which implies that (q)
(q)
Sk (ϕ) = Sk (ϕ) −→ S0 (ϕ) = S0 (ϕ),
ϕ ∈ span{∪n≥0 Λn }
where Λn = { In (ξ ⊗n ) | ξ ∈ Ec }. It is known that the set span{∪n≥0 Λn } is dense in (Eq ). Therefore, by the (q) (q) Banach-Steinhaus theorem,2 we know that { Sk } converges strongly to S0 = S (q) . In particular { Sk } converges strongly to S0 = S. We now prove the only if part. Obviously,
the property that Tk converges strongly to T implies that Tk Wξn −→ T Wξn (k −→ ∞) in the norm of X for each ξ ∈ Ec and each n ≥ 0. To verify what remains, we put UN = { ϕ | ϕ ∈ (E), sup Tk Θ−1 (ϕ)X ≤ N },
N ≥1
k≥1
We can see that each UN is a closed subset of (E) since UN =
∞ 5
{ ϕ | ϕ ∈ (E), Tk Θ−1 (ϕ)X ≤ N }.
k=1
6
Moreover (E) = N ≥1 UN since { Tk Θ−1 } converges strongly to T Θ−1 . It is known that (E) is a Fr´echet space whose distance ρ can be taken as ρ(ϕ, ψ) =
∞ 1 ϕ − ψp . p 1 + ϕ − ψ 2 p p=0
Hence, by the well known Baire’s category theorem,2 there exists N0 ≥ 1 such that UN0 has an interior point ϕ0 ∈ UN0 . Noting that on (E) it holds that ·0 ≤ ·1 ≤ · 2 ≤ · · · , we find that there exist p ≥ 0 and δ > 0 such that Bp (ϕ0 , δ) ⊂ UN0 , where Bp (ϕ0 , δ) = { ϕ | ϕ ∈ (E), ϕ − ϕ0 p < δ }. It is easy to see that for any ϕ ∈ (E) with ϕp = 0 δϕ + ϕ0 ∈ Bp (ϕ0 , δ). 2ϕp Hence 2 δϕ 2 sup 2Tk Θ−1 + ϕ0 2X ≤ N0 , ϕ ∈ (E), ϕp = 0 2ϕ p k≥1 which implies that sup Tk Θ−1 (ϕ)X ≤ k≥1
4N0 ϕp , ϕ ∈ (E). δ
Banach Space-Valued Functionals of White Noise
In particular, we have 2
2 2
2 4N0 √ sup 2Tk Wξn 2X = sup 2Tk Θ−1 In ξ ⊗n 2X ≤ n! |ξ|np , δ k≥1 k≥1 This completes the proof.
89
ξ ∈ Ec , n ≥ 0.
Acknowledgement This work is supported by National Natural Science Foundation of China (10571065), Natural Science Foundation of Gansu Province (0710RJZA106) and NWNU-KJCXGC, China. References 1. D.M. Chung, T.S. Chung and U.C. Ji, A characterization theorem for operators on white noise functionals, J. Math. Soc. Japan 51 (1999), 437–447. 2. J.B. Conway, A Course in Functional Analysis, 2nd edition, Spinger-Verlag, New York, 1990. 3. T. Hida, H.H. Kuo, J. Potthoff and L. Streit, White Noise–An Infinite Dimensional Calculus, Kluwer Academic, Dordrecht, 1993. 4. Z.Y. Huang and J.A. Yan, Introduction to Infinite Dimensional Stochastic Analysis, Kluwer Academic, Dordrecht, 1999. 5. Yu.G. Kondratiev and L. Streit, Spaces of white noise distributions: constructions, applications, I. Rep. Math. Phys. 33 (1993), 341–366. 6. H.H. Kuo, White Noise Distribution Theory, CRC Press, 1996. 7. N. Obata, Operator calculus on vector-valued white noise functionals, J. Funct. Anal. 121 (1994), 185–208. 8. J. Potthoff and L. Streit, A characterization of Hida distributions, J. Funct. Anal. 101 (1991), 212–229. 9. C.S. Wang and Z.Y. Huang, A moment characterization of B-valued generalized functionals of white noise, Acta Math. Sin. Engl. Ser. 22 (2006), 157–168
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 91–117
Chapter 6 Hurst Index Estimation for Self-Similar Processes with Long-Memory Alexandra Chronopoulou∗ and Frederi G. Viens∗ Department of Statistics, Purdue University 150 N. University St. West Lafayette IN 47907-2067, USA. [email protected], [email protected] The statistical estimation of the Hurst index is one of the fundamental problems in the literature of long-range dependent and self-similar processes. In this article, the Hurst index estimation problem is addressed for a special class of self-similar processes that exhibit long-memory, the Hermite processes. These processes generalize the fractional Brownian motion, in the sense that they share its covariance function, but are non-Gaussian. Existing estimators such as the R/S statistic, the variogram, the maximum likelihood and the wavelet-based estimators are reviewed and compared with a class of consistent estimators which are constructed based on the discrete variations of the process. Convergence theorems (asymptotic distributions) of the latter are derived using multiple Wiener-Itˆ o integrals and Malliavin calculus techniques. Based on these results, it is shown that the latter are asymptotically more efficient than the former.
Keywords: self-similar process, parameter estimation, long memory, Hurst parameter, multiple stochastic integral, Malliavin calculus, Hermite process, fractional Brownian motion, non-central limit theorem, quadratic variation. 2000 AMS Classification Numbers: Primary 62F12; Secondary 60G18, 60H07, 62M09. Contents 1
2
3
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . 1.2 Mathematical Background . . . . . . . . . . . . . . Most Popular Hurst Parameter Estimators . . . . . . . . 2.1 Heuristic Estimators . . . . . . . . . . . . . . . . . 2.2 Maximum Likelihood Estimation . . . . . . . . . . 2.3 Wavelet Estimator . . . . . . . . . . . . . . . . . . Multiplication in the Wiener Chaos & Hermite Processes 3.1 Basic Tools on Multiple Wiener-Itˆ o Integrals . . . . 3.2 Main Definitions . . . . . . . . . . . . . . . . . . .
∗ Both
. . . . . . . . . .
. . . . . . . . . .
authors’ research partially supported by NSF grant 0606615. 91
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
92 92 93 96 96 98 99 101 101 103
92
4
Alexandra Chronopoulou and Frederi G. Viens
Hurst Parameter Estimator Based on Discrete Variations 4.1 Estimator Construction . . . . . . . . . . . . . . . ˆN . . . . . . . . . . . . 4.2 Asymptotic Properties of H 5 Comparison & Conclusions . . . . . . . . . . . . . . . . . 5.1 Variations Estimator vs. mle . . . . . . . . . . . . 5.2 Variations’ vs. Wavelet Estimator . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
105 105 106 111 112 114 116
1. Introduction 1.1. Motivation A fundamental assumption in many statistical and stochastic models is that of independent observations. Moreover, many models that do not make this assumption have the convenient Markov property, according to which the future of the system is not affected by its previous states but only by the current one. The phenomenon of long memory has been noted in nature long before the construction of suitable stochastic models: in fields as diverse as hydrology, economics, chemistry, mathematics, physics, geosciences, and environmental sciences, it is not uncommon for observations made far apart in time or space to be non-trivially correlated. Since ancient times the Nile River has been known for its long periods of dryness followed by long periods of floods. The hydrologist Hurst ([13]) was the first one to describe these characteristics when he was trying to solve the problem of flow regularization of the Nile River. The mathematical study of long-memory processes was initiated by the work of Mandelbrot [16] on self-similar and other stationary stochastic processes that exhibit long-range dependence. He built the foundations for the study of these processes and he was the first one to mathematically define the fractional Brownian motion, the prototype of self-similar and long-range dependent processes. Later, several mathematical and statistical issues were addressed in the literature, such as derivation of central (and non-central) limit theorems ([5], [6], [10], [17], [27]), parameter estimation techniques ([1], [7], [8], [27]) and simulation methods ([11]). The problem of the statistical estimation of the self-similarity and/or longmemory parameter H is of great importance. This parameter determines the mathematical properties of the model and consequently describes the behavior of the underlying physical system. Hurst ([13]) introduced the celebrated rescaled adjusted range or R/S statistic and suggested a graphical methodology in order to estimate H. What he discovered was that for data coming from the Nile River the R/S statistic behaves like a constant times k H , where k is a time interval. This was called later by Mandelbrot the Hurst effect and was modeled by a fractional Gaussian noise (fGn). One can find several techniques related to the Hurst index estimation problem in the literature. There are a lot of graphical methods including the R/S statistic, the correlogram and partial correlations plot, the variance plot and the variogram,
Hurst Index Estimation for Self-Similar Processes with Long-Memory
93
which are widely used in geosciences and hydrology. Due to their graphical nature they are not so accurate and thus there is a need for more rigorous and sophisticated methodologies, such as the maximum likelihood. Fox and Taqqu ([12]) introduced the Whittle approximate maximum likelihood method in the Gaussian case which was later generalized for certain non-Gaussian processes. However, these approaches were lacking computational efficiency which lead to the rise of wavelet-based estimators and discrete variation techniques. 1.2. Mathematical Background Let us first recall some basic definitions that will be useful in our analysis. Definition 1.1. A stochastic process {Xn ; n ∈ N} is said to be stationary if the vectors (Xn1 , . . . , Xnd ) and (Xn1 +m , . . . , Xnd +m ) have the same distribution for all integers d, m ≥ 1 and n1 , . . . , nd ≥ 0. For Gaussian processes this is equivalent to requiring that Cov(Xm , Xm+n ) := γ(n) does not depend on m. These two notions are often called strict stationarity and second-order stationarity, respectively. The function γ(n) is called the autocovariance function. The function ρ(n) = γ(n)/γ(0) is the called autocorrelation function. In this context, long memory can be defined in the following way: Definition 1.2. Let {Xn ; n ∈ N} be a stationary process. If n ρ (n) = +∞ then Xn is said to exhibit long memory or long-range dependence. A sufficient condition for this is the existence of H ∈ (1/2, 1) such that ρ(n) > 0. n2H−2 Typical long memory models satisfy the stronger condition limn→∞ ρ(n)/n2H−2 > 0, in which case H can be called the long memory parameter of X. lim inf n→∞
A process that exhibits long-memory has an autocorrelation function that decays very slowly. This is exactly the behavior that was observed by Hurst for the first time. In particular, he discovered that the yearly minimum water level of the Nile river had the long-memory property, as can been seen in Figure 6.1. Another property that was observed in the data collected from the Nile river is the so-called self-similarity property. In geometry, a self-similar shape is one composed of a basic pattern which is repeated at multiple (or infinite) scale. The statistical interpretation of self-similarity is that the paths of the process will look the same, in distribution, irrespective of the distance from which we look at. The rigorous definition of the self-similarity property is as follows: Definition 1.3. A process {Xt ; t ≥ 0} is called self-similar with self-similarity parameter H, if for all c > 0, we have the identity in distribution " −H #D c Xc t : t ≥ 0 ∼ {Xt : t ≥ 0} .
94
Alexandra Chronopoulou and Frederi G. Viens
1300 1000
dataa
Yearly Minimum Water Level for Nile River
0
100
200
300
400
500
600
Years
0.4 0.0
ACF
0.8
ACF for Nile River Data
0
20
40
60
80
Lag
Figure 6.1. Yearly minimum water levels of the Nile √ River at the Roda Gauge (622-1281 A.D.). The dotted horizontal lines represent the levels ±2/ 600. Since our observations are above these levels, it means that they are significantly correlated with significance level 0.05.
In Figure 6.2, we can observe the self-similar property of a simulated path of the fractional Brownian motion with parameter H = 0.75. In this chapter, we concentrate on a special class of long-memory processes which are also self-similar and for which the self-similarity and long-memory parameters coincide, the so-called Hermite processes. This is a family of processes parametrized by the order q and the self-similarity parameter H. They all share the same covariance function Cov(Xt , Xs ) =
1 2H t + s2H − |t − s|2H . 2
(1.1)
From the structure of the covariance function we observe that the Hermite processes have stationary increments, they are H-self-similar and they exhibit long-range dependence as defined in Definition 1.2 (in fact, limn→∞ ρ (n) /n2H−2 = H(2H − 1)). The Hermite process for q = 1 is a standard fractional Brownian motion with Hurst parameter H, usually denoted by B H , the only Gaussian process in the Hermite class. A Hermite process with q = 2 known as the Rosenblatt process. In the sequel, we will call H either long-memory parameter or self-similarity parameter or Hurst parameter. The mathematical definition of these processes is given in Definition 3.5.
95
B_H(t)
−0.2 0.2
0.6
Hurst Index Estimation for Self-Similar Processes with Long-Memory
0
2
4
6
8
10
3
4
5
0.6
0.8
1.0
0.2 −0.2
B_H(t)*2^H
t
0
1
2
0.00 −0.10
B_H(t)*10^H
t
0.0
0.2
0.4 t
Figure 6.2. Self-similarity property for the fractional Brownian motion with H = 0.75. The first graph shows the path from time 0 to 10. The second and third graph illustrate the normalized sample path for 0 < t < 5 and 0 < t < 1 respectively.
Another class of processes used to model long-memory phenomena are the fractional ARIMA (Auto Regressive, Integrated, Moving Average) or FARIMA processes. The main technical difference between a FARIMA and a Hermite process is that the first one is a discrete-time process and the second one a continuous-time process. Of course, in practice, we can only have discrete observations. However, most phenomena in nature evolve continuously in time and the corresponding observations arise as samplings of continuous time processes. A discrete-time model depends heavily on the sampling frequency: daily observations will be described by a different FARIMA model than weekly observations. In a continuous time model, the observation sampling frequency does not modify the model. These are compelling reasons why one may choose to work with the latter. In this article we study the Hurst parameter estimation problem for the Hermite processes. The structure of the paper is as follows: in Section 2, we provide a survey of the most widely used estimators in the literature. In Section 3 we describe the main ingredients and the main definitions that we need for our analysis. In Section 4, we construct a class of estimators based on the discrete variations of the process and describe their asymptotic properties, including a sketch of the proof of the main theoretical result, Theorem 4.2, which summarizes the series of papers [6], [7], [27] and [28]. In the last section, we compare the variations-based estimators with the existing ones in the literature, and provide an original set of practical recommendations based on theoretical results and on simulations.
96
Alexandra Chronopoulou and Frederi G. Viens
2. Most Popular Hurst Parameter Estimators In this section we discuss the main estimators for the Hurst parameter in the literature. We start with the description of three heuristic estimators: the R/S estimator, the correlogram and the variogram. Then, we concentrate on a more traditional approach: the maximum likelihood estimation. Finally, we briefly describe the wavelet-based estimator. The description will be# done in the case of the fractional Brownian mo" that it #is observed in discrete times tion (fBm) BtH ; t ∈ [0, 1] . We assume " {0, 1, . . . , N − 1, N }. We denote by XtH ; t ∈ [0, 1] the corresponding increment process of the fBm (i.e. X Hi = B Hi − B H i−1 ) , also known as fractional Gaussian N N N noise. 2.1. Heuristic Estimators R/S Estimator: The most famous among these estimators is the so-called R/S estimator that was first proposed by Hurst in 1951, [13], in the hydrological problem regarding the storage of water coming from the Nile river. We start by dividing our data in K non-intersecting blocks, each one of which contains M = [ N K ] elements. The rescaled adjusted range is computed for various values of N by Q := Q(ti , N ) =
R(ti , N ) S(ti , n)
at times ti = M (i − 1), i = 1, . . . , K. For k−1 n−1 1 XtHi +j − k XtHi +j , k = 1, . . . , n Y (ti , k) := n j=0 j=0 we define R(ti , n) and S(ti , n) to be R(ti , n) := max {Y (ti , 1), . . . , Y (ti , n)} − min {Y (ti , 1), . . . , Y (ti , n)} and 7 2 8 8 n−1 n−1 81 H 2 1 X − XH . S(ti , n) :=9 n j=0 ti +j n j=0 ti +j Remark 2.1. It is interesting to note that the numerator R(ti , n) can be computed only when ti + n ≤ N . In order to compute a value for H we plot the logarithm of R/S (i.e log Q) with respect to log n for several values of n. Then, we fit a least-squares line y = a + b log n to a central part of the data, that seem to be nicely scattered along a straight line. The slope of this line is the estimator of H.
Hurst Index Estimation for Self-Similar Processes with Long-Memory
97
This is a graphical approach and it is really in the hands of the statistician to determine the part of the data that is “nicely scattered along the straight line”. The problem is more severe in small samples, where the distribution of the R/S statistic is far from normal. Furthermore, the estimator is biased and has a large standard error. More details on the limitations of this approach in the case of fBm can be found in [2]. Correlogram: Recall ρ(N ) the autocorrelation function of the process as in Definition 1.1. In the Correlogram approach, it is sufficient to plot the sample autocorrelation function ρˆ(N ) =
γˆ (N ) γˆ (0)
√ against N . As a rule of thumb we draw two horizontal lines at ±2/ N . All observations outside the lines are considered to be significantly correlated with significance level 0.05. If the process exhibits long-memory, then the plot should have a very slow decay. The main disadvantage of this technique is its graphical nature which cannot guarantee accurate results. Since long-memory is an asymptotic notion, we should analyze the correlogram at high lags. However, when for example H = 0.6 it is quite hard to distinguish it from short-memory. To avoid this issue, a more suitable plot will be this of log ρˆ(N ) against log N . If the asymptotic decay is precisely hyperbolic, then for large lags the points should be scattered around a straight line with negative slope equal to 2H − 2 and the data will have long-memory. On the other hand when the plot diverges to −∞ with at least exponential rate, then the memory is short. Variogram: The variogram for the lag N is defined as 2 % 1 $
H . V (N ) := E BtH − Bt−N 2 Therefore, it suffices to plot V (N ) against N . However, we can see that the interpretation of the variogram is similar to that of the correlogram, since if the process is stationary (which is true for the increments of fractional Brownian motion and all other Hermite processes), then the variogram is asymptotically finite and V (N ) = V (∞)(1 − ρ(N )). In order to determine whether the data exhibit short or long memory this method has the same problems as the correlogram.
98
Alexandra Chronopoulou and Frederi G. Viens
The main advantage of these approaches is their simplicity. In addition, due to their non-parametric nature, they can be applied to any long-memory process. However, none of these graphical methods are accurate. Moreover, they can frequently be misleading, indicating existence of long-memory in cases where none exists. For example, when a process has short-memory together with a trend that decays to zero very fast, a correlogram or a variogram could show evidence of long-memory. In conclusion, a good approach would be to use these methods as a first heuristic analysis to detect the possible presence of long-memory and then use a more rigorous technique, such as those described in the remainder of this section, in order to estimate the long-memory parameter. 2.2. Maximum Likelihood Estimation The Maximum Likelihood Estimation (mle) is the most common technique of parameter estimation in Statistics. In the class of Hermite processes, its use is limited to fBm, since for the other processes we do not have an expression for their distribution function. The mle estimation is done in the spectral domain using the spectral density of fBm
as follows. H the vector of the fractional Gaussian noise Denote by X H = X0H , X1H , . . . , XN (increments of fBm) and by X H the transposed (column) vector; this is a Gaussian vector with covariance matrix ΣN (H) = [σij (H)]i,j=1,...,N ; we have 1 2H
i + j 2H − |i − j|2H . σij := Cov XiH ; XjH = 2 Then, the log-likelihood function has the following expression: log f (x; H) = −
1 N 1 −1 log 2π − log [det (ΣN (H))] − X H (ΣN (H)) X H . 2 2 2
ˆ mle , the mle for H, we need to maximize the log-likelihood In order to compute H equation with respect to H. A detailed derivation can be found in [3] and [9]. The ˆ mle is described in the following theorem. asymptotic behavior of H 2 π ∂ 1 Theorem 2.1. Define the quantity D(H) = 2π dx. Then −π ∂H log f (x; H) under certain regularity conditions (that can be found in [9]) the maximum likelihood estimator is weakly consistent and asymptotically normal: ˆ mle → H , as N → ∞ in probability; (i) H √ ˆ mle − H → N (0, 1) in distribution, as N → ∞. (ii) N 2 D(H) H In order to obtain the mle in practice, in almost every step we have to maximize a quantity that involves the computation of the inverse of Σ(H), which is not an easy task. In order to avoid this computational burden, we approximate the likelihood function with the so-called Whittle approximate likelihood which can be proved to
Hurst Index Estimation for Self-Similar Processes with Long-Memory
99
converge to the true likelihood, [29]. In order to introduce Whittle’s approximation we first define the density on the spectral domain. Definition 2.4. Let Xt be a process with autocovariance function γ(h), as in Definition 1.1. The spectral density function is defined as the inverse Fourier transform of γ(h) f (λ) :=
∞ 1 −iλh e γ(h). 2π h=−∞
In the fBm case the spectral density can be written as 3 4 π 1 1 exp − f (λ; H) = log f1 dλ , where 2π 2π −π ∞ 1 f1 (λ; H) = Γ(2H + 1) sin(πH)(1 − cos λ) |2πj + λ|−2H−1 π j=−∞ The Whittle method approximates each of the terms in the log-likelihood function as follows: π 1 (i) limN →∞ log det(ΣN (H)) = 2π −π log f (λ; H)dλ. −1 (ii) The matrix ΣN (H) itself is asymptotically equivalent to the matrix A(H) = [α(j − )]j , where π −i(j− )λ 1 e dλ α(j − ) = (2π)2 −π f (λ; H) Combining the approximations above, we now need to minimize the quantity π n 1 N 1 (log f (λ; H))∗ = − log 2π − log f (λ; H)dλ − X A(H)X . 2 2 2π −π 2 The details in the Whittle mle estimation procedure can be found in [3]. For the Whittle mle we have the following convergence in distribution result as N → ∞ : N D ˆ W mle − H → H N (0, 1) (2.1) −1 [2 D(H)] It can also be shown that the Whittle approximate mle remains weakly consistent. 2.3. Wavelet Estimator Much attention has been devoted to the wavelet decomposition of both fBm and the Rosenblatt process. Following this trend, an estimator for the Hurst parameter based on wavelets has been suggested. The details of the procedure for the constructing this estimator, and the underlying wavelets theory, are beyond the scope of this article. For the proofs and the detailed exposition of the method the reader can refer to [1], [11] and [14]. This section provides a brief exposition.
100
Alexandra Chronopoulou and Frederi G. Viens
Let ψ : R → R be a continuous function with support in [0, 1]. This is also called the mother wavelet. Q ≥ 1 is the number of vanishing moments where tp ψ(t)dt = 0, for p = 0, 1, . . . , Q − 1, R tQ ψ(t)dt = 0. R
∗
For a “scale” α ∈ N the corresponding wavelet coefficient is given by ∞ 1 t − i ZtH dt, d(α, i) = √ ψ α −∞ α for i = 1, 2, . . . , Nα with Nα = N α − 1, where N is the sample size. Now, for (α, b) we define the approximate wavelet coefficient of d(α, b) as the following Riemann approximation 1 H e(α, b) = √ Zk ψ α N
k=1
k −b , α
where Z H can be either fBm or Rosenblatt process. Following the analysis by J.M. Bardet and C.A. Tudor in [1], the suggested estimator can be computed by performing a log-log regression of Nαi (N ) 1 e2 (αi (N ), j) Nαi (N ) j=1 1≤i≤
against (i αi (N ))1≤i≤ , where α(N ) is a sequence of integer numbers such that N α(N )−1 → ∞ and α(N ) → ∞ as N → ∞ and αi (N ) = iα(N ). Thus, the obtained estimator, in vectors notation, is the following Nαi (N ) −1 1 1 1 ˆ wave := ,0 (2.2) Z , Z H Z −1 e2 (αi (N ), j) − , 2 2 j=1 2 1≤i≤
where Z (i, 1) = 1, Z (i, 2) = log i for all i = 1, . . . , , for ∈ N {1}. Theorem 2.2. Let α(N ) as above. Assume also that ψ ∈ C m with m ≥ 1 and ψ is supported on [0, 1]. We have the following convergences in distribution. (1) Let Z H be a fBm; assume N α(N )−2 → 0 as N → ∞ and m ≥ 2; if Q ≥ 2, or if Q = 1 and 0 < H < 3/4, then there exists γ 2 (H, , ψ) > 0 such that : N ˆ D Hwave − H → N (0, γ 2 (H, , ψ)), as N → ∞. (2.3) α(N )
Hurst Index Estimation for Self-Similar Processes with Long-Memory 5−4H
101
3−2H+m 3−2H
(2) Let Z H be a fBm; assume N α(N )− 4−4H → 0 as N α(N )− Q = 1 and 3/4 < H < 1, then 2−2H N D ˆ wave − H → H L, as N → ∞ α(N )
where the distribution law L depends on H, and ψ. 2−2H (3) Let Z H is be Rosenblatt process; assume N α(N )− 3−2H N α(N )−(1+m) → 0; then 1−H N D ˆ wave − H → H L, as N → ∞ α(N )
→ 0; if
(2.4) →
0 as
(2.5)
where the distribution law L depends on H, and ψ. The limiting distributions L in the theorem above are not explicitly known: they come from a non-trivial non-linear transformation of quantities which are asympˆ wave totically normal or Rosenblatt-distributed. A very important advantage of H over the mle for example is that it can be computed in an efficient and fast way. On the other hand, the convergence rate of the estimator depends on the choice of α(N ). 3. Multiplication in the Wiener Chaos & Hermite Processes 3.1. Basic Tools on Multiple Wiener-Itˆ o Integrals In this section we describe the basic framework that we need in order to describe and prove the asymptotic properties of the estimator based on the discrete variations of the process. We denote by {W "t :Ht ∈[ 0, 1]} a# classical Wiener process on a standard Wiener space (Ω, F , P ). Let Bt ; t ∈ [0, 1] be a fractional Brownian motion with Hurst parameter H ∈ (0, 1) and covariance function ) ( 1 2H t + s2H − |t − s|2H . (3.1) 1[0,s] , 1[0,t] = RH (t, s) := 2 1
We denote by H its canonical Hilbert space. When H = 12 , then B 2 is the standard funcBrownian motion on L2 ([0, 1]). Otherwise, H is a Hilbert space ) ( which contains tions on [0, 1] under the inner product that extends the rule 1[0,s] , 1[0,t] . Nualart’s textbook (Chapter 5, [19]) can be consulted for full details. We will use the representation of the fractional Brownian motion B H with respect to the standard Brownian motion W : there exists a Wiener process W and a deterministic kernel K H (t, s) for 0 ≤ s ≤ t such that 1
K H (t, s)dWs = I1 K H (·, t) , (3.2) B H (t) = 0
o integral with respect to W . Now, let In (f ) be the where I1 is the Wiener-Itˆ multiple Wiener-Itˆ o integral, where f ∈ L2 ([0, 1]n ) is a symmetric function. One
102
Alexandra Chronopoulou and Frederi G. Viens
can construct the multiple integral starting from simple functions of the form f := i1 ,...,in ci1 ,...in 1Ai1 ×...×Ain where the coefficient ci1 ,..,in is zero if two indices are equal and the sets Aij are disjoint intervals by ci1 ,...in W (Ai1 ) . . . W (Ain ), In (f ) := i1 ,...,in
where W 1[a,b] = W ([a, b]) = Wb − Wa . Using a density argument the integral can be extended to all symmetric functions in L2 ([0, 1]n ). The reader can refer to Chapter 1 [19] for its detailed construction. Here, it is interesting to observe that this construction coincides with the iterated Itˆ o stochastic integral 1 tn t2 ... f (t1 , . . . , tn )dWt1 . . . dWtn . (3.3) In (f ) = n! 0
0
0
The application In is extended to non-symmetric functions f via
In (f ) = In f˜ where f˜ denotes the symmetrization of f defined by f˜(x1 , . . . , xN ) = 1 σ∈Sn f (xσ(1) , . . . , xσ(n) ). n! n equipped with the scaled norm In is an isometry between the Hilbert space H" # 1 √ ||·||H⊗n . The space of all integrals of order n, In (f ) : f ∈ L2 ([0, 1]n ) , is called n! nth Wiener chaos. The Wiener chaoses form orthogonal sets in L2 (Ω): E (In (f )Im (g)) = n! f, g L2 ([0,1]n ) =0
if m = n,
(3.4)
if m = n.
The next multiplication formula will plays a crucial technical role: if f ∈ L2 ([0, 1]n ) and g ∈ L2 ([0, 1]m ) are symmetric functions, then it holds that In (f )Im (g) =
m∧n
!Cm Cn Im+n−2 (f ⊗ g),
(3.5)
=0
where the contraction f ⊗ g belongs to L2 ([0, 1]m+n−2 ) for = 0, 1, . . . , m ∧ n and is given by (f ⊗ g)(s1 , . . . , sn− , t1 , . . . , tm− ) = f (s1 , . . . , sn− , u1 , . . . , u )g(t1 , . . . , tm− , u1 , . . . , u )du1 . . . du . [0,1]
Note that the contraction (f ⊗ g) is not necessarily symmetric. We will denote its ˜ g). symmetrization by (f ⊗ We now introduce the Malliavin derivative for random variables in a finite chaos. The derivative operator D is defined on a subset of L2 (Ω), and takes values in L2 (Ω × [0, 1]). Since it will be used for random variables in a finite chaos, it is sufficient to know that if f ∈ L2 ([0, 1]n ) is a symmetric function, DIn (f ) exists and it is given by Dt In (f ) = n In−1 (f (·, t)),
t ∈ [0, 1].
Hurst Index Estimation for Self-Similar Processes with Long-Memory
103
D. Nualart and S. Ortiz-Latorre in [21] proved the following characterization of convergence in distribution for any sequence of multiple integrals to the standard normal law. Proposition 3.1. Let n be a fixed integer. Let FN = In (fN ) be a sequence of square integrable random variables in the nth Wiener chaos such that limN →∞ E FN2 = 1. Then the following are equivalent: (i) The sequence (FN )N ≥0 converges to the normal law N (0, 1). 1 (ii) DFN 2L2 [0,1] = 0 |Dt In (f )|2 dt converges to the constant n in L2 (Ω) as N → ∞. There also exists a multidimensional version of this theorem due to G. Peccati and C. Tudor in [22]. 3.2. Main Definitions The Hermite processes are a family of processes parametrized by the order and the self-similarity parameter with covariance function given by (3.1). They are wellsuited to modeling various phenomena that exhibit long-memory and have the self(q,H) )t∈[0,1] the similarity property, but which are not Gaussian. We denote by (Zt Hermite process of order q with self-similarity parameter H ∈ (1/2, 1) (here q ≥ 1 is an integer). The Hermite process can be defined in two ways: as a multiple integral with respect to the standard Wiener process (Wt )t∈[0,1] ; or as a multiple integral with respect to a fractional Brownian motion with suitable Hurst parameter. We adopt the first approach throughout the paper, which is the one described in the following Definition 3.5. (q,H)
)t∈[0,1] of order q ≥ 1 and with selfDefinition 3.5. The Hermite process (Zt similarity parameter H ∈ ( 12 , 1) for t ∈ [0, 1] is given by t t t (q,H) H H Zt = d(H) ... dWy1 . . . dWyq ∂1 K (u, y1 ) . . . ∂1 K (u, yq )du , 0
y1 ∨...∨yq
0
(3.6) where K is the usual kernel of the fractional Brownian motion, d(H) a constant depending on H and H
H = 1 +
H −1 ⇐⇒ (2H − 2)q = 2H − 2. q
(3.7)
Therefore, the Hermite process of order q is defined as a q th order Wiener-Itˆo integral of a non-random kernel, i.e. (q,H)
Zt
= Iq (L(t, ·)) ,
where L(t, y1 , . . . , yq ) = ∂1 K H (u, y1 ) . . . ∂1 K H (u, yq )du. The basic properties of the Hermite process are listed below:
104
Alexandra Chronopoulou and Frederi G. Viens
• the Hermite process Z (q,H) is H-selfsimilar and it has stationary increments; • the mean square of its increment is given by 2 (q,H) − Zs(q,H) = |t − s|2H ; E Zt as a consequence, it follows from the Kolmogorov continuity criterion that, almost surely, Z (q,H) has H¨older-continuous paths of any order δ < H; • Z (q,H) exhibits long-range dependence in the sense of Definition 1.2. In fact, the autocorrelation function ρ (n) of its increments of length 1 is asymptotically equal to H(2H − 1)n2H−2 . This property is identical to that of fBm since the processes share the same covariance structure, and the property is well-known for fBm with H > 1/2. In particular for Hermite processes, the self-similarity and long-memory parameter coincide. In the sequel, we will also use the filtered process to construct an estimator for H. Definition 3.6. A filter α of length ∈ N and order p ∈ N \ 0 is an ( + 1)dimensional vector α = {α0 , α1 , . . . , α } such that
αq q r = 0,
for 0 ≤ r ≤ p − 1, r ∈ Z
q=0
αq q p = 0
q=0
with the convention 00 = 1. We assume that we observe the process in discrete times {0, N1 , . . . , NN−1 , 1}. The filtered process Z (q,H) (α) is the convolution of the process with the filter, according to the following scheme:
i−q (q,H) , for i = , . . . , N − 1 (3.8) αq Z Z(α) := N q=0 Some examples are the following: (1) For α = {1, −1}
Z (q,H) (α) = Z (q,H)
i N
− Z (q,H)
i−1 N
.
This is a filter of length 1 and order 1. (2) For α = {1, −2, 1} i i−1 i−2 (q,H) (q,H) (q,H) (q,H) − 2Z +Z . (α) = Z Z N N N This is a filter of length 2 and order 2.
Hurst Index Estimation for Self-Similar Processes with Long-Memory
105
(3) More generally, longer filters produced by finite-differencing are such that the coefficients of the filter α are the binomial coefficients with alternating signs. Borrowing the notation ∇ from time series analysis, ∇Z (q,H) (i/N ) = Z (q,H) (i/N ) − Z (q,H) ((i − 1) /N ), we define ∇j = ∇∇j−1 and we may write the jth-order finite-difference-filtered process as follows i . Z (q,H) (α) := ∇j Z (q,H) N 4. Hurst Parameter Estimator Based on Discrete Variations The estimator based on the discrete variations of the process is described by Coeurjolly in [8] for fractional Brownian motion. Using previous results by Breuer and Major, [5], he was able to prove consistency and derive the asymptotic distribution for the suggested estimator in the case of filter of order 1 for H < 3/4 and for all H in the case of a longer filter. Herein we see how the results by Coeurjolly are generalized: we construct consistent estimators for the self-similarity parameter of a Hermite process of order q based on the discrete observations of the underlying process. In order to determine the corresponding asymptotic behavior we use properties of the Wiener-Itˆ o integrals as well as Malliavin calculus techniques. The estimation procedure is the same irrespective of the specific order of the Hermite process, thus in the sequel we denote the process by Z := Z (q,H) . 4.1. Estimator Construction Filter of order 1: α = {−1, +1}. We present first the estimation procedure for a filter of order 1, i.e. using the increments of the process. The quadratic variation of Z is 2 N i−1 i 1 −Z Z SN (α) = . N i=1 N N
(4.1)
We know that the expectation of SN (α) is E [SN (α)] = N −2H ; thus, given good concentration properties for SN (α), we may attempt to estimate SN (α)’s expectation by its actual value, i.e. E [SN (α)] by SN (α); suggesting the following estimator for H: ˆ N = − log SN (α) . H 2 log N
(4.2)
Filter of order p: In this case we use the filtered process in order to construct the estimator for H. Let α be a filter (as defined in (3.6)) and the corresponding filtered process Z(α) as in (3.8): First we start by computing the quadratic variation of the
106
Alexandra Chronopoulou and Frederi G. Viens
filtered process 2 N 1 i−q αq Z . SN (α) = N N q=0
(4.3)
i=
Similarly as before, in order to construct the estimator, we estimate SN by its −2H 2H . Thus, expectation, which computes as E [SN ] = − N 2 q,r=0 αq αr |q − r| ˆ we can obtain HN by solving the following non-linear equation with respect to H SN
N −2H =− αq αr |q − r|2H . 2 q,r=0
(4.4)
2x ˆ N = g −1 (SN ), where g(x) = − N −2x We write that H q,r=0 αq αr |q − r| . In 2 this case, it is not possible to compute an analytical expression for the estimator. However, we can show that there exists a unique solution for H ∈ [ 12 , 1] as long as + * 2H q,r=0 αq αr log |q − r| |q − r| . exp N > max 2H H∈[ 12 ,1] q,r=0 αq αr |q − r|
This restriction is typically satisfied, since we work with relatively large sample sizes. ˆN 4.2. Asymptotic Properties of H The first question is whether the suggested estimator is consistent. This is indeed true: if sampled sufficiently often (i.e. as N → ∞), the estimator converges to the true value of H almost surely, for any order of the filter. Theorem 4.1. Let H ∈ ( 12 , 1). Assume we observe the Hermite process Z of order ˆ N is strongly consistent, i.e. q with Hurst parameter H. Then H ˆ N = H a.s. lim H
N →∞
ˆ N log N = 0 a.s. In fact, we have more precisely that limN →∞ H − H Remark 4.2. If we look at the above theorem more carefully, we observe that this is a slightly different notion of consistency than the usual one. In the case of the mle, for example, we let N tend to infinity which means that the horizon from which we sample goes to infinity. Here, we do have a fixed horizon [0, 1] and by letting N → ∞ we sample infinitely often. If we had convergence in distribution this would not be an issue, since we could rescale the process appropriately by taking advantage of the self-similarity property, but in terms of almost sure or convergence in probability it is not exactly equivalent.
Hurst Index Estimation for Self-Similar Processes with Long-Memory
107
ˆ N . Obviously, it The next step is to determine the asymptotic distribution of H should depend on the distribution of the underlying process, and in our case, on q and H. We consider the following three cases separately: fBm (Hermite process of order q = 1), Rosenblatt process (q = 2), and Hermite processes of higher order q > 2. We summarize the results in the following theorem, where the limit notation % $ L2 (Ω) 2 Xn → X denotes convergence in the mean square limN →∞ E (XN − X) = 0, D
and → continues to denote convergence in distribution. Theorem 4.2. (1) Let H ∈ (0, 1) and B H be a fractional Brownian motion with Hurst parameter H. (a) Assume that we use the filter of order 1. i. If H ∈ (0, 34 ), then as N → ∞
√ 2 ˆ D HN − H → N (0, 1), N log N √ c1,H 2 ∞
where c1,H := 2 + k=1 2k 2H − (k − 1)2H − (k + 1)2H . ii. If H ∈ ( 34 , 1), then as N → ∞ L2 (Ω) 2 ˆ N 1−H log N √ HN − H → Z (2,H) , c2,H
(4.5)
(4.6)
2
(2H−1) where c2,H := 2H4H−3 . 3 iii. If H = 4 , then as N → ∞
where c3,H :=
2 ˆ D HN − H → N (0, 1), N log N √ c3,H
(4.7)
9 16 .
(b) Now, let α be of any order p ≥ 2. Then, √ 1 ˆ D HN − H → N (0, 1), N log N c6,H 1 α 2 where c6,H = 2 i∈Z ρH (i) .
(4.8)
(2) Suppose that H > 12 and the observed process Z 2,H is a Rosenblatt process with Hurst parameter H. (a) If α is a filter of order 1, then N 1−H log N where c4,H := 16d(H)2 .
L2 (Ω) 1 ˆ HN − H → Z (2,H) , 2c4,H
(4.9)
108
Alexandra Chronopoulou and Frederi G. Viens
(b) If α is a filter of order p > 1, then
2 (Ω) −1/2 ˆ N − H L→ 2c7,H N 1−H log N H Z (2,H)
where c7,H = *
64 c(H)2
2H − 1 H (H + 1)2 $
(4.10)
bq br |1 + q − r|
2H
×
+ |1 − q + r|
2H
− 2|q − r|
2H
+ % 2
q,r=0
q with bq = r=0 αr . Here is the length of the filter, which is related to the order p, see Definition 3.6 and examples following. (3) Let H ∈ ( 12 , 1) and q ∈ N {0}, q ≥ 2. Let Z (q,H) be a Hermite process of and a filter order q and self-similarity parameter H. Then, for H = 1 + H−1 q of order 1, L2 (Ω) 2 ˆ HN (α) − H → Z (2,2H −1) , (4.11) N 2−2H log N c5,H
where c5,H :=
4q!d(H)4 (H (2H −1))2q−2 . (4H −3)(4H −2)
Remark 4.3. In the notation above, Z (2,K) denotes a standard Rosenblatt random variable, which means a random variable that has the same distribution as the Hermite process of order 2 and parameter K at t=1. Before continuing with a sketch of proof of the theorem, it is important to discuss the theorem’s results. (1) In most of the cases above, we observe that the order of convergence of the estimator depends on H, which is the parameter that we try to estimate. This is not a problem, because it has already been proved, in [6], [7], [27] and [28], ˆ N in the rate that the theorem’s convergences still hold when we replace H by H of convergence. (2) The effect of the use of a longer filter is very significant. In the case of fBm, when we use a longer filter, we no longer have the threshold of 3/4 and the suggested estimator is always asymptotically normal. This is important for the following reason: when we start the estimation procedure, we do not know beforehand the value of H and such a threshold would create confusion in choosing the ˆ N appropriately. Finally, the fact correct rate of convergence in order to scale H that we have asymptotic normality for all H allows us to construct confidence intervals and perform hypothesis testing and model validation. (3) Even in the Rosenblatt case the effect of the filter is significant. This is not obvious here, but we will discuss it later in detail. What actually happens is that by filtering the process asymptotic standard error is reduced, i.e. the longer the filter the smaller the standard error.
Hurst Index Estimation for Self-Similar Processes with Long-Memory
109
(4) Finally, one might wonder if the only reason to chose to work with the quadratic variation of the process, instead of a higher order variation (powers higher than 2 in (4.1) and (4.3)), is for the simplicity of calculations. It turns out that there are other, better reasons to do so: Coeurjolly ([8]) proved that the use of higher order variations would lead to higher asymptotic constants and thus to larger standard errors in the case of the fBm. He actually proved that the optimal variation respect to the standard error is the second (quadratic ). Proof. [Sketch of proof of Theorem 4.2] We present the key ideas for proving the consistency and asymptotic distribution results. We use a Hermite process of order q and a filter of order 1. However, wherever it is necessary we focus on either fBm or Rosenblatt in order to point out the corresponding differences. The ideas for the proof are very similar in the case of longer filters so the reader should refer to [7] for the details in this approach. It is convenient to work with the centered normalized quadratic variation defined as 2 (q,H) (q,H) N −1 Z i+1 − Z i 1 N N . (4.12) VN := −1 + N i=0 N −2H It is easy to observe that for SN defined in (4.1), SN = N −2H (1 + VN ) . ˆ N − H log N , therefore in Using this relation we can see that log (1 + VN ) = 2 H order to prove consistency it suffices to show that VN converges to 0 as N → ∞ ˆ N depends on the asymptotic behavior of VN . and the asymptotic distribution of H By the definition of the Hermite process in Definition 3.5, we have that (q,H)
Z i+1 N
(q,H)
−Zi
N
= Iq (fi,N )
where we denoted
fi,N (y1 , . . . , yq ) = 1[0, i+1 ] (y1 ∨ . . . ∨ yq )d(H) N
− 1[0, i ] (y1 ∨ . . . ∨ yq )d(H) N
i+1 N
y1 ∨...∨yq i N
y1 ∨...∨yq
∂1 K H (u, y1 ) . . . ∂1 K H (u, yq )du
∂1 K H (u, y1 ) . . . ∂1 K H (u, yq )du.
Now, using the multiplication property (3.5) of multiple Wiener-Itˆ o integrals we can derive a Wiener chaos decomposition of VN as follows: VN = T2q + c2q−2 T2q−2 + . . . + c4 T4 + c2 T2 (4.13)
q 2 where c2q−2k := k! k are the combinatorial constants from the product formula for 0 ≤ k ≤ q − 1, and N −1 2H−1 I2q−2k fi,N ⊗k fi,N , T2q−2k := N i=0
110
Alexandra Chronopoulou and Frederi G. Viens
where fi,N ⊗k fi,N is the k th contraction of fi,N with itself which is a function of 2q − 2k parameters. To determine the magnitude of this Wiener chaos decomposition VN , we study each of the terms appearing in the decomposition separately. If we compute the L2 norm of each term, we have 2 s 22 −1 2 2 N 2 2 2 4H−2 E T2q−2k = N (2q − 2k)! 2 fi,N ⊗k fi,N 2 2 2 2 2q−2k i=0
= N 4H−2 (2q − 2k)!
N −1
L ([0,1]
)
˜ k fi,N , fj,N ⊗ ˜ k fj,N L2 ([0,1]2q−2k ) fi,N ⊗
i,j=0
Using properties of the multiple integrals we have the following results 4 (H (2H −1))2q−2 N 2(2H −2) • For k = q − 1, E T22 ∼ 4d(H) (4H −3)(4H −2) • For k = 0, . . . , q − 2 % $ 2 = O N −2(2−2H )2(q−k−1) . E N 2(2−2H ) T2q−2k Thus, we observe that the term T2 is the dominant term in the decomposition of the variation statistic VN . Therefore, with c1,1,H = it holds that
4d(H)4 (H (2H − 1))2q−2 , (4H − 3)(4H − 2)
% $ (2−2H )2 −2 2 lim E c−1 c2 VN = 1. 1,1,H N
N →∞
Based on these results we can easily prove that VN converges to 0 a.s. and then ˆ N is strongly consistent. conclude that H Now, in order to understand the asymptotic behavior of the renormalized sequence VN it suffices to study the limit of the dominant term N −1 2H−1 (2−2H ) N fi,N ⊗q−1 fi,N I2 N i=0
When q = 1 (the fBm case), we can use the Nualart–Ortiz-Latorre criterion (Proposition 3.1) in order to prove convergence to Normal distribution. However, in the general case for q > 1, using the same criterion, we can see that convergence to a Normal law is no longer true. Instead, a direct method can be employed to determine the asymptotic distribution of the above quantity. Let N −1 N 2H−1 N (2−2H ) i=0 fi,N ⊗q−1 fi,N = f2N + r2 , where r2 is a remainder term and
f2N (y, z) := N 2H−1 N (2−2H ) d(H)2 a(H )q−1 N −1 1[0, Ni ] (y ∨ z) dvdu∂1 K(u, y)∂1 K(v, z)|u − v|(2H −2)(q−1) . i=0
Ii
Ii
Hurst Index Estimation for Self-Similar Processes with Long-Memory
111
It can be shown that the term r2 converges to 0 in L2 ([0, 1]2 ), while f2N converges in L2 ([0, 1]2 ) to the kernel of the Rosenblatt process at time 1, which is by definition
(H (2H − 1))(q−1) d(H)2 N 2H−1 N 2−2H N −1 × |u − v|(2H −2)(q−1) ∂1 K H (u, y)∂1 K H (v, z). i=0
Ii
Ii
This implies, by the isometry property (3.4) between double integrals and L2 ([0, 1]2 ), that the dominant term in VN , i.e. the second-chaos term T2 , converges in L2 (Ω) to the Rosenblatt process at time 1. The reader can consult [6], [7], [27] and [28] for all details of the proof. 5. Comparison & Conclusions In this section, we compare the estimators described in Sections 2 and 4. The performance measure that we adopt is the asymptotic relative efficiency, which we now define according to [24]: Definition 5.7. Let Tn be an estimator of θ for all n and {αn } a sequence of positive numbers such that αn → +∞ or αn → α > 0. Suppose that for some probability law Y with positive and finite second moment, D
αn (Tn − θ) → Y, (i) The asymptotic mean square error of Tn (amseTn (θ)) is defined to be the asymptotic expectation of (Tn − θ)2 , i.e. amseTn (θ) =
EY 2 . αn
(ii) Let Tn be another estimator of θ. The asymptotic relative efficiency of Tn with respect to Tn is defined to be amseTn (θ) eTn ,Tn (θ) = . (5.1) amseTn (θ)
(iii) Tn is said to be asymptotically more efficient than Tn if and only if lim sup eTn ,Tn (θ) ≤ 1, for all θ and n
lim sup eTn ,Tn (θ) < 1, for some θ. n
Remark 5.4. These definitions are in the most general setup: indeed (i) they are not restricted by the usual assumption that the estimators converge to a Normal distribution; moreover, (ii) the asymptotic distributions of the estimators do not have to be the same. This will be important in our comparison later. Our comparative analysis focuses on fBm and the Rosenblatt process, since the maximum likelihood and the wavelets methods cannot be applied to higher order Hermite processes.
112
Alexandra Chronopoulou and Frederi G. Viens
5.1. Variations Estimator vs. mle We start with the case of a filter of order 1 for fBm. Since the asymptotic behavior of the variations estimator depends on the true value of H we consider three different cases: • If H ∈ (0, 3/4), then 2+
∞
k=1 (2k
eHˆ N (α), Hˆ mle (H) =
2H
−(k−1)2H −(k+1)2H )2 √ 2 N log N [2 D(H)]−1 √ N
≈
1 . log N
This implies that lim sup eHˆ N (α),Hˆ mle (H) = 0, ˆ N (α) is asymptotically more efficient than H ˆ mle . meaning that H • If H ∈ (3/4, 1), then eHˆ N (α), Hˆ mle (H) =
4
2H 2 (2H−1) 4H−3 N 1−H log N
[2
D(H)]−1 √ N
≈
N H−1/2 log N
This implies that lim sup eHˆ N (α), Hˆ mle (H) = ∞, ˆ mle is asymptotically more efficient than H ˆ N (α). meaning that H • If H = 3/4, then 16
eHˆ N (α), Hˆ mle (H) = N
√ 9 4 N log N [2 D(H)]−1 √ N
1 ≈ √ log N
Similarly, as in the first scenario the variations estimator is asymptotically more efficient than the mle. ˆ mle we mean either the exact mle or the Whittle approximate Remark 5.5. By H mle, since both have the same asymptotic distribution. Before discussing the above results let us recall the Cram´er-Rao Lower Bound theory (see [24]). Let X = (X1 , . . . , XN ) be a sample (i.e. identically distributed random variables) with common distribution PH and corresponding density function fH . If T is an estimator of H such that E (T ) = H, then V ar (T ) ≥ [I(H)]
−1
where I(H) is the Fisher information defined by * 2 + ∂ I (H) := E log fH (X) . ∂H
(5.2)
(5.3)
Hurst Index Estimation for Self-Similar Processes with Long-Memory
Asymptotic Relative Efficiency (H=0.7)
3.8 3.6 ARE 3.2
3.4
2e−14 0e+00
3.0
1e−14
Variance
3e−14
4.0
4e−14
Asymptotic Variance (H=0.7)
113
0
20
40
60
80
N
ˆ N (bold) ˆ mle (dotted), H H Figure 6.3.
100
0
20
40
60
80
100
N
Asymptotic Relative Efficiency
Comparison of the variations’ estimator and mle for a filter of order p =1.
The inverse of the Fisher information is called Cram´er-Rao Lower Bound. Remark 5.6. It has been proved by Dahlhaus, [9], that the asymptotic variance of both the approximate and exact mle converges to the Cram´er-Rao Lower Bound and consequently both estimators are asymptotically efficient according to the Fisher criterion. Thus how can the variations estimator be more efficient in some cases? The variations-based estimator is computed using data coming from a fixed time horizon and more specifically [0, 1], i.e. data such as Xa = (X0 , X N1 , . . . , X1 ), while the mle is computed using data of the form Xb = (X0 , X1 , . . . , XN ). The timescaling makes a big difference since the vectors Xa and Xb do not have the same distribution. The construction of the Fisher information (and accordingly the asymptotic Cram´er-Rao Lower Bound) depends on the underlying distribution of the sample and it is going to be different for Xa and Xb . This implies that the Cram´er-Rao Lower Bound attained by the mle using Xb is not the same as the Cram´er-Rao Lower Bound attained by the mle using Xa . By the self-similarity property we can derive that Xa =D N H Xb , which indicates that if we want to compute the information matrix for the rescaled data, the scaling contains the parameter H and this will alter the information matrix and its rate of convergence. We begin by observing what happens in practice for a filter of order 1. In the following graphs, we compare the corresponding variations estimator with the mle, in the case of a simulated fractional Brownian motion with H = 0.65, by plotting the asymptotic variance against the sample size N . As we observe in Figure 6.3, the mle performs better than the estimator based on the variations of the process with filter order p = 1. It has a smaller asymptotic variance and the asymptotic relative efficiency seems to converge to zero extremely ˆ N is faster only by slowly, even for a very large sample size N . This is because H
114
Alexandra Chronopoulou and Frederi G. Viens
Asymptotic Relative Efficiency (H=0.7, p=3)
0.11 ARE
0e+00
0.09
0.10
2e−14 1e−14
Variance
3e−14
4e−14
0.12
Asymptotic Variance (H=0.7, p=3)
0
20
40
60
80
100
0
20
N
60
80
100
N
ˆ N (bold) ˆ mle (dotted), H H Figure 6.4.
40
Asymptotic Relative Efficiency
Comparison of the variations’ estimator and mle for a filter of order p = 10.
ˆ N are quite the factor log N (which is quite slow) and the constants in the case of H large. Let us consider now the case of longer filters (p ≥ 2). Using the results proved in the previous sections (esp. Theorem 4.2 part (1.b)), we have that for all H ∈ (0, 1) eHˆ N (α), Hˆ mle (H) ≈ N
1 log N
and from this we conclude that the variations estimator is always asymptotically more efficient than the mle. If we do the same plots as before we can see (Figure 6.4) that the constant is now significantly smaller. 5.2. Variations’ vs. Wavelet Estimator In this subsection we compare the variations and the wavelets estimators for the both the fBm and the Rosenblatt process. fBm: (1) Let 0 < H < 3/4, then for a filter of order p ≥ 1 in the variations estimator, and for any Q ≥ 1 in the wavelets estimator, we have 1 eHˆ N (α), Hˆ wave (H) ≈ . α(N ) log N Based on the properties of α(N ) as stated before (Theorem 2.2), we conclude that lim e ˆ (H) ˆ N →0 HN (α), Hmle
= 0,
Hurst Index Estimation for Self-Similar Processes with Long-Memory
115
which implies that the variations estimator is asymptotically more efficient than the wavelets estimator. (2) When 3/4 < H < 1, then for a filter of order p = 1 in the variations estimator, and Q = 1 for the wavelets estimator, we have eHˆ N (α), Hˆ wave (H) ≈
N 2−2H α(N )2−2H log N
If we choose α(N ) to be the optimal as suggested by Bardet and Tudor in (1−H)(1−2δ) , [1] , i.e. α(N ) = N 1/2+δ for δ small, then eHˆ N (α), Hˆ wave (H) ≈ N log N which implies that the wavelet estimator performs better. (3) When 3/4 < H < 1, then for a filter of order p ≥ 2 in the variations estimator and Q = 1 for the wavelets estimator, using again the optimal choice of α(N ) as proposed in [1], we have 1
eHˆ N (α), Hˆ wave (H) ≈
N ( 2 −H)−2δ(1−H) , log N
so the variations estimator is asymptotically more efficient than the wavelets one. Rosenblatt process: Suppose that 1/2 < H < 1, then for any filter of any order p ≥ 1 in the variations estimator, and any Q ≥ 1 for the wavelets based estimator, we have eHˆ N (α), Hˆ wave (H) ≈
1 α(N )1−H
log N
.
Again, with the behavior of α(N ) as stated in Theorem 2.2, we conclude that the variations estimator is asymptotically more efficient than the wavelet estimator. Overall, it appears that the estimator based on the discrete variations of the process is asymptotically more efficient than the estimator based on wavelets, in most cases. The wavelets estimator does not have the problems of computational time which plague the mle: using efficient techniques, such as Mallat’s algorithm, the wavelets estimator takes seconds to compute on a standard PC platform. However, the estimator based on variations is much simpler, since it can be constructed by simple transformation of the data. Summing up, the conclusion is that the heuristic approaches (R/S, variograms, correlograms) are useful for a preliminary analysis to determine whether long memory may be present, due to their simplicity and universality. However, in order to estimate the Hurst parameter it would be preferable to use any of the other techniques. Overall, the estimator based on the discrete variations is asymptotically more efficient than the estimator based on wavelets or the mle. Moreover, it can be applied not only when the data come from a fractional Brownian motion, but also when they come from any other non-Gaussian Hermite process of higher order.
116
Alexandra Chronopoulou and Frederi G. Viens
Finally, when we apply a longer filter in the estimation procedure, we are able to reduce the asymptotic variance and consequently the standard error significantly. The benefits of using longer filters needs to be investigated further. It would be interesting to study the choice of different types of filters, such as wavelet-type filters versus finite difference filters. Specifically, the complexity introduced by the construction of the estimator based on a longer filter, which is not as straightforward as in the case of filter of order 1, is something that will be investigated in a subsequent article. References 1. J-M. Bardet and C.A. Tudor (2008): A wavelet analysis of the Rosenblatt process: chaos expansion and estimation of the self-similarity parameter. Preprint. 2. J.B. Bassingthwaighte and G.M. Raymond (1994): Evaluating rescaled range analysis for time series, Annals of Biomedical Engineering, 22, 432-444. 3. J. Beran (1994): Statistics for Long-Memory Processes. Chapman and Hall. 4. J.-C. Breton and I. Nourdin (2008): Error bounds on the non-normal approximation of Hermite power variations of fractional Brownian motion. Electronic Communications in Probability, 13, 482-493. 5. P. Breuer and P. Major (1983): Central limit theorems for nonlinear functionals of Gaussian fields. J. Multivariate Analysis, 13 (3), 425-441. 6. A. Chronopoulou, C.A. Tudor and F. Viens (2009): Application of Malliavin calculus to long-memory parameter estimation for non-Gaussian processes. Comptes rendus Mathematique 347, 663-666. 7. A. Chronopoulou, C.A. Tudor and F. Viens (2009): Variations and Hurst index estimation for a Rosenblatt process using longer filters. Preprint. 8. J.F. Coeurjolly (2001): Estimating the parameters of a fractional Brownian motion by discrete variations of its sample paths. Statistical Inference for Stochastic Processes, 4, 199-227. 9. R. Dahlhaus (1989): Efficient parameter estimation for self-similar processes. Annals of Statistics, 17, 1749-1766. 10. R.L. Dobrushin and P. Major (1979): Non-central limit theorems for non-linear functionals of Gaussian fields. Z. Wahrscheinlichkeitstheorie verw. Gebiete, 50, 27-52. 11. P. Flandrin (1993): Fractional Brownian motion and wavelets. Wavelets, Fractals and Fourier transforms. Clarendon Press, Oxford, 109-122. 12. R. Fox, M. Taqqu (1985): Non-central limit theorems for quadratic forms in random variables having long-range dependence. Probab. Th. Rel. Fields, 13, 428-446. 13. Hurst, H. (1951): Long Term Storage Capacity of Reservoirs, Transactions of the American Society of Civil Engineers, 116, 770-799. 14. A.K. Louis, P. Maass, A. Rieder (1997): Wavelets: Theory and applications Pure & Applied Mathematics. Wiley-Interscience series of texts, monographs & tracts. 15. M. Maejima and C.A. Tudor (2007): Wiener integrals and a Non-Central Limit Theorem for Hermite processes, Stochastic Analysis and Applications, 25 (5), 1043-1056. 16. B.B. Mandelbrot (1975): Limit theorems of the self-normalized range for weakly and strongly dependent processes. Z. Wahrscheinlichkeitstheorie verw. Gebiete, 31, 271285. 17. I. Nourdin, D. Nualart and C.A Tudor (2007): Central and Non-Central Limit Theorems for weighted power variations of the fractional Brownian motion. Preprint.
Hurst Index Estimation for Self-Similar Processes with Long-Memory
117
18. I. Nourdin, G. Peccati and A. R´eveillac (2008): Multivariate normal approximation using Stein’s method and Malliavin calculus. Ann. Inst. H. Poincar´ e Probab. Statist., 18 pages, to appear. 19. D. Nualart (2006): Malliavin Calculus and Related Topics. Second Edition. Springer. 20. D. Nualart and G. Peccati (2005): Central limit theorems for sequences of multiple stochastic integrals. The Annals of Probability, 33, 173-193. 21. D. Nualart and S. Ortiz-Latorre (2008): Central limit theorems for multiple stochastic integrals and Malliavin calculus. Stochastic Processes and their Applications, 118, 614-628. 22. G. Peccati and C.A. Tudor (2004): Gaussian limits for vector-valued multiple stochastic integrals. S´eminaire de Probabilit´es, XXXIV, 247-262. 23. G. Samorodnitsky and M. Taqqu (1994): Stable Non-Gaussian random variables. Chapman and Hall, London. 24. J. Shao (2007): Mathematical Statistics. Springer. 25. M. Taqqu (1975): Weak convergence to the fractional Brownian motion and to the Rosenblatt process. Z. Wahrscheinlichkeitstheorie verw. Gebiete, 31, 287-302. 26. C.A. Tudor (2008): Analysis of the Rosenblatt process. ESAIM Probability and Statistics, 12, 230-257. 27. C.A. Tudor and F. Viens (2008): Variations and estimators for the selfsimilarity order through Malliavin calculus. Annals of Probability, 34 pages, to appear. 28. C.A. Tudor and F. Viens (2008): Variations of the fractional Brownian motion via Malliavin calculus. Australian Journal of Mathematics, 13 pages, to appear. 29. P. Whittle (1953): Estimation and information in stationary time series. Ark. Mat., 2, 423-434.
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 119–129
Chapter 7 Modeling Colored Noise by Fractional Brownian Motion
Jinqiao Duan1,2 , Chujin Li2 and Xiangjun Wang2
∗
1. Department of Applied Mathematics Illinois Institute of Technology Chicago, IL 60616, USA E-mail: [email protected] 2. School of Mathematics and Statistics Huazhong University of Science and Technology Wuhan 430074, China E-mail: [email protected]; [email protected] Complex systems are usually under the influences of noises. Appropriately modeling these noises requires knowledge about generalized time derivatives and generalized stochastic processes. To this end, a brief introduction to generalized functions theory is provided. Then this theory is applied to fractional Brownian motion and its derivative, both regarded as generalized stochastic processes, and it is demonstrated that the “time derivative of fractional Brownian motion” is correlated and thus is a mathematical model for colored noise. In particular, the “time derivative of the usual Brownian motion” is uncorrelated and hence is an appropriate model for white noise.
Keywords: White noise, colored noise, stochastic differential equations (SDEs); generalized time derivative, generalized stochastic processes, stationary processes 2000 AMS Subject Classification: 60G20, 60H40, 60H10 Contents 1 What is noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Generalized time derivative and generalized stochastic processes 3 White noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Colored noises and fractional Brownian motion . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
∗ The
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
120 121 122 125 126
authors gratefully acknowledge the support by the Cheung Kong Scholars Program, the K. C. Wong Education Foundation, Hong Kong, and the NSF grant 0620539. 119
120
Jinqiao Duan, Chujin Li and Xiangjun Wang
1. What is noise Complex systems in science and engineering are often subject to random fluctuations. Although these fluctuations arise from various situations, they appear to share some common features.16,21,26,29–31 They are generally regarded as stationary stochastic processes, with zero mean and with special correlations at different time instants. We assume that stochastic processes are defined for all real time t ∈ (−∞, ∞). Let Xt (ω) be a real-valued stochastic process defined in a probability space (Ω, F , P). We say that Xt is an stationary stochastic process if, for any integer k and any real numbers t1 < t2 < · · · < tk , the distribution of (Xt1 +t , Xt2 +t , · · · , Xtk +t ) does not depend on t, i.e. P ({(ω : Xt1 +t (ω), · · · , Xtk +t (ω)) ∈ A}) = P ({ω : (Xt1 (ω), · · · , Xtk (ω)) ∈ A}) , for every open interval A and all t. Moreover, we say that Xt has stationary increments if, for any integer k and any real numbers t0 < t1 < · · · < tk , the distribution of Xtj − Xtj−1 depends on tj and tj−1 only through the difference tj − tj−1 where
j = 1, ...,k. It means that if tj −tj−1 = ti −ti−1 for some i, j ∈ {1, · · · , k}, then Xtj − Xtj−1 =d Xti − Xti−1 , i.e., the both sides have the same distributions. Such a stationary process Xt is also called a strongly stationary process or a first order stationary process. It is called a weakly stationary process or the second order stationary process if, for any integer k and any real numbers t1 < t2 < · · · < tk , the mean and covariance matrix of (Xt1 +t , Xt2 +t , · · · , Xtk +t ) does not depend on t. Noise is a special stationary stochastic process ηt (ω). Its mean Eηt = 0 and its covariance E(ηt ηs ) = K c(t−s) for all t and s, with some constant K and a function c(·). When c(t − s) is the Dirac Delta function δ(t − s), the noise ηt is called a white noise, otherwise it is a colored noise. For example, Gaussian white noise may be molded in terms of “time derivative” of Brownian motion. Let us first discuss this formally.21,26 Recall that a scalar Brownian motion Bt is a Gaussian process with stationary (and also independent) increments, together with mean zero EBt = 0 and covariance E(Bt Bs ) = t ∧ s = min{t, s}. By the formal formula E(X˙ t X˙ s ) = ∂ 2 E(Xt Xs )/∂t∂s, we see that E(B˙ t B˙ s ) = ∂ 2 E(Bt Bs )/∂t∂s = ∂ 2 (t ∧ s)/∂t∂s = δ(t − s). So the spectral density function for B˙ t , i.e., the Fourier transform F for its covariance function E(B˙ t B˙ s ), is constant 1 . F (E(B˙ t B˙ s )) = F (δ(t − s)) = 2π Moreover, the increments like Bt+∆t − Bt ≈ B˙ t are stationary, and formally, EB˙ t ≈ E Bt+∆t −Bt = 0 = 0. Thus ηt = B˙ t is taken as a mathematical model ∆t
∆t
Modeling Colored Noise by Fractional Brownian Motion
121
for white noise. Note that Brownian motion does not have usual time derivative. It is necessary to interpret B˙ t as a generalized time derivative and make the above argument rigorous. This chapter is organized as follows. In §2, we discuss generalized time derivatives for stochastic processes. We then consider white noise and colored noise in §3 and §4, respectively. 2. Generalized time derivative and generalized stochastic processes We first consider the Heaviside function 3 1 if t ≥ 0, H(t) = 0 otherwise .
(2.1)
This function is certainly not C 1 (R), but if it had a (generalized) derivative H (t), then by formal integration by parts, H (t) should satisfy +∞ +∞ +∞ H (t)v(t)dt = − H(t)v (t)dt = − v (t)dx = v(0) − v(+∞) = v(0) −∞
−∞
0
for every test function v ∈ C01 (R) (i.e., smooth functions with compact support on R). Since +∞H (t) ≡ 0 for t = 0, this suggests that “H (0) = ∞” in such a way that −∞ H (t)v(t)dt = v(0). Of course, no true function behaves like this, so we call H (t) a generalized function. Notice that this process enables us to take more derivatives of H(t): +∞ +∞ +∞ H (t)v(t)dt = H(t)v (t) = v (t)dt = −v (0), −∞
+∞
−∞
−∞
H (t)v(t)dt = −
+∞
−∞
0
+∞
H(t)v (t) =
v (t)dt = −v (0),
0
provided v is sufficiently smooth and has compact support, say v ∈ C0∞ (R). The above computation of the derivatives is via formal integration by parts against a function in C0∞ (R). This is called generalized differentiation, and the function in C0∞ (R) are called test functions. We define the delta function, as a generalized function, to be the object δ(t) so that formally +∞ δ(t)v(t)dt = v(0), (2.2) −∞
C0∞ (R).
for every test function v ∈ Hence we find H (t) = δ(t). Also note that we can view the generalized function δ(t) as a linear functional on the test space C0∞ (R), which assigns each v a real value:
122
Jinqiao Duan, Chujin Li and Xiangjun Wang
F (v) :=
+∞
−∞
δ(t)v(t)dt = v(0).
Notice that we can take any order of generalized derivatives of δ(t). In fact, for any positive integer m, Dm δ(t) is the object satisfying Dm δ(t)v(t)dt = (−1)m δ(t)Dm v(t)dt = (−1)m Dm v(0). Also, we can translate the singularity in δ(t) to any point µ by letting δµ (t) = δ(t − µ) so that a change of variables y = t − µ yields +∞ +∞ +∞ δµ (t)v(t)dt = δ(t − µ)v(t)dt = δ(y)v(y + µ)dy = v(µ). −∞
−∞
−∞
We have thus defined a special generalized function, the delta function, and its generalized derivatives through their actions on test functions. In fact, this procedure applies to other generalized functions as well. A generalized function F is a linear functional on the test space, i.e., a linear mapping F : C0∞ (R) → R such that F (vj ) → 0 for every sequence vj ⊂ C0∞ (R) with support in a fixed compact set K ⊂ R and whose derivatives Dm vj → 0 uniformly in K, as j → ∞. If F and Fj are generalized functions in R, then Fj → F as generalized functions provided that Fj (v) → F (v) for every v ∈ C0∞ (R). The support of a generalized function F in R is the smallest closed set K ⊂ R such that F (v) = 0 whenever v ≡ 0 in a neighborhood of K. For example, F (v) = R v(t)f (t)dt with f (t) given and v ∈ C0∞ (R), is a generalized function. There are other generalized functions not defined by such integrals; see28 for more information. A generalized stochastic process ηt (ω) is a generalized function in time t, almost surely. In the next two sections, we will consider generalized stochastic processes as mathematical models for white noise and colored noise. 3. White noise In engineering, white noise is generally understood as a stationary process ξt , with zero mean Eξt = 0 and constant spectral density f (λ) on the entire real axis. See21,31 for discussions of white noise in engineering and see7,8 for more applications. If Eξs ξt+s = C(t) is the covariance function of ξt , then the spectral density is defined as Fourier transform of the covariance function C(t) +∞ 1 K f (λ) = e−iλt C(t)dt = 2π −∞ 2π where K is a positive constant. This relation holds for a (generalized) stochastic process ξt with covariance function C(t) = δ(t), the Dirac delta function. This
Modeling Colored Noise by Fractional Brownian Motion
123
says that white noise is a stationary stochastic process that has zero mean and is uncorrelated at different time instants. We will see that ξt = B˙ t is such a stochastic process. If white noise ξt ’s covariance Cov(ξt , ξs ), and B˙ t ’s covariance Cov(B˙ t , B˙ s ) are the same, then we can take B˙ t as a mathematical model for white noise ξt . This is indeed the case, but we have to verify this in the context of generalized functions, t because dB dt has no meaning in the sense of ordinary functions. In fact, white noise was first correctly described in connection with the theory of generalized functions.2 From the last section, we know that +∞ ϕ(t)f (t)dt, (3.1) Φf (ϕ) = −∞
defines a generalized function for a given f . The function Φf depends linearly and continuously on test functions ϕ. It is the generalized function corresponding to f . With this representation we regard f as a generalized function. In fact, we may identify f with this linear functional Φf . In particular, the generalized function defined by Φ(ϕ(t)) = ϕ(t0 ), with a fixed t0 , for ϕ ∈ C0∞ (R), is called the Dirac delta function, and is also symbolically denoted by δ(t−t0 ). In contrast with classical functions, generalized functions always have derivatives of every order, which again are generalized functions. By ˙ of Φ, we mean the generalized function defined by the derivative Φ ˙ Φ(ϕ) = −Φ(ϕ). ˙ A generalized stochastic process is now simply a random generalized function in the following sense: for every test function ϕ, a random variable Φ(ϕ) is assigned such that the functional Φ is linear and continuous. A generalized stochastic process is said to be Gaussian if, for arbitrary linearly independent functions ϕ1 , · · · ϕn ∈ K, the random variable (Φ(ϕ1 ), · · · Φ(ϕn )) is normally distributed. Just as in the classical case, a generalized Gaussian process is uniquely defined by the continuous linear mean functional EΦ(ϕ) = m(ϕ) and the continuous bilinear positive-definite covariance functional E(Φ(ϕ) − m(ϕ))(Φ(ψ) − m(ψ)) = C(ϕ, ψ). One of the important advantages of a generalized stochastic process is the fact that its derivative always exists and is itself a generalized stochastic process. In fact, the ˙ of Φ is the process defined by setting derivative Φ ˙ Φ(ϕ) = −Φ(ϕ). ˙
124
Jinqiao Duan, Chujin Li and Xiangjun Wang
The derivative of a generalized Gaussian process with mean m(ϕ) and covariance C(ϕ, ψ) is again a generalized Gaussian process and it has mean m(ϕ) ˙ = −m(ϕ) ˙ ˙ and covariance C(ϕ, ˙ ψ). Now let us look at Brownian motion Bt and its derivative. The generalized stochastic process corresponding to Bt is the following linear functional +∞ ϕ(t)Bt dt, Φ(ϕ) = −∞
C0∞ (R).
With this representation we regard Bt as a generalized stochastic for ϕ ∈ process. In fact, we may identify Bt with this linear functional Φ. We conclude that the mean functional m(ϕ) ≡ 0 and the covariance functional
∞
∞
min(t, s) ϕ(t)ψ(s)dtds.
C(ϕ, ψ) = 0
0
After some elementary manipulations and integration by parts, we get ∞ ˆ − ψ(∞))dt, ˆ C(ϕ, ψ) = (ϕ(t) ˆ − ϕ(∞))( ˆ ψ(t) 0
where
t
ϕ(t) ˆ =
∞
ˆ = ϕ(s)ds and ψ(t)
0
t
ψ(s)ds. 0
0
+∞ The derivative of Bt , i.e., derivative of Φ(ϕ) = −∞ ϕ(t)Bt dt, is also a generalized Gaussian process with mean m(ϕ) ˙ = 0 = −m(ϕ) ˙ and covariance ∞ ˙ = ˙ ϕ(t)ψ(s)dt. C(ϕ, ψ) = C(ϕ, ˙ ψ) 0
This formula can be put in the form ∞ ˙ C(ϕ, ψ) = 0
∞
δ(t − s)ϕ(t)ψ(s)dtds.
0
Therefore, the covariance of the derivative of Brownian motion Bt is the generalized stochastic process with mean zero and covariance ˙ t) = δ(t − s). C(s, But this is the covariance for white noise ξt . Thus, B˙ t is taken as a mathematical model for white noise ξt . This justifies the notation ξt = B˙ t
Modeling Colored Noise by Fractional Brownian Motion
125
frequently used in engineering literature and occasionally in stochastic differential equations. We may also write t ξt ds. Bt = 0
We may say that a Gaussian white noise ξt is a generalized Gaussian stochastic process Φξ with mean zero and covariance functional +∞ Cξ (ϕ, ψ) = ϕ(t)ψ(t)dt. −∞
4. Colored noises and fractional Brownian motion Random fluctuations in complex systems may not be uncorrelated (i.e., may not be white noise). In fact, most fluctuations ξt are correlated5,13,14 and thus the covariance E(ξt ξs ) = c(t − s) is usually not a delta function. In this case we call the stationary process ξt a colored noise. We usually take its mean to be zero, otherwise we define a new stationary process by subtracting the mean. Since covariance function c(t − s) may be arbitrary, there are many colored noises in principle. For example, the Ornstein-Uhlenbeck process, as the solution of a linear Langevin equation, is often used as a model for colored noise. We here discuss a model of colored noise in terms of fractional Brownian motion (fBM). It has been known that such colored noise arise in mathematical modeling of missing mechanisms or unresolved scales in various complex systems.4–6 The fractional Brownian motion B H (t), indexed by a so called Hurst parameter H ∈ (0, 1), is a generalization of the more well-known process of the usual Brownian motion B(t). It is a zero-mean Gaussian process with stationary increments. However, the increments of the fractional Brownian motion are not independent, except in the usual Brownian motion case (H = 12 ). For more details, see.10,22–24 Definition of fractional Brownian motion: For H ∈ (0, 1), a Gaussian process B H (t) is a fractional Brownian motion if it starts at zero B H (0) = 0, a.s., has mean zero E[B H (t)] = 0, and has covariance E[B H (t)B H (s)] = 12 (|t|2H + |s|2H − |t − s|2H ) for all t and s. The standard Brownian motion is a fractional Brownian motion with Hurst parameter H = 12 . Some properties of fractional Brownian motion: A fractional Brownian motion B H (t) has the following properties: (i) It has stationary increments; (ii) When H = 1/2, it has independent increments; (iii) When H = 1/2, it is neither Markovian, nor a semimartingale. As for the covariance for the generalized derivative of fractional Brownian noise, H ˙ Bt , it is complicated due to its non-Markovian nature. Recall that a function f in
126
Jinqiao Duan, Chujin Li and Xiangjun Wang
C ∞ (R) is a Schwartz function if it goes to zero as |x| → ∞ faster than any inverse H H Bt (here and below M− and power of x, as do all its derivatives. Since B˙ tH = M− H 23 M+ are operators defined in ), for any Schwartz functions f and g, we have EB˙ H B˙ H , f ⊗ g = E f (t)M H B˙ t dt g(s)M H B˙ s ds t
s
R
−
R
−
H H = E M+ f (t)dBt M+ g(s)dBs R R H H = M+ f (t)M+ g(t)dt R
H H = δ(t − s), M+ f ⊗ M+ g H H ⊗ M− δ(t − s), f ⊗ g . = M−
This implies that B˙ tH is correlated or colored noise. We use the Weierstrass-Mandelbrot function to approximate the fractional Brownian motion. The basic idea is to simulate fractional Brownian motion by randomizing a representation due to Weierstrass. Given the Hurst parameter H with 0 < H < 1, we define the function w(t) to approximate the fractional Brownian motion: ∞ Cj rjH sin(2πr−j ti + dj ) w(ti ) = j=−∞
where r = 0.9 is a constant, Cj ’s are normally distributed random variables with mean 0 and standard deviation 1, and the dj ’s are uniformly distributed random variables in the interval 0 ≤ dj < 2π. The underlying theoretical foundation for this approximation can be found in.27 Three figures here show a few sample paths of the fractional Brownian motion with Hurst parameters H = 0.25, 0.5, 0.75, respectively. References 1. L. Arnold, Random Dynamical Systems, Springer, New York,1998. 2. L. Arnold, Stochastic DifferentiL Equations, John Wiley & Sons, New York, 1974. 3. A. Berlinet and C. Thomas-Agnan, Reproducing Kernel Hilbert Spaces in Probability and Statistics, Kluwer Academic Publishers, 2004. 4. A. Du and J. Duan. A stochastic approach for parameterizing unresolved scales in a system with memory. Journal of Algorithms & Computational Technology 3(2009), 393-405. 5. J. Duan, Stochastic Modeling of Unresolved Scales in Complex Systems. Frontiers of Math. in China 4 (2009), 425-436. 6. J. Duan, Quantifying model uncertainty by correlated noise, Oberwolfach Reports, vol. 5, (2008). 7. J. Duan, Predictability in Spatially Extended Systems with Model Uncertainty I & II: Engineering Simulation p17–32, No. 2 & p21–35, No. 3, 31 (2009). 8. J. Duan, Predictability in Nonlinear Dynamical Systems with Model Uncertainty. In Stochastic Physics and Climate Modeling, T. N. Palmer and P. Williams (eds.), Cambridge Univ. Press, 2009.
Modeling Colored Noise by Fractional Brownian Motion
127
9. J. Duan, X. Kan and B. Schmalfuss, Canonical Sample Spaces for Stochastic Dynamical Systems. In “Perspectives in Mathematical Sciences”, Interdisciplinary Math. Sci. Vol. 9, 2009, pp.53-70. 10. T. E. Duncan, Y. Z. Hu, B. Pasik-Duncan, Stochastic Calculus for Fractional Brownian Motion. I: Theory. SIAM Journal on Control and Optimization 38 (2000), 582-612. 11. C. W. Gardiner, Handbook of Stochastic Methods. Second Ed., Springer, New York, 1985. 12. J. Garcia-Ojalvo and J. M. Sancho, Noise in Spatially Extended Systems. SpringerVerlag, 1999. 13. P. Hanggi, Colored Noise in Dynamical Systems: A Functional Calculus Approach, in: Noise in Nonlinear Dynamical Systems, vol. 1, F. Moss and P. V. E. McClintock, eds., chap. 9, pp. 307328, Cambridge University Press,1989. 14. P. Hanggi and P. Jung, Colored Noise in Dynamical Systems. Advances in Chem. Phys., 89(1995), 239-326. 15. T. Hida. Brownian Motion. Springer, New York, 1974. 16. W. Horsthemke and R. Lefever, Noise-Induced Transitions, Springer-Verlag, Berlin, 1984. 17. Z. Huang and J. Yan, Introduction to Infinite Dimensional Stochastic Analysis. Science Press/Kluwer Academic Pub., Beijing/New York, 1997. 18. M. James, F. Moss and P. Hanggi, Switching in the Presence of Colored Noise: The Decay of an Unstable State, Phys. Rev. A 38, 46904695 1988. 19. W. Just, H. Kantz, C. Rodenbeck and M. Helm, Stochastic modelling: replacing fast degrees of freedom by noise. J. Phys. A: Math. Gen., 34 (2001),3199–3213. 20. I. Karatzas and S. E. Shreve, Brownian Motion and Stochastic Calculus 2nd, Springer 1991. 21. V. Krishnan, Nonlinear Filtering and Smoothing: An Introduction to Martingales, Stochastic Integrals and Estimation. John Wiley & Sons, New York, 1984. 22. B. Maslowski and B. Schmalfuss, Random dynamical systems and stationary solutions of differential equationsdriven by the fractional Brownian motion. Stoch. Anal. Appl., Volume 22, Issue 6 January 2005, pages 1577 - 1607. 23. Y. S. Mishura, Stochastic calculus for fractional Brownian motion and related processes, Springer, New York, 2008. 24. D. Nualart, Stochastic calculus with respect to the fractional Brownian motion and applications. Contemporary Mathematics 336, 3-39, 2003. 25. B. Oksendal, Stochastic differential equations: An introduction with applications, Sixth edition, Springer, New York, 2003. 26. A. Papoulis, Probability, Random Variables, and Stochastic Processes, McGraw-Hill Companies; 2nd edition, 1984. 27. V. Pipiras and M. S. Taqqu, Convergence of the Wererstrass-Mandelbrot process to fractinal Brownian motion. Fractals Vol. 8, No.4, (2000), 369-384 . 28. M. Renardy and R. Rogers, Introduction to Partial Differential Equations, SpringerVerlag, 1993. 29. N. G. Van Kampen, How do stochastic processes enter into physics? Lecture Note in Phys. 1250 (1987) 128–137. 30. N. G. Van Kampen, Stochastic Processes in Physics and Chemistry. North-Holland, New York, 1981. 31. E. Wong and B. Hajek. Stochastic Processes in Engineering Systems. Spring-Verlag, New York, 1985.
128
Jinqiao Duan, Chujin Li and Xiangjun Wang
2 1.5 1
BH
0.5 t
0 −0.5 −1 −1.5 −2
Figure 7.1.
0
5
10
15 t
20
25
30
A sample path of fractional Brownian motion B H (t), with H = 0.25
2
15
1
Bt
05
0
−0 5
−1
−1 5
−2
0
5
10
15 t
20
25
30
A sample path of Brownian motion B(t); namely, fractional Brownian motion with H = 0.5
Figure 7.2.
Modeling Colored Noise by Fractional Brownian Motion
129
2 1.5 1
BH
0.5 t
0 −0.5 −1 −1.5 −2
Figure 7.3.
0
5
10
15 t
20
25
30
A sample path of fractional Brownian motion B H (t), with H = 0.75
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 131–142
Chapter 8 A Sufficient Condition for Non-Explosion for a Class of Stochastic Partial Differential Equations Hongbo Fu1 , Daomin Cao2 and Jinqiao Duan3,1
∗
1. School of Mathematics and Statistics, Huazhong University of Science and Technology Wuhan 430074, China E-mail: hbfu [email protected] 2. Institute of Applied Mathematics, Chinese Academy of Sciences Beijing, 100190, China E-mail: [email protected] 3. Department of Applied Mathematics, Illinois Institute of Technology Chicago, IL 60616, USA E-mail: [email protected] To facilitate random dynamical systems approach for stochastic partial differential equations arising as mathematical models for complex systems under fluctuations in science and engineering, global existence and uniqueness of mild solutions for a class of stochastic partial differential equations with local Lipschitz coefficients are considered. A sufficient condition for non-explosion, or global existence and uniqueness, of mild solutions is provided and a few examples are presented to demonstrate the result.
Keywords : Stochastic partial differential equations (SPDEs); Mild solution; Wellposedness; Local Lipschitz condition 2000 AMS Subject Classification : 60H40 Contents 1 Introduction 2 Preliminaries 3 Main results . 4 Examples . . References . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
∗ The
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
132 134 136 140 142
authors gratefully acknowledge the support by K. C. Wong Education Foundation, Hong Kong, the Science Fund for Creative Research Groups of Natural Science Foundation of China (No.10721101), and the NSF grant 0620539. 131
132
Hongbo Fu, Daomin Cao and Jinqiao Duan
1. Introduction Mathematical models for scientific and engineering systems are often subject to uncertainties, such as fluctuating forcing, random boundary conditions and uncertain parameters. Moreover, some biological, chemical or physical processes are not well understood, not well-observed, and thus are difficult to be represented in the mathematical models. These missing processes or mechanisms may be very small in spatial scale and fast in temporal scale, but their impact on overall system evolution may be delicate or uncertain [1]. Therefore it is important to take such uncertainties into account. Mathematical models for randomly influenced spatio-temporal dynamical systems are usually in the form of stochastic partial differential equations (SPDEs). To facilitate dynamical systems approach for SPDEs, we need to establish global existence and uniqueness of solutions for SPDEs. One method is the variational approach as presented in [13, 11]. Another approach is the semigroup approach as in [7, 3]. In this chapter, we present the semigroup approach for global existence and uniqueness of mild solutions for SPDEs, through a few examples. We provide a sufficient condition for global existence, the Assumption (B) and Theorem 3.2 in §3. This condition is different from those in Da Prato-Zabczyk [7] or Chow [3]. For deterministic partial differential equations, the semigroup approach for wellposedness is presented in, for example, [10, 4, 6]. Let D ⊂ Rd (d ≥ 1) be a bounded domain with smooth boundary ∂D. We consider the following nonlinear stochastic partial differential equations(SPDEs) ∂ ∂ ∂t u(t, x) = (κ∆ − α)u(t, x) + f (u(t, x)) + σ(u(t, x)) ∂t w(t, x), (1.1) u| = 0, u(x, 0) = h(x). ∂D
where t > 0, x ∈ D, κ and α are positive, ∆ =
d i=1
∂2 ∂x2i
is the Laplace operator and
h(x) is a given function in L (D) which be denoted by H with inner product (·, ·) and norm · . The coefficients f , σ : R → R are given measurable functions and w(t, x) is a H-valued R-Wiener process to be defined below. Such equations model a variety of phenomena in many fields, such as biology, quantum field, neurophysiology and so on, see [9, 14, 8]. Throughout this paper, we shall assume a complete probability space (Ω, F , {Ft }t≥0 , P) equipped with the filtration {Ft }t≥0 , which satisfies the usual conditions, such that F0 contains all P-null sets. For our system we present some information on R-Wiener process [3]. Let R be a linear integral operator defined, for all φ ∈ H, by r(x, y)φ(y)dy x ∈ D, R(φ)(x) = 2
D
Sufficient Condition for Non-Explosion for SPDEs
133
nuclear r(x, y) = r(y, x) is positive and square integral such that the integral where 2 |r(x, y)| dxdy < ∞. Then the eigenvalues {µk }k≥1 of R are positive and the D D normalized eigenfunctions {φk }k≥1 form a complete orthonormal basis for H. In ∞ the case that eigenvalues satisfy µk < ∞ , the R-Wiener process in H has an k=1
infinite series representation w(t, x) =
∞ √
µk wtk φ(x),
k=1
where {wtk }k≥1 is an independent sequence of real-valued Wiener process. Now let us state the formal problem in Eq. (1.1) in its rigorous meaning. A predictable random field (see [5] for definition) {u(t, x), t ≥ 0, x ∈ D} is called a mild solution of the Eq. (1.1) if t G(t, x, y)h(y)dy + G(t − s, x, y)f (u(s, y))dyds D D 0 t G(t − s, x, y)σ(u(s, y))dyw(y, ds), P − a.s. +
u(t, x) =
0
(1.2)
D
for each t ≥ 0, where G(t, x, y) stands for the fundamental solution of the heat ∂ equation of ∂t u(t, x) = (κ∆ − α)u(t, x) with the boundary conditions specified ∞ e−λk t ek (x)ek (y), here before. In fact, G(t, x, y) can be expressed as G(t, x, y) = k=1
{ek (x)}k≥1 denote the complete orthornormal system of eigenfunctions in H such that, for k = 1, 2 · · · , (−κ∆ + α)ek = λk ek , ek |∂D = 0,
(1.3)
with α ≤ λ1 ≤ λ2 ≤ · · · λk ≤ · · · . If we set (Gt g)(x) = D G(t, x, y)g(y)dy, g ∈ H. Then Gt is a semigroup on H; see [10] for details. Let us also write u(t) = u(t, ·) and dwt = w(·, dt), then Eq. (1.2) can be written as u(t) = Gt h +
t
Gt−s f (u(s))ds + 0
t
Gt−s σ(u(s))dws , P − a.s.
(1.4)
0
Global existence and uniqueness for mild solution of the SPDE Eq. (1.4) under global Lipschitz condition and linear growth on the coefficients f and σ have be studied in [7]. Here we present global existence and uniqueness results for mild solutions of above mentioned SPDE under local Lipschitz condition, mainly following [3] but with a different sufficient condition to guarantee global existence; see the Assumption (B) in §3. After some preliminaries in §2, we present a global wellposedness result in §3, and discuss a few examples in §4.
134
Hongbo Fu, Daomin Cao and Jinqiao Duan
2. Preliminaries Let T > 0 be a fixed number and denote by L2 (Ω, C([0, T ]; H)) the space of all H-valued Ft -adapt processes X(t, ω) defined on [0, T ] × Ω which are continuous in 1 t for a.e. fixed ω ∈ Ω and for which X(·, ·)T = {E sup X(t, ω)2} 2 < ∞, then 0≤t≤T
the space L2 (Ω, C([0, T ]; H)) is a Banach space with the norm · T . Let L2 (Ω×[0, T ]) be the space of all H-valued predictable process X(t, ω) defined 1 T on [0, T ] × Ω and for which X(·, ·)2 = {E 0 X(t, ω)2dt} 2 , then the space L2 (Ω × [0, T ]; H) is also a Banach space with the norm · 2 . Set A := −κ∆+α and we can use (1.3) to define its fractional power for γ ∈ (0, 1) by natural expression (see [12]) Aγ u =
∞
λγi (u, ei )ei ,
i=1
when the right hand is well defined. For more information on fractional power we infer to [10] or [4]. Let H γ denote the domain of Aγ in H, i.e. H γ = {u ∈ ∞ H; λγi (u, ei )ei convergences in H}, then H γ is a Banach space endowed with i=1
norm uγ := {
∞
1
j=1
2 2 γ1 λ2γ ⊂ H γ2 continuously for any j (u, ej ) } . It is clear that H
0 < γ2 < γ1 < 1. 1 Set σR := ( D r(x, x)σ 2 (x)dx) 2 and assume that r(x, y) ≤ r0 for some positive number r0 , where r(x, y) is the integral nuclear of integral operator R. Before starting to prove our main theorem, for the reader’s convenience, we shall formulate some foundamental inequalities with γ ∈ (0, 12 ], which will be used in the proofs (See [3]). Lemma T 2.1. Suppose h ∈ H and v(t, ·) is a predictable random field in H such that E 0 v(t, ·)2 dt < ∞, then the following inequalities hold: sup Gt h ≤ h, 0≤t≤T
E sup 0≤t≤T
t
(2.1)
T
Gt−r v(r, ·)dr ≤ T E 2
0
v(t, ·)2 dt,
(2.2)
0
Lemma 2.2. Suppose that σ(t, ·) is a predictable random field in H such that T T E 0 D r(x, x)σ 2 (t, x)dxdt = E 0 σ(t)2R dt < ∞, then we have t T 2 Gt−s σ(s, ·)dws ≤ 16E σ(t)2R dt. (2.3) E sup 0≤t≤T
0
0
By the same techniques to proof of Lemma 2.1 and Lemma 2.2, we have the following two Lemmas. Lemma h ∈ H and v(t, ·) is a predictable random field in H such T 2.3. Suppose 1 1 t 2 that E 0 v(t, ·) dt < ∞, then Gt h ∈ H 2 , 0 Gt−s v(s, ·)ds is a H 2 -valued random
Sufficient Condition for Non-Explosion for SPDEs
135
field and the following inequalities hold: 1 Gt h2γ ≤ 1−2γ h2 , α T 1 Gt h2γ dt ≤ h2 , 1−2γ 2α 0 T t Gt−s v(s, ·)ds2γ dt ≤ E
(2.4) (2.5)
T
T
E v(t, ·)2 dt. 2α1−2γ 0 0 0 T t 1 Gt−s v(s, ·)ds2γ ≤ E v(t, ·)2 dt. E sup 2 0 0≤t≤T 0
(2.6) (2.7)
Lemma 2.4. Suppose that σ(t, ·) is a predictable random field in H such that T T t E 0 D r(x, x)σ 2 (t, x)dxdt = E 0 σ(t)2R dt < ∞, then 0 Gt−s σ(s, ·)dws is a 1 H 2 -valued process and we have
T
E
0
0
t
1
Gt−s σ(s, ·)dws 2γ dt ≤
E 2α1−2γ
T
σ(t)2R dt.
(2.8)
σ(s)2γ ds, 0 ≤ t ≤ T.
(2.9)
0
Moreover, if σ(t, ·) is a H γ -valued process, then we have E 0
t
Gt−s σ(s, ·)dws 2γ ≤ r0 E
t
0
We point out that the proofs for Lemma 2.3 and Lemma 2.4 are straightforward. For instance, consider the inequality (2.8) as follows: E 0
T
0
t
Gt−s σ(s, ·)dws 2γ dt
∞ T
=E 0
T
= 0
=
j=1
T
0
≤ =
∞
t
∞
1 E 2α1−2γ E 2α1−2γ
Gt−s σ(s, ·)dws , ej )2 dt
0
2 t λ2γ e−λj (t−s) (ej , σ(s, ·))dws dt j E 0
j=1
j=1
1
λ2γ j (
λ2γ−1 E j
∞ T 0
λj e−2λj (t−s) (Qs ej , ej )ds dt
0
(Qt ej , ej )dt
j=i T
0
t
σ(t)2R dt.
Here Qs denotes the local characteristic operator (see [3] Chapter 3, Sect σ(s)dw tion 2) for H-valued martingale s , which is defined by (Qt g, h) = 0 D D r(x, y)σ(t, x)σ(t, y)g(x)h(y)dxdy, for any g, h ∈ H.
136
Hongbo Fu, Daomin Cao and Jinqiao Duan
3. Main results We consider the SPDE (1.1) or its equivalent mild form (1.2), in two separate cases: global Lipschitz coefficients and local Lipschitz coefficients. First we consider the global Lipschitz case. We make following global Lipschitz assumption on coefficients f and σ in Eq. (1.2): (H) Global Lipschitz condition: f (r) and σ(r) are real-valued, measurable functions defined on R, and there exist positive constants β, r1 and r1 such that for any u, v ∈ H γ f (u) − f (v)2 ≤ βu − v2 + r1 u − v2γ , σ(u) − σ(v)2R ≤ βu − v2 + r2 u − v2γ . In fact from (H) it is clear that there exits a constant C such that for any u ∈ H γ , the following sublinear growth condition holds: f (u)2 + σ(u)2R ≤ C(1 + u2 + u2γ ).
(3.1)
Remark 3.1. Note that for non-autonomous SPDEs, i.e., when f and σ in SPDE (1.1) explicitly depend on time t, the global Lipschitz condition (H) does not usually imply the sublinear growth condition. In that case, we need to impose the additional sublinear growth condition as in [3, 7]. Remark 3.2. Since H γ ⊂ H continuously, (3.1) and the inequalities in (H) can be rewritten to concise forms, which the norm · is dominated by the norm · γ . Remark 3.3. In case of γ = 12 , it is proved in [3] that the assumption (H) on f and σ ensure the global existence and uniqueness of the solution of Eq. (1.2) if r1 + r2 < 1 holds. From now, we shall restrict that 0 < γ < 12 , then we have the following theorem when we discuss Eq. (1.2) on a finite time interval [0, T ] for any fixed T > 0. The following result is essentially in [7, 3], but since we consider more regular mild solutions, the proof is thus modified. Theorem 3.1. [Wellposedness under global Lipschitz condition] Assume that the global Lipschitz condition (H) holds and consider the SPDE (1.2) with initial data h ∈ H γ for 0 < γ < 12 . Then there exists a unique solution u, as an adapt, continuous process in H. Moreover for any T > 0, u belongs to L2 (Ω; C([0, T ]; H)) L2 (Ω × [0, T ]; H γ ) such that T E{ sup u(t)2 + u(t)2γ dt} < ∞. 0≤t≤T
0
Sufficient Condition for Non-Explosion for SPDEs
137
Proof. We choose some sufficiently small T0 < T and denote by YT0 the set of predictable random field {u(t)}0≤t≤T belong to L2 (Ω; C([0, T0 ]; H)) L2 (Ω × [0, T0 ]; H γ ) and for which T0 1 uT0 = {E( sup u(t)2 + u(t)2γ dt)} 2 < ∞, 0≤t≤T0
0
then YT0 is a Banach space with norm · T0 . Let Γ denote a mapping in YT0 defined by t t Gt−s f (u(s))ds + Gt−s σ(u(s))dws , t ∈ [0, T0 ]. Γt u = Gt h + 0
0
We first verify that Γ : YT0 → YT0 is well-defined and bounded. In the following, C will denotes a positive constant whose values might change from line to line. It follows form (2.1) and (2.5) that T0 Gt h2γ dt ≤ C h2 . (3.2) E sup Gt h2 + E 0≤t≤T0
0
Let {u(s)}t∈[0,T0 ] ∈ YT0 and use (2.2) ,(2.6) firstly, then (3.1). We obtain t T0 t 2 Gt−s f (u(s))ds + E Gt−s f (u(s))ds2γ dt E sup 0≤t≤T0
0
0
≤ CE
0
T0
f (u(s))2 ds
0
≤ CE
0
T0
(1 + u(s)2 + u(s)2γ )ds
≤ C (1 + u(s)2T0 ).
(3.3)
Similarly, by making use of (2.3), (2.8) and (3.1), it is easy to check t T0 t Gt−s σ(u(s))dws 2 + Gt−s σ(u(s))dws 2γ dt ≤ C (1+u(s)2T0 ). E sup 0≤t≤T0
0
0
0
(3.4) From (3.2), (3.3) and (3.4), it follows that Γ : YT0 → YT0 is well-defined and bounded. To show Γ is a contraction operator in YT0 , we introduce another equivalent norm · µ,YT0 in YT0 as follow: T0 1 2 u(t)2γ dt)} 2 , uµ,T0 = {E( sup u(t) + µ 0≤t≤T0
0
where µ is a parameter, then for u, v ∈ YT0 , Γu − Γv2µ,T0 = E{ sup Γt u − Γt v2 + µ 0≤t≤T0
0
T0
Γt u − Γt v2γ dt}.
138
Hongbo Fu, Daomin Cao and Jinqiao Duan
By making use of (2.2), (2.3) and simple inequality (a + b)2 ≤ C a2 + (1 + )b2 with C = 1+ for any > 0, we get t E sup Γt (u) − Γt (v)2 = E sup { Gt−s (f (u(s)) − f (v(s)))ds 0≤t≤T0
0≤t≤T0 t
+
0
Gt−s (σ(u(s)) − σ(v(s)))dws 2 }
0
≤ C T0 E
T0
0
f (u(s)) − f (v(s))2 dt
+ 16(1 + )E 0
T0
σ(u(s)) − σ(v(s))2R dt.
Similarly, by making use of (2.6), (2.8) and the simple inequality mentioned above, we obtain T0 T0 T0 C E Γt u − Γt v2γ dt ≤ E f (u(s)) − f (v(s))2 dt 1−2γ 2α 0 0 T0 1+ E σ(u(s)) − σ(v(s))2R dt. + 2α1−2γ 0 Hence, by assumption (H), we get T0 C T0 µ )E (βu(s) − v(s)2 + r1 u(s) − v(s)2γ )ds 2α1−2γ 0 T0 (1 + )µ )E (βu(s) − v(s)2 + r2 u(s) − v(s)2γ )ds + (16(1 + ) + 2α1−2γ 0 T0 2 u(s) − v(s)2γ ds, ≤ ρ1 E sup u(t) − v(t) + ρ2 µE
Γ(u) − Γ(v)2µ,T0 ≤ (C T0 +
0≤t≤T0
0
here T0 µ µ T0 + + ), 2 2 T0 16 r1 + r2 1 T0 ρ2 = (1 + )( + + + )( 1−2γ ) 2 µ 2 µ α ρ1 = β(1 + )T0 (16 +
with µ =
µ α1−2γ
.
+r2 < 1. If not that, choose M > 0 such Note that we can always assume that αr11−2γ r1 +r2 that (α+M)1−2γ < 1, and rewrite Equation (1.1) as
∂ ∂ u(t, x) = [κ∆ − (α + M )]u(t, x) + [f (u(t, x)) + M · u(t, x)] + σ(u(t, x)) w(t, x), ∂t ∂t it is clear that (H) holds with β, replaced by β + M 2 . So it is possible to choose µ sufficiently large and , T0 sufficiently small such that ρ = ρ1 ∨ ρ2 < 1, which implies that Γ is a contraction operator in YT0 . This means that there exists a
Sufficient Condition for Non-Explosion for SPDEs
139
unique local solution of the Eq. (1.2) over [0, T0 ], the solution can be extend over the finite interval [0, T ] by standard arguments. The proof of the theorem is completed. Remark 3.4. To show that it is possible to choose the parameters µ , and T0 to 1 and µ = 642 , then we have ρ2 ≤ (1 + )( 12 + 6413 + make ρ < 1, we first let T0 < 64 1 16 3 1 128 + 643 ), this yields ρ2 < 4 if we choose = 64 . It is clear that ρ1 can be make less than 34 by taking T0 to be sufficiently small. In Theorem 3.1, if the global Lipschitz condition on coefficients is relaxed to hold locally, then we only obtain a local solution which may blow up (or explode or have explosion) in finite time. To get global solution, we impose the following conditions: (Hn) Local Lipschitz condition: f (r) and σ(r) are real-valued, measurable functions defined on R and there exit constants rn > 0 such that f (u) − f (v)2 ∨ σ(u) − σ(v)2 ≤ rn u − v2γ , for all u, v ∈ H γ with uγ ∨ vγ ≤ n, n = 1, 2, · · · . Here we use the notation a ∨ b = max(a, b). (B) A priori estimate: For the solution u(t), u(t)γ is continuous a.s for all t > 0 and satisfies a priori bound Eu(t)2γ ≤ K(t), 0 ≤ t < ∞, where K(t) is defined and finite for all t > 0. Theorem 3.2. [Wellposedness under local Lipschitz condition] Assume that the local Lipschitz condition (Hn) and the finite time a priori estimate (B) hold and consider the Eq. (1.2) with initial data h ∈ H γ for 0 < γ < 12 . Then there exists a unique solution u as an adapt, continuous process in H. Moreover, for any T > 0, u belongs to L2 (Ω; C([0, T ]; H)) L2 (Ω × [0, T ]; H γ ) such that T E{ sup u(t)2 + u(t)2γ dt} < ∞. 0≤t≤T
0
For any integer n ≥ 1, let ηn : [0, ∞) → [0, 1] is a C ∞ -function such that * 1, 0 ≤ r ≤ n, ηn (r) = 0, r ≥ 2n.
Proof.
We will consider the truncated systems ∂ ∂ ∂t u(t, x) = (κ∆ − α)u(t, x) + fn (u(t, x)) + σn (u(t, x)) ∂t w(t, x), u|
∂D
(3.5) = 0, u(x, 0) = h(x),
140
Hongbo Fu, Daomin Cao and Jinqiao Duan
where fn (u) = f (ηn (uγ ) · u), σn (u) = σ(ηn (uγ · u)). Then the assumption (Hn) implies that fn and σn satisfy the global conditions (H). Hence, by Theorem 3.1 the system (3.5) has a unique solution un (t) ∈ L2 (Ω; C([0, T ]; H)) L2 (Ω × [0, T ]; H γ ). Define a increasing sequence of stopping time {τn }n≥1 by τn = inf{t > 0; un (t)γ > n} if it exits, and τn = ∞ otherwise. Let τ∞ = lim τn a.s. and set uτn (t) = un (t ∧ τn ), n→∞
then uτn (t) is a local solution of Equation (1.2). By assumption (B), we have, for any T > 0, Euτn (T )2γ ≤ K(T ).
(3.6)
Since Eun (T ∧ τn )2γ = Euτn (T )2γ ≥ E{1{τn ≤T } un (T ∧ τn )2γ } ≥ P{τn ≤ T }n2. In view of (3.6) and (3.7), we get P{τn ≤ T } ≤ Cantelli Lemma,
K n2 ,
(3.7) which, by invoking the Borel-
P{τ∞ > T } = 1 is obtained. Hence u(t) := lim un (t) is a global solution. n→∞ The proof of the theorem is completed.
Remark 3.5. Our framework is mainly adapted from works by P. L. Chow [3], which also deals with, in chapter 6, the wellposedness of strong solutions for stochastic evolution equations by Galerkin approximate under local Lipschitz condition, coercivity condition and monotonicity condition. In addition, we point out that the prior continuity of u(t)γ in assumption (B) is not easy to check, thus we will not discuss it in this article. 4. Examples Let us look at a couple of examples. Let D ⊂ R2 be a bounded domain with smooth boundary ∂D and denote H = L2 (D) as before. Example 4.1. Global Lipschitz case. Consider the following SPDE on D for t > 0: ∂ ∂ ∂t u(t) = (∆ − 1)u(t) + sin u(t) + cos u(t) ∂t w(t), u|
∂D
(4.1) = 0, u(x, 0) = h(x).
Sufficient Condition for Non-Explosion for SPDEs
141
It is easy to check that global Lipschitz condition (H) holds for the SPDE (4.1). Therefore, by Theorem 3.1, the SPDE (4.1) has a unique mild solution {u(t, x)}t≥0 with given h ∈ H γ for γ ∈ (0, 12 ). Example 4.2. Local Lipschitz case. Now consider the following SPDE on D for t > 0: ∂ ∂ u(t) = ∆u(t) + u(t) − u3 (t) + u(t) ∂t w(t), ∂t (4.2)
u|
∂D
= 0, u(x, 0) = h(x).
7 We choose γ = 16 for simplicity. To apply Theorem 3.2, we can set α = κ = 1, 7 3 f (u) = 2u − u and σ(u) = u. Note that since H 16 is continuously imbedded in 7 7 such that for any u ∈ H 16 , L8 (D) (see [4]), there exists a constant L 16
u2L8 + |u2L6 + u2L4 ≤ L 167 u27 ,
(4.3)
16
where · Lp denotes the usual Lp (D) norm. Then, using the H¨ older inequality and 7 (4.3), for any u, v ∈ H 16 we have f (u) − f (v)2 + σ(u) − σ(v)2R ≤ 4u − v2L4 u2 + v 2 + 12L4 + r0 u − v27
16
2 7 , v 7 )u − v 7 , ≤ C(u 16 16 16
where C = C(u , v ) depending on u and v . Thus, condtion (Hn) holds for Eq. (4.2). To check condition (B), suppose a H-valued process {u(t)}t≥0 is the unique solution for Eq. (4.2) such that t t 3 Gt−s (2u(s) − u (s))ds + Gt−s u(s)dws , t ≥ 0. u(t) = Gt h + 7 16
7 16
7 16
0
7 16
0
With the aid of (2.4), (2.7), (2.9), and (4.3) we can get, for any fixed T > 0, t 2 Eu(t) 7 ≤ M (T ) + 3r0 E u(s)27 ds, 0 ≤ t ≤ T, 16
0
16
where M = M (T ) depending on T . It then follows from the Gronwall inequality that Eu(t)27 ≤ M (T ) exp{3r0 T } < ∞, 0 ≤ t ≤ T, 16
which means priori bound is satisfied. If we can check the prior continuity of u(t)γ , then we can conclude the Eq. (4.2) has a unique mild solution {u(t, x)}t≥0 with 7 given h(x) ∈ H 16 .
142
Hongbo Fu, Daomin Cao and Jinqiao Duan
Acknowledgements We would like to thank Hongjun Gao, Jicheng Liu and Wei Wang for helpful discussions and comments. References 1. L. Arnold. Random Dynamical Systems. Springer, New York, 1998. 2. D. Barbu, Local and global existence for mild soltions of stochastic differential equations, Port. Math. 55 (4) (1998) 411-424. 3. P. L. Chow, Stochastic Partial Differential Equations. Chapman & Hall/CRC, New York, 2007. 4. D. Henry, Geometric Theory of Semilinear Parabolic Equations,. Springer-Verlag, Berlin, 1981. 5. H. Kunita, Stochastic Flows and Stochastic Differential Equations, Cambridge University Press, Cambridge, UK, 1990. 6. R.C McOwen. Partial Differential Equantions. Pearson Education, New Jetsy, 2003. 7. G. Da Prato, J. Zabczyk, Stochastic Equations in Infinte Dimensions. Cambridge University Press, Cambridge, UK, 1992. 8. R. Marcus, stochastic diffusion on an unbounded domain, Pacific J. Math. 84 (4) (1979) 143-153. 9. E. Pardoux, Stochastic partial differential equations, a review, Bull. Sci. Math. 117 (1) (1993) 29-47. 10. A. Pazy, Semigroups of Linear Operators and Applications to Partial Differential Equations, Springer, Berlin, 1985. 11. C. Prevot and M. Rockner, A Concise Course on Stochastic Partial Differential Equations, Lecture Notes in Mathematics, Vol. 1905. Springer, New York, 2007. 12. J. C. Robinson, Infinite Dimensional Dynamical Systems, Cambridge University Press, Cambridge, UK, 2001, pp.83-84. 13. B. L. Rozovskii, Stochastic Evolution Equations. Kluwer Academic Publishers, Boston, 1990. 14. T. Shiga, Two contrasting properties of solutions for one-dimension stochastic partial differntial equations, Canad. J. Math. 46 (2) (1994) 415-437.
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 143–160
Chapter 9 The Influence of Transaction Costs on Optimal Control for an Insurance Company with a New Value Function Lin He, Zongxia Liang∗ and Fei Xing† Zhou Pei-Yuan Center for Applied Mathematics, Tsinghua University, Beijing 100084, China. Email: [email protected] In this chapter, we consider the optimal proportional reinsurance policy for an insurance company and focus on the case that the reinsurer asks for positive transaction costs from the insurance company. That is, the reserve of the insurance company {Rt } is governed by a SDE dRtπ = (µ−(1−aπ (t))λ)dt+aπ (t)σdWt, where Wt is a standard Brownian motion, π denotes the admissible proportional reinsurance policy, µ, λ and σ are positive constants with µ ≤ λ. The aim of"this # τ paper is to find a policy that maximizes the return function J(x, π) = x + E 0 π e−ct dRtπ , where c > 0, τπ is the time of bankruptcy and x represents the initial reserve. We find the optimal policy and the optimal return function for the insurance company via stochastic control theory. We also give transparent economic interpretation of the return function J(x, π) and show that it could be a better interpretation than the traditional one. Finally, we make some numerical calculations and give some economic analysis on the influence of the transaction costs.
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . 2 Stochastic Control Model . . . . . . . . . . . . 3 Comparisons of the Two Models . . . . . . . . 4 HJB Equation and Verification Theorem . . . . 5 Construction of Solution to the HJB Equation 6 Economic Analysis . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
143 144 146 147 150 157 159
1. Introduction In this chapter, we consider the optimal risk control policy for an insurance company which uses proportional reinsurance policy to reduce its risk. In our model, the reinsurance company asks for some extra ‘transaction costs’ from the company. ∗ Department
of Mathematical Sciences, Tsinghua University, Beijing 100084, China. Email: [email protected] † Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China. Email: fi[email protected] 143
144
Lin He, Zongxia Liang and Fei Xing
That is, the reinsurer uses a safety loading not smaller than the insurer. Our target is to find the appropriate reinsurance rate to maximize the total value of the company. We will work out the optimal control policy via HJB methods. The liquid reserves of the insurance company is governed by a Brownian motion with constant drift and diffusion coefficient. One common control policy for the management of the company is the reinsurance policy. They use it to reduce risk by simultaneously reducing drift and the diffusion coefficient. We refer the readers to Refs. 1, 4 and 9. The objective of the management is to maximize the total value of the company. τ The traditional value function of the insurance company is often interpreted as 0 e−ct Rt dt, where τ is the bankruptcy time, c is the discount rate and Rt is the liquid reserve of the company. This kind of model is solved in Ref. 5. Unfortunately, the model is opaquely defined. It has some flaws both interpreted as the return of shareholders and the value of the company. We propose a new value model inRef. 2 and we consider it to be a better interpretation of company value. τ Indeed, 0 e−ct dRt stands for the total discounted liquid reserve changes during the life time of the company. It could also beexplained as the total discounted net τ incomes of the company. We attribute R0 + 0 e−ct dRt as a better value function of the insurance company. Our work are all based on this model. Based on the above model, we consider the case that the reinsurer asks more risk premium from the insurance company than what the insurer asks from their insured. Our objective is to find the optimal control policy to maximize the value of the insurance company. Højgaard, Taksar have solved this kind of optimal stochastic control problem under the traditional value function, one can refer to Ref. 3. In this paper, we discuss the influence of transaction costs on the insurance company based on the new value function. With the help of the HJB methods, we solve the problem effectively and get some results which are quite helpful for the companies to make their choices. The paper is organized as follows: In section 2, we introduce the stochastic control model of the insurance company. The transparent economic interpretation of the new model is given in section 3. In section 4, we construct the HJB equation for this stochastic control model and prove the validation theorem. The most important results are given in section 5. We solve the HJB equation and give the analytical solutions of the optimal control policy and the optimal value function. We give some economic explanation and numerical calculations in the last section.
2. Stochastic Control Model In this chapter, we consider the situation that an insurance company uses proportional reinsurance policy to control its risk while the reinsurance company chooses a safety loading not smaller than the insurer, that is, the total premium that the insurer pays to the reinsurer is not smaller than the premium he gets from the insured.
Influence of Transaction Costs on an Insurance Company
145
In this model, if there is no dividends payments and we suppose that only the proportional reinsurance policy is used to control the risk, then the liquid reserve of the insurance company is modeled by the following stochastic differential equation,
dRt = µ − (1 − a(t))λ dt + σa(t)dWt , where Wt is a standard Brownian Motion, a(t) ∈ [0, 1] is the proportional reinsurance rate and the constants µ and λ are regarded as the safety loadings of the insurer and the reinsurer, respectively. To give a rigor mathematical foundation of this optimization problem, we fix a filtered probability space (Ω, F , {Ft }t≥0 , P ), {Wt } is a standard Brownian motion on this probability space, {Ft }t≥0 is a family of sub-σ- algebras of F satisfying the usual conditions. Ft represents the information available at time t and all decisions are made based on this information. A control policy π is described by π = {aπ (t)}. Given a control policy π, we assume the liquid reserve of the insurance company is modeled by the following stochastic differential equations,
(2.1) dRtπ = µ − (1 − aπ (t))λ dt + σaπ (t)dWt , R0π = x, where x is the initial reserve of the company. The time of bankruptcy or the ruin time is defined by τπ = inf{t : Rtπ = 0}.
(2.2)
The objective of the insurance company is to maximize the company value by choosing control policy π, i.e., we want to find the optimal value function V (x) and the optimal policy π ∗ such that V (x) = J(x, π ∗ ) where V (x) is defined by " # V (x) = sup J(x, π) π
and J(x, π) = x + E
"
τπ 0
e−ct dRtπ
#
(2.3)
where x is the initial reserve, c denotes the discount rate. We explain the right hand side of Eq.(2.3) as the value of the company. The first term is the initial reserve. While the second term is the total discounted liquid reserve changes during the life time of the company. Another way of explanation is the total discounted net incomes until bankruptcy. Therefore, it could be a better interpretation of the company value than the traditional one. We will give the transparent economic interpretation of the new model in next section.
146
Lin He, Zongxia Liang and Fei Xing
3. Comparisons of the Two Models In this section, we compare the new model with the traditional models. In order to simplify the problem, we only consider the differences between the two models under the cheap reinsurance condition, i.e., λ = µ. Suppose the liquid reserves of the company is modeled by dRtπ = µaπ (t)dt + σaπ (t)dWt ,
R0π = x.
In the traditional model, the value of the company is defined as " τπ −ct # Jold (x, π) = E e Rt dt ,
(3.1)
(3.2)
0
we call it model 1. In the new model, the value of the company is defined as " τπ −ct Jnew (x, π) = x + E e dRt }, (3.3) 0
we call it model 2. We think it’s a better interpretation of value of the company than model 1 and we will explain the reasons as follows. Model 1 is widely used as the value of the company. Unfortunately, its economic interpretation is opaquely defined. We find an explanation in Ref. 10. They supposed the dividends are paid off at the rate proportional to the current surplus. Then (3.2) stands for the total discounted return of the shareholders. Unfortunately, the reserve process can not be defined as in (3.1) and it should be rewritten as dRtπ = µaπ (t)dt + σaπ (t)dWt − bRtπ dt,
R0π = x,
(3.4)
b is the constant dividends payout rate. Obviously, this will lead to a totally different solution. In fact, the company seldom uses the proportional dividends payout strategy as mentioned in Ref. 10. Since the strategy will lead to fluctuated dividends payout, this seems to be a bad management of the company. Another interpretation of model 1 is that it stands for the value of the company. But, there are also some flaws in the model. In fact, it’s the integral of value with respect to time t and it has no meanings. In order to compare the two models clearly, we consider the following extreme cases. Firstly, we set up the discrete forms of the two different value functions. Suppose that 0 = t0 < t1 < ... < tn < ..., Jold (x, π) = E
"
lim
∞
∆ →0
J(x, π) = x + E
"
where ∆ = max{ti − ti−1 }.
# e−cti ∧τπ Rti ∧τπ (ti+1 ∧ τπ − ti ∧ τπ )
(3.5)
i=0
lim
∆ →0
∞ i=0
# e−cti ∧τπ (Rti+1 ∧τπ − Rti ∧τπ ) ,
(3.6)
Influence of Transaction Costs on an Insurance Company
147
Clearly, model 1 stands for the total discounted future liquid reserves and model 2 stands for the total discounted liquid reserve changes before bankruptcy. If the time value is not taken into consideration, i.e., c = 0, the value function Jold (x, π) > 0 and J(x, π) = 0. In fact, at the time of bankruptcy, all the profit equals to the loss and the company worths nothing neglecting the time value. In our case, bankruptcy is not the optional choice but the compulsory one which only happens when Rtπ = 0. So, J(x, π) ≡ 0 as c = 0. On the contrary, Jold (x, π) > 0, we believe that model 1 gives too much privilege for the former performance of the company. In order to make the economic interpretation more clear, we consider another extreme case. Suppose the management of the company always chooses the strategy π = {a ≡ 0}. Then the company will survive for infinite time and the reserve of the company is x determinately. Using the integral in (3.5) and (3.6), we get Jold (x, π) = xc and J(x, π) = x. Since the company chooses such a risk free strategy, the discount rate c should be zero. Then Jold (x, π) → ∞ in this case. In model 1, they treat this case as a good one and its value could be huge. In fact, this is merely passive management, the company worths nothing except for the initial value. So, model 1 is not effective in this extreme case. According to the above extreme examples, we can see that model 1 gives too much privilege of the former performance of the company and it is not a good representation of the company value. In the meanwhile, model 2 calculates the total liquid reserve changes of the company before bankruptcy and it is perfectly reasonable in the extreme cases. Secondly, model 2 can be interpreted as the total discounted net incomes during the lifetime of the insurance company. Rti+1 −Rti is the net income happened in the interval [ti , ti+1 ]. According to the valuation of the company, the company value is the total discounted net incomes before bankruptcy. So, the new model is a good interpretation of the company value. 4. HJB Equation and Verification Theorem Theorem 4.1. Assume f ∈ C 2 is a solution of the following HJB equation: " σ 2 a2 f (x) + (µ − (1 − a)λ)f (x) − cf (x) a∈[0,1] 2 # +µ − (1 − a)λ = 0 max
(4.1)
with boundary condition f (0) = 0,
(4.2)
and lim sup x→∞
|f (x)| < ∞. x
Then for any admissible control π, g(x) = f (x) + x ≥ J(x, π).
(4.3)
148
Lin He, Zongxia Liang and Fei Xing
Proof. Fix a policy π. Choose ε > 0 and let τπε = inf{t : Rtπ ≤ ε}. Then, by Itˆ o’s formula ;
e−c(t
τπε )
f (Rtπ; τπε )
t
;
τπε
= f (x) + 1 + 2 +
t
;
0 τπε
e t
0 ;
τπε
0
Since
t
;
τπε
e
−cs
0
−cs 2 2
a σ f
e
;
t
;
τπε
=
τπε )
= f (x) +
t
;
e
−cs
t
0 t
;
τπε
τπε
+
−c
;
τπε
0
e−cs f (Rsπ )ds
(µ − (1 − a)λ)ds +
e
0 τπε
t
;
τπε
e−cs aσdWs ,
0
f (Rtπ; τπε ) ; t τπε −cs e
0 ;
t
0
1 + 2 +
(Rsπ )ds
we have −c(t
e−cs aσf (Rsπ )dWs .
dRsπ
e−cs (µ − (1 − a)λ)f (Rsπ )ds
t
;
τπε
+ 0
e−cs dRsπ
(µ − (1 − a)λ)f (Rsπ )ds
−cs 2 2
a σ f
(Rsπ )ds
e−cs aσf (Rsπ )dWs +
−c
t
0 ; t τπε
;
τπε
e−cs f (Rsπ )ds
e−cs (µ − (1 − a)λ)ds
0
e−cs aσdWs .
0
Taking expectations at both sides of the last equations, we have t ; τπε ; ε e−cs dRsπ + Ee−c(t τπ ) f (Rtπ; τπε ) ≤ f (x) E
+E
t
0 ;
τπε
e
−cs
0
aσf
(Rsπ )dWs
+E
t
;
τπε
e−cs aσdWs
(4.4)
0
due to the fact
σ 2 a2 f (x) + µ − (1 − a)λ f (x) − cf (x) + µ − (1 − a)λ ≤ 0 2 for all a ∈ [0, 1]. By concavity of f , 0 ≤ f (Rsπ ) ≤ f (ε) on (0, τπ ), the second term in the right hand side of (4.4) is a zero-mean square integrable martingale, i.e., t ; τπε E{ e−cs aσf (Rsπ )dWs } = 0. 0
Influence of Transaction Costs on an Insurance Company
149
Similarly,
t
;
τπε
E{
e−cs aσdWs } = 0.
0
So (4.4) becomes E
t
;
0
τπε
e−cs dRsπ + E{e−c(t
;
τπε )
f (Rtπ; τπε )} ≤ f (x).
(4.5)
Letting ε → 0 and noticing that ; # " lim E e−c(t τπ ) f (Rtπ; τπ )I{τπ <∞} t→∞ " # = E e−cτπ f (Rτππ )I{τπ <∞} " # = E e−cτπ f (0)I{τπ <∞} =0 and
(4.6)
; # " lim E e−c(t τπ ) f (Rtπ; τπ )I{τπ =∞} t→∞ # " = lim E e−ct f (Rt )I{τπ =∞} t→∞ # " ≤ lim ke−ct E Rt I{τπ =∞} t→∞
k(x + λt) = 0, t→∞ ect
≤ lim
(4.7)
where k := lim sup |f (x)| < ∞, we obtain x x→∞
; " # lim E e−c(t τπ ) f (Rtπ; τπ ) = 0.
t→∞
Putting (4.5) and (4.8) together implies that ; " t τπ −cs π # e dRs ≤ f (x). J(x, τπ ) − x = E
(4.8)
(4.9)
0
Thus we obtain g(x) = f (x) + x ≥ J(x, π)
(4.10)
2
for any admissible control π.
Theorem 4.2. Let π ∗ = a∗ (x) be the maximizer of the left hand side of (4.1) and ∗ Rtπ is a solution to the following stochastic differential equation ∗
∗
∗
dRtπ = (µ − (1 − a∗ (Rtπ ))λ)dt + a∗ (Rtπ )σdWt , ∗
with the boundary condition R0π = x. Then V (x) = J(x, π ∗ ) = g(x).
150
Lin He, Zongxia Liang and Fei Xing
Proof. We follow the same procedures in Theorem 4.1, and all the inequities in (4.4)(4.5)(4.9)(4.10) could become equalities. It is easy to get g(x) = J(x, π ∗ ), i.e., g(x) ≤ V (x). Combined with g(x) ≥ V (x), we get the conclusion V (x) = 2 J(x, π ∗ ) = g(x). 5. Construction of Solution to the HJB Equation According to Theorem 4.1, we only need to find a C 2 function f satisfying the HJB equation (4.1) with the boundary conditions (4.2) and (4.3). The case λ = µ This case is also known as cheap reinsurance and has already been solved in Lin He and Zongxia Liang (2007), where a solution f to (4.1) can be found as follows: x k1 Q(G−1 ( )), 0 ≤ x < x0 , k1 f (x) = k2 ed(x−x0) + µ , x ≥ x0 , c where k1 , k2 , x0 , d are constants determined via exogenous parameters of the problem and G, Q are special functions given by G(x) = Q(x) =
x
x
γ
e 1+y y γ (1 + y)−2−γ dy, γ
e 1+y y γ−1 (1 + y)−2−γ dy.
The maximizing function is x x k1 µ −1 x (G ( ) + (G−1 ( ))2 )g(G−1 ( )), 0 ≤ x < x0 , a(x) = σ2 k1 k1 k1 1, x ≥ x , 0 γ
where g is the integral kernel of G, that is, g(x) = e 1+x xγ (1 + x)−2−γ . The case λ > µ First we guess that a(x) = 1 for all x, where (4.1) becomes σ 2 f (x) + µf (x) − cf (x) + µ = 0. 2 Using the concavity of f as well as f (0) = 0, we get the solution of (5.1), µ µ f (x) = − edx + , c c where −µ − µ2 + 2σ 2 c d= σ2
(5.1)
(5.2)
(5.3)
Influence of Transaction Costs on an Insurance Company
151
is a negative solution to the following equation σ2 2 y + µy − c = 0. 2
(5.4)
Then we can get the following important result. Proposition 5.1. Let f be defined by (5.2). Then f is a solution to (4.1) if and only if λ ≥ 2µ. Proof. First we prove the sufficiency. To show that f is a concave solution to (4.1), we only need to show that for any a ∈ [0, 1]
F (a, x) =
(1 − a2 )σ 2 f (x) + (1 − a)λf (x) + (1 − a)λ ≥ 0. 2
(5.5)
In fact, we have for any a ∈ [0, 1], x ∈ [0, +∞) (1 − a2 )σ 2 µd2 dx µd (− e ) + (1 − a)λ(− edx + 1) 2 c c 2 2 µd µd (1 + a)σ (− ) + λ(− + e−dx )] = (1 − a)edx [ 2 c c µd (1 + a)σ 2 µd2 (− ) + λ(− + 1)] ≥ (1 − a)edx [ 2 c c λ µ = (1 − a)edx [− (1 + a)(c − µd) + (−µd + c)] c c c − µd = (1 − a)edx [λ − µ(1 + a)] c ≥0
F (a, x) =
(5.6) (5.7)
(5.8)
where (5.6) satisfies because of d < 0 and x ≥ 0, which implies that e−dx ≥ 1; (5.7) satisfies since 12 σ 2 d2 + µd − c = 0 and (5.8) satisfies since c − µd > 0 and λ ≥ 2µ. So we have proved sufficiency. As to the necessity, assume µ < λ < 2µ. Then there exists a a0 ∈ (0, 1) such that m := λ − µ(1 + a0 ) < 0. Let x0 = − d1 log(1 − m(c−µd) ). Clearly, x0 ∈ (0, ∞). Then we can compute 2cλ F (a0 , x0 ) =
c − µd m(1 − a0 )edx0 < 0, 2c
which contradicts the assumption. Therefore we obtain the necessity.
2
Now assume µ < λ < 2µ. Suppose that the concave function f is found and there exists x1 such that a(x) satisfies 0 < a(x) < 1 for all 0 < x < x1 . When x < x1 , we can find a(x) by 2 2 differentiating the expression Φ(a) ≡ σ 2a f (x) + (µ − (1 − a)λ)f (x) − cf (x) + µ −
152
Lin He, Zongxia Liang and Fei Xing
(1 − a)λ in bracket {·} of Eq.(4.1) with respect to a and letting Φ (a) = 0. Here x1 is an unknown variable to be specified later and a(x) is given as
λ(1 + f ) . a(x) = − σ 2 f
(5.9)
Substituting (5.9) into (4.1), f satisfies
−
λ2 (f + 1)2 + (µ − λ)f − cf + µ − λ = 0, for 0 ≤ x < x1 . 2σ 2 f
(5.10)
By our assumption, f (x) is concave for 0 ≤ x < x1 , so there exists a function X(z) : R −→ [0, +∞) such that −ln(f (X(z)) + 1) = z. Here z ≤ 0. In addition, the following two equations are valid,
f (X(z)) = e−z − 1, −e−z . f (X(z)) = X (z)
(5.11)
Define M such that X(−M ) = 0, i.e., f (0) = eM − 1. Then X : [−M, 0) → [0, ∞). Putting x = X(z) into (5.10), we obtain that λ2 X (z)e−z − cf (X(z)) + (µ − λ)e−z = 0. 2σ 2
(5.12)
Differentiating (5.12) w.r.t z and using (5.11), it is easy to get λ2 λ2 −z X (z)e − ( + c − cez )X (z)e−z − (µ − λ)e−z = 0. 2 2 2σ 2σ
Defining γ =
2σ2 λ2 ,
(5.13)
(5.13) can be rewritten as
X (z) − (1 + cγ − cγez )X (z) = (µ − λ)γ. So X (z) = γ(µ − λ)
z
exp (−(1 + cγ)y + cγey )dy +k1 exp ((1 + cγ)z − cγez ),
−M
where k1 is a free parameter. Let g be the p.d.f of Gamma distribution with parameters (cγ + 1, 1/cγ), we obtain z 1 z z dy + k1 ez g(ez ) X (z) = γ(µ − λ)e g(e ) y g(ey ) e −M = γ(µ − λ)ez g(ez )H(ez ) + k1 ez g(ez ), where
H(z) =
z e−M
1 dy, y 2 g(y)
∀z ≥ e−M .
(5.14)
Influence of Transaction Costs on an Insurance Company
153
Since X(−M ) = 0, we have z
γ(µ − λ)ey g(ey )H(ey ) + k1 ey g(ey ) dy X(z) = −M ez
=
γ(µ − λ)H(y)g(y) + k1 g(y) dy
e−M
= K(ez ), where
K(z) :=
z
(5.15)
e−M
γ(µ − λ)H(y)g(y) + k1 g(y) dy,
∀z ≥ e−M .
(5.16)
Because X(z) and ez is monotone increasing in z, K(z) is also monotone increasing in z. Therefore we can calculate the inverse function of K: 1 , K −1 (X(z)) = f (X(z)) + 1 that is,
f (x) = Consequently,
f (x) = 0
x
1 − 1. K −1 (x)
1 dy − x, K −1 (y)
∀x ∈ [0, x1 ].
(5.17)
(5.18)
Thus
a(x) = − where
λ λ(f (x) + 1) = 2 K −1 (x)k(K −1 (x)), σ 2 f (x) σ
k(y) := K(y) = γ(µ − λ)H(y)g(y) + k1 g(y).
(5.19)
(5.20)
Now for x > x1 we have a(x) = 1. Letting a(x) = 1 in (4.1) and then using the concavity, we obtain the following solution: µ ∀x ≥ x1 f (x) = k2 edx + , c with d given in (5.3). Therefore we obtain the following solution: * x 1 0 ≤ x < x1 , 0 K −1 (y) dy − x, (5.21) f (x) = µ dx k2 e + c , x ≥ x1 , where k1 , k2 , x1 , M are constants to be determined. To ensure f to be twice continuously differentiable at x1 we have
f (x1 −) = k2 dedx1 ,
f (x1 −) = k2 d2 edx1 ,
154
Lin He, Zongxia Liang and Fei Xing
that is, 1 − 1 = k2 dedx1 , K −1 (x1 ) λ 1 = k2 d2 edx1 , − 2 −1 σ K (x1 )
(5.22) (5.23)
where the left hand side of (5.23) is obtained via using
a(x1 ) = −
λ(f (x1 ) + 1) = 1, σ 2 f (x1 )
which means that
f (x1 ) = −
λ λ 1 . (f (x1 ) + 1) = − 2 −1 2 σ σ K (x1 )
Let α = K −1 (x1 ),
β = k2 edx1 .
We see from (5.22) and (5.23) that 1 − 1 = βd, α
−
λ 1 = βd2 . σ2 α
These equations have a solution (α, β) =
λ σ2 1 . + 1, − dσ 2 λ + dσ 2 d
(5.24)
By (5.3),(5.4) and the assumption λ < 2µ, we have β < 0 and 0 < α < 1. Using (5.19) and the condition a(x1 ) = 1, we have λα
γ(µ − λ)H(α)g(α) + k1 g(α) = 1. 2 σ Therefore, (x1 , k1 ) = K(α),
σ2 + γ(λ − µ)H(α) . λαg(α)
(5.25)
So f (x) must be the following form, * x 1 f (x) =
dy − x, 0 ≤ x < x1 , 0 K −1 (y) βed(x−x1 ) + µc , x ≥ x1 ,
and the maximizing function has the following form, 3 λ −1 K (x)k(K −1 (x)), 0 ≤ x < x1 , a(x) = σ2 1, x ≥ x1 ,
(5.26)
(5.27)
where K is given by (5.16). The constant x1 , k1 and β are given by (5.24) and (5.25), and d is given by (5.3). Now we only need to determine the constant M
Influence of Transaction Costs on an Insurance Company
155
such that f (0+) is a solution to (5.10) in the limit. Letting x → 0 in (5.10) we get the equation λa(0+) (f (0+) + 1) + (µ − λ)f (0+) + (µ − λ) = 0. 2 Due to the fact that f (0+) ≥ 0, this is satisfied if and only if
a(0+) =
2(λ − µ) := B λ
(5.28)
Since µ < λ < 2µ, B ∈ (0, 1). By (5.27) and the fact that K(e−M ) = X(−M ) = 0, we obtain
λ σ2 a(0+) = 2 e−M g(e−M ) γ(λ − µ)(H(α) − H(e−M )) + σ λαg(α) −M −M g(e ) e + Be−M g(e−M )H(α). = (5.29) αg(α) By (5.28) and (5.29), we have e−M g(e−M ) + Be−M g(e−M )H(α) = B. αg(α) Denote yg(y) + Byg(y) F (y) := αg(α)
y
α
1 dz, z 2 g(z)
(5.30)
y ∈ [0, α].
(5.31)
To ensure the existence of M in (5.31), we need only to prove there exists y0 ∈ [0, α], 2 such that F (y0 ) = B. Then M = − ln y0 is what we need. Lemma 5.1. Suppose f (x) ∈ C 2 [a, b] and f (a) < f (b). If for any x ∈ [a, b], f (x) = 0 leads to f (x) < 0, then for any y0 ∈ (f (a), f (b)), there exists a unique x0 ∈ [a, b] such that f (x0 ) = y0 . Proof. Suppose there exist x1 , x2 ∈ [a, b], x1 < x2 , such that y0 := f (x1 ) = f (x2 ) ∈ (f (a), f (b)). Then we claim that y1 := min f (x) ≥ y0 . Since if it is not the case, x∈[x1 ,x2 ]
then we can find c ∈ (x1 , x2 ) such that f (c) = y1 , which implies that f (c) = 0 and f (c) ≥ 0. Contradicting to the assumption of this lemma. Using the same method we obtain y2 := min f (x) ≥ y0 . Then we obtain that y0 = min f (x), which x∈[x2 ,b]
x∈[x1 ,b]
implies that f (x2 ) = 0 and f (x2 ) ≥ 0. Contradicting to the assumption of this lemma. Therefore, we have finished the proof. 2 Proposition 5.2. F (x) = B has a unique solution in [0, α]. Proof. First, F (α) = 1 > B. By L’Hospital rule and (5.31) it can be seen that F (x) → B/(1 + cγ) < B when x → 0. Next, notice that (xg(x)) = g(x)(cγ − cγx + 1).
(5.32)
156
Lin He, Zongxia Liang and Fei Xing
Using this we can obtain F (x) =
1
F (x)(cγ − cγx + 1) − B x
and F (x) =
cγ cγF (x) F (x)(1 − x) − . x x
Assume there exists x ˆ ∈ (0, α) satisfying F (ˆ x) = 0, we have F (ˆ x) = −
cγF (ˆ x) < 0. x ˆ
(5.33)
By Lemma 5.1 we obtain the result. 2 Recall that we assume a(x) < 1 for all x < x1 in the beginning of this section. Now we are going to check this case. First we need to prove a lemma. Lemma 5.2. Suppose f (x) ∈ C 2 [a, b], f (a) < f (b) and f (b) > 0. If for any x ∈ [a, b], f (x) = 0 leads to f (x) < 0, then for any x ∈ (a, b) we have f (x) < f (b). Proof. Suppose there exists x0 ∈ (a, b) satisfying x0 ≥ f (b). Since f (b−) > 0, then there exists x1 ∈ (x0 , b) such that f (x1 ) < f (b) ≤ f (x0 ). Therefore, we can find c ∈ (x0 , b) such that f (c) = min f (x). Since f (x) ∈ C 2 [a, b], then we have x∈[x0 ,b]
f (c) = 0 and f (c) ≥ 0, which contradicts the assumption.
2
Proposition 5.3. Let a(x) be defined by (5.27). Then a(x) < 1 for all x < x1 . Proof. We only need to consider a(y) =
λ yk(y), σ2
e−M < y < α.
Using (5.32) again it follows that a (y) =
1
a(y)(1 + cγ − cγy) + B y
(5.34)
and cγ 1 a (y) = cγa (y)(1 − ) − a(y). y y
(5.35)
By (5.34) and (5.35) if a (y) = 0, we have a (y) < 0. Since a(α) = 1, it follows from (5.34) that a (α−) = α1 1 + cγ − cγy + B > 0, then by Lemma 5.2 we complete the proof. 2 Theorem 5.1. Let f be defined by (5.26) where e−M is the unique solution to (5.31). Then f ∈ C 2 and is a concave solution to (4.1) for µ < λ < 2µ.
Influence of Transaction Costs on an Insurance Company
157
Proof. By the construction of f we obtain that f is concave and f ∈ C 2 , so we only need to prove that f satisfies (4.1). By Proposition 5.3, we know that the maximizing function a(x) satisfies a(x) < 1 for all x < x1 , then it satisfies (4.1) for all x < x1 from the construction of f . Now for x ≥ x1 , equality holds with a(x) = 1. So we need only to prove that for all 0 ≤ a ≤ 1 (1 − a2 )σ 2 f (x) + (1 − a)λf (x) + (1 − a)λ ≥ 0. 2 In fact, we can compute that
F (a, x) =
(1 − a2 )σ 2 f (x) + (1 − a)λf (x) + (1 − a)λ ≥ 0 2 1 = σ 2 βd2 (1 − a)2 ed(x−x1 ) + (1 − a)λβded(x−x1 ) + (1 − a)λ 2 1 = (1 − a)ed(x−x1 ) (1 + a)σ 2 βd2 + λβd + λe−d(x−x1 ) 2 1 ≥ (1 − a)ed(x−x1 ) (1 + a)σ 2 βd2 + λβd + λ , (5.36) 2
F (a, x) =
where (5.36) satisfies since d < 0 and x ≥ x1 . Notice that (1 − a)ed(x−x1 ) ≥ 0 for all a ≤ 0, so it suffices to show that 12 (1 + a)σ 2 βd2 + λβd + λ ≥ 0. Since σ 2 βd2 < 0, using (5.24) we have 1 (1 + a)σ 2 βd2 + λβd + λ 2 ≥ σ 2 βd2 + λβd + λ
σ2 d −1 +λ = (σ 2 d + λ) λ + σ2 d ≥ 0, so F (a, x) ≥ 0 for all a ≤ 1. Thus we complete the proof.
2
6. Economic Analysis In this section, we calculate the optimal return function V (x), the optimal control strategy a∗ and the switch point x1 for µ = 1, σ = 1, c = 0.1 and different values of λ between 1 and 2. From Figure 1, we see that the valuation function V (x) is a monotone decreasing function with respect to λ. Obviously, higher risk premium charged by the reinsurance company reduces the profit of the insurance company. In the meanwhile, we find that the optimal value is about ten percent of the optimal value obtained via the traditional model. We think the new model is more realistic. (See Ref. 3 for details) From Figure 2, we see that when λ is more than two folds of µ, the management of the insurance company would like to take full responsibility of the risk. This means huge risk premium charged by the reinsurance company leads to no business
158
Lin He, Zongxia Liang and Fei Xing
12
V(x)
10
8
6 λ=1.1 4
λ=1.3 λ=1
2
λ=2
0 0
0.5
1
x
1.5
2
Figure 1. The optimal return function V (x) for µ = 1, σ = 1, c = 0.1. a(x) 1
0.8
0.6
0.4
λ=1.1 λ=1.3 λ=1
0.2
λ=2 0 0
0.5
1
1.5
x
2
∗
Figure 2. The optimal control policy a for µ = 1, σ = 1, c = 0.1. at all. The other interesting point is that when the liquid reserves approaches 0, the reinsurance rate is 1 in the cheap reinsurance case. So the company never goes bankruptcy in the case of λ = µ. (See Ref. 2 for details) In the non-cheap reinsurance case, the management of the insurance company chooses to take some risk when the reserves approaches 0. When the reserves is small, the management of the company can’t afford the transaction costs generated from the reinsurance procedure and have to take some risk. From Figure 3, we see that x(1) is a concave function of λ. First, x1 increases with respect to λ, then decreases in λ. This is quite interesting. x1 stands for the switch point that the management of the company would like to take full responsibility of the risk. The company reduces its risk appetite when the λ becomes larger
Influence of Transaction Costs on an Insurance Company
159
x(1) 1.1
1
0.9
0.8
0.7
1.1
1.2
1.3
1.4
1.5
1.6
1.7
λ 1.8
Figure 3. The switch point x1 of λ. at first. The company enlarges its responsibility of the risk slowly with respect to the liquid reserve level. This may be induced by the balance between the cost reduction and the risk aversion. The company would like to pay some cost to reduce the possibility of bankruptcy when the cost is small. When λ is large enough, the management of the company would to take more risk when λ becomes larger. The incentive for cost reduction overcomes the motivation for risk aversion at last. Acknowledgement This work is supported by NSFC No. 10771114, and SRF for ROCS, SEM, and the Korea Foundation for Advanced Studies. We would like to thank for generous financial support. References 1. Dayananda, P.W.A.: Optimal reinsurance, Journal of Applied Probability 7, 134-156, 1970. 2. He, Lin and Liang, Zongxia: Optimal Proportional Reinsurance Policies for the Insurance Company with a New Value Function, Preprint(2007). 3. Højgaard, B., Taksar: Optimal Proportional Reinsurance Policies for Diffusion Models with Transaction Costs, Insurance: Mathematics and Economics, Vol. 22, 41-51, 1998. 4. Højgaard, B., Taksar, M.: Controlling Risk Exposure and Dividend Payout Schemes: Insurance Company Example, Mathematical Finance, Vol. 9, No. 2, 153-182, 1999. 5. Højgaard, B., Taksar, M.: Optimal Proportional Reinsurance Policies for Diffusion Models, Insurance: Mathematics and Economics, Volume 23, 303, 1998. 6. Øksendal, B.: Stochastic Differential Equations, Berlin: Springer Verlag, 1985 7. Samuel, k., Taylor, H. M.: A Second Course in Stochastic Processes, ISBN 0123986508, New York: Academic Press, 1981. 8. Sethi, S.: Optimal Consumption and Investment with Bankruptcy, Kluwer, 1997.
160
Lin He, Zongxia Liang and Fei Xing
9. Taskar M.: Optimal Risk and Dividend Distribution Control Models for an Insurance Company, Mathematical Methods of Operations Research, Vol. 51, 1-42, 2000. 10. Taksar, M., Hunderup, C., L.: The Influence of Bankruptcy Value on Optimal Risk Control for Diffusion Models with Proportional Reinsurance, Mathematics and Economics, Vol. 40, 311-321, 2007. 11. Wendell H. Fleming, H. Mete Soner: Controlled Markov Processes and Viscosity Solutions, New York : Springer-Verlag, ISBN 0387979271, 1993. 12. Whittle,P.: Optimization over Time-Dynamic Programming and Stochastic Control, New York: Wiley , 1983. 13. Yong Jiong Min, Zhou Xun Yu: Stochastic Controls: Hamiltonian Systems and HJB Equations, ISBN 0387987231, New York: Springer-Verlag, 1999.
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 161–175
Chapter 10 Limit Theorems for p-Variations of Solutions of SDEs Driven by Additive Stable L´ evy Noise and Model Selection for Paleo-Climatic Data
Claudia Hein, Peter Imkeller and Ilya Pavlyukevich Institut f¨ ur Mathematik, Humboldt Universit¨ at zu Berlin Unter den Linden 6, 10099 Berlin, Germany [email protected] (Claudia Hein) [email protected] (Peter Imkeller) [email protected] (Ilya Pavlyukevich) In this chapter asymptotic properties of the p-variations of stochastic processes of the type X = Y + L are studied, where L is an α-stable L´evy process, and Y a perturbation which satisfies some mild Lipschitz continuity assumptions. Local functional limit theorems for the power variation processes of X are established. In case X is a solution of a stochastic differential equation driven by the process L, these limit theorems provide estimators of the stability index α. They are applied to paleo-climatic temperature time series taken from the Greenland ice core, describing the fine structure of temperature variability during the last ice age, in particular exhibiting the intermediate warming periods known as Dansgaard–Oeschger events. Our results provide the best fitting α in a parameter test in which the time series are modeled by stochastic differential equations with α-stable noise component.
Keywords: p-variation; stable L´evy process; tightness; Skorokhod topology; stability index; model selection; estimation; Kolmogorov–Smirnov distance; paleoclimatic time series; Greenland ice core data 2000 AMS Subject Classification: primary 60G52, 60F17; secondary 60H10, 62F10, 62M10, 86A40 Contents 1 2 3 4 5
Introduction . . . . . . . . . . . . . Object of study and main results . . Applications to real data . . . . . . Functional convergence for Vpn (L) . Generalisation to sums of processes . 5.1 Equivalence for p ≤ 1 . . . . . 5.2 Equivalence for p > α . . . . . 5.3 Equivalence for α ∈ (1, 2), p ∈
. . . . . . . . . . . . . . . . . . . . . (1, α]
. . . . . . . .
. . . . . . . .
161
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
162 164 166 169 170 170 170 171
162
Claudia Hein, Peter Imkeller and Ilya Pavlyukevich
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
175
1. Introduction The research leading to this paper was inspired and triggered by two papers by Ditlevsen1,2 which will be in its focus. In his work, Ditlevsen uses simple stochastic differential equations with additive noise as a model fit for temperature data (yearly averages) obtained from the Greenland ice core describing aspects of the evolution of the Earth’s climate during the last Ice Age, which extended over about 100,000 years. This time series features in particular the catastrophic warmings and coolings in the Northern hemisphere, the so-called Dansgaard–Oeschger events.3 The dynamical systems component of the modeling stochastic differential equation describes the evolution of temperature as a function of time, thus lives in one dimensional Euclidean space, and can therefore be chosen to be given by the gradient of a double well potential (climatic quasi-potential), in which the local minima correspond to cold and warm meta-stable climate states. In order to find a good fit for the noise component, Ditlevsen performs a histogram analysis for the residuals of the ice core time series, the temperature increments measured between adjacent data points, i.e. years. He conjectures that the noise may contain a strong α–stable component with α ≈ 1.75, and plots an estimate for the drift term assuming the stationarity of the solution. Stochastic differential equations have been used for quite a while as meso-scopic descriptions of models for natural phenomena. In their simplest variants, they consist of deterministic differential equations describing dynamical systems perturbed by a noise term. The subclass in which the noise source is Gaussian arises for instance from microscopic models of coupled systems of deterministic differential equations on different time scales, in the limit of infinite speed of the fast scale, and describe the fluctuations of the slow scale component around its averaged version with the fast scale component as a stochastic perturbation. With a view in particular towards the mathematical interpretation of financial time series, the theory for stochastic differential equations the noise term of which is given by more general (discontinuous and non-Gaussian) semimartingales has received considerable attention during the recent years. It is reasonable in a quite general framework to model real data by stochastic differential equations. Usually neither their dynamical systems nor their noise component can be deduced from first principles for instance from microscopic models. However, they may be selected by statistical inference or model fit from the time series they are supposed to interpret. The central question of the corresponding model selection problem in Ditlevsen’s papers motivating the study which led to this paper asks for the best choice of the noise term. More formally, suppose we wish to model a real time series by the dynamics
Limit Theorems for p-Variations of Solutions of SDEs, Model Selection
X = (Xt )t≥0 of a real valued process of the type t f (s, Xs ) ds + ηt , Xt = x +
t ≥ 0,
163
(1.1)
0
where the process component η describes the noise perturbation. Then the problem of model fit consists in the choice of a drift term f and a noise term η, so that the solution of (1.1) is in the best possible agreement with the data of the given time series. We resume Ditlevsen’s model selection problem for the fit of the noise component from the perspective of a new testing method details of which have still to be developed. Following Ditlevsen, we work under the model assumption that the noise η has an α-stable L´evy component which may be symmetric or skewed. We search for a test statistics discriminating well between different α, and capable of testing for the right one. We shall show that this job is well done by the equidistant p-variation — in the sequel called p-variation — of the process X defined for all p > 0 as Vpn (X)t :=
[nt]
|∆ni X|p ,
(1.2)
i=1
where ∆ni X := X ni − X i−1 for 1 ≤ i ≤ n, n ≥ 1. We first observe that for those n values of p relevant for our analysis the main contribution to the p-variation of X comes from the α-stable component of the noise. We next prove local limit theorems which hold under very mild assumptions on the drift term f and allow to determine the stability index α asymptotically. We finally use these limit theorems in Section 3 below to analyze the real data from the Greenland ice core with our methods, and come to an estimate α ≈ 0.7, surprisingly quite different from the one obtained by Ditlevsen.1,2 In fact, the diagram of distances between observed and theoretical levels α taken in the Kolmogorov–Smirnov sense, exhibits two local minima which change their shape and position as a scale parameter of the laws is modified. The deeper of the two corresponds to the estimate just quoted, while the shallower one could correspond to Ditlevsen’s findings. The paper is organized as follows. In section 2 we set the stage for stating our main results, which are applied to the Greenland ice core data in section 3. In section 4, the convergence of the finite dimensional laws of renormalized processes of p-variations is shown. By independence of increments, this immediately implies functional convergence, as can be seen by classical results. In section 5, the functional convergence of laws is extended to sums of L´evy processes and processes of finite variation, i.e. to processes typical for the structure of an SDE perturbed by L´evy noise. D d In the following, ‘→’ denotes convergence in the Skorokhod topology, ‘→’ deu.c.p. notes convergence of finite-dimensional (marginal) distributions, and ‘ → ’ stands
164
Claudia Hein, Peter Imkeller and Ilya Pavlyukevich
for uniform convergence on compacts in probability. We denote the indicator function of a set A by IA , and A¯ denotes the complement of the set A. 2. Object of study and main results Let (Ω, F , (Ft )t≥0 , P) be a filtered probability space. We assume that the filtration satisfies the usual hypotheses in the sense of the book,4 i.e. it is right continuous, and F0 contains all the P-null sets of F . For α ∈ (0, 2] let L = (Lt )t≥0 be an α-stable L´evy process, i.e. a process with right continuous trajectories possessing left side limits (rcll) and stationary independent increments whose marginal laws satisfy πα −tC α |λ|α 1 − iβ sgn(λ) tan , α = 1, 2 t ≥ 0, (2.1) ln EeiλLt = 2 −tC|λ| 1 − iβ sgn(λ) log |λ| , α = 1, π C > 0 being the scale parameter and β ∈ [−1, 1] the skewness. We adopt the standard notation from Ref. 5 and write L1 ∼ Sα (C, β, 0). We also make use of the L´evy–Khinchin formula for L which takes the following form in our case (see Ref. ?Chapter XVII.3]Feller-71 for details): $1 − β % dx 1+β CF I{x<0} + I{x>0} (eiλx − 1 − iλxI{|x|<1} , α = 1, 2 2 |x|1+α R\{0} % dx $1−β ln EeiλL1 = 1 + β I{x<0} + I{x>0} (eiλx − 1 − iλ sin x) , α = 1, CF 1+α R\{0}
2
2
|x|
(2.2) where CF denotes the scale parameter in Feller’s notation and equals −1 C α cos πα Γ(α) , α = 1, 2 CF = 2 C , α = 1. π
(2.3)
Recall that a totally asymmetric process L with β = 1 (β = −1) is called spectrally positive (negative). A spectrally positive α-stable process with α ∈ (0, 1) has a.s. increasing sample paths and is called subordinator. The main results of this paper are presented in the following three theorems. The first theorem deals with the asymptotic behavior of the p-variation for a stable L´evy process itself. As we will see later, this behavior does not change under perturbations by stochastic processes that satisfy some mild conditions. Theorem 2.1. Let (Lt )t≥0 be an α-stable L´evy process with L1 ∼ Sα (C, β, 0). If p > α/2 then
n D Vp (L)t − nt Bn (α, p) t≥0 → (Lt )t≥0
as n → ∞,
(2.4)
Limit Theorems for p-Variations of Solutions of SDEs, Model Selection
where L1 ∼ Sα/p (C , 1, 0) with the scale α p/α cos( πα 2p )Γ(1 − p ) C p , πα cos( 2 )Γ(1 − α) C = π 1 , C p πα 2 cos( 2 )Γ(1 − α)
165
α = p, (2.5) α = p.
The normalising sequence (Bn (α, p))n≥1 is deterministic and given by −p/α E|L1 |p , p ∈ (α/2, α), n
−1 α Bn (α, p) = E sin n |L1 | , p = α, 0, p > α.
(2.6)
We remark that the skewness parameter β of L does not influence the convergence of Vpn (L)t and does not appear in the limiting process since the p-variation depends only on the absolute values of the increments of L. Moreover, for p > α the limiting process L is a subordinator. We next perturb L by some other process Y . We impose no restrictions on dependence properties of Y and L. The only conditions on Y concern the behavior of its p-variation. We formulate two theorems, the first for p ∈ (α/2, 1) ∪ (α, ∞), and the second for p ∈ (1, α]. Theorem 2.2. Let (Lt )t≥0 be an α-stable stochastic process, with L1 ∼ Sα (C, β, 0) and (Yt )t≥0 be another stochastic process that satisfies u.c.p.
Vpn (Y ) → 0,
n → ∞,
(2.7)
for some p ∈ (α/2, 1) ∪ (α, ∞). Then D
(Vpn (L + Y )t − nt Bn (α, p))t≥0 → (Lt )t≥0
as n → ∞,
(2.8)
with L and (Bn (α, p))n≥1 defined in (2.5) and (2.6). The methods used to prove Theorem 2.2 do not work for p ∈ (1, α], and in this case we have to impose stronger conditions on the process Y . Theorem 2.3. Let (Lt )t≥0 be an α-stable stochastic process with L1 ∼ Sα (C, β, 0), α ∈ (1, 2) and let (Yt )t≥0 be another stochastic process. Let p ∈ (1, α] and T > 0. If Y is such that for every δ > 0 there exists K(δ) > 0 that satisfies P(|Ys (ω) − Yt (ω)| ≤ K(δ)|s − t| for all s, t ∈ [0, T ]) ≥ 1 − δ,
(2.9)
the process Y does not contribute to the limit of Vpn (L + Y ), i.e. D
(Vpn (L + Y )t − nt Bn (α, p))0≤t≤T → (Lt )0≤t≤T , with L and (Bn (α, p))n≥1 defined in (2.5) and (2.6).
n→∞
(2.10)
166
Claudia Hein, Peter Imkeller and Ilya Pavlyukevich
To be able to study models of the type (1.1) we formulate the following corollary of Theorems 2.3 and 2.2 which takes into account that Lebesgue integral processes are absolutely continuous w.r.t. the time variable and thus qualify as small process perturbations in the sense of the Theorems. Corollary 2.1. Let (Lt )t≥0 be an α-stable stochastic process, with L1 ∼ Sα (C, β, 0), and let f : R+ × R → R be a locally bounded function such that for some x ∈ R and T > 0 the unique solution for t f (s, Xs ) ds + Lt (2.11) Xt = x + 0
exists on the time interval [0, T ]. Then for p > max{1, α/2} we have D
(Vpn (X)t − nt Bn (α, p))0≤t≤T → (Lt )0≤t≤T
(2.12)
as n → ∞, with L and (Bn (α, p))n≥1 defined in (2.5) and (2.6). The functional convergence of power variations of symmetric stable L´evy processes to stable processes was first studied by Greenwood in Ref. 7, where more general non-equidistant power variations were considered, and in particular for p > α the convergence to subordinators was proved. Further, more general results on power variations of L´evy processes are obtained by Greenwood and Fristedt.8 Lepingle9 proves convergence of power variations for semimartingales in probability. In Refs. 10 and 11, Jacod proves convergence results for p-variations of general L´evy processes and semimartingales. In particular, several laws of large numbers and central limit theorems are established. Our results are different from Jacod’s because we consider processes possessing no second moments so that only the generalized central limit theorem can apply. Moreover, we consider in addition convergence of perturbed processes. Corcuera, Nualart and Woerner12 consider · p-variations of a (perturbed) integrated α-stable process of the type X = Y + 0 us− dLs with some cadlag adapted process u. For u = 1, our setting results. The paper12 contains a law of large numbers for 0 < p < α and a functional central limit theorem for 0 < p ≤ α/2. However, very restrictive conditions on possible perturbation processes Y are imposed, so that the results are not applicable to processes of the type (1.1). 3. Applications to real data In this section we illustrate our convergence results and show how they can be applied to the estimation of the stability index α. We emphasize that the conclusions obtained are somewhat heuristic. Additional work has to be done to provide more precise statistical properties of the p-variation processes as estimators for the stability index, and to describe the decision rule of the testing procedure along with its quality features.
Limit Theorems for p-Variations of Solutions of SDEs, Model Selection
167
We first work with simulated data. Assume they are realizations of the SDE (2.11) where L is a stable process with unknown stability index α and scale C > 0. From the data set we extract m samples with n data points each by taking adjacent non-overlapping groups of n consecutive points. This way we get the (i) (i) (i) samples (X (i) )1≤i≤m with X (i) = (X 1 , X 2 , . . . , X1 ), 1 ≤ i ≤ m. Assume for a n n moment that p = 2α and p > 1. Then Corollary 2.1 can be applied with Bn (α, p) = 0. Along each sample we calculate the p-variation Vpn (X (i) )1 =
n
|∆nj X (i) |p ,
1 ≤ i ≤ m,
(3.1)
j=1
which is close in law to the stable random variable L1 , possessing stability index α/p = 1/2 and the scale C = C (C) connected with the scale parameter C of L by the relation (2.5). The probability distribition function F1/2,C of the limiting spectrally positive 1/2-stable random variable L1 is known explicitly: < C x e−C /2y (3.2) dy, x ≥ 0. F1/2,C (x) = 2π 0 y 3/2 Convergence in law is in turn equivalent to the convergence of the correponding Kolmogorov–Smirnov distance. Thus calculating from the data the empirical distribution function of (Vpn (X (i) )1 )1≤i≤n given by 1 I (V n (X (i) )1 ), m i=1 (−∞,x] p m
Gp,n (x) :=
x ∈ R,
(3.3)
the Corollary 2.1 implies that for n big enough Dn (C, p) = sup |Gp,n (x) − F1/2,C (C) (x)| ≈ 0. x≥0
(3.4)
This argument allows us to estimate the unknown values α and C. Indeed, calculating numerically the empirical distribution functions Gp,n for different values p ∈ [p1 , p2 ], 1 < p1 < p2 , we minimize the Kolmogorov–Smirnov distance Dn (C, p) over the parameter domain p ∈ [p1 , p2 ] and C ∈ [C1 , C2 ]. If Dn (C, p) attains its unique minimum at C = C ∗ and p = p∗ we conclude that L has the stability index α∗ = p∗ /2 and the scale C ∗ . To test this method, we simulate m = 200 samples of the data from equation (2.11) with f (·, x) = cos x, x ∈ R, and L1 ∼ S0.75 (6.35, 0, 0), n = 200. We find that the Kolmogorov–Smirnov distance Dn (C, p) attains a unique global minimum at C ∗ ≈ 6.35 and p∗ ≈ 1.5 corresponding to the true values of α and C (see Fig. 10.1). We next study the real ice-core data, analysed earlier by Ditlevsen.1,2 The logcalcium signal covers the time period from approximately 90 150 to 10 150 years before present. We divide it into m = 282 samples, each containing n = 282 data points. Then the Kolmogorov–Smirnov distance is minimized numerically over p and C according to (3.4).
168
Claudia Hein, Peter Imkeller and Ilya Pavlyukevich 0.6 0.5 0.4 0.3 0.2 0.1
1.0
Figure 10.1.
1.5
2.0
2.5
3.0
3.5
Dn (C ∗ , p) for the simulated data, L is a 0.75-stable L´evy process, n = m = 200 0.6 0.5 0.4 0.3 0.2 0.1
1.0
Figure 10.2.
1.5
2.0
2.5
3.0
3.5
Dn (C ∗ , p) for the ice-core data set, C ∗ ≈ 7.2, α∗ ≈ 0.75, n = m = 282. 0.6 0.5 0.4 0.3 0.2 0.1
1.0
Figure 10.3.
1.5
2.0
2.5
3.0
3.5
4.0
Dn (C, p) for the ice-core data set, C = 3.28, n = m = 282.
It turns out that Dn (C, p) for the real data also exhibits a unique global minimum in the (C, p)-domain, which yields the estimate α∗ ≈ 0.75 for C ∗ ≈ 7.2, Dn (C ∗ , p∗ ) ≈ 0.1 (see Fig. 10.2). It is striking that our estimate differs from Ditlevsen’s by a quantity very close to 1. This discrepancy can be explained as follows. It turns out that the function p → Dn (C, p) has two local minima for some values of C different from the optimal value C ∗ . For example, for C = 3.28 there
Limit Theorems for p-Variations of Solutions of SDEs, Model Selection
169
are two local minima corresponding to the stability indices α1 ≈ 1.02 and α2 ≈ 1.75 with the distances Dn (C, 2α1 ) ≈ 0.18 and Dn (C, 2α2 ) ≈ 0.27. Unfortunately the paper2 only contains the estimated value of the stability index α ≈ 1.75 of the (symmetric) forcing L, and not its scale. It is possible that under some a priori assumptions on C, Ditlevsen’s method provides a locally best fit which is not globally optimal (see Fig. 10.3). 4. Functional convergence for Vpn (L) To prove the convergence of the marginal distributions we use the following theorem which is a direct result of the well known generalized central limit theorem for i.i.d. random variables with infinite variance (e.g. see Theorem 3 in Feller [6, Chapter XVII.5]). Prop 10.1. Let (ηi )i≥1 be a sequence of non-negative i.i.d. random variables with a regularly varying tail such that P(η1 > x) ≈ C
2 − α −α/p x α
as x → +∞.
(4.1)
for some α ∈ (0, 2), p > α/2 and C > 0. Then for any t > 0 we have n t p/α
n with
d
ηi − bt,n (α, p) → tp/α Z,
as n → ∞,
(4.2)
i=1
p/α t n Eη1 , n tη 1 bt,n (α, p) = nE sin , n 0, p > α.
p ∈ (α/2, α) (4.3)
p = α,
where Z ∼ Sα/p (C , 1, 0) with C as defined in (2.5). Let L be an α-stable L´evy process as defined in (2.1) and let p > α/2. To study the finite dimensional distributions of Vpn (L)t we note that due to the independence of increments of L it suffices to establish the convergence of marginal laws for a fixed t > 0. Further, the stationarity and independence of increments of L and the selfd similarity property Lt = t1/α L1 implies that Vpn (L)t
=
[nt] i=1
d
d |∆ni L|p =
[nt] |∆1 L|p i
i=1
np/α
,
with ∆1i L = L1 ∼ Sα (C, β, 0) being i.i.d. random variables.
(4.4)
170
Claudia Hein, Peter Imkeller and Ilya Pavlyukevich
It is easy to see that the random variables |∆1i L|p have a distribution function with a regularly varying tail, namely P(|∆1i L|p > x) = P(|L1 | > x1/p ) ≈ CF
2 − α −α/p x α
as x → +∞,
(4.5)
and thus we can apply Proposition 10.1 to the sum (4.4). Taking into account that nt/[nt] → 1 as n → ∞, we obtain convergence of the finite dimensional laws in Theorem 2.1. Again as these are sums of i.i.d. random variables, we can combine the results from Theorems VII.2.35 and VII.2.29 in Ref. 13 to obtain functional convergence for the process.
5. Generalisation to sums of processes We finally discuss the situation of Theorem 2.3, where besides the L´evy process L another process Y is given. To see that Vpn (L) and Vpn (L + Y ) have equivalent asymptotic behaviour we apply Lemma VI.3.31 in Ref. 13. Under the conditions u.c.p. of Theorems 2.2 and 2.3 it is enough to show that Vpn (L + Y ) − Vpn (L) → 0 as n → ∞.
5.1. Equivalence for p ≤ 1 Let L be an α-stable L´evy process, p ∈ (α/2, 1] and let Y be such that for u.c.p. Vpn (Y ) → 0, as n → ∞. Note that due to the monotonicity properties of Vpn (Y ), P
the latter convergence condition is equivalent to Vpn (Y )N → 0 for any N ≥ 1. Then a simple application of the triangle inequality yields the proof. In fact, for any N ≥ 1 we have
sup |Vpn (L + Y )t − Vpn (L)t | ≤
0≤t≤N
≤
nN
nN n |∆ (L + Y )|p − |∆n L|p i
i
i=1
(5.1) P
|∆ni (L + Y ) − ∆ni L|p = Vpn (Y )N → 0,
n → ∞.
i=1
5.2. Equivalence for p > α P
Assume again that Vpn (Y )t → 0, n → ∞, for any t ≥ 0. Denote m := [p], so that p = m + q with q ∈ [0, 1). Then for any N ≥ 1 we have
Limit Theorems for p-Variations of Solutions of SDEs, Model Selection
171
[nt] sup Vpn (L + Y )t − Vp (L)t = sup |∆ni (L + Y )|m+q − |∆n Li |m+q
t≤N
t≤N
i=1
[nt] m m n q = sup |∆n Li |k |∆ni Y |m−k − |∆ni L|m+q |∆i (L + Y )| k t≤N i=1 k=0 m−1 nN m n |∆i (L + Y )|q |∆ni L|k |∆ni Y |m−k ≤ k i=1 k=0
nN n + |∆i (L + Y )|q |∆ni L|m − |∆ni L|m+q i=1
≤
m−1
nN m−1 nN m n k+q n m−k m n k n m−k+q |∆i L| |∆i Y | + |∆ L| |∆i Y | k i=1 k i=1 i
k=0
+
nN
k=0
|∆ni L|m |∆ni (L + Y )|q − |∆ni L|q
i=1
≤
(5.2)
m−1
nN m−1 nN m n k+q n m−k m n k n m−k+q |∆ L| |∆i Y | + |∆ L| |∆i Y | k i=1 i k i=1 i
k=0
k=0
+ I{q>0}
nN
|∆ni L|m |∆ni Y |q .
i=1
The right-hand side of the latter inequality is essentially a sum of 2m + 1 terms of nN the type i=1 |∆ni L|k |∆ni Y |p−k , where k ∈ [0, p] ∩ N. Applying H¨ older’s inequality we get [nt] i=1
|∆n Li |k |∆n Yi |p−k ≤
[nt]
|∆n Li |p
[nt] k/p
i=1
|∆n Yi |p
(p−k)/p .
(5.3)
i=1
The first factor in the latter formula converges in probability to a finite limit, since p > α. The second factor converges to 0 in probability due to the assumption on Y , and Theorem 2.2 is proven. 5.3. Equivalence for α ∈ (1, 2), p ∈ (1, α] The main technical difficulty of this case arises from the fact that the p-variation of L for p < α does not exist. In particular, the events when increments of the stable process L become very large have to be considered carefully. For T > 0 and some c > 0 which will be specified later define the following sets: Jcn (ω) := {i ∈ [0, [nT ]] : |∆ni L(ω)| > c}, Anc (j) := {ω ∈ Ω : |Jcn (ω)| = j},
j = 0, . . . , [nT ].
(5.4)
172
Claudia Hein, Peter Imkeller and Ilya Pavlyukevich
The set Jcn contains the time instants (in the scale n1 ), where the increments of the process L are ‘large’, i.e. exceed c. The set Anc (j) describes the event that the number of large increments equals j. Let δ > 0, ε > 0. According to the conditions of Theorem (2.3), let K := δ K( 2 ) > 0, and set B = {ω : |Ys (ω) − Yt (ω)| ≤ K |s − t| for all s, t ∈ [0, T ]},
(5.5)
so that P(B) ≥ 1 − 2δ . We estimate P sup Vpn (L + Y )t − Vpn (L)t > ε 0≤t≤T
' & [nT ] ¯ ≤P |∆ni (L + Y )|p − |∆ni L|p > ε ∩ B + P(B) i=1
' δ & [nT ] P |∆ni (L + Y )|p − |∆ni L|p > ε ∩ Anc (j) ∩ B + 2 j=0 i=1
[nT ]
≤
ε' & ∩ Anc (j) ∩ B P ≤ |∆ni (L + Y )|p − |∆ni L|p > 2 n j=0 [nT ]
(5.6)
i∈Jc
ε' δ & ∩ Anc (j) ∩ B + P + |∆ni (L + Y )|p − |∆ni L|p > 2 2 n j=0 [nT ]
i∈Jc
δ =: D(1) (n, c) + D(2) (n, c) + . 2 In the following two steps we show that for appropriately chosen c = c(ε) > 0 and n big enough, D(1) (n, c) = 0 and D(2) (n, c) < δ/2. This will finish the proof. Step 1. To estimate D(1) (n, c), let ω ∈ B. Using the elementary inequality p |a − bp | ≤ max{a, b}p−1 p|a − b| which holds for a, b ≥ 0 and p ≥ 1 we estimate |∆ni (L(ω) + Y (ω))|p − |∆ni L(ω)|p i∈Jcn (ω)
≤
i∈Jcn (ω)
≤
i∈Jcn (ω)
K p−1 n p c+ |∆i (L(ω) + Y (ω))| − |∆ni L(ω)| n K p−1 n p c+ |∆i Y | n
K p−1 K p c+ n n i∈Jcn (ω) p−1 K K T, ≤p c+ n ≤
(5.7)
Limit Theorems for p-Variations of Solutions of SDEs, Model Selection
173
where we have used that the ‘small’ increments of L are bounded by c. So we have ε' & ∩ Anc (j) ∩ B P |∆ni L + ∆ni Y |p − |∆ni L|p > 2 n j=0
[nT ]
D(1) (n, c) =
i∈Jc
& K p−1 ε' ∩ Anc (j) ∩ B P p c+ KT > n 2 j=0 p−1 K ε =0 ≤P p c+ K T > n 2
[nT ]
≤
for all n ≥ n1 with 1 % 1 2pK T p−1 $ 1 ε p−1 and c = c(ε) := . n1 := 2K ε 4 2pK T
(5.8)
(5.9)
Step 2. To estimate D(2) (n, c), let n ≥ Kc so that for ω ∈ B and i ∈ Jcn (ω) we have ∆n Y (ω) K i ≤ 1. (5.10) n ≤ ∆i L(ω) nc By means of the elementary inequality (1+|x|)p −1 ≤ 3|x| which holds for 1 ≤ p ≤ 2 and |x| ≤ 1 this implies that |∆ni (L(ω) + Y (ω))|p − |∆ni L(ω)|p i∈Jcn (ω)
≤
|∆ni L(ω)|p
i∈Jcn (ω)
≤
i∈Jcn (ω)
≤
$
∆n Y (ω) p % 1 + in −1 ∆i L(ω)
∆n Y (ω) |∆ni L(ω)|p 3 in ∆i L(ω) |∆ni L(ω)|p
i∈Jcn (ω)
(5.11)
3K . nc
This in turn immediately yields & 3K ε ' > P Anc (j) ∩ |∆ni L|p nc 2 n j=1
[nT ]
D
(2)
(n, c) ≤
i∈Jc
= & εnc ' . |∆ni L|p > P Anc (j) ∩ ≤ 6K j n j=1 [nT ]
(5.12)
i∈Jc
Since all ∆ni L, i = 1, . . . , [nT ], are i.i.d., and only j of them exceed the threshold c, we can estimate the probability for this event precisely. Indeed, denoting pn := P(|∆n1 L| > c) = P(|L1 | > cn1/α ),
(5.13)
174
Claudia Hein, Peter Imkeller and Ilya Pavlyukevich
we continue the estimates in (5.12) to get [nT ]
D
(2)
[nT ] P {|∆ni L| > c, i = 1, . . . , j} (n, c) ≤ j j=1 ∩ {|∆ni L| ≤ c, i = j + 1, . . . , [nT ]} ∩
[nT ]
[nT ] jP {|∆ni L| > c, i = 1, . . . , j} = j j=1
j & = i=1
|∆ni L|p >
εnc ' 6jK (5.14)
& εnc ' ∩ {|∆ni L| ≤ c, i = j + 1, . . . , [nT ]} ∩ |∆n1 L|p > 6jK [nT ] [nT ] εnc [nT ]−j jpj−1 ≤ . P |∆n1 L|p > n (1 − pn ) 6jK j j=1 From (4.5) we know that there is a c˜ such that pn ≤
c˜ cα n
(5.15)
and εnc 1/p c˜ εnc −α/p εnc = P |L1 | > P |∆n1 L|p > n1/α ≤ 6jK 6K j n 6K j
(5.16)
holding for some c˜ > 0, for all n ≥ n2 ≥ Kc and 1 ≤ j ≤ [nT ]. Combining (5.14) and (5.16), denoting the constant pre-factor by C, and recalling that αp ≤ 2, we obtain for n ≥ n2 that D
(2)
[nT ] [nT ] C j 1+α/p pjn (1 − pn )[nT ]−j (n, c) ≤ j pn n1+α/p j=1 [nT ] [nT ] C j 3 pjn (1 − pn )[nT ]−j . ≤ j pn n1+α/p j=1
(5.17)
The sum in the previous formula represents the third moment of a binomial distribution, and thus can be calculated explicitly. By means of the asymptotic behaviour (4.5) we get (nT − 2)(nT − 1)nT p3n + 3(nT − 1)nT p2n + nT pn pn n1+α/p CT ≤ α/p ((nT pn )2 + 3nT pn + 1). n
D(2) (n, c) ≤ C
(5.18)
Now choose n ≥ n3 ≥ n2 big enough to ensure that this expression is smaller than δ 2 . This completes the proof of Theorem 2.3.
Limit Theorems for p-Variations of Solutions of SDEs, Model Selection
175
Acknowledgement P.I. and I.P. thank DFG SFB 555 Complex Nonlinear Processes for financial support. C.H. thanks DFG IRTG Stochastic Models of Complex Processes for financial support. C.H. is grateful to R. Schilling for his valuable comments. The authors thank P. Ditlevsen for providing the ice-core data. References 1. P. D. Ditlevsen, Anomalous jumping in a double-well potential, Physical Review E. 60(1), 172–179, (1999). 2. P. D. Ditlevsen, Observation of α-stable noise induced millenial climate changes from an ice record, Geophysical Research Letters. 26(10), 1441–1444 (May, 1999). 3. W. Dansgaard, S. J. Johnsen, H. B. Clausen, D. Dahl-Jensen, N. S. Gundestrup, C. V. Hammer, C. S. Hvidberg, J. P. Stefensen, A. E. Sveinbjornsdottir, J. Jouzel, and G. Bond, Evidence for general instability of past climate from 250 kyr ice-core record, Nature. 364, 218–220, (1993). 4. P. E. Protter, Stochastic integration and differential equations. vol. 21, Applications of Mathematics, (Springer, 2004), second edition. 5. G. Samorodnitsky and M. S. Taqqu, Stable non-Gaussian random processes. (Chapman&Hall/CRC, 1994). 6. W. Feller, An introduction to probability theory and its applications: volume II. (John Wiley & Sons, 1971). 7. P. E. Greenwood, The variation of a stable path is stable, Zeitschrift f¨ ur Wahrscheinlichkeitstheorie und verwandte Gebiete. 14(2), 140–148, (1969). 8. P. E. Greenwood and B. Fristedt, Variations of processes with stationary, independent increments, Zeitschrift f¨ ur Wahrscheinlichkeitstheorie und verwandte Gebiete. 23(3), 171–186, (1972). 9. D. Lepingle, La variation d’ordre p des semi-martingales, Zeitschrift f¨ ur Wahrscheinlichkeitstheorie und verwandte Gebiete. 36, 295–316, (1976). 10. J. Jacod, Asymptotic properties of power variations of L´evy processes, ESAIM: Probability and Statistics. 11, 173–196, (2007). 11. J. Jacod, Asymptotic properties of realized power variations and related functionals of semimartingales., Stochastic Processes and their Applications. 118(4), 517–559, (2008). 12. J. M. Corcuera, D. Nualart, and J. H. C. Woerner, A functional central limit theorem for the realized power variation of integrated stable processes, Stochastic Analysis and Applications. 25, 169–186, (2007). 13. J. Jacod and A. N. Shiryaev, Limit theorems for stochastic processes. vol. 288, Grundlehren der Mathematischen Wissenschaften, (Springer, 2003), second edition.
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 177–183
Chapter 11 Class II Semi-Subgroups of the Infinite Dimensional Rotation Group and Associated Lie Algebra Takeyuki Hida and Si Si Professor Emeritus, Nagoya University and Meijo University Nagoya, Japan. Faculty of Information Science and Technology Aichi Prefectural University, Aichi Prefecture, Japan. The infinite dimensional rotation group plays important roles in white noise analysis. In particular, the class II subgroup describes significant probabilistic properties. We shall find a local Lie group involving some half whiskers in the class II and discuss its Lie algebra.
Keywords: white noise, infinite dimensional rotation group, half whisker 2000 AMS Subject Classification: 60H40 Contents 1 Introduction . . . . . . . . 2 Class II Subgroups of O(E) 3 Half whiskers . . . . . . . . 4 Lie algebra . . . . . . . . . 5 Concluding remarks . . . . References . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
177 178 180 181 182 182
1. Introduction There are two main characteristics in white noise analysis. (1). Generalized white noise functionals. It has reasonably been developed as a typical theory of infinite dimensional calculus. There should be reminded an important view point for the discussion of finite dimensional approximation, that is, in terms of Volterra-L´evy le passage du fini a ` l’infini. We emphasize the theory dealing with essentially infinite dimensional calculus, where we have a base involving continuously many (independent) vectors. We, therefore, come naturally to a space of generalized white noise functionals. 177
178
Takeyuki Hida and Si Si
(2). Infinite dimensional rotation group O(E). Following H. Yoshizawa we start with the rotation group O(E) of E (⊂ L2 (R)). The group consists of homeomorphisms of E which are orthogonal in L2 (R). It is topologized by the compact-open topology. The collection O∗ (E ∗ ) of the adjoint operators g ∗ of g ∈ O(E) forms a group which is isomorphic to O(E). The significance of the group O∗ (E ∗ ) is that every g ∗ in O∗ (E ∗ ) keeps the white noise measure µ to be invariant: g ∗ µ = µ. By using the group O(E), isomorphic to O∗ (E ∗ ), we can carry on, as it were, an infinite dimensional harmonic analysis, which should be a part of the white noise analysis. We now have a brief observation of the group O(E) as a preliminary of the present report. One might think that O(E) is a limit of the finite dimensional rotation groups SO(n) as n → ∞, but not quite. The limit can occupy a very minor part of O(E) : of course it is almost impossible to measure the size of the limit occupied in the entire group O(E). Our idea of investigating the group is as follows: i) Since O(E) is very big (neither compact nor locally compact), we take subgroups that can be managed. First the entire group is divided into two parts: Class I and Class II. The Class I involves members that can be determined by using a base, say {ξn } of E and any member of the class II comes from a diffeomorphism of the parameter space R1 . ii) Each class has subgroups. We are now interested in new subsemigroups in the class II that are isomorphic to local Lie groups and have certain probabilistic meanings. The present report shall discuss particularly subgroups belonging to the class II. The class I has been discussed extensively, while we know only few of subgroups belonging to the class II. Of course, known subgroups are quite important, however we are sure that there should be some more (certainly quite many) significant subgroups that are important for our white noise analysis; actually we have been in search of finding such subgroups. Now we shall show some examples in the class II. 2. Class II Subgroups of O(E) This section is devoted to a brief review of the known results and find some hints to find new good subgroups. Each member of this class, say {gt , t : real}, should be defined by a system of ¯ = R ∪ ∞ : one point compactification. parameterized diffeomorphisms {ψt (u)} of R Namely, ξ(u) → (gξ(ψt (u)) |ψt (u)|, (2.1)
Class II Semi-Subgroups
179
where ψt (u) is the derivative of ψt (u) in the variable u. We are interested in a subgroup which can be made to be a local Lie group embedded in O(E). In what follows the basic nuclear space is specified to Do (see [3]). More practically, we restrict our attention to the case where gt , t ∈ R forms a one-parameter group such that gt is continuous in t. Assume further that there d gt |t=0 of this group gt . exists the (infinitesimal) generator A = dt There exits, by the assumptions of the group property and continuity, a family {ψt (u), t ∈ R} such that ψt (u) is measurable in (t, u) and satisfies ψt · ψt = ψt+s ψ0 (u) = u. Such a one-parameter group is called whisker. Let gt∗ be the adjoint operator to gt . Then, the system {gt∗ } again forms a oneparameter group of µ (the white noise measure) preserving transformations gt∗ . The system is a flow on the white noise space E ∗ (µ). By J. Aez´el [1], we have an expression for ψt (u): ψt (u) = f (f −1 (u) + t)
(2.2)
where f is continuous and strictly monotone. Its (infinitesimal) generator α, if f is differentiable, can now be expressed in the form α = a(u)
1 d + a (u), du 2
(2.3)
where a(u) = f (f −1 (u)).
(2.4)
See e.g. [4], [5]. We have already established the results that there exists a three dimensional subgroup of class II with significant probabilistic meanings. The group consists of three one-parameter subgroups, the generators of which are expressed by a(u) = 1, a(u) = u, a(u) = u2 , respectively. Namely, we show a list: d , du d 1 τ =u + , du 2 d κ = u2 + u. du One of the interesting interpretations may be said that they put together describe the projective invariance of Brownian motion. Those generators form a base of a three dimensional Lie algebra under the Lie product [α, β] = αβ − βα, satisfying the Jacobi identity. s=
180
Takeyuki Hida and Si Si
The algebra given above is isomorphic to sl(2, R). This fact can easily be seen by the commutation relations: [τ, s] = −s,
[τ, κ] = κ,
[κ, s] = 2τ.
There is a remark that the shift with generator s is sitting as a key member of the generators. It corresponds to the flow of Brownian motion, significance of which is quite clear. Also, one can take τ to be another key generator. The τ describes the OrnsteinUhlenbeck Brownian motion which is Gaussian and simple Markov. We are now in search of new whiskers that show some significant probabilistic properties as above three whiskers under somewhat general setup. There a whisker may be changed to a half-whisker. 3. Half whiskers First we recall the notes [11] p. 60, section O∞ 1, where a new whisker with generator αp = up
d p + up−1 du 2
(3.1)
is suggested to be investigated, where p is not necessarily be integer. (The power p was written as α in [11], but to avoid notational confusion, we write p instead of α.) Since fractional power p is involved, we tacitly assume that u is positive, We, therefore, take a white noise with time-parameter [0, ∞). The basic nuclear space E is chosen to be D00 which is isomorphic to D0 , eventually isomorphic to C ∞ (S 1 ). We are now ready to state an answer. As was remarked in the last section, the power p = 1 is the key number and, in fact, it is exceptional. In this case the variable u can run through R, that is, corresponds to a whisker with generator τ . In what follows we escape from the case p = 1. We remind the relationship between f and a(u) that appear in the expressions of ψt (u) and α, respectively. The related formulas are the same as the case where u runs through R. Assuming differentiability of f we have the formula (2.4). For a(u) = up , the corresponding f (u) is determined. Namely, up = f (f −1 (u)). An additional requirement for f is that the domain of f should be the entire [0.∞). Hence, we have 1
f (u) = cp u 1−p ,
(3.2)
Class II Semi-Subgroups
181
where cp = (1 − p)1/(1−p) . Summing up, we define ψt (u) for u > 0 by (2.2) such that ψt (u) = f −1 (f (u) + t) with the special choice of f given by (3.2). We, therefore, have f −1 (u) = (1 − p)−1 u1−p . We are ready to define a transformation gtp acting on D00 by : u1−p cp u1−p 1/(1−p) + t) ( + t)p/(1−p) u−p . (gt ξ)(u) = ξ(cp ( 1−p 1−p 1−p
(3.3)
(3.4)
Note that f is always positive and maps (0, ∞) onto itself, in the ordinary order in the case p < 1, and in the reciprocal order in the case p > 1. The exceptional case p = 1 is refered to the literature [5]. It has been well defined. Then, we claim, still assuming p = 1, the following theorem. Theorem 3.1. i) gtp is a member of O(D00 ) for every t > 0. ii) The collection {gtp , t ≥ 0} forms a continuous semi-group with the product gtp · p for t, s ≥ 0. gsp = gt+s iii) The generator αp of gtp is αp given by (3.1). Proof. Assertion i) comes from the structure of D00 . Assertions ii) and iii) can be proved by the actual computations. Definition. A continuous semi-group gt , t ≥ 0, each member of which comes from ψt (u) is called a half whisker. Theorem 3.2. The collection of half whiskers gtp , t ≥ 0, p ∈ R, generates a local Lie semi-group GL : GL = generated by {gtp11 · · · gtpnn } The definition of a local Lie group is found, e.g. in W. Miller, Jr. [10]. A semi-group is defined similarly. 4. Lie algebra The collection {αp ; p ∈ R} generates a vector space gL . There is introduced the Lie product [·, ·]. Note that the exceptional power p = 1 is now included. The whiskers
182
Takeyuki Hida and Si Si
introduced in Section 1.2 are considered as half whiskers by letting the parameter t run through [0, ∞). In this sense, we identify in such a manner that α0 = s,
α1 = τ,
α2 = κ.
With these understanding, we have Theorem 4.1. The space gL is a Lie algebra parameterized by p ∈ R. It is associated with the local Lie semi-group GL . Proof.
We have
d 1 + (q − p)u(p+q−2) . du 2 The result is a(u) = (q − p)up+q−1 . This proves the theorem. [αp , αq ] = (q − p)up+q−1
(4.1)
In fact, we have an infinite dimensional Lie algebra, the base of which consists of one-parameter generators of half whiskers. 5. Concluding remarks We shall propose a more general theory, where it is possible to propose many kinds of half whiskers. Namely, we may consider general infinitesimal generators, where the functions a(u) in (2.3) or f in (2.2) are restricted so as to define subgroups of O(D00 ). For every p, the (gtp )∗ is a semigroup of µ-measure preserving transformations. We may, therefore, define a Gaussian process X p (t) in such a manner that X p (t) = (gtp )∗ x, ξ , where x ∈ E ∗ (µ). We have much freedom to choose ξ, in fact, we may choose the indicator function χ[0,1] (u). Acknowledgement We are grateful to Professor I. Volovich who told us to remind Virasoro type Lie algebras in connection with the group O(E). We are encouraged to think of the problem raised in [11] and the whiskers discussed in [4] and [5]. References 1. J. Acz´el, Vorlesungen u ¨ber Funktionalgleichungen und ihre Anwendungen, Birkh¨ user, 1960. 2. I.M. Gel’fand. Russian Math. Survey 29 (1974) 3-16. 3. I.M. Gel’fand, M.I. Graev and N.Ya. Vilenkin, Generalized functions. vol. 5, 1962 (Russian original), Academic Press, 1966. 4. T. Hida, Stationary stochastic processes, Princeton Univ. Press, 1970.
Class II Semi-Subgroups
183
5. T. Hida, Brownian motion. Iwanami Pub. Co. 1975, in Japanese; english transl. Springer-Verlag, 1980. 6. T. Hida and Si Si, Lectures on white noise functionals. World Sci. Pub. Co. 2008. 7. A.A. Kirillov, K¨ ahler structures on K-orbits of the group of diffepmorphisms of a circle. Functional Analysis and its Applications. vol.21, no.2. 42-45 (1986; english. 1987). 8. A.A. Kirillov and D.V. Yur’ev, K¨ ahler geometry of the infinite dimensional homogeneous space M = Dif f+ (S 1 )/Rot(S 1 ). Functional analysis and its Applications vol.21, no.4, (1987; english. 284-294). 9. P.A. Meyer et J.A. Yan, A propos des distributions. LNM 1247 (1987), 8-26. 10. W.Miller, Jr. Lie theory and special functions, Academic Press. 1968. 11. T. Shimizu and Si Si, Professor Takeyuki Hida’s mathematical notes. informal publication 2004.
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 185–193
Chapter 12 Stopping Weyl Processes
R. L. Hudson Mathematics Department, Loughborough University, Loughborough, Leicestershire LE11 3TU, Great Britain [email protected] It is shown that, in order that the multiplicativity rule Wab Wbc = Wac , which essentially characterizes Weyl processes, continue to hold when the sure times a < b < c are replaced by stop times, it is sufficient to use left stopping at the lower limit a and right stopping at the upper limit b in Wab .
Contents 1 Introduction . . . . . . . . . . . . . . . . . . 2 Characterization of unitary product integrals 3 Stop times . . . . . . . . . . . . . . . . . . . 4 Factorizing product integrals at stop times . References . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
. . . . .
185 187 190 191 192
1. Introduction The quantum notion of stop time was introduced in Ref. 3, where it was called Markov time, in the context of both Fock and non-Fock extremal universally invariant quantum stochastic calculus. It has been developed in the more general context of a filtered von Neumann algebra by a number of authors beginning with Barnett and Lyons2 (see Ref. 8 for more recent references) and in the context of Fock quantum stochastic calculus by Parthasarathy and Sinha.9 In all these works except Ref. 3 a distinction is made, when stopping a quantum process at a stop time, between right and left stopping. The stop time is in effect a projection-valued measure and a process is an operator-valued function of time. To stop the process one evaluates it at the stop time as a spectral integral against the projection valued measure as integrator, with the operator-valued function as integrand. If the latter does not commute with the former the integrator may be placed on either side of the integrand resulting in two different values for the integral. This problem is avoided in Ref. 3 by formulating the strong Markov property in such a way that the 185
186
Robin L. Hudson
relevant integrand commutes with the integrator; see also Ref. 4 where this aspect is emphasised. The Weyl operators W (f ), f ∈ L2 (R+ ), forming a representation of the canonical commutation relations, 1 iIm f, g W (f + g), f, g ∈ L2 (R+ ), W (f )W (g) = exp 2 have from the beginning7 played a motivating role in both Fock and non-Fock quantum stochastic calculus. Each Weyl operator W (f ) generates a Weyl process (W (f )t )t∈R+ with W (f )t = W (f χ[0,t] ), where χ[0,t] is the indicator function of [0, t], which is the unique solution of the quantum stochastic differential equation σ2 2 |f | dT ), (1.1) 2 with initial condition W (f )0 = 1, where the variance parameter σ = 1 in the Fock case and σ > 1 in the case. It follows that if, for a < b we define
non-Fock Wab (f ) = W (f χ]a,b] ), then Wab (f ) 0≤a≤b satisfies the evolution equation dW (f ) = W (f )(f dA† − f¯dA −
Wab Wbc = Wac whenever a < b < c.
(1.2)
A more suggestive notation for Wab (f ) is as a product integral, Wab (f ) =
b σ2 2 |f | dT ) (1 + f dA† − f¯dA − 2 a
meaning that Wab (f ) is the solution at b > a of the differential equation (1.1) with initial condition W (f )a = 1. Note that Wab (f ) is strongly continuous in a, b ∈ R+ ; this follows from the strong continuity of the Weyl operators W (f ) in f ∈ L2 (R+ ). It is well defined when f is only locally square integrable; f ∈ L2loc (R+ ). More generally given such f and also a real-valued locally integrable function h on R+ , then the family of unitary operators b σ2 2 |f | )dT 1 + f dA† − f¯dA + (ih − Wab (f ; h) = 2 a (1.3) = Wab (f )γab > b where γab = ba (1 + ihdT ) = exp i a h(x)dx also satisfies (1.2). let there be given a strongly continuous family of unitary operators
Conversely b Wa a
and such that each Wab belongs to the von Neumann algebra Nab generated by the Weyl operators W (f ).with f ∈ L2 (R+ ) supported by the interval ]a, b]. Then in the non-Fock case σ > 1 there exists a (locally) square integrable complex-valued function f on R+ and a locally integrable real-valued function h on R+ such that each Wab = Wab (f ; h). This is not true in the Fock case, when product integrals involving the conservation process can also arise. Although it is surely well known
Stopping Weyl Processes
187
to be true in the non-Fock case, I do not know of a proof so one is given in Section 2. For a related result proved by similar methods see Ref. 6. The main purpose of this paper is to obtain a generalisation of the evolution property (1.2) which will hold when the real numbers a, b and c are replaced by stop times. We shall see that (1.2) will hold for stop times a, b and c satisfying a < b < c, where a < b means that every point in the spectrum of a does not exceed every point of the spectrum of b, if and only if Wab is defined for stop times a < b in this sense by using left stopping at a and right stopping at b. 2. Characterization of unitary product integrals From now on we take the quantum stochastic calculus to be the non-Fock extremal universally invariant version of Ref. 5 of variance σ > 1, so that the quantum Itˆ o table is †
dA dA dT
dA† 0 α2 dT 0
dA β 2 dT 0 0
dT 0 . 0 0
(2.1)
where α and β are real numbers satisfying α2 + β 2 = σ 2 , α2 − β 2 = 1.
Theorem 2.1. Let there be given a family Wab a
that each Wab ∈ Nab , the map [0, b[ a → Wab is strongly continuous for each b > 0, the map ]a, ∞[ b → Wab is strongly continuous and the evolution equation (1.2) holds. Then there exists a locally square-integrable complex-valued function f and a locally integrable real valued function h such that each Wab is given by (1.3). Proof. The von Neumann algebras Nab are ampliations of von Neumann algebras in tensor factor Hilbert spaces in which factorisations the cyclic separating vacuum vector Ω is a product vector. Thus, taking vacuum expectations in the (evolution) equation Wab Wbc = Wac , we deduce that the complex numbers Φ(a, b) = Ω, Wab Ω satisfy the functional equation Φ(a, b)Φ(b, c) = Φ(a, c) whenever a < b < c. Moreover the complex-valued function Φ inherits continuity in its two arguments separately from the strong continuity of the family Wab . It follows Ref. 1 that Φ is of the form b φ(x) dx (2.2) Φ(a, b) = exp a
for some integrable complex-valued function φ. Since, in view of the unitarity of the Wab , each |Φ(a, b)| ≤ 1, we can put φ = −F + ih where F is nonnegative and h is real valued.
188
Robin L. Hudson
Now consider, for b fixed a, the operator valued function Ma on [a, ∞[ where Ma (b) = Wab − 1 − a Wax φ(x) dx. The vacuum expectation of Ma (b) is ? @ E [Ma (b)] =
Ω, Wab − 1 −
= Φ(a, b) − 1 −
b
a
Wax φ(x)dx Ω
b
Φ(a, x)φ(x)dx. a
It follows from the fundamental theorem of calculus that E [Ma (b)] = 0 for arbitrary a < b. For a < b < c we have c c Wax φ(x)dx Ma (c) = Wa − 1 − a
= Wab Wbc − 1 −
b
Wax φ(x)dx − Wab
a
= Wab (Wbc − 1 −
c
b
b
c
Wbx φ(x)dx
Wbx φ(x)dx) + Wab − 1 −
a
b
Wax φ(x)dx
= Wab Mb (c) + Ma (b) Taking the vacuum conditional expectation at time b we find that E Ma (c) Nab = E Wab Mb (c) + Ma (b). Nab = Wab E [Mb (c)] + Ma (b) = Ma (b). Hence the process (Ma (b))b≥a is a martingale, and we can apply the martingale representation theorem of Ref. 5 and the fact that Ma (b) = 0 when b = a to write each Ma (b) in the form b
Ea (s) dA† (s) + Fa (s)dA(s) Ma (b) = a
for unique adapted, locally square-integrable processes Ea and Fa on [a, ∞[. Equivalently we can write b
Wab = 1 − Ea dA† + Fa dA + Wax φ(x) dx . (2.3) a
Comparing two expressions Wac = 1 −
c
Ea dA† + Fa dA + Wax φ(x) dx ,
a
Wac = Wab Wbc = Wab −
b
c
Wab Eb dA† + Wab Fb dA + Wax φ(x) dx
we see that, for x > b > a Ea (x) = Wab Eb (x), Fa (x) = Wab Fb (x).
(2.4)
Stopping Weyl Processes
189
Equivalently, for a < b < c
b −1
−1 Wa Ea (c) = Eb (c), Wab Fa (c) = Fb (c). Consider, for n ∈ N, −1
(Wac )
−1 c− 1 c Ea (c) = Wa n Wc− Ea (c) 1 n −1 c− 1 −1 c Wa n = Wc− Ea (c) 1 n −1 c = Wc− Ec− n1 (c). 1 n
c Hence the former also belongs to Nc− The latter is an element of 1 and hence n. ∞ c c to n=1 Nc− 1 . = C1. Thus we can write Ea (c) = Wa fa (c) and similarly Fa (c) = n Wac ga (c) for some complex-valued functions fc and gc .Making these substitutions into (2.4) we find that c Nc− 1 n.
Wax fa (x) = Wab Wbx fb (x) = Wax fb (x), Wax ga (x) = Wab Wbx gb (x) = Wax gb (x), hence we can write fa = f and ga = g independently of a. By local square integrability of the processes Ea and Fa , f and g are locally square-integrable. Thus (2.3) becomes Wab = 1 −
a
b
Wax f (x) dA† (x) + Wax g(x) dA(x) + Wax φ(x) dx
and so the process Wa satisfies dWa = Wa (f dA† + g dA + φ dT ) with initial condition Waa = 1. Finally, since each Wab is unitary, differentiating the process Wa Wa∗ = 1 we find that 0 = d (Wa Wa∗ ) = Wa (f dA† (x) + g dA + φ dT )Wa∗ +Wa (¯ g dA† + f¯ dA + φ¯ dT )W ∗ a
+Wa (f dA† + g dA + φ dT )(¯ g dA† (x) + f¯ dA + φ¯ dT )Wa∗ 2 2 = Wa Wa∗ (f + g¯)dA† + (g + f¯)dA + (φ + φ¯ + β 2 |f | + α2 |g| )dT and so, again using the fact that Wa Wa∗ = 1,we must have g = −f¯ and φ = −
σ2 2 |f | + ih 2
for some real valued integrable h. Then Wab is given by (1.3).
190
Robin L. Hudson
3. Stop times A stop time is a right-continuous increasing family of projections (E(λ))λ∈R+ labelled by the nonnegative real numbers with the property that each & ' (3.1) E(λ) ∈ N λ = (W (f )t )t≤λ : f ∈ L2 (R+ ) , the von Neumann algebra generated by the Weyl processes (W (f )t )t≤λ up to time λ. We shall always assume that the strong limit limλ→∞ E(λ) = 1. Thus a stop time ∞ is charaterised by the nonnegative self-adjoint operator τ = 0− λdE(λ) which is affiliated to N . Stop times may be partially ordered either by prescribing that τ1 ≤ τ2 holds in the usual sense for self adjoint operators, that ψ, τ1 ψ ≤ ψ, τ2 ψ for arbitrary ψ in the domain of τ1 , or, more usefully,2 that E2 (λ) ≤ E1 (λ) for all λ ∈ R+ . In this chapter we shall use a partial ordering that is stronger than either of these, namely τ1 ≤ τ2 will mean that there is a real number µ ≥ 0 such that E1 (µ) = 1, E2 (µ) = 0;
(3.2)
equivalently every element of the spectrum of τ1 is majorised by every element of the spectrum of τ2 . Following the intuition that a function f (τ ) of the self-adjoint operator τ is ∞ defined as the spectral integral 0− f (λ)dE(λ), given a process X = (X(t))t∈R+ we may try to define the observable or random variable X(τ ) got by stopping the process at τ as ∞ X(τ ) = X(λ)dE(λ) 0−
whenever this exists in a suitable analytic sense, for example as the strong limit lim X(λj )(E(λj+1 ) − E(λj )) over partitions P = (0, λ1 , λ2 , ..., λN = λP ) of interP
vals [0, λP ] with λP → ∞ on the domain for which such limits exist. A problem which arises with this definition, of the right-stopped process, is that it fails to be equal to the similarly defined left-stopped process ∞ dE(λ)X(λ) = lim (E(λj ) − E(λj−1 ))X(λj ) P
0−
or to the double-stopped process ∞ dE(λ)X(λ)dE(λ) = lim (E(λj ) − E(λj−1 ))X(λj )(E(λj ) − E(λj−1 )). 0−
P
Compounding this problem of nonuniqueness, all three definitions have the defect that in general stopping the product of two commuting processes is not the same thing as the product of the stopped processe. These problems are avoided in the situation that each X(t) is affiliated to the commutant Nt of Nt so that each X(λj )(E(λj ) − E(λj−1 ) = (E(λj ) − E(λj−1 ))X(λj ) = (E(λj ) − E(λj−1 ))X(λj )(E(λj ) − E(λj−1 )),
Stopping Weyl Processes
191
and in fact the stopped process can then be given a more direct definition in terms of factorising vectors ψ = ψt ⊗ ψ t , χ = χt ⊗ χt , t ∈ R+ , namely the double exponential vectors e(f, g¯) = e(f ) ⊗ e(¯ g), as ∞ ( t ) ψ , X(τ )χt d ψt , E(t)χt . ψ, X(τ )χ = 0
Though this procedure may seem artificial it allows a precise formulation of the strong Markov property,3 that the processes A† and A begin anew independently of the past at each stop time. Parthasarathy and Sinha9 refined this idea in the case of Fock quantum stochastic calculus to show that the usual Fock space splitting at each sure time t ∈ R+ , H = Ht ⊗ Ht extends to stop times, H = H τ ⊗ H τ in which each exponential vector e(f ) is a product vector. Their treatment can be extended straightforwardly to the non-Fock case. 4. Factorizing product integrals at stop times We consider a family of product integrals b σ2 2 |f | )dT , a < b, 1 + f dA† − f¯dA + (ih − Wab (f ; h) = 2 a where f is complex-valued square-integrable and h is real-valued and integrable, equivalently, by Theorem 1, a strongly continuous family of unitary operators Wab ∈ Nab satisfying the evolution equation (1.2). Our purpose is to study to what extent (1.2) when a, b and c are replaced by stop times. ∞ ∞ For stop times τ1 = 0− λdE1 (λ), τ2 = 0− λdE2 (λ) with τ1 < τ2 in the sense (3.2) we define Wττ12 informally using left stopping at the lower limit τ1 and right stopping at the upper limit τ2 . Then ∞ ∞ dE1 (λ1 )Wλλ12 dE2 (λ2 ) Wττ12 = 0− 0− µ ∞ dE1 (λ1 )Wλλ12 dE2 (λ2 ) = 0− µ
µ ∞
0− µ
dE1 (λ1 )W λµ2 Wλµ1 dE2 (λ2 ) ∞ dE1 (λ)W λµ W µλ dE2 (λ),
= = 0−
µ
(4.1)
µ
are using the factorisation (1.2) with a = λ1 , b = µ and c = λ2 . The latter
integrals well defined in view of the unitarity and continuity of the family Wab a≤b so we may take (4.1) as the rigorous definition of Wττ12 . It is easy to see using (1.2) that this definition is independent of the choice of µ satisfying (3.2). Also it is clear that (1.2) holds for stop times a, b and c satisfying a ≤ b ≤ c, provided that b is sure. Theorem 4.1. (1.2) holds for arbitrary stop times a, b and c satisfying a ≤ b ≤ c in the sense of (3.2).
192
Robin L. Hudson
Proof. By replacing a by a sure time µ for which (3.2) holds with τ1 = a and τ2 = b and using the fact that (1.2) holds when b is sure we may assume without loss of generality that a is sure. Similarly we may assume that c is sure. Then, with ∞ c b= λdE(λ) = λdE(λ) 0−
we have
Wab Wbc
a
c
Waλ1 dE(λ1 )
=
a
a
c a
dE(λ2 )Wλc2
c
Waλ dE(λ)W cλ
= c
= ac
Waλ W cλ dE(λ)
Wac dE(λ) a c = Wac dE(λ)
=
a
= Wac where we used (3.1) and the fact that W cλ ∈ N cλ , together with the fact that (1.2) holds with b replaced by a sure time λ. Acknowledgement Conversations with Sylvia Pulmannov´a are acknowledged. References 1. J Aczel, Lectures on functional equations and their applications, Academic Press (1966). 2. C Barnett and T Lyons, Stopping noncommutative processes, Math. Proc. Cambridge Philos. Soc 99, 151-161 (1984) 3. R L Hudson, The strong Markov property for canonical Wiener processes, Jour. Funct. Anal. 34 266-281 (1979). 4. R L Hudson, Stop times in Fock space quantum probability, Stochastics 79, 383-391 (2007). 5. R L Hudson and J M Lindsay, A non-commutative martingale representation theorm for non-Fock quantum stochastic calculus, Jour. Funct. Anal 61, 202-221 (1985). 6. R L Hudson and J M Lindsay, Uses of non-Fock quantum Brownian motion and a quantum martingale representation theorem, pp276-305 in Quantum Probability II, Proceedings, Heidelberg 1984, Springe Lecture Notes in mathematics 1136 (1985). 7. R L Hudson and R F Streater, Noncommutative martingales and stochastic integrals in Fock space, pp216-227, in Stochastic processes in quantum theory and statistical physics, Proceedings, Marseilles, 1982, Springer Lecture Notes in Physics, 173 (1983)
Stopping Weyl Processes
193
8. A Luczak and A A A Mohammed, Stochastic integration in finite von Neumann algebras, Studia Sci. Math. Hungar. 44, 233-244 (2007) 9. K R Parthasarathy and K B Sinha, Stop times in Fock space stochastic calculus, PTRF 73, 317-349 (1987).
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 195–206
Chapter 13 Karhunen-Lo´ eve Expansion for Stochastic Convolution of Cylindrical Fractional Brownian Motions Zongxia Liang Department of Mathematical Sciences, Tsinghua University, Beijing 100084, China [email protected] This chapter aims at firstly establishing a Karhunen-Lo´eve expansion for stochastic convolution of cylindrical fractional Brownian motion, then studying asymptotic behavior of conditional exponential moments of the stochastic convolution via this expansion.
Keywords: Cylindrical Brownian motion, Cylindrical fractional Brownian motion, Stochastic convolution, Conditional exponential moments 2000 AMS Subject Classification: Primary 60H05, 60G15; Secondary 60J65, 60F10
Contents 1 2
Introduction . . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . 2.1 Fractional Brownian motion . . . . . . . 2.2 Hypotheses on operators A and Φ . . . H of cylindrical 2.3 Stochastic convolution WA H . . . . . . . 3 Karhunen-Lo´ eve expansion of WA 4 Evaluation of conditional exponential moments References . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . fractional . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
195 197 197 198 198 199 203 206
1. Introduction It is well-known that one of important questions related to the diffusion processes investigated by physicists in the context of statistics mechanics and quantum theory ( see [8, 12]) is to compute the following Onsager-Machlup functional J(φ) = log lim
ε→0
P(x − φ ≤ ε) P(W ≤ ε)
195
196
Zongxia Liang
under a suitable norm · and some regularity conditions imposed on deterministic function φ, where x(t) is a n-dimensional diffusion process driven by a standard Brownian motion W . A rigorous mathematical treatment of this question was initiated by Stratonovich [15] and carried out by Ikeda and Watanabe [9], Takahashi and Watanabe [16], Fujita and Kotani [7], Zeitouni [18], Shepp and Zeitouni [14], Lyons and Zeitouni [10], Capitaine [4], Chaleyat-Maurel and Nualart [5] and other authors in various degrees of generality. They proved that the function J(φ) has the following expression 1 1 1 1 ˙ 2 |φ(t) − f (φ(t))| dt − ∇f (φ(t))dt. (1.1) J(φ) = − 2 0 2 0 We noticed that in order to get this expression (1.1) one has to evaluate the following conditional exponential expectation 1 lim E exp{ f (s)dW (s)}W ≤ ε = 1 (1.2) ε→0
0
for any function f ∈ L2 ([0, 1]) (see [10,14]). Therefore the problem (1.2) has attracted a lot of interest recently. Bardina, Rovira and Tindel [2, 3], Moret and Nualart [11] extended the problem above to cylindrical Brownian motion W(see [2, 3, 13]) and one-dimensional fractional Brownian motion B H with Hurst parameter H ∈ (0, 1) (cf. [11] ). In this chapter we generalize the problem (1.2) to a cylindrical fractional Brownian motion BH (see section 2 below). More precisely, we investigate the following limit 1 lim E exp{ < f (s), dW >}WAH 2 ≤ ε (1.3) ε→0
0
t where f ∈ L2 ([0, 1]; U), WAH (t) = 0 exp{(t − s)A}ΦdBH (s) is a stochastic convolution of a cylindrical Brownian motion BH , A and Φ are operators on a real separable Hilbert space U(see section 2 below). To the author’s knowledge, there is no answer about this problem up to now. We will show that the limit (1.3) is not one in general, i.e., for any L2 ([0, 1]; U)-valued function φ there is an orthogonal subset {hkj ⊗ ej , k ≥ 1, j ≥ 1} of L2 ([0, 1]; U) such that 1 lim E exp{ < φ(s), dW >}WAH 2 ≤ ε ε→0
0
∞ 1 = exp{ < φ, hkj ⊗ ej >2L2 } ([0,1];U) 2 k,j=1
under some conditions (H1) and (H2) imposed on A and Φ (see section 2 below) and eve expansion norm · 2 in L2 ([0, 1]; U). We prove this result via a Karhunen-Lo´ H of WA . The problem (1.1) in this case remains unsolved.
Karhunen-Lo´ eve Expansion for Stochastic Convolution
197
This paper is organized as follows. In section 2 we collect basic facts about fractional Brownian motion that will be used in this paper. In section 3 we give the KarhunenLo´ eve expansion of WAH . In section 4 we devote to proving main result of this paper. 2. Preliminaries 2.1. Fractional Brownian motion In this subsection we collect some basic facts about fractional Brownian motion as well as stochastic integral w.r.t. it. We refer the readers to [1, 17]. Let B H = {BtH , t ∈ [0, 1]} be an one-dimensional fractional Brownian motion with Hurst parameter H ∈ (0, 1). B H has the following Wiener integral representation, t H Bt = K(t, s)dWs , (2.1) 0
where W = {Ws , s ∈ [0, 1]} is a standard Brownian motion, K(t, s) is a kernel given by t t 1 1 1 K(t, s) = CH ( )H− 2 (t − s)H− 2 + s 2 −H F ( ), s s z−1 3 1 CH is a constant and F (z) = CH ( 12 −H) 0 θH− 2 (1−(θ+1)H− 2 )dθ. In particular, 1 B 2 is a standard Brownian motion. We denote by E the linear space of step functions on [0, 1]. Let H be an Hilbert space of defined as the closure of E with respect to the scalar product < I[0,t] , I[0,s] >H = R(t, s), where R(t, s) = 12 (t2H + s2H − |t − s|2H ). n If φ(t) = i=1 ai I(ti ,ti+1 ] (t) ∈ E, t1 ,· · · , tn ∈ [0, 1], n ∈ N, ai ∈ , then we define its Wiener integral with respect to the fraction Brownian motion as follows, 1 n ϕ(s)dBsH = ai (BtHi+1 − BtHi ). (2.2) 0
i=1
1 The mapping φ(t) = i=1 ai I(ti ,ti+1 ] → 0 φ(s)dBsH is an isometry between E and the linear space span {BtH , t ∈ [0, 1]} ⊂ L2 (Ω) and it can be extended to an isometry between H and the first Wiener chaos generated by {BtH , t ∈ [0, 1]}. The image on an element φ ∈ H by this isometry is called the Wiener integral of φ with respect to B H . n
Let s < t and consider an operator Kt in L2 ([0, 1]), t ∂K (Kt φ)(s) = K(t, s)φ(s) + (φ(r) − φ(s)) (r, s)dr. ∂r s
198
Zongxia Liang
Then Kt is an isometry between H and L2 ([0, 1]), t t H ϕ(s)dBs = (Kt ϕ)(s)dWs 0
0
for every t ∈ [0, 1], and ϕI[0,t] ∈ H iff Kt ϕ ∈ L2 [0, 1] (see [1, 17]). Moreover, if tt H ≥ 12 , d φ, ϕ ∈ H are such that 0 0 |φ(s)ϕ(t)||t − s|2H−2 dsdt < +∞, then t t < φ, ϕ >H = H(2H − 1) φ(u)ϕ(v)|u − v|2H−2 dudv 0 0 t t φ(s)dBsH ϕ(s)dBsH ) = E( 0 1
=E = 0
0 1
0
(Kt φ)(s)dWs
0
1
(Kt ϕ)(s)dWs
(Kt φ)(s)(Kt ϕ)(s)ds.
(2.3)
2.2. Hypotheses on operators A and Φ In this subsection we introduce hypotheses (H1) and (H2) on the operators A and Φ. Let U be a real separable Hilbert space and A : D(A) ⊂ U → U an unbounded operator on U. The norm and scalar product in U will be denoted by |·|U and < ·, · > respectively. The L2 -norm in L2 ([0, 1]; U) will be denoted by · 2 . Throughout this paper we assume that (H1) The operator A generates a self adjoint C0 −semigroup {exp(tA); t ≥ 0}, of negative type. Moreover, there exists a complete orthogonal system {ej ; j ≥ 1} which diagonalizes A. We will denote by {−αj , j ≥ 1} the corresponding set of eigenvalues and we also suppose that 0 < α1 < α2 < · · · < αn → +∞ as n → +∞ for convenience. (H2) The operator Φ is a bounded linear operator on U. Φ is of non-negative type, and is diagonal when expressed in the orthogonal basis {ej ; j ≥ 1}. Denote by {βj ; j ≥ 1} the corresponding set of eigenvalues. {βj ; j ≥ 1} satisfies ∞
βj2
j=1
α2H−1 (1 + 2αj ) j
< +∞,
(2.4)
and the Hurst parameter H is in [ 12 , 1). H of cylindrical fractional Brownian 2.3. Stochastic convolution WA motion
In this subsection we introduce cylindrical fractional Brownian motion and its stochastic convolution.
Karhunen-Lo´ eve Expansion for Stochastic Convolution
199
Let (Ω, F, Ft , P) be a probability space. {W i (t); t ∈ [0, 1], i ≥ 1} is a sequence of mutually independent Brownian motions on this probability space and is adapted to Ft . The cylindrical Brownian motion W and cylindrical fractional Brownian motion BH ( see [2, 13, 17] ) on U are defined by the following formal series’
W(t) =
∞
W i (t)ei
i=1 ∞
BH (t) =
and
BiH (t)ei
(2.5) (2.6)
i=1
respectively, where {BiH , i ≥ 1} is a family of mutually independent fractional Brownian motions with the Hurst parameter H ∈ (0, 1), that is, for i ≥ 1 BiH (t)
t
K(t, s)dW i (s).
= 0
We define the stochastic convolution WAH of cylindrical fractional Brownian motion BH by WAH (t) = ≡
t
0 ∞ j=1
exp{(t − s)A}ΦdBH (s) βj
0
t
exp{−(t − s)αj }dBjH (s) ej .
(2.7)
We will prove the series (2.7) is convergent in L2 (Ω×[0, 1]×U) under the assumption (H1) and (H2) in next section.
H 3. Karhunen-Lo´ eve expansion of WA
t Let X(t) = 0 exp{−α(t − s)}dWsH with α > 0, and W H is an one-dimensional fractional Brownian motion with Hurst parameter H ∈ [ 12 , 1). Then by (2.3) EX(t) = 0, {X(t), t ∈ [0, 1]} is a Gaussian process, t t Γ(2H − 1) EX 2 (t) = exp{−α(2t − u − v)}|u − v|2H−2 dudv ≤ < +∞, α2H 0 0 (3.1)
200
Zongxia Liang
where Γ denotes the Euler function. Moreover t E|X(t) − X(s)|2 ≤ 2E[ exp{−α(t − u)}dWuH ]2 s s +2E[ (exp{−α(t − u)} − exp{−α(s − u)})dWuH ]2
1
=2 0
0
1
I[s,t]×[s,t] (u, v) exp{−α(2t − 0 1 1
u − v)}|u − v|2H−2 dudv
I[0,s]×[0,s] (u, v)(exp{−α(t − u)} − exp{−α(s − u)})
+2 0
0
×(exp{−α(t − v)} − exp{−α(s − v)})|u − v|2H−2 dudv →0
as t − s → 0.
(3.2)
Its covariance function B(t, s) = E[X(t)X(s)] is thus a continuous, symmetric, non-negative definite kernel on [0, 1]2 . The following integral operator 1 B(t, s)f (s)ds T(f )(t) = 0
is compact and self adjoint, and admits a complete orthogonal base of L2 ([0, 1]) consisting of eigenfunctions φn (t) of T with corresponding eigenvalues λn . By Mercer’s theorem (see [6], p.117) B(t, s) =
∞
λj φj (t)φj (s).
j=1
Let ξn (ω) =
1 0
X(t)φn (t)dt. Then
1
1
E[ξn ξm ] =
E[X(t)X(s)]φn (t)φm (s)dtds 0
0
1
1
B(t, s)φn (t)φm (s)dtds
= 0
0 1
φn (t)Tφm (t)dt
= 0
1
φn (t)φm (t)dt
= λm 0
= δ{n=m} λm .
(3.3)
Therefore {ξn , n ≥ 1} forms a sequence of zero-mean, independent Gaussian random variables with variance λn . Further E|X(t) −
n k=1
ξk φk (t)|2 = B(t, t) −
n j=1
λj φ2j (t) → 0(as n → +∞).
Karhunen-Lo´ eve Expansion for Stochastic Convolution
201
That is, X(t) =
∞
ξk φk (t) =
k=1
∞
λk zk φk (t),
(3.4)
k=1
where zk = √1λ ξk are standard normal random variables. Without generality we k assume that n → 0 as n → +∞. Hence if we replace X by t λ1 ≥ λ2 ≥ · · · ≥ λ Xj (t) = 0 exp{−αj (t − s)}dBjH (s), then by (3.4) we can define the stochastic convolution WAH of cylindrical fractional Brownian motion B H by WAH (t) =
∞
βj
λk jzkj (φkj (t) ⊗ ej )
(3.5)
j,k=1
if the series (3.5) is convergent, where λkj = Eξkj 1 ξkj = Xj (t)φkj (t)dt 0
1 ξkj zkj = λkj φkj is an eigenfunction corresponding to λkj . The main result of this section is the following. Theorem 3.1. Assume that the (H1) and (H2) hold. Then the stochastic convolution WAH of cylindrical fractional Brownian motion B H has Karhunen-Lo´ eve expansion (3.5). That is, the series (3.5) is convergent in L2 (Ω × [0, 1] × U). Proof. Since {φkj ⊗ ej , k ≥ 1, j ≥ 1} forms a complete orthogonal base of L2 ([0, 1]; U) such that for any k, j ≥ 1 Cov(< WAH , φkj ⊗ ej >L2 ([0,1];U) ) = βj2 λkj , we have ∞
WAH 22 =
2 βj2 λkj zkj ,
k,j=1
WAH 2L2 (Ω×[0,1]×U) =
∞
βj2 λkj .
(3.6)
k,j=1
Therefore, we only need to prove that ∞ k,j=1
βj2 λkj < +∞.
(3.7)
202
Zongxia Liang
For every j ≥ 1 , noting that {φkj , k ≥ 1} is a complete orthogonal base of L2 ([0, 1]), by (2.4) and the Parseval equality we have ∞
∞ λkj = E[ < Xj , φkj >2 ]
k=1
k=1
= EXj 2L2 ([0,1]) 1 = EXj2 (t)dt 0
1
=
t t [ exp{−αj (2t − u − v)}|u − v|2H−2 dudv]dt.
0
0
(3.8)
0
We now calculate the last integral. Using the Fubini Theorem t 0
t
exp{−αj (2t − u − v)}|u − v|2H−2 dudv
0 t t
I{u≥v} exp{−αj (2t − u − v)}|u − v|2H−2 dudv
= 0 t
0 t
0
0
+
I{u≤v]} exp{−αj (2t − u − v)}|u − v|2H−2 dudv
≡ J1 (t) + J2 (t),
t
(3.9)
t1
e−αj (2t−t1 −s1 ) (t1 − s1 )2H−2 ds1 t1 −2αj t αj t1 =e e [ eαj s1 (t1 − s1 )2H−2 ds1 ]dt1 0 0 t αj t1 −2αj t 2αj t1 1−2H =e e αj [ e−x x2H−2 dx]dt1 0 t ≤ e−2αj t Γ(2H − 1) e2αj t1 dt1 · α1−2H j
J1 (t) =
dt1
0
0 t
0
Γ(2H − 1) = (1 − e−2αj t ). 2α2H j
(3.10)
Similarly J2 (t) ≤
Γ(2H − 1) (1 − e−2αj t ). 2α2H j
(3.11)
Karhunen-Lo´ eve Expansion for Stochastic Convolution
203
By the inequalities (3.8)-(3.11) 1 ∞ λkj = [J1 (t) + J2 (t)]dt 0
k=1
Γ(2H − 1) ≤ α2H j
1
(1 − e−2αj t )dt
0
Γ(2H − 1) (2αj − 1 + e−2αj )/2αj α2H j 1 , ≤ 2Γ(2H − 1) 2H−1 αj (1 + 2αj ) =
where we have used an elementary inequality: x − 1 + e−x ≤ ∞
βj2 λkj ≤ 2Γ(2H − 1)
k,j=1
∞
βj2
j=1
α2H−1 (1 + 2αj ) j
x2 1+x
(3.12) for x ≥ 0. Hence
< +∞
(3.13)
by (2.4). We thus complete the proof. 4. Evaluation of conditional exponential moments In this section we will use the Karhunen-Lo´ eve expansion (3.5) of WAH to compute 1 the limit limε→0 E{exp{[ 0 < l(s), dW(s) >]WAH 2 ≤ ε} for any function l ∈ L2 ([0, 1]; U). Where W is cylindrical Brownian motion. Before doing this we first present a lemma that we will use in this section. We denote here by 2 a set of sequences of real numbers {ai , i ≥ 1} such that 2 i≥1 ai < +∞. Lemma 4.1. (see [2] Lemma 2.4) Let {zn , n ≥ 1} be a sequence of independent N (0, 1) random variables defined on (Ω, F, P), and {ηi , i ≥ 1} and {vi , i ≥ 1} two 2 sequences of real numbers such that ηi > 0 for any i ≥ 1.Then ∞ ∞
lim E exp zi vi | ηi2 zi2 ≤ ε2 = 1.
ε→0
i=1
(4.1)
i=1
Now we state the main result of this paper. Theorem 4.1. Let l ∈ L2 ([0, 1]; U), the (H1) and (H2) hold. Then there exists an orthogonal subset {hkj } of L2 ([0, 1]) such that 1 lim E{exp{[ < l(s), dW(s) >]}WAH 2 ≤ ε} ε→0
0
∞ 1 = exp{ < l, hkj ⊗ ej >2L2 ([0,1];U) }. 2 k,j=1
(4.2)
204
Zongxia Liang
Proof. For λkj = 0 we let 1 hkj (s) ≡ λkj
1
s
Kt (e−αj (t−·) )(s)φkj (t)dt,
s ∈ [0, 1].
Then {hkj , k ≥ 1} is an orthogonal subset of L2 ([0, 1]). Indeed, for n, m ≥ 1, by (2.4) < hnj , hmj >L2 ([0,1]) 1 1 1 1 −αj (t−·) = [ Kt (e )(s)φnj (t)dt] · [ Kt (e−αj (t−·) )(s)φmj (t)dt]ds λnj λmj 0 s s 1 1 t∧u 1 [ Kt (e−αj (t−·) )(s)Ku (e−αj (u−·) )(s)ds]φnj (t)φmj (u)dtdu = λnj λmj 0 0 0 1 1 1 E[Xj (t)Xj (u)]φnj (t)φmj (u)dtdu = λnj λmj 0 0 : : λmj λmj < φnj , φmj >L2 ([0,1]) = δ{n=m} , (4.3) = λnj λnj that is, < hnj , hmj >L2 ([0,1]) = 0 for n = m and hnj L2 ([0,1]) = 1. If we denote by I(f ) the Ito integral of f , then we have zkj = I(hkj ). In fact, 1 ξkj λkj 1 t 1 = [ K (e−αj (t−·) )(s)dW j (s)]φkj (t)dt λkj 0 0 t 1 1 1 Kt (e−αj (t−·) )(s)φkj (t)dt dW j (s) = λkj s 0 1 hkj (s)dW j (s) =
zkj =
0
= I(hkj ).
(4.4)
2
Since L ([0, 1]) is a real separable Hilbert space, we can find an orthogonal subset {hkj , k ≥ 1} of L2 ([0, 1]) such that {hkj , hnj , k ≥ 1, n ≥ 1} becomes a complete orthogonal base of L2 ([0, 1]). Hence {hkj ⊗ ej , hnj ⊗ ej , k ≥ 1, n ≥ 1, j ≥ 1} is a complete orthogonal base of L2 ([0, 1]; U). And for any l ∈ L2 ([0, 1]; H) we have l(s) = < l, hkj ⊗ ej > hkj (s) ⊗ ej + < l, hnj ⊗ ej > hnj (s) ⊗ ej . n,j=1
k,j=1
So
1
< l, dW(s) >= 0
k,j=1
< l, hkj ⊗ ej > zkj +
n,j=1
< l, hnj ⊗ ej > znj , (4.5)
Karhunen-Lo´ eve Expansion for Stochastic Convolution
205
where zkj = I(hkj ). Moreover, Ezmk znj 1 k = E[ hmk (s)dW (s) 0
1
hnj (s)dW j (s)] 0
1
hmk (s)hnj (s)ds
= 0
=0
(4.6)
for n, m, k, j ≥ 1. Therefore {zmk , znj } is a family of independent standard normal random variables. Consequently, 1 < l(s), dW(s) >]}WAH 2 ≤ ε} E{exp{[ 0 < l, hkj ⊗ ej > zkj } = E exp{ k,j=1
× exp{
2 2 < l, hnj ⊗ ej > znj } βj λkj zkj ≤ ε2
n,j=1
= E[exp{
k,j=1
< l, hnj ⊗ ej > znj }]
n,j=1
2 2 < l, hkj ⊗ ej > zkj } βj λkj zkj ≤ ε2 ×E exp{ k,j=1
k,j=1
(4.7) By using the Bessel inequality and (3.13) < l, hkj ⊗ ej >2 ≤ l2L2([0,1];U) and k,j=1 ∞
βj2 λkj ≤ 2Γ(2H − 1)
k,j=1
∞
βj2
j=1
α2H−1 (1 + 2αj ) j
< +∞.
By Lemma 4.1 it follows that 2 2 lim E exp{ < l, hkj ⊗ ej > zkj } βj λkj zkj ≤ ε2 = 1. ε→0
k,j=1
(4.8)
(4.9)
k,j=1
A combination of (4.7),(4.8) and (4.9) yields 1 lim E{exp{[ < l(s), dW(s) >]}WAH 2 ≤ ε} ε→0 0 < l, hnj ⊗ ej > znj }] = E[exp{ = exp{
n,j=1 ∞
1 < l, hnj ⊗ ej >2L2 ([0,1];U) }. 2 n,j=1
The proof is thus complete.
(4.10)
206
Zongxia Liang
Acknowledgement This work is supported by Project 10771114 of NSFC, Project 20060003001 of SRFDP, and SRF for ROCS, SEM, and the Korea Foundation for Advanced Studies. The author would like to thank the institutions for the generous financial support. He is also very grateful to College of Social Sciences and College of Engineering at Seoul National University for providing excellent working conditions for him. References 1. Alos, E., Mazet,O., Nualart, D.: Stochastic calculus with respect to Gaussian processes. Ann.Probab. 29, 766-801(1999). 2. Bardina,X., Rovira,C., Tindel,S.: Onsager-Machlup functional for stochastic evolution equations. Ann.I. Poincare-PR. 39, 69-93(2003). 3. Bardina,X., Rovira,C., Tindel,S.: Onsager-Machlup functional for stochastic evolution equations in a class of norms. Stochastic analysis and applications. 21, 12311253(2002). 4. Capitaine, M.: Onsager-Machlup functional for some smoo th norms on Wiener space. Probab. Theory Related Fields. 102, 189-201(1995). 5. Chaleyat-Maurel, M., Nualart,D.: The Onsager-Machlup functional for a class of anticipating processes. Probab. Theory Related Fields. 94, 247-270(1992). 6. Courant,R., Hilbert, D.: Methods of mathematical physics. John Wiley Inc., 1989. 7. Fujita, T. and Kotani. S.: The Onsager-Machlup function for diffusion processes. J.Math. Kyoto Univ. 22, 115-130( 1982). 8. Graham, R. : Path integral formulation of general diffusion process. Z. Phys. B. 26, 281-290(1979). 9. Ikeda, N., Watanabe, S.: Stochastic differential equations and diffusion processes. Wiley, New York, 1980. 10. Lyons, T. and Zeitouni O.: Conditional exponential moments for iterated wiener integrals. Ann.Probab. 27 , 1738-1749(1999). 11. Moret, S., Nualart, D: Onsager- Machlup functional for the fractional Brownian motion.Probab. Theory Related Fields. 124, 227-260(2002). 12. Onsager, L., Machlup, S. : Fluctuations and irreversible processes. I.II. Phys.Rev. 91, 1505-1512, 1512-1515(1953). 13. Prato,G.D., Zabczyk, J.: Stochastic equations in infinite dimensions. Cambridge University Press, 1992. 14. Shepp, L.A., Zeitouni, O.: A note on conditional exponential moments and OnsagerMachlup functionals. Ann.Probab. 20, 652-654(1992). 15. Stratanovich, R.L.: On the probability function of diffusion processes. Selected Transl. Math. Statist. Probab. 10, 273-286. 16. Takahashi, Y., Watanabe, S.: The probability functionals ( Onsager- Machlup functions) of diffusion processes. Stochastic Integrals. Lecture Notes in Math., 851, 433463(1981). Springer, Berlin. 17. Tindel,S., Tudor, C.A., Viens, F.: Stochastic evolution equations with fractional Brownian motion. Probab. Theory Related Fields. 127, 186-204(2003). 18. Zeitouni, O. : On the Onsager-Machlup functional of diffusion processes around non C 2 curves. Ann.Probab. 17, 1037-1054(1989).
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 207–236
Chapter 14 Stein’s Method Meets Malliavin Calculus: A Short Survey With New Estimates Ivan Nourdin∗ and Giovanni Peccati† Universit´e Paris VI and Universit´e Paris Ouest This is an overview of some recent techniques involving the Malliavin calculus of variations and the so-called ‘Stein’s method’ for the Gaussian approximations of probability distributions. Special attention is devoted to establishing explicit connections with the classic method of moments: in particular, interpolation techniques are used in order to deduce some new estimates for the moments of random variables belonging to a fixed Wiener chaos. As an illustration, a class of central limit theorems associated with the quadratic variation of a fractional Brownian motion is studied in detail.
Keywords: Central limit theorems, fractional Brownian motion, isonormal Gaussian processes, Malliavin calculus, multiple integrals, Stein’s method 2000 AMS Subject Classification: 60F05, 60G15, 60H05, 60H07
Contents 1
2
3
4
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Stein’s heuristic and method . . . . . . . . . . . . . . . . . . . . . . . 1.2 The role of Malliavin calculus . . . . . . . . . . . . . . . . . . . . . . 1.3 Beyond the method of moments . . . . . . . . . . . . . . . . . . . . . 1.4 An overview of the existing literature . . . . . . . . . . . . . . . . . . Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 Isonormal Gaussian processes . . . . . . . . . . . . . . . . . . . . . . 2.2 Chaos, hypercontractivity and products . . . . . . . . . . . . . . . . 2.3 The language of Malliavin calculus . . . . . . . . . . . . . . . . . . . One-dimensional approximations . . . . . . . . . . . . . . . . . . . . . . . . 3.1 Stein’s lemma for normal approximations . . . . . . . . . . . . . . . 3.2 General bounds on the Kolmogorov distance . . . . . . . . . . . . . . 3.3 Wiener chaos and the fourth moment condition . . . . . . . . . . . . 3.4 Quadratic variation of the fractional Brownian motion, part one . . . 3.5 The method of (fourth) moments: explicit estimates via interpolation Multidimensional case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
∗ Laboratoire
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
208 208 209 210 210 211 212 214 215 217 217 219 220 223 226 228
de Probabilit´es et Mod`eles Al´eatoires, Universit´e Pierre et Marie Curie, Boˆıte courrier 188, 4 Place Jussieu, 75252 Paris Cedex 5, France. Email: [email protected] † Equipe Modal’X, Universit´ e Paris Ouest – Nanterre la D´efense, 200 Avenue de la Rpublique, 92000 Nanterre, and LSTA, Universit´e Paris VI, France. Email: [email protected] 207
208
Ivan Nourdin and Giovanni Peccati
4.1 Main bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.2 Quadratic variation of fractional Brownian motion, continued . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
229 232 234
1. Introduction This survey deals with the powerful interaction of two probabilistic techniques, namely the Stein’s method for the normal approximation of probability distributions, and the Malliavin calculus of variations. We will first provide an intuitive discussion of the theory, as well as an overview of the literature developed so far. 1.1. Stein’s heuristic and method We start with an introduction to Stein’s method based on moments computations. Let N ∼ N (0, 1) be a standard Gaussian random variable. It is well-known that the (integer) moments of N , noted µp := E(N p ) for p ≥ 1, are given by: µp = 0 if p is odd, and µp = (p − 1)!! := p!/(2p/2 (p/2)!) if p is even. A little inspection reveals that the sequence {µp : p ≥ 1} is indeed completely determined by the recurrence relation: µ1 = 0, µ2 = 1,
and µp = (p − 1) × µp−2 ,
for every p ≥ 3.
(1.1)
Now (for p ≥ 0) introduce the notation fp (x) = xp , so that it is immediate that the relation (1.1) can be restated as E[N × fp−1 (N )] = E[fp−1 (N )],
for every p ≥ 1.
(1.2)
By using a standard argument based on polynomial approximations, one can easily prove that relation (1.2) continues to hold if one replaces fp with a sufficiently smooth function f (e.g. any C 1 function with a sub-polynomial derivative will do). (Z)] for Now observe that a random variable Z verifying E[Zfp−1 (Z)] = E[fp−1 p every p ≥ 1 is necessarily such that E(Z ) = µp for every p ≥ 1. Also, recall that the law of a N (0, 1) random variable is uniquely determined by its moments. By combining these facts with the previous discussion, one obtains the following characterization of the (standard) normal distribution, which is universally known as ‘Stein’s Lemma’: a random variable Z has a N (0, 1) distribution if and only if E[Zf (Z) − f (Z)] = 0,
(1.3)
for every smooth function f . Of course, one needs to better specify the notion of ‘smooth function’ – a rigorous statement and a rigorous proof of Stein’s Lemma are provided at Point 3 of Lemma 3.1 below. A far-reaching idea developed by Stein (starting from the seminal paper [36]) is the following: in view of Stein’s Lemma and given a generic random variable Z, one can measure the distance between the laws of Z and N ∼ N (0, 1), by assessing the distance from zero of the quantity E[Zf (Z) − f (Z)], for every f belonging to
Stein’s Method Meets Malliavin Calculus
209
a ‘sufficiently large’ class of smooth functions. Rather surprisingly, this somewhat heuristic approach to probabilistic approximations can be made rigorous by using ordinary differential equations. Indeed, one of the main findings of [36] and [37] is that bounds of the following type hold in great generality: d(Z, N ) ≤ C × sup |E[Zf (Z) − f (Z)]|, f ∈F
(1.4)
where: (i) Z is a generic random variable, (ii) N ∼ N (0, 1), (iii) d(Z, N ) indicates an appropriate distance between the laws of Z and N (for instance, the Kolmogorov, or the total variation distance), (iv) F is some appropriate class of smooth functions, and (v) C is a universal constant. The case where d is equal to the Kolmogorov distance, noted dKol , is worked out in detail in the forthcoming Section 3.1: we anticipate that, in this case, one can take C = 1, and F equal to the collection of all bounded Lipschitz functions with Lipschitz constant less or equal to 1. Of course, the crucial issue in order to put Stein-type bounds into effective use, is how to assess quantities having the form of the right-hand side of (1.4). In the last thirty years, an impressive panoply of approaches has been developed in this direction: the reader is referred to the two surveys by Chen and Shao [7] and Reinert [33] for a detailed discussion of these contributions. In this chapter, we shall illustrate how one can effectively estimate a quantity such as the right-hand side of (1.4), whenever the random variable Z can be represented as a regular functional of a generic and possibly infinite-dimensional Gaussian field. Here, the correct notion of regularity is related to Malliavin-type operators. 1.2. The role of Malliavin calculus All the definitions concerning Gaussian analysis and Malliavin calculus used in the Introduction will be detailed in the subsequent Section 2. Let X = {X(h) : h ∈ H} be an isonormal Gaussian process over some real separable Hilbert space H. Suppose Z is a centered functional of X, such that E(Z) = 0 and Z is differentiable in the sense of Malliavin calculus. According to the Stein-type bound (1.4), in order to evaluate the distance between the law of Z and the law of a Gaussian random variable N ∼ N (0, 1), one must be able to assess the distance between the two quantities E[Zf (Z)] and E[f (Z)]. The main idea developed in [18], and later in the references [19,20,22,23], is that the needed estimate can be realized by using the following consequence of the integration by parts formula of Malliavin calculus: for every f sufficiently smooth (see Section 2.3 for a more precise statement), E[Zf (Z)] = E[f (Z) DZ, −DL−1 Z H ],
(1.5)
where D is the Malliavin derivative operator, L−1 is the pseudo-inverse of the Ornstein-Uhlenbeck generator, and ·, · H is the inner product of H. It follows from (1.5) that, if the derivative f is bounded, then the distance between E[Zf (Z)] and E[f (Z)] is controlled by the L1 (Ω)-norm of the random variable
210
Ivan Nourdin and Giovanni Peccati
1 − DZ, −DL−1 Z H . For instance, in the case of the Kolmogorov distance, one obtains that, for every centered and Malliavin differentiable random variable Z, dKol (Z, N ) ≤ E|1 − DZ, −DL−1 Z H |.
(1.6)
We will see in Section 3.3 that, in the particular case where Z = Iq (f ) is a multiple Wiener-Itˆo integral of order q ≥ 2 (that is, Z is an element of the qth Wiener chaos of X) with unit variance, relation (1.6) yields the neat estimate < q−1 × |E(Z 4 ) − 3|. (1.7) dKol (Z, N ) ≤ 3q Note that E(Z 4 ) − 3 is just the fourth cumulant of Z, and that the fourth cumulant of N equals zero. We will also show that the combination of (1.6) and (1.7) allows to recover (and refine) several characterizations of CLTs on a fixed Wiener chaos – as recently proved in [26] and [27]. 1.3. Beyond the method of moments The estimate (1.7), specially when combined with the findings of [23] and [31] (see Section 4), can be seen as a drastic simplification of the so-called ‘method of moments and cumulants’ (see Major [13] for a classic discussion of this method in the framework of Gaussian analysis). Indeed, such a relation implies that, if {Zn : n ≥ 1} is a sequence of random variables with unit variance belonging to a fixed Wiener chaos, then, in order to prove that Zn converges in law to N ∼ N (0, 1), it is sufficient to show that E(Zn4 ) converges to E(N 4 ) = 3. Again by virtue of (1.7), one also has that the rate of convergence of E(Zn4 ) to 3 determines the ‘global’ rate convergence in the Kolmogorov distance. In order to further characterize the connections between our techniques and moments computations, in Proposition 3.1 we will deduce some new estimates, implying that (for Z with unit variance and belonging to a fixed Wiener chaos), for every integer k ≥ 3 the quantity |E[Z k ] − E[N k ]| is controlled (up to an explicit universal multiplicative constant) by the square root of |E[Z 4 ] − E[N 4 ]|. This result is obtained by means of an interpolation technique, recently used in [22] and originally introduced by Talagrand – see, e.g., [38]. 1.4. An overview of the existing literature The present survey is mostly based on the three references [18], [22] and [23], dealing with upper bounds in the one-dimensional and multi-dimensional approximations of regular functionals of general Gaussian fields (strictly speaking, the papers [18] and [22] also contain results on non-normal approximations, related, e.g., to the Gamma law). However, since the appearance of [18], several related works have been written, which we shall now shortly describe.
Stein’s Method Meets Malliavin Calculus
211
- Our paper [19] is again based on Stein’s method and Malliavin calculus, and deals with the problem of determining optimal rates of convergence. Some results bear connections with one-term Edgeworth expansions. - The paper [3], by Breton and Nourdin, completes the analysis initiated in Section 4 of [18], concerning the obtention of Berry-Ess´een bounds associated with the so-called Breuer-Major limit theorems (see [5]). The case of non-Gaussian limit laws (of the Rosenblatt type) is also analyzed. - In [20], by Nourdin, Peccati and Reinert, one can find an application of Stein’s method and Malliavin calculus to the derivation of second order Poincar´e inequalities on Wiener space. This also refines the CLTs on Wiener chaos proved in [26] and [27]. - One should also mention our paper [16], where we deduce a characterization of non-central limit theorems (associated with Gamma laws) on Wiener chaos. The main findings of [16] are refined in [18] and [22], again by means of Stein’s method. - The work [24], by Nourdin and Viens, contains an application of (1.5) to the estimate of densities and tail probabilities associated with functionals of Gaussian processes, like for instance quadratic functionals or suprema of continuous-time Gaussian processes on a finite interval. - The findings of [24] have been further refined by Viens in [40], where one can also find some applications to polymer fluctuation exponents. - The paper [4], by Breton, Nourdin and Peccati, contains some statistical applications of the results of [24], to the construction of confidence intervals for the Hurst parameter of a fractional Brownian motion. - Reference [2], by Bercu, Nourdin and Taqqu, contains some applications of the results of [18] to almost sure CLTs. - In [21], by Nourdin, Peccati and Reinert, one can find an extension of the ideas introduced in [18] to the framework of functionals of Rademacher sequences. To this end, one must use a discrete version of Malliavin calculus (see Privault [32]). - Reference [29], by Peccati, Sol´e, Taqqu and Utzet, concerns a combination of Stein’s method with a version of Malliavin calculus on the Poisson space (as developed by Nualart and Vives in [28]). - Reference [22], by Nourdin, Peccati and Reinert, contains an application of Stein’s method, Malliavin calculus and the ‘Lindeberg invariance principle’, to the study of universality results for sequences of homogenous sums associated with general collections of independent random variables.
2. Preliminaries We shall now present the basic elements of Gaussian analysis and Malliavin calculus that are used in this chapter. The reader is referred to the monograph by Nualart
212
Ivan Nourdin and Giovanni Peccati
[25] for any unexplained definition or result. 2.1. Isonormal Gaussian processes Let H be a real separable Hilbert space. For any q ≥ 1, we denote by H⊗q the qth tensor product of H, and by Hq the associated qth symmetric tensor product; plainly, H⊗1 = H1 = H. We write X = {X(h), h ∈ H} to indicate an isonormal Gaussian process over H. This means that X is a centered Gaussian family, defined on some probability space (Ω, F , P ), and such that E [X(g)X(h)] = g, h H for every g, h ∈ H. Without loss of generality, we also assume that F is generated by X. The concept of an isonormal Gaussian process dates back to Dudley’s paper [10]. As shown in the forthcoming five examples, this general notion may be used to encode the structure of many remarkable Gaussian families. Example 2.1 (Euclidean spaces). Fix an integer d ≥ 1, set H = Rd and let (e1 , ..., ed ) be an orthonormal basis of Rd (with respect to the usual Euclidean inner product). Let (Z1 , ..., Zd ) be a Gaussian vector whose components are i.i.d. d N (0, 1). For every h = j=1 cj ej (where the cj are real and uniquely defined), set # " X (h) = dj=1 cj Zj and define X = X (h) : h ∈ Rd . Then, X is an isonormal Gaussian process over Rd endowed with its canonical inner product. Example 2.2 (Gaussian measures). Let (A, A, ν) be a measure space, where ν is positive, σ-finite and non-atomic. Recall that a (real) Gaussian random measure over (A, A), with control ν, is a centered Gaussian family of the type G = {G (B) : B ∈ A, ν(B) < ∞} , satisfying the relation: for every B, C ∈ A of finite ν-measure, E[G(B)G(C)] = ν(B ∩ C). Now consider the Hilbert space H = L2 (A, A, ν), with inner product h, h H = A h(a)h (a)ν(da). For every h ∈ H, define X (h)"= A h(a)G(da) to be the # Wiener-Itˆo integral of h with respect to G. Then, X = X (h) : h ∈ L2 (Z, Z, ν) defines a centered Gaussian family with covariance given by E[X(h)X(h )] = h, h H , thus yielding that X is an isonormal Gaussian process over L2 (A, A, ν). For instance, by setting A = [0, +∞) and ν equal to the Lebesgue measure, one obtains that the process Wt = G([0, t)), t ≥ 0, is a standard Brownian motion started from zero (of course, in order to meet the usual definition of a Brownian motion, one has also to select a continuous version of W ), and X coincides with the L2 (Ω)-closed linear Gaussian space generated by W . Example 2.3 (Isonormal spaces derived from covariances). Let Y = {Yt : t ≥ 0} be a real-valued centered Gaussian process indexed by the positive axis, and set R (s, t) = E [Ys Yt ] to be the covariance function of Y . One can embed Y into some isonormal Gaussian process as follows: (i) define E as the collection of all finite linear combinations of indicator functions of the type 1[0,t] ,
Stein’s Method Meets Malliavin Calculus
213
t ≥ 0; (ii) define H = HR to be the Hilbert space given by the closure of E with respect to the inner product f, h R := ai cj R (si , tj ) ,
i,j
where f = i ai 1[0,si ] and h = j cj 1[0,tj ] are two generic elements of E ; (iii) for h = j cj 1[0,tj ] ∈ E, set X (h) = j cj Ytj ; (iv) for h ∈ HR , set X (h) to be the L2 (P ) limit of any sequence of the type X (hn ), where {hn } ⊂ E converges to h in HR . Note that such a sequence {hn } necessarily exists and may not be unique (however, the definition of X (h) does not depend on the choice of the sequence {hn }). Then, by construction, the Gaussian space {X (h) : h ∈ HR } is an isonormal Gaussian process over HR . See Chapter 1 of Janson [12] or Nualart [25], as well as the forthcoming Section 3.4, for more details on this construction. Example 2.4 (Even functions and symmetric measures). Other classic examples of isonormal Gaussian processes (see, e.g., [6,11,13]) are given by objects of the type Xβ = {Xβ (ψ) : ψ ∈ HE,β } , where β is a real non-atomic symmetric measure on (−π, π] (that is, β (dx) = β (−dx)), and HE,β = L2E ((−π, π] , dβ)
(2.1)
stands for the collection of all real linear combinations of complex-valued even functions that are square-integrable with respect to β (recall that a function ψ is even if ψ (x) = ψ (−x)). The class HE,β is a real Hilbert space, endowed with the inner product π ψ1 (x) ψ2 (−x) β (dx) ∈ R. (2.2) ψ1 , ψ2 β = −π
This type of construction is used in the spectral theory of time series. Example 2.5 (Gaussian Free Fields). Let d ≥ 2 and let D be a domain in Rd . Denote by Hs (D) the space of real-valued continuous and continuously differentiable functions on Rd that are supported on a compact subset of D (note that this implies that the first derivatives of the elements of Hs (D) are squareintegrable with respect to the Lebesgue measure). Write H(D) in order to indicate real Hilbert space obtained as the closure of Hs (D) with respect to the inner product f, g = Rd ∇f (x) · ∇g(x)dx, where ∇ is the gradient. An isonormal Gaussian process of the type X = {X(h) : h ∈ H(D)} is called a Gaussian Free Field (GFF). The reader is referred to the survey by Sheffield [35] for a discussion of the emergence of GFFs in several areas of modern probability. See, e.g., Rider and Vir´ag [34] for a connection with the ‘circular law’ for Gaussian non-Hermitian random matrices.
214
Ivan Nourdin and Giovanni Peccati
Remark 2.1. An isonormal Gaussian process is simply an isomorphism between a centered L2 (Ω)-closed linear Gaussian space and a real separable Hilbert space H. Now, fix a generic centered L2 (Ω)-closed linear Gaussian space, say G. Since G is itself a real separable Hilbert space (with respect to the usual L2 (Ω) inner product) it follows that G can always be (trivially) represented as an isonormal Gaussian process, by setting H = G. Plainly, the subtlety in the use of isonormal Gaussian processes is that one has to select an isomorphism that is well-adapted to the specific problem one wants to tackle. 2.2. Chaos, hypercontractivity and products We now fix a generic isonormal Gaussian process X = {X(h), h ∈ H}, defined on some space (Ω, F, P ) such that σ(X) = F . Wiener chaos. For every q ≥ 1, we write Hq in order to indicate the qth Wiener chaos of X. We recall that Hq is the closed linear subspace of L2 (Ω, F , P ) generated by the random variables of the type Hq (X(h)), where h ∈ H is such that hH = 1, and Hq stands for the qth Hermite polynomial, defined as dq − x2 e 2 , x ∈ R, q ≥ 1. dxq We also use the convention H0 = R. For any q ≥ 1, the mapping Hq (x) = (−1)q e
x2 2
Iq (h⊗q ) = q!Hq (X(h))
(2.3)
(2.4)
can be extended to a linear isometry between the symmetric tensor product Hq √ equipped with the modified norm q! ·H⊗q and the qth Wiener chaos Hq . For q = 0, we write I0 (c) = c, c ∈ R. Remark 2.2. When H = L2 (A, A, ν), the symmetric tensor product Hq can be identified with the Hilbert space L2s (Aq , Aq , ν q ), which is defined as the collection of all symmetric functions on Aq that are square-integrable with respect to ν q . In this case, it is well-known that the random variable Iq (h), h ∈ Hq , coincides with the (multiple) Wiener-Itˆo integral, of order q, of h with respect to the Gaussian measure B → X(1B ), where B ∈ A has finite ν-measure. See chapter 1 of [25] for more details on this point. Hypercontractivity. Random variables living in a fixed Wiener chaos are hypercontractive. More precisely, assume that Z belongs to the qth Wiener chaos Hq (q ≥ 1). Then, Z has a finite variance by construction and, for all p ∈ [2, +∞), one has the following estimate (see Theorem 5.10 in [12] for a proof):
p/2
. (2.5) E |Z|p ≤ (p − 1)pq/2 E Z 2
In particular, if E(Z 2 ) = 1, one has that E |Z|p ≤ (p − 1)pq/2 . For future use, we also observe that, for every q ≥ 1, the mapping p → (p − 1)pq/2 is strictly increasing on [2, +∞).
Stein’s Method Meets Malliavin Calculus
215
Chaotic decompositions. It is well-known (Wiener chaos decomposition) that the space L2 (Ω, F , P ) can be decomposed into the infinite orthogonal sum of the spaces Hq . It follows that any square-integrable random variable Z ∈ L2 (Ω, F , P ) admits the following chaotic expansion Z=
∞
Iq (fq ),
(2.6)
q=0
where f0 = E(Z), and the kernels fq ∈ Hq , q ≥ 1, are uniquely determined. For every q ≥ 0, we also denote by Jq the orthogonal projection operator on Hq . In particular, if Z ∈ L2 (Ω, F , P ) is as in (2.6), then Jq (Z) = Iq (fq ) for every q ≥ 0. Contractions. Let {ek , k ≥ 1} be a complete orthonormal system in H. Given f ∈ Hp and g ∈ Hq , for every r = 0, . . . , p ∧ q, the contraction of f and g of order r is the element of H⊗(p+q−2r) defined by f ⊗r g =
∞
f, ei1 ⊗ . . . ⊗ eir H⊗r ⊗ g, ei1 ⊗ . . . ⊗ eir H⊗r .
(2.7)
i1 ,...,ir =1
Notice that f ⊗r g is not necessarily symmetric: we denote its symmetrization by f ⊗r g ∈ H(p+q−2r) . Moreover, f ⊗0 g = f ⊗ g equals the tensor product of f and g while, for p = q, one has that f ⊗q g = f, g H⊗q . In the particular case where H = L2 (A, A, ν), one has that Hq = L2s (Aq , Aq , ν q ) (see Remark 2.2) and the contraction in (2.7) can be written in integral form as (f ⊗r g)(t1 , . . . , tp+q−2r ) = f (t1 , . . . , tp−r , s1 , . . . , sr ) Ar
× g(tp−r+1 , . . . , tp+q−2r , s1 , . . . , sr )dν(s1 ) . . . dν(sr ). Multiplication. The following multiplication formula is well-known: if f ∈ Hp and g ∈ Hq , then p∧q p q r! Ip+q−2r (f ⊗r g). (2.8) Ip (f )Iq (g) = r r r=0 Note that (2.8) gives an immediate proof of the fact that multiple Wiener-Itˆ o integrals have finite moments of every order. 2.3. The language of Malliavin calculus We now introduce some basic elements of the Malliavin calculus with respect to the isonormal Gaussian process X. Malliavin derivatives. Let S be the set of all cylindrical random variables of the type Z = g (X(φ1 ), . . . , X(φn )) ,
(2.9)
216
Ivan Nourdin and Giovanni Peccati
where n ≥ 1, g : Rn → R is an infinitely differentiable function with compact support and φi ∈ H. The Malliavin derivative of Z with respect to X is the element of L2 (Ω, H) defined as DZ =
n ∂g (X(φ1 ), . . . , X(φn )) φi . ∂x i i=1
By iteration, one can define the mth derivative Dm Z, which is an element of L2 (Ω, Hm ), for every m ≥ 2. For m ≥ 1 and p ≥ 1, Dm,p denotes the closure of S with respect to the norm · m,p , defined by the relation Zpm,p = E [|Z|p ] +
m
E Di ZpH⊗i .
i=1
The chain rule. The Malliavin derivative D verifies the following chain rule. If ϕ : Rd → R is continuously differentiable with bounded partial derivatives and if Z = (Z1 , . . . , Zd ) is a vector of elements of D1,2 , then ϕ(Z) ∈ D1,2 and D ϕ(Z) =
d ∂ϕ (Z)DZi . ∂x i i=1
(2.10)
A careful application, e.g., of the multiplication formula (2.8) shows that (2.10) continues to hold whenever the function ϕ is a polynomial in d variables. Note also ∞ that a random variable Z as in (2.6) is in D1,2 if and only if q=1 qJq (Z)2L2 (Ω) < ∞
∞ and, in this case, E DZ2H = q=1 qJq (Z)2L2 (Ω) . If H = L2 (A, A, ν) (with ν non-atomic), then the derivative of a random variable Z as in (2.6) can be identified with the element of L2 (A × Ω) given by Dx Z =
∞
x ∈ A.
qIq−1 (fq (·, x)) ,
(2.11)
q=1
The divergence operator. We denote by δ the adjoint of the operator D, also called the divergence operator. A random element u ∈ L2 (Ω, H) belongs to the domain of δ, noted Domδ, if and only if it verifies |E DZ, u H | ≤ cu ZL2(Ω) for any Z ∈ D1,2 , where cu is a constant depending only on u. If u ∈ Domδ, then the random variable δ(u) is defined by the duality relationship
(2.12) E(Zδ(u)) = E DZ, u H , which holds for every Z ∈ D1,2 . Ornstein-Uhlenbeck operators. The operator L, known as the generator of the Ornstein-Uhlenbeck semigroup, is defined as L = ∞ q=0 −qJq . The domain of L is DomL = {Z ∈ L2 (Ω) :
∞ q=1
2
q 2 Jq (Z)L2 (Ω) < ∞} = D2,2 .
Stein’s Method Meets Malliavin Calculus
217
There is an important relation between the operators D, δ and L (see, e.g., Proposition 1.4.3 in [25]): a random variable Z belongs to D2,2 if and only if Z ∈ Dom (δD) (i.e. Z ∈ D1,2 and DZ ∈ Domδ) and, in this case, δDZ = −LZ. (2.13) ∞ 1 −1 For any Z ∈ L (Ω), we define L Z = q=1 − q Jq (Z). The operator L is called the pseudo-inverse of L. For any Z ∈ L2 (Ω), we have that L−1 Z ∈ DomL, and 2
−1
LL−1 Z = Z − E(Z).
(2.14)
An important string of identities. Finally, let us mention a chain of identities playing a crucial role in the sequel. Let f : R → R be a C 1 function with bounded derivative, and let F, Z ∈ D1,2 . Assume moreover that E(Z) = 0. By using successively (2.14), (2.13) and (2.10), one deduces that
E Zf (F ) = E LL−1 Z × f (F ) = E δD(−L−1 Z) × f (F )
= E Df (F ), −DL−1 Z H
(2.15) = E f (F ) DF, −DL−1 Z H .
We will shortly see that the fact E Zf (F ) = E f (F ) DF, −DL−1 Z H constitutes a fundamental element in the connection between Malliavin calculus and Stein’s method. 3. One-dimensional approximations 3.1. Stein’s lemma for normal approximations Originally introduced in the path-breaking paper [36], and then further developed in the monograph [37], Stein’s method can be roughly described as a collection of probabilistic techniques, allowing to characterize the approximation of probability distributions by means of differential operators. As already pointed out in the Introduction, the two surveys [7] and [33] provide a valuable introduction to this very active area of modern probability. In this section, we are mainly interested in the use of Stein’s method for the normal approximation of the laws of realvalued random variables, where the approximation is performed with respect to the Kolmogorov distance. We recall that the Kolmogorov distance between the laws of two real-valued random variables Y and Z is defined by dKol (Y, Z) = sup P (Y ≤ z) − P (Z ≤ z). z∈R
The reader is referred to [18] for several extensions of the results discussed in this survey to other distances between probability measures, such as, e.g., the total variation distance, or the Wasserstein distance. The following statement, containing all the elements of Stein’s method that are needed for our discussion, can be traced back to Stein’s original contribution [36]. Lemma 3.1. Let N ∼ N (0, 1) be a standard Gaussian random variable.
218
Ivan Nourdin and Giovanni Peccati
1. Fix z ∈ R, and define fz : R → R as x
a2 x2 2 fz (x) = e 1(−∞,z] (a) − P (N ≤ z) e− 2 da, −∞
Then, fz is continuous on R, bounded by verifies moreover
x ∈ R.
(3.1)
√ 2π/4, differentiable on R \ {z}, and
fz (x) − xfz (x) = 1(−∞,z] (x) − P (N ≤ z)
for all x ∈ R \ {z}.
(3.2)
One has also that fz is Lipschitz, with Lipschitz constant less or equal to 1. 2. Let Z be a generic random variable. Then, dKol (Z, N ) ≤ sup |E[Zf (Z) − f (Z)]|,
(3.3)
f
where the supremum is taken over the class of all Lipschitz functions that are √ bounded by 2π/4 and whose Lipschitz constant is less or equal to 1. 3. Let Z be a generic random variable. Then, Z ∼ N (0, 1) if and only if E[Zf (Z)−f (Z)] = 0 for every continuous and piecewise differentiable function f verifying the relation E|f (N )| < ∞. Proof: (Point 1) We shall only prove that fz is Lipschitz and we will evaluate its constant (the proof of the remaining properties is left to the reader). We have, for x ≥ 0, x = z: x 2
− a2 fz (x) = 1(−∞,z] (x) − P (N ≤ z) + xe x2 1(−∞,z] (a) − P (N ≤ z) e 2 da −∞ +∞
a2 x2 1(−∞,z] (a) − P (N ≤ z) e− 2 da = 1(−∞,z] (x) − P (N ≤ z) − xe 2 (∗)
x
2 2 x2 2 2 2 ≤ 1(−∞,z] (·) − P (N ≤ z) ∞ 1 + xe
≤ 1+e
x2 2
+∞
2
e
− a2
da
x
+∞
ae−
a2 2
da = 2.
x
Observe that identity (∗) holds since
1 0 = E 1(−∞,z] (N ) − P (N ≤ z) = √ 2π
+∞
−∞
a2 1(−∞,z] (a) − P (N ≤ z) e− 2 da.
For x ≤ 0, x = z, we can write x 2
− a2 f (x) = 1(−∞,z] (x) − P (N ≤ z) + xe x2 1(−∞,z] (a) − P (N ≤ z) e 2 da z −∞ x 2 2 x2 a2 ≤ 21(−∞,z] (·) − P (N ≤ z)2∞ 1 + |x|e 2 e− 2 da −∞ x x2 a2 |a|e− 2 da = 2. ≤ 1+e 2 −∞
Stein’s Method Meets Malliavin Calculus
219
Hence, we have shown that fz is Lipschitz with Lipschitz constant bounded by 2. For the announced refinement (that is, the constant is bounded by 1), we refer the reader to Lemma 2.2 in Chen and Shao [7]. (Point 2) Take expectations on both sides of (3.2) with respect to the law of Z. Then, take the supremum over all z ∈ R, and exploit the properties of fz proved at Point 1. (Point 3) If Z ∼ N (0, 1), a simple application of the Fubini theorem (or, equivalently, an integration by parts) yields that E[Zf (Z)] = E[f (Z)] for every smooth f . Now suppose that E[Zf (Z) − f (Z)] = 0 for every function f as in the statement, so that this equality holds in particular for f = fz and for every z ∈ R. By integrating both sides of (3.2) with respect to the law of Z, this yields that P (Z ≤ z) = P (N ≤ z) for every z ∈ R, and therefore that Z and N have the same law. Remark 3.3. Formulae (3.2) and (3.3) are known, respectively, as Stein’s equation and Stein’s bound. As already evoked in the Introduction, Point 3 in the statement of Lemma 3.1 is customarily referred to as Stein’s lemma. 3.2. General bounds on the Kolmogorov distance We now face the problem of establishing a bound on the normal approximation of a centered and Malliavin-differentiable random variable. The next statement contains one of the main findings of [18]. Theorem 3.1. (See [18]) Let Z ∈ D1,2 be such that E(Z) = 0 and Var(Z) = 1. Then, for N ∼ N (0, 1), !
(3.4) dKol (Z, N ) ≤ Var DZ, −DL−1 Z H . Proof. In view of (3.3), it is enough to prove that, for every Lipschitz function f with Lipschitz constant less or equal to 1, one has that the quantity |E[Zf (Z) − f (Z)]| is less or equal to the right-hand side of (3.4). Start by considering a function f : R → R which is C 1 and such that f ∞ ≤ 1. Relation (2.15) yields
E Zf (Z) = E f (Z) DZ, −DL−1 Z H , so that
E f (Z) − E Zf (Z) = E f (Z)(1 − DZ, −DL−1 Z H ) ≤ E 1 − DZ, −DL−1 Z H .
By a standard approximation argument (e.g. by using a convolution
with anap
E f (Z) − E Zf (Z) ≤ proximation of the identity), one sees that the inequality E 1 − DZ, −DL−1 Z H continues to hold when f is Lipschitz with constant less or equal to 1. Hence, by combining the previous estimates with (3.3), we infer that !
2 dKol (Z, N ) ≤ E 1 − DZ, −DL−1 Z H ≤ E 1 − DZ, −DL−1 Z H .
220
Ivan Nourdin and Giovanni Peccati
Finally, the desired conclusion follows by observing that, if one chooses f (z) = z in (2.15), then one obtains E( DZ, −DL−1 Z H ) = E(Z 2 ) = 1, $
2 % = Var DZ, −DL−1 Z H . so that E 1 − DZ, −DL−1 Z H
(3.5)
Remark 3.4. By using the standard properties of conditional expectations, one sees that (3.4) also implies the ‘finer’ bound !
(3.6) dKol (Z, N ) ≤ Var g(Z) , where g(Z) = E[ DZ, −DL−1 Z H |Z]. In general, it is quite difficult to obtain an explicit expression of the function g. However, if some crude estimates on g are available, then one can obtain explicit upper and lower bounds for the densities and the tail probabilities of the random variable Z. The reader is referred to Nourdin and Viens [24] and Viens [40] for several results in this direction, and to Breton et al. [4] for some statistical applications of these ideas. 3.3. Wiener chaos and the fourth moment condition In this section, we will apply Theorem 3.1 to chaotic random variables, that is, random variables having the special form of multiple Wiener-Itˆ o integrals of some fixed order q ≥ 2. As announced in the Introduction, this allows to recover and refine some recent characterizations of CLTs on Wiener chaos (see [26,27]). We begin with a technical lemma. Lemma 3.2. Fix an integer q ≥ 1, and let Z = Iq (f ) (with f ∈ Hq ) be such that Var(Z) = E(Z 2 ) = 1. The following three identities are in order: 2 q−1 1 q−1 2 DZH − 1 = q (r − 1)! I2q−2r (f ⊗r f ), q r−1 r=1 Var
1 DZ2H q
=
4 q r! (2q − 2r)!f ⊗r f 2H⊗2q−2r , q2 r
q−1 2 r r=1
2
(3.7)
(3.8)
and 4 q−1 3 2 q E(Z ) − 3 = rr! (2q − 2r)!f ⊗r f 2H⊗2q−2r . q r=1 r 4
In particular,
Var
1 DZ2H q
≤
q − 1
E(Z 4 ) − 3 . 3q
(3.9)
(3.10)
Stein’s Method Meets Malliavin Calculus
221
Proof. Without loss of generality, we can assume that H is equal to L2 (A, A, ν), where (A, A) is a measurable space
andν a σ-finite measure without atoms. For any a ∈ A, we have Da Z = qIq−1 f (·, a) so that
2 1 Iq−1 f (·, a) ν(da) DZ2H = q q A 2 q−1
q−1 r! I2q−2−2r f (·, a) ⊗r f (·, a) ν(da) by (2.8) =q r A r=0 q−1 2 q−1 =q r! I2q−2−2r f (·, a) ⊗r f (·, a)ν(da) r A r=0 2 q−1 q−1 =q r! I2q−2−2r (f ⊗r+1 f ) r r=0 2 q q−1 (r − 1)! I2q−2r (f ⊗r f ). =q r−1 r=1 2 q−1 q−1 = q!f 2H⊗q + q (r − 1)! I2q−2r (f ⊗r f ). r−1 r=1 Since E(Z 2 ) = q!f 2H⊗q , the proof of (3.7) is finished. The identity (3.8) follows from (3.7) and the orthogonality properties of multiple stochastic integrals. Using (in order) formula (2.13) and the relation D(Z 3 ) = 3Z 2 DZ, we infer that E(Z 4 ) =
1
3
1
E δDZ × Z 3 = E DZ, D(Z 3 ) H = E Z 2 DZ2H . q q q
Moreover, the multiplication formula (2.8) yields 2 q q 2 2 s! I2q−2s (f ⊗s f ). Z = Iq (f ) = s s=0
(3.11)
(3.12)
By combining this last identity with (3.7) and (3.11), we obtain (3.9) and finally (3.10). 2 As a consequence of Lemma 3.2, we deduce the following bound on the Kolmogorov distance – first proved in [22]. Theorem 3.2. (See [22]) Let Z belong to the qth chaos Hq of X, for some q ≥ 2. Suppose moreover that Var(Z) = E(Z 2 ) = 1. Then < q − 1
E(Z 4 ) − 3 . (3.13) dKol (Z, N ) ≤ 3q Proof. Since L−1 Z = − 1q Z, we have DZ, −DL−1 Z H = need to apply Theorem 3.1 and formula (3.10).
1 2 q DZH .
So, we only 2
222
Ivan Nourdin and Giovanni Peccati
The estimate (3.13) allows to deduce the following characterization of CLTs on Wiener chaos. Note that the equivalence of Point (i) and Point (ii) in the next statement was first proved by Nualart and Peccati in [27] (by completely different techniques based on stochastic time-changes), whereas the equivalence of Point (iii) was first obtained by Nualart and Ortiz-Latorre in [26] (by means of Malliavin calculus, but not of Stein’s method). Theorem 3.3. (see [26,27]) Let (Zn ) be a sequence of random variables belonging to the qth chaos Hq of X, for some fixed q ≥ 2. Assume that Var(Zn ) = E(Zn2 ) = 1 for all n. Then, as n → ∞, the following three assertions are equivalent: Law
(i) Zn −→ N ∼ N (0, 1); (ii) E(Zn4 ) → E(N 4) = 3; (iii) Var
1 2 q DZn H
→ 0.
Proof. For every n, write Zn = Iq (fn ) with fn ∈ Hq uniquely determined. The implication (iii) → (i) is a direct application of Theorem 3.2, and of the fact that the topology of the Kolmogorov distance is stronger than the topology of the convergence in law. The implication (i) → (ii) comes from a bounded convergence argument (observe that supn≥1 E(Zn4 ) < ∞ by the hypercontractivity relation (2.5)). Finally, let us prove the implication (ii) → (iii). Suppose that (ii) is in order. Then, by virtue of (3.9), we have that fn ⊗r fn H⊗2q−2r tends to zero, as n → ∞, for all (fixed) r ∈ {1, . . . , q − 1}. Hence, (3.8) allows to conclude that (iii) is in order. The proof of Theorem 3.3 is thus complete. Remark 3.5. Theorem 3.3 has been applied to a variety of situations: see, e.g., (but the list is by no means exhaustive) Barndorff-Nielsen et al. [1], Corcuera et al. [8], Marinucci and Peccati [14], Neuenkirch and Nourdin [15], Nourdin and Peccati [17] and Tudor and Viens [39], and the references therein. See Peccati and Taqqu [30] for several combinatorial interpretations of these results. By combining Theorem 3.2 and Theorem 3.3, we obtain the following result. Corollary 14.1. Let the assumptions of Corollary 3.3 prevail. As n → ∞, the following assertions are equivalent: Law
(a) Zn −→ N ∼ N (0, 1); (b) dKol (Zn , N ) → 0. Proof. Of course, only the implication (a) → (b) has to be proved. Assume that (a) 1 2 is in order. By Corollary 3.3, we have that Var q DZn H → 0. Using Theorem 3.2, we get that (b) holds, and the proof is done.
Stein’s Method Meets Malliavin Calculus
223
3.4. Quadratic variation of the fractional Brownian motion, part one In this section, we use Theorem 3.1 in order to derive an explicit bound for the second-order approximation of the quadratic variation of a fractional Brownian motion. Let B = {Bt : t ≥ 0} be a fractional Brownian motion with Hurst index H ∈ (0, 1). This means that B is a centered Gaussian process, started from zero and with covariance function E(Bs Bt ) = R(s, t) given by 1 2H t + s2H − |t − s|2H , s, t ≥ 0. R(s, t) = 2 The fractional Brownian motion of index H is the only centered Gaussian processes normalized in such a way that Var(B1 ) = 1, and such that B is selfsimilar with index H and has stationary increments. If H = 1/2 then R(s, t) = min(s, t) and B is simply a standard Brownian motion. If H = 1/2, then B is neither a (semi)martingale nor a Markov process (see, e.g., [25] for more details). As already explained in the Introduction (see Example 2.3), for any choice of the Hurst parameter H ∈ (0, 1) the Gaussian space generated by B can be identified with an isonormal Gaussian process X = {X(h) : h ∈ H}, where the real and separable Hilbert space H is defined as follows: (i) denote by E the set of all Rvalued step functions on [0, ∞), (ii) define H as the Hilbert space obtained by closing E with respect to the scalar product ) ( 1[0,t] , 1[0,s] H = R(t, s). In particular, with such a notation, one has that Bt = X(1[0,t] ). Set n−1 Law n2H n−1 1 2 Zn = (Bk+1 − Bk ) − 1 = (B(k+1)/n − Bk/n )2 − n−2H σn σn k=0
k=0
E(Zn2 )
= 1. It is well-known (see, e.g., [5]) that, for where σn > 0 is chosen so that every H ≤ 3/4 and for n → ∞, one has that Zn converges in law to N ∼ N (0, 1). The following result uses Stein’s method in order to obtain an explicit bound for the Kolmogorov distance between Zn and N . It was first proved in [18] (for the case H < 3/4) and [3] (for H = 3/4). Theorem 3.4. Let N ∼ N (0, 1) and assume that H constant cH > 0 (depending only on H) such that, for 1 √ if H n 3 dKol (Zn , N ) ≤ cH × n2H− 2 if H √1 if H log n
≤ 3/4. Then, there exists a every n ≥ 1, ∈ (0, 12 ] ∈ [ 12 , 34 ) . =
3 4
(3.14)
224
Ivan Nourdin and Giovanni Peccati
Remark 3.6. (1) By inspection of the forthcoming proof of Theorem 3.4, one sees that σ2 limn→∞ nn = 2 r∈Z ρ2 (r) if H ∈ (0, 3/4), with ρ given by (3.15), and σ2
n limn→∞ n log n = 9/16 if H = 3/4. (2) When H > 3/4, the sequence (Zn ) does not converge in law to N (0, 1). ActuLaw ally, Zn −→ Z∞ ∼ ‘Hermite random variable’ and, using a result by Davydov
n→∞
and Martynova [9], one can also associate a bound to this convergence. See [3] for details on this result. (3) More generally, and using the analogous computations, one can associate bounds with the convergence of sequence Zn(q) =
n−1 1 (q) σn k=0
Law
Hq (Bk+1 − Bk ) =
n−1 1 (q) σn k=0
Hq (nH (B(k+1)/n − Bk/n )
towards N ∼ N (0, 1), where Hq (q ≥ 3) denotes the qth Hermite polynomial (q) (as defined in (2.3)), and σn is some appropriate normalizing constant. In this case, the critical value is H = 1 − 1/(2q) instead of H = 3/4. See [18] for details. In order to show Theorem 3.4, we will need the following ancillary result, whose proof is obvious and left to the reader. Lemma 3.3. 1. For r ∈ Z, let ρ(r) =
1
|r + 1|2H + |r − 1|2H − 2|r|2H . 2
(3.15)
If H = 12 , one has ρ(r) ∼ H(2H − 1)|r|2H−2 as |r| → ∞. If H = 12 and |r| ≥ 1, one has ρ(r) = 0. Consequently, r∈Z ρ2 (r) < ∞ if and only if H < 3/4. n−1 2. For all α > −1, we have r=1 rα ∼ nα+1 /(α + 1) as n → ∞. We are now ready to prove the main result of this section.
Proof of Theorem 3.4. Since 1[k,k+1] 2H = E (Bk+1 − Bk )2 = 1, we have, by (2.4), (Bk+1 − Bk )2 − 1 = I2 (1⊗2 [k,k+1] ) ⊗2 2 so that Zn = I2 (fn ) with fn = σ1n n−1 . Let us compute the exact k=0 1[k,k+1] ∈ H
value of σn . Observe that 1[k,k+1] , 1[l,l+1] H = E (Bk+1 −Bk )(Bl+1 −Bl ) = ρ(k−l)
Stein’s Method Meets Malliavin Calculus
225
with ρ given by (3.15). Hence 2 n−1 E (Bk+1 − Bk )2 − 1 k=0
2 n−1 n−1 ⊗2 = I2 (1⊗2 ) E I2 (1⊗2 = E [k,k+1] [k,k+1] )I2 (1[l,l+1] ) k=0 n−1
=2
k,l=0
1[k,k+1] , 1[l,l+1] 2H = 2
k,l=0
n−1
ρ2 (k − l).
k,l=0
That is,
σn2 = 2
n−1
ρ2 (k − l) = 2
k,l=0
n−1 n−1−l
ρ2 (r) = 2 n
ρ2 (r) −
|r|
l=0 r=−l
2 |r| + 1 ρ (r) .
|r|
Assume that H < 3/4. Then, we have |r| + 1 σn2 2 =2 1{|r|
Since
r∈Z ρ
2
(r) < ∞, we obtain, by bounded Lebesgue convergence: σn2 =2 ρ2 (r). n→∞ n lim
(3.16)
r∈Z
Assume that H = 3/4. We have ρ2 (r) ∼ n
ρ2 (r) ∼
|r|
9n 64
9 64|r|
as |r| → ∞. Therefore, as n → ∞,
0<|r|
1 9n log n ∼ |r| 32
and
|r|
9 9n . |r| + 1 ρ2 (r) ∼ 1∼ 64 32 |r|
We deduce that σn2 9 = . n→∞ n log n 16 lim
(3.17)
226
Ivan Nourdin and Giovanni Peccati
Now, we have, see (3.8) for the first equality, 2 22 2 n−1 2 2 2
1 1 1 2 ⊗2 ⊗2 2 Var DZn 2H = fn ⊗1 fn 2H⊗2 = 1 ⊗ 1 1 [l,l+1] 2 [k,k+1] 2 4 2 2 2σn 2 2 k,l=0 H 2 22 2 n−1 2 2 1 2 2 = ρ(k − l)1[k,k+1] ⊗ 1[l,l+1] 2 2 2 4 2σn 2 2 k,l=0 H
=
1 2σn4
1 ≤ 4σn4
n−1
ρ(k − l)ρ(i − j)ρ(k − i)ρ(l − j)
i,j,k,l=0 n−1
|ρ(k − i)||ρ(i − j)| ρ2 (k − l) + ρ2 (l − j)
i,j,k,l=0
n−1 n−1 1 |ρ(k − i)||ρ(i − j)| ρ2 (r) ≤ 2σn4 r=−n+1 i,j,k=0 n−1 2 n−1 n ≤ |ρ(s)| ρ2 (r). 4 2σn s=−n+1 r=−n+1
If H ≤ 1/2 then s∈Z |ρ(s)| < ∞ and r∈Z ρ2 (r) < ∞ so that, in view of (3.16),
1 n−1 Var 2 DZn 2H = O(n−1 ). If 1/2 < H < 3/4 then s=−n+1 |ρ(s)| = O(n2H−1 ) 2 (see Lemma 3.3) and r∈Z ρ (r) < ∞ so that, in view of (3.16), one has
1 n−1 √ Var 2 DZn 2H = O(n4H−3 ). If H = 3/4 then s=−n+1 |ρ(s)| = O( n) and n−1 2 n) (indeed, by Lemma 3.3, ρ2 (r) ∼ cst r=−n+1 ρ (r) = O(log |r| as |r| → ∞) so that,
1 2 in view of (3.17), Var 2 DZn H = O(1/ log n). Finally, the desired conclusion follows from Theorem 3.2. 2
3.5. The method of (fourth) moments: explicit estimates via interpolation It is clear that the combination of Theorem 3.2 and Theorem 3.3 provides a remarkable simplification of the method of moments and cumulants, as applied to the derivation of CLTs on a fixed Wiener chaos (further generalizations of these results, concerning in particular multi-dimensional CLTs, are discussed in the forthcoming Section 4). In particular, one deduces from (3.13) that, for a sequence of chaotic random variables with unit variance, the speed of convergence to zero of the fourth cumulants E(Zn4 ) − 3 also determines the speed of convergence in the Kolmogorov distance. In this section, we shall state and prove a new upper bound, showing that, for a normalized chaotic sequence {Zn : n ≥ 1} converging in distribution to N ∼ N (0, 1), the convergence to zero of E(Znk ) − E(N k ) is always dominated by the
Stein’s Method Meets Malliavin Calculus
227
speed of convergence of the square root of E(Zn4 ) − E(N 4 ) = E(Zn4 ) − 3. To do this, we shall apply a well-known Gaussian interpolation technique, which has been essentially introduced by Talagrand (see, e.g., [38]); note that a similar approach has recently been adopted in [22], in order to deduce a universal characterization of CLTs for sequences of homogeneous sums. Remark 3.7. 1. In principle, one could deduce from the results of this section that, for every k ≥ 3, the speed of convergence to zero of kth cumulant of Zn is always dominated by the speed of convergence of the fourth cumulant E(Zn4 ) − 3. 2. We recall that the explicit computation of moments and cumulants of chaotic random variables is often performed by means of a class of combinatorial devices, known as diagram formulae. This tools are not needed in our analysis, as we rather rely on multiplication formulae and integration by parts techniques from Malliavin calculus. See section 3 in [30] for a recent and self-contained introduction to moments, cumulants and diagram formulae. Proposition 3.1. Let q ≥ 2 be an integer, and let Z be an element of the qth chaos Hq of X. Assume that Var(Z) = E(Z 2 ) = 1, and let N ∼ N (0, 1). Then, for all integer k ≥ 3, E(Z k ) − E(N k ) ≤ ck,q E(Z 4 ) − E(N 4 ), (3.18) where the constant ck,q is given by : < kq 5 q − 1 (2k − 4)! k− 2 −q + (2k − 5) 2 . ck,q = (k − 1)2 3q 2k−2 (k − 2)! Proof. Without loss of generality, we can assume that N is independent of the underlying isonormal √ Gaussian process X. Fix an integer k ≥ 3. By denoting √ k Ψ(t) = E ( 1 − tZ + tN ) , t ∈ [0, 1], we have 1 E(Z k ) − E(N k ) = Ψ(1) − Ψ(0) ≤ |Ψ (t)|dt, 0
where the derivative Ψ is easily seen to exist on (0, 1), and moreover one has √ √ √ √ k k E ( 1 − tZ + tN )k−1 Z . Ψ (t) = √ E ( 1 − tZ + tN )k−1 N − √ 2 1−t 2 t By integrating by parts and by using the explicit expression of the Gaussian density, one infers that % $ √ √ √ √ E ( 1 − tZ + tN )k−1 N = E E ( 1 − tz + tN )k−1 N |z=Z % √ $ √ √ = (k − 1) t E E ( 1 − tz + tN )k−2 |z=Z √ √ √ = (k − 1) t E ( 1 − tZ + tN )k−2 .
228
Ivan Nourdin and Giovanni Peccati
Similarly, using this time (2.15) in order to perform the integration by parts and taking into account that DZ, −DL−1 Z H = 1q DZ2H because Z ∈ Hq , we can write √ √ E ( 1 − tZ + tN )k−1 Z % $ √ √ = E E ( 1 − tZ + tx)k−1 Z |x=N √ k−2 1 √ √ 2 DZH |x=N = (k − 1) 1 − t E E ( 1 − tZ + tx) q √ √ √ k−2 1 2 DZH . = (k − 1) 1 − t E ( 1 − tZ + tN ) q Hence, Ψ (t) =
k(k − 1) E 2
and consequently k(k − 1) Ψ (t) ≤ 2
√ √ 1 1 − DZ2H ( 1 − tZ + tN )k−2 , q
7 E < $ 2 F % 8 8 √ √ 1 . E ( 1 − tZ + tN )2k−4 × 9E 1 − DZ2H q
By (3.5) and (3.10), we have E 2 F 1 q − 1
1 2 2 DZH ≤ E(Z 4 ) − 3 . = Var E 1 − DZH q q 3q √ √ √ Using succesively (x+y)2k−4 ≤ 22k−5 (x2k−4 +y 2k−4 ), x + y ≤ x+ y, inequality (2.5) and E(N 2k−4 ) = (2k − 4)!/(2k−2 (k − 2)!), we can write < $ % √ √ E ( 1 − tZ + tN )2k−4 ! ! 5 k 5 k ≤ 2k− 2 (1 − t) 2 −1 E(Z 2k−4 ) + 2k− 2 t 2 −1 E(N 2k−4 ) : kq k 5 k (2k − 4)! k− 52 −1 −q k− −1 ≤2 (1 − t) 2 (2k − 5) 2 +2 2 t 2 2k−2 (k − 2)! so that : E F 1< $ % k− 32 √ √ kq 2 (2k − 4)! −q (2k − 5) 2 . E ( 1 − tZ + tN )2k−4 dt ≤ + k 2k−2 (k − 2)! 0 Putting all these bounds together, one deduces the desired conclusion. 4. Multidimensional case Here and for the rest of the section, we consider as given an isonormal Gaussian process {X(h) : h ∈ H}, over some real separable Hilbert space H.
Stein’s Method Meets Malliavin Calculus
229
4.1. Main bounds We shall now present (without proof) a result taken from [23], concerning the Gaussian approximation of vectors of random variables that are differentiable in the Malliavin sense. We recall that the Wasserstein distance between the laws of two Rd -valued random vectors X and Y , noted dW (X, Y ), is given by E[g(X)] − E[g(Y )], sup dW (X, Y ) := g∈H ; g Lip ≤1
where H indicates the class of all Lipschitz functions, that is, the collection of all functions g : Rd → R such that gLip := sup x=y
|g(x) − g(y)| <∞ x − yRd
(with · Rd the usual Euclidian norm on Rd ). Also, we recall that the operator norm of a d × d matrix A over R is given by Aop := sup x Rd =1 AxRd . Note that, in the following statement, we require that the approximating Gaussian vector has a positive definite covariance matrix. Theorem 4.1. (See [23]) Fix d ≥ 2 and let C = (Cij )1≤i,j≤d be a d × d positive definite matrix. Suppose that N ∼ Nd (0, C), and assume that Z = (Z1 , . . . , Zd ) is a Rd -valued random vector such that E[Zi ] = 0 and Zi ∈ D1,2 for every i = 1, . . . , d. Then, 7 8 d 8 −1 1/2 9 E[(Cij − DZi , −DL−1 Zj H )2 ]. dW (Z, N ) ≤ C op Cop i,j=1
In what follows, we shall use once again interpolation techniques in order to partially generalize Theorem 4.1 to the case where the approximating covariance matrix C is not necessarily positive definite. This additional difficulty forces us to work with functions that are smoother than the ones involved in the definition of the Wasserstein distance. To this end, we will adopt the following simplified notation: for every ϕ : Rd → R of class C 2 , we set 2 ∂ ϕ (z) . ϕ ∞ = max sup i,j=1,...,d z∈Rd ∂xi ∂xj Theorem 4.2. (See [22]) Fix d ≥ 2, and let C = (Cij )1≤i,j≤d be a d × d covariance matrix. Suppose that N ∼ Nd (0, C) and that Z = (Z1 , . . . , Zd ) is a Rd -valued random vector such that E[Zi ] = 0 and Zi ∈ D1,2 for every i = 1, . . . , d. Then, for every ϕ : Rd → R belonging to C 2 such that ϕ ∞ < ∞, we have d E[ϕ(Z)] − E[ϕ(N )] ≤ 1 ϕ ∞ E Ci,j − DZj , −DL−1 Zi H . 2 i,j=1
(4.1)
230
Ivan Nourdin and Giovanni Peccati
Proof. Without loss of generality, we assume that N is independent of the underlying a C 2 -function such that ϕ ∞ < isonormal Gaussian process X. Let ϕ : √ Rd → R be √ ∞. For any t ∈ [0, 1], set Ψ(t) = E ϕ 1 − tZ + tN , so that 1 E[ϕ(Z)] − E[ϕ(N )] = Ψ(1) − Ψ(0) ≤ |Ψ (t)|dt. 0
We easily see that Ψ is differentiable on (0, 1) with d √ 1 ∂ϕ √ 1 √ Ni − √ Ψ (t) = Zi . E 1 − tZ + tN ∂xi 2 1−t 2 t i=1 By integrating by parts, we can write √ ∂ϕ √ E 1 − tZ + tN Ni ∂xi + * √ ∂ϕ √ 1 − tz + tN Ni =E E ∂xi |z=Z + * d 2 √ √ ∂ ϕ √ = t Ci,j E E 1 − tz + tN ∂xi ∂xj |z=Z j=1 =
d √ t Ci,j E j=1
√ ∂ 2 ϕ √ 1 − tZ + tN . ∂xi ∂xj
By using (2.15) in order to perform the integration by parts, we can also write √ ∂ϕ √ 1 − tZ + tN Zi E ∂xi + * √ ∂ϕ √ 1 − tZ + tx Zi =E E ∂xi |x=N + * d √ √ ∂ 2 ϕ √ −1 = 1−t E E 1 − tZ + tx DZj , −DL Zi H ∂xi ∂xj |x=N j=1 =
d √ 1−t E j=1
√ ∂ 2 ϕ √ 1 − tZ + tN DZj , −DL−1 Zi H . ∂xi ∂xj
Hence
2 d √
1 ∂ ϕ √ −1 Ψ (t) = E 1 − tZ + tN Ci,j − DZj , −DL Zj H , 2 i,j=1 ∂xi ∂xj
so that
0
1
|Ψ (t)|dt ≤
d 1 ϕ ∞ E Ci,j − DZj , −DL−1 Zi H 2 i,j=1
and the desired conclusion follows.
2
Stein’s Method Meets Malliavin Calculus
231
We now aim at applying Theorem 4.2 to vectors of multiple stochastic integrals. Corollary 14.2. Fix integers d ≥ 2 and 1 ≤ q1 ≤ . . . ≤ qd . Consider a vector Z = (Z1 , . . . , Zd ) := (Iq1 (f1 ), . . . , Iqd (fd )) with fi ∈ Hqi for any i = 1 . . . , d. Let N ∼ Nd (0, C), with C = (Cij )1≤i,j≤d a d × d covariance matrix. Then, for every ϕ : Rd → R belonging to C 2 such that ϕ ∞ < ∞, we have d 1 E[ϕ(Z)] − E[ϕ(N )] ≤ 1 ϕ ∞ E Ci,j − DZj , DZi H . 2 di i,j=1 Proof. We have −L−1 Zi =
1 di
(4.2)
Zi so that the desired conclusion follows from (4.1). 2
When one applies Corollary 14.2 in concrete situations, one can use the following result in order to evaluate the right-hand side of (4.2). Proposition 4.2. Let F = Ip (f ) and G = Iq (g), with f ∈ Hp and g ∈ Hq (p, q ≥ 1). Let a be a real constant. If p = q, one has the estimate: E 2 F 1 E a − DF, DG H ≤ (a − p! f, g H⊗p )2 p 4 p−1
p−1 p2 + (r − 1)!2 (2p − 2r)! f ⊗p−r f 2H⊗2r + g ⊗p−r g2H⊗2r . 2 r=1 r−1 On the other hand, if p < q, one has that E 2 F 2 1 q−1 E a − DF, DG H ≤ a2 + p!2 (q − p)!f 2H⊗p g ⊗q−p gH⊗2p q p−1 2 2 p−1 q−1 p2 2 p−1 + (r − 1)! (p + q − 2r)! 2 r=1 r−1 r−1
× f ⊗p−r f 2H⊗2r + g ⊗q−r g2H⊗2r . Remark 4.8. When bounding the right-hand side of (4.2), we see that it is sufficient to asses the quantities fi ⊗r fi H⊗2(qi −r) for all i = 1, . . . , d and r = 1, . . . , qi − 1 on the one hand, and E(Zi Zj ) = qi ! fi , fj H⊗qi for all i, j = 1, . . . , d such that qi = qj on the other hand. In particular, this fact allows to recover a result first proved by Peccati and Tudor in [31], namely that, for vectors of multiple stochastic integrals whose covariance matrix is converging, the componentwise convergence to a Gaussian distribution always implies joint convergence. Proof of Proposition 4.2. Without loss of generality, we can assume that H = L2 (A, A , µ), where (A, A ) is a measurable space, and µ is a σ-finite and non-atomic
232
Ivan Nourdin and Giovanni Peccati
measure. Thus, we can write DF, DG H = p q Ip−1 (f ), Iq−1 (g) H = p q
Ip−1 f (·, t) Iq−1 g(·, t) µ(dt)
A
p−1 q−1 r! Ip+q−2−2r f (·, t)⊗r g(·, t) µ(dt) = pq r r A r=0 p∧q−1 p−1 q−1 = pq r! Ip+q−2−2r (f ⊗r+1 g) r r r=0 p∧q p−1 q−1 (r − 1)! Ip+q−2r (f ⊗r g). = pq r−1 r−1 r=1
p∧q−1
It follows that E 2 F 1 (4.3) E a − DF, DG H q
p 2 2 2 p−1 2 q−1 2 2 a + p r=1 (r − 1)! r−1 r−1 (p + q − 2r)!f ⊗r gH⊗(p+q−2r) if p < q, =
4 p−1 2 (a − p! f, g H⊗p )2 + p2 r=1 (r − 1)!2 p−1 r−1 (2p − 2r)!f ⊗r gH⊗(2p−2r) if p = q. If r < p ≤ q then f ⊗r g2H⊗(p+q−2r) ≤ f ⊗r g2H⊗(p+q−2r) = f ⊗p−r f, g ⊗q−r g H⊗2r ≤ f ⊗p−r f H⊗2r g ⊗q−r gH⊗2r 1
f ⊗p−r f 2H⊗2r + g ⊗q−r g2H⊗2r . ≤ 2 If r = p < q, then f ⊗p g2H⊗(q−p) ≤ f ⊗p g2H⊗(q−p) ≤ f 2H⊗p g ⊗q−p gH⊗2p . If r = p = q, then f ⊗p g = f, g H⊗p . By plugging these last expressions into (4.3), we deduce immediately the desired conclusion. 4.2. Quadratic variation of fractional Brownian motion, continued In this section, we continue the example of Section 3.4. We still denote by B a fractional Brownian motion with Hurst index H ∈ (0, 3/4]. We set nt−1 1 (Bk+1 − Bk )2 − 1 , t ≥ 0, Zn (t) = σn k=0
where σn > 0 is such that E Zn (1)2 = 1. The following statement contains the multidimensional counterpart of Theorem 3.4, namely a bound associated with the convergence of the finite dimensional distributions of {Zn (t) : t ≥ 0} towards a standard Brownian motion. A similar result can be of course recovered from Theorem 4.1 – see again [23].
Stein’s Method Meets Malliavin Calculus
233
Theorem 4.3. Fix d ≥ 1, and consider 0 = t0 < t1 < . . . < td . Let N ∼ Nd (0, Id ). There exists a constant c (depending only on d, H and t1 , . . . , td ) such that, for every n ≥ 1: E F Zn (ti ) − Zn (ti−1 ) √ − E ϕ(N ) sup E ϕ ti − ti−1 1≤i≤d 1 √ if H ∈ (0, 12 ] n 3 ≤ c × n2H− 2 if H ∈ [ 12 , 34 ) √1 if H = 34 log n where the supremum is taken over all C 2 -function ϕ : Rd → R such that ϕ ∞ ≤ 1. Proof. We only make the proof for H < 3/4, the proof for H = 3/4 being similar. Fix d ≥ 1 and t0 = 0 < t1 < . . . < td . In the sequel, c will denote a constant independent of n, which can differ from one line to another. First, see e.g. the proof of Theorem 3.4, observe that Zn (ti ) − Zn (ti−1 ) (n) √ = I2 (fi ) ti − ti−1 with fn(i) =
1 √ σn ti − ti−1
nti −1
1⊗2 [k,k+1] .
k=nti−1
In the proof of Theorem 3.4, it is shown that, for any fixed i ∈ {1, . . . , d} and r ∈ {1, . . . , qi − 1}: 1 if H ∈ (0, 12 ] √n fn(i) ⊗1 fn(i) H⊗2 ≤ c × . (4.4) n2H− 32 if H ∈ [ 1 , 3 ) 2 4 Moreover, when 1 ≤ i < j ≤ d, we have, with ρ defined in (3.15), nti −1 ntj −1 (i) (j) 1 2 fn , fn H⊗2 = √ ρ (l − k) √ σ 2 ti − ti−1 tj − tj−1 n k=nti−1 l=ntj−1 ntj −nti−1 −1 c = 2 gi,j,n (r)ρ2 (r) σn |r|=ntj−1 −nti +1 "nti # − "nti−1 # − 1 ≤c ρ2 (r) (4.5) σn2 |r|≥ntj−1 −nti +1
= O n4H−3 , as n → ∞,
234
where
Ivan Nourdin and Giovanni Peccati
gi,j,n (r) = ("ntj # − 1 − r) ∧ ("nti # − 1) − ("ntj−1 # − r) ∨ ("nti−1 #) ,
the last equality coming from (3.16) and ρ2 (r) = O( |r|4H−4 ) = O(N 4H−3 ), |r|≥N
as N → ∞.
|r|≥N
Finally, by combining (4.4), (4.5), Corollary 14.2 and Proposition 4.2, we obtain the desired conclusion. References 1. O. Barndorff-Nielsen, J. Corcuera, M. Podolskij and J. Woerner (2009). Bipower variations for Gaussian processes with stationary increments. J. Appl. Probab. 46, no. 1, 132-150. 2. B. Bercu, I. Nourdin and M.S. Taqqu (2009). A multiple stochastic integral criterion for almost sure limit theorems. Preprint. 3. J.-C. Breton and I. Nourdin (2008). Error bounds on the non-normal approximation of Hermite power variations of fractional Brownian motion. Electron. Comm. Probab. 13, 482-493. 4. J.-C. Breton, I. Nourdin and G. Peccati (2009). Exact confidence intervals for the Hurst parameter of a fractional Brownian motion. Electron. J. Statist. 3, 416-425 (Electronic) 5. P. Breuer et P. Major (1983). Central limit theorems for non-linear functionals of Gaussian fields. J. Mult. Anal. 13, 425-441. 6. D. Chambers et E. Slud (1989). Central limit theorems for nonlinear functionals of stationary Gaussian processes. Probab. Theory Rel. Fields 80, 323-349. 7. L.H.Y. Chen and Q.-M. Shao (2005). Stein’s method for normal approximation. In: An Introduction to Stein’s Method (A.D. Barbour and L.H.Y. Chen, eds), Lecture Notes Series No.4, Institute for Mathematical Sciences, National University of Singapore, Singapore University Press and World Scientific 2005, 1-59. 8. J.M. Corcuera, D. Nualart et J.H.C. Woerner (2006). Power variation of some integral long memory process. Bernoulli 12, no. 4, 713-735. 9. Y.A. Davydov and G.V. Martynova (1987). Limit behavior of multiple stochastic integral. Preila, Nauka, Moscow 55-57 (in Russian). 10. R.M. Dudley (1967). The sizes of compact subsets of Hilbert space and continuity of Gaussian processes. J. Funct. Anal. 1, 290-330. 11. L. Giraitis and D. Surgailis (1985). CLT and other limit theorems for functionals of Gaussian processes. Zeitschrift f¨ ur Wahrsch. verw. Gebiete 70, 191-212. 12. S. Janson (1997). Gaussian Hilbert Spaces. Cambridge University Press, Cambridge. 13. P. Major (1981). Multiple Wiener-Itˆ o integrals. LNM 849. Springer-Verlag, Berlin Heidelberg New York. 14. D. Marinucci and G. Peccati (2007). High-frequency asymptotics for subordinated stationary fields on an Abelian compact group. Stochastic Process. Appl. 118, no. 4, 585-613. 15. A. Neuenkirch and I. Nourdin (2007). Exact rate of convergence of some approximation schemes associated to SDEs driven by a fractional Brownian motion. J. Theoret. Probab. 20, no. 4, 871-899.
Stein’s Method Meets Malliavin Calculus
235
16. I. Nourdin and G. Peccati (2009). Non-central convergence of multiple integrals. Ann. Probab. 37, no. 4, 1412-1426. 17. I. Nourdin and G. Peccati (2008). Weighted power variations of iterated Brownian motion. Electron. J. Probab. 13, no. 43, 1229-1256 (Electronic). 18. I. Nourdin and G. Peccati (2009). Stein’s method on Wiener chaos. Probab. Theory Rel. Fields 145, no. 1, 75-118. 19. I. Nourdin and G. Peccati (2008). Stein’s method and exact Berry-Ess´een asymptotics for functionals of Gaussian fields. Ann. Probab., to appear. 20. I. Nourdin, G. Peccati and G. Reinert (2009). Second order Poincar´e inequalities and CLTs on Wiener space. J. Func. Anal. 257, 593-609. 21. I. Nourdin, G. Peccati and G. Reinert (2008). Stein’s method and stochastic analysis of Rademacher functionals. Preprint. 22. I. Nourdin, G. Peccati and G. Reinert (2009). Invariance principles for homogeneous sums: universality of Gaussian Wiener chaos Preprint. 23. I. Nourdin, G. Peccati and A. R´eveillac (2008). Multivariate normal approximation using Stein’s method and Malliavin calculus. Ann. Inst. H. Poincar´ e Probab. Statist., to appear. 24. I. Nourdin and F. Viens (2008). Density estimates and concentration inequalities with Malliavin calculus. Electron. J. Probab., to appear. 25. D. Nualart (2006). The Malliavin calculus and related topics of Probability and Its Applications. Springer Verlag, Berlin, Second edition, 2006. 26. D. Nualart and S. Ortiz-Latorre (2008). Central limit theorems for multiple stochastic integrals and Malliavin calculus. Stochastic Process. Appl. 118 (4), 614-628. 27. D. Nualart and G. Peccati (2005). Central limit theorems for sequences of multiple stochastic integrals. Ann. Probab. 33 (1), 177-193. 28. D. Nualart and J. Vives (1990). Anticipative calculus for the Poisson space based on the Fock space. S´eminaire de Probabilit´es XXIV, LNM 1426. Springer-Verlag, Berlin Heidelberg New York, pp. 154-165. 29. G. Peccati, J.-L. Sol´e, F. Utzet and M.S. Taqqu (2008). Stein’s method and normal approximation of Poisson functionals. Ann. Probab., to appear. 30. G. Peccati and M.S. Taqqu (2008). Moments, cumulants and diagram formulae for non-linear functionals of random measures (Survey). Preprint. 31. G. Peccati and C.A. Tudor (2005). Gaussian limits for vector-valued multiple stochastic integrals. S´eminaire de Probabilit´es XXXVIII, LNM 1857. Springer-Verlag, Berlin Heidelberg New York, pp. 247-262. 32. N. Privault (2008). Stochastic analysis of Bernoulli processes. Probability Surveys 5, 435-483. 33. G. Reinert (2005). Three general approaches to Stein’s method. In: An introduction to Stein’s method, 183-221. Lect. Notes Ser. Inst. Math. Sci. Natl. Univ. Singap. 4, Singapore Univ. Press, Singapore. 34. B. Rider and B. Vir´ ag (2007). The noise in the circular law and the Gaussian free field. Int. Math. Res. Not. 2, Art. ID rnm006. 35. S. Sheffield (1997). Gaussian free field for mathematicians. Probab. Theory Rel. Fields 139(3-4), 521-541 36. Ch. Stein (1972). A bound for the error in the normal approximation to the distribution of a sum of dependent random variables. In: Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics and Probability, Vol. II: Probability theory, 583-602. Univ. California Press, Berkeley, CA. 37. Ch. Stein (1986). Approximate computation of expectations. Institute of Mathematical Statistics Lecture Notes - Monograph Series, 7. Institute of Mathematical Statistics,
236
Ivan Nourdin and Giovanni Peccati
Hayward, CA. 38. M. Talagrand (2003). Spin Glasses: A Challenge for Mathematicians. Cavity and Mean Fields Models. Springer, Berlin. 39. C.A. Tudor and F. Viens (2008). Variations and estimators for the selfsimilarity order through Malliavin calculus. Ann. Probab., to appear. 40. F. Viens (2009). Stein’s lemma, Malliavin calculus and tail bounds, with applications to polymer fluctuation exponents. Stochastic Process. Appl., to appear.
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 237–250
Chapter 15 On Stochastic Integrals with Respect to an Infinite Number of Poisson Point Process and Its Applications Guanglin Rang1 , Qing Li2 and Sheng You2
∗
1. School of Mathematics and Statistics Wuhan University Wuhan 430072, China E-mail: [email protected] 2. Faculty of Mathematics and Computer Science Hubei University Wuhan 430062, China E-mail: [email protected] This chapter investigates stochastic integrals with respect to an infinite number of Poisson point processes and its’ corresponding martingale representations. Furthermore, with non-Markovian and non-Lipschitz coefficients, stochastic differential equations driven by sequence of Poisson point processes are discussed, where the results are extensions of the linear continuous cases.
Keywords: stochastic integral, infinite number of Poisson point processes, nonMarkovian coefficients, integral inequalities 2000 AMS Subject Classification: 60H05, 60H10 Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Stochastic integral with respect to the infinite number of Poisson point processes 3 Martingale representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Non-Markovian SDE driven by countably many Poisson point processes . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . .
. . . . .
. . . . .
237 238 241 246 250
1. Introduction Since financial market contains infinite number of assets, where each asset price is driven by an idiosyncratic random source as well as by a systematic noise term, it is not enough to describe this market by finite number of stochastic processes. So it necessitates to develop stochastic calculus with respect to a sequence of stochastic ∗ The
authors gratefully acknowledge the support by Natural Science Foundation of China (No.10871153). 237
238
Guanglin Rang, Qing Li and Sheng You
processes, semi-martingales [5], or slight restrictively speaking, independent increment processes. Such a calculus , in fact, a special case of the theory of cylindrical stochastic calculus, was presented first by M. Hitsuda and H. Watanabe in [6], there stochastic integral, Ito formula and Girsanov transformation with respect to an infinite number of Brownian motions were given systematically, and then applied conveniently to a causal and causally invertible representation of equivalent Gaussian processes. Cao and He in [2] obtained the existence and uniqueness of solution of stochastic differential equation (SDE in short) driven by a sequence of Brownian motion under non-Lipschtiz conditions by a successive approximation (also see [3]). Using a similar method they also got the same results of H-valued SDE and backward SDE in a general setup, i.e., driven by cylindrical Brownian motion with Poisson point process. All considerations above are based one Poisson process, which as a source of uncertainty are a standard tool for modeling rare and randomly occurring events. These processes can be found, among others, in quality-ladder models of growth, in the endogenous fluctuations and growth literature with uncertainty, in the labor market matching literature, in monetary economics (see [1,13] and references therein). So stochastic calculus related to one jump source might be inappropriate for the use in economic modeling. Our aim, in this chapter, is to establish systematically the theory of stochastic calculus with respect to a sequence of Poisson processes (for simplicity we exclude the term driven by Brownian motion). In most cases Poisson processes affect the concerned variables through a stochastic differential equation, we shall explore the existence and uniqueness of SDE driven by countably many Poisson point processes under non-Markovin coefficients conditions, i.e., the coefficients satisfy integral non-Lipschtiz conditions. The results is obtained by a generalized Biharyi type inequality, so far as we know, this is not discussed yet. 2. Stochastic integral with respect to the infinite number of Poisson point processes Let(Ω, F , P ) be the underlying space, with a filtration {Ft , t ≥ 0} satisfying the usual conditions. Let (Ui , Bi , ni ), i = 1, 2, · · · be a sequence of σ-finite measure spaces (not necessary common). For i ∈ N, qi = (qi (tik ), tik ∈ Di , k = 1, 2, · · · ) is a Poisson point process defined on (Ω, F , P ) taking values in Ui with domain Di . Ni (dtdx) is the counting measure associated with pi , that is, for U ∈ Ui ,
1U pi (tik ) 1{tik t} , Ni [0, t], U = tik ∈Di
where 1U ·) is the indicator of set U . Furthermore assume pi stationary, hence ˆ N Ni (dtdx) admits a compensator i (dtdx) = dt ni(dx), such that, for every U ∈ Ui ,withn(Ui ) < ∞, Ni (0, t], U = Ni (0, t], U − Ni (0, t), U is Ft -square
Stochastic Integrals with Respect to Poisson Point Process
239
integrable martingale. A real-valued predictable function f (t, x, w) : R+ × Ui × Ω → R means P ⊗ Bi -measurable, where P is a σ−algebra generated by all left limited adapted processes. Let t+ & ' ˆi (ds, dz) < ∞ , |f (s, z, w)|N Fi1 = f : f predictable with 0
Ui
t+ & Fi2 = f : f predictable with 0
For f ∈
Fi1 ,
' f 2 (s, z, w)ds ni (dz) < ∞ .
Ui
define stochastic integrals of f with respect to Ni by t+ f (s−, z, w)N (ds, dz) 0
=
Ui
f (tik , p(tik ), ω) −
tik t
t f (s, z, ω)ds ni (dz), 0 Ui
it is an Ft -martingale. For f ∈ Fi2 , we can find predictable functions sequences 2 t 1 {fn } ⊂ Fi Fi , with 0 Ui |fn − fm |2 ds ni (dz) → 0 (as n, m → ∞), by a familiar procedure, stochastic integral of such an f w. r. t. Ni is obtained, denoted also by t+ t+ f (s, z, ω)N (ds, dz) = f (s, z, ω)[N (dsdz) − ds ni (dz)], 0
0
Ui
Ui
which is an Ft -square integrable martingale. Notice that, in this case, because of the possibility of both integrals being divergent, it is nonsense to t+ t+ take 0 Ui f (t, z, ω)Ni (ds, dz) as the difference of 0 Ui f (t, z, ω)Ni (ds, dz) and t+ 0 Ui f (s, z, ω)ds ni (dz). ' &
For U ⊂ Ui , with ni (U ) < ∞. Ni = Ni (t, U ) = Ni (0, t], U , t 0 is a pure & ' jump Ft -adapted increasing processes. Let [Ni ] = [Ni ]t , t 0 be the quadratic variation processes of Ni , then (Ni (t))2 = Ni (t). [Ni ]t = 0<st
Actually, if f ∈ Fi2 , denoted by Mt the stochastic integral of f ω.r.t. N (dt, ds) then [M ]t = (Mt )2 = f 2 (s, p(s), ω), 0<st
0<st
t
with predictable dual projection M t = 0 Ui f 2 (s, z)ds ni (dz). Next we present the stochastic integral with respect to a sequence q of independent Poisson Point processes {qi }∞ i=1 . Let &
H2q = Y = y1 , y2 , · · · : yi (s, zi , ω) ∈ Fi2 . i = 1, 2, · · ·
240
Guanglin Rang, Qing Li and Sheng You
with
∞ t i=1
0
E|yi (s, zi , ω)|2 ds ni (dzi ) < ∞. for any 0 t T
'
Ui
Now, We put Y n = (y1 , y2 , · · · , yn , 0, 0, · · · ) for Y = (y1 , y2 , · · · , yn , · · · ) ∈ H2q . Then, we can define a sequence of square integrable martingale. I(t, ω, Y n ) =
n t i=1
0
yi (s, zi , ω)Ni (ds dzi ).
Ui
By independence hypothesis, for any Ai ⊂ Ui with ni (Ai ) < ∞, We know Ni (·, Ai ), Nj (·, Aj ) t = 0, and
E
i = 1, 2, · · · , n.
sup |I(t, ω, Yn ) − I(t, ω, Ym )|2 0≤t≤T
% $ 4E |I(T, ω, Y n ) − I(T, ω, Y m )|2 T m = 4 E yi2 (s, zi , ω)ds ni (dzi ) 0
i=n+1
Ui
→ 0, as m, n → +∞. Therefore, we see easily that I(t, ω, Y n ) converges uniformly in t in L2 , thus, we can define the stochastic integral for Y with respect to the countably many orthogonal martingale measure related to Poisson point processes {qi }∞ i=1 , . And write denoted by I(t, ω, Y ), as the limit of {I(t, ω, Y n )}∞ n=1 I(t, ω, Y ) =
∞ t i=1
0
Ui
yi2 (s, zi , ω)Ni (dsdzi )
0 t T.
(2.1)
Obviously, I = {I(t, ω, Y ), t ≥ 0} is an Ft square integrable martingale with Y ∈ H2q fixed. Theorem 2.1. Let F (t, x1 , x2 , · · · , xn ) be a continuous function defined in ∂F ∂2F [0, T ] × Rn such that partial derivatives Ft = ∂F ∂t , Fxi = ∂xi , Fxi xj = ∂xi ∂xj are all continuous. Then, the differential of X(t, ω) = F (t, I1 , · · · In ) = F (t, I) is given by
F t, I(t, ω, y) − F 0, I(0) t ∞ t+ ∂F
∂F
I(s, ω, y)ds + I(s−, ω, y) dIi (s) = ∂xi 0 ∂s i=1 0 +
∞ ' &
∂F
F I(s) − F I(s−) − I(s) Ii (s) , ∂xi i=1
0<st
where, Ii = I(·, ω, Yi ) with Yi ∈ H2q , i = 1, 2, . . . , n, are given by equation (2.1).
Stochastic Integrals with Respect to Poisson Point Process
241
Especially, I 2 (t, ω, y) ∞ = I 2 (0) + = I 2 (0) +
i=1
0
i=1
0
∞ t+
∞ T
+
i=1
t+
0
Ui
2(I(s− , ω, y)dIi (s) +
∞ (Ii (s))2 i=1 1t
2(I(s− , ω, y)dIi (s)
Yi2 (s, zi , ω)ds ni (dzi ).
Note that the continuous part of quadratic covariation [Ni , Nj ]t of Ni and Nj is zero when i = j, i.e. , [Ni , Nj ]ct = 0. Proof. Combining the Itˆ o formula for semimartingale (see [11]) and limit procedure yields the result above. 3. Martingale representation Since an important role is played in stochastic analysis and its applications, such as in BSDE, in this section, martingale presentation properties will be presented, i.e., square integrable martingale with the predictable representation properties of this point processes, that is, Theorem 3.1. Let M = {Mt , t 0} be an Ftq adapted square integrable martingale, where {Ftq , t 0} is the increasing family of σ-algebras generated by Poisson point processes q = (q1 , q2 , · · · ) up to time t. Then there exists a unique sequence of predictable process Y = {Y (t, z, ω) = (y1 (t, z1 , ω), y2 (t, z2 , ω), · · · ), yi ∈ Fi2 , i = 1, · · · } satisfying
∞ t i=1 0
Ui
yi2 (s, xi , ω)ds ni (dxi ) < ∞}, such that
Mt = M0 +
∞ i=1
0
t+
Yi (s, zi , ω)Ni (dsdzi ). Ui
To prove this theorem we adopt the method in [7] (also see [9]), there the continuous case is considered only (see [11] for semi-martingale case). First we formulate an Itˆo formula for exponential functions of Poisson point processes. n Let f (ξ1 , ξ2 , · · · , ξn ) = exp{− i=1 θi ξi }, θi , ξi > 0, Ni (t) = Ni (t, Ai ) = Ni (0, t] × Ai , Ai ⊂ Ui with ni (Ai ) < ∞, i = 1, 2, · · · , n. Then, by Itˆo formula
242
Guanglin Rang, Qing Li and Sheng You
for semimartingale, we have f (N1 (t), N2 (t), · · · , Nn (t)) $ f N1 (s), N2 (s), · · · , Nn (s) = st
% −f N1 (s−), N2 (s−) · · · , Nn (s−) =
e
−
n
θi Ni (s−)
i=1
n $ − % θi ∆Ni (s) · e i=1 −1
st
Since Ni (s) takes values 1 or 0, we can show, by induction, that
f N1 (t), N2 (t), · · · , Nn (t) n (e−θi − 1) Γ(s− )Ni (s) = i=1
st
(e−θi − 1)(e−θj − 1) Γ(s−)Ni (s)Nj (s) + i<j
st
+···+
n (e−θi − 1) Γ(s−) Ni (s)
n i=1
where, Γ(s−) = e
n
−
θi Ni (s−)
i=1
i=1
st
. If put Ni1 , Ni2 , · · · , Nil t = Ni1 (s) · · · Nil (s), st
then
f N1 (t), N2 (t), · · · , Nn (t) n t+ (e−θi − 1) 0 Γ(s−)Ni (s)(ds, Ui ) = i=1 t+ + (e−θi − 1)(e−θj − 1) 0 Γ(s−)d[Ni , Nj ]s + · · · +
i<j n >
(e−θi − 1)
t+
Γ(s−)d[N1 , N2 , · · · , Nn ]t .
0
i=1
Proof. [Proof of Theorem 1.2:] Without loss of generality suppose M0 = 0. First we prove by induction on m, that there exist predictable processes y1 , y2 , . . . ,ym with yi ∈ Fi2 , such that Ztm
Mt −
m i=1
0
t+
yi (s, zi , ω)Ni (ds, dzi ) Ui
is orthogonal to every martingale of the from m i=1
0
t+
ri (s, xi , ω)Ni (ds, dxi ) Ui
Stochastic Integrals with Respect to Poisson Point Process
243
with ri ∈ Fi2 , i = 1, 2, . . . , m. If m = 1, this is a direct consequence of Lemma 2.3 in [14]. Suppose such processes exist for m − 1, that is Ztm−1 is orthogonal to
, Mt − m−1 P R
m−1 X Z t+Z 0
i=1
t+R 0 Ui
i=1 {Ztm−1 , t
Ui
ei (ds, dzi ) yi (s, zi , ω)N
06t6T
ei (ds, dzi ), for ri ∈ F 2 , i = 1, 2, . . . , m−1. ri (s, zi , ω)N i
Here Z m−1 = > 0} is still a square integrable Martingale , by Lemma 2.3 2 in [14] again, we get ym ∈ Fm , such that Mt − =
Z
0
t+Z
m−1 X Z t+Z i=1
Um
0
Ui
ei (ds, dzi ) yi (s, zi , ω)N
em (ds, dzm ) + Zt , ym (s, zm , ω)N
where Zt is orthogonal to all martingale Z t+Z em (ds, dzm ) rm (s, zm , ω)N 0
Um
R t+R
2 em (ds, dzm ) is orthogonal to the martinfor any rm ∈ Fm , and 0 Um ym (s, zm , ω)N m−1 R R P t+ e gale 0 Ui yi (s, zi , ω)Ni (ds, dzi ). i=1
Furthermore, by orthogonality, we have
X
Z ·+Z
m
e
Yi (s, xi , ω)Ni (ds, dxi )
=
i=1 0 m X Z TZ i=1
so M
m
=
m R R P t+
i=1
0
0
Ui
Ui
M2
EYi2 (s, xi , ω)ds ni (dxi ) 6 E[MT2 ],
ei (ds, dzi ); y (s, zi , ω)N Ui i
t>0
converges to a limit, then let
m → ∞, it justify the following equality: ∞ Z t+Z X ei (ds, dzi ) + Zt∞ Mt = yi (s, zi , ω)N i=1
0
Ui
for any t ∈ [0, T ] with Z ∞ ∈ M2 . In addition to this, we have also hM· ,
Z
0
·+Z
yi (s, zi , ω)Ni (ds, dzi )it = Ui
Z tZ
yi (s, zi , ω)ds ni (dzi )
(3.1)
0 Ui
Next we show Z ∞ ≡ 0 P a.s.. Since Z ∞ is right continuous left limit (RCLL in short), it is sufficient to prove Zt = 0 P a.s. for any t ∈ [0, T ], therefore we show
244
Guanglin Rang, Qing Li and Sheng You
for any bounded measurable functions fk : R∞ → C, 0 ≤ k ≤ n, such that n Y E Zt · fk (qsk ) = 0,
(3.2)
k=0
where 0 = s0 ≤ s1 ≤ · · · ≤ sn ≤ t is a partition of interval [0, t], and qsk = (q1 (sk ), · · · qn (sk ), · · · ). In fact we only take into account the case where fi (i = 1, 2, · · · , k) depends on arbitrary finite coordinates. Without loss of generality, assume fi = fi (x1 , x2 , · · · , xik ) (ik = 1, 2, · · · ). We prove equation (3.3) true by induction on n. When n = 0 this is an obvious result. Suppose equation (3.3) holds for n−1, and let fn = f (x1 , x2 , · · · , xin ) = in P exp{− θj xj } with θj > 0, xj > 0, j = 1, 2, · · · , in . j=1
Let Ai ⊂ Ui with ni (Ai ) < ∞, then by Itˆ o formula (3.1) we have e
−
ik P
θj Nj [0,sn ],Ai
j=1
= exp{−
ik X j=1
+
ik X j=1
θj Nj [0, sn−1 ], Ai }
(e−θj − 1)
Z
sn +
Γ(s−)Nj (ds, dxj ) sn−1
Z X −θi −θj + (e − 1)(e − 1) i<j
sn +
Γ(s−)d[Ni , Nj ]s , sn−1
where Γ(s−) was givend in equation (3.1). Put "
ϕ(sn ) = E Zt∞
n−1 Y
fi (qsi ) exp{−
i=1
"
= E Zs∞n
n−1 Y
ik X j=1
fi (qsi ) exp{−
i=1
ik X j=1
θj Nj [0, sn ], Aj }
θj Nj [0, sn ], Aj }
#
#
" # ik n−1 Y X ∞ = E E Z sn fi (qsi ) exp{− θj Nj [0, sn ], Aj } Fsn−1 i=1
=E
=E
" n−1 Y
"
i=1
j=1
# ik X ∞ fi (qsi ) E Zsn exp{− θj Nj [0, sn ], Aj } Fsn−1
Zs∞n−1
j=1
n−1 Y i=1
fi (qsi ) exp{−
ik X j=1
|
θj Nj [0, sn−1 ], Aj }
#
Stochastic Integrals with Respect to Poisson Point Process
+E
E n−1
fi (qsi ) E Zs∞n
i=1
+E
E n−1
(e−θl − 1)
sn
F
|
Γ(s−)Nj (ds, Aj ) Fsn−1
sn−1
j=1
fi (qsi ) E Zs∞n
i=1
ik
245
(e−θl − 1)(e−θm − 1)
sn
|
Γ(s−)d[Nl , Nm ]s Fsn−1
sn−1
l<m
F
= I1 + I2 + I3 . By hypothesis, the first term I1 above is zero. Since ik (e−θj − 1) E ZS∞n =
(e−θj
− 1)E Zs∞n
Sn
(e−θj
− 1)E ZS∞n
=
sn
|
Γ(s−)nj (Aj )ds Fsn−1 ,
|
sn−1
j=1
Γ(s−)(Nj (ds, Aj ) + nj (Aj )ds) Fsn−1
sn−1
j=1 ik
Γ(s−)Nj (ds, Aj ) Fsn−1
sn−1
j=1 ik
|
sn
we have the identity I2 = E
Zs∞n−1
n−1
fi (qsi )
i=1
=
ik
(e
−θj
ik
− 1)nj (Aj )
=
sn
sn−1
(e−θj − 1)nj (Aj )
− 1)
sn
|
Γ(s−)nj (Aj )ds Fsn−1
sn−1
j=1
j=1 ik
(e
−θj
n−1 ∞ E Zsn−1 fi (qsi )Γ(s) ds i=1
sn
ϕ(s)ds. sn−1
j=1
Therefore the following integral equation is obtained: ϕ(sn ) =
ik j=1
(e
−θj
− 1)nj (Aj )
Sn
ϕ(s)ds, Sn−1
from which we know ϕ ≡ 0. Then a familiar arguments about Laplacian transformation leads to equation (3.3) holding for all bounded functions fn . Thus we complete the proof. Remark 3.1. We can use such a martingale presentation property to get the existence and uniqueness theory of BSDE driven by countably many Brownian motion with jump, e.g. [12].
246
Guanglin Rang, Qing Li and Sheng You
4. Non-Markovian SDE driven by countably many Poisson point processes In this section we shall focus on Non-Markovian SDE driven by countably many Poisson point processes of the form t ∞ t ˜i (dsdzi ). b(s, x)ds + σi (s, x−, zi )N (4.1) xt = x0 + 0
0
i=1
Ui
Here we only consider this equation in R1 for simplicity and follow the notations in the previous sections. We also use the notation D = D([0, T ]) denoting the space of all RCLL functions on [0, T ] with sup norm |x|T = sup0≤t≤T |xt |, and L2 = L2 (Ω, D) denoting the space of square integral functionals from Ω to D with norm X = (E|X|2T )1/2 . For x ∈ D, x− is the left limit of x. Now we will give some assumptions on coefficients b and σi , i = 1, 2, · · · (A1): b(t, x) : [0, T ] × D −→ R1 and σi (t, x, zi ) : [0, T ] × D × Ui −→ R1 , i = 1, 2, · · · , are deterministic measurable functions. (A2): b(t, x) and σi (t, x, zi ), i = 1, 2, · · · , satisfy integral non-Lipschitz conditions: for any x, y ∈ D, t ∈ [0, T ], t |b(t, x) − b(t, y)|2 ≤ L1 |xt − yt |2 + L2 0 ρ(|xs − ys |2 )dAs , ∞ i=1
|σi (t, x−, zi ) − σ(t, y−, zi )|2 ni (dzi )
Ui
≤ L1 |xt − yt |2 + L2
t
ρ(|xs − ys |2 )dAs ,
(4.2)
0
for some RCLL increasing functions A(s) on [0, T ] and some positive constants L1 , L2 . ρ is an increasing concave functions satisfying 0+ ρ−1 (r)dr = +∞ with ρ(0) = 0. Because of concavity of ρ, integral non-Lipschitz condition (4.2) implies the following linear growth condition (maybe some modifications for constants occur), note that σi , i = 1, 2, . . . , are compelled to be subject to this condition. |b(t, x)| + 2
∞ i=1
≤ L1 (1 +
x2t )
Ui
σi2 (s, x−, zi )ni (dzi )
+ L2 0
t
(1 + x2s )dAs .
(4.3)
Non-Markovian type SDEs can be used in the theory of transmission of messages in noise channel, such as coding and decoding (see [8] and references thererin), also used in the theory of stochastic optimal control. The corresponding equations are all, usually, with feedback, i.e., the input at time t may include all the past information of the output up to t. The SDEs or BSDEs of this type have been
Stochastic Integrals with Respect to Poisson Point Process
247
discussed in [3] in a general setup—Hilbert-valued processes driven by cylindrical Brownian motion with jump (also see [9] in real-valued Brownian motion case). In spite of generality in [3], one may be dazzled by the stack of conditions. Here we give a result in a relatively concise form. First, we formulate the definition of the existence and uniqueness for equation (4.1). Definition 4.1. There exists a RCLL Ft −adapted process x(t, ω) satisfying equation (4.1) and if two RCLL Ft −adapted processes x1 , x2 satisfy equation (4.1) with E sup0≤t≤T (|x1 (t) − x2 (t)|)2 =0, we say that equation (4.1) has a unique solution. Theorem 4.1. Suppose x0 ∈ L2 (Ω), conditions (A1)and (A2) hold. Then equation (4.1) admits a unique solution. Proof. We first show uniqueness. To this end, let x, y be two solutions of eq.(4.1), then we have t 2 [b(s, x) − b(s, y)]2 ds |xt − yt | ≤ 2T 0
+2
E∞
t+
0
i=1
F2 ˜ (σi (s, x−, zi ) − σi (s, y−, zi ))Ni (dsdzi ) ,
Ui
thus, by Doob’s inequalities E sup |xs − ys |2 0≤s≤t
≤ 2T E + 2T E
t
[b(s, x) − b(s, y)]2 ds
0 ∞ t+ i=1
0
[σi (s, x1 −, zi ) − σi (s, x2 −, zi )]2 dsni (dzi ).
(4.4)
Ui
By integral Lipschitz condition (2.1) plus the concave properties of ρ, we get the following inequality: E sup |xs − ys |2
(4.5)
0≤s≤t
t
≤ L1
E sup |xu − yu |2 ds + L2 0≤u≤s
0
+L1 0
t
t
s
ρ(E sup |xv − yv |2 )dAu ds 0≤v≤u
0 0
E sup |xu − yu |2 ds + L2 0≤u≤s
t 0
s
0
E sup |xv − yv |2 dAu ds. 0≤v≤u
Put H(t) = E sup0≤s≤t |xs − ys | , we have the integral equation t t s H(s)ds + L2 ρ(H(u))dAu ds H(t) ≤ L1 2
0
t
0
H(u)dAu ds.
+L2 0
0
0
s
(4.6)
248
Guanglin Rang, Qing Li and Sheng You
Therefore, by Lemma 1.1 below we know that H(t) = 0 for all 0 ≤ t ≤ T , i.e., P (x = y) = 1. The proof for uniqueness is complete. As far as existence is concerned, we will proceed by a Picard iteration procedure as follows: for 0 ≤ t ≤ T x0 = x0
xn+1 (t) = x0 +
t
b(s, xn )ds + 0
∞ t 0
i=1
˜i (dsdzi ). σi (s, xn −, zi )N
(4.7)
Ui
Since ρ can be dominated by ax + b in R for some a, b, and by linear growth, we know definition (4.7) reasonable, and we can estimate E sup0≤s≤t |xns |2 ≤ C(x2 , T, L) independent of n and 0 ≤ t ≤ T (see [9]). Also, for n, m ≥ 1, 0 ≤ t ≤ T , we have as equation (4.4) t n m 2 2 E sup |xs − xs | ≤ 2L1 E sup |xnu − xm u | ds 0≤s≤t
0≤u≤s
0
t
s
+ L2 0
t + L2 0 n,m
If we set G lim sup Gn,m (t) = n,m→∞
have
0
2 ρ(E sup |xnv − xm v | )dAu ds 0≤v≤u
0 s
2 E sup |xnv − xm v | dAu ds. 0≤v≤u
2 (t) = E sup0≤s≤t |xns − xm s | , n m 2 E sup0≤s≤t |xs − xs | exists
then by remarks above we know and denote it by G(t), hence we
t G(t) ≤ 2L1 G(s)ds 0 t s t s ρ(G(u))dAu ds + L2 G(u)dAu ds + L2 0 0 0 0 t t s G(s)ds + L2 ρ1 (G(u))dAu ds, ≤ 2L1 0
0
(4.8)
0
where ρ1 (z) = ρ(z) + z. Again by Lemma 1.1 we have G ≡ 0, which implies that {xn }∞ n=1 is a Cauchy sequence in L2 (Ω, D) with respect to the norm (E sup0≤s≤t | · |2 )1/2 . We denote by x the limit, then by a limit procedure we have the desired result. Lemma 4.1. Let f (t) be Borel measurable bounded left limit and nonnegative function on [0, T ], H(z) be a continuous increasing function with the property that u−1 H(z) ≤ H(u−1 z) for z ≥ 0, u > 0. If t t s f (t) ≤ m(s)f (s)ds + n(s) l(τ )H(f (τ ))dAτ ds, (4.9) 0
0
0
and m, n are two continuous functions and can be comparable each other. At is a nondecreasing function on [0, T ]. Then f (t) ≡ 0 for t ∈ [0, T ].
Stochastic Integrals with Respect to Poisson Point Process
Proof.
First let c0 > 0, and consider the following inequality: t t s m(s)f (s)ds + n(s) l(τ )H(f (τ ))dAτ ds, f (t) ≤ c0 + 0
or
0
g(t) ≤ 1 +
(4.10)
0
t
249
m(s)g(s)ds + 0
t
s
n(s)
l(τ )H(g(τ ))dAτ ds,
0
(4.11)
0
with g = f /c0 . Denote by v(t) the right hand of equation (4.11) and differentiate, we find g(t) ≤ v(t) and t v (t) ≤ m(t)g(t) + n(t) l(τ )H(g(τ ))dAτ 0 (4.12) t ≤ m(t)v(t) + n(t) l(τ )H(v(τ ))dAτ . Put W (t) = v(t) +
t 0
0
l(τ )H(v(τ ))dAτ , then v(t) ≤ W (t) and t l(τ )H(W (τ ))dAτ , W (t) ≤ v(t) + 0
i.e., W (t) ≤ 1+ v(t)
t
l(τ )H(W (τ )/v(τ ))dAτ , 0
An extension of Bihari’s inequality to Lebesgue-Stieltjes integral (see [10]) justifies the following inequality, t −1 l(τ )dAτ ), W (t) ≤ v(t)Ψ (Ψ(1) + t
0 1 −1 H(s) ds, Ψ
is the inverse of Ψ. If assume m(t) ≤ n(t), then we where Ψ(t) = t0 have, by inequality (4.12), v (t) ≤ g(t)W (t), and t v (t) ≤ m(t)v(t)Ψ−1 (Ψ(1) + l(τ )dAτ ). 0
Thus, since v(0) = 1, we have
v(t) ≤ exp
t
m(s)Ψ−1 (Ψ(1) +
0
that is ,
s
l(s)dAs ), 0
f (t) ≤ c0 exp
t
m(s)Ψ−1 (Ψ(1) +
0
Finally, let c0 → 0, the desired result is followed.
s
l(s)dAs ). 0
Remark 4.1. This lemma is an extension of theorem 1 in [15]. Here LebesgueStieljes integral is involved, the corresponding difficulty is that the differentiation can not be implemented.
250
Guanglin Rang, Qing Li and Sheng You
Remark 4.2. If f in the first term of the right hand of inequality (4.9) is replaced by γ(f ) for some nonlinear functions γ, the conclusion may not hold (see [4]). References 1. T. Bj¨ ork and B. N¨ aslund, Diversified portfolios in continuous time. European Financial Review, 1(1998), 361-387. 2. G. Cao and K. He, On a type of stochastic differential equations driven by countably many Brownian motions. Journal of Functional Analysis, 203(2003),262-285. 3. G. Cao, K. He and X. Zhang, Successive approximations of infinite dimensional SDEs with jump. Stoch. Dyn. 5 (4)(2005), 609-619. 4. M. Dannan, Integral Inequalities of Gronwall-Bellman-Bihari Type and Asymptotic Behavior of Certain Second Order Nonlinear Differential Equations. J. Math. Anal. Appl. 108(1985), 151-164. 5. M. Donno and M. Pratelli, Stochastic Integration with Respect to a sequence of semimartingales. LNM, Springer-Verlag, 2006. 6. M. Hitsuda and H. Watanabe, On Stochastic Integrals with Respect to an Infinite Nunber of Brownian Motions and Its Applications. Proc.of Ineqn.Symp.SDE, 5774.Kyoto:John Wiley&Sons,Inc, 1976. 7. I. Karatzas and E. Shreve, Brownian Motion and Stochastic Calculus. Springer, 1988. 8. P. Katyshev, Uniform Optimal Transmission of Gaussian Messages, From Stochastic Calculus to Mathematical Finance –The Shiryaev Festschrift, Yuri Kabanov, Robert Liptser and Jordan Stoyanov (eds), Springer 369-384,2006. 9. R. Lipster and A. Shiryaev, Statistics of Stochastic Processes. Moscow, ”Naukau” (1974). 10. X. Mao, Lebesgue-Stieljes integral inequlities in several variables with retardation. Proc. Indian. Acad. Sci. (Math. Sci.), 100(3)(1990), 231-243. 11. E. Protter, Stochastic Integration and Differential Equation. Second Edition, Springing, 2004. 12. G. Rang and H. Jiao, The existence and uniquness of the solution of backward stochastic differential equation driven by countably many Brownian motions with jump. J. Hubei University, 2009. 13. K. Sennewald and K. W¨ alde, ”Ito’s Lemma” and the Bellman Equation for Poisson Processes: An Applied View. Journal of Economics, 89(1)(2006),1-36. 14. S. Tang and X. Li, Necessary conditions for optimal control of stochastic systems with random jumps. SIMA J. Control Optim, 32(1994), 1447-1475. 15. C. Young, ON BELLMAN-BIHARI INTEGRAL INEQUALITIES. Intern. J. Math. & Math. Sci., 5(1)(1982),97-103.
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 251–268
Chapter 16 L´ evy White Noise, Elliptic SPDEs and Euclidean Random Fields
Jiang-Lun Wu Department of Mathematics, Swansea University, Singleton Park, Swansea SA2 8PP, United Kingdom In this article, we start with a briefly survey of the recent development on Euclidean random fields along with constructive, relativistic quantum field theory. We then present a unified account of L´evy white noise and related elliptic SPDEs driven by L´evy white noise. We explicate the link, via analytic continuation, from the Euclidean random fields obtained as solutions of the elliptic SPDEs to local, relativistic quantum fields with indefinite metric. By comparing the derived vector and scalar relativistic quantum field models, we reformulate the elliptic SPDE for the scalar model with a new Gaussian white noise term so that the associated Euclidean random field possesses a feature of avoiding the re-definition of the two point Schwinger functions needed for nontrivial scattering in the relativistic model, which then leads to a scalar model of local, relativistic quantum field theory in indefinite metric with nontrivial scattering behavior. Finally, we demonstrate a lattice approximation for the induced Euclidean random field.
Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 L´ evy white noise . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 L´ evy white noise and random fields . . . . . . . . . . . . . . . . . 4 Comparison of vector and scalar models . . . . . . . . . . . . . . 5 New formulation of elliptic SPDEs and the lattice approximation References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
251 253 256 258 263 265
1. Introduction Since the pioneer works of Symanzik51 and Nelson,46,47 the construction of local, relativistic quantum fields via analytic continuation from Euclidean random fields has been the most vital and productive paradigm in constructive quantum field theory (QFT), see e.g. the by now classical expositions in Ref. 31. This Euclidean strategy has been completed successfully in d = 2 space-time dimensions49 (cf. also Ref. 1) and partial results have been obtained for d = 3 (see Refs. 21 and 30). In the physical space-time dimension d = 4, however, the standard approach to the definition of local potentials via renormalization up to now is plagued by seemingly 251
252
Jiang-Lun Wu
incurable ultra-violet divergences, and no construction of a non-trivial (interacting) quantum field is known within that approach. In the series of papers by Albeverio and Høegh–Krohn12–14 (see also Ref. 11), a different approach for d = 4 was started of construction Euclidean covariant vector Markov random fields as solutions of stochastic quaternionic Cauchy-Riemann equations with multiplicative white noise. This has been associated, in Ref. 15, with some mass zero local relativistic quantum filed models of gauge type by obtaining their Wightman functions via performing the analytic continuation of the corresponding Schwinger functions (moments) of these Euclidean covariant random fields. However, the peculiarities of mass zero and gauge invariance in the relativistic vector models rise difficulties of physical interpretation, especially concerning to the construction of the “physical Hilbert space”, in QFT. But remarkably in Ref. 5, it has been proved that these Wightman functions satisfy the so-called “Hilbert space structure condition” which permits the construction of (non unique) physical Hilbert spaces associated to the set of Wightman functions and hence leads, by Refs. 44 and 50, to local relativistic vector field models for indefinite metric QFT. Furthermore, in Ref. 7, explicit formulae for the (gauge invariant) scattering amplitudes for these local relativistic vector fields with indefinite metric have been carried out, which shows that such models have nontrivial scattering behaviour. Moreover, Euclidean covariant (elliptic) SPDEs with multiplicative white noise have been systematically studied in Refs. 22, 8 and 9, leading to local relativistic vector models which include massive quantum vector fields. Further in Ref. 8, necessary and sufficient conditions on the mass spectrum of the given covariantly differential operators (with constant coefficients) have been presented. Such conditions imply nontrivial scattering behaviour of the relativistic models. In Ref. 9, in order to avoid renormalising the two point Schwinger functions and to keep the induced relativistic fields with nontrivial scattering, a revised formulation of the covariant SPDEs with a newly added Gaussian white noise term is proposed. Furthermore, asymptotic states and the S-matrix are constructed. The scattering amplitudes can be explicitly calculated out and the masses of particles are then determined by the mass spectrum. Thus one can have a nice particle interpretation picture of the obtained vector models (Ref. 35). In Ref. 23, however, a no-go theorem for Euclidean random vector fields has been established, showing why Euclidean field theory has been less successful in the vector case than in the scalar case. For example, as a consequence of the no-go theorem, it follows that there is no simple vector analogue of scalar P (φ)2 -theory. On the other hand, by considering stochastic pseudo-differential equations with generalized white noise, scalar models have been started in Ref. 17 (massless) and in Ref. 4 (massive). In the latter case, a better approach of anayltic continuation has been developed via truncation techniques. Again, in Ref. 5, it has been proved that the obtained Wightman functions satisfy the “Hilbert space structure condition” which then permits the construction of (non unique) physical Hilbert spaces and
L´ evy White Noise, Elliptic SPDEs and Euclidean Random Fields
253
hence leads to local relativistic scalar field models for indefinite metric QFT, as in the vector case. However, there is no scattering theory (neither Lehmann-SymanzikZimmerman theory nor Haag-Ruelle theory) to the scalar case. In Ref. 6, the spectral condition on the translation group for the scalar models has been proved, which is an important step towards Haag-Ruelle scattering theory in the usual standard (namely, positive definite metric) QFT, cf. e.g. Ref. 36. In Ref. 2, axiomatically scattering theory for local relativisitic QFT with indefinite metric has been established. Let us also mention Refs. 10, 20, 24–28, 32–34, 37–40, 43, 45, 48 for further investigations of the scalar models. The rest of the paper is organised as follows. In the next section, we shall briefly introduce L´evy white noise with several concrete examples of L´evy type noise in SPDEs. In Section 3, we establish a link of L´evy white noise with generalized random fields (i.e., multiplicative white noise). In Section 4, starting with elliptic SPDEs driven by L´evy white noise, we explicate the link, via analytic continuation, from the Euclidean random fields obtained as solutions of the elliptic SPDEs to local, relativistic quantum fields with indefinite metric. By comparing the derived vector and scalar relativistic quantum field models, we then reformulate the elliptic SPDE for the scalar model with a new Gaussian white noise term so that the associated Euclidean random field possesses a feature of avoiding the re-definition of the two point Schwinger function needed for nontrivial scattering in the relativistic model, which then leads to a scalar model of local, relativistic quantum field theory in indefinite metric with nontrivial scattering behavior. Section 5, the final section, is devoted to demonstrate a lattice approximation for the induced Euclidean random field. 2. L´ evy white noise This section is devoted to a brief account to L´evy white noise. We start with Poisson white noise. Let (Ω, F , P ) be a given complete probability space with a filtration {Ft }t∈[0,∞) and let (U, B(U ), ν) be an arbitrary σ-finite measure space. Definition 2.1. Let (E, E, µ) be a σ-finite measure space. By a Poisson white noise on (E, E, µ) we mean an integer-valued random measure N : (E, E, µ) × (U, B(U ), ν) × (Ω, F , P ) → N ∪ {0} ∪ {∞} =: N with the following properties: (i) for A ∈ E and B ∈ B(U ), N (A, B, ·) : (Ω, F , P ) → N is a Poisson distributed random variable with P {ω ∈ Ω : N (A, B, ω) = n} =
e−µ(A)ν(B) [µ(A)ν(B)]n n!
for each n ∈ N. (Here we take the convention that when µ(A) = ∞ or ν(B) = ∞, N (A, B, ·) = ∞, P -a.s.);
254
Jiang-Lun Wu
(ii) for any fixed B ∈ B(U ) and any n ≥ 2, if A1 , . . . , An ∈ E are pairwise disjoint, then N (A1 , B, ·), . . . , N (An , B, ·) are mutually independent random variables such that n N (∪nj=1 Aj , B, ·) = N (Aj , B, ·) , P − a.s. j=1
Clearly, the mean measure of N is E[N (A, B, ·)] = µ(A)ν(B) ,
A ∈ E , B ∈ B(U ) .
N is nothing but a Poisson random measure on the product measure space (E × U, E × B(U ), µ ⊗ ν) and can be constructed canonically as
N (A, B, ω) :=
n (ω) η
(n)
1(A∩En )×(B∩Un ) (ξj (ω))1{ω∈Ω:ηn (ω)≥1} (ω)
(2.1)
n∈N j=1
for A ∈ E, B ∈ B(U ) and ω ∈ Ω, where (a) {En }n∈N ⊂ E is a partition of E with 0 < µ(En ) < ∞, n ∈ N, and {Un }n∈N ⊂ B(U ) is a partition of U with 0 < ν(Un ) < ∞, n ∈ N; (n) (b) ∀n, j ∈ N, ξj : Ω → En × Un is F /En × B(Un )-measurable with (n)
P {ω ∈ Ω : ξj (ω) ∈ A × B} =
µ(A)ν(B) , µ(En )ν(Un )
A ∈ En , B ∈ B(Un ),
where En := E ∩ En and B(Un ) := B(U ) ∩ Un ; (c) ∀n ∈ N, ηn : Ω → N is Poisson distributed with P {ω ∈ Ω : ηn (ω) = k} = (n)
(d) ξj
e−µ(En )ν(Un ) [µ(En )ν(Un )]k ˜ , k ∈ N; k!
and ηn are mutually independent for all n, j ∈ N.
Next, let us briefly recall the notion of Gaussian white noise on (E, E, µ). It is a random measure W : (E, E, µ) × (Ω, F, P ) → [0, ∞) such that {W (A, ·)}A∈EF is a Gaussian family of random variables with E[W (A, ·)] = µ(A) and E[W (A1 , ·)W (A2 , ·)] = µ(A1 ∩ A2 ), where EF := {A ∈ E : µ(A) < ∞}. Definition 2.2. By a L´evy white noise on (E, E, µ), we mean a random measure L : (E, E, µ) × (Ω, F, P ) → [0, ∞) having the following expression L(A, ω) := W (A, ω) + a(x, y)N (dx, dy, ω) A U + b(x, y)µ(dx)ν(dy) (2.2) A
U
for (A, ω) ∈ E × Ω, where a, b : E × U → R are measurable.
L´ evy White Noise, Elliptic SPDEs and Euclidean Random Fields
255
Let us end this section with some examples of L´evy white noise used in SPDEs. Example 1. Take (E, E) = ([0, ∞), B([0, ∞))) and µ to be Lebesgue measure on the Borel σ-algebra B([0, ∞)). Let the Poisson white noise N on ([0, ∞), B([0, ∞)), µ) be constructed canonically by (2.1). Such a random measure N is called an extended Poisson measure on [0, ∞) × U in Ref. 42. Alternatively, N can be constructed as the Poisson random measure on [0, ∞) × U associated with an {Ft }-Poisson point process as in Ref. 41. In the sense of Schwartz distributions, we then define N (dt, B, ω) (t) , dt We call Nt Poisson time white noise. Nt (B, ω) :=
(t, B, ω) ∈ [0, ∞) × B(U ) × Ω .
Example 2. Take (E, E, µ) = ([0, ∞) × Rd , B([0, ∞)) × B(Rd ), dt ⊗ dx) and let the Poisson white noise N be constructed canonically by (2.1). Define (again in the sense of Schwartz distributions) the Radon-Nikodym derivative Nt,x (B, ω) :=
N (dt, dx, B, ω) (t, x) , dtdx
(B, ω) ∈ B(U ) × Ω
for t ∈ [0, ∞) and x ∈ Rd . We call Nt,x Poisson space-time white noise. Accordingly, we can define the compensating martingale measure M (t, A, B, ω) := N ([0, t], A, B, ω) − t|A|ν(B) for any (t, A, B) ∈ [0, ∞) × B(Rd ) × B(U ) with |A|ν(B) < ∞ (where |A| stands for the Lebesgue measure of A). Then we have E[M (t, A, B, ·)] = 0,
E([M (t, A, B, ·)]2 ) = t|A|ν(B) .
Moreover, we can define the Radon-Nikodym derivative Mt,x (dy, ω) :=
M (dt, dx, dy, ω) (t, x) . dtdx
For a Brownian sheet {W (t, x, ω)}(t,x,ω)∈[0,∞)×Rd×Ω on [0, ∞) × Rd (cf. e.g. Ref. 52), we define Wt,x (ω) :=
∂ d+1 W (t, x, ω) . ∂t∂x1 . . . ∂xd
Let the given N and W be independent and let U0 ∈ B(U ) with ν(U \ U0 ) < ∞ be arbitrarily given. Set c1 (t, x; y)Mt,x (dy, ω) Lt,x (ω) = Wt,x (ω) + U0 + c2 (t, x; y)Nt,x (dy, ω) U\U0
for (t, x, ω) ∈ [0, ∞) × R × Ω, where c1 , c2 : [0, ∞) × [0, L] × U → R are measurable. Lt,x is called L´evy space-time white noise. d
256
Jiang-Lun Wu
Example 3. Take (E, E, µ) = ([0, ∞), B([0, ∞)), dt) and (U, B(U ), ν) = (Rd \ {0}, B(Rd \ {0}), |x|dx d+p ), p ∈ (0, 2). Let the Poisson random measure N and the compensating martingale measure M be defined in the same manner as in previous examples. Let us restrict ourselves to the nonnegative valued M . Namely, we consider those M whose Laplace transform is given by EeθM([0,t],A,·) = eθ
p
t|A|
for t ≥ 0, θ ≥ 0, A ∈ B(Rd \ {0}) with |A| < ∞. Notice that in this case, M ([0, t], A, ω) is nonnegative valued and its Fourier transform is given by EeiθM([0,t],A,·) = e(iθ)
p
t|A|
for t ≥ 0, θ ∈ R, A ∈ B(Rd \ {0}) with |A| < ∞. From this it is easy to see that when p = 2, M then determines a Brownian sheet. We call the (distributional) Radon-Nikodym derivative Mt,x (ω) =
M (dt, dx, ω) (t, x) dtdx
a p-stable space-time white noise. 3. L´ evy white noise and random fields Here we want to give a link between L´evy white noise and random fields. First of all, we have the following definition for random fields Definition 3.1. Let (Ω, F , P ) be a probability space and (V, T ) be a (real) topological vector space. By a random field X on (Ω, F , P ) with parameter space V , we mean a system {X(f, ω), ω ∈ Ω}f ∈V of random variables on (Ω, F , P ) having the following properties: (1) P {ω ∈ Ω : X(c1 f1 + c2 f2 , ω) = c1 X(f1 , ω) + c2 X(f2 , ω)} = 1 , for c1 , c2 ∈ R, f1 , f2 ∈ V ; (V,T )
(2) fn −→ f
⇒
in law
X(fn , ·) −→ X(f, ·) .
Now let S(Rd ) (d ∈ N) be the Schwartz space of rapidly decreasing (real) C ∞ functions on Rd and S (Rd ) the topological dual of S(Rd ). Denote by < ·, · > the natural dual pairing between S(Rd ) and S (Rd ). Let B be the σ-algebra generated by cylinder sets of S (Rd ). Then (S (Rd ), B) is a measurable space. Let ψ : R → C be a continuous, negative definite function having the following L´evy-Khinchine representation σ 2 t2 + ψ(t) = iat − 2
R\{0}
eist − 1 −
ist 1 + s2
dM (s) ,
t∈R
(3.1)
L´ evy White Noise, Elliptic SPDEs and Euclidean Random Fields
257
where a, σ ∈ R and the measure M (L´evy measure) satisfies min(1, s2 )dM (s) < ∞. R\{0}
Hereafter, we call ψ a L´evy-Khinchine function. From Gelfand and Vilenkin,29 the following functional 4 3 ψ(f (x))dx , f ∈ S(Rd ) C(f ) = exp Rd
is a characteristic functional on S(Rd ). Thus by Bochner-Minlos theorem (Ref. 29) there exists a unique probability measure Pψ on (S (Rd ), B) such that 3 4 ei dPψ (ω) = exp ψ(f (x))dx , f ∈ S(Rd ). S (Rd )
Rd
We call Pψ the L´evy white noise measure and (S (Rd ), B, Pψ ) the L´evy white noise space associated with ψ. The associated (coordinate) canonical process F : S(Rd ) × (S (Rd ), B, Pψ ) → R defined by F (f, ω) =< f, ω >,
f ∈ S(Rd ), ω ∈ S (Rd )
is a random field on (S (Rd ), B, Pψ ) with parameter space S(Rd ). Such kind of (Euclidean) random fields has been used substantially in recent years in constructive (indefinite metric) quantum field theory. Remark 3.1. In the terminology of Gelfand and Vilenkin, F is called a generalized random process with independent value at every point, namely, the random variables < f1 , · > and < f2 , · > are independent whenever f1 (x)f2 (x) = 0 for f1 , f2 ∈ S(Rd ). Moreover, F is also named multiplicative white noise or generalized white noise in the literature. In order to present the relation between L´evy white noise L and the random field F , we need to define F (1A , ω) for A ∈ B(Rd ). Notice that 1A ∈ L2 (Rd ), thus we need to extend F to the parameter space L2 (Rd ) from S(Rd ). This can be done by the following argument (cf. Ref. 17). Remarking that S(Rd ) is dense in L2 (Rd ), for any f ∈ L2 (Rd ), there exists a sequence {fn }n∈N ⊂ S(Rd ) converging to f in L2 (Rd ). Since F (fn , ω) is well-defined for each n ∈ N, one can define F (f, ω) := lim F (fn , ω) n→∞
where the limit is understood in law. Then the linear operator F (·, ω) : S(Rd ) → L(S (Rd ), B, Pψ ) can be extended uniquely to a continuous linear operator F (·, ω) : L2 (Rd ) → L(S (Rd ), B, Pψ ) ,
258
Jiang-Lun Wu
where L(S (Rd ), B, Pψ ) is the Fr´echet space of real random variables on (S (Rd ), B, Pψ ) with quasi-norm ||ξ||0 := EPψ (|ξ| ∧ 1). Now in Definition 2.2 by taking (E, E, µ) = (Rd , B(Rd ), dx) and (Ω, F , P ) = (S (Rd ), B, Pψ ), we have L´evy white noise L(A, ω), for (A, ω) ∈ B(Rd) × Ω. By virtue of the Fourier transform, F (1A , ω)
in law
=
L(A, ω),
(A, ω) ∈ B(Rd) × Ω .
4. Comparison of vector and scalar models In this section, let us briefly introduce both the vector and scalar models and then give a comparison between them. We start with the construction of covariant Euclidean random fields via the quaternionic Cauchy-Riemann operator. To this end, we begin with quaternions and the quaternionic Cauchy-Riemann operator. Let Q be the skew field of all quaternions and let {1, e1 , e2 , e3 } be its canonical basis with multiplication rules e1 e1 = e2 e2 = e3 e3 = −1 and e1 e2 = −e2 e1 = e3 . A quaternion x ∈ Q is represented by x = x0 1 + x1 e2 + x2 e2 + e3 x3 ,
(x0 , x1 , x2 , x3 ) ∈ R4 .
Thus, Q is isomorphic to R4 , the isomorphism being given by x ∈ Q → (x0 , x1 , x2 , x3 ) ∈ R4 , regarding Q as a real vector space. Furthermore, R can be imbedded into Q by identifying x0 ∈ R with x0 1 ∈ Q. Hence, Q forms a real associative algebra with identity 1 under the multiplication rules of the canonical basis. In fact, Q is a Clifford algebra. There is a distinct automorphism of Q called conjugation which is defined by x ¯ := x0 1 − x1 e2 − x2 e2 − e3 x3 ,
x ∈ Q.
One can then define a norm || · || and a scalar inner product (·, ·)Q on Q: 1
1
||x|| := (x¯ x) 2 = (x20 + x21 + x22 + x23 ) 2 ,
x ∈ Q.
and 1 1 (||x + y||2 − ||x − y||2 ) 2 = xy 4 = x0 y0 + x1 y1 + x2 y2 + x3 y3 , x, y ∈ Q.
(x, y)Q :=
The inverse x−1 of a quaternion x ∈ Q \ {0} with respect to the multiplication x ¯ . Setting Sp(1) := {x ∈ Q : ||x|| = 1}, we notice that is given by x−1 = ||x|| Sp(1) is a subgroup of the multiplicative group Q \ {0} and is isomorphic to SU (2). By x ∈ Q → uxv −1 ∈ Q for u, v ∈ Sp(1), we have a surjective homomorphism Sp(1) × Sp(1) → SO(4), whose kernel is given by {(1, 1), (−1, −1)} ∼ = Z2 , and hence 4 ∼ [Sp(1) × Sp(1)]/Z2 = SO(4). Now by identifying Q with R , we have the following
L´ evy White Noise, Elliptic SPDEs and Euclidean Random Fields
259
two distinct Sp(1) × Sp(1) actions on the collection X (R4 , Q) of all 4-vector fields on R4 : the first one is given by A(x) −→ uA(u−1 (x − y)v)v −1 , where x, y ∈ R4 ,
(u, v) ∈ Sp(1) × Sp(1),
A ∈ X (R4 , Q),
and A obeying this rule is called a covariant 4-vector field; the second one is given by A(x) −→ vA(u−1 (x − y)v)v −1 , where x, y ∈ R4 ,
(u, v) ∈ Sp(1) × Sp(1),
A ∈ X (R4 , Q),
and A obeying this rule is called a covariant scalar 3-vector field. Let C0∞ (R4 , Q) be the space of all Q-valued smooth functions with compact supports (where the smoothness of Q-valued functions is defined in terms of the four real components). We define a bilinear form, for f ∈ C0∞ (R4 , Q) and A ∈ X (R4 , Q), via (f (x), A(x))Q dx < f, A >:= R4
and then extend this relation to a distributional pairing in the natural manner. We remark that < ·, · > is invariant under the above two Sp(1) × Sp(1) actions. We can now define the quaternionic differential operators ∂, ∂¯ and ∆ as ∂ := 1∂0 − e1 ∂1 − e2 ∂2 − e3 ∂3 , ∂¯ := 1∂0 + e1 ∂1 + e2 ∂2 + e3 ∂3 , ¯ = ∂2 + ∂2∂2 + ∂2 , ∆ := ∂ ∂¯ = ∂∂ 0 1 2 3 where ∂k := ∂x∂ k , k = 0, 1, 2, 3. The operator ∂ is called the quaternionic CauchyRiemann operator. Moreover, we consider the two variable transformation x → x = u−1 xv for some (u, v) ∈ Sp(1) × Sp(1) and define the corresponding quaternionic ¯ Cauchy-Riemann operator ∂ and its conjugate ∂¯ in the same manner as ∂ and ∂, −1 −1 ¯ Let then ∂ = v ∂u and ∂¯ = u ∂v. σ := A0 dx0 + A1 dx1 + A2 dx2 + A3 dx3 be a 1-form with the orientation adapted to the canonical basis {1, e1 .e2 , e3 } and ∗ σ be the associated Hodge dual. Further by identifying anti-selfdual 2-forms with 3-vector fields, we get < ∗ d∗ σ, dσ − ∗ σ >=< F0 , F1 e1 + F2 e2 + F3 e3 >,
260
Jiang-Lun Wu
where F0 := ∂0 A0 + ∂1 A1 + ∂2 A2 + ∂3 A3 , F1 := −∂1 A0 + ∂0 A1 + ∂3 A2 − ∂2 A3 , F2 := −∂2 A0 − ∂3 A1 + ∂0 A2 + ∂1 A3 , F3 := −∂3 A0 + ∂2 A1 − ∂1 A2 + ∂0 A3 . Therefore, if A is a covariant 4-vector field, then ∂A is a covariant scalar 3-vector filed. However, ∂A is not covariant under reflections since F1 e1 + F2 e2 + F3 e3 corresponds to an anti-selfdual 2-form. Remarking that the Green’s function for the operator −∆ is given by g(x) :=
1 , 2π 2 ||x||2
¯ Thus, we define S(x) := −∂g(x) = then see that
x π 2 ||x||4
x ∈ Q \ {0}.
¯ and S(x) :=
¯ ∂S(x) = −∂ ∂g(x) = −∆g(x) = δ(x),
x ¯ π 2 ||x||4
for x ∈ Q \ {0}. We
x∈Q
where δ stands for the Dirac distribution with support at the origin. The Euclidean vector (massless) field models considered in Ref. 5, 7, 15 are the solutions of the following covariant SPDEs: ∂X = F
(4.1)
where F : S(R4 , Q) × (S (R4 , Q), P ) → R is a L´evy white noise whose Fourier transform is given by 3 4 iF (f,ω) e P (dω) = exp ψ(f (x))dx , f ∈ S(R4 , Q) S (R4 ,Q)
R4
with ψ being a L´evy-Khinchine function on Q with the following expression a0 a ψ(x) := iβx0 − x20 − ||x − x0 ||2 2 2 ixy e − 1 − ixy1(0,1) (||y||) λ(dy), + Q\{0}
x∈Q
(4.2)
where β ∈ R, a0 , a ∈ (0, ∞) and λ is a Sp(1) adjoint invariant L´evy measure on Q \ {0}, namely, λ(udyu−1 ) = λ(dy) for u ∈ Sp(1). The function ψ on Q is invariant under SO(Q) = SO(R4 ). The SPDEs can be solved by the convolution ¯ ∗ F . Thus, the solution to the above SPDE is clearly related to mass X = (−∂g) zero. The Schwinger functions associated to X can be calculated explicitly and the corresponding Wightman functions are obtained explicitly by performing analytic
L´ evy White Noise, Elliptic SPDEs and Euclidean Random Fields
261
continuation. The explicit formulae for the truncated Wightman functions can be found in Ref. 7. It is proved in Ref. 5 that the Wightman functions of this model fulfil the modified Wightman axioms (i.e., Poincar´e invariance, locality, hermiticity, spectral condition and Hilbert space structure condition) of Morchio and Strocchi.44 Thus, these Wightman functions are associated with a local relativistic quantum field theory with indefinite metric (this is different from the free electromagnetic potential field unless the noise F is purely Gaussian). Furthermore, explicit formulae for scattering amplitudes have been obtained in Ref. 7 which show a convergence to free asymptotic in- and out fields of mass zero and nontrivial scattering behaviour. Moreover, the following generalized covariant SPDEs (in any space-time dimension d) have been considered in Ref. 22 and Ref. 8: DX = F
(4.3)
where F is a L´evy white noise transforming covariantly under a representation τ of SO(Rd ) acting on the spin components of F , i.e. τ (Λ)F (Λ−1 x) = F (x) in law, and D is a τ -covariant differential operator with constant coefficients whose Fourier transformed Green’s function has the form QE (k) ˆ −1 (k) = > D N 2 νl 2 l=1 (|k| + ml ) with ml ∈ C (the complex mass parameters), νl ∈ N, j ∈ N, and mj = ml if j = l. QE (k) is certain matrix with polynomial entries of order less or equal to N κ := 2( l=1 νl − 1) which fulfils the Euclidean transformation law τ (Λ)QE (k)τ (Λ−1 ) = QE (Λk),
∀Λ ∈ SO(d).
The covariant SPDE is solved by setting X = D−1 ∗ F . Again, as has been carried out in Ref. 8, one can compute explicitly the Schwinger functions of X and the associated Wightman functions which determine a massive local relativistic quantum vector field model with indefinite metric. Furthermore, necessary and sufficient conditions in terms of the mass spectrum of D are obtained for the massive vector model having nontrivial scattering behaviour (cf. Theorem 2 in Ref. 8). Thus, the vector models have an advantage that they possess nontrivial scattering behaviour in which the masses of the asymptotic fields can be obtained from the mass-spectrum of the Green’s function D−1 . While their disadvantage is that the two point Wightman function in general is ill-defined and has to be “renormalized” (cf. Ref. 15 for m = 0), otherwise the two point function cannot be associated to free field and the scattering behaviour is unclear, cf. Formula (8) in Ref. 8. This procedure of “renormalisation”, however, does not have a clear interpretation in terms of the models. This makes it unclear whether the physical insight, gained e.g. by the calculation of lattice action functionals on the Euclidean domain, is still meaningful in the “renormalised” models.
262
Jiang-Lun Wu
Now let us turn to the scalar case. The stochastic (elliptic) pseudo-differential equations for local scalar models with sharp mass (in any space-time dimension d) are considered in the following form (Ref. 4) (−∆ + m2 )α X = F ,
α ∈ (0, 1/2]
(4.4)
where ∆ is the Laplace operator on Rd , and the mass m > 0 if d = 1, 2 and m ≥ 0 if d ≥ 3, F : S(Rd ) × (S (Rd ), P ) → R is a L´evy white noise determined by 3 4 eiF (f,ω) P (dω) = exp ψ(f (x))dx , f ∈ S(Rd ) S (Rd )
Rd
with ψ being a L´evy-Khinchine function on R as given by (3.1) in Section 3 having the following representation σ 2 t2 ist ist + e −1− dM (s) , t ∈ R ψ(t) = iat − 2 1 + s2 R\{0} for a ∈ R, σ ≥ 0 and M satisfying R\{0} min(1, s2 )dM (s) < ∞. The truncated Wightman functions are derived whose Fourier transforms ˆ T Wn , n ∈ N, are given by the following explicit formulae: ˆT ρ⊗n (dm2 ), n ≥ 3, WˆnT = cn Wm,n (4.5) (R+ )n
where cn , n ∈ N, are constants depending on the probability law of the noise F, ρ(dµ) = 2 sin(πα)1{µ>m2 } /(µ − m2 )α dµ is a Borel measure on R+ with suppρ ⊂ [m2 , ∞), m = (m1 , . . . , mn ) ∈ (R+ )n , and n j−1 n n (−1) − + ˆ T = (2π)d−1− dn 2 W δ (k ) δ (k ) δ( kl ) (4.6) m,n ml l ml l kj2 − m2j j=1 l=1
l=j+1
l=1
and ˆ 2T = 2(2π)−d/2 sin(2πα) W
1{k10 <0,k12 >m2 } (k12 − m2 )2α
δ(k1 + k2 )
(4.7)
− + Here δm (kl ) := 1{kl0 <0} (kl )δ(kl2 − m2l ), δm (kl ) := 1{kl0 >0} (kl )δ(kl2 − m2l ), kl2 := l l 2 k 0 − k 2 for kl = (k 0 , kl ) ∈ R × Rd−1 . l
l
l
From the above formulae, one sees that the two point Wightman function is a regular two point function of a generalized free field and hence no “renormalisation” is required. However, psuedo-differential operators with continuous mass distribution are involved (leading to an “infraparticle”-interpretation of the model), which rise difficulties for scattering theory in the scalar case. In other words, there is no clear picture for particle interpretation. Comparing both vector and scalar cases, we see that in order to get rid of psuedo-differential operators and to consider scattering theory for the scalar case, it is worthwhile to reform the Euclidean (scalar) random fields by modifying the corresponding elliptic SPDEs.
L´ evy White Noise, Elliptic SPDEs and Euclidean Random Fields
263
5. New formulation of elliptic SPDEs and the lattice approximation First of all, let us consider the following elliptic SPDE (−∆ + m2 )X = F .
(5.1)
It is not hard to see that the two point Schwinger function of X is c2 (−∆ + m2 )−2 instead of c2 (−∆ + m2 )−1 , where the latter is the right one we expect to have for the two point Schwinger function of a QFT-model of mass m. In order to have that, we propose to consider the following elliptic SPDE (−∆ + m2 )A = F + G
(5.2)
where G : S(Rd ) × (S (Rd ), P ) → R is a ultralocal Gaussian white noise, which is independent of F , with Fourier transform 3 4 c2 iG(f,ω) e P (dω) = exp f (x)(∆f )(x)dx , f ∈ S(Rd ). m2 R d S (Rd ) c2 Obviously, the covariance of G is m 2 (−∆)δ(x). Moreover, the two point Schwinger function of A can be then calculated as follows
Cov(A)(x) = (−∆ + m2 )−2 [Cov(F ) + Cov(G)](x) = c2 [(−∆ + m2 )−2 − (−∆ + m2 )−2 ∆/m2 ](x) = (c2 /m2 )(−∆ + m2 )−1 (x). The truncated n-point Schwinger functions of A remain the same as those of X for n ≥ 3 and the associated truncated Wightman functions are (up to a constant) given by formula (4.6) for the case ml = m, l = 1, . . . , n. Thus, Euclidean random field A has the “right” Schwinger functions, and the associated Wightman functions possess a well-defined (nontrivial) scattering behaviour and permit the construction of associated relativistic quantum field to asymptotic in- and out- fields of mass m. Therefore, it is possible to develop a properly nontrivial scattering behaviour of relativistic quantum field just starting from A (cf. e.g. Ref. 2). In order to understand the “physics” behind the structure functions, it seems thus to be reasonable to study the lattice approximation of A so that one can see its action functional. The lattice approximation for Euclidean random fields has been discussed in Ref. 16 and Ref. 18 for vector models and in Ref. 17 for scalar models. In Ref. 19 (and most recently Ref. 53), a nonstandard lattice formulation has been set up which gives a rigorous functional integration formula for the Euclidean random field measures (namely the inverse Fourier transform formula for the measures). Let ε > 0 be arbitrarily fixed and define the lattice Lε with spacing ε to be the set Lε := εZd = {εz : z ∈ Zd }.
264
Jiang-Lun Wu
Denote for z = (z1 , . . . , zd ) ∈ Zd [εz, ε(z + 1)) :=
d
[εzj , ε(zj + 1))
j=1
then the indicator function 1[εz,ε(z+1)) ∈ L2 (Rd ) for any z ∈ Zd . Remarking that S(Rd ) is dense in L2 (Rd ), there exists a sequence {hn }n∈N ⊂ S(Rd ) converging to 1[εz,ε(z+1)) in L2 (Rd ). We can then define Fεz (·) := F (1[εz,ε(z+1)) , ·) = P − lim F (hn , ·) ε→0
and Gεz (·) := G(1[εz,ε(z+1)) , ·) = P − lim G(hn , ·) . ε→0
Clearly the random variables Fεz , Gεz , εz ∈ Lε , are mutually independent. Let ∆ε be the discretized Laplace operator on Lε and Kε the lattice Green’s function of −∆ε + m2 : (−∆ε + m2 )f (εz) = εd Kε (εz − εz )f (εz ) . (5.3) εz ∈Lε
Setting
Aεz (·) :=
εd Kε (εz − εz ) [Fεz (·) + Gεz (·)] ,
(5.4)
εz ∈Lε
then following the proofs of Theorems 5.1 and 5.1 in Ref. 17 and Theorem 4.3 in Ref. 18, we can show Theorem 5.1. {Aεz , εz ∈ Lε } approximates A in the sense that f ∈ S(Rd )
lim CAεz (f ) = CA (f ),
ε→0
where CA (f ) := S (Rd ) eiA(f,ω) P (dω) is the characteristic functional of A. Namely the convergence is in the sense of characteristic functionals or in law. Let PFε , PGε , and PAε be the probability measures on RLε associated with Fε = (Fεz , εz ∈ Lε ), Gε = (Gεz , εz ∈ Lε ), and Aε = (Aεz , εz ∈ Lε ), respectively. Then one can carry out that ∀q = (qεz ) ∈ RLε ∩ S(Rd ), d dqεz (5.5) PFε (dq) = e− εz∈Lε ε W (qεz ) εz∈Lε
and 1
PGε (dq) = Z −1 e− 2
εz,εz ∈Lε
B(εz−εz )qεz qεz
dqεz
εz∈Lε
where
d W (qεz ) = −ε−d log (−2π)−d eε (−iqεz t+ψ(t)) dt , R
(5.6)
L´ evy White Noise, Elliptic SPDEs and Euclidean Random Fields
265
(B(εz −εz ))εz,εz ∈Lε is a symmetric, positive definite matrix determined by (−∆ε + m2 ) and the lattice Green’s function of −∆ε , and Z is the normalization constant. Furthermore, PAε (dq)
= PFε ∗ PGε (εd Kε (εz − εz ))−1 εz,εz ∈Lε dq 1 = Z −1 [ exp{− B(εz − εz )(qεz − pεz )2 2 L ε R εz,εz ∈Lε d − ε W ((−∆ε + m2 )pεz )} dpεz ] dqεz . εz∈Lε
εz∈Lε
(5.7)
εz∈Lε
Finally from Theorem 5.1, we get w
Corollary 16.1. PAε =⇒ PA as ε → 0. More details and proofs regarding to the lattice approximation of A will appear in our forthcoming work. Acknowledgement I would like to thank Sergio Albeverio for introducing me to the wonderful world of constructive quantum field theory and for his deep insight discussion. I also thank Hanno Gottschalk for the joyful collaboration on topics related to this article. References 1. S. Albeverio, J. E. Fenstad, R. Høegh-Krohn, T. Lindstrøm: Nonstandard Methods in Stochastic Analysis and Mathematical Physics. Pure and Applied Mathematics, 122, Academic Press, Inc., Orlando, FL, 1986; MIR, Moscow, 1988 (in Russian). 2. S. Albeverio, H. Gottschalk: Scattering theory for local relativistic QFT with indefinite metric. Commun. Math. Phys. 216 (2001), 491–513. 3. S. Albeverio, H. Gottschalk, J.-L. Wu: Euclidean random fields, pseudodifferential operators, and Wightman functions. Proc. Gregynog Symposium Stochastic Analysis and Applications (eds. I.M. Davies, A. Truman and K.D. Elworthy), pp20–37, World Scientific, Singapore, 1996. 4. S. Albeverio, H. Gottschalk, J.-L. Wu: Convoluted generalized white noise, Schwinger functions and their analytic continuation to Wightman functions. Rev. Math. Phys. 8 (1996), 763–817. 5. S. Albeverio, H. Gottschalk, J.-L. Wu: Models of local relativistic quantum fields with indefinite metric (in all dimensions). Commun. Math. Phys. 184 (1997), 509–531. 6. S. Albeverio, H. Gottschalk, J.-L. Wu: Remarks on some new models of interacting quantum fields with indefinite metric. Reports Math. Phys. 40 (1997), 385–394. 7. S. Albeverio, H. Gottschalk, J.-L. Wu: Nontrivial scattering amplitudes for some local relativistic quantum field models with indefinite metric. Phys. Lett. B 405 (1997), 243–248.
266
Jiang-Lun Wu
8. S. Albeverio, H. Gottschalk, J.-L. Wu: Scattering behaviour of relativistic quantum vector fields obtained from Euclidean covariant SPDEs. Reports Math. Phys. 44 (1999), 21–28. 9. S. Albeverio, H. Gottschalk, J.-L. Wu: SPDEs leading to local, relativistic quantum vector fields with indefinite metric and nontrivial S-matrix. Stochastic Partial Differential Equations and Applications (eds. G. Da Prato and L. Tubaro.), pp21–38, Lecture Notes in Pure and Appl. Math., 227, Dekker, New York, 2002. 10. S. Albeverio, H. Gottschalk, M.W. Yoshida: Systems of classical particles in the grand canonical ensemble, scaling limits and quantum field theory. Rev. Math. Phys. 17 (2005), 175–226. 11. S. Albeverio, H. Holden, R. Høegh–Krohn, T. Kolsrud: Representation and construction of multiplicative noise. J. Funct. Anal. 87 (1989), 250–272. 12. S. Albeverio, R. Høegh–Krohn: Euclidean Markov fields and relativistic quantum fields from stochastic partial differential equations. Phys. Lett. B177 (1986), 175–179. 13. S. Albeverio, R. Høegh–Krohn: Quaternionic non–abelian relativistic quantum fields in four space–time dimensions. Phys. Lett. B189 (1987), 329–336. 14. S. Albeverio, R. Høegh–Krohn: Construction of interacting local relativistic quantum fields in four space–time dimensions. Phys. Lett. B200 (1988), 108–114; with erratum in ibid. B202 (1988), 621. 15. S. Albeverio, K. Iwata, T. Kolsrud: Random fields as solutions of the inhomogeneous quaternionic Cauchy–Riemann equation.I.Invariance and analytic continuation. Commun. Math. Phys. 132 (1990), 555–580. 16. S. ALbeverio, K. Iwata, M. Schmidt: A convergent lattice approximation for nonlinear electromagnetic fields in four dimensions. J. Math. Phys. 34 (1993), 3327–3342. 17. S. Albeverio, J.-L. Wu: Euclidean random fields obtained by convolution from generalized white noise. J. Math. Phys. 36 (1995), 5217–5245. 18. S. Albeverio, J.-L. Wu: On the lattice approximation for certain generalized vector Markov fields in four space-time dimensions. Acta Appl. Math. 47 (1997), 31–48. 19. S. Albeverio, J.-L. Wu: On nonstandard construction of stable type Euclidean random field measures and large deviation. Reuniting the Antipodes: Constructive and Nonstandard Views of the Continuum (eds. U. Berger, H. Osswald, P. Schuster), pp1–18, Synthese Library, vol. 306, Kluwer Academic Publishers, Boston, Dordrecht, London, 2001. 20. S. Albeverio, M.W. Yoshida: H-C 1 maps and elliptic SPDEs with polynomial and exponential perturbations of Nelson’s Euclidean free field. J. Funct. Anal. 196 (2002), 265–322. 21. T. Balaban, Ultra violet stability in field theory. The φ43 model. Scaling and SelfSimilarity in Physics (ed. J. Fr¨ ohlich), pp297–319, Birkh¨ auser, Boston, Basel, Stuttgart, 1983. 22. C. Becker, R. Gielerak, P. L ugiewicz: Covariant SPDEs and quantum field structures. J. Phys. A 31 (1998), 231–258. 23. C. Becker, H. Gottschalk, J.-L. Wu: Generalized random vector fields and Euclidean quantum vector fields.Second Seminar on Stochastic Analysis, Random Fields and Applications (eds. R. Dalang, M. Dozzi, F. Russo), pp15–24, Progr. Probab., 45, Birkh¨ auser, Basel, 1999. 24. T. Constantinescu, A. Gheondea: On L. Schwartz’s boundedness condition for kernels. Positivity 10 (2006), 65–86. 25. T. Constantinescu, A. Gheondea: Invariant Hermitian kernels and their Kolmogorov decompositions. C. R. Acad. Sci. Paris Sr. I Math. 331 (2000), 797–802. 26. T. Constantinescu, A. Gheondea: Representations of Hermitian kernels by means of
L´ evy White Noise, Elliptic SPDEs and Euclidean Random Fields
267
Krein spaces. II. Invariant kernels. Commun. Math. Phys. 216 (2001), 409–430. 27. S.H. Djah, H. Gottschalk, H. Ouerdiane: Feynman graph representation of the perturbation series for general functional measures. J. Funct. Anal. 227 (2005), 153–187. 28. C. H. Eab, S. C. Lim, L. P. Teo: Finite temperature Casimir effect for a massless fractional Klein-Gordon field with fractional Neumann conditions. J. Math. Phys. 48 (2007), no. 8, 082301, 24 pp. 29. I.M. Gelfand, N.Ya. Vilenkin: Generalized Functions, IV. Some Applications of Harmonic Analysis. Academic Press, New York, London, 1964. 30. J. Glimm, A. Jaffe: Positivity of the ϕ43 Hamiltonian. Fortschritte der Physik 21 (1973), 327–376. 31. J. Glimm, A. Jaffe: Quantum physics: A Functional Integral Point of View. 2nd ed., Springer-Verlag, New York, Berlin, Heidelberg, 1987. 32. H. Gottschalk, H. Thale: An indefinite metric model for interacting quantum fields on globally hyperbolic space-times. Ann. Henri Poincar 4 (2003), 637–659. 33. H. Gottschalk, B. Smii: How to determine the law of the solution to a stochastic partial differential equation driven by a L´evy space-time noise? J. Math. Phys. 48 (2007), no. 4, 043303, 22 pp. 34. M. Grothaus, L. Streit: Construction of relativistic quantum fields in the framework of white noise analysis. J. Math. Phys. 40 (1999), 5387–5405. 35. R. Haag: Quantum fields with composite particles and asymptotic conditions. Phys. Rev. 112 (1958), 669–673. 36. K. Hepp: On the connection between the LSZ and Wightman quantum field theory. Commun. Math. Phys. 1 (1965), 95–111. 37. G. Hofmann: The Hilbert space structure condition for quantum field theories with indefinite metric and transformations with linear functionals. Lett. Math. Phys. 42 (1997), 281–295. 38. G. Hofmann: On inner characterizations of pseudo-Krein and pre-Krein spaces. Publ. Res. Inst. Math. Sci. 38 (2002), 895–922. 39. Z. Huang, C. Li: On fractional stable processes and sheets: white noise approach. J. Math. Anal. Appl. 325 (2007), 624–635. 40. Z. Huang, Y. Wu: Interacting Fock expansion of L´evy white noise functionals. Acta Appl. Math. 82 (2004), 333–352. 41. N. Ikeda, S. Watanabe: Stochastic Differential Equations and Diffusion Processes. 2nd. North-Holland, Kodansha, 1989. 42. J. Jacod, A.N. Shiryaev: Limit Theorems for Stochastic Processes. Springer-Verlag, Berlin, 1987. 43. S. C. Lim, L. P. Teo: Sample path properties of fractional Riesz-Bessel field of variable order. J. Math. Phys. 49 (2008), no. 1, 013509, 31 pp. 44. G. Morchio, F. Strocchi: Infrared singularities, vacuum structure and pure phases in local quantum field theory. Ann. Inst. H. Poincar´e A33 (1980), 251–282. 45. G. Morchio, F. Strocchi: Representation of ∗ -alegbras in indefinite inner product spaces. Stochastic Processes, Physics and Geometry: New Interplays, II (a volume in honor of Sergio Albeverio), pp 491–503, CMS Conf. Proc., 29, Amer. Math. Soc., Providence, RI, 2000. 46. E. Nelson: Construction of quantum fields from Markoff fields. J. Funct. Anal. 12 (1973), 97–112. 47. E. Nelson: The free Markoff field. J. Funct. Anal. 12 (1973), 211–227. 48. W.G. Ritter: Description of noncommutative theories and matrix models by Wightman functions. J. Math. Phys. 45 (2004), 4980–5002. 49. B. Simon: The P (φ)2 Euclidean (Quantum) Field Theory. Princeton University Press,
268
Jiang-Lun Wu
Princeton, 1975. 50. F. Strocchi: Selected Topics on the General Properties of Quantum Field Theory. Lect. Notes in Physics 51. World Scientific, Singapore, 1993. 51. K. Symanzik: Euclidian quantum field theory I. Equations for a scalar model. J. Math. Phys. 7 (1966), 510–525. ´e 52. J.B. Walsh: An introduction to stochastic partial differential equations. Ecole d’Et´ de Probabilit´ es de St. Flour XIV, pp. 266–439, Lect. Notes in Math. 1180, SpringerVerlag, Berlin, 1986. 53. J.-L. Wu: A hyperfinite flat integral for generalized random fields. J. Math. Anal. Appl. 330 (2007), 133–143.
Interdisciplinary Mathematical Sciences, Volume 8, 2009 pp. 269–291
Chapter 17 A Short Presentation of Choquet Integral
Jia-an Yan∗ Institute of Applied Mathematics Academy of Mathematics and Systems Science Chinese Academy of Sciences, Beijing 100190, P. R. China Email: [email protected] The paper provides a short representation of Choquet integral. The main content is from the book D. Denneberg (1994) and the paper Zhou (1998).
Contents 1 2 3 4 5
Introduction . . . . . . . . . . . . . . . . . . . . . . Integration of Monotone Functions . . . . . . . . . . Monotone Set Functions, Measurability of Functions Comonotonicity of Functions . . . . . . . . . . . . . The Choquet Integral . . . . . . . . . . . . . . . . . 5.1 Definition and basic properties . . . . . . . . 5.2 Example 1: Distorted probability measures . 5.3 Example 2: λ-fuzzy measures . . . . . . . . . 6 The Subadditivity Theorem . . . . . . . . . . . . . . 7 Representing Functionals as Choquet Integrals . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
269 270 271 275 277 277 279 280 281 285 291
1. Introduction The Choquet integral was introduced by Choquet (1953), and originally used in statistical mechanics and potential theory. After the work of Dempster (1967), later developed by Shafer (1976), it was applied to uncertainty and the representation of beliefs. The interest of statistics for the subject was started with the work on robust Bayesian inference by Huber (1973). Some years later integral representation theorems based on Choquet integral were established by Greco (1982) and Schmeidler (1986). Yaari (1987) establishes a dual theory for risk and uncertainty, in which the certainty equivalent of a uniformly bounded economic prospect can be represented as a Choquet integral. ∗ This
work was supported by the National Natural Science Foundation of China (No. 10571167), the National Basic Research Program of China (973 Program, No.2007CB814902), and the Science Fund for Creative Research Groups (No.10721101). 269
270
Jia-An Yan
The purpose of this note is to give a short representation of some basic results about the Choquet integral. The main content is from the book “Non-additive Measure and Integral” by D. Denneberg (1994) and the paper Zhou(1998). In fact, this note has been presented by the author at a seminar of the Department of Systems Engineering and Engineering Management, Chinese University of Hong Kong, during his visit to CUHK in 2005. 2. Integration of Monotone Functions Let I be an (open, closed or semiclosed )interval of IR. Let f : I → IR be a decreasing function on I. Put a = inf{x : x ∈ I}, and J = [inf x∈I f (x), supx∈I f (x)]. There always exists a decreasing function g : J → IR such that a ∨ sup{x|f (x) > y} ≤ g(y) ≤ a ∨ sup{x|f (x) ≥ y}. We call such a g a pseudo-inverse of f , and denoted by fˇ. Note that fˇ is unique except on an at most countable set (e.c. for short). We have (fˇ)ˇ = f, e.c., and f ≤ g, e.c. is equivalent fˇ ≤ gˇ, e.c.. If f (x) is a continuity point of fˇ, then fˇ(f (x)) = x. Proposition 2.1 For a decreasing function f : IR+ → IR+ with limx→∞ f (x) = 0 and any pseudo-inverse fˇ of f , we have ∞ ∞ f (x)dx, fˇ(y)dy = 0
0
where we extended fˇ from [0, f (0)] to IR+ by letting fˇ(x) = 0 for x > f (0). For a decreasing function f : [0, b] → IR with 0 < b < ∞, and any pseudo-inverse fˇ of f , we have ∞ 0 b f (x)dx = (fˇ(y) − b)dy, fˇ(y)dy + 0
0
−∞
where we extended fˇ from [f (b), f (0)] to IR by letting fˇ(x) = 0 for x > f (0) and fˇ(x) = f (b) for x < f (b). Proof. In order to prove the first result we put Sf := {(x, y) ∈ IR2+ | 0 ≤ y ≤ f (x), x ∈ IR+ }, Sfˇ := {(x, y) ∈ IR2+ | 0 ≤ x ≤ fˇ(y), y ∈ IR+ }. Then the closures S f and S fˇ of Sf and Sfˇ in IR2 are the same. However the integrals f and fˇ are the areas of S f and S fˇ, so they are equal. Now assume f : [0, b] → IR with 0 < b < ∞ is a decreasing function. There is a point a ∈ [0, b] such that f (x) ≥ 0 for x < a and f (x) ≤ 0 for x > a. Define g(x) = f (x)I[0,a) (x),
h(x) = −f (b − x)I(0,b−a) (x), x ∈ [0, ∞].
A Short Presentation of Choquet Integral
271
ˇ Then fˇ = gˇ, ˇ h(x) = b − f(−x) on IR+ , e.c., and applying the above proved result gives a ∞ ∞ ∞ f (x)dx = g(x)dx = gˇ(y)dy = fˇ(y)dy, 0
and
b
0
0
f (x)dx = − 0
a
=−
∞
∞
0
h(x)dx = −
∞
0
ˇ h(x)dx
(b − fˇ(−x))dx =
0
0 −∞
(fˇ(y) − b)dy. 2
Adding two equalities gives the desired result. 3. Monotone Set Functions, Measurability of Functions
Let Ω be a non-empty set. We denote by 2Ω the family of all subsets of Ω. By a set system, we mean any sub-family of 2Ω containing ∅ and Ω. By a set function on a set system S, we mean a function µ : S → IR+ = [0, ∞] with µ(∅) = 0. Definition 3.1 A set function µ on S is called monotone, if µ(A) ≤ µ(B) whenever A ⊂ B, A, B ∈ S. µ is called submodular (resp. supermodular), if A, B ∈ S such that A ∪ B, A ∩ B ∈ S implies µ(A ∪ B) + µ(A ∩ B) ≤ (resp.≥)µ(A) + µ(B). µ is called modular if it is sub- and supermodular. µ is called subadditive(resp., superadditive), if A, B ∈ S such that A ∪ B ∈ S, A ∩ B = ∅ implies µ(A ∪ B) ≤ (resp.≥)µ(A) + µ(B). If S is an algebra then µ is modular iff it is additive. If S is a σ-algebra then µ is σ-additive iff it is additive and continuous from below. For a system S the closure from below S of S is defined by + * ∞ = S := A ⊂ Ω ∃ increasing sequence An ∈ S such that A = An . n=1
A set system S is called closed from below if S = S. Definition 3.2 or B ⊂ A.
A set system S is called a chain, if A, B ∈ S implies A ⊂ B
Proposition 3.1 Let S ⊂ 2Ω be a chain and µ a monotone set function on S. We denote by A the algebra generated by S. Then µ is a modular and there exists a unique modular, i.e. additive extension α : A → IR+ of µ on A. Proof It is easy to prove that *n + = A= (Ai \ Bi ) n ∈ IN , Ai , Bi ∈ S, Bi ⊂ Ai , Ai+1 ⊂ Bi , 1 ≤ i ≤ n . i=1
272
Jia-An Yan
For a set in A as above, if we require that Ai = Bi and Bi = Ai+1 then this representation is unique, and we define a set function on A by n n
= (Ai \ Bi ) := (µ(Ai ) − µ(Bi )). α i=1
i=1
Here we understand ∞ − ∞ to be 0. Obviously, α is an additive extension of µ on A. 2 Definition 3.3 Let µ be a monotone set function on 2Ω and X : Ω → IR be a function on Ω. Put Gµ,X (x) := µ(X > x). We call Gµ,X the (decreasing) distribution function of X w.r.t. µ, and call the ˇ µ,X the quantile function of X w.r.t. µ. Since 0 ≤ Gµ,X ≤ pseudo-inverse function G ˇ µ(Ω), Gµ,X is defined on [0, µ(Ω)]. Proposition 3.2 Let µ be a monotone set function on 2Ω and X : Ω → IR be a function on Ω. If u is an increasing function and u and Gµ,X have no common discontinuities, then ˇ µ,X . ˇ µ,u(X) = u ◦ G G Proof.
Let u−1 (y) = inf{x | u(x) > y}. Then
{x | u(x) > y} = {x | x > u−1 (y)} ∪ {x | x = u−1 (y), u(x) > y} ⊂ {x | x ≥ u−1 (y)}. Consequently, if [X = u−1 (y), u(X) > y] = ∅, then it holds that [u(X) > y] = [X > u−1 (y)]; otherwise u−1 (y) is a discontinuity point of u so that Gµ,X is continuous at u−1 (y). In that case we have µ([X > u−1 (y)]) = µ([X ≥ u−1 (y)]) which implies µ([u(X) > y]) = µ([X > u−1 (y)]), i.e., Gµ,u(X) = Gµ,X ◦ u−1 . In order to prove the proposition, we only need to show that sup{x | Gµ,X ◦ u−1 (x) > y} ˇ µ,X (y) ≤ u◦G ≤ sup{x | Gµ,X ◦ u−1 (x) ≥ y}. We first show the left inequality. Assume Gµ,X ◦ u−1 (x) > y, then u−1 (x) ≤ ˇ µ,X (y), then x < ˇ µ,X (y). We consider separately two cases: when u−1 (x) < G G −1 ˇ ˇ µ,X (y), so that ˇ u◦Gµ,X (y); when u (x) = Gµ,X (y), then Gµ,X is discontinuous at G −1 ˇ µ,X (y). ˇ u is continuous at Gµ,X (y). In the latter case we have x = u(u (x)) = u ◦ G This proves the left inequality. ˇ µ,X (y). ˇ µ,X (y), then u−1 (x) ≤ G Now we show the right inequality. If x < u ◦ G −1 ˇ We consider separately two cases: when u (x) < Gµ,X (y), then Gµ,X ◦ u−1 (x) >
A Short Presentation of Choquet Integral
273
ˇ µ,X (y), then u is discontinuous at G ˇ µ,X (y), so that Gµ,X is y; when u−1 (x) = G ˇ µ,X (y) = ˇ µ,X (y). In the latter case we have Gµ,X ◦u−1 (x) = Gµ,X ◦ G continuous at G y. This proves the right inequality. 2 Now we consider a monotone set function on a set system S ⊂ 2Ω . For any A ⊂ Ω we define µ∗ (A) := inf{µ(B) | A ⊂ B, B ∈ S}, µ∗ (A) := sup{µ(C) | C ⊂ A, C ∈ S}. We call the set functions µ∗ and µ∗ the outer and inner set function of µ, respectively. The following two results are typical ones from measure theory. We omit their proofs. Proposition 3.3 Let µ be a monotone set function on S ⊂ 2Ω . (i) µ∗ , µ∗ are monotone. (ii) Let S be closed under union and intersection. µ∗ is submodular if µ is; µ∗ is supermodular (superadditive) if µ is. When S is an algebra, µ∗ is subadditive if µ is. (iii) Let S be closed under union and intersection and closed from below. µ∗ is submodular and continuous from below if µ is. Proposition 3.4 Let µ be an arbitrary set function on 2Ω . Put Aµ := {A ⊂ Ω | µ(C) = µ(A ∩ C) + µ(Ac ∩ C) for all C ⊂ Ω}. Then Aµ is an algebra and µ(A ∪ B) + µ(A ∩ B) = µ(A) + µ(B), for all A ∈ Aµ , B ⊂ Ω. In particular, µ is additive on Aµ . We call Aµ the Caratheodory algebra of µ. Corollary 3.1 If µ is a monotone set function on 2Ω and is subadditive and continuous from below, then Aµ is a σ-algebra and µ is σ-additive on Aµ Definition 3.4 A function X : Ω → IR is called upper µ-measurable if Gµ∗ ,X = Gµ∗ ,X , e.c.. We denote this function, unique e.c., by Gµ,X , and call it the (decreasing) distribution function of X w.r.t µ on S. Definition 3.5 A function X : Ω → IR is called lower µ-measurable, if −X is upper µ-measurable. X is called µ-measurable, if it is lower and upper µ-measurable. X is called (upper, lower) S-measurable, if it is (lower, upper) µmeasurable for any monotone set function µ on S. X is called strongly S-measurable, if MX , M−X ∈ S, where MX the so-called upper set system of X, i.e. MX = {[X > x], [X ≥ x], x ∈ IR}.
274
Jia-An Yan
If S is a σ-algebra, then X is strongly S-measurable iff it is S-measurable in the usual sense. The following hereditary properties for measurability are an immediate consequence of Proposition 3.2. Proposition 3.5 Let µ be a monotone set function on S ⊂ 2Ω and X a upper µmeasurable function on Ω. If u : IR → IR is increasing (decreasing) and continuous, then u ◦ X is upper (lower) µ-measurable. In particular, X + c, X ∧ c, X ∨ c are upper µ-measurable for c ∈ IR, and cX is upper (lower) µ-measurable for c > 0 (c < 0). The following proposition, due to Greco (1982), gives a necessary and sufficient condition for upper S-measurability. Proposition 3.6 A function X : Ω → IR is upper S-measurable iff for every pair a, b ∈ IR, a < b, there exists a set S ∈ S so that [X > b] ⊂ S ⊂ [X > a]. Proof. Sufficiency. Let µ be a monotone on S and x ∈ IR a continuous point of Gµ∗ ,X . It is sufficient to show that Gµ∗ ,X (x) = Gµ∗ ,X (x). Let b > x. The assumption implies Gµ∗ ,X (b) = inf{µ(A) | [X > b] ⊂ A, A ∈ S} ≤ sup{µ(B) | B ⊂ [X > x], B ∈ S} = Gµ∗ ,X (x). Letting b → x gives Gµ∗ ,X (x) ≤ Gµ∗ ,X (x). Since µ∗ ≤ µ∗ the reversed inequality holds and we are done. Necessity. Let a < b. We put µ(A) = inf{(b − a ∨ x)+ | A ⊂ [X > x], x ∈ IR}, A ∈ S. Here we make a convention that inf ∅ := b − a. By assumption X is upper µmeasurable. So we can find a real number x, a < x < b, such that Gµ∗ ,X (x) = Gµ∗ ,X (x). If [X > x] ∈ S, we can take S = [X > x] so that [X > b] ⊂ S ⊂ [X > a]. If [X > x] ∈ / S, then we have (b − a ∨ x)+ ≥ sup{µ(A) | A ∈ S, A ⊂ [X > x]} = inf{µ(B) | B ∈ S, [X > x] ⊂ B} ≥ (b − a ∨ x)+ . Hence equality holds and we can find S ∈ S so that (b − a ∨ x)+ ≤ µ(S) < b − a. Consequently, we must have [X > b] ⊂ [X > x] ⊂ S ⊂ [X > a]. The proposition is proved.
2
A Short Presentation of Choquet Integral
275
Remark 3.1 If S is a σ-algebra, then by Proposition 3.6 a function X : Ω → IR is upper S-measurable iff it is S-measurable. The following important result is also due to Greco (1982). Proposition 3.7 Let S ⊂ 2Ω be a set system which is closed under union and intersection. If X, Y : Ω → IR are upper S-measurable functions, which are bounded below, then X + Y is upper S-measurable, too. Proof. Instate of adding a constant to X andY we may assume X, Y ≥ 0. Given a < b we have to find a set S so that [X + Y > b] ⊂ S ⊂ [X + Y > a]. b First we select n ∈ IN so large that n−4 n b > a. Let ai = (i − 1) n , i = 0, . . . , n. Since n−1 ai+1 + an−i = n b < b, it is easy to see that = [X > ai ] ∩ [Y > aj ]. [X + Y > b] ⊂ i+j=n
On the other hand, by Proposition 2.10 there exist Si , Tj ∈ S so that [X > ai ] ⊂ Si ⊂ [X > ai−1 ], [Y > aj ] ⊂ Tj ⊂ [Y > aj−1 ]. Consequently, [X > ai ] ∩ [Y > aj ] ⊂Si ∩ Tj ⊂ [X > ai−1 ] ∩ [Y > aj−1 ] ⊂[X + Y > ai−1 + aj−1 ]. Since ai−1 + aj−1 = n−4 b > a, we have that the last set is a subset of [X + Y > a] n 6 2 if i + j = n. Thus S := i+j=n Si ∩ Tj is the desired set. 4. Comonotonicity of Functions 6 A class C of functions Ω → IR is called comonotonic if X∈C MX is a chain. Clearly, a class C of functions is comonotonic iff each pair of functions in C is comonotonic. The following proposition gives equivalent conditions for a pair of functions to be comonotonic. Proposition 4.1 For two functions X, Y : Ω → IR the following conditions are equivalent: (i) X, Y are comonotonic. (ii) There is no pair ω1 , ω2 ∈ Ω such that X(ω1 ) < X(ω2 ) and Y (ω1 ) > Y (ω2 ). 2 (iii) The set {(X(ω), Y (ω)) | ω ∈ Ω} ⊂ IR is a chain w.r.t. the ≤-relation in 2 IR . If X, Y are real valued, the above conditions and the following two conditions are equivalent: (iv) There exists a function Z : Ω → IR and increasing functions u, v on IR such that X = u(Z), Y = v(Z).
276
Jia-An Yan
(v) There exist continuous, increasing functions u, v on IR such that u(z)+v(z) = z, z ∈ IR, and X = u(X + Y ), Y = v(X + Y ). Proof. The equivalences (i) ⇔(ii)⇔(iii) are easy to check. For real valued X, Y , the implications (v)⇒(iv)⇒(ii) are trivial. We only need to prove (ii)⇒(v). Now assume (ii) is valid and X, Y are real valued. Let Z = X + Y . Then from (ii) it is easy to see that any z ∈ Z(Ω) possesses a unique decomposition z = x + y with z = Z(ω), x = X(ω), y = Y (ω) for some ω ∈ Ω. We denote x and y by u(z) and v(z). By (ii) it is easy to check that u and v are increasing on Z(Ω). Now we prove that u, v are continuous on Z(Ω). First notice that for z, z + h ∈ Z(Ω) with h > 0 we have z + h = u(z + h) + v(z + h) ≥ u(z + h) + v(z) = u(z + h) + z − u(z). Thus we have u(z) ≤ u(z + h) ≤ u(z) + h. Similarly, for z, z − h ∈ Z(ω) with h > 0 we have u(z) − h ≤ u(z − h) ≤ u(z). These two inequalities together imply the continuity of u. By the symmetry of the roles of u and v, v is continuous, too. It remains to show that u, v can be extended continuously from Z(Ω) to IR. Fist extend to the closure Z(Ω). If z ∈ ∂Z(Ω) is only one sided boundary point, there is no problem, because u, v are increasing functions. If z is two sided limiting point of Z(Ω), then the above inequalities imply that two sided continuous extensions coincide. Finally, the extension of u, v from Z(Ω) to IR is done linearly on each connected component of IR\Z(Ω) in order to maintain the condition u(z)+v(z) = z. 2 Corollary 4.1 Let µ be a monotone set function on 2Ω . If X, Y are real valued comotononic functions on Ω, then ˇ µ,X+Y = G ˇ µ,X + G ˇ µ,Y , e.c.. G Proof Using the above notations in (v), we have X = u(X +Y ), Y = v(X +Y ). By Proposition 3.2 we get ˇ µ,X+Y = (u + v) ◦ G ˇ µ,X+Y = u ◦ G ˇ µ,X+Y + v ◦ G ˇ µ,X+Y = G ˇ µ,X + G ˇ µ,Y , e.c.. G 2
A Short Presentation of Choquet Integral
277
5. The Choquet Integral In this section, we define the Choquet integral of functions w.r.t. a monotone set function, and show their basic properties. As two examples of monotone set functions, the distorted probability measure and the λ-fuzzy measure are studied. 5.1. Definition and basic properties Let µ be a monotone set function on a set system S ⊂ F , and X : Ω → IR an upper µ-measurable function. If the following Lebesgue integral µ(Ω) ˇ µ,X (t)dt G 0
ˇ µ,X is the quantile function of X, then we say that X is integrable exists, where G w.r.t. µ, and define it as the Choquet integral of X w.r.t. µ. We denote it by Xdµ or µ(X). By Proposition 2.1 and using the fact that (fˇ)ˇ = f, e.c., we have 0 ∞ Gµ,X (x)dx + (Gµ,X (x) − µ(Ω))dx, if µ(Ω) < ∞. µ(X) = −∞
0
Recall that if µ is a probability measure and X is a random variable, then the expectation of X w.r.t. µ can be expressed by 0 ∞ µ(X ≥ t)dt + (µ(X ≥ t) − 1)dt. µ(X) = 0
−∞
So the Choquet integral of a real valued function X w.r.t. a probability measure µ coincide with its expectation. n If X is a simple function of the form X = i=1 xi IAi , where A1 , · · · , An are disjoint and enumerated so that (xi ) are in descending order, i.e. x1 ≥ · · · ≥ xn , then n n (xi − xi+1 )µ(Si ) = xi (µ(Si ) − µ(Si−1 )), µ(X) = i=1
i=1
where Si = A1 ∪ · · · ∪ Ai , i = 1, · · · , n, S0 = ∅, and xn+1 = 0. Now we investigate the basic properties of the Choquet integral. Proposition 5.1 If µ is a monotone set function on S ⊂ 2Ω and X, Y : Ω → IR are upper µ-measurable functions, then (i) IA dµ = µ(A), A ∈ S. (ii) (positive homogeneity) cXdµ = c Xdµ, if c ≥ 0. (iii) (asymmetry) If µ is finite then Xdµ = − (−X)d¯ µ, where µ ¯(A) = µ(Ω) − µ(A). (iv) (monotonicity) If X ≤ Y then Xdµ ≤ Y dµ. (v) (X + c)dµ = Xdµ + cµ(Ω), c ∈ IR.
278
Jia-An Yan
(vi) (comonotonic additivity) If X, Y are comonotonic and real valued then (X + Y )dµ = Xdµ + Y dµ. (vii) (transformation rule) For a T : Ω → Ω with T −1 (S ) ⊂ S, let µT (A) = µ(T −1 (A)), A ∈ S . Then for a function Z : Ω → IR, we have Gµ,Z◦T = GµT ,Z and ZdµT = Z ◦ T dµ. Proof.
Since for upper µ-measurable functions X we have Xdµ = Xdµ∗ = Xdµ∗ ,
and µ∗ , µ∗ are monotone set functions defined on 2Ω , instate of replacing µ by µ∗ or µ∗ , we may assume that µ is a monotone set function on 2Ω . ˇ µ,X for c > 0 (Proposition 3.2). (iii) is ˇ µ,cX = cG (i) is trivial. (ii) follows from G due to the fact that Gµ¯ ,X (x) = µ(Ω) − Gµ,−X (−x). (vi) is derived from Corollary 4.1. Other properties are easy to check. 2 Proposition 5.2 Let X : Ω → IR be a S-measurable function and µ, ν Ω monotone set functions on S ⊂ 2 . Then Xd(cµ) = c Xdµ, if c > 0. (i) Gcµ,X = cGµ,X , (ii) If µ and ν are finite and S is closed under union and intersection, then Xd(µ + ν) = Xdµ + Xdν. Gµ+ν,X = Gµ,X + Gν,X , e.c., (iii) If µ(Ω) = ν(Ω) < ∞ or X ≥ 0 then µ ≤ ν implies Xdµ ≤ Xdν. Gµ,X ≤ Gν,X , e.c., (iv) If µn is a sequence of monotone set function on S with µn ≤ µn+1 and limn→∞ µn (A) = µ(A), A ∈ S, then for bounded below X lim Xdµn = Xdµ. n→∞
Proof. (i) and (iii) are trivial. (ii) is trivial, too, if S = 2Ω . In the general case, (ii) is also true, because we can show that (µ + ν)∗ = µ∗ + ν ∗ . In order to prove (iv), notice that lim(µn )∗ (A) = sup(µn )∗ (A) = sup n
n
=
sup
n B∈S,B⊂A
sup
sup µn (B) =
B∈S,B⊂A n
µn (B)
sup B∈S,B⊂A
µ(B) = µ∗ (A), A ∈ 2Ω ,
we may assume S = 2Ω . If X ≥ 0, then Xdµn = Gµn ,X (x)dx and the monotone convergence theorem gives the desired assertion. Subtraction a constant shows that the assertion is true for bounded below function X. 2
A Short Presentation of Choquet Integral
279
Remark 5.1 If X is µ-measurable then X + and −X − are upper µ-measurable (Proposition 3.5). Since X + , −X − are comonotonic functions we have, if X is real valued, + Xdµ = X dµ + (−X − )dµ, and
Proposition 5.3 q < µ(Ω), define
X + dµ −
Xdµ =
X − d¯ µ, if µ(Ω) < ∞.
Let µ be a monotone set functions on 2Ω . For any q, 0 < µq (A) := q ∧ µ(A), A ∈ 2Ω .
µq is monotone and for an arbitrary function X : Ω → IR lim Xdµq = Xdµ. q→µ(Ω)
Proof.
Since Gµq ,X (x) = µq (X > x) = q ∧ µ(X > x) = q ∧ Gµ,X ,
ˇ µq ,X coincide on [0, q). Hence we have ˇ µ,X and G G q ˇ µq ,X (t)dt = G Xdµq = →
0 µ(Ω)
ˇ µ,X (t)dt = G
q
ˇ µ,X (t)dt G
0
Xdµ.
0
2
5.2. Example 1: Distorted probability measures Let P be a probability measure on a measurable space (Ω, F ) and γ : [0, 1] → [0, 1] an increasing function with γ(0) = 0, γ(1) = 1. Then µ = γ ◦ P is a monotone set function. µ is called a distorted probability and γ the corresponding distortion. If γ is a concave (convex) function then γ ◦ P is a submodular (supermodular) set function. This assertion is also valid for a normalized additive set function on an algebra, instate of a probability measure on a σ-algebra. We only consider the concave case, the convex case being similar. Let A, B ∈ S. Assume a := P (A) ≤ P (B) =: b. Denote c = P (A∩B), d = P (A∪B). Then c ≤ a ≤ b ≤ d. By modularity of P we have c + d = a + b. Thus concavity of γ implies γ(c) + γ(d) ≤ γ(a) + γ(b). This proves submodularity of γ ◦ P . For a distortion g ◦ P the Choquet integral (g ◦ P )(X) of X w.r.t. g ◦ P can be expressed in the following form: 1 1 qX (1 − x)dg(x) = qX (t)dγ(t), (g ◦ P )(X) = 0
0
280
Jia-An Yan
where qX (t) is the right-continuous inverse of the distribution function FX of X, and γ(t) = 1 − g(1 − t). 5.3. Example 2: λ-fuzzy measures Let λ ∈ (−1, ∞). A normalized monotone set function µλ defined on an algebra S ⊂ 2Ω is called a λ-fuzzy measure on S, if for every pair of disjoint subsets A and B of Ω µλ (A ∪ B) = µλ (A) + µλ (B) + λµλ (A)µλ (B). If λ = 0, µ0 is additive. For λ ∈ (−1, ∞) and λ = 0, we define ψλ (r) = log(1+λ) (1 + λr). The inverse of ψλ is ψλ−1 =
1 [(1 + λ)r − 1]. λ
It is easy to check that ψλ ◦ µλ is additive. Since ψλ−1 is a concave (resp. convex) function for λ > 0 (resp. λ ∈ (−1, 0)), µλ is submodular (resp. supermodular) if λ > 0 (resp. λ ∈ (−1, 0)). For every finite sequence of mutually disjoint subsets A1 , A2 , . . . , An of Ω, n E n F = −1 µλ Ai = ψλ ψλ (µλ (Ai )) . i=1
Thus,
µλ
n =
i=1
Ai
= ψλ−1
i=1
1 = λ
E n
i=1 n
F log(1+λ) (1 + λµλ (Ai ))
[1 + λµλ (Ai )] − 1 .
i=1
Let P be a probability measure on a measurable space (Ω, F ). Then the set function ψλ−1 ◦ P is a λ-fuzzy measure . For every F -measurable function X on Ω, we define its λ-expectation Eλ (X) as Eλ [X] = Xd(ψλ−1 ◦ P ). The λ-expectation has the following properties: (i) If λ ≤ λ , then Eλ [X] ≥ Eλ [X]. (ii) limλ→−1 Eλ [X] = esssupω∈Ω X(ω). (iii) limλ→∞ Eλ [X] = essinf ω∈Ω X(ω). For decision problem, the region of λ representing risk proneness is (−1, 0) and the one representing risk aversion is (0, ∞). When λ = 0, the decision maker is risk neutral.
A Short Presentation of Choquet Integral
281
¯ = − λ , then the two λ-fuzzy measures ψ −1 ◦ P and For λ ∈ (−1, ∞), let λ λ 1+λ ψλ−1 ¯ ◦ P are conjugate to each other, i.e., −1 c ψλ−1 ¯ ◦ P (A) = 1 − ψλ ◦ P (A ),
and that Eλ (X) = −Eλ¯ (−X). 6. The Subadditivity Theorem Let µ be a monotone set functions on 2Ω . The Choquet integral w.r.t. µ is called subadditive if for upper µ-measurable functions X and Y (X + Y )dµ ≤ Xdµ + Y dµ. A necessary condition for the Choquet integral w.r.t. µ be subadditive is submodularity of µ, because IA∪B and IA∩B are comonotonic, and we have (IA + IB )dµ = (IA∪B + IA∩B )dµ = IA∪B dµ + IA∩B dµ = µ(A ∪ B) + µ(A ∩ B), A, B ⊂ Ω. We shall prove that submodularity of the set function is also sufficient for subadditivity of the Choquet integral. The following lemma contains the core of the proof. Lemma 6.1 Let Ω be the disjoint union of the sets A1 , . . . , An . Let A be the algebra generated by {A1 , . . . , An } and µ : A → [0, 1] be a monotone set function with µ(Ω) = 1. For any permutation π of {1, . . . , n} define Siπ :=
i =
Aπj , i = 1, . . . , n,
S0π := ∅.
j=1
We define a probability measure P π on A through π P π (Aπi ) := µ(Siπ ) − µ(Si−1 ), i = 1, . . . , n.
Now let X : Ω → IR be A-measurable, i.e. constant on each Ai . If µ is submodular then Xdµ ≥ XdP π , and equality holds if X(Aπ1 ) ≥ X(Aπ2 ) ≥ · · · ≥ X(Aπn ). Proof. It suffices to prove the case π = id. We denote Siid by Si , P id by P , and let xi := X(Ai ). We first prove the assertion on equality. Assume
282
Jia-An Yan
that x1 ≥ x2 ≥ · · · ≥ xn . Since S1 ⊂ S2 ⊂ · · · ⊂ Sn , the class {IS1 , · · · ISn } is comonotonic. Thus we have (letting xi+1 := 0, S0 := ∅) n n Xdµ = xi IAi dµ = (xi − xi+1 )ISi i=1
i=1
n n = (xi − xi+1 )µ(Si ) = xi (µ(Si ) − µ(Si−1 ) = XdP. i=1
i=1
Now assume that for some i < n we have xi < xi+1 . Let ϕ be the permutation ϕ ϕ = Si−1 = Siϕ ∩ Si , Si+1 = Si+1 = which just interchanges i and i + 1. Then Si−1 ϕ Si ∪ Si . Submodularity of µ implies ϕ ) = P ϕ (Aϕi ) = P ϕ (Ai+1 ). P (Ai+1 ) = µ(Si+1 ) − µ(Si ) ≤ µ(Siϕ ) − µ(Si−1
Multiplying by xi+1 − xi > 0 gives (xi+1 − xi )P (Ai+1 ) ≤ (xi+1 − xi )P ϕ (Ai+1 ). On the other hand, we have ϕ ϕ ) − µ(Si−1 ) P (Ai ) + P (Ai+1 ) = µ(Si+1 ) − µ(Si−1 ) = µ(Si+1
= P ϕ (Ai+1 ) + P ϕ (Ai ). Multiplying by xi and adding to the last inequality gives xi P ϕ (Ai ) + xi+1 P ϕ (Ai+1 ) ≥ xi P (Ai ) + xi+1 P (Ai+1 ), which implies
XdP ≥ ϕ
XdP.
By induction, we can construct from finitely many permutation of type ϕ a permutation θ with X(Aθ1 ) ≥ X(Aθ2 ) ≥ · · · ≥ X(Aθn ) and
XdP ≥ θ
XdP.
Since we have proved that the left hand side integral is Xdµ, we conclude the proof of the desired result. 2 For convenience we will say that a property of an upper µ-measurable function ˇ µ,X e.c.. X holds µ-essentially if the same property holds for the quantile function G ˇ For example, we say X is µ-essentially > −∞, if Gµ,X (t) > −∞ for all t ∈ [0, µ(Ω)], e.c.. The following is the subadditivity theorem.
A Short Presentation of Choquet Integral
283
Theorem 6.1 Let µ be a monotone, submodular set functions on 2Ω and X, Y upper µ-measurable functions on Ω. If X, Y are µ-essentially > −∞, i.e., lim Gµ,X (x) = µ(Ω),
x→−∞
then
lim Gµ,Y (x) = µ(Ω),
x→−∞
(X + Y )dµ ≤
Xdµ +
Y dµ.
If µ is continuous from below the assumption on X, Y can be dropped. Proof. First of all, we assume µ(Ω) = 1. If X, Y are simple functions, the Z := X + Y is also a simple functions. Let A1 , A2 , · · · , An be a partition of Ω such that X and Y are constant on each Ai , and Z(A1 ) ≥ Z(A2 ) ≥ · · · ≥ Z(An ). By Lemma 6.1 there is a probability measure on A, the algebra generated by A1 , A2 , · · · , An such that Zdµ = ZdP = XdP + Y dP. Once again, Lemma 6.1 implies (X + Y )dµ ≤ Xdµ + Y dµ. Now assume that X, Y are bounded. Let Z := X + Y and Xn := un (X), Yn := un (Y ), Zn := un (Z), where 4 3 k k | k ∈ Z, l ≥ x , n ∈ IN . un := inf n n Then Xn , Yn , Zn are sequences of simple functions, and X ≤ Xn ≤ X + Xn + Yn −
1 1 , Y ≤ Yn ≤ Y + , n n
2 ≤ Zn ≤ Xn + Yn . n
Proposition 3.1 (iv) and (v) imply 1 Xdµ ≤ Xn dµ ≤ Xdµ + . n Hence
lim
n→∞
Xn dµ =
Xdµ.
The same is valid for Y and Z. However, monotonicity of the integral the subadditivity for simple functions imply Zn dµ ≤ (Xn + Yn )dµ ≤ Xn dµ + Yn dµ, from which we get the desired inequality.
284
Jia-An Yan
Assume µ(Ω) = 1 and that X, Y are bounded below. By adding a constant we may assume X, Y ≥ 0. Let Xn := n ∧ X, Yn := n ∧ Y . Since the increasing sequence Gµ,Xn +Yn converges to Gµ,X+Y , we have ∞ ∞ Gµ,Xn +Yn (x)dx → (X + Y )dµ. (Xn + Yn )dµ = 0
0
On the other hand, monotonicity of the integral the subadditivity for bounded functions imply (Xn + Yn )dµ ≤ Xn dµ + Yn dµ ≤ Xdµ + Y dµ, from which we get the desired inequality. ˇ µ,X (t) and G ˇ µ,Y are bounded below e.c.. In this Assume µ(Ω) = 1 and that G case there is an a ∈ IR so that Gµ,X (a) = 1, Gµ,Y (a) = 1. Define X := a ∨ X, which is bounded below. Then Gµ,X = Gµ,X , hence Xdµ = Xdµ. Doing the same for Y we get int(X + Y )dµ ≤ (X + Y )dµ ≤ Xdµ + Y dµ = Xdµ + Y dµ. Now we come to the general case. First of all Proposition 5.2 (i) extends the desired inequality from normalized µ to finite µ. We will use µq = q ∧ µ, 0 < q < µ(Ω) to extend the assertion on unbounded X, Y and infinite µ(Ω). In fact, since limt→−∞ Gµ,X (t) = µ(Ω) > q, we can find an a ∈ IR with Gµ,X (a) ≥ q so that ˇ µq ,X (t), and similarly G ˇ µq ,Y , are Gµq ,X = q ∧ Gµ,X (a) = q = µq (Ω). That means G bounded below e.c.. From the above proved result we have (X + Y )dµq ≤ Xdµq + Y dµq . Letting q → µ(Ω), Proposition 5.3 implies (X + Y )dµ ≤ Xdµ + Y dµ. Finally, we come to the case where µ is continuous below. We treat two cases separately. In case µ(X + Y > −∞) < µ(Ω), either (X + Y )dµ does not exist or is −∞. So nothing has to be proved or the assertion is trivial. Now assume µ(X + Y > −∞) = µ(Ω). Since {X + Y > −∞} = {X > −∞} ∩ {Y > −∞}, monotonicity of µ implies µ(X > −∞) = µ(Ω) and µ(Y > −∞) = µ(Ω). Then it is ˇ µ,X (t) > −∞, G ˇ µ,Y (t) > −∞ for all t ∈ [0, µ(Ω)] e.c.. So we are easy to see that G in the situation already proved. 2 Corollary 6.1 Let µ be a monotone, submodular set functions on 2Ω and X, Y upper µ-measurable functions on Ω. Moreover, if X, Y, X − Y and Y − X are
A Short Presentation of Choquet Integral
285
µ-essentially > −∞, then Xdµ − Y dµ ≤ |X − Y |dµ. Especially, we have
Proof.
Xdµ ≤ |X|dµ.
We may assume Xdµ ≥ Y dµ. By Theorem 6.1 we have Xdµ = (X − Y + Y )dµ ≤ (X − Y )dµ + Y dµ
and, using X − Y ≤ |X − Y |, 0 ≤ Xdµ − Y dµ ≤ (X − Y )dµ ≤ |X − Y |dµ, 2
the latter is the desired inequality. 7. Representing Functionals as Choquet Integrals
Given a family F of functions X : Ω → IR and a functional Γ : F → IR, we are interested in conditions under which Γ can be represented as a Choquet integral: Γ(X) = Xdγ, X ∈ F , where γ is a monotone set function on 2Ω . To begin with we prepare a lemma. Lemma 7.1 Let µ be a monotone set function on S ⊂ 2Ω and X : Ω → IR+ a upper µ-measurable function. Then at any continuity point x ≥ 0 of Gµ,X the function g : x → X ∧ xdµ is differentiable with derivative Gµ,X (x). If Gµ,X is right continuous then Gµ,X is the derivative from right of X ∧ xdµ at all points x ≥ 0. Proof.
Since for x, y ≥ 0 Gµ,X∧x (y) = µ∗ (X ∧ x > y) = Gµ,X (y)I[0,x) (y),
we have
g(x) :=
X ∧ xdµ =
∞
Gµ,X∧x (y)dy = 0
x
Gµ,X (y)dy, x ≥ 0,
0
whence the desired assertion.
2
The following representation theorem is due to Greco (1982). Theorem 7.1 Let F be a family of functions which has the following properties: a) X ≥ 0 for all X ∈ F ,
286
Jia-An Yan
b) aX, X ∧ a, X − X ∧ a ∈ F , if X ∈ F , a ∈ IR+ . Assume that the functional Γ : F → IR satisfies the following conditions: (i) (positive homogeneity): Γ(cX) = cΓ(X) for X ∈ F , a ∈ IR+ , (ii) (monotonicity): X, Y ∈ F , X ≤ Y imply Γ(X) ≤ Γ(Y ), (iii) (comonotonic additivity):Γ(X+Y ) = Γ(X)+Γ(Y ) for comonotonic X, Y ∈ F withX + Y ∈ F , (iv) (lower marginal continuity): lima→0 Γ(X − X ∧ a) = Γ(X) for X ∈ F , (V) (upper marginal continuity): limb→∞ Γ(X ∧ b) = Γ(X) for X ∈ F . Put α(A) := sup{Γ(X) | X ∈ F , X ≤ IA }, β(A) := inf{Γ(Y ) | X ∈ F , Y ≥ IA }, A ∈ 2Ω . Then α ≤ β and α, β are monotone. Let γ be a monotone set function on 2Ω so that α ≤ γ ≤ β. Then γ represents Γ. Let X ∈ F . For n ∈ IN we define n2n −1 1 un (x) := n I i . 2 i=1 x> 2n
Proof.
Since
i+1 i i i−1 n 2 x ∧ n − x ∧ n ≤ Ix> in ≤ 2 x ∧ n − x ∧ n , 2 2 2 2 2 n
we have
i i+1 i 2n Γ X ∧ n − X ∧ n ≤ α X > n 2 2 2 i i ≤γ X > n ≤β X > n 2 2 i i − 1 . ≤ 2n Γ X ∧ n − X ∧ n 2 2
Summing up these inequalities with running index i and observing that the functions X ∧ 2in − X ∧ i−1 2n are commonotonic (Proposition 4.1) we get, using comonotonic additivity of Γ, 1 1 ≤ Γ(X). Γ X ∧ n − X ∧ n ≤ un (X)dγ ≤ Γ X ∧ n − n 2 2 In order to prove that lim un (X)dγ = Γ(X), we only need to show n→∞ 1 lim Γ X ∧ n − X ∧ n = Γ(X). n→∞ 2 To this end we rewrite the functions on the left hand side as follows: 1 1 X ∧ n − X ∧ n = (X ∧ n − X ∧ 1) + (X ∧ 1 − X ∧ n ) 2 2 1 = (X − X ∧ 1) ∧ (n − 1) + (X ∧ 1 − (X ∧ 1) ∧ n ). 2
A Short Presentation of Choquet Integral
287
The last two functions being comonotonic we derive from (iii)-(v) 1 lim Γ X ∧ n − X ∧ n = Γ(X − X ∧ 1) + Γ(X ∧ 1) = Γ(X). n→∞ 2 Finally, to conclude the proof it remains to show un (X)dγ = Xdγ. lim n→∞
This can be shown as follows: X ∧n−X ∧ implies
n 1 2n
1 ≤ un (X) ≤ X 2n
1 Gγ,X (x)dx = X ∧ n − X ∧ n dγ ≤ un (X)dγ 2 ∞ ≤ Xdγ = Gγ,X (x)dx, 0
from which letting n → ∞ we get the desired result.
2
Remark 7.1 1) In the statement of theorem condition (i) is implied by conditions (ii) and (iii). In fact, for a positive rational number c, Γ(cX) = cΓ(X) is implied by comonotonic additivity. The monotonicity assumption which is also a continuity assumption implies the above equality for all non-negative numbers c. 2) The set functions α, β are the smallest and the largest monotone set functions, respectively, which represent Γ. 3) Conditions (i) through (v) are not only sufficient for representing Γ as an integral but necessary, too. Corollary 7.1 Let F be a family of functions which has the properties b), c) (or c)’) and d), where c) X + 1 ∈ F for all X ∈ F , c)’ 1 ∈ F and a) is true, d) X is bounded for X ∈ F . Given a real functional Γ on F satisfying properties (i)-(iii), then there is a monotone finite set function γ on 2Ω representing Γ. Proof. First we assume c)’ and d) are true. d) implies (v), and (iv) follows from the fact that Γ(X −X ∧a) = Γ(X)−Γ(X ∧a) and 0 ≤ Γ(X ∧a) ≤ Γ(a) = aΓ(1) → 0, as a → 0. Thus all assumptions of Theorem 7.1 are valid. Now we assume c) and d) are valid. Since b) implies 0 ∈ F, by c) 1 must in F . Let F+ = {X ∈ F | X ≥ 0. Then according to the above proved result F+ matches all assumptions of Theorem 7.1. Now for X ∈ F there is, according to d), a constant c > 0 such that X + c ≥ 0 and X + c = c( 1c X + 1) ∈ F+ and, since the assertion is valid for X + c, Γ(X) = Γ(X + c) − cΓ(1) = (X + c)dγ − cγ(Ω) = Xdγ. 2
288
Jia-An Yan
The following corollary gives sufficient conditions on F and Γ under which a submodular (resp. supermodular) γ representing Γ exists. Corollary 7.2 In Theorem 7.1 if F further has the lattice property e) X ∧ Y, X ∨ Y ∈ F if X, Y ∈ F , and Γ further has the following property (vi) (submodularity): Γ(X ∨ Y ) + Γ(X ∧ Y ) ≤ Γ(X) + Γ(Y ) if X, Y ∈ F , or (vii) (supermodularity):Γ(X ∨ Y ) + Γ(X ∧ Y ) ≥ Γ(X) + Γ(Y ) if X, Y ∈ F , then β (resp. α) defined in the proof of Theorem 7.1 is a monotone, submodular( resp. supermodular) which represents Γ. Proof.
For any X : Ω → IR+ let SX := {(ω, x) ∈ Ω × IR+ | x < X(ω)}
be the subgraph of X. Then by e) the system S := {SX | X ∈ F is closed under union and intersection. We introduce an auxiliary set function ν on S by ν(SX ) := Γ(X), X ∈ F . Since SX ⊂ SY iff X ≤ Y , ν is monotone. It is easy to see that (vi) implies that ν is submodular, and the outer set function ν ∗ of ν is submodular, too, by Proposition 5.2. Now we return from Ω × IR+ to Ω in defining γ(A) := ν ∗ (SIA ), A ∈ 2Ω . Clearly γ is monotone and submodular since SIA∪B = SIA ∪ SIB , SIA∩B = SIA ∩ SIB . Now we show that γ = β. In fact, for any A ∈ 2Ω , β(A) = inf{Γ(Y ) | IA ≤ Y ∈ F} = inf{ν(SY ) | SIA ⊂ SY ∈ S} = ν ∗ (SIA ) = γ(A). Similarly, we can treat with the case where (vii) holds.
2
A normalized monotone set function is called a capacity. If a capacity is continuous from above we call it a upper-continuous capacity. Let F be a collection of bounded real-valued functions on Ω. F is called a Stone vector lattice if : (i) F is a vector space; (ii) F is a lattice, i.e., X ∨ Y, X ∧ Y ∈ F for all X, Y ∈ F ; and (iii) F contains all constant functions on Ω. Let I be a (realvalued) function from F to IR. I is called a quasi-integral if: I is comonotonically additive, monotonic and continuous in the sense that lim I(Xn ) = I(X),
n→∞
A Short Presentation of Choquet Integral
289
if X, Xn ∈ F , n ≥ 1, (Xn ) is deccreasing and tends to X. According to Remark 7.1.1), a quasi-integral I is always positive homogeneous. The following theorem (due to Zhou( 1998)) establishes a one-to-one correspondence between upper-continuous capacities and quasi-integrals. Theorem 7.2 Assume that I is a quasi-integral on a Stone lattice F on Ω and I(1) = 1. Then there exists a unique upper-continuous capacity µ on S := {[X ≥ c], X ∈ F , c ∈ IR} which represents I. On the other hand, for any upper-continuous capacity µ, the functional I defined on F by the Choquet integral I(X) := Xdµ is a quasi-integral. Before proving Theorem 7.2 we prepare a lemma. Lemma 7.2 Let I be a functional on a lattice F that is comonotonically additive and monotonic. Then I is continuous iff for any decreasing sequence (Xn ) and X in F such that for all ω ∈ Ω, there is an nω ∈ IN with Xn (ω) ≤ X(ω) for all all n ≥ nω , limn→∞ I(Xn ) ≤ I(X). Proof. Necessity. Assume that I is continuous. Take any decreasing sequence (Xn ) and X in F satisfying the required conditions. Since Xn ∨ X decreasingly tends to X, we have lim I(Xn ) ≤ lim I(Xn ∨ X) = I(X).
n→∞
n→∞
Sufficiency. Take any decreasing sequence (Xn ) and X in F such that limn→∞ Xn (ω) = X(ω) for all ω. Fix an > 0. Since for all ω there is an nω with Xn (ω) ≤ X(ω) + , by assumption limn→∞ I(Xn ) ≤ I(X + ) = I(X) + I(1). Let go to zero, we have limn→∞ I(Xn ) ≤ I(X). But the inverse inequality is also 2 true by monotonicity. Hence, limn→∞ I(Xn ) = I(X). Proof of Theorem 7.2 Suppose that I is a quasi-integral on F . Let A = [X ≥ c] ∈ S. Put XnA = (1 − n(c − X)+ )+ , then XnA ∈ F and XnA decreasingly tends to IA . Since I is monotone, we denote by µ(A) the limit of I(XnA ). We will show that the definition of µ(A) in independent of the expression of A. In fact, if A = [Y ≥ b] with Y ∈ F and let YnA = (1 − n(b − Y )+ )+ , by Lemma 5.6 one can show that for any fixed m, lim I(XnA ∨ YmA ) ≤ I(YmA ).
n→∞
Consequently we have limn→∞ I(XnA ) ≤ limm→∞ I(YmA ). By symmetry the equality holds. Thus the set function µ on S is well-defined. It is easy to check that µ is indeed a capacity. We are going to show that µ is upper-continuous. Let (An ) be a decreasing sequence of sets in S and A ∈ S with ∩∞ n=1 An = A. By definition of S there are a sequence of functions (Xn ) and X ∈ F such that An = [Xn ≥ cn ] and [X ≥ c]. Since F is a Stone lattice, we may assume, without loss of generality, that (Xn )
290
Jia-An Yan
is a decreasing sequence, that Xn ≤ 1 and X ≤ 1, and that An = [Xn = 1] and A = [X = 1]. Let > 0. By the definition of µ, there are some m and, without loos of generality, an increasing sequence (mn ) of integers with mn ≥ n such that A An )| < , |µ(An ) − I(Xm )| < , for all n. |µ(A) − I(Xm n An A ) and Xm satisfy the condition in Lemma 7.2. If We claim that the sequence (Xm n A / A, since ∩∞ ω ∈ A, then Xm (ω) = 1; and if ω ∈ n=1 An = A, there is an nω such / An that ω ∈ / An for all n ≥ nω . By the definition of XkAn , for any fixed n, if ω ∈ An then for sufficiently large kn one has XkAn (ω) = 0 for all k ≥ kn . Since (Xm ) is a n An ) = 0 for all n ≥ max{n , k }. By decreasing sequence, for ω ∈ / A one has (Xm ω nω n Lemma 5.6 we obtain An A ) ≤ I(Xm ). lim I(Xm n
n→∞
Since is arbitrary, we must have limn→∞ µ(An ) ≤ µ(A). By monotonicity of µ this implies upper-continuity of µ. Now we show that I satisfies conditions (iv) and (v) of Theorem 7.1 for X ∈ F with X ≥ 0. In fact, (iv) follows from I(X − X ∧ a) = I(X) − I(X ∧ a) (due to comonotonic additivity of I) and continuity of I. (v) is trivial, because X is bounded by a constant K > 0 and X ∧ b = X for b ≥ K. Now assume that c > 0, A = [X ≥ c] and Z ≤ IA ≤ W with Z, W ∈ F , Z, W ≥ 0. We will show that I(Z) ≤ µ(A) ≤ I(W ). Since (1 − n(c − X)+ )+ ≥ IA ≥ Z, we have that µ(A) = limn→∞ I((1 − n(c − X)+ )+ ) ≥ I(Z). On the other hand, by Lemma 7.1 we have µ(A) = I(XnA ) ≤ I(W ). Consequently, for any X ∈ F with X ≥ 0, if we approximate X with functions un (X), where n2 −1 1 I i , un (x) := n 2 i=1 x≥ 2n n
then, using the fact that I(Y ) ≤ µ([X ≥ c]) ≤ I(Z) for Y ≤ I[X≥c] ≤ Z with Y, Z ∈ F , Y, Z ≥ 0, the same proof as in Theorem 7.1 gives that I(X) is the Choquet integral of X w.r.t. µ. Now assume X ∈ F. Let K > 0 such that K + X ≥ 0. Then by comonotonic additivity of I and Choquet integral we have K + I(X) = I(K + X) = (K + X)dµ = K + Xdµ. Thus I(X) = Xdµ. Using the expression of Choquet integral in terms of Gµ,X and the up-continuity of µ one can show the uniqueness of µ representing of I. On the other hand, suppose that a functional I is defined by the Choquet integral wr.r.t. an upper-continuous capacity µ. It is clear that I is monotonic and comonotonically additive. So we only have to prove that I is continuous. Let (Xn ) be a decreasing sequence in F with the limit X ∈ F . Since [Xn ≥ t] decreasingly tends to [X ≥ t] and µ is upper-continuous, we have limn→∞ µ([Xn ≥ t]) = µ([X ≥ t]).
A Short Presentation of Choquet Integral
291
Using the expression of Choquet integral in terms of Gµ,X we can apply the mono2 tone convergence theorem to conclude limn→∞ Xn dµ = Xdµ.
References 1. Dempster, A. Upper and lower probability induced by a multi-valued mapping. Annals of Mathematical Statistics 38, 325-339, (1967). 2. Denneberg, D. Non-Additive Measure and Integral. Kluwer Academic Publishers, Boston, (1994). 3. Greco, G. Sulla rappresentazione di funzionali mediante integrali. Rend. Sem. Math. Univ. Padova 66, 21-42, (1982). 4. Huber, P. The use of Choquet capacity in statistics. Bull. Institut. Internat. Statist. 45, 181-191, (1973). 5. Schmeidler, D. Integral representation without additivity. Proceedings of the American Mathematical Society 97,255-261, (1986). 6. Shafer, G. A Mathematical Theory of Evidence. Princeton Univ. Press. Princeton, (1976). 7. Choquet, G. Theory of Capacity. Ann. Inst. Fourier. (Grenoble) 5, 131-295, (1953). 8. Zhou, L. Integral representation of continuous comonotonically additive functionals. Transactions of American Mathematical Society 350(5), 1811-1822, (1998). 9. Yaari, M.E. The dual theory of choice under risk. Econometrica 55, 95-115, (1987).