Problem Books in Mathematics
Edited by P. Winkler
For further volumes: http://www.springer.com/series/714
Dmytro Gusak · Alexander Kukush · Alexey Kulik· Yuliya Mishura · Andrey Pilipenko
Theory of Stochastic Processes With Applications to Financial Mathematics and Risk Theory
123
Dmytro Gusak Institute of Mathematics of Ukrainian National Academy of Sciences Kyiv 01601 Ukraine
[email protected]
Alexander Kukush Department of Mathematical Analysis Faculty of Mechanics and Mathematics National Taras Shevchenko University of Kyiv Kyiv 01033 Ukraine alexander
[email protected]
Alexey Kulik Institute of Mathematics of Ukrainian National Academy of Sciences Kyiv 01601 Ukraine
[email protected]
Yuliya Mishura Department of Probability Theory and Mathematical Statistics Faculty of Mechanics and Mathematics National Taras Shevchencko University of Kyiv Kyiv 01033 Ukraine
[email protected]
Andrey Pilipenko Institute of Mathematics of Ukrainian National Academy of Sciences Kyiv 01601 Ukraine
[email protected]
Series Editor Peter Winkler Department of Mathematics Dartmouth College Hanover, NH 03755-3551 USA
[email protected] ISSN 0941-3502 ISBN 978-0-387-87861-4 e-ISBN 978-0-387-87862-1 DOI 10.1007/978-0-387-87862-1 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2009939131 Mathematics Subject Classification (2000): 60-xx:60Gxx 60G07 60H10 91B30 c Springer Science+Business Media, LLC 2010 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
To our families
Preface
This collection of problems is planned as a textbook for university courses in the theory of stochastic processes and related special courses. The problems in the book have a wide spectrum of the level of difficulty and can be useful for readers with various levels of mastering in the theory of stochastic processes. Together with technical and illustrative problems intended for beginners, the book contains a number of problems of theoretical nature that can be useful for students and undergraduate students that pursue advanced studies in the theory of stochastic processes and its applications. Among others, the important aim of the book is to provide a teaching staff an efficient tool for preparing seminar studies, tests, and exams concerning university courses in the theory of stochastic processes and related topics. While composing the book, the authors have partially used the collections of problems in probability theory [16, 65, 75, 83]. Also, some exercises and problems from the monographs and textbooks [4, 9, 19, 22, 82] were used. At the same time, a large part of our problem book contains original material. The book is organized as follows. The problems are collected into chapters, each chapter being devoted to a certain topic. At the beginning of each chapter, the theoretical grounds for the corresponding topic are given briefly together with the list of bibliography, which the reader can use in order to study this topic in more detail. For the most of the problems, either hints or complete solutions (or answers) are given, and some of the problems are provided with both hints and solutions (answers). However, the authors do not recommend that a reader use the hints systematically, because solving a problem without assistance is much more useful than using a ready-made idea. Some statements that have a particular theoretical interest are formulated on theoretical grounds, and their proofs are formulated as problems for the reader. Such problems are supplied with either complete solutions or detailed hints. In order to work with the problem book efficiently, a reader should be acquainted with probability theory, calculus, and measure theory within the scope of respective university courses. Standard notions, such as random variable, measurability, independence, Lebesgue measure and integral, and so on are used without additional discussion. All the new notions and statements required for solving the problems are given either on theoretical grounds or in the formulations of the problems vii
viii
Preface
straightforwardly. However, sometimes a notion is used in the text before its formal definition. For instance, the Wiener and Poisson processes are processes with independent increments and thus are formally introduced in a Theoretical grounds for Chapter 5, but these processes are used widely in the problems of Chapters 2 to 4. The authors recommend that a reader who comes to an unknown notion or object use the Index in order to find the corresponding formal definition. The same recommendation concerns some standard abbreviations and symbols listed at the end of the book. Some problems in the book form cycles: solutions to one of them are grounded on statements of others or on auxiliary constructions described in some preceding solutions. Sometimes, on the contrary, it is proposed to prove the same statement within different problems using essentially different techniques. The authors recommend a reader pay specific attention to these fruitful internal links between various topics of the theory of stochastic processes. Every part of the book was composed substantially by one author. Chapters 1–6, and 16 are composed by A. Kulik, Chapters 7, 12–15, 18, and 19 by Yu. Mishura, Chapters 8–10 by A. Pilipenko, Chapter 17 by A. Kukush, and Chapter 20 by D. Gusak. Chapter 11 was prepared jointly by D. Gusak and A. Pilipenko. At the same time, every author has made a contribution to other parts of the book by proposing separate problems or cycles of problems, improving preliminary versions of theoretical grounds, and editing the final text. The authors would like to express their deep gratitude to M. Portenko and A. Ivanov for their careful reading of a preliminary version of the book and valuable comments that led to significant improvement of the text. The authors are also grateful to T. Yakovenko, G. Shevchenko, O. Soloveyko, Yu. Kartashov, Yu. Klimenko, A. Malenko, and N. Ryabova for their assistance in translation, preparing files and pictures, and composing the subject index and references. The theory of stochastic processes is an extended discipline, and the authors understand that the problem book in its current form may cause critical remarks from readers, concerning either the structure of the book or the content of separate chapters. While publishing the problem book in its current form, the authors are open for remarks, comments, and propositions, and express in advance their gratitude to all their correspondents. Kyiv December 2008
Dmytro Gusak Alexander Kukush Alexey Kulik Yuliya Mishura Andrey Pilipenko
Contents
Definition of stochastic process. Cylinder σ -algebra, finite-dimensional distributions, the Kolmogorov theorem . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1 1 3 3 7 9
Characteristics of a stochastic process. Mean and covariance functions. Characteristic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11 11 13 13 16 17
3
Trajectories. Modifications. Filtrations . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21 21 24 24 29 31
4
Continuity. Differentiability. Integrability . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33 33 34 34 38 40
1
2
ix
x
5
Contents
Stochastic processes with independent increments. Wiener and Poisson processes. Poisson point measures . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
43 43 46 47 53 54
6
Gaussian processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
59 59 61 62 66 67
7
Martingales and related processes in discrete and continuous time. Stopping times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
71 71 79 79 93 98
Stationary discrete- and continuous-time processes. Stochastic integral over measure with orthogonal values . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
107 107 110 111 119 122
9
Prediction and interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
129 129 130 131 133 135
10
Markov chains: Discrete and continuous time . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
137 137 140 140 152 154
8
Contents
xi
11
Renewal theory. Queueing theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
159 159 162 162 169 170
12
Markov and diffusion processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
175 175 182 182 186 188
13
Itˆo stochastic integral. Itˆo formula. Tanaka formula . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
193 193 196 196 205 209
14
Stochastic differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
215 215 217 217 223 225
15
Optimal stopping of random sequences and processes . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
229 229 231 231 235 237
16
Measures in a functional spaces. Weak convergence, probability metrics. Functional limit theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
241 241 250 250 259 262
xii
Contents
17
Statistics of stochastic processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
271 271 281 281 286 287
18
Stochastic processes in financial mathematics (discrete time) . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
303 303 306 306 310 311
19
Stochastic processes in financial mathematics (continuous time) . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
315 315 317 317 322 322
20
Basic functionals of the risk theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Theoretical grounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Hints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Answers and Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
327 327 343 343 348 350
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of probability distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . List of symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
359 364 364 365 367
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371
1 Definition of stochastic process. Cylinder σ -algebra, finite-dimensional distributions, the Kolmogorov theorem
Theoretical grounds Let (Ω , F, P) be a probability space, (X, X) be a measurable space, and T be some set. Definition 1.1. A random function X with phase space X and parameter set T is a function X : T × Ω (t, ω ) → X(t, ω ) ∈ X such that for any t ∈ T the mapping X(t, ·) : ω → X(t, ω ) is F − X-measurable. Hereinafter the mapping X(t, ·) is denoted as X(t). According to commonly accepted terminology it is a random element taking values in X. The definition introduced above is obviously equivalent to the following one. Definition 1.2. A random function X with phase state X and parameter set T is a family of random elements {X(t),t ∈ T} with values in X indexed by points of T. A random function, as defined in Definitions 1.1 and 1.2, sometimes is also called a stochastic (random) process, but usually the term stochastic process is reserved for the case where T is an interval or a ray on the real axis R. A random sequence (or stochastic process with discrete time) is a random function defined on T ⊂ Z. A random field is a random function defined on T ⊂ Rd , d > 1. For a fixed ω ∈ Ω , function X(·, ω ) : T t → X(t, ω ) is called a trajectory or a realization of the random function X. Denote by X⊗m = X ⊗ · · · ⊗ X the product of m copies of σ -algebra X, that is, the least σ -algebra of a subsets of Xm that contains every set of the form A1 × · · · × Am , A1 , . . . , Am ∈ X. Definition 1.3. For given values m ≥ 1 and t1 , . . . ,tm ∈ T, (m-dimensional) finitedimensional distribution PtX1 ,...,tm of random function X is the joint distribution of random elements X(t1 ), . . . , X(tm ) or, equivalently, the distribution of the vector (X(t1 ), . . . , X(tm )) considered as a random element with values in (Xm , X⊗m ). The set {PtX1 ,...,tm ,t1 , . . . ,tm ∈ T, m ≥ 1} is called the set (or the family) of finite-dimensional distributions of the random function X. D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 1,
1
2
1 Definition of stochastic process. Finite-dimensional distributions
Theorem 1.1. (The Kolmogorov theorem on finite-dimensional distributions) Let X be a complete separable metric space and X be its Borel σ -algebra. Suppose a family {Pt1 ,...,tm ,t1 , . . . ,tm ∈ T, m ≥ 1} be given, such that, for any m ≥ 1 and t1 , . . . ,tm ∈ T, Pt1 ,...,tm is a probability measure on (Xm , X⊗m ). The following consistency conditions are necessary and sufficient for a random function to exist, such that the family {Pt1 ,...,tm ,t1 , . . . ,tm ∈ T, m ≥ 1} is the family of finite-dimensional distributions for this function. (1) For any m ≥ 1,t1 , . . . ,tm ∈ T, B1 , . . . , Bm ∈ X and arbitrary permutation π : {1, . . . , m} → {1, . . . , m}, Pt1 ,...,tm (B1 × · · · × Bm ) = Ptπ (1) ,...,tπ (m) (Bπ (1) × · · · × Bπ (m) ) (permutation invariance). (2) For any m > 1,t1 , . . . ,tm ∈ T, B1 , . . . , Bm−1 ∈ X, Pt1 ,...,tm (B1 × · · · × Bm−1 × X) = Pt1 ,...,tm−1 (B1 × · · · × Bm−1 ) (projection invariance). The random function provided by the Kolmogorov theorem can be constructed on a special probability space. Further on, we describe the construction of this probability space. Let Ω = XT be a space of all functions ω : T → X. Definition 1.4. A set A ⊂ Ω is called a cylinder set, if it has the following representation (1.1) A = {ω ∈ Ω | (ω (t1 ), . . . , ω (tm )) ∈ B} for some m ≥ 1,t1 , . . . ,tm ∈ T, B ∈ X⊗m . A cylinder set has many representations of the form (1.1). A set B in any representation (1.1) of the cylinder set A is called a base (or basis) of A. A class of all cylinder sets is denoted by C(X, T) or simply by C. This class is an algebra, but, in general, it is not a σ -algebra (see Problem 1.35). A minimal σ -algebra σ (C) that contains this class is called a σ -algebra generated by cylinder sets or cylinder σ -algebra. The random function in the Kolmogorov theorem can be defined on the space Ω = XT with the σ -algebra F = σ (C), X(t, ω ) = ω (t),t ∈ T, ω ∈ Ω , and probability P constructed in some special way (see, for instance, [79], Chapter 2 or [9], Appendix 1). The cylinder σ -algebra has the following useful characterization (see Problem 1.29). Theorem 1.2. A set A ⊂ XT belongs to σ (C(X, T)) if and only if there exists a sequence (tn )∞ n=1 ⊂ T and a set B ∈ σ (C(X, N)) such that A = {ω | (ω (tn ))∞ n=1 ∈ B}.
1 Definition of stochastic process. Finite-dimensional distributions
3
Bibliography [9], Chapter I; [24], Volume 1, Chapter I, §4; [25], Chapter II, §2; [15], Chapter II, §§1,2; [79], Chapters 1,2.
Problems 1.1. Let η be a random variable with the distribution function F. Prove that X(t) is a stochastic process, if (a) X(t) = η t; (b) X(t) = min(η ,t); (c) X(t) = max(η ,t 2 ); (d) X(t) = sign (η + t), where 1, x ≥ 0, sign x = −1, x < 0. Draw the trajectories of the process X. Find one-dimensional distributions of the process X. 1.2. Let τ be a random variable with uniform distribution on [0, 1] and {X(t),t ∈ [0, 1]} be a waiting process corresponding to this variable; that is, X(t) = 1It≥τ ,t ∈ [0, 1]. Find all (a) one-dimensional; (b) two-dimensional; (c) m-dimensional distributions of the process X. 1.3. Two devices start to operate at the instant of time t = 0. They operate independently of each other for random periods of time and after that they shut down. The operating time of the ith device has a distribution function Fi , i = 1, 2. Let X(t) be the number of operating devices at the instant t. Find one- and two-dimensional distributions of the process {X(t),t ∈ R+ }. 1.4. Let ξ1 , . . . , ξn be independent identically distributed random variables with distribution function F, and 1 n 1 X(x) = #{k|ξk ≤ x} = ∑ 1Iξk ≤x , x ∈ R n n k=1 (remark that X(·) ≡ Fn∗ (·) is the empirical distribution function based on the sample ξ1 , . . . , ξn ). Find all (a) one-dimensional; (b) two-dimensional; (c) m-dimensional distributions of the process X. Here and below, # denotes the number of elements of a set. 1.5. Is it possible for stochastic processes {X1 (t), X2 (t),t ≥ 0} to have (a) identical one-dimensional distributions, but different two-dimensional ones; (b) identical twodimensional distributions, but different three-dimensional ones? 1.6. Let Ω = T = R+ , F = B(R+ ), A ⊂ R+ . Here and below, B(X) denotes the Borel σ -algebra in X. Prove that: (1) XA (t, ω ) = 1It=ω · 1Iω ∈A is a stochastic process for an arbitrary set A. (2) YA (t, ω ) = 1It≥ω · 1Iω ∈A is a random process if and only if A ∈ B(R+ ). Depict all possible realizations of the processes XA ,YA .
4
1 Definition of stochastic process. Finite-dimensional distributions
1.7. At the instant of failure of some unit in a device, this unit is immediately replaced by a reserve one. Nonfailure operating times for each unit are random variables, jointly independent and exponentially distributed with parameter α > 0. Let X(t) be the number of failures up to the time moment t. Find finite-dimensional distributions of the process X(t), if there are (a) n reserve units; (b) an infinite number of reserve units. 1.8. Let random variable ξ have distribution function F. Denote F [−1] (x) = inf{y| F(y) > x},
x ∈ [0, 1]
(the function F [−1] is called the generalized inverse function for F or the quantile transformation of F), and set ζ = F [−1] (ε ), where ε is a random variable uniformly distributed on [0, 1]. Prove that ζ has the same distribution with ξ . 1.9. Prove that it is possible to construct a sequence of independent identically distributed random variables defined on the probability space Ω = [0, 1], F = B([0, 1]), P = λ 1 |[0,1] , which (a) take the values 0 and 1 with probabilities 12 ; (b) are uniformly distributed on [0, 1]; (c) have an arbitrary distribution function F. Here and below, λ 1 |[0,1] denotes the restriction of the Lebesgue measure λ 1 to [0, 1]. 1.10. Prove that it is impossible to construct on the probability space Ω = [0, 1], F = B([0, 1]), P = λ 1 |[0,1] a family of independent identically distributed random variables {ξt ,t ∈ [0, 1]} with a nondegenerate distribution. 1.11. Let μ , ν be such distributions on R2 that μ (R × A) = ν (A × R) for every A ∈ B(R). Prove that it is possible to construct random variables ξ , η , ζ defined on some probability space in such a way that the joint distribution of ξ and η equals μ , and the joint distribution of η and ζ equals ν . 1.12. Assume a two-parameter family of distributions {μm,n , m, n ≥ 1} on R2 is given, consistent in the sense that (a) for any A, B ∈ B(R) and m, n ≥ 1, μm,n (A × B) = μn,m (B × A); (b) for any A ∈ B(R) and l, m, n ≥ 1, μn,m (A × R) = μn,l (A × R). Is it true that for any such family there exists a sequence of random variables {ξn , n ≥ 1} satisfying the relations μm,n (C) = P((ξm , ξn ) ∈ C) for any m, n ≥ 1 and C ∈ B(R2 )? 1.13. Let {Xn , n ≥ 1} be a random sequence. Prove that the following extended random variables (i.e., the variables with possible values +∞ and −∞) are measurable with respect to the σ -algebra generated by cylinder sets: (a) supn Xn ; (b) lim supn Xn ; (c) number of partial limits of the sequence {Xn }. 1.14. In the previous problem, let random variables in the sequence {Xn , n ≥ 1} be independent. Which ones among extended variables presented in the items a) — c) are degenerate, that is, take some value from R ∪ {−∞, +∞} with probability 1?
1 Definition of stochastic process. Finite-dimensional distributions
5
1.15. Suppose that in Problem 1.13 random variables {Xn , n ≥ 1} may be dependent, but for some m ≥ 2 and for any l = 1, . . . , m random variables {Xnm+l , n ≥ 0} are independent. Which ones among extended variables presented in the items a) — c) of Problem 1.13 are degenerate? 1.16. Let {ξn , n ≥ 1} be a sequence of i.i.d. random variables. Indicate such a sequence {an , n ≥ 1} ⊂ R+ that lim supn→+∞ ξn /an = 1 almost surely, if (a) ξn ∼ N(0, σ 2 ); (b) ξn ∼ Exp(λ ); (c) ξn ∼ Pois(λ ). 1.17. Let {X(t),t ∈ R+ } be a stochastic process with right continuous trajectories, a ∈ C(R+ ) be some deterministic function. Prove that the following variables are extended random variables (that is, measurable functions from Ω to R ∪ {−∞, +∞}): (a) supt∈R+ X(t)/a(t); (b) lim supt→+∞ X(t)/a(t); (c) the number of partial limits of the function X(·)/a(·) as t → +∞; (d) Var(X(·), [a, b]) (variation of X(·) on the interval [a, b]). 1.18. Are the random variables presented in the previous problem measurable for an arbitrary stochastic process without any additional conditions on its trajectories? 1.19. Suppose that a random process {X(t),t ∈ [0; 1]} has continuous trajectories. Prove that the following sets are measurable. A = {ω ∈ Ω | X(t, ω ),t ∈ [0; 1] satisfies Lipschitz condition}. B = {ω ∈ Ω | mint∈[0;1] X(t, ω ) < 7}. C = {ω ∈ Ω |
1 0
X 2 (s, ω )ds > 3 maxs∈[0;1] X(s, ω )}.
D = {ω ∈ Ω | ∃ t ∈ [0; 1) : X(t, ω ) = 1}. E = {ω ∈ Ω | X(1/2, ω ) + 3 sin X(1, ω ) ≤ 0}. F = {ω ∈ Ω | X(t, ω ),t ∈ [0, 1] is monotonically nondecreasing}. G = {ω ∈ Ω | ∃t1 ,t2 ∈ [0, 1],t1 = t2 : X(t1 , ω ) = X(t2 , ω ) = 0}. H = {ω ∈ Ω | X(t, ω ),t ∈ [0, 1) is monotonically increasing}. I = {ω ∈ Ω | at some point trajectory X(·, ω ) is tangent from above to the axis Ox; that is, there exists such an interval [τ1 , τ2 ] ⊂ [0, 1], that X(t, ω ) ≥ 0 as t ∈ [τ1 , τ2 ] and mint∈[τ1 ,τ2 ] X(t, ω ) = 0}. 1.20. Let {Xn (t), n ≥ 1,t ∈ [0, 1]} be a sequence of random processes with continuous trajectories. Prove that the set {ω ∈ Ω | ∑n Xn (t, ω ) uniformly converges on [0, 1]} is a random event. 1.21. Let Γ ⊂ R be an open set and suppose that the trajectories of a process {X(t),t ∈ R+ } are right continuous and have left limits. (1) Are the following functions extended random variables: (a) τ Γ ≡ sup{t : ∀s ≤ t, X(s) ∈ Γ }; (b) τΓ ≡ inf{t : X(t) ∈ Γ }? (2) Prove that τ Γ = τΓ (this value is called the hitting time of the set Γ by the process X).
6
1 Definition of stochastic process. Finite-dimensional distributions
1.22. Solve the previous problem assuming Γ is a closed set. 1.23. Let Γ ⊂ R be some set and τΓ ≡ inf{t : X(t) ∈ Γ } be a hitting time of Γ by a process {X(t),t ∈ R+ } with right continuous trajectories that have left limits. Is the variable τΓ an extended random variable? 1.24. Solve the previous problem assuming Γ is a closed set and trajectories of the process {X(t),t ∈ R+ } do not satisfy any continuity conditions. 1.25. Suppose that trajectories of the process {X(t),t ∈ [0, 1]} are right continuous and have left limits. Prove that for any ω ∈ Ω the trajectory X(·, ω ) is Riemann integrable on [0, 1], and 01 X(t) dt is a random variable. 1.26. Suppose that trajectories of the process {X(t),t ∈ [0, 1]} are right continuous. Is it true that for any ω ∈ Ω trajectory X(·, ω ) is Riemann integrable on [0, 1]? 1.27. Let trajectories of the process {X(t),t ∈ [0, 1]} be right continuous, and τ be a random variable with values in [0, 1]. Prove that the function X(τ ) : Ω ω → X(τ (ω ), ω ) is a measurable function; that is, X(τ ) is a random variable. 1.28. Present an example of random process {X(t),t ∈ [0, 1]} and random variable τ taking values in [0, 1] such that X(τ ) is not a random variable. 1.29. Prove Theorem 1.2. 1.30. Prove that the following subsets of R[0,1] do not belong to the σ -algebra generated by cylinder sets. (a) The set of all continuous functions (b) The set of all bounded functions (c) The set of all Borel functions 1.31. Construct a random process {X(t),t ∈ [0, 1]} defined on probability space Ω = [0, 1], F = B([0, 1]), P = λ 1 |[0,1] in such a way that the set {ω ∈ Ω | function X(·, ω ) is continuous} is not a random event. 1.32. Construct a random process {X(t),t ∈ [0, 1]} defined on probability space Ω = [0, 1], F = B([0, 1]), P = λ 1 |[0,1] in such a way that the set {ω ∈ Ω | function X(·, ω ) is bounded} is not a random event. 1.33. Construct a random process {X(t),t ∈ [0, 1]} defined on probability space Ω = [0, 1], F = B([0, 1]), P = λ 1 |[0,1] in such a way that the set {ω ∈ Ω |function X(·, ω ) is measurable} is not a random event. 1.34. Prove that there exist subsets of the set {0, 1}N that do not belong to the σ algebra generated by cylinder sets (suppose that X = 2{0,1} ). 1.35. Prove that the class C(X, T) of cylinder sets is an algebra; if T is an infinite set and X contains at least two points, then the class is not a σ -algebra.
1 Definition of stochastic process. Finite-dimensional distributions
7
1.36. Let X = {X(t),t ∈ T} be a random function defined on some probability space (Ω , F, P) with phase space X. Prove that for any subset A ⊂ XT that belongs to the cylinder σ -algebra we have: {ω ∈ Ω |X(·, ω ) ∈ A} ∈ F. 1.37. Let X = {X(t),t ∈ T} be a random function defined on some probability space (Ω , F, P) with phase space X and let {ω ∈ Ω |X(·, ω ) ∈ A} ∈ F for some subset A ⊂ XT . Can we assert that A belongs to cylinder σ -algebra? Compare with the previous problem.
Hints 1.2. For any t1 < · · · < tm the random variables X(t1 ), . . . , X(tm ) can take values 0 and 1, only. Moreover, if X(t j ) = 1, then X(tk ) = 1 for k = j + 1, . . . , m. Therefore the joint distribution of X(t1 ), . . . , X(tm ) is concentrated at the points z0 = (1, . . . , 1), z1 = (0, 1, . . . , 1), . . . , zm−1 = (0, . . . , 0, 1), zm = (0, . . . , 0) (there are m + 1 such points). The fact that (X(t1 ), . . . , X(tm )) = z j ( j = 2, . . . , m − 1) means that X(t j−1 ) = 0 and X(t j ) = 1, which gives τ ∈ (t j−1 ,t j ]. 1.6. (1) X(t, ·) ≡ 0 if t ∈ A and X(t, ·) = 1I{t} (·) if t ∈ A; that is, in both these cases we have a measurable function. (2) {ω | X(t, ω ) = 1} = {ω ≤ t, ω ∈ A}.
1.8. {F (ε ) ≤ x} = {inf{y| F(y) > ε } ≤ x} = ∞ n=1 {inf{y| F(y) > ε } < x + 1/n} = ∞ −1 {F(y) > ε } = { ε < F((x + 1/n)−)} = {ε ≤ F(x)}. n=1 n=1 y∈Q,y<x+1/n 1.9. (a) Let εk (ω ) be equal to the kth digit after the point in the binary notation for the number ω ∈ [0, 1]. Then {εk , k ∈ N} are i.i.d. random variables, which take on values 0 and 1 with probabilities 12 (prove it!). (b) Sets N and N2 are equinumerous, therefore on [0, 1] there exists a double sequence {εk, j , k, j ∈ N2 } of i.i.d. random variables that take on values 0 and 1 with probabilities 12 . Take ξk = ∑∞j=1 2− j εk, j . (c) Use item (b) and Problem 1.8. 1.10. Suppose that the set A ⊂ B(R) is such that P(ξt ∈ A) ∈ (0, 1); then for any t = s the distance in L2 (Ω , F, P) between 1Iξt ∈A and 1Iξs ∈A is equal to some constant cA > 0; that is, the space L2 (Ω , F, P) is not separable. Compare with properties of the space L2 ([0, 1]). 1.11. Let ε1 , ε2 be independent variables with uniform distribution on [0, 1]; ξ˜ , η˜ be random variables with joint distribution μ ; {F(y), y ∈ R} be distribution function of η˜ ; and {F(x/y), x, y ∈ R} be conditional distribution function of ξ˜ under conb dition that {η = y} (i.e., P(ξ˜ ≤ a, η˜ ≤ b) = −∞ F(a/y) dF(y), a, b ∈ R). Denote by F [−1] and F [−1] (·/y) the generalized inverse functions of F, F(·/y), y ∈ R, take ξ = F [−1] (ε1 , ε2 ), η = F [−1] (ε2 ), and prove that ξ , η have joint distribution μ (see Problem 1.8). After that repeat the same procedure (with the same ε1 , ε2 ) for variables ζ˜ , η˜ such that the joint distribution of η˜ , ζ˜ equals to ν .
8
1 Definition of stochastic process. Finite-dimensional distributions
1.12. Find a symmetric matrix of the size 3 × 3, which is not a nonnegatively defined one, but such that all three matrices obtained by obliteration of the ith row and the ith column (i = 1, 2, 3) of this matrix are nonnegatively defined. Does there exist a Gaussian three-dimensional vector with such a covariance matrix? of partial limits of the sequence {Xn (ω ), n ≥ 1}, 1.13. (c) Let N(ω ) be the number then {ω ∈ Ω | N(ω ) ≥ k} = α1 ,...,α2k ∈Q,α1 <···<α2k kj=1 {ω ∈ Ω | {Xn (ω ), n ∈ N} ∩ (α2 j−1 , α2 j ) is an infinite set}. Prove this and use it for solving the problem. 1.16. The relation lim supn→+∞ ξn /an = 1 a.s. is equivalent to the following. For every ε > 0, P(lim supn {ξn ≥ (1 + ε )an }) = 0 and P(lim supn {ξn ≥ (1 − ε )an }) = 1 (prove it). Use the Borel–Cantelli lemma. In the item (c), for estimation of probability of the event {ξn ≥ m} prove that ∑k≥m λ k /k! ∼ λ m /m!, m → +∞ and use Stirling formula. 1.17. This problem can be reduced to Problem 1.13 by using discretization of time; for example, X( m ) X(t) = sup sup mn . sup n≥1 m≥1 a( n ) t∈R+ a(t) 1.21. X(t, ω ) ∈ Γ if and only if there exists such an ε (ω ) > 0 that X(s, ω ) ∈ Γ , s ∈ [t,t + ε (ω )). Therefore {τ Γ < a} = b∈Q,b
1 Definition of stochastic process. Finite-dimensional distributions
9
1.30. Check that for arbitrary set T ⊂ [0, 1] with no more then countable complement, the continuous (bounded, measurable) function x may be modified on the set T in such a way that it will not be continuous (correspondingly, bounded or measurable) anymore. Use Theorem 1.2. 1.31. Use the process from item (1) of Problem 1.6. 1.32,1.33. Modify the process given in item (1) of Problem 1.6, assuming X(t, ω ) = ∑k≥1 1It=ω /k 1Iω ∈Ak and choosing the sets Ak in the proper way. 1.34. Sequences of 0 and 1 that contain an infinite number of zeros and units can be put into one-to-one correspondence with the numbers from the interval [0, 1], which are not binary-rational. This correspondence is given by [0, 1] x = ∑k 2−k εk , {εk } ∈ {0, 1}N and is measurable together with inverse mapping with respect to B([0, 1]) and σ (C({0, 1}, N)). Now one can use the fact that there exist subsets of [0, 1] which are not Borel sets. 1.35. In order to prove that C(X, T) is an algebra, use the considerations presented in the solution of Problem 1.29. For proving that C(X, T) is not a σ -algebra, use the same considerations as in the hint of Problem 1.30 with a finite set T . 1.36. Let K be a class of such A ⊂ XT , that {ω ∈ Ω | X(·, ω ) ∈ A} ∈ F. Then K ⊃ C(X, T) and K is a σ -algebra. Thus, K ⊃ σ (C(X, T)).
Answers and Solutions 1.2. For every m ≥ 1, 0 ≤ t1 < · · · < tm ≤ 1, the m-dimensional distribution Pt1 ,...,tm assigns to the points (1, . . . , 1), (0, 1, . . . , 1), (0, . . . , 0, 1), (0, . . . , 0) the weights t1 , (t2 − t1 ), . . . , (tm − tm−1 ), (1 − tm ) correspondingly. 1.4. For every m ≥ 1, 0 ≤ t1 < · · · < tm ≤ 1, the m-dimensional distribution Pt1 ,...,tm is concentrated at the points (l1 /n, . . . , lm /m), where l1 ≤ l2 ≤ · · · ≤ lm and l1 , . . . lm ∈ {0, . . . , n} (we suppose 0 ≤ t1 < · · · < tm ≤ 1). The weight of the point (l1 /n, . . . , lm /m) equals n![F(t1 )]l1 [F(t2 ) − F(t1 )]l2 . . . [F(tm ) − F(tm−1 )]lm [1 − F(tm )]n−l1 −···−lm . l1 !l2 ! . . . lm !(n − l1 − · · · − lm )! 1.5. (a) Yes; (b) yes. 1.12. Not for any. 1.14. Variables given in items (b), (c). 1.15. Variables given in items (b), (c). √ 1.16. (a) an = σ 2 ln n. (b) an = λ −1 ln n. (c) an = φ −1 (ln n), where φ (t) = t lnt, t ∈ [1, +∞).
10
1 Definition of stochastic process. Finite-dimensional distributions
1.18. No. 1.23. No. 1.26. No. 1.29. Use the “principle of the fitting sets”. That is, consider the class K of the sets A ∈ σ (C(X, T)), for which there exist such (tn )∞ n=1 and B ∈ σ (C(X, N)) that A = {ω | (ω (tn ))∞ n=1 ∈ B}. If we prove that K is a σ -algebra which contains a class that generates σ (C(X, T)), then we prove that K = σ (C(X, T)). It is obvious that K ⊃ C(X, T), thus the second requirement is met. Next, ∅, XT ∈ K because for representation of these sets it is sufficient to choose an arbitrary sequence (tn )∞ n=1 and T put B = ∅ or B = XN , correspondingly. Let A = {ω | (ω (tn ))∞ n=1 ∈ B}, then X \A = ∞ N ∞ {ω | (ω (tn ))n=1 ∈ X \B}. Finally, if A = {ω | (ω (tn ))n=1 ∈ B}, and (s j , j ∈ J) is some at most countable family of points from the parameter set, then A may be preJ ˜ ˜ sented in the form A = {ω | ((ω (tn ))∞ n=1 , (ω (s j )) j∈J ) ∈ B} with B = B × X . Thus, if k k ∞ k {A = {ω | (ω (tn ))n=1 ∈ B }, k ≥ 1} is a countable collection of sets from the class and writing the sets Ak in the form K, then assuming {sn , n ≥ 1} = k {tnk , n ≥ 1} k k ∞ k ˜ ˜k A = {ω | (ω (sn ))n=1 ∈ B }, k ≥ 1, we obtain k A = {ω | (ω (sn ))∞ n=1 ∈ k B } ∈ K. 1.37. No, we can not. For example, a trajectory of the waiting process is bounded for every ω (Problem 1.2), but the set of bounded functions does not belong to σ (C(R, [0, 1])) (Problem 1.30).
2 Characteristics of a stochastic process. Mean and covariance functions. Characteristic functions
Theoretical grounds In this chapter we consider random functions with the phase space being either real line R or complex plane C. Definition 2.1. Assume that E|X(t)| < +∞, t ∈ T. Function {aX (t) = EX(t),t ∈ T} is called the mean function (or simply the mean) of the random function X. Function ˜ = X(t) − aX (t),t ∈ T is called the centered (or compensated) function, correX(t) sponding to function X. Recall that covariance of two real-valued random variables ξ and η , both having the second moment, is defined as cov(ξ , η ) = E(ξ − Eξ )(η − Eη ) = Eξ η − Eξ Eη . If ξ , η are complex-valued and E|ξ |2 < +∞, E|η |2 < +∞ then cov(ξ , η ) = E(ξ − Eξ )(η − Eη ) = Eξ η − Eξ Eη (here “ ”, the overbar, is a sign of complex conjugation). Definition 2.2. Assume that E|X(t)|2 < +∞, t ∈ T. Function RX (t, s) = cov(X(t), X(s)),
t, s ∈ T
is called the covariance function (or simply the covariance) of the random function X. If X,Y are two functions with E|X(t)|2 < +∞, E|Y (t)|2 < +∞,t ∈ T, then {RX,Y (t, s) = cov(X(t),Y (s)),t, s ∈ T} is called the mutual covariance function for the functions X,Y . Definition 2.3. Let T be some set, function K be defined on T × T, and take values in C. Function K is nonnegatively defined if m
∑
K(t j ,tk )c j ck ≥ 0
j,k=1
for any m ∈ N and any t1 , . . . ,tm ∈ T, c1 , . . . , cm ∈ C. D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 2,
11
12
2 Mean and covariance functions. Characteristic functions
This definition is equivalent to the following one. Definition 2.4. Function K : T × T → C is nonnegatively defined if for any m ∈ N and any t1 , . . . ,tm ∈ T the matrix Kt1 ...tm = {K(t j ,tk )}mj,k=1 is nonnegatively defined. Proposition 2.1. Covariance RX of an arbitrary stochastic process X is nonnegatively defined. And vice versa, if a : T → C and K : T × T → C are some functions and K is nonnegatively defined, then on some probability space there exists random function X such that a = aX , K = RX . Remark 2.1. Recall that the mean vector and covariance matrix for a random vector ξ = (ξ1 , . . . , ξm ) are aξ = (Eξ j )mj=1 and Rξ = (cov(ξ j , ξk ))mj,k=1 , respectively. If the conditions of Proposition 2.1 hold, then for any m ∈ N,t1 , . . . ,tm ∈ T the covariance matrix for the vector (X(t1 ), . . . , X(tm )) is equal to Kt1 ...tm (see Definition 2.4) and the mean vector is equal to at1 ...tm = (a(t j ))mj=1 . Recall that for a random vector ξ = (ξ1 , . . . , ξm ) with real-valued components, its characteristic function (or equivalently, common characteristic function of the random variables ξ1 , . . . , ξm ) is defined by
φξ (z) = Eei(ξ ,z)Rm = Eei ∑ j=1 ξ j z j , z = (z1 , . . . , zm ) ∈ Rm . m
Theorem 2.1. (The Bochner theorem) An arbitrary function φ : Rm → C is a characteristic function of some random vector if and only if the following three conditions are satisfied. (1) φ (0) = 1. (2) φ is continuous in the neighborhood of 0. (3) For any m ∈ N and z1 , . . . , zm ∈ R, c1 , . . . , cm ∈ C m
∑
φ (z j − zk )c j ck ≥ 0.
j,k=1
Definition 2.5. Let X be a real-valued random function. For a fixed m ≥ 1 and t1 , . . . ,tm ∈ T, the common characteristic function of X(t1 ), . . . , X(tm ) is denoted by φtX1 ,...,tm and is called the (m-dimensional) characteristic function of the random function X. The set {φtX1 ,...,tm ,t1 , . . . ,tm ∈ T, m ≥ 1} is called the set (or the family) of finite-dimensional characteristic functions of the random function X. Mean and covariance functions of a random function do not determine the finitedimensional distributions of this function uniquely (e.g., see Problem 6.7). On the other hand, the family of finite-dimensional characteristic functions of the random function X has unique correspondence to its finite-dimensional characteristics because the characteristic function of a random vector determines the distribution of this vector uniquely. The following theorem is the reformulation of the Kolmogorov theorem (Theorem 1.1) in terms of characteristic functions.
2 Mean and covariance functions. Characteristic functions
13
Theorem 2.2. Consider a family {φt1 ,...,tm : Rm → C,t1 , . . . ,tm ∈ T, m ≥ 1} such that for any m ≥ 1,t1 , . . . ,tm ∈ T the function φt1 ,...,tm satisfies the conditions of the Bochner theorem. The following consistency conditions are necessary and sufficient for such a random function X to exist that the family {φt1 ,...,tm : Rm → C,t1 , . . . ,tm ∈ T, m ≥ 1} is the family of its finite-dimensional characteristic functions. (1) For any m ≥ 1,t1 , . . . ,tm ∈ T, z1 , . . . , zm ∈ R and any permutation π : {1, . . . , m} → {1, . . . , m}, φt1 ,...,tm (z1 , . . . , zm ) = φtπ (1) ,...,tπ (m) zπ (1) , . . . , zπ (m) . (2) For any m > 1,t1 , . . . ,tm ∈ T, z1 , . . . , zm−1 ∈ R,
φt1 ,...,tm (z1 , . . . , zm−1 , 0) = φt1 ,...,tm−1 (z1 , . . . , zm−1 ).
Bibliography [9], Chapter II; [24], Volume 1, Chapter IV, §1; [25], Chapter I, §1; [79], Chapter 16.
Problems 2.1. Find the covariance function for (a) the Wiener process; (b) the Poisson process. 2.2. Let W be the Wiener process. Find the mean and covariance functions for the process X(t) = W 2 (t),t ≥ 0. 2.3. Let W be the Wiener process. Find the covariance function for the process X if (a) X(t) = W (1/t) , t > 0. t (b) X(t) = W (e R. ), t2∈ (c) X(t) = W 1 − t , t ∈ [−1, 1]. 2.4. Let W be the Wiener process. Find the characteristic function for W (2)+2W (1). 2.5. Let N be the Poisson process with intensity λ . Find the characteristic function for N(2) + 2N(1). 2.6. Let W be the Wiener process. Find: (a) E(W (t))m , m ∈ N. (b) E exp(2W (1) +W (2)). (c) E cos(2W (1) +W (2)). 2.7. Let N be the Poisson process with intensity λ . Find: (a) P(N(1) = 2, N(2) = 3, N(3) = 5). (b) P(N(1) ≤ 2, N(2) = 3, N(3) ≥ 5). (c) E(N(t) + 1)−1 . (d) EN(t)(N(t) − 1) · · · · · (N(t) − k), k ∈ Z+ .
14
2 Mean and covariance functions. Characteristic functions
2.8. Let W be the Wiener process and f ∈ C([0, 1]). Find the characteristic func tion for random variable 01 f (s)W (s) ds (the integral is defined for every ω in the Riemann sense; see Problem 1.25). Prove that this random variable is normally distributed. 2.9. Let W be the Wiener process, f ∈ C([0, 1]), X(t) = Find RW,X .
t 0
f (s)W (s) ds, t ∈ [0, 1].
2.10. Let N be the Poisson process, f ∈ C([0, 1]). Find the characteristic functions of random variables: (a) 01 f (s)N(s) ds; (b) 01 f (s)dN(s) ≡ ∑ f (s), where summation is taken over all s ∈ [0, 1] such that N(s) = N(s−). 2.11. Let N be the Poisson process, f , g ∈ C([0, 1]), X(t) = 0 g(s)dN(s), t ∈ [0, 1]. Find: (a) RN,X ; (b) RN,Y ; (c) RX,Y .
t
t 0
f (s)N(s) ds, Y (t) =
2.12. Find all one-dimensional and m-dimensional characteristic functions: (a) for the process introduced in Problem 1.2; (b) for the process introduced in Problem 1.4. 2.13. Find the covariance function of the process X(t) = ξ1 f1 (t) + · · · + ξn fn (t), t ∈ R, where f1 , . . . , fn are nonrandom functions, and ξ1 , . . . , ξn are noncorrelated random variables with variances σ12 , . . . , σn2 . 2.14. Let {ξn , n ≥ 1} be the sequence of independent square integrable random variables. Denote an = Eξn , σn2 = Var ξn . (1) Prove that series ∑n ξn converges in the mean square sense if and only if the series ∑n an and ∑n σn2 are convergent. (2) Let { fn (t), t ∈ R}n∈N be the sequence of nonrandom functions. Formulate the necessary and sufficient conditions for the series X(t) = ∑n ξn fn (t) to converge in the mean square for every t ∈ R. Find the mean and covariance functions of the process X. 2.15. Are the following functions nonnegatively defined: (a) K(t, s) = sint sin s; (b) K(t, s) = sin(t + s); (c) K(t, s) = t 2 + s2 (t, s ∈ R)? 2.16. Prove that for α > 2 the function K(t, s) = 12 (t α + sα − |t − s|α ) , t, s ∈ Rm is not a covariance function. 2.17. (1) Let {X(t), t ∈ R+ } be a stochastic process with independent increments and E|X(t)|2 < +∞, t ∈ R+ . Prove that its covariance function is equal to RX (t, s) = F(t ∧ s), t, s ∈ R+ , where F is some nondecreasing function. (2) Let {X(t),t ∈ R+ } be a stochastic process with RX (t, s) = F(t ∧ s), t, s ∈ R+ , where F is some nondecreasing function. Does it imply that X is a process with independent increments? 2.18. Let N be the Poisson process with intensity λ . Let X(t) = 0 when N(t) is odd and X(t) = 1 when N(t) is even. (1) Find the mean and covariance of the process X. (2) Find RN,X .
2 Mean and covariance functions. Characteristic functions
15
2.19. Let W and N be the independent Wiener process and Poisson process with intensity λ , respectively. Find the mean and covariance of the process X(t) = W (N(t)). Is X a process with independent increments? 2.20. Find RX,W and RX,N for the process from the previous problem. 2.21. Let N1 , N2 be two independent Poisson processes with intensities λ1 , λ2 , respectively. Define X(t) = (N1 (t))N2 (t) ,t ∈ R+ if at least one of the values N1 (t), N2 (t) is nonzero and X(t) = 1 if N1 (t) = N2 (t) = 0. Find: (a) The mean function of the process X (b) The covariance function of the process X 2.22. Let X,Y be two independent and centered processes and c > 0 be a constant. Prove that RX+Y = RX + RY , R√cX = cRX , RXY = RX RY . 2.23. Let K1 , K2 be two nonnegatively defined functions and c > 0. Prove that the following functions are nonnegatively defined: (a) R = K1 + K2 ; (b) R = cK1 ; (c) R = K1 · K2 . 2.24. Let K be a nonnegatively defined function on T × T. (1) Prove that for every polynomial P(·) with nonnegative coefficients the function R = P(K) is nonnegatively defined. (2) Prove that the function R = eK is nonnegatively defined. (3) When it is additionally assumed that for some p ∈ (0, 1) K(t,t) < p−1 , t ∈ T, prove that the function R = (1 − pK)−1 is nonnegatively defined. 2.25. Give the probabilistic interpretation of items (1)–(3) of the previous problem; that is, construct the stochastic process for which R is the covariance function. 2.26. Let K(t, s) = ts,t, s ∈ R+ . Prove that for an arbitrary polynomial P the function R = P(K) is nonnegatively defined if and only if all coefficients of the polynomial P are nonnegative. Compare with item (1) of Problem 2.24. 2.27. Which of the following functions are nonnegatively defined: (a) K(t, s) = sin(t − s); (b) K(t, s) = cos(t − s); (c) K(t, s) = e−(t−s) ; (d) K(t, s) = e−|t−s| ; 2 4 (e) K(t, s) = e−(t−s) ; (f) K(t, s) = e−(t−s) ? 2.28. Let K ∈ C ([a, b] × [a, b]). Prove that K is nonnegatively defined if and only if the integral operator AK : L2 ([a, b]) → L2 ([a, b]), defined by AK f (t) =
b a
K(t, s) f (s) ds,
f ∈ L2 ([a, b]),
is nonnegative. 2.29. Let AK be the operator from the previous problem. Check the following statements. (a) The set of eigenvalues of the operator AK is at most countable. (b) The function K is nonnegatively defined if and only if every eigenvalue of the operator AK is nonnegative.
16
2 Mean and covariance functions. Characteristic functions
2.30. Let K(s,t) = F(t − s), t, s ∈ R, where the function F is periodic with period 2π and F(x) = π − |x| for |x| ≤ π . Construct the Gaussian process with covariance K of the form ∑n εn fn (t), where {εn , n ≥ 1} is a sequence of the independent normally distributed random variables. 2.31. Solve the previous problem assuming that F has period 2 and F(x) = (1 − x)2 , x ∈ [0, 1]. 2.32. Denote {τn , n ≥ 1} the jump moments for the Poisson process N(t), τ0 = 0. Let {εn , n ≥ 0} be i.i.d. random variables that have expectation a and variance σ 2 . Consider the stochastic processes X(t) = ∑nk=0 εk , t ∈ [τn , τn+1 ), Y (t) = εn , t ∈ [τn , τn+1 ), n ≥ 0. Find the mean and covariance functions of the processes X,Y. Exemplify the models that lead to such processes. 2.33. A radiation measuring instrument accumulates radiation with the rate that equals a Roentgen per hour, right up to the failing moment. Let X(t) be the reading at point of time t ≥ 0. Find the mean and covariance functions for the process X if X(0) = 0, the failing moment has distribution function F, and after the failure the measuring instrument is fixed (a) at zero point; (b) at the last reading. 2.34. The device registers a Poisson flow of particles with intensity λ > 0. Energies of different particles are independent random variables. Expectation of every particle’s energy is equal to a and variance is equal to σ 2 . Let X(t) be the readings of the device at point of time t ≥ 0. Find the mean and covariance functions of the process X if the device shows (a) Total energy of the particles have arrived during the time interval [0,t]. (b) The energy of the last particle. (c) The sum of the energies of the last K particles. 2.35. A Poisson flow of claims with intensity λ > 0 is observed. Let X(t),t ∈ R be the time between t and the moment of the last claim coming before t. Find the mean and covariance functions for the process X.
Hints 2.1. See the hint to Problem 2.17. 2.4. Because the variables (W (1),W (2)) are jointly Gaussian, the variable W (2) + 2W (1) is normally distributed. Calculate its mean and variance and use the formula for the characteristic function of the Gaussian distribution. Another method is proposed in the following hint. 2.5. N(2) + 2N(1) = N(2) − N(1) + 3N(1). The values N(2) − N(1) and N(1) are Poisson-distributed random variables and thus their characteristic functions are known. These values are independent, that is, the required function can be obtained as a product.
2 Mean and covariance functions. Characteristic functions
17
2.6. (a) If η ∼ N(0, 1), then Eη 2k−1 = 0, Eη 2k = (2k − 1)!! = (2k − 1)(2k − 3) · · · 1 for k ∈ N. Prove and use this for the calculations. (b) Use the explicit formula density. for the Gaussian (c) Use formula cos x = 12 eix + e−ix and Problem 2.4. 2.10. (a) Make calculations similar to those of Problem 2.8. (b) Obtain the characteristic functions of the integrals of piecewise constant functions f and then uniformly approximate the continuous function by piecewise constant ones. 2.17. (1) Let s ≤ t; then values X(t) − X(s) and X(s) are independent which means that they are uncorrelated. Therefore cov(X(t), X(s)) = cov(X(t) − X(s), X(s)) + cov(X(s), X(s)) = cov(X(t ∧ s), X(t ∧ s)). The case t ≤ s can be treated similarly. 2.23. Items (a) and (b) can be proved using the definition. In item (c) you can use the previous problem. 2.24. Proof of item (1) can be directly obtained from the previous problem. For the proof of items (2) and (3) use item (1), Taylor decomposition of the functions x → ex , x → (1 − px)−1 and a fact that the pointwise limit of a sequence of nonnegatively defined functions is also a nonnegatively defined function. (Prove this fact!).
Answers and Solutions 2.1. RW (t, s) = t ∧ s, RN (t, s) = λ (t ∧ s). 2.2. aX (t) = t, RX (t, s) = 2(t ∧ s)2 . 2.3. For arbitrary f : R+ → R+ , the covariance function for the process X(t) = W ( f (t)),t ∈ R+ is equal to RX (t, s) = RW ( f (t), f (s)) = f (t) ∧ f (s). 2.8. Let In = n−1 ∑nk=1 f (k/n)W (k/n). Because the process W a.s. has continuous trajectories and the function f is continuous, the Riemann integral sum In converges to I = 01 f (t)W (t) dt a.s. Therefore φIn (z) → φI (z), n → +∞, z ∈ R. Hence, EeizIn = Eeizn n
−1
∑nk=1 f (k/n)W (k/n)
2 −(2n)−1 zn−1 ∑nj=k f ( j/n)
= ∏e
i ∑nk=1 zn−1 ∑nj=k f ( j/n) (W (k/n)−W ((k−1)/n))
= Ee
→ e−(z
2 /2 1 1 0 t
) (
2
f (s) ds) dt
k=1
Thus I is a Gaussian random variable with zero mean and variance 2.9. RW,X (t, s) =
s
f (r)(t ∧ r) dr. 1
2.10. (a) φ (z) = exp λ 01 eiz t f (s) ds − 1 dt .
(b) φ (z) = exp λ 01 eiz f (t) − 1 dt . 0
,
n → ∞.
1 1 0
t
2
f (s) ds
dt.
18
2 Mean and covariance functions. Characteristic functions
2.11. RN,X (t, s) = λ 2 0s f (r)(t ∧ r) dr, RN,Y (t, s) = λ 2
u∧s t g(r) dr du. 0 f (u) 0
t∧s 0
g(r) dr, RX,Y (t, s) = λ 2 ×
2.12. (a) Let 0 ≤ t1 < · · · < tn ≤ 1; then φt1 ,...,tm (z1 , . . . , zm ) = t1 eiz1 +···+izm + (t2 − t1 )eiz2 +···+izm + · · · + (tm − tm−1 )eizm + (1 − tm ). (b) Let 0 ≤ t1 < · · · < tn ≤ 1, then −1 −1 −1 −1 φt1 ,...,tm (z1 , . . . , zm ) = F(t1 )eiz1 n +···+izm n + (F(t2 ) − F(t1 ))eiz2 n +···+izm n + · · · + (F(tm ) − F(tm−1 ))eizm n
−1
n + (1 − F(tm )) .
2.13. RX (t, s) = ∑nk=1 σk2 fk (t) fk (s). 2.15. (a) Yes; (b) no; (c) no. 2.17. (2) No, it does not. 2.18. (1) aX (t) = 12 1 + e−2λ t , RX (t, s) = 14 e−2λ |t−s| − e−2λ (t+s) . (2) RN,X (t, s) = −λ (t ∧ s)e−2λ s . 2.19. aX ≡ 0, RX (t, s) = λ (t ∧ s). X is the process with independent increments. 2.20.
k(λ t)k (λ t)k ∑ k! + s · ∑ k! , RX,N ≡ 0. k<s k≥s
RX,W (t, s) = E[N(t) ∧ s] = e
−λ t
2.21. aX (t) = exp λ1teλ2 t − (λ1 + λ2 )t ; function RX is not defined because EX 2 (t) = +∞,t > 0. 2.25. There exist several interpretations, let us give two of them. m + The first one: let R = f (K) and f (x) = ∑∞ m=0 cm x with cm ≥ 0, m ∈ Z . Let the radius of convergence of the series be equal to r f > 0 and K(t,t) < r f ,t ∈ R+ . Consider a triangular array {Xm,k , 1 ≤ k ≤ m} of independent centered identically distributed processes with the covariance function K. In addition, let random variable √ ξ be independent of {Xm,k } and Eξ = 0, Dξ = 1. Then the series X(t) = c0 ξ + √ m ∑∞ m=1 cm ∏k=1 Xm,k (t) converges in the mean square for any t and the covariance function of the process X is equal to R. The second one: using the same notations, denote c = ∑∞ k=0 ck , pk = ck /c, k ≥ 0. Let {Xm , m ≥ 1} be a sequence of independent identically distributed centered processes with the covariance function K, and ξ be as above. Let η be {Xm , m ≥ 1}, with the random variable, independent both on ξ and the processes √ P(η = k) = pk , k ∈ Z+ . Consider the process X(t) = c ∏ηk=1 Xk (t) assuming that ∏0k=1 Xk (t) = ξ . Then the covariance function of the process X is equal to R. In particular, the random variable η should have a Poisson distribution in item (2) and a geometric distribution in item (3).
2 Mean and covariance functions. Characteristic functions
19
2.26. Consider the functions Rk = (∂ 2k /∂ t k ∂ sk )R, k ≥ 0. These functions are nonnegatively defined (one can obtain this fact by using either Definition 2.3 or Theorem 4.2). Function Rk can be represented in the form Rk = Pk (K), where the absolute term of the polynomial Pk equals the kth coefficient of the polynomial P multiplied by (k!)2 . Now, the required statement follows from the fact that Q(t,t) ≥ 0 for any nonnegatively defined function Q. 2.27. Functions from the items (b), (d), (e) are nonnegatively defined; the others are not. 2.28. Let K be nonnegatively defined. Then for any f ∈ C([a, b]), (AK f , f )L2 ([a,b]) =
b b
K(t, s) f (t) f (s) dsdt n k(b − a) b−a 2 j(b − a) ,a+ ≥0 K a+ = lim ∑ n→∞ n n n j,k=1 a
a
because every sum under the limit sign is nonnegative. Because C([a, b]) is a dense subset in L2 ([a, b]) the above inequality yields that (AK f , f )L2 ([a,b]) ≥ 0, f ∈ L2 ([a, b]). On the other hand, let (AK f , f )L2 ([a,b]) ≥ 0 for every f ∈ L2 ([a, b]), and let points t1 , . . . ,tm and constants z1 , . . . , zm be fixed. Choose m sequences of continuous functions { fn1 , n ≥ 1}, . . . , { fnm , n ≥ 1} such that, for arbitrary function φ ∈ C([a, b]), ab φ (t) fnj (t) dt → φ (t j ), n → ∞, j = 1, . . . , m. Putting fn = ∑mj=1 z j fnj , we obtain that ∑mj,k=1 z j zk K(t j ,tk ) = limn→∞ ab ab K(t, s) fn (t) fn (s) dsdt = limn→∞ (AK fn , fn ) ≥ 0. 2.29. Statement (a) is a particular case of the theorem on the spectrum of a compact operator. Statement (b) follows from the previous problem and theorem on spectral decomposition of a compact self-adjoint operator.
3 Trajectories. Modifications. Filtrations
Theoretical grounds Definition 3.1. Random functions {X(t),t ∈ T} and {Y (t),t ∈ T}, defined on the same probability space, are called equivalent (or stochastically equivalent), if P(X(t) = Y (t)) = 1 for any t ∈ T. Random functions {X(t),t ∈ T} and {Y (t),t ∈ T}, possibly defined on different probability spaces, are called stochastically equivalent in a wide sense if their corresponding finite-dimensional distributions coincide. A random function Y equivalent to X is called a modification of the random function X. Definition 3.2. Let T be a linearly ordered set. A filtration (or a flow of σ -algebras) on a probability space (Ω , F, P) is a family of σ -algebras F = {Ft ,t ∈ T} that satisfies the condition Fs ⊂ Ft ⊂ F, s,t ∈ T, s ≤ t. Filtration is called complete if every σ -algebra Ft includes all null probability sets from F.
Denote Ft+ = s>t Fs , Ft− = s
α ∈A Gα
denotes the least
Definition 3.3. Filtration F = {Ft ,t ∈ T} is left-hand continuous if Ft− = Ft ,t ∈ T and right-hand continuous if Ft+ = Ft ,t ∈ T. Filtration that is both left-hand and right-hand continuous is continuous. An important class of filtrations contains filtrations generated by stochastic processes (or, more generally, by random functions). Let {X(t),t ∈ T} be a random function and a set T be linearly ordered. For every t ∈ T denote FtX,0 = σ (X(s), s ≤ t) and define σ -algebra FtX as the augmentation of FtX,0 w.r.t. the measure P; that is, FtX = σ (FtX,0 ∪ NP ) with NP = {A ∈ F| P(A) = 0}. Definition 3.4. Filtration FX = {FtX ,t ∈ T} is called the filtration generated by the random function X, or the natural filtration for the function X.
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 3,
21
22
3 Trajectories. Modifications. Filtrations
Sometimes, in the definition of the natural filtration, the augmentation operation is not involved, and one calls natural filtration the family FX,0 = {FtX,0 ,t ∈ T}. The Problems 3.37, 3.44 explain the role of the augmentation operation. We remark that the augmentation of a σ -algebra G differs slightly from the completion of G (the latter is defined as the least σ -algebra that contains all sets from G and all subsets of the sets from NP ), but if the probability P is complete then these two σ -algebras coincide. Definition 3.5. Let the parametric set T be endowed by a σ -algebra T. A random function {X(t),t ∈ T} is called measurable if X is measurable as a function T× Ω → X; that is, {(t, ω )| X(t, ω ) ∈ B} ∈ T ⊗ F, B ∈ X. Further on we assume T, X to be metric spaces with the metrics d and ρ , respectively, and T, X to be corresponding Borel σ -algebras. Theorem 3.1. Assume that the space T is separable, the space X is complete, and random function {X(t),t ∈ T} is continuous in probability; that is, for any t ∈ T and ε > 0, P(ρ (X(t), X(s)) > ε ) → 0, s → t. Then there exists a modification of the function X that is measurable (i.e., a measurable modification). Remark 3.1. In most textbooks the statement of Theorem 3.1 is formulated and proved under the condition that the space T is compact or σ -compact (e.g., T = R, T = Rd , etc.). For the separable space T, this statement still holds true. This follows from the result of Problem 3.45, published primarily in the paper [86] (see also Problem 3.46). Definition 3.6. A random function X is called separable if there exist a countable dense subset T0 ⊂ T and a set N ∈ F with P(N) = 0 such that, for every open set G ⊂ T and closed set F ⊂ X, {ω | X(t, ω ) ∈ F, t ∈ G}{ω | X(t, ω ) ∈ F, t ∈ G ∩ T0 } ⊂ N. The set T0 is called the set of separability for the function X. Theorem 3.2. Let the space T be separable and the space X be compact. Then every random function {X(t),t ∈ T} has a modification being a separable random function (i.e., a separable modification). Let us consider the question of existence of a modification of a random function with all its trajectories being continuous functions (i.e., a continuous modification). Theorem 3.3. Assume that the space T is compact, the space X is complete, and a random function {X(t),t ∈ T} is continuous in probability. Then the continuous modification for the function X exists if and only if the following condition holds true,
3 Trajectories. Modifications. Filtrations
⎛ P⎝
∞ ∞
23
⎞
n=1 m=1 s,t∈T0 ,d(t,s)<
1 ⎠ = 1, ρ (X(t), X(s)) < n 1
m
where T0 is an arbitrary countable dense subset of T. In what follows, we assume the space X to be complete. The following theorem gives a sufficient condition for a stochastic process to possess a continuous modification, formulated in terms of two-dimensional distributions of the process. Theorem 3.4. Let {X(t),t ∈ [0, T ]} be a stochastic process. Suppose there exist a nondecreasing function {g(h), h ∈ [0, T ]} and some function {q(c, h), c ∈ R+ , h ∈ [0, T ]} such that P(ρ (X(t), X(t + h)) > cg(h)) ≤ q(c, h), h > 0, t ∈ [h, T − h], ∞
∞
n=0
n=1
∑ g(2−n T ) < +∞, ∑ 2n q(c, 2−n T ) < +∞, !
c ∈ R+ .
(3.1)
Then the process X has a continuous modification. As a corollary of Theorem 3.4, one can obtain the well-known sufficient Kolmogorov condition for existence of a continuous modification. Theorem 3.5. Let {X(t),t ∈ [0, T ]} be a stochastic process satisfying Eρ α (X(t), X(s)) ≤ C|t − s|1+β , t, s ∈ [0, T ] with some positive constants α , β ,C. Then the process X possesses a continuous modification. Under the sufficient Kolmogorov condition, the properties of the trajectories of the process X can be specified in more detail. Recall that function f (t),t ∈ [0, T ] is said to satisfy the H¨older condition with index γ (γ > 0) if sup |t − s|−γ ρ ( f (t), f (s)) < +∞. t=s
Theorem 3.6. Under conditions of Theorem 3.5, for arbitrary γ < β /α the process X has a modification with the trajectories satisfying the H¨older condition with index γ . Analogues of the sufficient Kolmogorov condition are available for random functions defined on parametric sets that may have more complicated structure than an interval. Let us give a version of this condition for random fields. Theorem 3.7. Let {ξ (x), x = (x1 , . . . , xd ) ∈ D ⊂ Rd } be a random field such that Eρ α (ξ (x), ξ (y)) ≤ Cx − yd+β , x, y ∈ D with some positive constants α , β ,C. Then the field ξ possesses a continuous modification.
24
3 Trajectories. Modifications. Filtrations
Remark 3.2. In numerous models and examples, there arises a wide class of stochastic processes that do not possess continuous modification because of the jump discontinuities of their trajectories. The most typical and important example here is the Poisson process. This leads to the following definitions and notation. A function f : [a, b] → X is called c`adl`ag if it is right continuous and has left-hand limits in every point of [a, b]. This notation is the abbreviation for the French phrase continue a` droite, limite a` gauche. Similarly, a function that is left continuous and has right-hand limits in every point is called c`agl`ad. Analogous English abbreviations rcll and rllc are used less frequently. The set of all c`adl`ag functions f : [a, b] → X is denoted D([a, b], X) and is called the Skorohod space. The short notation for D([a, b], R) is D([a, b]). The following theorem gives sufficient conditions for a stochastic process to possess a c`adl`ag modification, formulated in terms of three-dimensional distributions of the process. Theorem 3.8. Let {X(t),t ∈ [0, T ]} be a continuous in probability stochastic process. Suppose that there exist a nondecreasing function {g(h), h ∈ [0, T ]} and a function {q(c, h), c ∈ R+ , h ∈ [0, T ]} such that (3.1) holds true and P({ρ (X(t − h), X(t)) > cg(h)} ∩ {ρ (X(t), X(t + h)) > cg(h)}) ≤ q(c, h), h > 0,t ∈ [h, T − h]. Then the process X has c`adl`ag modification. Also, for existence of either c`adl`ag or continuous modifications sufficient conditions are available, formulated in terms of conditional probabilities. Theorem 3.9. Let {X(t), t ∈ [0, T ]} be a stochastic process, and {α (ε , δ ), ε , δ > 0} be a family of constants such that P(ρ (X(t), X(s)) > ε /FsX ) ≤ α (ε , δ ) a.s., 0 ≤ s ≤ t ≤ s + δ ≤ T, ε > 0. Then (1) If limδ →0+ α (ε , δ ) = 0 for any ε > 0, then the process X has c`adl`ag modification. (2) If limδ →0+ δ −1 α (ε , δ ) = 0 for any ε > 0, then the process X has continuous modification.
Bibliography [9], Chapter I; [24], Volume 1, Chapter III, §2–5; [25], Chapter IV, §2–5; [15], Chapter II, §2; [79], Chapters 8–11.
Problems 3.1. Prove that if the domain T of a random function X is countable and σ -algebra T includes all one-point sets, then the random function X is measurable.
3 Trajectories. Modifications. Filtrations
25
3.2. (1) Prove that if a process is measurable then each of its trajectories is a measurable function. (2) Give an example of a nonmeasurable stochastic process with all its trajectories being measurable functions. (3) Give an example of a stochastic process with all its trajectories being nonmeasurable functions. 3.3. Let Ω = [0, 1], and σ -algebra F consist of all the subsets of [0, 1] having their Lebesgue measure equal either 0 or 1. Let X(t, ω ) = 1It=ω , ω ∈ [0, 1],t ∈ [0, 1]. Prove that (a) X is a stochastic process; (b) X is not measurable. 3.4. Prove that stochastic process {X(t),t ∈ R+ } is measurable assuming its trajectories are: (a) right continuous; (b) left continuous. 3.5. Assume it is known that every trajectory of the process {X(t),t ∈ R+ } is either right continuous or left continuous. Does it imply that this process is measurable? Compare with the previous problem. 3.6. Prove that if a process {X(t), t ∈ R+ } is measurable and a random variable τ possesses its values in R+ , then X(τ ) is random variable. Compare this problem and Problem 3.4 with Problems 1.27 and 1.28. 3.7. A process {X(t), t ∈ R+ } is called progressively measurable if for every T > 0 the restriction of the function X to [0, T ] × Ω is B([0, T ]) ⊗ FTX − X-measurable. Construct a process X that is measurable, but not progressively measurable. 3.8. Let stochastic process {X(t), t ∈ R+ } be continuous in probability. Prove that this process has: (a) measurable modification; (b) progressively measurable modification. 3.9. Let all values of a process {X(t), t ∈ R+ } be independent and uniformly distributed on [0, 1]. (a) Does this process have a c`adl`ag modification? (b) Does this process have a measurable modification? 3.10. Let T = R+ , (Ω , F, P) = (R+ , B(R+ ), μ ), where μ is a probability measure on R+ that does not have any atoms. Introduce the processes X,Y by X(t, ω ) = 1I{t=ω } , Y (t, ω ) = 0, t ∈ T, ω ∈ Ω . (1) Prove that X and Y are stochastically equivalent; that is, Y is a modification of X. (2) Check that all the trajectories of Y are continuous and all the trajectories of X are discontinuous. 3.11. Prove that if a stochastic process {X(t), t ∈ R} has a continuous modification, then every process Y , stochastically equivalent to X in a wide sense, has a continuous modification too.
26
3 Trajectories. Modifications. Filtrations
3.12. Assume that the random field {ξ (x), x ∈ Rd } is such that, for every n ∈ N, the field {ξ (x), x ≤ n} has a continuous modification. Prove that the field {ξ (x), x ∈ Rd } itself has a continuous modification. 3.13. Let {X(t),t ∈ R+ } be a Gaussian process with zero mean and covariance function RX (t, s) that is equal to (1) exp[−|t − s|] (Ornstein–Uhlenbeck process). (2) 12 (t 2H + s2H − |t − s|2H ), H ∈ (0, 1] (fractional Brownian motion). Prove that the process X(·) has a continuous modification. Find values of γ such that these processes have modifications with their trajectories satisfying the H¨older condition with index γ . 3.14. Prove that if a function f : [0, 1] → R satisfies the H¨older condition with index γ > 1, then f is a constant function. Derive that K(t, s) = 12 (t 2H + s2H − |t − s|2H ) is not a covariance function for H > 1. Compare with Problem 2.16. 3.15. Let Gaussian process {X(t),t ∈ R} be stochastically continuous, and t0 ∈ R be fixed. Prove that the process Y (t) = E[X(t)/X(t0 )],t ∈ R has a continuous modification. 3.16. Prove that the Wiener process has a modification with every trajectory satisfying the H¨older condition with arbitrary index γ < 12 . √ 3.17. Let W be the Wiener process. Prove that lim supt→0+ W (t)/ t = +∞ with probability one. In particular, almost every trajectory of the Wiener process does not satisfy the H¨older condition with index γ = 12 . 3.18. Prove that for any α >
1 2
W (t) P lim α = 0 = 1. t→+∞ t 3.19. Let {W (t),t ≥ 0} be the Wiener process. Prove that there exists the limit in probability 2 n k k+1 −W , P-lim ∑ W n→∞ n n k=0 and find this limit. Prove that
k k+1 −W P-lim ∑ W = ∞. n→∞ n n k=0 n
3.20. Prove that almost all trajectories of the Wiener process have unbounded variation on [0, 1]. 3.21. Let {W (t), t ∈ R+ } be the Wiener process. Prove that, with probability one, λ 1 {t ≥ 0| W (t) = 0} = 0.
3 Trajectories. Modifications. Filtrations
27
3.22. Prove that, with probability one, the Wiener process attains its maximum value on [0, 1] only once. 3.23. Let W be the Wiener process. For a, b ∈ R+ , a < b denote Cab = {x ∈ C(R+ )| x(t) = x(a), t ∈ [a, b]}.
Prove that the set {ω |W (·, ω ) ∈ a 0,Y (0) = X(0). Prove that Y is a stochastic process with c`agl`ad trajectories. 3.27. Let τ be a random variable taking values in R+ and X(t) = 1It>τ , t ≥ 0. Find a condition on the distribution of τ that would be necessary and sufficient for the process X to have a c`adl`ag modification. 3.28. Let {X(t), t ∈ [0, T ]} be a continuous in probability stochastic process with independent increments. Prove that X has a c`adl`ag modification. 3.29. Let f ∈ D([a, b]). Prove that (a) Function f is bounded. (b) The set of discontinuities of the function f is at most countable. 3.30. Let the trajectories of a stochastic process X belong to the space D([0, T ]) and let c > 0 be given. Prove that τc = inf{t| |X(t) − X(t−)| ≥ c} is a random variable. 3.31. Let the trajectories of a stochastic process X belong to the space D([0, T ]) and let Γ ∈ B(R) be given. Prove that τ = inf{t| X(t) −X(t−) ∈ Γ } is a random variable. 3.32. Let {X(t), t ∈ R} be a separable stochastic process taking its values in a complete metric space. Prove that if the process X has a continuous modification then ˜ = 0 such that the trajectory X(·, ω ) is continuous there exists a set N˜ ∈ F with P(N) ˜ for any ω ∈ N. 3.33. Let {X(t), t ∈ R} be a separable stochastic process taking its values in a complete metric space. Prove that if the process X has a c`adl`ag modification then there ˜ = 0 such that the trajectory X(·, ω ) is c`adl`ag for any exists a set N˜ ∈ F with P(N) ˜ ω ∈ N.
28
3 Trajectories. Modifications. Filtrations
3.34. Let {X(t), t ∈ R} be a separable stochastic process taking its values in a complete metric space. Assume that the process X has a measurable modification. Does ˜ = 0 that restriction of the function this imply existence of such a set N˜ ∈ F with P(N) ˜ X(·, ·) to R × (Ω \N) is a measurable function? 3.35. Let {X(t), t ∈ R} be a stochastic process taking its values in X = [0, 1]. Does separability of X imply separability of the subspace L2 (Ω , F, P) generated by the family {X(t), t ∈ R} of the values of this process? 3.36. Prove the following characterization of the σ -algebra FtX : it includes all A ∈ F for which there exists a set A0 ∈ FtX,0 such that AA0 ∈ NP . 3.37. (1) Let X,Y be two stochastically equivalent processes. Prove they generate the same filtration. (2) Make an example of two stochastically equivalent processes X,Y such that the corresponding filtrations {FtX,0 ,t ∈ T} and {FtY,0 ,t ∈ T} do not coincide. 3.38. Let
0, X(t) = (t − 12 )η
t ∈ [0, 12 ], , t ∈ ( 12 , 1]
where P(η = ±1) = 12 . Describe explicitly the natural filtration for the process X. Is this filtration: (a) right continuous? (b) left continuous? 3.39. Let τ be a random variable uniformly distributed on [0, 1], F = σ (τ ) (σ -algebra, generated by random variable τ ), X(t) = 1It>τ , t ∈ [0, 1]. Describe explicitly the natural filtration for the process X. Is this filtration: a) right continuous? b) left continuous? 3.40. Let stochastic process {X(t),t ∈ R+ } have continuous trajectories. (1) Prove that its natural filtration is left continuous. (2) Provide an example that this filtration is not necessarily right continuous. 3.41. Provide an example of a process having c`adl`ag trajectories and generating filtration that is neither left continuous nor right continuous. 3.42. Is a filtration generated by a Wiener process: (a) left continuous? (b) right continuous? 3.43. Is a filtration generated by a Poisson process: (a) left continuous? (b) right continuous? 3.44. Let {W (t), t ∈ R+ } be a Wiener process and assume that all its trajectories are continuous. (1) Is it necessary for a filtration {FtW,0 , t ∈ R+ } to be: (a) left continuous? (b) right continuous? (2) Answer the same questions for the filtration {FtN,0 , t ∈ R+ }, where {N(t), t ∈ R+ } is the Poisson process with c`adl`ag trajectories.
3 Trajectories. Modifications. Filtrations
29
3.45. Let (Ω , F , P) be a probability space, Y be a separable metric space, and (A, A) be a measurable space. Consider the sequence of F ⊗ A-measurable random elements {Xn = Xn (a), a ∈ A, n ≥ 1} taking their values in Y. Assume that for every a ∈ A there exists a limit in probability of the sequence {Xn (a), n ≥ 1}. Prove that there exists an F ⊗ A-measurable random element X(a), a ∈ A, takP
ing its values in Y , such that Xn (a) −→ X(a), n → ∞ for every a ∈ A. 3.46. Prove Theorem 3.1.
Hints
3.1. {(t, ω )| X(t, ω ) ∈ B} = t∈T {t} × {ω | X(t, ω ) ∈ B}; the union is at most countable. Every set {t} × {ω | X(t, ω ) ∈ B} belongs to T ⊗ F because X(t) is a random element by the definition of a random function and therefore {ω | X(t, ω ) ∈ B} ∈ F. 3.4. You can use Problem 3.1 and relation X(t, ω ) = limn→∞ X (([tn] + 1)/n, ω ) in item (a) or X(t, ω ) = limn→∞ X ([tn]/n, ω ) in item (b). 3.5. Consider a sum of a process XA from item (1) of Problem 1.6 and a process Y (t, ω ) = 1It>ω . 3.11. Use Theorem 3.3. 3.13, 3.16. Use Theorem 3.6 with α = 2n, n ∈ N. 3.15. If DX(t0 ) > 0, then E[X(t)/X(t0 )] = EX(t) + (X(t0 ) − EX(t0 )) ×
cov(X(t), X(t0 )) . DX(t0 )
If DX(t0 ) = 0, then Y (t) = EX(t). 3.18. Use Problems 3.16 and 6.5 (d). 3.17. Consider a sequence of random variables ξk = 2−(k+1)/2 (W (2−k )−W (2−k−1 )), k ∈ N and prove that P(lim supk→∞ |ξk | = +∞) = 1. Use this identity. 3.21. Introduce a function 1I{(t,ω )|W (t,ω )=0} and use the Fubini theorem. 3.22. It is sufficient to prove that for any a < b probability of the event {maxt∈[0,a] W (t) = maxt∈[b,1] W (t)} is equal to zero. One has max W (t) = W (a) + (W (b) −W (a)) + max (W (t) −W (b))
t∈[b,1]
t∈[b,1]
and the variables W (b) −W (a) and maxt∈[b,1] (W (t) −W (b)) are jointly independent with the σ -algebra FW a . In addition, the distribution of W (b) − W (a) is absolutely continuous. Therefore, the conditional distribution of maxt∈[b,1] W (t) w.r.t. FW a is absolutely continuous. This implies the needed statement.
30
3 Trajectories. Modifications. Filtrations
3.23. The fact that P(W ∈ Cab ) = 0 for fixed a < b, can be proved similarly to the previous hint. Then you can use that a 0 ∀δ > 0 ∃u, v ∈ [0,t − ε ] ∩ Q : |u − v| < δ , |X(u) − X(v)| > c − δ . 3.31. Consider a sequence θk = τ1/k , k ≥ 1 (see Problem 3.30); then τ = min{θk | X(θk ) − X(θk −) ∈ Γ }. 3.34. Put Ω = [0, 1], F is the σ -algebra of Lebesgue measurable sets, P is the Lebesgue measure, X(t, ω ) = 1It=ω 1It∈A + 1It>ω , where A is a Lebesgue nonmeasurable set. 3.35. Consider a stochastic process {X(t),t ∈ R} with i.i.d. values uniformly distributed on [0, 1] (see Problem 1.10 and corresponding hint). This process has a separable modification due to Theorem 3.2. 3.36. Use the “principle of the fitting sets”: prove that the class given in the formulation of the problem is a σ -algebra containing FtX,0 and NP . Prove that every σ -algebra containing both FtX,0 and NP should also contain this class. 3.42. (a) Use Problem 3.40. (b) Use Problem 5.54. 3.43. (a) For any t > 0, B ∈ B(R), one has {N(t) ∈ B}{N(t−) ∈ B} ⊂ {N(t−) = that N has c`adl`ag trajectories). Therefore {N(t) ∈ B} ∈ N(t)} P (assuming ∈N N . σ NP ∪ s 2−k < 2−k . Check that there exists an F ⊗ A-measurable random element X(a), a ∈ A such that Xnk (a) (a) → X(a), k → ∞ almost surely for every a ∈ A. 3.46. Choose a sequence of measurable subsets {Un,k } of the space T such that for any n ≥ 1: (1) T = ∪kUn,k . (2) The sets Un,k and Un, j are disjoint for k = j and their diameters do not exceed 1/n. Let tn,k ∈ Un,k . Then Xn (t) = ∑k X(tnk )1IUn,k (t) is measurable. Prove that for any t sequence {Xn (t), n ≥ 1} converges to X(t) in probability as n → ∞ and use Problem 3.45.
3 Trajectories. Modifications. Filtrations
31
Answers and Solutions 3.2. Item (1) immediately follows from the Fubini theorem. As an example for item (2), one can use the process from Problem 1.6 with a Borel nonmeasurable set. In item (3), one of the possible examples is: Ω = T = [0, 1], T = F = B([0, 1]), X(t, ω ) = 1It∈A , where A is a Borel nonmeasurable set. 3.3. Every variable X(t, ·) is an indicator function for a one-point set and, obviously, is measurable w.r.t. F. This proves (a). In order to prove that the process X is not measurable, let us show that the set {(t, ω )| t = ω } does not belong to B([0, 1]) ⊗ F. Denote K the class of sets C ⊂ [0, 1]2 satisfying the following condition. There exists a set ΔC with its Lebesgue measure equal to 0 such that for arbitrary t, ω1 , ω2 ∈ [0, 1] it follows from (t, ω1 ) ∈ C, (t, ω2 ) ∈ C that at least one of the points ω1 , ω2 belongs to ΔC . Then K is a σ -algebra (prove this!). In addition, K contains all the sets of the type C = A × B, B ∈ F. Then, by the “principle of the fitting sets”, class K contains B([0, 1]) ⊗ F. On the other hand, the set {(t, ω )| t = ω } does not belong to K (verify this!). 3.5. It does not. 3.6. Because a superposition of two measurable mappings is also measurable, it is sufficient to prove that the mapping Ω ω → (τ (ω ), ω ) ∈ R+ × Ω is F − B(R+ ) ⊗ F-measurable. For C = A × B, A ∈ B(R+ ), B ∈ F we have that {ω | (τ (ω ), ω ) ∈ C} = {ω | τ (ω ) ∈ A} ∩ B ∈ F. Because “rectangles” C = A × B generate B(R+ ) ⊗ F, this proves the needed measurability. 3.7. Take Ω = [0, 1], F = B([0, 1]), 1It=ω , t ∈ [0, 1] . X(t, ω ) = ω, t ∈ (1, 2] Then the process X is measurable (verify this!). On the other hand, the σ -algebra F1X is degenerate, that is, contains only the sets of Lebesgue measure 0 or 1. Then the process X is not progressively measurable since its restriction to the time interval [0, 1] is not measurable (see Problem 3.3). 3.9. (a) No, because it would be right continuous in probability. (b) No. Assume that exists. Then for any t ∈ R+ , ω ∈ Ω there t such modification 1 exists Y (t, ω ) = 0 X(s, ω ) − 2 ds (Lebesgue integral). By the Fubini theorem, EY 2 (t) = 0t 0t E X(s1 ) − 12 X(s2 ) − 12 ds1 ds2 = 0t 0t 1Is1 =s2 ds1 ds2 = 0, and then Y (t) = 0 a.s. Every trajectory of the process Y is continuous as a Lebesgue integral with varying upper bound, therefore Y (·, ω ) ≡ 0 for almost all ω . Then X(·, ω ) ≡ 12 for the same ω . It is impossible. 3.13. a) γ < 12 ; b) γ < H.
32
3 Trajectories. Modifications. Filtrations
3.27. Denote by Kτ the set of atoms of distribution for τ (i.e., the points t ∈ R+ for which P(τ = t) > 0). The required condition is as follows. For any t ∈ Kτ there exists ε > 0 such that P(τ ∈ (t,t + ε )) = 0. 3.31. According to Problem 3.30, θk = τ1/k , k ∈ N are random variables. Both the process X and the process Y (t) = X(t−) are measurable (see Problems 3.4 and 3.26), therefore Zk = X(θk ) − X(θk −), k ∈ N are random variables (see Problem 3.6). Thus τ = ∑∞ k=1 θk · 1IZk ∈Γ · ∏m>k 1IZm ∈Γ is a random variable too. 3.34. Not true. 3.35. Not true. 3.37. (1) For any t ∈ T, B ∈ X one has {X(t) ∈ B}{Y (t) ∈ B} ⊂ {X(t) = Y (t)}, and then {X(s) ∈ B} ∈ FtY , s ≤ t (see Problem 3.36). Because NP ⊂ FtY , one has FtX ⊂ FtY . Similarly, it can be proved that FtY ⊂ FtX . (2) Let Ω = [0, 1], F = B([0, 1]), P is the Lebesgue measure, T = {t}, X(t, ω ) ≡ 0,Y (t, ω ) = 1Iω =1 . Then FtX,0 = {∅, [0, 1]}, FtY,0 = {∅, {1}, [0, 1), [0, 1]}. 3.38. (a) No; (b) yes. 3.39. FtX contains all sets of the form {ω : τ (ω ) ∈ A}, where A is an arbitrary Borel set such that either P(A ∩ (t, 1]) = 0 or P(A ∩ (t, 1]) = 1 −t. Filtration FX is both right and left continuous. 3.42. (a) Yes; (b) yes. 3.43. (a) Yes; (b) yes. 3.44. (1) (a) Yes; (b) no. (2) (a) No; (b) no.
4 Continuity. Differentiability. Integrability
Theoretical grounds Definition 4.1. Let T and X be metric spaces with the metrics d and ρ , respectively. A random function {X(t),t ∈ T} taking values in X is said to be (1) Stochastically continuous (or continuous in probability) at point t ∈ T if ρ (X(t), X(s)) → 0 in probability as s → t. (2) Continuous with probability one (or continuous almost surely, a.s.) at point t ∈ T if ρ (X(t), X(s)) → 0 with probability one as s → t. (3) continuous in the L p sense, p > 0 (or mean continuous of the power p ) at point t ∈ T if Eρ p (X(t), X(s)) → 0 as s → t. If a random function is stochastically continuous (continuous with probability one, continuous in the L p sense) at every point of the parametric set T then it is said to be stochastically continuous (respectively, continuous with probability one or continuous in the L p sense) on this set. Note that sometimes, while dealing with continuity of the function X in the L p sense, one assumes the additional condition Eρ p (X(t), x) < +∞,t ∈ T, where x ∈ X is a fixed point. Continuity in the L1 sense is called mean continuity, and continuity in the L2 sense is called mean square continuity. Theorem 4.1. Let {X(t), t ∈ [a, b]} be a real-valued stochastic process with EX 2 (t) < +∞ for t ∈ [a, b]. The process X is mean square continuous if and only if aX ∈ C([a, b]), RX ∈ C([a, b] × [a, b]). Definition 4.2. A real-valued stochastic process {X(t), t ∈ [a, b]} is said to be stochastically differentiable (differentiable with probability one, differentiable in the L p sense) at point t ∈ [a, b] if there exists a random variable η such that X(t) − X(s) → η, s → t t −s in probability (with probability one or in L p , respectively). The random variable η is called the derivative of the process X at the point t and is denoted by X (t). D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 4,
33
34
4 Continuity. Differentiability. Integrability
If a stochastic process is differentiable (in any sense introduced above) at every point t of the parametric set T then it is said to be differentiable (in this sense) on this set. If, in addition, its derivative X = {X (t),t ∈ T} is continuous (in the same sense), then the process X is said to be continuously differentiable (in this sense). Theorem 4.2. Let {X(t), t ∈ [a, b]} be a real-valued stochastic process with EX 2 (t) < +∞ for t ∈ [a, b]. The process X is continuously differentiable in the 1 mean square if and only if aX ∈ C1 ([a, b]), RX ∈ C ([a, b] × [a, b]) and there ex2 ists derivative ∂ /∂ t ∂ s RX . In this case, aX (t) = aX (t), RX (t, s) = 2 the continuous ∂ /∂ t ∂ s RX (t, s), RX, X (t, s) = (∂ /∂ s) RX (t, s). Definition 4.3. Let {X(t), t ∈ [a, b]} be a real-valued stochastic process. Assume there exists the random variable η such that, for any partition sequence {λ n = {a = n ) → 0 and for any sequence of t0n < t1n < · · · < tnn = b}, n ∈ N} with maxk (tkn − tk−1 n n n n suites {θn = {θ1 , . . . , θn }, n ∈ N} with θk ∈ [tk−1 ,tkn ], k ≤ n, n ∈ N, the following convergence takes place, n
∑ X (θkn )
n n tk − tk−1 → η , n → ∞
k=1
either in probability, with probability one, or in the L p sense. Then the process X is said to be integrable (in probability, with probability one or in the L p sense, respectively) on [a, b]. The random variable η is denoted ab X(t) dt and called the integral of the process X over [a, b]. Theorem 4.3. Let {X(t),t ∈ [a, b]} be a real-valued stochastic process with EX 2 (t) < +∞ for t ∈ [a, b]. The process X is mean square integrable on [a, b] if and only if the functions aX and RX are Riemann integrable on [a, b] and [a, b] × [a, b], respectively. In this case, E
b a
X(t) dt =
b a
aX (t) dt, D
b a
X(t) dt =
b b a
a
RX (t, s) dtds.
Bibliography [24], Volume 1, Chapter IV, §3; [25], Chapter V, §1; [79], Chapter 14.
Problems 4.1. Let {X(t), t ∈ [0, 1]} be a waiting process; that is, X(t) = 1It≥τ ,t ∈ [0, 1], where τ is a random variable taking its values in [0, 1]. Prove that the process X is continuous (in any sense: in probability, L p , a.s.) if and only if the distribution function of τ is continuous on [0, 1].
4 Continuity. Differentiability. Integrability
35
4.2. Give an example of a process that is (a) Stochastically continuous but not continuous a.s. (b) Continuous a.s. but not mean square continuous. (c) For given p1 < p2 , continuous in L p1 sense but not continuous in L p2 sense. 4.3. Suppose that all values of a stochastic process {X(t), t ∈ R+ } are independent and uniformly distributed on [0, 1]. Prove that the process is not continuous in probability. 4.4. Let {X(t), t ∈ [a, b]} be continuous in probability process. Prove that this process is (a) Bounded in probability: ∀ε > 0 ∃C ∀t ∈ [a, b] :
P (|X(t)| ≥ C) < ε ;
(b) Uniformly continuous in probability: ∀ε > 0 ∃δ > 0 ∀t, s ∈ [a, b], |t − s| < δ :
P (|X(t) − X(s)| ≥ ε ) < ε .
4.5. Prove that the necessary and sufficient condition for stochastic continuity of a real-valued process X at a point t is the following. For any x, y ∈ R, x < y, P(X(t) ≤ x, X(s) ≥ y) + P(X(t) ≥ y, X(s) ≤ x) → 0, s → t. 4.6. Prove that the following condition is sufficient for a real-valued process {X(t), t ∈ R} to be stochastically continuous: for any points t0 , s0 ∈ R, x, y ∈ R, P(X(t) ≤ x, X(s) ≤ y) → P(X(t0 ) ≤ x, X(s0 ) ≤ y), t → t0 , s → s0 . 4.7. (1) Let stochastic process {X(t), t ∈ [a, b]} be mean square differentiable at the point t0 ∈ [a, b]. Prove that cov (η , X (t0 )) = (d/dt)|t=t0 cov(η , X(t)) for any η ∈ L2 (Ω , F, P). (2) Let {X(t),t ∈ [a, b]} bea mean square stochastic process and integrable η ∈ L2 (Ω , F, P). Prove that cov η , ab X(s) ds = ab cov(η , X(s)) ds. 4.8. Let {X(t),t ∈ R} be a real-valued centered process with independent increments and EX 2 (t) < +∞,t ∈ R. Prove that for every t ∈ R there exist the following mean square limits: X(t−) = lims→t− X(s), X(t+) = lims→t+ X(s). 4.9. Let {N(t),t ∈ R+ } be the Poisson process. (1) Prove that N is not differentiable in any point t ∈ R+ in the L p sense for any p ≥ 1. (2) Prove that N is differentiable in an arbitrary point t ∈ R+ in the L p sense for every p ∈ (0, 1). (3) Prove that N is differentiable in an arbitrary point t ∈ R+ in probability and with probability 1. Find the derivatives in items (2) and (3).
36
4 Continuity. Differentiability. Integrability
4.10. Let {X(t), t ≥ 0} be a real-valued homogeneous process with independent increments and D[X(1) − X(0)] > 0. Prove that the processes X(t) and Y (t) := X(t + 1) − X(t) are mean square continuous but not mean square differentiable. 4.11. Consider a mean square continuous process {X(t),t ∈ R} such that D(X(t) − X(s)) = F(t − s),t, s ∈ R for some function F and F (0) = 0. Describe all such processes. 4.12. Let the mean and covariance functions of the process X be equal aX (t) = t 2 , RX (t, s) = ets . Prove that X is mean square differentiable and find : (a) E[X(1) + X(2)]2 . (b) EX(1)X (2).
2 (c) E X(1) + 01 X(s) ds . (d) cov X (1), 01 X(s) ds .
4 (e) E X(1) + 01 X(s) ds assuming additionally that the process X is Gaussian. = N(t) − t be the 4.13. Let N be the Poisson process with intensity λ = 1 and N(t) corresponding compensated process. Find: (a) E 13 N(t) dt · 24 N(t) dt. dt · 2 N(t) dt · 3 N(t) dt. (b) E 01 N(t) 1 2 4.14. Let X(t) = α sin(t + β ), t ∈ R, where α and β are independent random variables, α is exponentially distributed with parameter 2, and β is uniformly distributed on [−π , π ]. (1) Is the process X continuous or differentiable in the mean square? (2) Prove using the definition that the mean square derivative X (t) is equal to α cos(t + β ). (3) Find RX , RX,X , RX , aX , aX . 4.15. Suppose that the trajectories of a process {X(t), t ∈ R} are differentiable at the point t0 , and the process X is mean square differentiable at this point. Prove that the derivative of the process at the point t0 in the mean square sense coincides almost surely with the ordinary derivative of the trajectory of the process. 4.16. Consider a mean square continuously differentiable stochastic process {X(t), t ∈ R}. (1) Prove that the processes X and X have measurable modifications. Further on, assume X and X be measurable. (2) Prove that X(t) = X(0) +
t 0
X (s)ds a.s., t ∈ R.
(4.1)
Consider both the mean square integral and the Lebesgue integral for a fixed ω (by convention, 0t is equal to − t0 for t < 0).
4 Continuity. Differentiability. Integrability
37
(3) Prove that X has a continuous modification. (4) Assume ∞ −∞
EX 2 (t) dt < ∞,
∞ −∞
E(X (t))2 dt < ∞.
Prove that for almost all ω the function t → X(t, ω ) belongs to the Sobolev space W21 (R), that is, a space of square-integrable functions with their weak derivatives being also square-integrable. Prove that, for P-almost all ω and λ 1 -almost all t, the Sobolev derivative coincides with the mean square derivative. (5) Will the previous statements still hold true if X is supposed to be mean square differentiable but not necessarily continuously mean square differentiable? 4.17. (1) Verify whether a process {X(t),t ∈ R} is either mean square continuous or mean square differentiable if (a) X(t) = min(t, τ ), where the random variable τ is exponentially distributed. (b) X has periodic trajectories with period 1 and X(t) = t − τ , t ∈ [τ , τ + 1], where the random variable τ is uniformly distributed on [0; 1]. (2) Find RX . If X is mean square differentiable then find X and RX , RX ,X . 4.18. Let X(t) = f (t − τ ), where τ is exponentially distributed and 1 − |x|, |x| ≤ 1, f (x) = 0, |x| > 1. Is the process X mean square differentiable? If so, find X and E(X (1))2 . 4.19. Let random variable τ have the distribution density (1−x)α α +1 , x ∈ [0, 1] p(x) = 0, x∈ / [0, 1] with α > 0. Is the stochastic process ⎧ ⎪ ⎨0, X(t) = 1, ⎪ ⎩ t−τ
t ≤ τ, t ≥ 1, , t ∈ (τ ; 1) 1−τ
either mean square continuous or mean square differentiable (a) on R. (b) On [0, 1]? 4.20. Assume that function f is bounded and satisfies the Lipschitz condition. Let random variable τ be continuously distributed. Prove that the stochastic process X(t) = f (t − τ ),t ∈ R is mean square differentiable. 4.21. (The fractional effect process). Let {εn , n ≥ 1} be nonnegative nondegen−1 erate i.i.d. random variables. Define Sn = ∑nk=1 ξk , f (x) = 1 + x4 . Prove that the stochastic process X(t) = ∑∞ n=1 f (t + Sn ), t ∈ R is mean square continuously differentiable.
38
4 Continuity. Differentiability. Integrability
4.22. Is it possible that a stochastic process (a) Has continuously differentiable trajectories but is not mean square continuously differentiable? (b) Is mean square continuously differentiable but its trajectories are not continuously differentiable? + process then the process 4.23. Prove t that if X(t), t ∈ R is a mean square continuous Y (t) = 0 X(s)ds is mean square differentiable and Y (t) = X(t).
4.24. Let {W (t), t ∈ R+ } be the Wiener process. (1) Prove that, for a given δ > 0, stochastic process Wδ (t) = (1/δ ) tt+δ W (s)ds, t ∈ R+ is mean square continuously differentiable. Find its derivative. (2) Prove that l.i.m.δ →0+ Wδ (t) = W (t), where l.i.m. denotes the limit in the mean square sense.
Hints 4.4. Item (a) follows from item (b). Assume that the statement of item (b) is not true and prove that there exist ε > 0 and sequences {tn } and {sn } converging to some point t ∈ [a, b] such that P(|X(tn ) − X(sn )| ≥ ε ) ≥ ε . Show that this contradicts the stochastic continuity of X at the point t. 4.6. Use Problem 4.5. 4.8. Prove and use the following fact. If H is a Hilbert space and {hk , k ∈ N} is an orthogonal system of elements from this space with ∑k hk 2H < +∞, then there exists h = ∑k hk ∈ H, where the series converges in norm in H. In this problem, one should consider the increments of the process as elements in the Hilbert space H = L2 (Ω , F, P). 4.9. (1), (2) Show that N(t) − N(s) p ∼ λ |t − s|1−p , s → t. E t −s (3)
N(t) − N(s) → 0, s → t t −s
⊂ {N(t−) = N(t)} .
Derivatives of N in probability, a.s. and in the L p , p ∈ (0, 1) sense are zero. 4.10. Use Problem 2.17 and Theorem 4.2. 4.12. (d) Use Problem 4.7. (e) If ξ ∼ N(a, σ 2 ) then Eξ 4 = a4 + 6a2 σ 2 + 3σ 4 . Prove this formula and use it. 4.16. (1) Both X and X are stochastically continuous (prove this!), thus existence of their measurable modifications is provided by Theorem 3.1.
4 Continuity. Differentiability. Integrability
39
(2) Let ϕ ∈C0∞ (R) be a compactly supported nonnegative infinitely differentiable ∞ ∞ ϕ (x)dx = 1. Define ϕ (x) = n ϕ (nx), X (t) := X(t − s)ϕn (s) function with n n −∞ −∞ ∞ X(s)ϕn (t − s)ds. Check that the trajectories of the process X are infinitely ds = −∞ n ∞ ∞ X (s)ϕn (t − s)ds = −∞ X(s)ϕn (t − s)ds, and for any t differentiable a.s., Xn (t) = −∞ lim X (t) = X (t) n→∞ n
lim Xn (t) = X(t),
n→∞
(4.2)
in the mean square. Because the trajectories of Xn are continuously differentiable a.s., the Newton–Leibnitz formula implies Xn (t) = Xn (0) + One has
t 0
Xn (s)ds a.s.
(4.3)
t t t E Xn (s) − X (s) ds E Xn (s) ds − X (s) ds ≤ 0
0
E|Xn (s) − X (s)|
E|X (s)| +
0
∞
−∞ E|X (s1 )|ϕn (s − s1 )ds1 .
and ≤ Pass to the limit in (4.3) using (4.2) and the Lebesgue dominated convergence theorem. (3) Follows immediately from the item (2). (4) It is well known that if a function f ∈ L2 (R) has the form f (t) = c +
t 0
g(s)ds, t ∈ R,
(4.4)
with some c ∈ R, g ∈ L2 (R), then f ∈ W21 (R) and g is equal to its Sobolev derivative. Now use the result from the item (2). Note that, moreover, the function f appears to be absolutely continuous. In particular, f is differentiable for almost all t and g is its ordinary derivative. (5) Statement (1) still holds true. this, consider the sequence of In order to show measurable processes Yn (t) := n X t + n−1 − X(t) (X is mean square continuous and thus can be supposed to be measurable). Then Yn (t) → X (t), n → ∞ in probability. Use Problem 3.45. Statement (2) still holds true assuming that T −T
E(X (t))2 dt < ∞, T > 0.
(4.5)
This condition is needed in order to construct the approximating sequence {Xn } and justify passing to the limit in (4.3). Note that under condition (4.5) the integral in the right-hand side of (4.1) is well defined as the Lebesque integral for a.s. ω but not necessarily as the mean square integral. Statement (4) still holds true and statement (3) holds true assuming (4.5). 4.17. (1) (a) Use the Lebesgue dominated convergence theorem and check that X (t) = 1It≤τ . Furthermore, EX(t) =
∞ 0
min(t, x)e−x dx =
t 0
xe−x dx +
∞ t
te−x dx = 1 − e−t .
40
4 Continuity. Differentiability. Integrability
Let s ≤ t. Then EX(s)X(t) = s 0
x2 e−x dx +
t s
∞ 0
sxe−x dx +
min(s, x) min(t, x)e−x dx =
∞ t
ste−x dx = −e−s (s + 2) + 2 − se−t .
In order to obtain the covariance function of the derivative, use Theorem 4.2. (b) Similarly to item (a), the process X(t) is mean square continuous. If there exists X (t), then X (t) = 1 a.s. (Problem 4.15) and the process Y (t) = X(t) − t must be mean square continuously differentiable with zero derivative. That is, Y (t) = Y (0) a.s. (Problem 4.16). But this is not correct. 4.19. The process X(t) is mean square differentiable at every point t except possibly t = 1. At the same time, X (t) = 0 when t ≤ τ or t > 1, and X (t) = 1/(1 − τ ) when t ∈ (τ , 1]. Because X(t) − X(1) X(t) − X(1) 1 = a.s., = 0 = lim t→1− t→1+ t −1 1−τ t −1 lim
the process {X(t),t ∈ R} is not differentiable at t = 1. Now, consider the restriction of X(t) to [0, 1]. Check that the mean square derivative X (1) exists if and only if E(1 − τ )−2 < ∞; that is, α > 1. Check that X (1) = (1 − τ )−1 . 4.24. Use the considerations from the solution to Problem 4.23.
Answers and Solutions 4.5. For any x < y, P(X(t) ≤ x, X(s) ≥ y) + P(X(t) ≥ y, X(s) ≤ x) ≤ P(|X(t) − X(s)| > y − x) → 0 as s → t in the case when X is stochastically continuous at point t. On the other hand, let ε , δ > 0 be fixed and we choose C so that P(|X(t)| > C) < δ . Consider the sets (k + 1)ε (k − 1)ε kε kε , Bk = (x, y) | x ≥ , y ≤ , Ak = (x, y) | x ≤ , y ≥ 2 2 2 2 k∈ Z. Let us assume that m = [2C/ε ] + 1, then {(x, y)| |x − y| > ε , |x| ≤ C} ⊂ m k=−m (Ak ∪ Bk ). Under the assumptions made above P((X(t), X(s)) ∈ Ak ∪ Bk ) → 0 as s → t for every k. Therefore lim sups→t P(|X(t) − X(s)| > ε ) ≤ P(|X(t)| > C) < δ . Because δ is arbitrary, we have P(|X(t) − X(s)| > ε ) → 0, s → t. 4.11. We get from the independence of increments that D(X(t) − X(0)) = nF(t/n) for every n ∈ N, t ∈ R. It follows from the equality F(z) = F(−z) that F (0) = 0. Obviously, F(0) = D(X(t) − X(t)) = 0. Thus D(X(t) − X(s)) = o(|t − s|2 ) as |t − s| → 0. Therefore D(X(t) − X(0)) = 0, t ∈ R and thus X(t) = X(0) + aX (t) − aX (0), t ∈ R.
4 Continuity. Differentiability. Integrability
41
4.12. (a) e4 + 2e2 + e + 25. (b) e2 + 4. t (c) 3e + 29 + 01 e −1 t dt. (d) 1. 2 2 4 t 1 et −1 4 3e − 2 + 01 e −1 dt + 3 3e − 2 + (e) 43 + 6 43 0 t dt . t 4.13. (a) 34 + 13 ; (b) 12 . 4.14. (1) Yes, it is. (2) By the Lebesgue dominated convergence theorem lim E
ε →0
2 α sin(t + ε + β ) − α sin(t + β ) − α cos(t + β ) = 0. ε
The Lagrange theorem implies that the expression in parentheses is dominated by the quantity α 2 . (3) RX (s,t) = 12 cos(t − s); RX,X (s,t) = − 12 sin(t − s); RX (s,t) = 12 cos(t − s), aX = aX ≡ 0. 4.15. Mean square convergence of (X(t) − X(t0 ))/(t −t0 ) to some random variable ξ as t → t0 implies convergence in probability. Thus, we can select a sequence tk → t0 , k → ∞ such that X(tk ) − X(t0 ) → ξ a.s., k → ∞. tk − t0 Since the trajectories of X(t) are differentiable at the point t0 , the limit ξ coincides a.s. with ordinary derivative. 4.18. Similarly to item (a) of Problem 4.17, the process X is continuously differentiable in m.s. and ⎧ ⎪ t ∈ [τ − 1, τ ], ⎨1, X (t) = −1, t ∈ (τ , τ + 1], ⎪ ⎩ 0, t∈ / [τ − 1, τ + 1]. 2 E X (1) = E1I1∈[τ −1;τ +1] = P(τ ∈ [0, 2]) = 1 − e2α . 4.20. The Lipschitz continuity of the function f implies its absolute continuity and therefore its differentiability at almost every point w.r.t. the Lebesgue measure. Let U be the set of the points where the derivative of the function f exists. Then, for every t0 ∈ R X(t) − X(t0 ) ≥ P(t0 − τ ∈ U) = 1. P ∃ lim t→t0 t − t0 On the other hand, for every t,t0 ,t = t0 , the absolute value of the fraction (X(t) − X(t0 ))/(t − t0 ) does not exceed the Lipschitz constant of the function f . Therefore, by the Lebesgue theorem on dominated convergence, the process {X(t), t ∈ R} is mean square differentiable and X (t) = g(t − τ ), where
42
4 Continuity. Differentiability. Integrability
g(t) =
f (t), t ∈ U . 0, t∈ U
4.22. (a) Yes, if E(X(t))2 = +∞. (b) Yes. Look at the process from Problem 4.17. 4.23. Let ε > 0 be arbitrary. Choose δ > 0 such that E(X(t) − X(s))2 < ε as soon as |t − s| < δ . Thus, for every s ∈ (t − δ ,t + δ ), s = t : t E
s
2 2 t X(z)dz (X(z) − X(t))dz − X(t) = E s t −s t −s
≤
E st (X(z) − X(t))2 dz < ε. t −s
5 Stochastic processes with independent increments. Wiener and Poisson processes. Poisson point measures
Theoretical grounds Definition 5.1. Let T ⊂ R be an interval. A stochastic process {X(t), t ∈ T} taking values in Rd is a process with independent increments if for any m ≥ 1 and t0 , . . . ,tm ∈ T, t0 < · · · < tm the random vectors X(t0 ), X(t1 ) − X(t0 ), . . . , X(tm ) − X(tm−1 ) are jointly independent. A process with independent increments is said to be homogeneous if, for any t, s, v, u ∈ T such that t − s = v − u, the increments X(t) − X(s) and X(v) − X(u) have the same distribution. The following theorem shows that all the finite-dimensional distributions for the process with independent increments on T = R+ are uniquely determined by the starting distribution (i.e., the distribution of X(0)) and distributions of the increments (i.e., the distributions of X(t) − X(s), t > s ≥ 0). Theorem 5.1. The finite-dimensional distributions of the process with independent increments {X(t),t ∈ R+ } taking values in Rd are uniquely determined by the family of the characteristic functions # " φ0 (·) = E exp{i(·, X(0))Rd }, φs,t (·) = E exp{i(·, X(t) − X(s))Rd }, 0 ≤ s < t . On the other hand, in order for a family of the functions {φ0 , φs,t , 0 ≤ s < t} to determine a process with independent increments, it is necessary and sufficient that (1) Every function φ0 , φs,t , 0 ≤ s < t is a characteristics function of a random vector in Rd . (2) The following consistency condition is fulfilled
φs,t φt,u = φs, u , 0 ≤ s < t < u. Theorem 5.1 is a version of the Kolmogorov theorem on finite-dimensional distributions (see Theorems 1.1, 2.2 and Problem 5.2).
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 5,
43
44
5 Stochastic processes with independent increments
Definition 5.2. The (one-dimensional) Wiener process W is the real-valued homogeneous process with independent increments on R+ such that W (0) = 0 and for any t > s the increment W (t) − W (s) has the distribution N(0,t − s). The multidimensional Wiener process is the m-dimensional process W (t) = (W 1 (t), . . . ,W m (t)), where {W i (t),t ≥ 0} are jointly independent Wiener processes. Let T ⊂ R be an interval and κ be a locally finite measure on B(T) (i.e., κ possesses finite values on bounded intervals). Definition 5.3. The Poisson process X with intensity measure κ on T is the process with independent increments such that the increment X(t) − X(s) for any t > s has the distribution Pois(κ((s,t])). The Poisson process N with parameter λ > 0 is a homogeneous process with independent increments defined on T = R+ , such that N(0) = 0 and for any t > s the increment N(t) − N(s) has the distribution Pois(λ (t − s)). The Poisson process with parameter λ is, obviously, the Poisson process with intensity measure κ = λ · λ 1 |R+ on T = R+ (λ 1 is a Lebesgue measure, λ 1 |R+ is its restriction to R+ ). Note that this form of the measure κ is implied by homogeneity of the process. Further on, we use the short name the Poisson process for the Poisson process with some parameter λ . Let 0 ≤ τ1 ≤ τ2 ≤ · · · be a sequence of random variables and X(t) = ∑∞ k=1 1I τk ≤t ,t ∈ + R . The process X is called the registration process associated with the sequence {τk }. The terminology is due to the following widely used model. Assume that some sequence of events may happen at random (e.g., particles are registered by a device, claims are coming into a telephone exchange or a server, etc.). If the variable τk is interpreted as the time moment when the kth event happens, then X(t) counts the number of events that happened until time t. Proposition 5.1. The Poisson process with parameter λ is the registration process associated with the sequence {τk } such that τ1 , τ2 − τ1 , τ3 − τ2 , . . . are i.i.d. random variables with the distribution Exp(λ ). Let us give the full description of the characteristic functions for the increments of stochastically continuous homogeneous processes with independent increments (such processes are called L´evy processes). Theorem 5.2. (The L´evy–Khinchin formula) Let {X(t),t ≥ 0} be a stochastically continuous homogeneous processes with independent increments taking values in Rd . Then φs,t (z) = exp[(t − s)ψ (z)], s ≤ t, z ∈ Rd , 1 ψ (z) = i(z, a)Rd − (Bz, z)Rd + 2
Rd
ei(z,u)Rd − 1 − i(z, u)Rd 1Iu
≤1 Rd
Π (du), (5.1)
where a ∈ Rd , the matrix B ∈ Rd×d is nonnegatively defined, and the measure Π satisfies the relation
5 Stochastic processes with independent increments Rd
45
(u2Rd ∧ 1)Π (du) < +∞.
The function ψ is called the cumulant of the process X; the measure Π is called the L´evy measure of the process X. Consider a “multidimensional segment” T = T1 × · · · × Td ⊂ Rd with some segments Ti ⊂ R, i = 1, . . . , d. On the set T, the partial order is naturally defined: t = (t 1 , . . . ,t d ) ≥ s = (s1 , . . . , sd ) ⇔ t i ≥ si , i = 1, . . . , d. For a function X : T → R and s ≤ t, we denote by Δs,t (X) the increment of X on the “segment” (s,t] = (s1 ,t 1 ] × · · · × (sd ,t d ]; that is, ε1 +···+εd 1 1 1 d d d Δs,t (X) = (−1) X t − ε (t − s ), . . . ,t − ε (t − s ) . 1 d ∑ ε1 ,...,εd ∈{0,1}
Definition 5.4. Let κ be a locally finite measure on B(T). A real-valued random field {X(t), t ∈ T} is called the Poisson random field with intensity measure κ, if for any t0 , . . . ,tm ∈ T, t0 ≤ t1 ≤ · · · ≤ tm the variables X(t0 ), Δt0 ,t1 (X), . . . , Δtm−1 ,tm (X) are jointly independent and for every s ≤ t the increment Δs,t (X) has the distribution Pois(κ((s, t])). Let (E, E, μ ) be a space with σ -finite measure. Denote Eμ = {A ∈ E | μ (A) < +∞}. Definition 5.5. A random point measure on a ring K ⊂ Eμ is the mapping ν that associates with every set A ∈ K a Z+ -valued random variable ν (A), and satisfies the following condition. For any Ai ∈ K, i ∈ N such that Ai ∩ A j = ∅, i = j, i Ai ∈ K, ν Ai = ∑ ν (Ai ) a.s. i
i
The random point measure is called the Poisson point measure with intensity measure μ , if for any A1 , . . . , Am ∈ Eμ , Ai ∩ A j = ∅, i = j the values ν (A1 ), . . . , ν (Am ) are jointly independent and for every A ∈ Eμ the value ν (A) has the distribution Pois(μ (A)). The term “random point measure” can be explained by the following result (see Problem 5.41) which shows that this object has a natural interpretation as a collection of measures indexed by ω ∈ Ω and concentrated on the countable subsets of E (“point measures”). Proposition 5.2. Let ν be a point measure on a ring K. Assume that σ (K) contains all one-point sets and there exists a countable ring K0 ⊂ K such that σ (K0 ) ⊃ K. Then there exists the mapping νˆ : Ω × K → R+ such that: (1) For every A ∈ σ (K) the function νˆ (·, A) is an extended random variable. (2) For every ω ∈ Ω the mapping A → νˆ (ω , A) is the σ -finite measure concentrated at some enumerable set and taking natural values on the points of this set; (3) νˆ (·, A) is equal to ν (A) a.s. for every A ∈ K.
46
5 Stochastic processes with independent increments
If σ (K) = E then the Poisson point measure with intensity measure μ defined on K can be extended to the Poisson point measure defined on Eμ , and such an extension is unique. For a given Poisson point measure ν with intensity μ , the corresponding centered (or compensated) Poisson point measure is defined as ν˜ (A) = ν (A) − μ (A), A ∈ Eμ . Let f : E → R be measurable w.r.t. σ (K) and { f = 0} ⊂ A for some A ∈ K. Then the integral of f over the random point measure ν is naturally defined by % $ f (z)ν (dz) (ω ) = f (z)νˆ (ω , dz), ω ∈ Ω , (5.2) E
E
where νˆ is the collection of measures given by Proposition 5.2 (see Problem 5.42). ˜ If ν is a Poisson measure, ν is the corresponding compensated measure, and f ∈ L2 (E, μ ), then the integral E f (z)ν˜ (dz) is well defined as the stochastic integral over an orthogonal random measure (see Chapter 8 and Problem 5.43). Two definitions of the integrals mentioned above are adjusted in the sense that if f ∈ L2 (E, d μ ) and { f = 0} ∈ Eμ then
f (z)ν (dz) −
E
E
f (z)μ (dz) =
E
f (z)ν˜ (dz)
(5.3)
(see Problem 5.44). Frequently, one needs to consider Poisson point measures defined on a product of spaces, for instance, E = R+ × Rd , E = B(E). In this case, the above-defined integrals of a function (s, u) → f (s, u)1Is≤t,u∈A are denoted t 0
A
f (s, u)ν (ds, du),
t 0
A
f (s, u)ν˜ (ds, du).
Theorem 5.3. Let {X(t),t ≥ 0} be a stochastically continuous homogeneous process with independent increments taking its values in Rd . Let a, B, Π be, respectively, the vector, the matrix, and the measure appearing in the L´evy–Khinchin formula for the cumulant of this process (see Theorem 5.2). Then there exist the independent d-dimensional Wiener process W and Poisson point measure on E = R+ × Rd with the intensity measure μ = λ 1 |R+ × Π such that X(t) = at + B
1/2
W (t) +
t 0
{uRd >1}
uν (ds, du) +
t 0
{uRd ≤1}
uν˜ (ds, du), (5.4)
t ∈ R+ . And vice versa, let X be determined by the equality (5.4) with arbitrary a, B, and independent d-dimensional Wiener process W and Poisson point measure ν with the intensity measure μ = λ 1 |R+ × Π . Then X is a L´evy process and its cumulant is determined by the equality (5.1).
Bibliography [9], Chapter II; [24], Volume 1, Chapter III, §1; [25], Chapter VI; [15], Chapter VIII; [78]; [79], Chapters 2, 27, 28.
5 Stochastic processes with independent increments
47
Problems 5.1. Verify that the consistency condition (condition 2 of Theorem 5.1) holds true for characteristic functions of the increments for: (a) Wiener process; (b) Poisson process with intensity measure κ. Such verification is necessary for Definitions 5.2 and 5.3 to be formally correct. 5.2. Let the family of functions {ψ0 , ψs,t , 0 ≤ s < t} be given, and let every function of the family be a characteristic function of some random variable. For any m ≥ 1,t1 , . . . ,tm ∈ R+ , z1 , . . . , zm ∈ R, put
φt1 ,...,tm (z1 , . . . , zm ) = ψ0 (zπ (1) + · · · + zπ (m) )ψ0,π (t1 ) (zπ (1) + · · · + zπ (m) ) × ψπ (t1 ),π (t2 ) (zπ (2) + · · · + zπ (m) ) . . . ψπ (tm−1 ),π (tm ) (zπ (m) ), where the permutation π is such that tπ (1) ≤ · · · ≤ tπ (m) . Prove that the necessary and sufficient condition for the family {φt1 ,...,tm , t1 , . . . ,tm ∈ T, m ≥ 1} to satisfy consistency conditions of Theorem 2.2 is that the family {ψ0 , ψs,t , 0 ≤ s < t} satisfies the consistency condition of Theorem 5.1. Prove also that if these conditions hold then the process X with the finite-dimensional characteristic functions {φt1 ,...,tm } is a process with independent increments. 5.3. Let {μt , t > 0} be a family of probability measures on R such that, for any s,t > 0, μt+s equals the convolution of μt and μs . Prove that there exists homogeneous process with independent increments {X(t), t > 0} such that for every t the distribution of X(t) equals μt . Describe the finite-dimensional distributions of the process {X(t), t > 0}. 5.4. Let N be the Poisson process with intensity λ . Find: (a) P(N(1) = 2, N(2) = 3, N(3) = 5). (b) P(N(1) √ ≤ 2, N(2) = 3, N(3) ≥ 5). (c) P(N( 2) = √3). (d) P(N(3) = 2). (e) P(N(4) = 3, N(1) = 2). (f) P(N(2)N(3) = 2). (g) P(N 2 (2) ≥ 3N(2) − 2). (h) P(N(2) + N(3) = 1). 5.5. Find E(X(1) + 2X(2) + 3X(3))2 ; where X is: (a) the Wiener process; (b) the Poisson process.
E(X(1) + 2X(2))3 ; E(X(1) + 2X(2) + 1)3 ,
48
5 Stochastic processes with independent increments
5.6. Specify the finite-dimensional distributions for the Poisson process. 5.7. Assume that stochastic process {X(t),t ≥ 0} satisfies the conditions: (a) X takes values in Z+ and X(0) = 0. (b) X has independent increments. (c) P(|X(t + h) − X(t)| > 1) = o(h), P(X(t + h) − X(t) = 1) = λ h + o(h), h → 0 with some given λ > 0. Prove that X is the Poisson process with intensity λ . 5.8. Assume that stochastic process {X(t), t ≥ 0} satisfies conditions (a),(b) from the previous problem and condition (c ) P(|X(t + h) − X(t)| > 1) = o(h), P(X(t + h) − X(t) = 1) = λ (t)h + o(h), h → 0, where λ (·) is some continuous nonnegative function. Find the finite-dimensional distributions for the process X. 5.9. Let {N(t), t ∈ R+ } be the Poisson process, τ1 be the time moment of its first jump. Find the conditional distribution of τ1 given that the process has on [0, 1]: (a) Exactly one jump (b) At most one jump (c) At least m jumps (m ∈ N). 5.10. Prove that the distribution of the Poisson process defined on [0, 1] conditioned that the process has m jumps (m ∈ N) on [0, 1] is equal to the distribution of the process X(t) = ∑m k=1 1Iξk ≤t , t ∈ [0, 1], where ξ1 , . . . , ξm are i.i.d. random variables uniformly distributed on [0, 1]. 5.11. Let τ be a random variable, and X(t) = 1It≥τ ,t ∈ R be the corresponding waiting process. What distribution should the random variable τ follow in order for X to be a process with independent increments? 5.12. Prove Proposition 5.1. 5.13. Let τn be the time moment of the nth jump for the Poisson process. Prove that the distribution density of τn equals
λ n xn−1 −λ x e , x ≥ 0. n! 5.14. Let N be the Poisson process, τn be the time moment of its nth jump, and N(t), t ∈ [τ2n , τ2n+1 ), X(t) = N(t) − 1, t ∈ [τ2n−1 , τ2n ). Draw the trajectories of the process X. Calculate P(X(3) = 2), P(X(3) = 2, X(5) = 4). Is the process X a process with independent increments? 5.15. Let the number of signals transmitted via a communication channel during time [0,t] be the Poisson process N with intensity λ > 0. Every signal is successfully received with probability p ∈ (0; 1), independently of the process N and other signals. Let {X(t),t ≥ 0} be the number of a signals received succesfully. Find: (a) onedimensional; (b) multidimensional distributions of X.
5 Stochastic processes with independent increments
49
5.16. The numbers of failures in work for plants A and B during the time period [0,t] are characterized by two independent Poisson processes with intensities λ1 and λ2 , respectively. Find: (a) one-dimensional; (b) multidimensional distributions of the total number of failures for both plants A and B during the time period [0,t]. 5.17. Let {ξn , n ≥ 1} be i.i.d. random variables with the distribution function F and ν be a Poisson distributed λ independent of {ξn }. ' & random variable with parameter Prove that the process Xν (x) = ∑νn=1 1Iξn ≤t ,t ∈ R is the Poisson process with the intensity measure κ determined by the relation κ((a, b]) = λ (F(b) − F(a)), a ≤ b. 5.18. Within the conditions of the previous problem, prove that all multidimensional distributions of the process λ −1 Xν weakly converge as λ → +∞ to the corresponding finite-dimensional distributions of a (nonrandom) process that is equal to the determinate function F. 5.19. Let {ξn , n ≥ 1} be i.i.d. random variables with the distribution function F and {N(t),t ∈ R+ } be the Poisson process with intensity λ independent of {ξn }. Prove ' & N(t) that X(x,t) = ∑n=1 1Iξn ≤x , (x,t) ∈ R × R+ is the Poisson random field with intensity κ × λ 1 |R+ , where κ is the measure defined in Problem 5.17. 5.20. The compound Poisson process is a process of the form X(t) = ∑k=1 ξk , t ∈ R+ , where ξk , k ∈ N are i.i.d. random variables and {N(t), t ∈ R+ } is a Poisson process independent of {ξn }. Prove that the compound Poisson process is a homogeneous process with independent increments. N(t)
5.21. Prove that the sum of two processes with independent increments, which are independent of each other, is again a process with independent increments. 5.22. Let W be the Wiener process, N1 , . . . , Nm be the Poisson processes with parameters λ1 , . . . , λm and the processes W, N1 , . . . , Nm be jointly independent. Let also c0 , . . . , cm ∈ R. Prove that X = c0W + c1 N1 + · · · + cm Nm is a homogeneous process with independent increments and find the parameters a, B, Π in the L´evy–Khinchin formula for X (Theorem 5.2). 5.23. Let W be the Wiener process, N λ and N μ be two Poisson processes with intensities λ and μ , respectively. Let also {ηi , i ≥ 1} be i.i.d. random variables exponentially distributed with parameter α and {ζk , k ≥ 1} be i.i.d. random variables exponentially distributed with parameter β . The processes W, N λ, N μ and sequences {ηi }, {ζk } are assumed to be jointly independent. Prove that X(t) = bW (t) +
N λ (t)
∑
i=1
ηi −
N μ (t)
∑
ζi , t ∈ R+
i=1
is a homogeneous process with independent increments and find the parameters a, B, Π in the L´evy–Khinchin formula for X. Find the characteristic function of the variable X(t),t > 0 and express the distribution density for this variable in integral form.
50
5 Stochastic processes with independent increments
5.24. Prove that the m-dimensional Wiener process is a process with independent increments and each of its increments W (t) − W (s), t > s has the distribution N(0, (t − s)IRm). 5.25. Let {W (t), t ≥ 0} be the two-dimensional Wiener process, B(0, r) = {x ∈ R2 | x ≤ r}, r > 0. Find P(W (t) ∈ B(0, r)). 5.26. Let {W (t) = (W1 (t), . . . ,Wm (t)), t ∈ R+ } be an m-dimensional Wiener process and let set A ⊂ Rm have zero Lebesgue measure. Prove that the total time spent by W in the set A equals zero a.s. (compare with Problem 3.21). 5.27. Let {W (t) = (W1 (t), . . . ,Wm (t)), t ∈ R+ } be an m-dimensional Wiener pro2 cess and x = (x1 , . . . , xm ) ∈ Rm be a point such that ∑m i=1 xi = 1. Prove that Y (t) := m + ∑i=1 xiWi (t), t ∈ R is the Wiener process. 5.28. Let {W (t) = (W1 (t), . . . ,Wm (t)), t ∈ R+ } be an m-dimensional Wiener process and an (m ×m)-matrix U have real-valued entries and be orthogonal (i.e., UU T = E). (t) = UW (t), t ≥ 0} is again an m-dimensional Wiener process. Prove that {W 5.29. Prove that there exists a homogeneous process {X(t), t > 0} with independent increments and distribution density pt (x) =
1 t−1 −x x e 1Ix>0 . Γ (t)
5.30. Prove that there exists homogeneous process {X(t), t > 0} with independent increments and the characteristic function EeizX(t) = e−t|z| , t > 0. 5.31. Find the parameters a, B, Π in the L´evy–Khinchin formula for the process from: (a) Problem 5.29. (b) Problem 5.30. 5.32. Prove that there does not exist a process with independent increments X such that: (a) X(0) has a continuous distribution but the distribution of X(1) has an atom. (b) The distribution of X(0) is absolutely continuous but the distribution of X(1) is not. 5.33. Prove that there does not exist a process with independent increments X such that X(0) is uniformly distributed on [0, 1] and X(1) is exponentially distributed. 5.34. Prove that there does not exist a homogeneous process with independent increments X with P(X(1) − X(0) = ±1) = 12 . 5.35. Give examples of homogeneous processes X with independent increments defined on [0, 1] for which X(0) = 0 and the distribution of X(1) is: (a) discrete (b) absolutely continuous (c) continuous singular.
5 Stochastic processes with independent increments
51
5.36. Prove that the process with independent increments {X(t), t ∈ R+ } is stochastically continuous if and only if its one-dimensional characteristic function φt (z) = EeizX(t) , t ∈ R+ , z ∈ R is a continuous function w.r.t. t for every fixed z. 5.37. Assume {X(t), t ∈ [0, 1]} is a process with independent increments. Does it imply that the process Y (t) = X(−t), t ∈ [−1, 0] has independent increments? Compare with Problem 12.20. 5.38. Let {X(t), t ∈ R+ } be a nondegenerate homogeneous process with independent increments. Prove that P(|X(t)| > A) > 0 for any t > 0 and A > 0. 5.39. Let {X(t), t ∈ R+ } be a nondegenerate homogeneous process with independent increments. Prove that for every a > 0, b > 0 there exist random variables τn , σn , n ≥ 1 such that almost surely τn < σn < τn+1 , σn − τn ≤ a and |X(σn ) − X(τn )| > b for every n ≥ 1. 5.40. Let {X(t), t ∈ R+ } be a homogeneous process with independent increments and piecewise constant trajectories. Prove that X is a compound Poisson process (see Problem 5.20). 5.41. Prove Proposition 5.2. 5.42. Prove that the formula (5.2) defines a random variable (i.e., a measurable function of ω ). 5.43. Prove that the compensated Poisson point measure ν˜ = ν − μ is the centered orthogonal measure with a structural measure μ (here ν is a Poisson point measure with the intensity measure μ ). 5.44. Prove equality (5.3). 5.45. Let ν be the Poisson point measure with intensity measure μ . Prove that (1) The characteristic function of a random variable E f (z)ν (dz) equals $ % it f (z) φ (z) = exp (e − 1)μ (du) . E
(2) The characteristic function of a random variable E g(z)ν˜ (dz) equals $ % itg(z) ˜ φ (z) = exp (e − 1 − iz)μ (du) E
(the functions f , g are such that the corresponding integrals are correctly defined). 5.46. Let {X(t), t ∈ T1 × · · · × Td } be the Poisson random field with intensity mea i i i i i i sure κ. Define the mapping ν on a ring K = { m i=1 (s ,t ], s ,t ∈ T, s ≤ t , i = 1, . . . , m, m ∈ N} by the equality
ν(
m
m
(si ,t i ]) = ∑ Δsi ,t i (X), if (si ,t i ] ∩ (s j ,t j ] = ∅, i = j.
i=1
i=1
Prove that ν is the Poisson point measure with intensity measure κ.
52
5 Stochastic processes with independent increments
5.47. Let T ⊂ R be an interval and let ν be a Poisson point measure with in i i i i i i tensity measure κ defined on the ring K = { m i=1 (s ,t ], s ,t ∈ T, s ≤ t , i = 1, . . . , m, m ∈ N}. Assume that {X(t), t ∈ T} is a stochastic process such that X(t) − X(s) = ν ((s,t]) for any s,t ∈ T, s ≤ t. Does it imply that X is the Poisson process with intensity measure κ? Compare with the previous problem. 5.48. Let {X(t) = ∑k=1 ξk ,t ∈ R+ } be a compound Poisson process. Define the point i i i i + i i measure ν on the ring K = { m i=1 (s ,t ], s ,t ∈ R , s ≤ t , i = 1, . . . , m, m ∈ N} by m m i i i i j j equality ν ( i=1 (s ,t ]) = ∑i=1 Δsi ,t i (X), as (s ,t ] ∩ (s ,t ] = ∅, i = j. What distribution should the random variables {ξk } follow in order for ν to be a Poisson point measure? What is its intensity measure in that case? N(t)
5.49. Let {X(t) = ∑k=1 ξk , t ∈ R+ } be a compound Poisson process, E = R+ × R, i i i i i i K={ m i=1 (a , b ], a , b ∈ E, a ≤ b , i = 1, . . . , m, m ∈ N}. For a = (s, x), b = N(t) i i (t, y) ∈ E, a ≤ b we define ν ((a, b]) = ∑k=N(s) 1Iξk ∈(x,y] and put ν ( m i=1 (a , b ]) = i i i i j j ∑m i=1 ν ((a , b ]), as (a , b ] ∩ (a , b ] = ∅, i = j. Prove that ν is a Poisson point measure with intensity measure λ (λ 1 × μ ), where λ is the parameter of the process N, λ 1 is the Lebesgue measure, and μ is the distribution of the variable ξ1 . N(t)
5.50. Let (E, E) be a measurable space, {Xn , n ≥ 1} be a sequence of i.i.d random elements with values in E, and ζ be a random variable following the distribution Pois(λ ) and independent of {Xn , n ≥ 1}. Prove that the mapping ν : E A → ν (A) = ζ ∑k=1 1IXk ∈A is the Poisson point measure with the intensity measure λ μ , where μ is the distribution of X1 . 5.51. Let {X(t) = ∑k=1 ξk ,t ∈ R+ } be a compound Poisson process, and α > 0 be a fixed number. Prove that E|X(t)|α < +∞ for any t > 0 if and only if E|ξ1 |α < +∞. N(t)
5.52. Let X be a L´evy process with the L´evy measure Π , and α > 0 be a fixed number. Prove that EX(t)αRd < +∞ for any t > 0 if and only if u d >1 uαRd Π (du) R < +∞.
α ∈ A} be A⊃ 5.53. (General “0 and 1” rule) Let {Gα, independent σ -algebras, ∞ A1 ⊃ A2 ⊃ · · · , ∞ α ∈Ak Gα , k ∈ N. Prove: if A ∈ k=1 Bk n=1 An = ∅, Bk = σ then P(A) = 0 or 1. 5.54. Let the process {X(t), t > 0} with independent increments have right-hand continuous trajectories. Prove that every random variable, measurable w.r.t. σ algebra t>0 σ (X(s), s ≤ t), is degenerate, that is, possesses a nonrandom value with probability one. 5.55. Describe all centered continuous Gaussian processes with independent increments whose trajectories have bounded variation on any segment.
5 Stochastic processes with independent increments
53
Hints 5.1. Write down the explicit expressions for the characteristic functions of the increments. 5.3. Use Theorem 5.1. 5.4. Express the events in the terms of increments of the process N. For example, {N(2) + N(3) = 1} = {N(2) = 0, N(3) = 1} = {N(2) − N(0) = 0, N(3) − N(2) = 1}. This implies that P(N(2) + N(3) = 1) = P(N(2) − N(0) = 0)P(N(3) − N(2) = 1) = e−2λ · [e−λ λ ] = λ e−3λ . 5.7. Write down the differential equations for the functions fk (t) = P(X(t) = k), k ∈ Z+ . Prove that the solution to this system of differential equations with f0 (0) = 1, fk (0) = 0, k ≥ 1 is unique. Verify that the corresponding probabilities for Poisson process satisfy this system. 5.9. The corresponding conditional distribution functions equal: (a) F1 (y) = P(τ1 ≤ y/τ1 ≤ 1, τ2 > 1). (b) F2 (y) = P(τ1 ≤ y/τ2 > 1). (c) F3 (y) = P(τ1 ≤ y/τm ≤ 1). Calculate these conditional probabilities using the identity {τm ≤ y} = {N(y) ≥ m}. 5.10. Use Problem 1.2. 5.12. For a given m ≥ 1 and 0 < a1 < b1 < a2 < · · · < bm−1 < am , calculate the probability P(τ1 ∈ (a1 , b1 ], . . . , τm−1 ∈ (am−1 , bm−1 ], τm > am ). 5.13. Differentiate by x the equality n−1
P(τn ≤ x) = P(N(x) ≥ n) = 1 − ∑ ((λ x)k /k!)e−λ x , x ≥ 0. k=0
5.17 — 5.20. Calculate the common characteristic functions for the increments. 5.21. Use the following general fact: if {ηα , α ∈ A} are jointly independent random variables, and A1 , . . . , An are some disjoint subsets of A and ζi ∈ σ (ηα , α ∈ Ai ), i = 1, . . . , n, then the variables ζ1 , . . . ζn are jointly independent. 5.23. In order to obtain the characteristic function φX(t) , use considerations similar to those used in the proof of Problem 5.17. In order to express the distribution density of X(t), use the inversion formula for the characteristic function: pX(t) (x) = (2π )−1 R e−izx φX(t) (z) dz. 5.28. Verify that the process UW (t) is also a process with independent increments and UW (t) −UW (s) follows the distribution N(0, (t − s)IRm ).
54
5 Stochastic processes with independent increments
5.29,5.30. Use Theorem 5.1. 5.32. Assuming ζ = ξ + η and variables ξ and η are independent, prove that: (a) If the distribution of ξ does not have atoms then the distribution of ζ also does not have them. (b) If the distribution of ξ has the density then the distribution of ζ has it, too. 5.33. Write down the characteristic function φ for uniform distribution and the characteristic function ψ for exponential distribution. Check that ψ /φ is not a characteristic function of a random variable. 5.38. If P(|X(t)| > A) = 0 then P(|X(t/2)| > A2 ) = 0 (Prove it!). Conclude that DX(t/2k ) ≤ A2 /22k . After that, deduce that DX(t) = 0. 5.42. For a simple function f , the integral in the right-hand side of (5.2) is equal to the sum of the values νˆ on a finite collection of sets A ∈ σ (K) with some nonrandom weights, and thus it is a random variable according to statement (1) of Proposition 5.2. Any nonnegative measurable function can be monotonically approximated by a simple ones. 5.44. First prove the formula (5.3) for simple functions and then approximate (both pointwisely and in the L2 (μ ) sense) an arbitrary function by a sequence of simple ones. 5.45. For a simple function f , the integrals are the sums of (independent) values of the Poisson measure ν or compensated Poisson measure ν˜ with some nonrandom weights. Use the explicit formula for the characteristic function of the Poisson random variable. Approximate a measurable function by a sequence of simple ones. 5.49,5.50. Calculate the common characteristic functions for the values of ν on disjoint sets A1 , . . . , An . 5.54. Use Problem 5.53, putting A = N, Gα = σ (X(t) − X(s), 2−α −1 ≤ s < t ≤ 2−α ), α ∈ N, Ak = {k, k + 1, . . . }. 5.55. The process is identical to zero almost everywhere. In order to show this, make the appropriate change of time variable and use Problem 3.19.
Answers and Solutions 5.6. PtN1 ,...,tm (A) =∑(u1 ,...,um )∈A P(N(t1 ) = u1 , . . . , N(tm ) = um ). For 0 < t1 < · · · < tm and u1 , . . . , um ∈ Z+ such that u1 ≤ · · · ≤ um , P(N(t1 ) = u1 , . . . , N(tm ) = um ) = P(N(t1 ) = u1 )P(N(t2 ) − N(t1 ) = u2 − u1 ) × · · · ×P(N(tm ) − N(tm−1 ) = um − um−1 ) = e−λ tm
(λ t1 )u1 . . . (λ tm − λ tm−1 )um −um−1 . u1 ! . . . (um − um−1 )!
5 Stochastic processes with independent increments
55
5.8. P(X(t ) = ki , i = 1, . . . , m) = P(N(Λ (ti )) = ki , i = 1, . . . , m) where i Λ (t) = 0t λ (s) ds,t ≥ 0 and N is the Poisson process with parameter λ = 1. 5.10. Let 0 ≤ t1 < · · · < tn ≤ 1, u1 ≤ · · · ≤ un ≤ m then P(N(t1 ) = u1 , . . . , N(tn ) = un /N(1) = m) λ m −1 −λ t1 (λ t1 )u1 −λ (t2 −t1 ) (λ (t2 − t1 ))u2 −u1
= e−λ · e · e m! u1 ! (u2 − u1 )! m−un
( λ (1 − t )) n . × · · · × e−λ (1−tn ) (m − un )! We finish the proof by using Problem 1.2. 5.11. The distribution should be degenerate. 5.12. Let m ≥ 1 be fixed. For 0 < a1 < b1 < a2 < · · · < bm−1 < am we have that P(τ1 ∈ (a1 , b1 ], . . . , τm−1 ∈ (am−1 , bm−1 ], τm > am ) = P(N(a1 ) = 0, N(b1 ) − N(a1 ) = 1, . . . , N(am ) − N(bm−1 ) = 0)
= e−λ a1 λ (b1 − a1 )e−λ (b1 −a1 ) . . . λ (bm−1 − am−1 )e−λ (bm−1 −am−1 ) × e−λ (am −bm−1 ) =
(a1 ,b1 ]×···×(am−1 ,bm−1 ]×(am ,+∞)
λ m e−λ xm dx1 . . . dxm .
From the same formula with am replaced by arbitrary bm > am , we get P (τ1 , . . . , τm ) ∈ A = λ m e−λ xm dx1 . . . dxm
(5.5)
A
for every set A of the form A = (a1 , b1 ] × · · · × (am , bm ], a1 < b1 < · · · < am < bm .
(5.6)
τ1 , . . . , τm is concentrated on the set Δm := The joint distribution of the variables # " (x1 , . . . , xm )| 0 ≤ x1 ≤ · · · ≤ xm . Because the family of sets of the type (5.6) is a semiring that generates a Borel σ -algebra in Δm , relation (5.5) implies that the joint distribution density for the variables τ1 , . . . , τm is equal to p(x1 , . . . , xm ) = λ m e−λ xm 1I0≤x1 ≤···≤xm . On the other hand, for independent Exp(λ ) random variables ξ1 , . . . , ξm , the joint distribution density for the variables ξ1 , ξ1 + ξ2 , . . . , ξ1 + · · · + ξm is equal to m
λ e−λ x1 1Ix1 ≥0 ∏ λ e−λ (xk −xk−1 ) 1Ixk −xk−1 ≥0 = p(x1 , . . . , xm ), k=2
d
that is, (τ1 , . . . , τm ) =(ξ1 , . . . , ξ1 + · · · + ξm ). This proves the required statement, because the finite dimensional distributions of a registration process are uniquely defined by the finite-dimensional distributions of the associated sequence {τk } (prove the latter statement!).
56
5 Stochastic processes with independent increments
5.15. X is the Poisson process with intensity pλ . 5.16. X is the Poisson process with intensity λ1 + λ2 . 5.17. Take x1 < · · · < xn , u1 , . . . , un ∈ R and denote Δ Xi = X(xi ) − X(xi+1 ), Δ Fi = F(xi ) − F(xi+1 ). We have E exp[i(u1 X(x1 ) + u2 Δ X2 + · · · + un Δ Xn )] =
∞
∑ E(exp[i(u1 X(x1 ) + u2 Δ X2 + · · · + un Δ Xn )]/ν = k) ·
k=0
=
∞
λ k e−λ · k! j=1
k · eiu1 F(x1 ) + eiu2 Δ F2 + · · · + eiun Δ Fn + (1 − F(xn )) k
∑ E exp i ∑ (u1 1Iξ j ≤x1 + u2 1Ix1 <ξ j ≤x2 + · · · + un 1Ixn−1 <ξ j ≤xn )
k=0
=
λ k e−λ k!
∞
λ k e−λ ∑ k! k=0
= eλ F(x1 )[e
iu1 −1]
eλ (F(x2 )−F(x1 ))[e
iu2 −1]
· · · eλ (F(xn )−F(xn−1 ))[e
iun −1]
.
Thus, the variables X(x1 ), Δ X2 , . . . , Δ Xn are independent and follow the Poisson distribution with parameters λ F(x1 ), λ Δ F2 , . . . , λ Δ Fn respectively. 5.22. B = c20 , Π = ∑m k=1 λk δck , a = ∑k:|ck |≤1 λk ck . 5.23. B = b2 ; Π ({u}) = λ α u−1 (1 − α ), u ∈ N, Π ({u}) = μβ −u−1 (1 − β ), −u ∈ N; a = λ (1 − α ) + μ (1 − β ). % $ tλ α t μβ tb2 z2 + + . EeizX(t) = exp − 2 α − iz β + iz pX(t) (x) =
1 2π
&
tb2 z2 tλ α2 t λ zα t μβ 2 t λ zβ ' + 2 2 + 2 2 cos zx+ 2 2 − 2 2 dz. exp − 2 α +z β +z α +z β +z R
5.25. P(W (t) ∈ B(0, r)) = 1 − e−(r 5.26. E
∞ 0
1I{W (t)∈B} dt =
∞ 0
2 /2t)
.
P(W (t) ∈ B)dt = (2π t)−(m/2)
∞ 0
−(y2 /2t) dydt Ae
= 0.
5.34. Assume such a process to exist. Denote by φ the characteristic function of X( 12 ) − X(0). One has E(X(1) − X(0))2 < +∞ and therefore E(X( 12 ) − X(0))2 < +∞ (prove this, using that X has independent increments). Thus φ should be at least twice differentiable on the whole real ( line. On the other hand, φ 2 (z) = cos z, z ∈ √ R, hence |φ (z) − φ (π /2)| = cos z ∼ π /2 − z, z → (π /2)−. Therefore φ is not differentiable at the point π /2, which contradicts the assumption made above. 5.35. (a) The Poisson process. (b) The processes from Problems 5.29, 5.30. −1 (c) The process X(t) = ∑∞ k=1 (k!) Nk (t), where Nk , k ≥ 1 are jointly independent Poisson processes with equal intensities.
5 Stochastic processes with independent increments
57
5.36. If the process X is stochastically continuous then EeizX(t) → EeizX(s) as t → s by the dominated convergence theorem. On the other hand, if for every z the function t → EeizX(t) is continuous at a point t = 0 then the continuity theorem for characteristic functions implies that X(t) − X(0) → 0 in distribution as t → 0+ and thus X(t) − X(s) → 0 in distribution as t → s. Now we can use the fact that the convergence in distribution to a constant implies the convergence in probability. 5.37. Not necessarily. For instance, the process N(−t) is not a process with independent increments because the values N(1) − N(0) and N(1) are not independent. 5.39. The random events An = {|X((n + 1)a) − X(na)| > b}, n ≥ 1 are jointly independent and have the same positive probability (Problem 5.38). By the Borell– Cantelli lemma, an infinite number of the events An , n ≥ 1 occur with probability 1. 5.41. Denote by C the family of triplets of the sets A, B,C ∈ K0 such that C = A ∪ B, A ∩ B = ∅. This family is countable since the family of all triplets A, B,C ∈ K0 is countable. Therefore, the sets
Ω0 =
{ω | ν (A)(ω ) + μ (B)(ω ) = ν (C)(ω )},
(A,B,C)∈C
Ω1 =
{ω | ν (A)(ω ), ν (B)(ω ), ν (C)(ω ) ∈ Z+ }
(A,B,C)∈C
are random events of probability 1. For a function, defined on sets and taking integer nonnegative values, σ -additivity is equivalent to additivity (prove this!). Thus, for any ω ∈ Ω0 ∩ Ω1 the function K0 A → ν (A)(ω ) is σ -additive. By the Carath´eodory theorem, this function can be extended to a σ -finite measure on σ (K0 ) = σ (K). We denote this measure by νˆ (ω , ·). For ω ∈ Ω0 ∩ Ω1 we put νˆ (ω , ·) ≡ 0. Under this construction, for any ω , any A ⊂ σ (K) with νˆ (ω , A) < +∞ and ε > 0, there exists Aε ∈ K0 such that νˆ (ω , A Aε ) < ε . This implies that νˆ possesses integer nonnegative values and νˆ (A) = ν (A), A ∈ K a.s. Therefore, because every one-point set belongs to σ (K), the measure νˆ is a sum of δ -measures. The properties (2),(3) have been proved. To prove property (1), we use the “principle of the fitting sets”. Consider the class A of the sets A ∈ σ (K) for is an extended random variable. Then A is a monotone class: A1 ⊂ A2 ⊂ which νˆ (A) . . . , A = n An , An ∈ A, n ≥ 1 implies that νˆ (ω , An ) → νˆ (ω , A) for every ω and thus νˆ (A) is an extended random variable; that is, A ∈ A. Furthermore, K0 ⊂ A and thus A = σ (K0 ). 5.46. This statement simply follows from the definitions of the Poisson random field and Poisson point measure. 5.47. It doesn’t follow. For example, let us put T = [−1, 0], X(t) = −N(−t), ν ((s,t]) = X(t) − X(s) = N(−s) − N(−t). It follows from the previous problem that ν is a Poisson point measure and Problem 5.37 implies that X is not a process with independent increments.
58
5 Stochastic processes with independent increments
5.48. The distribution of {ξk } should be degenerate. That is, P(ξk = a) = 1 for some a ∈ R. The intensity measure in this case is equal to λ (λ 1 |R+ × δa ) (λ is the intensity of the process N). 5.51. Necessity follows from the estimate E|X(t)|α ≥ E|ξ1 |α 1IN(t)=1 = λ te−λ t E|ξ1 |α . To prove sufficiency, we use the inequalities α −1 |x |α + · · · + |x |α , α ≥ 1 n n 1 . |x1 + · · · + xn |α ≤ α ∈ (0, 1) |x1 |α + · · · + |xn |α , The first one is the Jensen inequality; the second one can be verified straightforwardly. Then E|X(t)|α ≤ E|ξ1 |α
∞
∑ (nα −1 ∨ 1)
n=1
(λ t)n −λ t e < +∞. n!
5.52. It can be verified easily that the characteristic function of the last term in (5.4) is infinitely differentiable. Thus, the norm of this term has a finite moment of an arbitrary order. The norms of the first two terms in (5.4) also have finite moments of an arbitrary order. Thus, EX(t)α < +∞ if and only if EY (t)α < +∞, where Y (t) is the third term in (5.4). The process Y is the compound Poisson process with the distribution of its jump equal [Π (uRd > 1)]−1 Π (· ∩ {uRd > 1}) (verify this!). Now, we can deduce the required statement using considerations analogous to those used in the previous solution.
to all Bk , k ∈ N is in5.53. Denote Ck = σ ( α ∈A\Ak Gα ). Any set A that belongs ∞ C for any k ∈ N. Because (A\Ak ) = A, this set dependent of the σ -algebra k k=1 Ck ) = σ ( α ∈A Gα ). But it is obvious that A belongs to is independent of σ ( ∞ k=1 σ ( α ∈A Gα ), and thus A is independent of A. This implies P(A) = P(A ∩ A) = P2 (A) and therefore P(A) = 0 or 1.
6 Gaussian processes
Theoretical grounds Definition 6.1. Random variables ξ1 , . . . , ξm are called jointly Gaussian if the characteristic function of their joint distribution has the form Eei ∑k=1 zk ξk = ei(z,a)−(Bz,z)/2 , z = (z1 , . . . , zm ) ∈ Rm , m
(6.1)
where a ∈ Rm , B is a symmetric nonnegatively defined matrix, and (·, ·) denotes the scalar product in Rm . A random vector ξ = (ξ1 , . . . , ξm ) with jointly Gaussian coordinates is said to be Gaussian (or follow the Gaussian distribution). The vector a in the relation (6.1) is the mean vector for the random vector ξ and the matrix B is the covariance matrix for ξ . The distribution of the Gaussian vector with the characteristic function (6.1) is called the Gaussian measure with mean a and covariance B, and is denoted N(a, B). This distribution is uniquely determined by the mean vector a and covariance matrix B, and for any a ∈ Rm and symmetric nonnegative matrix B there exists a random vector following the distribution N(a, B). Definition 6.1 is equivalent to the following one. Definition 6.2. A vector ξ = (ξ1 , . . . , ξm ) is called Gaussian if for any nonrandom vector z = (z1 , . . . , zm ) ∈ Rm the product (z, ξ )Rm = ∑mj=1 z j ξ j is a Gaussian random variable. Proposition 6.1. (1) Let ξ ∼ N(a, B) be an m-dimensional Gaussian vector, b ∈ Rn , and A be an n × m matrix. Then η = b + Aξ is an n-dimensional Gaussian vector, η ∼ N(b + Aa, ABA∗ ). (2) Let {ξ n , n ∈ N} be a sequence of Gaussian random vectors weakly convergent to a vector ξ . Then ξ is a Gaussian vector. (3) Let ξ1 , . . . , ξm , η1 , . . . , ηn be jointly Gaussian random variables and cov(ξ j , ηk ) = 0, j = 1, . . . , m, k = 1, . . . , n. Then the random rectors ξ = (ξ1 , . . . , ξm ) and η = (η1 , . . . , ηn ) are independent. D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 6,
59
60
6 Gaussian processes
For two random vectors ξ = (ξ1 , . . . , ξm ) and η = (η1 , . . . , ηn ) the joint covariance matrix is the m × n matrix Rξ η consisting of covariances of the elements of the vectors:
Rξ η = cov (ξ j , ηk ) , j = 1, . . . , m, k = 1, . . . , n. jk
Remark that, in this notation, Rξ ξ is the covariance matrix Rξ of the vector ξ . Theorem 6.1. (On normal correlation) Let random variables ξ1 , . . . , ξm , η1 , . . . , ηn be jointly Gaussian. Then the conditional distribution of ξ = (ξ1 , . . . , ξm ) with respect to the σ -algebra, generated by η = (η1 , . . . , ηn ), is Gaussian. If the matrix Rηη is nondegenerate, then the mean vector of this conditional distribution equals aξ |η = aξ + Rξ η R−1 ηη (η − aη ), and the covariance matrix equals Rξ |η = Rξ ξ − Rξ η R−1 ηη Rηξ . Let us emphasize that, in the formulation of the theorem on normal correlation, it is crucial that the vectors ξ and η are parts of one Gaussian vector of the length m + n. The statement of this theorem is often interpreted as follows. The conditional distribution of ξ given η = y is equal to −1 (y − a ), R − R R R N aξ + Rξ η R−1 η ξξ ξ η ηη ηξ . ηη Definition 6.3. The stochastic process {X(t), t ∈ T} is called Gaussian if for any m ≥ 1 and any points {t1 , . . . ,tm } ⊂ T the vector (X(t1 ), . . . , X(tm )) follows the Gaussian distribution. Theorem 6.2. (1) Let a : T → R be an arbitrary function, and R : T × T → R be a nonnegatively defined function. Then there exist a probability space (Ω , F, P) and a Gaussian stochastic process {X(t), t ∈ T}, defined on this space, for which a and R are the mean and covariance functions, respectively. (2) The mean and covariance functions uniquely determine the finite-dimensional distributions of the Gaussian process. The Wiener process is Gaussian (see Problem 6.4). According to Theorem 6.2, the following characterization is available for this process. Proposition 6.2. A real-valued stochastic process {X(t), t ∈ R+ } is the Wiener process if and only if it is a Gaussian process with aX ≡ 0, RX (t, s) = t ∧ s, t, s ∈ R+ . Let us give some examples of the most important Gaussian processes. Example 6.1. The Brownian bridge is the centered Gaussian process {B(t), t ∈ [0, 1]} with covariance RB (t, s) = t ∧ s − st, s,t ∈ [0, 1]. Sometimes it is called the Brownian bridge of length 1, starting from the point 0 and arriving at the point 0 (see Problem 6.21).
6 Gaussian processes
61
Example 6.2. The Ornstein–Uhlenbeck process is the centered Gaussian process {X(t), t ∈ R} with covariance RX (t, s) = e−|t−s| , s,t ∈ R. Example 6.3. The fractional Brownian motion with Hurst index H ∈ (0, 1) is the cenH (t), t ∈ R} with covariance R (t, s) = 1 |t|2H + |s|2H − tered Gaussian process {B H B 2 |t − s|2H , t, s ∈ R. Remark 6.1. For t, s ≥ 0, one has RB1/2 (t, s) = t ∧ s. Therefore, the process B1/2 restricted to R+ coincides with the Wiener process (see Proposition 6.2). Sometimes, the process B1/2 is called the two-sided Wiener process (see also Problem 6.8). Proposition 6.3. The process 1/2−H 1/2−H (t − s)+ dW (s), t ∈ R − (−s)+ X H (t) = R
is the fractional Brownian motion with Hurst index H ∈ (0, 1). Remember that a positive part of a number b ∈ R is b+ = b1Ib>0 . The proposition presented above shows that the function RBH is nonnegatively defined. The simple conditions are available, sufficient for existence of the continuous and H¨older modifications of a centered Gaussian process. Theorem 6.3. Let {X(t), t ∈ [0, T ]} be a Gaussian process with aX ≡ 0. Denote
σX2 (t, h) = E(X(t + h) − X(t))2 = RX (t + h,t + h) − 2RX (t,t + h) + RX (t,t). (1) If σX2 (t, h) ≤ c| ln |h||−p ,t ∈ [0, T ], h > 0 for some p > 3 and c > 0, then the ˜ process X has continuous modification X. (2) If there exist p > 0 and c > 0 such that
σX2 (t, h) ≤ c|h| p , t ∈ [0, T ], h >, 0 then for any ε > 0 there exist Ω ⊂ Ω , P(Ω ) = 1, and a function c = c(ω ) : Ω → R such that for any t, s ∈ [0, T ], 1+ε X(t, ˜ ω ) ≤ c(ω )|t − s| p/2 ln|t − s| . ˜ ω ) − X(s, Corollary 6.1. The trajectories of the Wiener process for any ε > 0 a.s. satisfy the following inequality sup t, s∈[0,T ], s=t
(
|W (t) −W (s)| 1+ε < +∞. |t − s|ln|t − s|
Bibliography [9], Chapter II; [24], Volume 1, Chapter III, §1; [25], Chapter I, §2; [15], Chapter II, §3; [36].
62
6 Gaussian processes
Problems 6.1. Prove that Definitions 6.1 and 6.2 are equivalent. 6.2. Prove Proposition 6.1. 6.3. Let ξ1 , . . . , ξm , η1 , . . . , ηn be jointly Gaussian random variables, denote ζi = ηi − E [ηi /ξ1 , . . . , ξm ] , i = 1, . . . , n. Prove that the random vectors ξ = (ξ1 , . . . , ξm ) and ζ = (ζ1 , . . . , ζn ) are independent. Write the covariance matrix for the vector ζ assuming that the matrix Rξ is nondegenerate. 6.4. Prove that the Wiener process is Gaussian. 6.5. Let {W (t), t ∈ R+ } be a Wiener process. Verify that the following processes are Wiener processes √ as well (c, T > 0 are arbitrary constants). (a) W c (t) = cW (t/c), t ∈ R+ . (b) Wc (t) = W (t + c) −W (c), t ∈ R+ . (c) WT (t) = W (t), t ≤ T , WT (t) = 2W (T ) −W (t), t > T . (d) Wˆ (t) = tW (1/t), t > 0, Wˆ (0) = 0. ←
(e) W T (t) = W (T ) −W (T − t), t ∈ [0, T ]. Draw a trajectory of the process W and corresponding trajectories of the pro← cesses W c ,Wc ,WT , W T . 6.6. Let W be a Wiener process. Does there exist a nonrandom function c(t) such that the process (a) X(t) = c(t)W (2t) (b) X(t) = c(t)W (1/t), t > 0, X(0) = 0 (c) X(t) = c(t)W (et ) is a Wiener process? If the answer is positive, find the function c(t). 6.7. Give an example of a stochastic process, not being a Wiener process, with its mean and covariance functions equal to aX ≡ 0, RX (t, s) = t ∧ s, t, s ∈ R+ . 6.8. (1) Let {W1 (t),t ∈ R+ } and {W2 (t),t ∈ R+ } be two independent Wiener processes. Prove that t ≥ 0, W1 (t), X(t) = W2 (−t), t ≤ 0 is a two-sided Wiener process. (2) Is the two-sided Wiener process a process with independent increments? 6.9. Prove that the fractional Brownian motion (in particular, the Wiener process) is a process with stationary increments, that is, the process {BH (s) := BH (t + s) − B(t), s ≥ 0} has the same distribution for all t ≥ 0. 6.10. Let {X(t), t ∈ R} be the Ornstein–Uhlenbeck process. Prove that {Y (t) = t 1/2 X( 12 lnt),t > 0} is the Wiener process.
6 Gaussian processes
63
6.11. Let {X(t), t ∈ [0, 1]} be the Brownian bridge. Prove that {Y (t) = (1 + t)X(1 − (t + 1)−1 ), t ∈ R+ } is the Wiener process. 6.12. Let {W (t), t ∈ R+ } be the Wiener process. Prove that {X(t) = e−t W (e2t ), t ∈ R} is the Ornstein–Uhlenbeck process. 6.13. Let {W (t), t ∈ [0, 1]} be the Wiener process. Prove that the process {X(t) = W (t) − tW (1), t ∈ [0, 1]} is the Brownian bridge. 6.14. Let {W (t), t ∈ [0, 1]} be the Wiener process. Prove that all (a) one-dimensional (b) m-dimensional conditional distributions of the process W given that {W (1) = 0} are equal to the corresponding (unconditional) distributions of the Brownian bridge. 6.15. Let {X(t), t ∈ [0, 1]} be the Brownian bridge and ξ be a random variable independent of X and following the distribution N(0, 1). Prove that W (t) = X(t) + t ξ is the Wiener process. 6.16. Construct a stochastic process {X(t), t ∈ R+ } such that all its one-dimensional distributions are Gaussian but the process is not a Gaussian one. 6.17. Let X be a centered Gaussian process with covariance RX . Find mean and covariance functions for the process Y (t) = X 2 (t). 6.18. Let 0 < s1 < s2 < s3 . Find the conditional distribution of X(s3 ) given {X(s1 ) = x1 , X(s2 ) = x2 } if X is (a) The Wiener process. (b) The Brownian bridge (we suppose in this case that s3 < 1). (c) The Ornstein–Uhlenbeck process. (d) The fractional Brownian motion with Hurst index H = 12 . What is a difference between items (a) – (c) and item (d)? 6.19. Let 0 < s1 < s2 < s3 . Find the conditional distribution of W (s1 ) given {W (s2 ) = x2 ,W (s3 ) = x3 }. 6.20. Let X be the Gaussian process, and s1 , s2 , s3 be the points from its domain. Consider the following property. The conditional distribution X(s3 ) w.r.t. the σ algebra σ (X(s1 ), X(s2 )) equals the conditional distribution X(s3 ) w.r.t. the σ -algebra σ (X(s2 )). (1) Prove that this property takes place if and only if RX (s1 , s2 )RX (s2 , s3 ) = RX (s1 , s3 )RX (s2 , s2 )
(6.2)
and, additionally, RX (s1 , s3 ) = 0 as soon as RX (s2 , s2 ) = 0. (2) Give an example such that the equality (6.2) holds but the property described above does not hold true. Compare with Problem 6.18. See also the definition of the Markov process in Chapter 12 and Problems 12.10,12.11.
64
6 Gaussian processes
6.21. The Brownian bridge of length l, starting from point x and arriving at point y, is the process Bl,x,y , defined on [0, l], for which the corresponding finite-dimensional distributions are equal to the conditional distributions of x + W (t1 ), . . . , x + W (tm ) given {x +W (l) = y} (W is the Wiener process). Prove that Bl,a is a Gaussian process and find its mean and covariance functions. 6.22. Let W be the Wiener process. Show that the process lt l −t y−x t+ W , 0≤t ≤l X(t) = x + l l l −t is the Brownian bridge of the length l starting from x and arriving at y. 6.23. The Ornstein–Uhlenbeck bridge of length l, starting from point x and arriving at point y, is the process Ul,x,y , defined on R, for which the corresponding finitedimensional distributions are equal to the conditional distributions of the values U(t1 ), . . . ,U(tm ) of the Ornstein–Uhlenbeck process given {U(0) = x,U(l) = y}. Prove that Ul,x,y is a Gaussian process and find its mean and covariance functions. 6.24. The Ornstein–Uhlenbeck process with the initial value x is the process Ux , defined on R, for which the corresponding finite-dimensional distributions are equal to the conditional distributions of the values U(t1 ), . . . ,U(tm ) of the Ornstein– Uhlenbeck process given {U(0) = x}. Prove that Ux is a Gaussian process and find its mean and covariance functions. 6.25. Verify that the condition on correlation function RX formulated in Problem 6.20 hold true for any s1 < s2 < s3 if X is (1) The Ornstein–Uhlenbeck process with the initial value x (see Problem 6.24) 2) The Ornstein–Uhlenbeck bridge of length l, starting from point x and arriving at point y (see Problem 6.23) 6.26. Let K ∈ C([a, b] × [a, b]) be a nonnegatively defined function, {λk , k ≥ 1} and {φk , k ≥ 1} be the eigenvalues and corresponding orthonormal eigenfunctions of the integral operator AK (see Problem 2.28). Let {ξk } be i.i.d. random variables following the standard normal distribution. ( Prove that: λk ξk · φk (t) converges in mean square for any (1) The series X(t) = ∑∞ k=1 t ∈ [a, b]. (2) The process {X(t), t ∈ [a, b]} is a Gaussian one with aX ≡ 0, RX = K. 6.27. Find the eigenvalues {λk , k ≥ 1} and orthonormal eigenfunctions {φk , k ≥ 1} of the operator AK in the case when K is the covariance of: (a) Wiener process on [0, 1] (b) Brownian bridge 6.28. Using Problem 6.26 and item (a) of Problem 6.27, prove that the Wiener process on [0, 1] has the following representation: √ % $ ∞ 1 2 · t , t ∈ [0, 1], · sin k − π W (t) = ∑ ξk · 2 π (k − 12 ) k=1
6 Gaussian processes
65
where {ξk , k ≥ 1} are i.i.d. random variables following the standard normal distribution. 6.29. Prove that the Wiener process on [0, T ] can be written as: √ % $ ∞ 1 2T π k − · t , t ∈ [0, T ], W (t) = ∑ ξk · · sin T 2 π k − 12 k=1 where {ξk , k ≥ 1} are i.i.d. random variables following the standard normal distribution. If T = 1 then, on [0, 1 ∧ T ], this representation differs from the one given in the previous problem. 6.30. Using Problem 6.26 and item (b) of Problem 6.27 prove that the Brownian bridge has the following representation, √
∞ 2 · sin π k · t , t ∈ [0, 1], X(t) = ∑ ξk · πk k=1 where {ξk , k ≥ 1} are i.i.d. random variables following the standard normal distribution. 6.31. Using the previous problem prove that the Wiener process on [0, 1] can be written as √
∞ 2 · sin π k · t , t ∈ [0, 1], W (t) = ξ0 · t + ∑ ξk · πk k=1 where {ξk , k ≥ 0} are i.i.d. random variables following the standard normal distribution. 6.32. (1) Let an operator B : L2 ([0, T ]) → L2 ([0, T ]) be defined by the equality (Bx)(t) = 0t x(r) dr,t ∈ [0, T ]. Find the adjoint operator B∗ and prove that AK = BB∗ , where K(t, s) = t ∧ s, t, s ∈ [0, T ]. (2) Let {ek , k ≥ 1} be some orthonormal basis (ONB) in L2 ([0, T ]). Take fk = Bek , k ≥ 1 and put W (t) =
∞
∑ ξk · fk (t),
t ∈ [0, T ],
(6.3)
k=1
where {ξk , k ≥ 1} are i.i.d. random variables following the standard normal distribution. Prove that the series converges for any t in the mean square sense and its sum is the Wiener process on [0, T ]. (3) Let T = 1. Specify the bases {ek , k ≥ 1} such that the corresponding representations (6.3) coincide with those given in Problems 6.28, 6.31. + 6.33. Let t a stochastic process {X(t), t ∈ R } be defined by the stochastic integral: X(t) = 0 f (s)dW (s), where W is the Wiener process and f ∈ L2 ([0, T ]) for every T > 0. Prove that X is a Gaussian process. Find its mean and covariance functions.
66
6 Gaussian processes
6.34. (1) Prove that the process X from the previousproblem has the same finitedimensional distributions with the process Y (t) = W 0t f 2 (s) ds , t ∈ R+ . (2) Prove that the process X has a continuous modification. 6.35. Let K(t, s), t, s ∈ [0, T ] be a symmetric nonnegatively defined function. Assume there exists a function Q ∈ L2 ([0, T ]2 ) such that K(t, s) =
T
Q(t, r)Q(s, r) dr, 0
t, s ∈ [0, T ].
(6.4)
Prove that X(t) = 0T Q(t, s)dW (s), t ∈ [0, T ] is a centered Gaussian process and its covariance function is equal to K. 6.36. Let, in the previous problem, K(t, s) = t ∧ s − ts,t, s ∈ [0, 1]. Give an example of a function Q satisfying (6.4). Write the corresponding integral representation for the Brownian bridge. 6.37. Let K(t, s) = t ∧ s, t, s ∈ [0, 1]. Give two different functions Q satisfying (6.4). Compare the corresponding representations for the Wiener process with Problem 6.5. 6.38. Let X(t) = process.
t
s−t dW (s),t −∞ e
∈ R. Prove that X is the Ornstein–Uhlenbeck
6.39. Let {X(t), t ∈ [0, T ]} be a Gaussian process and E|X(t + h) − X(t)|2 ≤ ψ 2 (h) 2 with R+ ψ (e−x )dx < ∞. Prove that the process X has a continuous modification.
Hints 6.3. Prove that every variable ζi is uncorrelated with ξ j and use Proposition 6.1. 6.5. Prove that every process from items (a)–(e) is a Gaussian one and find the mean and covariance functions. 6.10 — 6.13,6.15. Prove that the process in the formulation of the problem is a Gaussian one, and find its mean and covariance functions. 6.14. The first version of the solution: use the theorem on normal correlation; the second version: let W be the Wiener process,and X(t) = W (t) − tW (1),t ∈ [0, 1] be the Brownian bridge (see Problem 6.13). We have cov(X(t),W (1)) = 0 (check this!), thus the process X and the random variable W (1) are independent (prove this!). This implies the required statement. 6.21, 6.23, 6.24. Use the theorem on normal correlation. 6.27. (1) The equation for eigenvalues and eigenfunctions of AK has the form λ f (t) = 0t s f (s) ds + t t1 f (s) ds,t ∈ [0, 1]. If λ = 0 then after differentiation by t we get t1 f (s) ds = 0,t ∈ [0, 1] and thus f ≡ 0. Therefore, λ > 0 (recall that AK ≥ 0; see Problems 2.28, 2.29). Then f is differentiable and λ f (t) = t1 f (s) ds; furthermore, f (0) = f (1) = 0. Thus the eigenvalues and eigenfunctions of the operator AK satisfy the boundary problem λ f = − f , f (0) = f (1) = 0.
6 Gaussian processes
67
(2) Similar arguments lead to the following boundary problem for the eigenvalues and eigenfunctions of the operator AK : λ f = − f , f (0) = f (1) = 0. 6.29. Use similar arguments as in Hint 6.27, item (a) on a segment [0, T ]. Another opportunity is to use Problem 6.28 and Problem 6.5, item (a). 6.33. Use Proposition 6.1 and the definition of the stochastic integral as the limit of the integrals of stepwise functions. 6.34. (1) Use Theorem 6.2. (2) Use Theorem 3.2. 6.39. Use Theorem 3.3.
Answers and Solutions 6.1. If ξ satisfies Definition 6.1 then for any z ∈ Rm the characteristic function of the variable (z, ξ )Rm is equal to ψz (t) = E exp{it(z, ξ )Rm } = exp[itaz − 12 t 2 σz2 ], t ∈ R, where az = (a, z)Rm , σz2 = (Bz, z)Rm . So, (z, ξ )Rm ∼ N(az , σz2 ). If ξ satisfies Definition 6.2 then E exp{i(z, ξ )Rm } = exp[iE(z, ξ )Rm − 12 D(z, ξ )Rm ], z ∈ Rm . Taking into account that E(z, ξ )Rm = (z, aξ )Rm , D(z, ξ )Rm = (Bξ z, z)Rm we obtain (6.1). 6.2. (1) If ξ is a Gaussian vector and z ∈ Rn , then (z, η )Rn = (z, b)Rn + (A∗ z, ξ )Rm = c + (z1 , ξ )Rm is also Gaussian (c = (z, b)Rn ∈ R, z1 = A∗ z ∈ Rm , and (z1 , ξ )Rm is a Gaussian random variable). The expressions for aη , Bη are obtained from the following relations: (z, aη )Rn = E(z, b + Aξ )Rn = (z, b)Rn + E(A∗ z, ξ )Rm = (z, b)Rn + (A∗ z, a)Rm , (Bη z, z)Rn = D(z, η )Rn = D(A∗ z, ξ )Rm = (BA∗ z, A∗ z)Rm , z ∈ Rn . (2) If ξ n → ξ weakly then φξ n (z) → φξ (z), z ∈ Rm . Therefore, there exist functions ψ1,2 such that φξ (z) = exp[iψ1 (z)− ψ2 (z)], (aξ n , z)Rm → ψ1 (z), 12 (Bξ n z, z)Rm → ψ2 (z) for every z ∈ Rm . It follows from the last two relations that aξ n → a, Bξ n → B, where a is a vector and B is a symmetric nonnegatively defined matrix. This implies (6.1) for the characteristic function of ξ . (3) The characteristic function of the vector ζ = (ξ1 , . . . , ξm , η1 , . . . , ηn ), due to the formula (6.1), can be written in the form
φζ (z1 , . . . , zn+m ) = φξ (z1 , . . . , zm )φη (zm+1 , . . . , zn+m ), z = (z1 , . . . , zn+m ) ∈ Rn+m . The independence of ξ and η follows from this expression. 6.3. Without loss of generality we can assume the variables ξi , η j to be centered. Using item (1) of Proposition 6.1, we obtain that ξ and ζ = η − Rηξ R−1 ξ are jointly ξξ
R . Using item (3) of the same Gaussian, uncorrelated, and Rζ = Rη − Rηξ R−1 ξξ ξη proposition we get the independence of ξ and ζ . 6.4. Let t1 < · · · < tm . The random variables ξ1 = W (t1 ), ξ2 = W (t2 )−W (t1 ), . . . , ξm = W (tm )−W (tm−1 ) are Gaussian and independent, and thus, the vector ξ = (ξ1 , . . . , ξm ) is also Gaussian (this can be easily obtained from Definition 6.1). The vector (W (t1 ), . . . ,W (tm )) is the image of the vector ξ under the linear mapping A : (x1 , . . . , xm ) → (x1 , x1 + x2 , . . . , x1 + · · · + xm ) and thus, is Gaussian (Proposition 6.1, item (1)).
68
6 Gaussian processes
√ 6.6. (a) c(t) = 1/ 2. (b) c(t) = t. (c) Such a function does not exist. 6.7. The compensated Poisson process with intensity λ = 1. 6.17. aX 2 (t) = RX (t,t), RX 2 (t, s) = 2R2X (t, s). 6.18. By the theorem on normal correlation, every required conditional distribution is a Gaussian one with the constant variance σ 2 and the mean a being a linear combination of the values x1 and x2 . Further on, we give the values a, σ 2 for every item: (a) x2 , s3 − s2 . b) 1 − s3 x2 , 1 − s2
(b)
(1 − s3 )(s2 − s1 ) . 1 − s2
(c) es2 −s3 x2 , 1 − e2(s2 −s3 ) . a=
d)
r32 r11 − r31 r12 r31 r22 − r32 r12 x1 + x2 , 2 2 r11 r22 − r12 r11 r22 − r12
σ 2 = r33 −
2 + r r 2 − 2r r r r11 r32 22 31 12 31 32 , 2 r11 r22 − r12
2H 2H where ri j = 12 (s2H i + s j − |si − s j | ). In items (a) – (c), in contrast to item (d), the conditional distribution does not depend on the value of X at the point s1 . 6.19. s1 s1 x2 , (s2 − s1 ) . N s2 s2 2 > 0 (i.e., the covariance matrix of the 6.20. (1) Denote ri j = RX (si , s j ). If r11 r22 − r12 vector (X(s1 ), X(s2 )) is nondegenerate), then using the theorem on normal correlation we obtain the explicit expressions for the conditional distributions X(s3 ) w.r.t. σ (X(s2 )), σ (X(s1 ), X(s2 )); these distributions are Gaussian with the means
a3|2 =
r32 r31 r22 − r32 r12 r32 r11 − r31 r12 X(s2 ), a3|12 = X(s1 ) + X(s2 ) 2 2 r22 r11 r22 − r12 r11 r22 − r12
and the variances 2 σ3|2 = r33 −
2 2 + r r 2 − 2r r r r32 r11 r32 22 31 12 31 32 2 , σ3|12 = r33 − . 2 r22 r11 r22 − r12
When the required property holds, the coefficient near X(t1 ) in the expression for a3|12 has to be zero, and thus the condition r31 r22 − r32 r12 = 0 is necessary for this property to hold. One can verify straightforwardly that, under this condition, 2 = σ2 σ3|2 3|12 and r32 r11 − r31 r12 r32 = . 2 r22 r11 r22 − r12
6 Gaussian processes
69
Therefore, this condition is also sufficient. 2 = 0 should be considered separately. If, in addition, The case r11 r22 − r12 r22 > 0 then X(s1 ) = cX(s2 ) + d for some constants c, d and thus σ (X(s1 ), X(s2 )) = σ (X(s2 )) and the needed property holds true automatically. At the same time r31 r22 = cr32 r22 = r32 (cr22 ) = r32 r12 . If r22 = 0 then the σ -algebra σ (X(s2 )) is degenerate and thus the conditional distribution of X(s3 ) w.r.t. it being equal to the ordinary distribution of X(s3 ). Consequently, when r22 = 0, we can reformulate the required property in the following form. The conditional distribution X(s3 ) w.r.t. σ (X(s1 )) equals to the ordinary distribution of X(s3 ). The latter claim is known to be equivalent to the independence of the random variables X(s1 ) and X(s3 ), and consequently (because X(s1 ), X(s3 ) are jointly Gaussian) to their noncorrelatedness. (2) X(s2 ) = 0, X(s1 ) = X(s3 ) ∼ N(0, 1). 6.21. aBl,x,y (t) = [(y − x)/l]t, RBl,x,y (t, s) = t ∧ s − (ts/l). ⎧ t ⎪ t ≤ 0, ⎨xe , sh(l−t) sht 6.23. aUl,x,y (t) = sh l x + sh l y, t ∈ [0, l], ⎪ ⎩ l−t t ≥l ye , ⎧ ⎪ s,t ≤ 0, e−|t−s| − es+t , ⎪ ⎪ ⎪ ⎨e−|t−s| − e2l−s−t , s,t ≥ l, RUl,x,y (t, s) = −|t−s| − 1 − e−2l sh(l−s) sh(l−t) , s,t ∈ [0, l], ⎪ e ⎪ (sh l)2 ⎪ ⎪ ⎩0 in other cases. 6.24. aUx (t) = xe−|t| , RUx (t, s) = e−|t−s| − e−|t|−|s| . 6.26. By the Mercer theorem, K(t, s) = ∑∞ k=1 λk φk (t)φk (s), s,t ∈ [a, b] and the series converges uniformly on [a, b] × [a, b]. Thus, for any t ∈ [a, b] we have that ∑k≥n λk φk2 (t) → 0, n → ∞ and we obtain the statement of item (1). (2) The series expansion for the kernel K yields K = RX . Proposition 6.1 provides that the process X is a Gaussian one. √ −2
, φk (t) = 2 sin π k − 12 · t , k ≥ 1. 6.27. (a) λk = π k − 12 √ ( b) λk = [π k]−2 , φk (t) = 2 sin[π k · t] , k ≥ 1. 6.32. (1) For any f , g ∈ L2 ([0, 1]), T t T T f (s) ds g(t) dt = g(t) dt f (s) ds, (B f , g)L2 ([0,T ]) = 0
and thus B∗ g(t) =
0
0
s
T
g(s) ds. Then t T ∗ f (r) dr ds = BB f (t) = t
0
s
0
T
0
t∧r
ds f (r) dr = AK f (t)
(2) For any f , g ∈ L2 ([0, T ]) we have E(W, f ) = ∑k ( fk , f )Eξk = 0,
70
6 Gaussian processes
) cov ((W, f ), (W, g)) = cov
*
∑( f j , f )ξ j , ∑( fk , g)ξk j
k
= ∑( f j , f )( fk , g) cov (ξ j , ξk ) = ∑( fk , f )( fk , g) k, j
k
= ∑(ek , B∗ f )(ek , B∗ g) = (B∗ f , B∗ g) = (AK f , g) k
(here (·, ·) denotes the inner product in L2 ([0, T ])). This yields aW = 0, RW = K. Proposition 6.1 implies that the √process
X is Gaussian. (3) In Problem 6.28: ek (t) = 2 cos π (k − 12 ) · t , k ≥ 1. In Problem 6.31: e1 ≡ 1, √ ek (t) = 2 cos[π (k − 1) ·t] , k ≥ 2. 6.33. aX ≡ 0, RX (t, s) = 0t∧s f 2 (r) dr. 6.35. The process is Gaussian due to Proposition 6.1 and the definition of the stochastic integral as the limit of integrals of stepwise functions. By the properties of the stochastic integral (see Theorem 13.1), EX(t) = 0, % $ T T Q(t, r)dW (r) Q(s, r)dW (r) cov(X(t), X(s)) = E =
T 0
0
0
Q(t, r)Q(s, r) dr = K(t, s), t, s ∈ [0, T ].
6.36. One can take Q(t, s) = 1Is≤t − t, s,t ∈ [0, 1]. The corresponding representation has the form X(t) = 01 1Is≤t dW (s) −t 01 dW (s) = W (t) −tW (1), t ∈ [0, 1] (compare with Problem 6.13). 6.37. One can take Q1 (t, s) = 1Is≤t , Q2 (t, s) = 1Is≥1−t , s,t ∈ [0, 1]. Then the processes obtained as the integral transformations of the given Wiener process W ← − with the kernels Q1 , Q2 are equal to X1 = W , X2 = W = W (1) − W (1 − ·). See Problem 6.5 (e).
7 Martingales and related processes in discrete and continuous time. Stopping times
Theoretical grounds Let T be a set with linear order. For instance, T = R+ or T = Z+ := N ∪ {0} with the usual type of relation ≤. Let also {Ω , F, {Ft }t∈T , P} be a probability space with complete right-hand continuous filtration (this kind of space is sometimes called a stochastic basis). Definition 7.1. A stochastic process {X(t), t ∈ T} is said to be a martingale if it satisfies the following three conditions. (1) For any t ∈ T the random variable X(t) ∈ L1 (P) (i.e., E|X(t)| < ∞; or sometimes we say that the process X is integrable on T). (2) For any t ∈ T the random variable X(t) is Ft -measurable (sometimes we say that the process X(t) is Ft -adapted). (3) For any s ≤ t, s,t ∈ T it holds that E(X(t)/Fs ) = X(s) P-a.s. If we change in condition (3) the sign = for ≥ and obtain E(X(t)/Fs ) ≥ X(s) P-a.s, then we have the definition of a submartingale; if E(X(t)/Fs ) ≤ X(s) P-a.s. for any s ≤ t, s,t ∈ T, then we have a supermartingale. Furthermore the property that takes place P-a.s. we denote simply “a.s.” A vector process we call a (sub-, super-) martingale if the corresponding property has each of its components. In what follows we denote the fact that a stochastic process {X(t), t ∈ T} is Ft -adapted as {X(t), Ft , t ∈ T}. Definition 7.2. A mapping τ : Ω → T ∪ {∞} is said to be the Markov moment if for any t ∈ T an event A := {ω ∈ Ω | τ (ω ) ≤ t} ∈ Ft . A stopping time is the Markov moment τ for which τ < ∞ a.s. A sigma-algebra generated by the Markov moment τ is the class of events Fτ := {A ∈ F| A ∩ {τ ≤ t} ∈ Ft , t ∈ R+ } (for continuous time) and Fτ := {A ∈ F| A ∩ {τ ≤ n} ∈ Fn , n ≥ 0} (for discrete time). Definition 7.3. The Markov moment τ (ω ) is called predictable if there exists a sequence {τn , n ≥ 1} of Markov moments such that: D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 7,
71
72
7 Martingales and related processes. Stopping times
(1) τn (ω ) is the increasing sequence a.s. and limn→∞ τn (ω ) = τ (ω ) a.s. (2) For any n ≥ 1 it holds that τn (ω ) < τ (ω ) a.s. on the set {τ (ω ) > 0}. Sometimes one says that the sequence τn predicts the Markov moment τ . Definition 7.4. A σ -algebra is called predictable on [0, ∞) × Ω if it is generated by random intervals [τ , σ ) := {(t, ω )| τ (ω ) ≤ t < σ (ω )}, where τ and σ are predictable Markov moments. Definition 7.5. A stochastic process {A(t), t ∈ T} defined on (Ω , F) and with values in a measurable space (X, X) is said to be predictable if the mapping A : [0, ∞) × Ω → X is measurable regarding the predictable σ -algebra on [0, ∞) × Ω . Usually we consider X = R or Rm . In the case T = Z+ Definition 7.5 is transformed to the following one. A discretetime process A(t) is predictable if A(0) is a constant and A(t) is Ft−1 -measurable r.v. for t ∈ N. If T = R+ then left-hand continuous (in particular, continuous) processes are predictable. Below in this chapter we denote the discrete-time processes as Xn , Mn , and so on (with the lower index as time) and continuous-time processes, as before, are denoted as X(t), M(t), and so on (the time index is inside the parentheses). Theorem 7.1. (1) A supermartingale {X(t), Ft , t ∈ R+ } has c`adl`ag modification if and only if EX(t) is a right-hand continuous function of t ∈ R+ (you can find the definition of c`adl`ag modification in Theoretical grounds of Chapter 3). (2) Let {X(t), Ft , t ∈ R+ } be a right-hand continuous supermartingale. Then X(t) has left-hand limits a.s. and almost all trajectories are bounded on every segment [0, T ], T > 0. Definition 7.6. A family of random variables {ξα , α ∈ A} is said to be uniformly integrable if lim sup |ξα |dP = 0. C→∞ α ∈A {|ξα |>C}
A stochastic process {X(t), t ∈ T} is said to be uniformly integrable if the family of random variables {X(t), t ∈ T} is uniformly integrable. Definition 7.7. A right-hand continuous uniformly integrable (sub-, super-) martingale {X(t), t ∈ R+ } belongs to a class D if a family of random variables {X(τ ), τ is a Markov moment} is uniformly integrable. In what follows we use X(∞) to denote a limit of a stochastic process: X(∞) = limt→∞ X(t) if the limit exists a.s. Theorem 7.2. (Doob–Meyer decomposition for supermartingales from the class D) Let {X(t), Ft , t ∈ R+ } be the right-hand continuous supermartingale from the class D. In this case, there exists the unique predictable right-hand continuous nondecreasing process {A(t), Ft , t ∈ R+ } such that A(0) = 0, EA(∞) < ∞ and the process M(t) := X(t) + A(t) is the uniformly integrable martingale.
7 Martingales and related processes. Stopping times
73
Definition 7.8. A stochastic process {M(t), Ft , t ∈ R+ } is said to be a (1) Local martingale if there exists a sequence of Markov moments {τn , n ≥ 1} such that: (a) 0 ≤ τn ≤ τn+1 a.s., (b) τn → ∞ a.s., (c) for any n ≥ 1 a stopped process M τn (t) := M(t ∧ τn ) is an Ft -martingale (2) Square-integrable martingale if it is a martingale and EM 2 (t) < ∞ for any t ∈ R+ (3) The locally square-integrable martingale if it is a local martingale and every stopped process M τn (t) is a square integrable martingale. We say that the sequence of Markov moments localizes the corresponding process. Theorem 7.3. (Doob–Meyer decomposition for general supermartingales) Let {X(t), Ft , t ∈ R+ } be a right-hand continuous supermartingale. Then there exists the unique decomposition of the form X(t) = M(t) − A(t), t ∈ R+ , where {A(t), Ft , t ∈ R+ } is a nondecreasing predictable process with A(0) = 0 a.s.; {M(t), Ft , t ∈ R+ } is the local martingale. Let us introduce the following notations. M is a class of all martingales determined on some fixed stochastic basis; Mloc is a class of all local martingales; M 2 2 is a class of locally is a class of square-integrable martingales; and finally, Mloc square-integrable martingales. The stopping time or Markov moment τ is called bounded if there exists a constant C > 0 such that τ ≤ C a.s. Theorem 7.4. (Doob’s stopping theorem, or the optional sampling theorem for discrete time) Let {Xn , Fn , n ∈ Z+ } be an integrable stochastic process. Then the following conditions are equivalent. (1) {Xn , Fn , n ∈ Z+ } is a martingale (submartingale). (2) E(Xτ /Fσ ) = (≥)Xτ ∧σ for any bounded Markov moment τ and any Markov moment σ . (3) EXτ = (≥)EXσ for all bounded Markov moments τ and σ such that τ ≥ σ a.s.
σ and τ be stopping times for Corollary 7.1. Let {Xn , Fn , n ∈ Z+ } be a martingale, which E|Xσ | < ∞, E|Xτ | < ∞ and lim infn→∞ {σ ≥n} |Xn |dP = lim infn→∞ {τ ≥n} |Xn |dP = 0. Then E(Xτ /Fσ ) · 1Iτ ≥σ = Xσ · 1Iτ ≥σ a.s. In particular, if σ ≤ τ a.s., then EXτ = EXσ . Theorem 7.5. (A version of Doob’s optional sampling theorem for continuous time) Let {X(t), Ft , t ∈ R+ } be a submartingale with right-hand continuous trajectories, such that for some random variable η with E|η | < ∞ the inequality holds X(t) ≤ E(η /Ft ) a.s., t ∈ R+ . Then for any Markov moments σ and τ it holds that X(σ ∧ τ ) ≤ E(X(τ )/Fσ ).
74
7 Martingales and related processes. Stopping times
Corollary 7.2. Let {X(t), Ft , t ∈ R+ } be a uniformly integrable martingale with right-hand continuous trajectories. Then, for some r.v. η with E|η | < ∞, it holds that X(t) = E(η /Ft ) a.s., t ∈ R+ , and thus, for the Markov moments σ and τ we obtain the equality (see Problem 7.98) X(σ ∧ τ ) = E(X(τ )/Fσ ). Now, let T = Z+ . Definition 7.9. (1) The stochastic process [M, M]n := ∑nk=1 (Mk − Mk−1 )2 is called the quadratic variation of a martingale {Mn , Fn , n ∈ Z+ }. (2) The stochastic process [M, N]n := ∑nk=1 (Mk − Mk−1 )(Nk − Nk−1 ) is called the joint quadratic variation of two martingales {Mn , Nn , Fn , n ∈ Z+ }. (3) Let {Mn , Nn , Fn , n ∈ Z+ } be square integrable martingales; that is, for all n ∈ Z+ it holds that E|Mn |2 < ∞, E|Nn |2 < ∞. The process M, Mn := ∑nk=1 E((Mk − Mk−1 )2 /Fk−1 ), is called a quadratic characteristic of a martingale M and the process M, Nn := ∑nk=1 E((Mk − Mk−1 )(Nk − Nk−1 )/Fk−1 ) is called the joint quadratic characteristic of martingales M and N. In many cases the notations [M] and M are used instead of [M, M] and M, M, correspondingly. +2 a class of all square integrable martingales Suppose that T = R+ . Denote by M + {M(t), Ft , t ∈ R } such that their trajectories belong a.s. to the space D(R+ ) of functions without discontinuities of the second kind, right-hand continuous with lefthand limits at every point, and such that supt∈R+ EM 2 (t) < ∞. +2 then a process M 2 (t) is a submartingale from the class D (see Problem If M ∈ M 7.90). According to the Doob–Meyer decomposition for submartingales from the class D (it is an evident modification of Theorem 7.2) there exists a unique integrable nondecreasing predictable process A(t) and the martingale L(t) such that M 2 (t) = A(t) + L(t), t ∈ R+ .
(7.1)
Definition 7.10. (1) A predictable process A(t) in decomposition (7.1) is said to be a quadratic characteristic of the martingale M and is denoted as M, M(t) or M(t). (2) The predictable process M, N(t) = 12 (M + N(t) − M(t) − N(t)) is a +2 . joint quadratic characteristic of two martingales M and N from M +2 is a Hilbert space with inner product (M, N) := EM(∞)× Note that the space M N(∞) (see Problem 7.89) and a subset of continuous square integrable martingales. +2 is M 2,c is a closed subspace in M 2 . The orthogonal complement to M 2,c in M 2,d called the space of “purely discontinuous” martingales and is denoted by M . +2 admits a unique decomposition of the form Theorem 7.6. Any martingale M ∈ M M(t) = M c (t) + M d (t), where M c ∈ M 2,c is a continuous martingale, and M d ∈ M 2,d is a “purely discontinuous” martingale.
7 Martingales and related processes. Stopping times
75
Definition 7.11. A nondecreasing integrable stochastic process [M](t) := M c (t) + ∑ (Δs M d )2 s≤t
+2 . Here Δs M = Δs M d = is called a quadratic variation of a martingale M ∈ M d d d M (s) − M (s−) is a jump of the martingale M at point s ∈ R+ . The series mentioned above converges a.s. for all t ∈ R+ . Moreover, E([M](t) − M(t)/Fs ) = [M](s) − Ms for all s ≤ t. In particular, E[M](t) = EM(t), and M 2 (t) − [M]t is a martingale. Evidently, [M](t) = M(t) for M ∈ M 2,c . The definition of the quadratic variation [M] can be extended to processes M ∈ 2 . The point is that M ∈ M 2 admits a unique decomposition M = M c + M d Mloc loc 2 . So, it is into the continuous and “purely discontinuous” processes M c , M d ∈ Mloc possible to define [M](t) = M c t + ∑ (Δs M)2 , s≤t
∈ R+
a.s. and the series converges for all t Let us present general inequalities for martingales. Theorem 7.7. (Doob’s inequalities for discrete-time sub- and supermartingales) (1) Let {Xn , Fn , n ∈ Z+ } be a submartingale. Then for any C > 0, N ∈ Z+ it holds that EXN+ . P max Xn ≥ C ≤ 0≤n≤N C (2) Let {Xn , Fn , n ∈ Z+ } be a supermartingale. Then for any C > 0, N ∈ Z+ it holds that P( max Xn ≥ C) ≤ 0≤n≤N
2 max E|Xn |. C 0≤n≤N
(3) Let {Xn , Fn , n ∈ Z+ } be a positive supermartingale. Then for any C > 0, N ∈ Z+ it holds that EX0 . P max Xn ≥ C ≤ 0≤n≤N C Corollary 7.3. Let {Xn , Fn , n ∈ Z+ } be a martingale, p ≥ 1; then E|XN | p , C > 0, N ∈ Z+ . P max |Xn | ≥ C ≤ 0≤n≤N Cp Recall the notation X p := (E|X| p )1/p for any p ≥ 1 and r.v. X.
76
7 Martingales and related processes. Stopping times
Theorem 7.8. (Maximum integral Doob’s inequalities for discrete-time submartingales) Let {Xn , Fn , n ∈ Z+ } be a submartingale. (1) If p > 1, then , , , , , sup Xn+ , ≤ p X + p . , , p−1 N 0≤n≤N p (2) If p = 1, then , , , , , sup Xn+ , := E sup Xn+ ≤ , , 0≤n≤N
1
0≤n≤N
e (1 + EXN+ (ln+ XN+ )). e−1
Definition 7.12. Let {Xn , Fn , n ∈ Z+ } be a submartingale, and (a, b) be an interval with a < b. We define the Markov moments τk , k ≥ 0 in the following way: τ0 = 0; τ2m−1 = min{n : n > τ2m−2 , Xn ≤ a}; τ2m = min{n : n > τ2m−1 , Xn ≥ b}; m ∈ N. In the case when the set in braces is empty we assume that the corresponding τk and all τ j with j > k are infinite. Let us also define 0, if τ2 > N, βN (a, b) := max{m : τ2m ≤ N}, if τ2 ≤ N. This value is a “number of bottom-up crossings” by a process X of a strip (a, b) during the time period from 0 until N. Theorem 7.9. (Doob’s theorem on number of crossings) Let {Xn , Fn , n ∈ Z+ } be a submartingale, and a < b. Then it holds that EβN (a, b) ≤
E|XN | + |a| E(XN − a)+ ≤ . b−a b−a
Corollary 7.4. (Doob’s theorem on convergence of the discrete-time submartingale) Let {Xn , Fn , n ∈ Z+ } be a submartingale for which supn E|Xn | < ∞. Then there exists the limit limn→∞ Xn = X∞ with probability one, and E|X∞ | < ∞. (This statement evidently holds true for a discrete-time supermartingale as well.) Theorem 7.10. Let {Xn , Fn , n ∈ Z+ } be a martingale. The following conditions are equivalent. (1) Xn converges in L1 (P) as n → ∞. (2) supn E|Xn | < ∞ (due to this condition there exists a limit limn→∞ Xn = X∞ a.s.) and Xn = E(X∞ /Fn ) a.s. (3) There exists an integrable random variable η such that Xn = E(η /Fn ), n ∈ Z+ . (4) The martingale Xn is uniformly integrable. Let us present inequalities for martingale transformations (you can find the notation for ξ ◦ M in Problem 7.12).
7 Martingales and related processes. Stopping times
77
Theorem 7.11. Let {Mn , Fn , n ∈ Z+ } be a martingale, {ξn , Fn , n ∈ Z+ } be a predictable sequence, and |ξn | ≤ 1 a.s. for any n ∈ N. Then the following inequalities hold. (1) P(supn |(ξ ◦ M)n | ≥ C) ≤ 17C−1 supn E|Mn |, C > 0. (2) E|(ξ ◦ M)n | p ≤ Cp E|Mn | p , p > 1, n ≥ 0. Theorem 7.12. Let {Xn , Fn , n ∈ Z+ } be either a martingale, or a nonnegative submartingale, [X]n be its quadratic variation, and [X] = lim n→∞ [X]n (the limit exists a.s. because the stochastic process [X]n is nondecreasing in n). Then for any C > 0 the inequality P([X] > C) ≤ 3C−1 supn≥0 E|Xn | holds true. Theorem 7.13. (Burkholder–Davis inequalities for discrete-time martingale) If {Xn , Fn , n ∈ Z+ } is a martingale and X0 = 0 then for all p ≥ 1 there exist constants c p and Cp such that c p E[X] p/2 ≤ E sup |Xn | p ≤ Cp E[X] p/2 . n≥0
Theorem 7.13 admits the following generalization. Let a function Φ : R+ → R+ be nondecreasing, Φ (0) = 0, and Φ be a function of a bounded growth; that is, there exists c0 > 0 such that Φ (2x) ≤ c0 Φ (x), x ∈ R+ . Theorem 7.14. Let, additionally to a bounded growth, Φ be a convex function. Then there exist positive constants c and C independent of X, such that cEΦ ([X]1/2 ) ≤ EΦ (sup |Xn |) ≤ CEΦ ([X]1/2 ), n≥0
where {Xn , Fn , n ∈ Z+ } is a martingale with X0 = 0. Theorem 7.15. Let {Xn , Fn , n ∈ Z+ } be a martingale with X0 = 0. Then: (1) For all 0 < p ≤ 2 there exists Cp > 0 such that E sup |Xn | p ≤ Cp EX p/2 , n≥0
where X = limn→∞ Xn , (2) For all p ≥ 2 there exists c p > 0 such that c p EX p/2 ≤ E sup |Xn | p . n≥0
Theorem 7.16. (Doob’s inequalities for continuous-time martingales and submartingales) (1) Let {X(t), Ft , t ∈ [0, T ]} be a continuous submartingale. Then for any C > 0, P( sup X(t) ≥ C) ≤ 0≤t≤T
EX + (T ) . C
(2) Let {X(t), Ft , t ∈ [0, T ]} be a continuous square integrable martingale. Then for any C > 0, EX 2 (T ) . P( sup |X(t)| ≥ C) ≤ C2 0≤t≤T
78
7 Martingales and related processes. Stopping times
Theorem 7.17. (Burkholder–Davis inequalities for continuous-time martingales) Let {X(t), Ft , t ∈ R+ } be a martingale with X(0) = 0. Then for all p ≥ 1 there exist c p > 0 and Cp > 0 such that c p E[X] p/2 ≤ E sup |X(t)| p ≤ Cp E[X] p/2 , t∈R+
where [X] = lim t→∞ [X](t), and the existence of the limit follows from the fact the process [X](t) is the quadratic variation of the martingale X and is monotonically nondecreasing in t. For p > 1 the inequalities from Theorem 7.17 were first proved by Burkholder; for p = 1 they were proved by Davis. For discrete time we have the following version of the Burkholder inequalities. If {Xn , Fn , n ∈ Z+ } is a martingale, X0 = 0, then for all p > 1 and all n ≥ 1 p/2
c p E[X]n
p/2
≤ E|Xn | p ≤ Cp E[X]n ,
where c p = (18p3/2 /(p − 1))−p , Cp = (18p3/2 /(p − 1)1/2 ) p . It is evident that E|Xn | p can be replaced by E supn≥1 |Xn | p . If p = 1, then the left-hand part of this inequality does not hold (see Problem 7.67, item (3)). The Burkholder–Davis inequalities are generalization of the Khinchin and Marcinkievich–Zygmund inequalities for the sums of independent random variables. Khinchin inequalities. Let {ξi , i ≥ 1} be i.i.d. random variables, P(ξi = 1) = P(ξi = − 1) = 12 , {ci , i ≥ 1} be some real sequence. Then for any p ∈ (0, +∞) there exist A p and B p such that for any n ≥ 1 p n n p/2 n 1/2 A p ∑ c2i ≤ E ∑ ci ξi ≤ B p ∑ c2i . i=1
i=1
i=1
Marcinkievich–Zygmund inequalities. If {ξi , i ≥ 1} are independent integrable random variables Eξi = 0 then for any p ≥ 1 there exist A p and B p independent of ξi and such that n p p/2 p/2 n n A p E ∑ ξi2 ≤ E ∑ ξi ≤ B p E ∑ ξi2 . i=1
i=1
i=1
2 and its quadratic characteristic M be continuous. Theorem 7.18. Let M ∈ Mloc Then for all ε > 0, N > 0, T > 0,
P( sup |M(t)| ≥ ε ) ≤ 0≤t≤T
N + P(MT ≥ N). ε2
Theorem 7.18 is a particular case of the Lenglart inequality [58]. The following theorem can be useful on numerous occasions. Theorem 7.19. (L´evy) Let (Ω , F, {F n }n∈Z+ , P) be a probability space equipped with filtration Fn ⊂ F. Let also F∞ = σ { n∈Z+ Fn } and ξ be a r.v. with E|ξ | < ∞. Then the following sequence is uniformly integrable and converges, E(ξ /Fn ) → E(ξ /F∞ ) as n → ∞, a.s. and in the space L1 (P).
7 Martingales and related processes. Stopping times
79
Bibliography [9], Chapter IV; [90], Chapter 7; [24], Volume 1, Chapter II, §2; Volume 3, Chapter I, §1; [25], Chapter III, §1; [51], Chapter 13; [57]; [58], Part I; [79], Chapters 4 and 12; [22], Chapter VI; [23], Chapter 5; [82], Volume 2, Chapter VII; [20], Chapters 3 – 10; [8], Chapters 3 and 4; [46], Chapter 12, §12.3, Chapter 13; [54], Chapter 1, §1.2, Chapter 3, §3.3; [68], Chapters 9 and 10; [85], Chapters 2, 4, and 14.
Problems 7.1. Prove the following. If {ξn , n ≥ 1} is a square integrable sequence of random variables on a probability space (Ω , F, P) and ξn → ξ P-a.s., where ξ is also a random variable, then Eξn → Eξ and for any σ -algebra G ⊂ F it holds that E(ξn /G) → E(ξ /G). Extend this statement to a uniformly integrable sequence of random variables. 7.2. Prove that every martingale is still a martingale with respect to its natural filtration (see Definition 3.4). 7.3. Prove that a process {X(t), Ft , t ∈ T} is a submartingale if and only if {−X(t), Ft , t ∈ T} is a supermartingale. 7.4. Prove that a process {X(t), Ft , t ∈ T} is a martingale if and only if for all s,t ∈ T, s ≤ t and any event A ∈ Fs , A
X(s)dP =
X(t)dP. A
7.5. Let {Xn , Fn , n ∈ Z+ } be an integrable-adapted sequence. Prove that Xn is a martingale if and only if for any n ∈ Z+ it holds that E(Xn+1 /Fn ) = Xn . 7.6. Prove that a linear combination of any finite number of martingales is a martingale as well. 7.7. Prove that for a (sub-, super-) martingale {X(t), Ft , t ∈ T} the expectation is a constant (nondecreasing, nonincreasing) function in t ∈ T. 7.8. Let T = {0, 1, . . . , N} or T = [0, T ], {Ft }t∈T be a filtration on a probability space (Ω , F, P), and X ∈ L1 (P) be an integrable r.v. Prove that the stochastic process {X(t) := E(X/Ft ),t ∈ T} is a martingale (this is the so-called L´evy martingale). 7.9. Let {X(t), Ft , t ∈ T}, T = Z+ or R+ , be an integrable process with independent increments. Prove that {X(t) − EX(t), Ft , t ∈ T} is a martingale with respect to its natural filtration (see Definition 3.4). In particular a Wiener process is a martingale. If {N(t), Ft , t ∈ R+ } is a homogeneous Poisson process with intensity λ , then {N(t) − λ t, Ft , t ∈ R+ } is a martingale.
80
7 Martingales and related processes. Stopping times
7.10. (Martingale generated by a random walk, particular case of Problem 7.9) Let {ξn , n ∈ Z+ } be a sequence of integrable independent random variables with Eξn = 0, n ∈ Z+ . Prove that the sequence {Xn := ∑nk=0 ξk , Fn , n ∈ Z+ } is a martingale; moreover the equality holds, Fn := σ {ξ0 , . . . , ξn } = σ {S0 , . . . , Sn }. 7.11. Let {ξk , k ∈ Z+ } be a sequence of independent integrable random variables with Eξk = 1. Prove that the process Mn := ∏nk=0 ξk is a martingale with respect to both its natural filtration Fn = σ {Mk , 0 ≤ k ≤ n} and filtration σ {ξk , 0 ≤ k ≤ n}. Explain why these two filtrations can be different. 7.12. Let {ξn , Fn , n ∈ N} be a predictable bounded sequence, {Mn , Fn , n ∈ Z+ } be a martingale, and Yn := ∑nk=1 ξk (Mk − Mk−1 ). A process {Yn , Fn , n ∈ N} is said to be a martingale transformation of the martingale M or a discrete stochastic integral with respect to M, and denoted as Yn = (ξ ◦ M)n . (1) Prove that {Yn , Fn , n ∈ N} is a martingale. (2) Let ξk ≥ 0 a.s. and M be a submartingale. Prove that Y is a submartingale as well. 7.13. Let Q be a probability measure on {Ω , F, {Ft }t∈T }. The measure Q is locally absolutely continuous with respect to a measure P (it is denoted as Qloc P ) if for every t ∈ T the restriction Qt = Q|Ft of the measure Q on Ft is absolutely continuous with respect to the restriction Pt = P|Ft of the measure P on Ft . Let Qloc P and X(t) := dQt /dPt , t ∈ T is the Radon–Nikodym derivative. Prove that the process {X(t), Ft , t ∈ T} is a P-martingale, that is, a martingale provided all conditional expectations are computed with respect to the measure P. 7.14. (1) Let {X(t), Ft , t ∈ T} be a martingale, h : R → R be a convex function, and E|h(X(t))| < ∞, t ∈ T. Prove that {h(X(t)), Ft , t ∈ T} is a submartingale. (2) Let {X(t), Ft , t ∈ T} be a submartingale, the function h be convex as above, and, in addition, nondecreasing. Moreover, let E|h(X(t))| < ∞, t ∈ T. Prove that {h(X(t)), Ft , t ∈ T} is a submartingale. (3) Let {X(t),Y (t), Ft , t ∈ T} be martingales with EX 2 (t) < ∞. Prove that X 2 (t), |X(t)|, and |X(t)| ∨ |Y (t)| are submartingales. 7.15. (Kendall’s example) We suppose that an alarm clock should ring at 6 a.m. but the person wakes up in the middle of the night and cannot sleep any more. He or she doesn’t know what time it is and does not want to look at the clock. Let X(t) be the conditional probability that the alarm clock will ring at the latest, 60 minutes after awakening, computed after t minutes after awakening. The σ -algebra Ft , t ≥ 0 is assumed to be a σ -algebra generated by the information on the ringing of the clock from the moment of awakening till the moment t. Show that {X(t), Ft , t ∈ R+ } is a martingale. 7.16. Let {ξn , n ∈ Z+ } be a sequence of integrable random variables, Fn = σ {ξ0 , . . . , ξn } and Xn = ∑nk=0 ξk , n ∈ Z+ . Prove that {Xn , Fn , n ∈ Z+ } is a martingale if and only if E (ξn+1 fn (ξ0 , . . . , ξn )) = 0 for any bounded Borel function fn : Rn → R and all n ≥ 0.
7 Martingales and related processes. Stopping times
81
7.17. Let {ξn , Fn , n ≥ 1} be a sequence of independent, centered, and integrable random variables and Fn = σ {ξ1 , . . . , ξn }, n ≥ 1. Prove that for every m ≥ 1 the stochastic process {Xn,m , Fn , n ≥ m} of a form Xn,m =
∑
1≤i1 <···
ξi1 · · · ξim , n ≥ m
is a martingale with zero mean. 7.18. Let {Xn , Fn , n ∈ Z+ } be a supermartingale, supn E(Xn− ) < ∞, and Xn− = −Xn + Xn+ = −Xn · 1IXn <0 . (1) Prove that there exists limn→∞ Xn =: X∞ a.s. (2) Prove that E|X∞ | < ∞. (3) Prove that conditions supn E(Xn− ) < ∞ and supn E|Xn | < ∞ are equivalent. 7.19. (Polya scheme) An urn initially contains b black and r red marbles. One is chosen randomly. Then it is put back in the urn along with another c marbles of the same color. Let Y0 = b/b + r, Yn be the fraction of the black marbles in the urn after n iterations of the procedure. Prove that {Yn , n ∈ Z+ } is a martingale with respect to a natural filtration. 7.20. (Probability ratio as martingale) Let {Xn , n ≥ 1} be a sequence of random variables and it is known that the joint distribution of the random variables (X1 , . . . , Xn ) has either densities pn or densities qn but it is not known of which type exactly. Consider a new sequence of random variables Yn =
qn (X1 , . . . , Xn ) , pn (X1 , . . . , Xn )
with the assumption that the real densities are pn and they are positive and continuous. Prove that {Yn , Fn , n ≥ 1} is a martingale with Fn = σ {X1 , . . . , Xn }. The r.v. Yn is said to be the probability ratio. 7.21. Let ρ = (ρt , t ∈ T = {0, 1, . . . , T }) be a sequence of i.i.d. random variables taking two values b and a with probabilities p and q, respectively. Let Eρ0 = r and −1 < a < r < b. (1) Check that p = (r − a)/(b − a). (2) Specify that the stochastic process m(t) = ∑tk=0 (ρk − r) is a martingale with respect to the flow of σ -algebras Ft = σ {ρ0 , . . . , ρt } (the so-called “basic” martingale). (3) Prove that every martingale {X(t), Ft ,t ∈ T} with EX(t) = 0 admits such a decomposition regarding basic martingale: X(t) = ∑tk=0 αk Δ mk , where α0 = 0, αk is Fk−1 -measurable, k ≥ 1, and Δ mk = ρk − r. 7.22. (Snell envelope) Let {Xn , Fn , 0 ≤ n ≤ N} be a random sequence of integrable random variables. Denote YN := XN , Yn = Xn ∨ E(Yn+1 /Fn ), 1 ≤ n < N. The process {Yn , Fn , 0 ≤ n ≤ N} is the Snell envelope of the process X. (1) Prove that the Snell envelope {Yn , Fn , 0 ≤ n ≤ N} is a supermartingale. (2) Prove that it is the least supermartingale that dominates X. Namely, if Zn ≥ Xn , 0 ≤ n ≤ N and {Zn , Fn , 0 ≤ n ≤ N} is a supermartingale then Zn ≥ Yn .
82
7 Martingales and related processes. Stopping times
7.23. Let for a random sequence {Xn , Fn , n ∈ Z+ } there exists an integrable r.v. ξ such that Xn = E(ξ /Fn ), n ∈ Z+ . Prove that Xn → E(ξ /F∞ ) a.s. and in L1 (P) as n → ∞. Here, as above, F∞ := σ { n∈Z+ Fn } (compare with Theorem 7.10). 7.24. Let {Fn , n ∈ Z+ } be a flow of σ -algebras on (Ω , F, P) and Y ∈ L1 (Ω , F∞ , P), Xn = E(Y /Fn ). Prove that {Xn , Fn , n ∈ Z+ } is a uniformly integrable martingale and limn→∞ Xn = Y a.s. 7.25. (An example of a nonuniformly integrable martingale) Let {ξk , k ≥ 1} be i.i.d. random variables with P(ξ1 = 0) = P(ξ1 = 2) = 12 . Set Xn = ∏nk=1 ξk , n ≥ 1. Prove that {Xn , Fn , n ≥ 1} is a nonuniformly integrable martingale. 7.26. Let (Ω , F, P) = ([0, 1], B([0, 1]), λ 1 |[0,1] ) and {πn , n ≥ 1} be a sequence of partitions of the segment [0, 1] such that
πn = {0 = t0n < · · · < tknn = 1} and πn ⊂ πn+1 , n ≥ 1. Let Fn be a σ -algebra on [0, 1] generated by the sets Δn,1 = [0,t1n ], Δn,2 = (t1n ,t2n ], . . . , Δn,kn = (tknn −1 ,tknn ]. For a function f : [0, 1] → R we put
n )(t n −t n )−1 1I n Xn (ω ) = ∑kk=1 f (tkn )− f (tk−1 ω ∈Δn,k , n ≥ 1. Prove that {Xn , Fn , n ≥ 1} k k−1 is a martingale.
7.27. Find the values of a and b, under which the stochastic processes aW 2 (t) + bt and exp{aW (t) + bt} are (sub-, super-) martingales with respect to the filtration generated by a Wiener process {W (t), t ∈ R+ }. 7.28. It is said that a discrete-time martingale {Xn , Fn , n ≥ 1} is bounded in L2 (P) if supn≥1 EXn2 < ∞. Assume now that the martingale {Xn , Fn , n ≥ 1} is square inte2 grable. Show that it is bounded in L2 (P) if and only if ∑∞ n=1 E(Xn − Xn−1 ) < ∞. 7.29. (Random signs) Let {Xn , n ≥ 1} be a sequence of independent Bernoulli variables with P(Xn = 1) = P(Xn = −1) = 12 , n ≥ 1, and {αn , n ≥ 1} be a sequence of ∞ 2 real numbers. Show that a series ∑∞ n=1 αn Xn converges a.s. if ∑n=1 αn < ∞. 7.30. (1) Let σ and τ be Markov moments w.r.t. a filtration {Fn , n ∈ Z+ }. Prove that σ + τ , σ ∧ τ , ατ for integer α ≥ 2, and σ ∨ τ are Markov moments as well with respect to this filtration. (2) Let {τk , k ≥ 1} be the Markov moments with respect to the filtration {Fn , n ∈ Z+ }. Prove that supk≥1 τk , infk≥1 τk , lim supk→∞ τk , and lim infk→∞ τk are Markov moments as well with respect to this filtration. (3) Generalize the previous statements to the case of Markov moments with respect to the filtration {Ft , t ∈ R+ }. (4) Let τ be a Markov moment. Prove that the random variables τ − 1 and [τ /2] are not Markov moments w.r.t. the same filtration. 7.31. Let {Xn , Fn , 1 ≤ n ≤ N} be an integrable adapted process. Prove that the following statements are equivalent. (1) X is a martingale. (2) For any Markov moment the stopped process Xnτ := Xn∧τ is a martingale. (3) For any Markov moment τ it holds that EXτ ∧N = EX0 .
7 Martingales and related processes. Stopping times
83
7.32. Let {Xn , Fn , n ∈ Z+ } be a martingale and τ be a stopping time such that |Xτ ∧n | ≤ Y a.s. for some r.v. Y ∈ L1 (P) and all n ≥ 0. Prove that EXτ = EX0 . 7.33. Let {Xn , Fn , n ∈ Z+ } be a martingale and τ be a stopping time such that E|Xτ | < ∞ and lim infn→∞ E(|Xn |1Iτ >n ) = 0. (1) Prove that for any Markov moment σ E(Xτ /Fσ ) = Xτ ∧σ . (2) Prove: if σ is a stopping time such that σ ≤ τ a.s. and E|Xσ | < ∞, then EXτ = EXσ . (3) State and prove the corresponding statements in the case where X is a submartingale. 7.34. Let {Xn , Fn , n ∈ Z+ } be a martingale with Fn = σ {X0 , . . . , Xn } and τ be a stopping time with Eτ < ∞. Moreover, let there exist C > 0 such that for any n ≥ 0, E |Xn+1 − Xn |/Fn · 1Iτ >n ≤ C a.s. Prove: (a) E|Xτ | < ∞; (b) EXτ = EX0 . 7.35. Prove that for any submartingale {Xn , Fn , n ∈ Z+ } and any stopping time τ E|Xτ | ≤ lim infn→∞ E|Xn |. 7.36. Let {Xn , Fn , n ≥ 0} be a supermartingale and Xn ≥ E(ξ /Fn ), n ≥ 0 a.s. where E|ξ | < ∞. Let also σ and τ be stopping times with σ ≤ τ a.s. Prove that Xσ ≥ E(Xτ /Fσ ) a.s. State and prove the corresponding statements for the submartingale. 7.37. Let {Xn , Fn , n ∈ Z+ } be a real random sequence and a set B ∈ B(R). Prove that a moment of the first visit of the set B :
σB := inf{n ≥ 0| Xn ∈ B}, σB = ∞,
if Xn ∈ / B for any n ≥ 0,
and the moment of the first departure from the set B :
τB := inf{n ≥ 0| Xn ∈ / B}, τB = ∞,
if Xn ∈ B for any n ≥ 0,
are Markov moments (it is evident that σB = τBc , where Bc = R\B). 7.38. Let {Xn , Fn , n ∈ Z+ } be a martingale (submartingale) and τ be a Markov moment with respect to a filtration {Fn , n ∈ Z+ }. Prove that the stopped process Xnτ := Xτ ∧n is an Fn -martingale (submartingale) as well. 7.39. Let {Xn , Fn , n ∈ Z+ } be a martingale, constants N ≥ 1 and C > 0 be fixed, and ν = min{k ≥ 0 | Xk > C} ∧ N with ν = N, if Xk ≤ C, k ≤ N. Prove that ν is a stopping time, and a random sequence {Yn = Xn · 1In≤ν + (2Xν − Xn ) · 1In>ν , n ∈ Z+ } is an Fn -martingale as well. 7.40. Let {Mn , Fn , n ∈ Z+ } be a process such that E|Mn | < ∞ and EMτ = 0 for any stopping moment τ . Prove that {Mn , Fn , n ∈ Z+ } is a martingale.
84
7 Martingales and related processes. Stopping times
7.41. Let τ be a Markov moment and Fτ be a σ -algebra generated by τ (see Definition 7.2). In this problem as well as in the next one we assume that time is discrete. (1) Prove that Fτ is indeed a σ -algebra. (2) Prove that Fτ can be defined in the following way: Fτ := {A ∈ F| A ∩ {τ = n} ∈ Fn , n ≥ 0}. (3) Prove that τ is an Fτ -measurable r.v. (4) Prove that if {Xn , Fn , n ∈ Z+ } is a real-valued random sequence, then Xτ , max Xk and X1 + X2 + · · · + Xk , 1≤k≤τ
are Fτ -measurable random variables. (5) Let σ and τ be Markov moments. Prove that events {σ = τ }, {σ ≤ τ }, and {σ < τ } belong to Fσ ∧τ . (6) Prove that Fσ ⊂ Fτ if σ ≤ τ a.s., and σ and τ are Markov moments. (7) Let X ∈ L1 (P) and τ be a stopping time on a set {0, . . . , N}. Prove that E(X/Fτ ) = ∑Nj=0 E(X/F j )1Iτ = j . (8) Show that E(E(Y /Fτ )/Fσ ) = E(E(Y /Fσ )/Fτ ) = E(Y /Fτ ∧σ ). 7.42. Let σ and τ be Markov moments. Prove that events {σ = τ } and {σ ≤ τ } belong to Fσ Fτ . 7.43. Let τ be a Markov moment on a probability space (Ω , F, P), and a set A ∈ F. The restriction of the moment τ to the set A we denote by τA and define as follows. τ (ω ), ω ∈ A, τA (ω ) := ω∈ / A. ∞, Prove that the restriction τA is a Markov moment if and only if A ∈ Fτ . 7.44. Let {X(t), Ft ,t ≥ 0} be a continuous real-valued stochastic process, A ⊂ R be a closed set, and τ = inf{t ≥ 0| X(t) ∈ A} ∪ {+∞} be a hitting moment of the set A. Prove that τ is a Markov moment. 7.45. (1) Prove that for any stopping time τ and any constant r > 0 a r.v. τ + r is a predictable stopping time and it is predicted by the sequence τn := τ + r(1 − 1/n). (2) Prove that any stopping time is a limit of a decreasing sequence of predictable stopping times. 7.46. The σ -algebra Fτ − , generated by all elements from F0 and all sets of a form A {t < τ }, where t ∈ R+ , A ∈ Ft , is called the σ -algebra of events that are strictly prior to Markov moment τ . Prove the following statements. (1) Fτ − ⊂ Fτ . (2) τ is an Fτ − -measurable r.v. . (3) If τ ≤ σ , τ , σ are Markov moments then Fτ − ⊂ Fσ − (4) For any A ∈ Fσ the following inclusion holds true A {σ < τ } ∈ Fτ − .
7 Martingales and related processes. Stopping times
85
7.47. Let Y1 ⊃ Y2 ⊃ · · · be a nonincreasing family of σ -algebras and ξ be an integrable r.v. Prove that a sequence {Xn , Yn , n ≥ 1} with Xn = E(ξ /Yn ) produces an inverse martingale; that is, E(Xn /Yn+1 ) = Xn+1 a.s. for any n ≥ 1. 7.48. Let {Xk , k ≥ 1} be i.i.d. random variables, Zn = ∑nk=1 Xk , and Gn = σ {Zn , Zn+1 , . . .} . Prove that the random sequence {Zn /n, Gn , n ≥ 1} produces an inverse martingale. 7.49. (The first Wald identity) Let {ξk , k ≥ 1} be i.i.d. random variables, Sn = ∑nk=1 ξk , τ be a Markov moment with respect to the filtration Fn = σ {ξ1 , . . . , ξn }, τ (ω ) E|ξ1 | < ∞, and Eτ < ∞. Prove that ESτ = Eτ Eξ1 , where Sτ (ω ) (ω ) := ∑k=1 ξk (ω ) as ω ∈ {τ < ∞} and Sτ (ω ) (ω ) := 0 as ω ∈ {τ = ∞}. 7.50. (Corollary to the first Wald identity) Let {ξi , i ≥ 1} be i.i.d. random variables with P(ξi = 1) = P(ξi = −1) = 12 . Consider the random walk S0 = 0, Sn = ∑ni=1 ξi , n ≥ 1, τ = inf{n ≥ 1 : Sn = 1}. (1) Prove that P(τ < ∞) = 1; that is, τ is a stopping time. (2) Prove that P(Sτ = 1) = 1 and ESτ = 1. (3) Derive from the first Wald identity that Eτ = ∞. 7.51. (The second Wald identity) Let the conditions of Problem 7.49 hold true and additionally Dξ1 < ∞. Prove that E(Sτ − τ Eξ1 )2 = Eτ Dξ1 . 7.52. (The fundamental Wald identity) Let {ξi , i ≥ 1} be a sequence of i.i.d. random variables, Sn = ∑ni=1 ξi , n ≥ 1, a function ϕ (t) = Eet ξ1 , t ∈ R, and for some t0 = 0 the following relation holds, 1 ≤ ϕ (t0 ) < ∞. Let also Fn = σ {ξ1 , .., ξn }, n ≥ 1, τ be a stopping time with respect to {Fn , n ≥ 1} and such that Eτ < ∞, and |Sn | · 1Iτ >n ≤ C a.s. for a constant C > 0. Prove that E
et0 Sτ = 1. (ϕ (t0 ))τ
7.53. (The generalized Wald identity) Let {ξi , i ≥ 1} be independent random variables (not necessarily identically distributed), Eξi = 0, Eξi2 = σi2 , Sn = ∑ni=1 ξi , Fn = σ {ξ1 , . . . , ξn }, and τ be a stopping time. Prove the following. (1) If E ∑τi=1 |ξi | < ∞, then ESτ = 0. (2) If E ∑τi=1 ξi2 < ∞, then ESτ2 = E ∑τi=1 σi2 . (n)
7.54. (Galton—Watson process) Let {ξk , k, n ≥ 1} be i.i.d. random variables taking (1)
(1)
(n)
n−1 ξk their values in Z+ and Eξ1 = μ > 0. We set S0 = 1, S1 = ξ1 and Sn = ∑k=1
(n)
S
as
n ≥ 2 (here ∑0k=1 ξk := 0, n ≥ 1). A sequence {Sn , n ∈ Z+ } is said to be the branching process describing the population which is developing in the following manner. At starting moment n = 0 we have one population representative that gives birth to (1) a random number ξ1 of successors. Every successor gives birth independently of others to the random number of successors and so on. A degeneration of population is possible at some step.
86
7 Martingales and related processes. Stopping times
(1) Prove that a random sequence Mn := μ −n Sn , n ≥ 0 produces a martingale with respect to a natural filtration. (2) Prove that supn E|Mn | = 1. Derive that Mn → M∞ a.s. as n → ∞ and EM∞ < ∞. (3) Prove that for μ < 1 Sn → 0 a.s.; that is, the population degenerates asymptotically with probability one. 7.55. Let {ξk , k ≥ 1} be i.i.d. random variables with P(ξk = 1) = p, P(ξk = −1) = q = 1 − p, 0 < p < 1. We put X0 = x, x ∈ Z, Xn = x + ∑nk=1 ξk , n ≥ 1. Let a and b be integers with a < x < b, τa = inf{n ≥ 1| Xn = a}, τb = inf{n ≥ 1| Xn = b}, and τ = inf{n ≥ 1| Xn ∈ / (a, b)}. (1) Prove that τa , τb , and τ are Markov moments with respect to the filtration Fn = σ {ξ1 , . . . , ξn }. (2) Using the properties of random walks (see Problem 7.49) prove that τ is a stopping time. (3) Show that the process Mn := (q/p)Xn is a martingale with respect to {Fn , n ≥ 0}, where F0 = {∅, Ω }. (4) Using Problem 7.32 or item (2), prove that for p = q, P(Xτ = a) = P(τa < τb ) =
(q/p)x − (q/p)b . (q/p)a − (q/p)b
(5) Prove that for p = q = 12 , P(τa < τb ) =
b−x . b−a
(6) Calculate Eτ separately for p = q and for p = q = 12 . (Achieving the level a can be interpreted as the bankruptcy of a gambler with initial capital x, and achieving the level b as the winning of a gambler, respectively). 7.56. Let {Xn , Fn , n ∈ Z+ } be a supermartingale, and σ and τ be bounded stopping times with σ ≤ τ a.s. Prove that E(Xτ /Fσ ) ≤ Xσ a.s. 7.57. Let {ξi , i ≥ 1} be i.i.d. random variables, P(ξi = 1) = P(ξi = −1) = 12 , 0 < a < b, Xn = a ∑ni=1 1Iξi =1 − b ∑ni=1 1Iξi =−1 , and τr = inf{n ≥ 1| Xn ≤ −r}, r > 0. Prove that Eeλ τr < ∞ if λ ≤ α and Eeλ τr = ∞ if λ > α , where
α=
2b a 2a b ln + ln . a+b a+b a+b a+b
7.58. Let {Xn , Fn , n ∈ Z+ } be a square integrable martingale, and τ and σ be bounded stopping times with σ ≤ τ . Show that EXτ2 and EXσ2 are finite and E((Xτ − Xσ )2 /Fσ ) = E(Xτ2 − Xσ2 /Fσ ). 7.59. Let {Xn , Fn , n ∈ Z+ } be a square integrable martingale with supn≥1 EXn2 < ∞ and X0 = 0. Let τ be a stopping time and lim infn→∞ EXn2 1Iτ >n = 0. Prove that in this case EXτ2 = EXτ . Here Xτ = ∑τi=1 E((Xi − Xi−1 )2 /Fi−1 ).
7 Martingales and related processes. Stopping times
87
7.60. Let {Xn ,Yn , Fn , n ∈ Z+ } be square integrable martingales. Prove that {XnYn − X,Y n , Fn , n ∈ Z+ } is a martingale. 7.61. Consider martingales Xn = ∑ni=1 ξi , Yn = ∑ni=1 ηi , where {(ξi , ηi ), i ≥ 1} is a sequence of independent random vectors with Eξi = 0, Eηi = 0, Eξi2 < ∞ and Eηi2 < ∞. Prove that X,Y n = ∑ni=1 cov (ξi , ηi ). 7.62. (Doob’s decomposition for discrete-time stochastic processes) Let {Xn , Fn , n ∈ Z+ } be an integrable random sequence. (1) Prove that X admits the unique decomposition of the form Xn = Mn + An , where {Mn , Fn , n ∈ Z+ } is a martingale, A0 = 0, and {An , Fn , n ∈ Z+ } is a predictable process. (2) Prove that process A is nondecreasing (nonincreasing) if and only if X is a discrete-time submartingale (supermartingale). 7.63. Let {Xn , Fn , n ∈ Z+ } be a nonnegative submartingale with X0 = 0 and Xn = Mn + An be its Doob’s decomposition. Prove that for any C > 0 and N ∈ Z+ P( sup Xn ≥ C) ≤ 0≤n≤N
EAN . C
7.64. Prove the correctness of the next Krickeberg decomposition: every martingale with supn≥1 E|Xn | < ∞ can be presented as a difference of two nonnegative martingales. 7.65. (Riesz decomposition) Let X = {Xn , Fn , n ∈ Z+ } be a uniformly integrable supermartingale. Prove that one can present X in the form Xn = Mn + Pn , where {Mn , Fn , n ∈ Z+ } is a uniformly integrable martingale and {Pn , Fn , n ∈ Z+ } is a potential. A potential is defined as a uniformly integrable nonnegative supermartingale such that Pn → 0 a.s. as n → ∞. 7.66. Using the Khinchin inequalities, prove the Burkholder inequalities. 7.67. Let {ξi , i ≥ 1} be i.i.d. random variables with P(ξi = 1) = P(ξi = −1) = 12 , τ ∧n ξi , where τ = inf{n ≥ 1| ∑ni=1 ξi = 1}. and Xn = ∑i=1 (1) Prove that E|X (n | → 2 as√n → ∞. (2) Prove that E [X]n = E τ ∧ n → ∞ as n → ∞. ( (3) Derive from this that the Davis inequality in the form cE [X]n ≤ E|Xn |, n ≥ 1, with some fixed positive constant c, does not hold true. 7.68. Prove Theorem 7.7, statement (2). 7.69. Let {Xn , Fn , n ∈ Z+ } be a submartingale. Prove that for any C > 0 and for any N≥0 C · P( min Xn ≤ −C) ≤ E(XN − X0 ) − 0≤n≤N
{ min Xn ≤−C} 0≤n≤N
XN dP ≤ EXN+ − EX0 .
88
7 Martingales and related processes. Stopping times
7.70. Let h : R → R+ be a nondecreasing convex function and {Xn , Fn , 1 ≤ n ≤ N} be a submartingale. Prove that for any u ∈ R and r > 0, P( max Xn ≥ u) ≤ Eh(rXN )/h(ru). 1≤n≤N
7.71. Let {M(t), Ft , t ∈ R+ } be a uniformly integrable supermartingale (martingale) with right-hand continuous trajectories. Prove that {M(t ∧ τ ), Ft∧τ , t ∈ R+ } is a supermartingale (martingale) as well for any Markov moment τ . 7.72. Let {W (t), Ft , t ∈ R+ } be a Wiener process, and τ be a stopping time with respect to the filtration {Ft , t ∈ R+ }. Prove that {W (t ∧ τ ), Ft , t ∈ R+ } is a square integrable martingale. (We mention that a Wiener process is not a uniformly integrable martingale; i.e., one cannot use the previous problem to prove the present statement.) 7.73. Let {W (t), Ft , t ∈ R+ } be a Wiener process. For α , x ∈ R and t ∈ R+ define a function h(α , x,t) := exp{α x − α 2t/2} and put ∂ k h(α , x,t) , k ≥ 1. hk (x,t) = α =0 ∂ αk Prove that {hk (W (t),t), Ft , t ∈ R+ } is a martingale for every k ≥ 1. (According to Problem 7.27, {h(α ,W (t),t), Ft , t ∈ R+ } is a martingale for any α ∈ R.) + 7.74. Let {W Consider a Markov moment τa := (t), Ft , t ∈ R } be a Wiener process. + inf{t ∈ R |W (t)| = a}. Prove that Eτa = a2 , Eτa2 = 5a4 /3.
7.75. Let X(t) = γ t + σ W (t), t ∈ R+ , and σa := inf{t ∈ R+ | X(t) = a}. Prove that 2 for γ > 0, σ = 0, and a < 0 the equality P(σa < ∞) = e2γ a/σ holds. 7.76. A function f : Rm → R is said to be superharmonic if for any x ∈ Rm and −1 (S(x, r)) S(x,r) f (y)dy, where S(x, r) is a sphere of r > 0 it holds that f (x) ≥ λm−1 radius r and with the center at x, λm−1 (S(x, r)) is its surface Lebesgue measure in Rm−1 . Let {W (t), FtW , t ∈ R+ } be m-dimensional Brownian motion, and f be a continuous superharmonic function with E| f (W (t))| < ∞ for any t ∈ R+ . Prove that { f (W (t)), FtW , t ∈ R+ } is a supermartingale. 7.77. Let {W (t), FtW ,t ∈ R+ } be m-dimensional Brownian motion, and W (t) be its Euclidean norm in Rm . (This is an m-dimensional Bessel process.) Prove√that {W (t), FtW , t ∈ R+ } is a submartingale. Prove also that for any t > 0 and x > mt, P( sup W (s) ≥ x) ≤ ( s∈[0,t]
t m −m/2 x2 ). ) exp(− x2 2t
7.78. Let {W (t), FtW , t ∈ R+ } be m-dimensional Brownian motion, m ≥ 3. (1) Prove that W (t) → ∞ a.s. as t → ∞. (2) Prove that the process W does not return to 0 with probability 1. (3) Prove that for m = 1 the statements (1) and (2) are wrong. (4) Prove that for m = 2 statement (1) is wrong, whereas statement (2) is still correct.
7 Martingales and related processes. Stopping times
89
7.79. Prove that every continuous-time martingale is a local martingale. 7.80. Prove that any linear combination of a finite number of local martingales is still a local martingale. 7.81. Prove that any nonnegative local martingale is a supermartingale. 7.82. Let X ∈ Mloc with X(t) ≥ 0, X(0) = 0. Prove that for every t ∈ R+ it holds that X(t) = 0 a.s. 7.83. Prove that the sequence {Xn , Fn , n ∈ Z+ } is a local martingale provided that E(|Xn+1 |/Fn ) < ∞ a.s. and E(Xn+1 /Fn ) = Xn a.s. for every n ≥ 0. 7.84. Let X ∈ Mloc and E supt∈R+ |X(t)| < ∞. Prove that for any Markov moment τ it holds that EX(τ ) = EX(0). 7.85. Let p ∈ [1, ∞), M be a local martingale with a localizing sequence {τn , n ≥ 1}, and for every t ∈ R+ the sequence of the random variables {|M(t ∧ τn )| p , n ≥ 1} is uniformly integrable. Prove that {M(t), Ft , t ∈ R+ } is a martingale and E|M(t)| p < ∞ for all t ∈ R+ . 7.86. Let {M(t), Ft ,t ≥ 0} be a local martingale, and for every t > 0 it holds that EM ∗ (t) < ∞ where M ∗ (t) = sups≤t |M(s)|. Prove that {M(t), Ft ,t ≥ 0} is a martingale. 7.87. Let {M(t), Ft ,t ≥ 0} be a locally square integrable martingale and there exist C > 0 and a localizing sequence of stopping times {τn , n ≥ 1}, such that τn+1 ≥ τn → ∞ as n → ∞ a.s. and EM 2 (t ∧ τn ) ≤ C. Prove that {M(t), Ft ,t ≥ 0} is a square integrable martingale. 7.88. Prove that every continuous martingale is a locally square integrable martingale. +is a set of right-hand continuous, uniformly integrable martin7.89. Assume that M + gales determined on a stochastic basis {Ω , F, {Ft }t∈R+ , P}. For martingale M ∈ M and p ∈ [1, ∞] we put MH p := M ∗ (∞) p , where M ∗ (t) = sup0≤s≤t |M(t)|, · p is a norm in L p (Ω , F, P). Denote by H p the +with MH < ∞. Prove the following statements. space of M ∈ M p (1) If we identify martingales that are equal a.s., that is, P(ω ∈ Ω | sup |M(t) − N(t)| > 0) = 0, 0≤t≤∞
then H p becomes Banach space equipped with the norm · H p . (2) If 1 ≤ p ≤ p then H p ⊂ H p . +, then M(∞) p ≤ M ∗ (∞) p ≤ qM(∞) p , where (3) If 1 < p < ∞ and M ∈ M +2 . 1/p + 1/q = 1. Deduce from here that H2 = M
90
7 Martingales and related processes. Stopping times
+, then M(∞)∞ = M ∗ (∞)∞ . (4) If p = ∞ and M ∈ M (5) If 1 < p ≤ ∞ and {M n (t), n ≥ 1} is a sequence of martingales converging to a martingale M in the norm of the space H p , then there exists a subsequence {M nk (t), k ≥ 1}, that uniformly converges to M(t) on [0, ∞) a.s. (6) Prove that the limit in H p , 1 < p ≤ ∞, of a sequence of continuous martingales is again a continuous martingale. +2 (or H2 , see Problem 7.89). Prove that M 2 is a right7.90. Let martingale M ∈ M hand continuous uniformly integrable submartingale that belongs to the class D (see Definition 7.7). +are said to be orthogonal if their product MN ∈ M + 7.91. Two martingales M, N ∈ M and M(0)N(0) = 0. (1) Let M and N belong to H2 and be orthogonal martingales. Prove that for any stopping time τ it holds that EM(τ )N(τ ) = 0 and MN ∈ H1 . (2) Prove the inverse statement: if M and N belong to H2 , M(0)N(0) = 0 a.s., and EM(τ )N(τ ) = 0 for any stopping time τ , then M and N are orthogonal martingales. 7.92. Define a conditional covariance of two square integrable random variables X and Y by cov(X,Y /F) := E(XY /F) − E(X/F)E(Y /F). Now, let {Mn , Nn , Fn , n ∈ Z+ } be two square integrable discrete-time martingales. We say that they are orthogonal if their product MN is a martingale and M0 N0 = 0 a.s. Prove the equivalence of the following statements. (1) M and N are orthogonal. (2) M0 N0 = 0 and cov(Mn+1 − Mn , Nn+1 − Nn /Fn ) = 0 a.s., for any n ∈ Z+ . 7.93. Consider discrete time T = {0, 1, . . . , T }. Denote by H2,T a space of square integrable martingales {Mn , Fn , n ∈ T}. A subspace S ⊂ H2,T is said to be stable if N τ ∈ S for all N ∈ S and for all stopping times τ on T. For a stable subspace S and martingale M ∈ H2,T with M0 = 0 prove the equivalence of the following statements. (1) EMT NT = 0 for any N ∈ S. (2) For any N ∈ S the equality cov(Mn+1 − Mn , Nn+1 − Nn /Fn ) = 0 holds true a.s. for any n ∈ T. (3) The product MN is a martingale for any N ∈ S. 7.94. Prove the following discrete version of the Kunita–Watanabe decomposition. If {Xn , Fn , n ∈ T} is a square integrable martingale, then any martingale M ∈ H2,T can be expanded in the following way. n
Mn = M0 + ∑ ξk (Xk − Xk−1 ) + Ln = M0 + (ξ ◦ X)n + Ln , k=1
where {ξk , Fk , 1 ≤ k ≤ T } is a predictable stochastic process, ξk (Xk − Xk−1 ) ∈ L2 (Ω , F, P) for any 1 ≤ k ≤ T , and L is a square integrable martingale, which is orthogonal to X and such that L0 = 0.
7 Martingales and related processes. Stopping times
91
7.95. Prove the following statements. (1) Let P and Q be probability measures on a measurable space (Ω , F), and Q P on F with density ρ . If F0 is a σ -algebra contained in F, then Q P on F0 and the corresponding density can be defined by equality (dQ/dP) |F0 = E(ρ /F0 ) P-a.s. (2) Let Q P on F with density ρ and F0 ⊂ F. Then for any F-measurable nonnegative r.v. ξ the equality holds true P-a.s., EQ (ξ /F0 ) =
1 E(ξ ρ /F0 ). E(ρ /F0 )
(3) Let T = {0, 1, . . . , T }, {Ω , F, {Ft }t∈T , P} be a probability space with filtration, and Q be a probability measure with Q ∼ P. Prove that an adapted process M is a Q-martingale if and only if the process Mt · E((dQ/dP) | Ft ), t ∈ T is a P-martingale. 7.96. (Discrete version of stochastic Doleans–Dade exponent) Let, as in the previous problem, T = {0, 1, . . . , T }, {Ω , F, {Ft }t∈T , P} be a probability space with filtration, and Q be a probability measure with Q ∼ P. (1) Prove that there exists such a P-martingale L that: L(0) = 1, and L(t + 1) − L(t) > −1 P-a.s. for all t ∈ {0, 1, . . . , T − 1}, and the martingale Z(t) := E((dQ/dP)/Ft ), t ∈ T can be expanded as t
Z(t) = ∏ (1 + L(s) − L(s − 1)) , t ∈ T.
(7.2)
k=1
(2) Prove the inverse statement: if L is a P-martingale, L(0) = 1, and L(t + 1) − L(t) > −1 P-a.s., and moreover the equality (7.2) determines the P-martingale Z, then the equality dQ := Z(T )dP determines the probability measure Q ∼ P. 7.97. (Discrete version of the Girsanov formula) Let the conditions of item (1) of Problem 7.96 hold and L be a martingale from the relation (7.2) for a sequence of - is such a Q-martingale that M(t) - ∈ densities Z(t) := E((dQ/dP)/Ft ), t ∈ T. If M L1 (P) for all t ∈ T, then the process t - − M(k - − 1))/Fk−1 - + ∑ E (L(k) − L(k − 1))(M(k) M(t) := M(t)
(7.3)
k=1
is a P-martingale. 7.98. Prove: if {M(t), Ft , t ∈ R+ } is a uniformly integrable martingale with righthand continuous trajectories, whereas σ and τ are Markov moments with σ ≤ τ , then M(σ ) = E(M(τ )/Fσ ) and E|M(τ )| < ∞. 7.99. Let {X(t), Ft , t ∈ [0, T ]} be a supermartingale and EX(0) = EX(T ). Prove that X is a martingale. 7.100. Prove: if η is an integrable r.v., X(t) = E(η /Ft ) and τ is a Markov moment, then X(τ ) = E(η /Fτ ).
92
7 Martingales and related processes. Stopping times
7.101. (1) Let {Xn , Fn , 0 ≤ n ≤ N} be an integrable stochastic process with X0 = 0 and EXτ = 0 for any stopping time τ . Prove that X is a martingale. (2) Let {X(t), Ft , t ∈ R+ } be an integrable c`adl`ag stochastic process with X(0) = 0, EX(τ ) = 0 for any stopping time τ , and for any T > 0 the family of random variables {X(s), 0 ≤ s ≤ T } is uniformly integrable. Prove that X is a martingale. 7.102. Let {X(t), Ft , t ∈ R+ } be a nonnegative right-hand continuous supermartingale. We put τ = inf{t ∈ R+ | X(t) = 0 or Xt− = 0} if the corresponding set is empty and τ = ∞ otherwise. Prove that X(t) = 0 a.s. for all t ≥ τ and for such ω that τ (ω ) < ∞. 7.103. Let {Mn , Fn , n ∈ Z+ } be a square integrable martingale. Prove that for any ε > 0, C > 0, and N ≥ 1, P( max |Mn | ≥ ε ) ≤ 0≤n≤N
C + E max1≤n≤N (Mn − Mn−1 ) + P(MN ≥ C). ε2
7.104. Prove Theorem 7.18. 7.105. Let {N(t), Ft , t ∈ R+ } be a Poisson process with intensity λ . Prove that a square integrable martingale M(t) := N(t) − λ t has a quadratic characteristic M(t) = λ t. 7.106. Let M ∈ H 2 , and M(0) = 0. Prove that EM 2 (τ ) = E[M](τ ) = EM(τ ) for any Markov moment τ . 7.107. Let {M(t), Ft , t ∈ [0, T ]} be a continuous square integrable martingale. (1) Prove that for any u > 0 it holds that u · P(M ∗ (T ) ≥ u) ≤ E(|M(T )| 1IM∗ (T )>u ), where M ∗ (T ) = sup0≤t≤T |M(t)|. (2) Prove that for any A > 0 it holds that E (M ∗ (T ) ∧ A)2 ≤ 2E ((M ∗ (T ) ∧ A)|M(T )|) . (3) Prove that EM ∗ (T ) < ∞ and E(M ∗ (T ))2 ≤ 4EM 2 (T ). 7.108. (Joint distribution for W (t) and sups≤t W (s)) Let {W (t), Ft ,t ∈ R+ } be a Wiener process. (1) Assume that τ is a bounded stopping time. Prove that for all u < v we have the equality E(exp{iz(W (v + τ ) −W (u + τ ))}/Fu+τ ) = exp{−z2 (v − u)/2}. (2) Prove that W τ (u) := W (u+ τ )−W (u) is an Fu+τ -Wiener process independent of the σ -algebra Fτ . (3) Let {Y (t), t ∈ R+ } be a continuous stochastic process, independent of σ A -measurable algebra A and such that E sup0≤s≤T |Ys | < ∞. Let T1 be a nonnegative r.v. that is bounded from above. Prove that E (YT1 /A ) = EY (t) T =t . (4) Let τ λ = inf{s ≥ 0| W (s) ≥ λ }. (a) Prove that for a bounded Borel function f ,
E f (W (t))1Iτ λ ≤t = Eφ (t − τ λ )1Iτ λ ≤t ,
1
7 Martingales and related processes. Stopping times
93
where φ (u) = E f (Wu + λ ). (b) Show that E f (W (u) + λ ) = E f (−W (u) + λ ) and derive that E f (W (t))1Iτ λ ≤t = E f (2λ −W (t))1Iτ λ ≤t . (5) Let W ∗ (t) = sups≤t W (s). Show that for any λ ≥ 0, P(W (t) ≤ λ ,W ∗ (t) ≥ λ ) = P(W (t) ≥ λ ,W ∗ (t) ≥ λ ) = P(W (t) ≥ λ ). Derive that W ∗ (t) and |W (t)| are equally distributed. (6) Prove that for λ ≥ (μ ∨ 0), P (W (t) ≤ μ ,W ∗ (t) ≥ λ ) = P (W (t) ≥ 2λ − μ ,W ∗ (t) ≥ λ ) = P (W (t) ≥ 2λ − μ ) , and for 0 ≤ λ ≤ μ , P (W (t) ≤ μ ,W ∗ (t) ≥ λ ) = 2P(W (t) ≥ λ ) − P(W (t) ≥ μ ). (7) Prove that the joint distribution of (W (t),W ∗ (t)) is given by the expression 2(2y − x) (2y − x)2 √ 1I0∨x≤y . exp − 2t 2π t 3 7.109. (Reflection principle for Wiener process) Let {W (t), Ft ,t ∈ R+ } be a Wiener process and τ be a stopping time w.r.t. {Ft }. Also, let Wτ (t) = W (t)1It≤τ + (2W (τ ) − W (t))1It>τ ,t ∈ R+ . Prove that {Wτ (t), Ft ,t ∈ R+ } is a Wiener process. Represent in diagram form a trajectory of the process W and the corresponding trajectory of the process Wτ .
Hints 7.3–7.13. Use the definition of a (sub-, super-) martingale. 7.14. Use the definition of a (sub-) martingale and Jensen’s inequality. 7.15. Use Problem 7.8. 7.16. Prove that E(ξn+1 /Fn ) = 0 if and only if E(ξn+1 · fn (ξ0 , . . . , ξn )) = 0 for any bounded Borel function fn : Rn → R . 7.17. Rewrite the r.v. Xn,m for n > m as
∑
1≤i1 <...
ξi1 . . . ξim +
∑
1≤i1 <...
ξi1 . . . ξim ,
where the first term is equal to Xn−1,m . Prove that the conditional expectation of the second term with respect to the σ -algebra Fn−1 is zero. 7.18. Use Corollary 7.4. 7.19. Let Xn be a number of black marbles inside the urn after the nth step, and Yn = Xn /(b + r + cn) be a fraction for which we are searching. Write Xn = (Xn−1 +
94
7 Martingales and related processes. Stopping times
c)1IA + Xn−1 1IA , where the event A consists in the choice of the black marble on the nth step, and calculate the conditional probability of the event A with respect to the σ -algebra Fn−1 = σ {X1 , . . . , Xn−1 } = σ {Y1 , . . . ,Yn−1 }. 7.22. (1) Use the “inverse” induction. It means to start with E(YN /FN−1 ). (2) It is evident that ZN ≥ YN . Prove that Zn ≥ E(Yn+1 /Fn ) using the “inverse” induction. 7.23. Due to Theorem 7.10, there exists limn→∞ Xn =: X∞ . Furthermore, you should use the L´evy theorem: if E|ξ | < ∞, then E(ξ /Fn ) → E(ξ /F∞ ), n → ∞ a.s. Convergence in L1 (P) follows now from the fact that X∞ = E(ξ /F∞ ) a.s. and from Theorem 7.10. 7.24. Use Problem 7.23 and Theorem 7.10. 7.25. Define events Ak = {ξk = 2} and use the Borel–Cantelli lemma: because ∑k≥1 P(Ak ) = ∞ and events Ak are mutually independent, then P(limn→∞ Xn = ∞) = 1. And finally, use Theorem 7.10. 7.27. Use the facts that increments of a Wiener process are independent and that for 2 2 the r.v. ξ ∼ N(0, σ 2 ) Eeαξ = e(α σ )/2 . For example, aW 2 (t) + bt is a martingale if and only if b = −a. 7.30. (1) Use the definition of Markov moment. (2) For example, an event {lim supk→∞ τk ≤ n} = {limk→∞ supm≥k τm ≤ n} = ∞ ∞ ∞ r=1 k=r m=k {τm ≤ n} ∈ Fn . 7.35. Write |Xτ | = lim infn→∞ |Xτ ∧n | and then use the Fatou lemma and Theorem 7.4. 7.37. It follows directly from the definition of the Markov moment. 7.38. See Problem 7.31. 7.39. Use the fact that |Xν | ≤ C, which implies that E|Xν | < ∞. Now, the conditional expectations can be calculated. 7.40 Put τ := n1IA + N1IAc , where A ∈ Fn and 0 ≤ n ≤ N, and prove that {Mn , Fn , n ∈ Z+ } is a martingale. 7.41. (1) Prove that for Fτ all the properties of a σ -algebra are fulfilled. (2) Consider the event {τ = k} and prove that it belongs to Fτ . For this purpose and n. consider an event {τ = k} ∩ {τ = n} for different k 3) Let B ∈ B(R). Represent the event {Xτ ∈ B} as ∞ n=0 {Xn ∈ B, τ = n}, and prove that this event belongs to Fτ . 7.42. Use Problem 7.41, item (4). 7.43. Write the event {τA ≤ t} as {τ ≤ t} ∩ A, 0 ≤ t < ∞. 7.44. {τ ≤ t} = infs∈Q∩[0,t] dist(X(s), A) = 0, where dist(x, A) = infy∈A |x − y|. 7.45. (1) Follows directly from the corresponding definitions; (1) implies (2). 7.46. (3) Consider the event A ∩ {t < τ } ∈ Ft ∩ Fτ − for A ∈ Ft ,t ∈ R+ , and represent it as A ∩ {t < τ } ∩ {σ < τ }. (4) Represent A ∩ {σ < τ } as a union r∈Q+ A ∩ {σ < r} ∩ {r < τ }, where Q+ is a set of nonnegative rational numbers. 7.47. Follows directly from the definition of the inverse martingale and from the fact that σ -algebras do not increase. 7.49. Represent Sτ as Sτ =
∞
n
∞
∞
∞
k=1
n=k
k=1
∑ ∑ ξk (ω )1Iτ =n = ∑ ξk (ω ) ∑ 1Iτ =n = ∑ ξk (ω )1Iτ ≥k ,
n=1 k=1
7 Martingales and related processes. Stopping times
95
and take into account the fact that the event {τ ≥ k} = Ω \ {τ < k} ∈ Fk−1 , that is, does not depend on ξk . k 7.51. Write the identity Sτ − τ Eξ1 = ∑∞ k=1 ∑i=1 (ξi − Eξi )1Iτ =k and use a transformation similar to the one used for proving the first Wald identity. 7.59. Prove that there exists a r.v. ξ ∈ L2 (P) such that Xn = E(ξ /Fn ). Using the maximum Doob’s inequality prove that E supn≥1 Xn2 < ∞. Derive that EXτ2 < ∞. Prove that the process Yn := Xn2 − Xn is a martingale. Use the problem situation and the statement of Problem 7.36, transformed for a submartingale, and prove that for any n ≥ 1 it holds that EXτ2∧n ≤ EXτ2 . Derive that EXτ ≤ EXτ2, and the opposite inequality. 7.60. Transform the increment XnYn − Xn−1Yn−1 − E (Xn − Xn−1 )(Yn − Yn−1 )/Fn−1 . 7.61. Use the definition of X,Y n . 7.62. (1) Put A0 = 0, Δ An := E(Xn − Xn−1 /Fn−1 ). (2) Use the shape and uniqueness of the decomposition obtained in the item (1). 7.63. Consider a stopping time τ = inf{n ≤ N| Xn ≥ C} ∧ N and use the fact that EXτ = EMτ + EAτ = EAτ . 7.64. Use Theorem 7.10 and write Xn as E(ξ /Fn ); then, decompose the r.v. ξ as ξ + − ξ −. 7.65. Prove that there exists limn→∞ Xn =: X∞ a.s. Put Mn = E(X∞ /Fn ) and Pn = Xn − E(X∞ /Fn ). 7.67. (1) The symmetry implies that E|Xn | = 2EXn+ ; τ is the stopping time (see Problem 7.55). Hence, Xn+ → Xτ+ = 1 a.s., and it is possible to use the Lebesgue dominated convergence theorem. √ So, E|Xn | → 2, n → ∞. (2) Check that [Xn ]1/2 = τ ∧ n. Furthermore, it follows from the theory of random walks that (−1)k−1 1/2(1/2 − 1) . . . (1/2 − k + 1) , k! √ and it is evident that P(τ = 2k) = 0. Derive the equality E τ = ∞ ( Eτ = ∞ was already proved in a different way in Problem 7.55). 7.68. Consider the stopping time τ = inf{0 ≤ n ≤ N| Xn ≥ C} ∧ N and use the fact that EX0 ≥ EXτ ≥ C · P(max0≤n≤N Xn ≥ C) − EXN− . 7.69. Consider the stopping time τ = inf{0 ≤ n ≤ N| Xn ≤ −C} ∧ N and use the inequality EXτ ≥ EX0 . 7.70. Use the facts that {h(tXn ), 1 ≤ n ≤ N} is a submartingale and the function h is nonnegative. Also use the statement (1) of Theorem 7.7. 7.71. Perform a discretization of both time and Markov moment (note that there exists a limit X(∞) = limt→∞ X(t) and you can put X(τ ) = X(∞) if the Markov moment τ = ∞). Then, you can pass to the limit assuming uniform integrability and right-hand continuity. 7.73. Prove by induction that (∂ k h(α ,W (t),t))/∂ α k is a martingale for any k ≥ 1 and α ∈ R. For this purpose you should write a derivative as a limit, taking into account that the prelimit expression is a martingale and check whether the prelimit values are uniformly integrable. Fix some α0 ∈ R, consider its neighborhood (e.g., |α − α0 | < 1), and prove that inside the neighborhood P(τ = 2k − 1) =
96
7 Martingales and related processes. Stopping times
sup
|α −α0 |<1
k ∂ h(α ,W (t),t) 2 < ∞ for all k ≥ 1 and t ∈ R+ . E ∂ αk
7.74. Derive from the statement of Problem 7.73 that the processes {W (t),W 2 (t) − t,W 3 (t) − 3tW (t),W 4 (t) − 6tW 2 (t) + t 2 , t ≥ 0} are martingales. Prove that τa is a that EW (τa ) = 0 with EW 2 (τa ) = Eτa , and stopping 4 time, |W (τ2a )| = a a.s.2 Prove E W (τa ) − 6EτaW (τa ) + 3Eτa = 0. 7.75. First, find such a λ ∈ R that a process exp{λ X(t)} is a martingale. Then, it holds that E exp{λ X(t ∧ σa )} = 1. Furthermore, pass to a limit as t → ∞ and obtain the inequality E exp{ λ X(σa )}1Iσa <∞ ≤ 1 which implies the statement. 7.76. Transform E f (W (t))/FW s as E ( f (W (t))/W (s)) (which Wiener process property should you use?). Furthermore, E ( f (W (t))/W (s)) = E ( f (W (t) −W (s) +W (s))/W (s)) . Use the following equality. If ξ and η are independent random vectors and a Borel function g is such that E|g(ξ , η )| < ∞, then E (g(ξ , η )/η ) = E f (ξ , y)y=η . And finally, prove that E f (W (t) − W (s) + y) ≤ f (y) using the superharmonic property of f . For this you should transform E f (W (t) − W (s) + y) as Rm f (x + y)pt−s,m (x)dx, of the distribution of m-measurable Gaussian vector where pt−s,m (x) is the density W (t) −W (s), and rewrite Rm = 0∞ S(y,r) . 7.77. The fact that {W (t), FtW ,t ∈ R+ } is a submartingale follows from the convexity of the function f : Rm → R+ , f (x) = x (check whether this function is indeed convex). Then use the statement of Problem 7.70 that is valid as well for 2 2 continuous-time submartingales. For this purpose put h(rx) = er x and choose r2 =
1 x2 + . 2s2 m 2s
7.79. Consider any sequence of stopping times τ1 ≤ τ2 < · · · a.s., such that τn → ∞ a.s., and use the fact that {X(t ∧ τ ),t ≥ 0} is a martingale if {X(t), ≥ 0} is a martingale (the proof of this is similar to Theorem 7.4). 7.81. Use Fatou’s lemma directly for the equality that defines the local martingale. 7.82. Direct corollary of Problem 7.81. 7.83. First prove that E(|Xτ ∧(n+1) |/Fn ) < ∞ for any Markov moment τ , and then use similar reasoning as in the proof of the implication (1) =⇒ (2) in Problem 7.31. 7.84. Let {τn , n ≥ 1} be a localizing sequence. First prove that a martingale X(t ∧ τn ) is uniformly integrable for every n ≥ 1. And then apply Corollary 7.2 for it. Finally, tend n → ∞ using the Lebesgue dominated convergence. 7.85. Because limn→∞ M(t ∧ τn ) = M(t), derive from the uniform integrability that the convergence holds in L p (Ω , F, P) too. You can derive the inequality E|Mt | p < ∞ for all t ∈ R+ . Check whether it is possible to tend n → ∞ in equation E(M(t ∧ τn )/Fs ) = M(s ∧ τn ). 7.86. As in Problems 7.87 and 7.85, the equality E(M(t ∧ τn )/Fs ) = M(s ∧ τn ) holds a.s., and you need to prove the uniform integrability of the sequence {M(t ∧ τn ),
7 Martingales and related processes. Stopping times
97
n ≥ 1}. For this purpose you can use the evident inequality |M(t ∧ τn )| ≤ M ∗ (t) and prove that P(|M(t ∧ τn )| ≥ C) → 0 as C → ∞. 7.87. Write the relation E(M(t ∧ τn )/Fs ) = M(s ∧ τn ) and prove that {M(t ∧ τn ), n ≥ 1} is a uniformly integrable sequence and tend n → ∞. You can use Problem 7.1. Use Fatou’s lemma to prove the square integrability. 7.88. Consider a sequence of stopping times τn = inf{t ∈ R+ |Mt | ≥ n} ∧ n. 7.89. (1) Check directly that H p is Banach space. (2) Use the H¨older inequality. (3) The first inequality is evident. The second one can be derived from the maximal integral Doob’s inequality that holds in the continuous time case as well. (4) You can derive the inequality M(∞)∞ ≤ M ∗ (∞)∞ from the following: |M(t)| ≤ M ∗ (∞)∞ for all t ∈ R+ . For the inverse inequality consider for any ε > 0 a Markov moment τε = inf{t ∈ R+ |M(t)| ≥ M ∗ (∞)∞ − ε } and use the fact that P(|M ∗ (∞)| > M ∗ (∞)∞ − ε ) > 0 (check this), and equality M(τε ) = E(M(∞)/Fτε ) which implies that M(τε )∞ ≤ M(∞)∞ . (5) It is sufficient to ensure that limn→∞ (M n − M)∗ (∞) p = 0, and derive the existence of the sequence for which we are searching. (6) Derive from item (5). 7.90. First prove that E supt≥0 M 2 (t) < ∞; that is, M ∈ H2 . Next, supτ EM 2 (τ ) × 1IM2 (τ )≥C ≤ supτ EM 2 (∞)1IM2 (τ )≥C , where τ runs through the family of all Markov moments. Prove that the latter expression tends to zero as C → ∞. 7.92. It can be checked via direct computation. 7.94. Consider a family G of all processes of the form {Y (t) := ∑tk=1 ξk · (Xk − Xk−1 ),t ∈ T} where ξk are predictable random variables and
ξk (Xk − Xk−1 ) ∈ L2 (Ω , F, P). Prove that it is a subspace in H2,T . Use the projection theorem in Hilbert space H2,T . 7.95. (1) The fact that Q P on F0 can be derived directly from the definition of absolute continuity. Now, let A ∈ F0 . Check the equalities Q(A) = A ρ dP = E( ρ /F0 )dP and derive that E(ρ /F0 ) is a required density. A (2) Let a r.v. ξ0 be nonnegative and F0 -measurable. Prove equalities EQ (ξ0 ξ ) = E(ξ0 ξ ρ ) = E(ξ0 E(ξ ρ /F0 )). Denote ρ0 = E(ρ /F0 ). Derive from item (1) that ρ0 > 0 Q-a.s. So, we can assume that P-a.s. ξ0 = 0 on the set {ρ0 = 0} and obtain the equality E(ξ0 E(ξ ρ /F0 )) = EQ (ξ0 · (1/(E(ρ /F0 )))E(ξ ρ /F0 )). Derive from this the required statement. (3) Denote Zt = E((dQ/dP) |Ft ). Check that Mt ∈ L1 (Q) if and only if Mt Zt ∈ L1 (P) and also that the process Z is positive P-a.s. Then, derive from item (2) the equality Zt ·EQ (Mt+1 /Ft ) = E(Mt+1 Zt+1 /Ft ). And finally, obtain that EQ (Mt+1 /Ft ) = Mt if and only if the equality E(Mt+1 Zt+1 /Ft ) = Mt Zt holds true. 7.98. Use the corresponding result for discrete-time martingales, make a discretization of stopping times and use the uniform integrability. 7.99. You can assume that for some 0 ≤ s < t ≤ T the following inequalities hold: E(X(t)/Fs ) ≥ X(s) and P(E(X(t)/Fs ) > X(s)) > 0. Consider an event A = {ω ∈ Ω |E(X(t)/Fs ) > X(s)} ∈ Fs and produce a chain of the inequalities EX(T ) ≥ EX(t) = E(E(X(t)/Fs )) > EX(s)1IA + EX(s)1IA = EX(s) ≥ EX(0).
98
7 Martingales and related processes. Stopping times
7.100. Check whether the process X is uniformly integrable. Next, make a discretization of the stopping time and pass to the limit using the uniform integrability. 7.101. (1) Write Doob’s decomposition for the process X and consider stopping times τ := inf{n| An+1 > 0} and σ := inf{n| An+1 < 0}, where A is a predictable process in the Doob’s decomposition. Prove that A = 0. (2) Use item (1) and the uniform integrability. 7.102. Assume that X(∞) = 0 a.s. First prove that this supermartingale is uniformly integrable and satisfies the conditions of Theorem 7.5. Then, use this theorem for Markov moments ν := τ + t with rational t > 0 and σ := τn where τn = inf{s ∈ R+ | X(s) ≤ 1/n}. 7.105. Find a decomposition of the process {(N(t) − λ t)2 ,t ≥ 0} into a martingale and a nonrandom continuous nondecreasing function. 7.106. To prove the identity EM 2 (τ ) = EM(τ ) we note that {M 2 (t) − M(t),t ≥ 0} is a uniformly integrable martingale with initial value zero. In order to prove the equality EM 2 (τ ) = E[M](τ ) consider separately continuous and discontinuous components and take into account their orthogonality. 7.107. (1) Consider τ = inf{t ∈ R+ |Mt | > u} ∧ T. (2) Use the fact that ∗
(M ∧ A) = p
M ∗ ∧A 0
px p−1 dx, p = 1, 2.
(3) The proof is similar to the one of the maximal integral Doob’s inequality in case of discrete time (see, e.g., the proof of Theorem 5, Chapter IV [9]). 7.108. (1) Use Doob’s theorem of optional sampling for the martingale Mt = exp{izW (t) + (z2t)/2}, where z ∈ R. (2) Follows from item (1). (3) First consider T1 = ∑ni=1 ti 1IAi where Ai ∈ A , Ai are disjoint, and 0 < t1 < · · · < tn = T. 7.109. Use items (2) and (3) of Problem 7.108. The trajectory of Wτ can be created from the trajectory of W using the reflection of that part of the latter that corresponds to the values of t > τ with respect to the line y = W (τ ).
Answers and Solutions 7.18. (1) Because Xn = Xn+ −Xn− , |Xn | = Xn+ +Xn− , and EXn ≤ EX1 < ∞ for any supermartingale {Xn , n ∈ Z+ }, then supn EXn+ < ∞; moreover supn E|Xn | < ∞. It follows now from Corollary 7.4 that there exists a limit limn→∞ Xn = X∞ a.s. (2) According to Fatou’s lemma E|X∞ | = E lim infn→∞ |Xn | ≤ lim infn→∞ E|Xn | ≤ supn E|Xn | < ∞. (3) One-way implication is proved (see item (1) ). Because Xn− ≤ |Xn |, the opposite implication is evident. 7.20. It is evident that if the true densities are pn , then the conditional density of the distribution Xn given X1 , .., Xn−1 is equal to (pn+1 )/pn , and then
7 Martingales and related processes. Stopping times
E(Yn /X1 = x1 , .., Xn−1 = xn−1 ) = =
qn (x1 ,..,xn−1 ,x) pn (x1 ,..,xn−1 ,x)
R pn (x1 ,..,xn−1 ,x) pn−1 (x1 ,..,xn−1 ) dx −1 = qn−1 (x1 ,..,xn−1 ) . R qn (x1 , .., xn−1 , x)dx · (pn−1 (x1 , .., xn−1 )) pn−1 (x1 ,..,xn−1 )
99
=
7.26. Let us calculate ξn−1,k := E(1Iω ∈n,k /Fn−1 ). For any event n−1,k the following equality holds. 1 λ (n,k ), if n,k ⊂ n−1,k , 1 ξn−1,k dP = 1Iω ∈n,k d λ = / 0, if n,k ∩ n−1,k = 0. A A That is why
ξn−1,k =
λ 1 (n,k ) 1Iω ∈n−1,k . λ 1 (n−1,k )
n This immediately implies the proof, if to decompose ∑kk=1 in the definition of Xn into sums with respect to the partition πn−1 . 7.31. Implication (1) ⇒ (2) follows from the next chain of equalities: E(Xnτ − τ /F Xn−1 n−1 ) = E(Xτ ∧n − Xτ ∧(n−1) /Fn−1 ) = E(Xn − Xn−1 /Fn−1 )1Iτ >n−1 = 0. Implication (2) ⇒ (3) follows from Theorem 7.12, item (3). Implication (3) ⇒ (1): first we put τ = N; then EXN = EX0 . Let an event A ∈ Fn , and τ = n1IA + N1IA (check that τ is a stopping time). Then item 2) implies that EXτ = EXN ; that is, EXn 1IA +EXN 1IA = EXN , thus, EXn 1IA = EXN 1IA or A Xn dP = A E(XN /Fn )dP, which is equivalent to the martingale property of the process X. 7.32. According to Theorem 7.4, EXτ ∧n = EX0 . The Lebesgue dominated convergence theorem allows us to go to the limit in the integral as n → ∞. 7.33. (1) Theorem 7.4 implies that for any n ∈ Z+ it holds that E(Xτ ∧n /Fσ ) = Xτ ∧n∧σ . Note that Xτ ∧n∧σ → Xτ ∧σ a.s. as n → ∞. Further more, we choose such a sequence nk → ∞ that limnk →∞ E|Xnk |1Iτ >nk = 0. Then E|E(Xτ ∧nk /Fσ ) − E(Xτ /Fσ )| ≤ E|E((Xnk − Xτ )1Iτ >nk /Fσ )| ≤ E|Xnk |1Iτ >nk + E|Xτ |1Iτ >nk → 0 as nk → ∞, because |Xτ |1Iτ >nk → 0 a.s. and Xτ is an integrable r.v. As E(Xτ ∧nk /Fσ ) → E(Xτ /Fσ ) in L1 (P), then there exists nk j → ∞ such that E(Xτ ∧nk j /Fσ ) → E(Xτ /Fσ ) a.s., and the first statement follows. Item (2) follows immediately from item (1), and item (3) we leave for you to do on your own. 7.34. First we prove that E|Xτ | < ∞. Indeed, |Xτ | ≤ |X0 | + ∑τk=1 |Xk − Xk−1 |1Iτ >0 , and
n ∞ ∞ ∑∞ n=1 ∑k=1 E (|Xk − Xk−1 |1Iτ =n ) = ∑k=1 ∑n=k E (|Xk − Xk−1 |1τ =n ) ∞ = ∑k=1 E (|Xk − Xk−1 |1τ ≥k ) .
Because 1I{τ ≥k} = 1 − 1I{τ ≤k−1} ∈ Fk−1 , then E (|Xk − Xk−1 |1Iτ ≥k ) = E (1Iτ >k−1 E(|Xk − Xk−1 |/Fk−1 )) ≤ CP(τ ≥ k) and therefore E ∑τk=1 |Xk − Xk−1 |1Iτ >0 ≤ C ∑∞ k=1 P(τ ≥ k) = CEτ < ∞, and thus E|Xτ | < ∞. Now E|Xn |1Iτ >n ≤ E (|X0 | + ∑τk=1 |Xk − Xk−1 |1Iτ >0 ) 1Iτ >n and we just proved that a r.v. |X0 | + ∑τk=1 |Xk − Xk−1 |1Iτ >0 is integrable and 1Iτ >n → 0, n → ∞ a.s. So, E|Xn |1Iτ >n → 0; that is, we obtain the conditions of Problem 7.33. According to item (2) of that problem EXτ = EX0 as σ = 0.
100
7 Martingales and related processes. Stopping times
7.36. Let Yn = −Xn . Then Yn is a submartingale, Yn ≤ E(−ξ /Fn ), and E|ξ | < ∞. According to Theorem 7.4, for any n ≥ 0 it holds that Yσ ∧n ≤ E(Yτ ∧n /Fσ ). Note that Yσ ∧n → Yσ , Yτ ∧n → Yτ , as n → ∞ a.s. The only thing we need to ground is the limit change for the conditional mathematical expectation. In order to do this, we need to prove that the family of random variables {Yτ ∧n , n ≥ 1} is uniformly integrable. But E|Yτ ∧n |1IYτ ∧n ≥C ≤ ∑nk=1 E(E(|ξ |/Fk )1Iτ =k 1IE(|ξ |/Fk )≥C ) +E(E(|ξ |/Fn )1Iτ >n 1IE(|ξ |/Fn )≥C ). Note that for any set A ∈ Fk
A E(|ξ |/Fk )1Iτ =k dP
= A∩{τ =k} |ξ |dP = A∩{τ =k} E(|ξ |/Fτ )dP = A E(|ξ |/Fτ )1Iτ =k dP,
because A ∩ {τ = k} ∈ Fτ . So, E(|ξ |/Fk )1Iτ =k = E(|ξ |/Fτ )1Iτ =k and supn≥1 ∑nk=1 E(E(|ξ |/Fk )1Iτ =k 1IE(|ξ |/Fk )≥C ) = E(|ξ |1IE(|ξ |/Fτ )≥C 1Iτ ≤n ) ≤ E(|ξ |1IE(|ξ |/Fτ )≥C ) → 0, C → ∞. We now put Zn = E(|ξ |/Fn ). This is a uniformly integrable martingale according to Theorem 7.10, thus limC→∞ supn E(Zn 1IZn ≥C ) = 0. The obtained relations mean that {Yτ ∧n , n ≥ 1} is a uniformly integrable sequence. 7.39. It is evident that 1In≤ν = 1 − 1Iν ≤n−1 ∈ Fn−1 . Thus, E(Yn /Fn−1 ) = E(Xn /Fn−1 )1In≤ν + E(2Xν − Xn /Fn−1 )1In>ν = Xn−1 1In−1≤ν − Xn−1 1Iν =n−1 + 2E(Xν 1Iν t then C = A ∩ {t < τ ≤ s} = A ∩ ({τ ≤ s}\{τ ≤ t}) ∈ Fs . (2) We need to prove that for any s ∈ R+ an event A := {τ ≤ s} ∈ Fτ − . It is evident that this event belongs to Fs . Its complement Ac = {τ > s} belongs to Fs as well. So, Ac = Ac ∩ {s < τ } ∈ Fτ − according to the definition of Fτ − . Thus, A ∈ Fτ − .
7 Martingales and related processes. Stopping times
101
7.48. It follows directly from the symmetry that E(Zn /Gn+1 ) = (1/n) ∑nk=1 E(Xk /Gn+1 ) = E(X1 /Gn+1 ) = E(1/(n + 1) ∑n+1 k=1 Xk /Gn+1 ) = Zn+1 . 7.50. (1) Denote pk := P(Sn ≤ k, n ≥ 1/S0 = 0). Then p0 = 12 P(Sn ≤ 0, n ≥ 1/ξ1 = −1) = 12 p1 . Furthermore, it is easy to check that p1 = 12 p0 + 12 p2 and, in general, for k > 1 the equality pk = 12 pk−1 + 12 pk+1 implies p0 = 1k pk . Because pk ≤ 1 then p0 = p1 = · · · = 0. It means that P(τ < ∞) = 1. Item (2) is an immediate corollary of item (1). (3) Because Eξ1 = 0 and ESτ = 1, then the first Wald identity implies that Eτ = ∞, otherwise we would obtain that 1 = c · 0 with certain c ∈ R+ . 7.52. Put Xn = et0 Sn /((ϕ (t0 ))n ). Then * ) et0 Sn−1 et0 ξn = Xn−1 , E(Xn /Fn−1 ) = and E (ϕ (t0 ))n−1 ϕ (t0 ) that is, {Xn , Fn , n ∈ Z+ } is a martingale with EXn = 1, and tξ n E (|Xn − Xn−1 |/Fn−1 ) 1Iτ >n−1 = Xn−1 1Iτ >n−1 E ϕe (t0 ) − 1 0 t C t ξn ≤ (ϕe(t0 ))n · E ϕe (t0 ) − 1 ≤ 2et0C . 0
0
It follows from Problem 7.48 that EXτ = EX0 = 1. Therefore, Eet0 Sτ /((ϕ (t0 ))τ ) = 1. 7.53. We prove only item (1) (item (2) can be proved in a similar way). As above, Sτ = ∑∞ i=1 ξi 1Iτ ≥i . At that time ξi and 1Iτ ≥i are independent and Eξi 1Iτ ≥i = Eξi P(τ ≥ i) = 0. In addition, E|Sτ | ≤ E ∑τi=1 |ξi | < ∞. So, we only need to justify changing the order for expectations. It is sufficient to prove that E ∑∞ i=1 |ξi |1Iτ ≥i < ∞. But τ τ | ξ |1 I = | ξ | and, according to the condition, E ∑ ∑ ∑∞ i=1 i τ ≥i i=1 i i=1 |ξi | < ∞. 7.54. We prove only item (1). It is evident that ) * E (Mn /Fn−1 ) = μ −n E ) =μ
−n
E
∞
N
∑∑
* (n) ξk 1ISn−1 =N /Fn−1
N=1 k=1
Sn−1
∑ ξk
(n)
k=1
=μ
−n
/Fn−1
) E
∞
∑
* (n) ξk 1ISn−1 ≥k /Fn−1
.
k=1
(We change the order of summation and conditional expectation and take into ac(n) count that Sn−1 is Fn−1 -measurable and ξk is independent of Fn−1 .) So, ∞
E (Mn /Fn−1 ) = μ −n ∑ Eξk 1ISn−1 ≥k = μ −n+1 · Sn−1 = Mn−1 . (n)
k=1
(n)
The order of summation and conditional expectation can be changed because ξk are nonnegative.
102
7 Martingales and related processes. Stopping times
7.55. (2) We need to prove that P(τ < ∞/X0 = x) = 1. Consider the events A1 = {τ < ∞, Xτ = b} and A2 = {τ < ∞, Xτ = a}. It is evident that the event {τ < ∞} = A1 ∪ A2 . Now, denote α (x) = P(A1 /X0 = x) and β (x) = P(A2 /X0 = x). In this case α (x) and β (x) satisfy the following difference equations with corresponding boundary conditions.
α (x) = pα (x + 1) + qα (x − 1), α (b) = 1, α (a) = 0. β (x) = pβ (x + 1) + qβ (x − 1), β (b) = 0, β (a) = 1. Let p = q. The equation for α (x) has two obvious solutions α1 (x) = c1 , and α2 (x) = c2 (q/p)x , where c1 and c2 are some nonnegative constants. If we find the solution of the form α (x) = c1 + c2 (q/p)x , then we obtain, taking into account boundary conditions, that x a q − qp p α (x) = b a . q − qp p Similarly,
b q p
−
q p
−
β (x) = b
x q p
a . q p
In particular, this implies P(τ < ∞/X0 = x) = 1. The only thing we need is to prove that the solution of every difference equation with corresponding boundary - (b) = 1, α - (a) = 0. - (x) be some solution with α conditions is unique. Indeed, let α a - (a) and Then we can find two constants c1 and c2 such that c1 + c2 (q/p) = α a+1 - (a +1). So, we can obtain from the difference equation for α - (x) =α c1 +c2 (q/p) - (x) = c1 + c2 (q/p)x , a ≤ x ≤ b. - (a + 2) = c1 + c2 (q/p)a+2 and, in general, α that α Consequently, the solution α (x) is unique. β (x) can be treated in a similar way. If p = q = 12 , then the solutions α1 (x) and α2 (x) are equal, but there exists one more obvious solution α (x) = c2 x. Taking into account boundary conditions we obtain that
α (x) =
x−a , b−a
β (x) =
b−x b−a
and again the solution is unique. Thus, P(τ < ∞/X0 = x) = 1 for any a ≤ x ≤ b. Statements (4) and (5) have been proved in fact within proving statement (2). M |≤ Another way to prove our statement in the case when p = q: a martingale |Xn∧ τ |a| + |b| and is bounded; so, according to Problem 7.32, EMτ = EM0 = (q/p)x , but EMτ = (q/p)a P(τ = τa /X0 = x) + (q/p)b P(τ = τb /X0 = x), and P(τ = τa /X0 = x) + P(τ = τb /X0 = x) = 1, hence x q p
−
q p
−
P(τ = τa /X0 = x) = a
b q p
b . q p
7 Martingales and related processes. Stopping times
103
(6) Because the r.v. Xτ is bounded and Eξ1 = 0 as p = q = 1/2, then it follows from the first Wald identity that Eτ = ∞ in this case. If p = q, then Eξ1 = p − q = 0, and a EXτ =
q p
x
−
q p
b
+ b qp b a q − qp p
a
−
q p
x
.
7.66. Let {rn (t), n ≥ 1,t ∈ [0, 1]} be a sequence of Rademaher functions. It means that rn (t) = ±1 and rn (t) are mutually independent with respect to Lebesgue measure on [0, 1] (we consider the probability space Ω = [0, 1], F consisting of Lebesgue measurable sets from [0, 1] and P = λ 1 |[0,1] ). This sequence satisfies the Khinchin inequality where expectation should be treated as an integral on Lebesgue measure. Consider the martingale transformation (r(t) ◦ X)n and put r0 = 0. For this transformation we have r(t) ◦ (r(t) ◦ X)n = (r2 (t) ◦ X)n = Xn . According to the second statement of Theorem 7.11, it holds that E|Xn | p ≤ Cp E|(r(t) ◦ X)n | p ≤ C2p E|Xn | p as p > 1 and t ∈ [0, 1]. (In this formula and below we deal with ordinary expecp/2 tation). The following estimates are valid due to the Khinchin inequality: c p [X]n ≤ 1 p/2 p for p > 0. It follows from these inequalities that for 0 |(r(t) ◦ X)n | dt ≤ C p [X]n p/2 p/2 p > 1 it holds c p E[X]n ≤ E|Xn | p ≤ Cp E[X]n . + 7.72. It is obvious that {W (t), Ft ,t ∈ R } is a martingale. The integral Doob’s inequality holds true for continuous-time martingales: for any t > 0 and p > 1 it holds that E sup0≤s≤t |X(s)| p ≤ (p/(1 − p)) p E|X(t)| p (see Theorem 7.8 for discrete time and,e.g., [57] for continuous time). Thus, E|W (t ∧ τ )|2 ≤ E sup0≤s≤t |W (s)|2 ≤ 4E|W (t)|2 = 4t < ∞, and then W (t ∧ τ ) ∈ L2 (P) ⊂ L1 (P). Now, we need to use a version of Theorem 7.5 but instead of ∞ we consider any T > 0 and prove the martingale property on [0, T ]. 7.91. (1) According to the problem situation, M ∗ (∞) and N ∗ (∞) belong to L2 (Ω , F, P). That is why the product M ∗ (∞)N ∗ (∞) belongs to L1 (Ω , F, P). Furthermore, (MN)∗ (∞) = sup |M(t)N(t)| ≤ M ∗ (∞)N ∗ (∞) t∈R+
+loc and E(MN)∗ (∞) < ∞. We show that in and then the product MN belongs to M this case MN is a uniformly integrable martingale. Indeed, there exists a sequence τn ↑ ∞ of stopping times such that E ((MN)(τn ∧ t)/Fs ) = (MN)(τn ∧ s), and we are able to tend n → ∞, because there exists an integrable dominant (MN)∗ (∞). Thus, E ((MN)(t)/Fs ) = (MN)(s) and MN is a martingale. We can derive its integrability from the fact that E(MN)∗ (∞) < ∞, because in the case E supn |ξn | < ∞, it holds that supn E|ξn |1I|ξn |>C ≤ E supn |ξn |1Isupn |ξn |>C → 0 as n → ∞, and it can be easily generalized for any set of parameters. Now, to prove that EM(τ )N(τ ) = EM(0)N(0) = 0, we write E(MN)(τ ∧ n) = E(MN)(0) = 0 and tend t → ∞. It is possible due to the uniform integrability. (2) In this case for any stopping time τ we have that M(τ )N(τ ) ∈ L1 (Ω , F, P). Thus, E|M(τ )N(τ )| < ∞ and EM(τ )N(τ ) = 0. It follows from Theorem 7.1 and from the generalization of Problem 7.40 to continuous-time processes (check whether it
104
7 Martingales and related processes. Stopping times
indeed holds true), that MN has the right-hand continuous modification with finite limits from the left at every point. Now, we put τ (ω ) := t1IA + T 1IA for any t ∈ [0, T ) and any A ∈ Ft and obtain that EM(τ )N(τ ) = EM(t)N(t)1IA + EM(T )N(T )1IA = 0 = EM(T )N(T ). So, EM(t)N(t)1IA = EM(T )N(T )1IA , thus, MN is a martingale. The uniform integrability follows if we note that under our conditions (MN)∗ (∞) ∈ H1 (see also the proof of the first statement of this problem). 7.93. The equivalence of statements (2) and (3) has been proved in Problem 7.92. It is evident that (3) implies (2) if we remember that M0 = 0. Let us assume that statement (1) holds. Because N τ ∈ S for any τ on T then EMT NTτ = EMT Nτ = EMτ Nτ = 0. Now, we can use a discrete-time version of Problem 7.91. 7.96. (1) We define a process L for a given measure Q ∼ P by the equations L(0) = 1 and L(t + 1) = L(t) + (Z(t + 1) − Z(t))/Z(t), t ∈ {0, 1, . . . , T − 1}. It is obvious that equality (7.2) holds within this choice of L. Moreover, L satisfies the condition L(t + 1) − L(t) > −1 because the equivalence of the measures P and Q implies the positivity P-a.s. of random variables Z(t) for all t ∈ T. Now, by induction in t we show that L(t) ∈ L1 (P). It is evident for t = 0. Assume that L(t) ∈ L1 (P). As the process Z is nonnegative, the conditional expectation of the r.v. (Z(t + 1))/Z(t) is correctly defined and P-a.s. satisfies the equality Z(t + 1) 1 /Ft = E(Z(t + 1)/Ft ) = 1. E Z(t) Z(t) It means that (Z(t + 1))/Z(t) ∈ L1 (P) and hence, L(t + 1) = L(t) − 1 +
Z(t + 1) ∈ L1 (P). Z(t)
Now we can derive the martingale property of the process L: because the r.v. Z is positive, we can divide both parts of the equation E(Z(t + 1)/Ft ) = Z(t) by Z(t) and obtain the equality E(L(t + 1) − L(t)/Ft ) = 0. (2) If L satisfies the indicated conditions and the equality (7.2) determines a positive P-martingale Z, then the equalities EZ(t) = Z(0) = 1 are obvious. 7.97. First note that - − M(t - − 1)) (L(t) − L(t − 1))(M(t) =
1 - − M(t - − 1)) − (M(t) - − M(t - − 1)). · Z(t)(M(t) Z(t − 1)
(7.4)
- − Furthermore, note that EZ(t)|M(t)| = EQ |M(t)| < ∞ and the same is true for Z(t)|M(t - − M(t - − 1)) ∈ L1 (P). We put τn = inf{t ∈ T| Z(t) < 1/n} ∧ T , 1)|. Hence, Z(t)(M(t) - − M(t - − 1))1Iτn ≥t ∈ L1 (P). In particular, the conn ∈ N. Then (L(t) − L(t − 1))(M(t) ditional expectations from (7.3) are P-a.s. correctly defined. Furthermore, it follows from (7.4) that the following equalities hold true P-a.s. on the set {τn ≥ t}. - − M(t - − 1)/Ft−1 E M(t) 1 - − M(t - − 1))/Ft−1 E Z(t)(M(t) = Z(t−1) - − M(t - − 1))/Ft−1 . − E (L(t) − L(t − 1))(M(t)
7 Martingales and related processes. Stopping times
105
The equality means that Doob’s decomposition with respect to the measure P of the - is process M t - − M(k - − 1))/Fk−1 , t ∈ T. - = M(t) − ∑ E (L(k) − L(k − 1))(M(k) M(t) k=1
7.103. Consider a stopping time τ = inf{n ≥ 1 : Mn ≥ C} ∧ N. Thus, according to the maximal Doob’s inequality, P( max |Mn | ≥ ε ) ≤ P(τ < N) + P( max |Mn | ≥ ε ) ≤ P(MN ≥ C) 0≤n≤N
0≤n≤N
+P( max |Mτ ∧n | ≥ ε ) ≤ P(MN ≥ C) + 0≤n≤N
E|Mτ ∧N |2 E|Mτ |2 = P(MN ≥ C) + . 2 ε ε2
Now, E|Mτ |2 = EMτ ≤ C + E max (Mn − Mn−1 ) . 1≤n≤N
7.104. It is sufficient to prove the inequality we are searching only for martin2 via passing to the limit. gales from the class M 2 and then generalize it for Mloc Put τN = inf{t| M(t) ≥ N} ∧ T . Then, due to the continuity M(τ ) ≤ N a.s. On the other hand, P(sup0≤t≤T |M(t)| ≥ ε ) = P((sup0≤t≤T |M(t)| ≥ ε ) ∩ (τ < T )) + P((sup0≤t≤T |M(t)| ≥ ε ) ∩ (τ = T )) ≤ P(τ < T ) + P(sup0≤τ ≤T |M(t)| ≥ ε ). Taking into account that P(τ < T ) ≤ P(M(T ) ≥ N), P(sup0≤t≤τ |M(t)| ≥ ε ) ≤ N ε −2 , we obtain the required inequality.
8 Stationary discrete- and continuous-time processes. Stochastic integral over measure with orthogonal values
Theoretical grounds In this chapter we consider complex-valued stochastic processes. Let us recall the definition of covariance for complex-valued random variables X,Y : cov(X,Y ) = E(X − EX)(Y − EY ). Definition 8.1. A stochastic process {X(t),t ∈ R}, E|X(t)|2 < ∞ is called a widesense stationary process if EX(t) = EX(0) for all t ∈ R, and cov(X(t + s), X(s)) = cov(X(t), X(0)) for all s,t ∈ R. The function RX (t) := cov(X(t), X(0)) is called the covariance function for the process {X(t),t ∈ R}. A definition of a wide-sense stationary random sequence {Xn , n ∈ Z} can be given in a similar way. The following two theorems state that the covariance function of a wide-sense stationary process or sequence is a Fourier transform of a finite measure. Theorem 8.1. (Bochner–Khinchin theorem) Assume that the covariance function RX of a wide-sense stationary stochastic process {X(t),t ∈ R} is continuous. Then there exists a finite measure FX on (R, B(R)) such that RX (t) =
∞
−∞
eitu FX (du), t ∈ R.
Theorem 8.2. (Herglotz theorem) Assume that {Xn , n ∈ Z} is the wide-sense stationary sequence. Then there exists a finite measure FX on ((−π , π ], B((−π , π ])) such that π einu FX (du), n ∈ Z. RX (n) = −π
The measure FX from the Bochner–Khinchin (Herglotz) theorem is called the spectral measure of the process {X(t)} (sequence {Xn }). Its distribution function, which we also denote by FX , is called the spectral function. If the spectral function FX is absolutely continuous then its derivative pX (x) = FX (x) is called the spectral density of the process. D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 8,
107
108
8 Stationary discrete- and continuous-time processes
It is turns out that wide-sense stationary random sequences and processes can be represented as a Fourier transform of processes with orthogonal increments. In order to formulate the corresponding result, we present a construction of the stochastic integral. Definition 8.2. Let (Ω , F, P) be a probability space, (E, E) be a space with σ -finite measure μ , and Eμ = {A ∈ E : μ (A) < +∞} be a ring of sets of finite measure μ . A function Z : Eμ → L2 (Ω , F, P) is said to be the orthogonal stochastic measure with the structural measure μ if: (1) ∀ Δ1 , Δ2 ∈ Eμ , Δ1 ∩ Δ2 = ∅ : Z(Δ1 ∪ Δ2 ) = Z(Δ1 ) + Z(Δ2 ) a.s. (2) ∀ Δ ∈ Eμ : E|Z(Δ )|2 = μ (Δ ). (3) ∀ Δ1 , Δ2 ∈ Eμ , Δ1 ∩ Δ2 = ∅ : EZ(Δ1 )Z(Δ2 ) = 0. Define a stochastic integral for a simple function of the type f = ∑nk=1 ck 1IΔk , where ck ∈ C, Δk ∈ Eμ , as
E
f (ζ ) Z (d ζ ) =
n
∑ ck Z(Δk ).
(8.1)
k=1
Theorem 8.3. The stochastic integral defined by (8.1) can be uniquely extended to a continuous linear operator which acts from L2 (E, E, μ ) to L2 (Ω , F, P). Moreover, this extension is an isometry; that is, for all f , g ∈ L2 (E, E, μ ), E
E
f (ζ ) Z (d ζ )
E
g(ζ )Z(d ζ ) =
E
f (ζ )g(ζ )μ (d ζ ).
If the space (E, E) is a real line with the Borel σ -algebra, then there exists a one-to-one correspondence between a process with orthogonal increments which is right-continuous in the mean square and an orthogonal stochastic measure with locally finite structural measures. Namely, consider a process {X(t),t ∈ R}, such that E|X(t)|2 < ∞ and: (1) ∀ t ∈ R : lim E|X(s) − X(t)|2 = 0. s→t+
(2) ∀ t1 ≤ t2 ≤ t3 ≤ t4 : E(X(t2 ) − X(t1 ))(X(t4 ) − X(t3 )) = 0. It can be checked easily that the set function FX ((a, b]) := E|X(b) − X(a)|2 , a < b, can be uniquely extended to a locally finite measure on the Borel σ -algebra on R. This measure (and, correspondingly, its distribution function) is said to be the structural measure (structural function) of the stochastic process with orthogonal increments {X(t),t ∈ R}. Theorem 8.4. A mapping ZX ((a, b]) := X(b) − X(a), defined on intervals can be uniquely extended to the orthogonal stochastic measure on R with structural function FX . Conversely, if an orthogonal stochastic measure Z on R and its structural measure are locally finite, then the stochastic process
8 Stationary discrete- and continuous-time processes
Z((0,t]), X(t) := −Z((t, 0]),
109
t ≥ 0, t < 0,
is a process with orthogonal increments and X is right-continuous in the mean square. Because of Theorem 8.4, an orthogonal stochastic measure sometimes is identified with the corresponding orthogonal process, that is a process with orthogonal increments. The following result is one of the most important in the theory of stationary processes. Theorem 8.5. (Spectral representation) Let {X(t),t ∈ R} be a wide-sense stationary continuous in mean square process, EX(t) = 0. Then there exists an orthogonal stochastic measure ZX on R such that X(t) =
R
eiζ t ZX (d ζ ), t ∈ R.
If {Xn , n ∈ Z} is a wide-sense stationary sequence, with EXn = 0, then it holds Xn =
π
−π
eiζ n ZX (d ζ ), n ∈ Z,
for some orthogonal stochastic measure ZX on (−π , π ]. Moreover, the spectral measure of the process (or sequence) X coincides with the structural measure of the corresponding orthogonal stochastic measure. In what follows, only wide-sense stationary stochastic processes and sequences, which satisfy conditions of Theorem 8.5, are considered. Definition 8.3. A stationary sequence {εn , n ∈ Z} is called a white noise if Eεn = 0, n ∈ Z and Rε (n) = 0, n = 0, Rε (0) = 1. Let us introduce one more type of stationarity of random sequences. Definition 8.4. A random sequence {Xn , n ≥ 0} is said to be strictly stationary if for any m the distribution of sequences {Xn , n ≥ 0} and {Xn+m , n ≥ 0} is the same. The definition of a strictly stationary stochastic process can be given in a similar way. Remark 8.1. Unless otherwise specified, by the term “stationary process” we mean a wide-sense stationary process. The following construction provides one of the most important examples of a strictly stationary sequence. Example 8.1. Let (Ω , F, P) be a probability space. Assume that T is a measurable transformation of Ω preserving a measure P. That is, an image of the measure P under the transformation T equals P : ∀A ∈ F : P(T −1 (A)) = P(A). Let ξ = ξ (ω ) be a random variable. Then a random sequence {ξn (ω ) = ξ (T n (ω )), n ≥ 0} is strictly stationary.
110
8 Stationary discrete- and continuous-time processes
Let’s introduce a σ -algebra I consisting of sets A ∈ F that are invariant under T ; that is, A = T −1 (A). Definition 8.5. A measure-preserving transformation T is ergodic if every invariant set A has probability either 0 or 1. Theorem 8.6. (Birkhoff–Khintchin theorem) Consider a measure-preserving transformation T . Then for any integrable random variable ξ it holds k ∑n−1 k=0 ξ T (ω ) = E (ξ /I) lim n→∞ n for almost all ω and in the mean as well. If T is ergodic then the corresponding limit almost everywhere is equal to Eξ . If Ω = R∞ and a stationary random sequence is the coordinate sequence (i.e., ξn (x) = xn where x = (x0 , x1 , . . . ) ∈ R∞ ), then ξn can be expressed as a composition of an initial random variable ξ0 and a measure-preserving mapping T : (x0 , x1 , . . .) → (x1 , x2 , . . .) (compare with Example 8.1). In a case of a general strictly stationary random sequence {ξn , n ≥ 0}, there is no measure-preserving mapping T such that ξn (ω ) = ξ0 (T n (ω )) for all ω . Therefore, the following construction which is close to Example 8.1 is proposed. Define shift operator on the set of functions { f (ξ0 , . . . , ξn )| f is bounded and measurable , n ≥ 0} by U f (ξ0 , . . . , ξn ) = f (ξ1 , . . . , ξn+1 ). Let η be a random variable measurable with respect to σ (ξk , k ≥ 0), the σ -algebra generated by the random sequence {ξk , k ≥ 0}. Then there exists a sequence of random variables {ηn := fn (ξ0 , . . . , ξn ), n ≥ 1} converging to η in probability. It can be checked that the sequence of shifts U ηn = fn (ξ1 , . . . , ξn+1 ) is also convergent in probability, and its limit depends on η but does not depend on the approximating sequence {ηn , n ≥ 1}. Denote this limit by U η . Definition 8.6. Let Iξ be the σ -algebra of sets A ∈ σ (ξk , k ≥ 0) for which 1IA = U1IA a.s. A stationary random sequence ξ = {ξn , n ≥ 0} is called ergodic if every event from Iξ has probability either 0 or 1. Theorem 8.7. Let η be a σ (ξk , k ≥ 0)-measurable and integrable random variable. Then ∑n−1 U k η = E η /Iξ for almost all ω . lim k=0 n→∞ n If {ξk , k ≥ 0} is ergodic then the corresponding limit a.s. equals Eη . In particular, if E|ξ0 | < ∞ then limn→∞ n−1 ∑n−1 k=0 ξk = Eξ0 .
Bibliography [82] Chapters V,VI; [24], volume 1, Chapter II §8 and Chapter IV §1–7; [79] Chapters VI, VII, XV, XVI; [72]; [69] Chapter III §6,7; [90] Chapter V; [9] Chapters X and XI; [15] Chapters X and XI, [49], Chapters 15 and 16.
8 Stationary discrete- and continuous-time processes
111
Problems 8.1. Let {εn , n ≥ 0} be i.i.d. random variables with Eεn = 0, Dεk = 1. Define the stochastic process X(t) = ∑ cn εn eiλn t , t ∈ R, n
where {cn , n ≥ 0} ⊂ C, ∑n |cn < ∞, {λn , n ≥ 0} ⊂ R. Prove that {X(t),t ∈ R} is a wide-sense stationary stochastic process. Find the covariance function RX . |2
8.2. Let {W (t),t ∈ R} be a Wiener process. Prove that X(t) = W (t + 1) −W (t),t ∈ R is a wide-sense stationary process. Find RX and FX . 8.3. Let {X(t),t ∈ R} be a wide-sense stationary process with the spectral function FX . Denote Y1 (t) =
n
∑ ck X (t + λk ) , Y2 (t) =
k=1
m
∑ d j X(t + μ j ),
j=1
where ck , d j ∈ C and λk , μ j ∈ R. Prove that the processes {Y1 (t),t ∈ R} and {Y2 (t), t ∈ R} are wide-sense stationary. Find their covariance and spectral functions. Find also the joint covariance function RY1 ,Y2 (s,t) = cov(Y1 (s),Y2 (t)). 8.4. Let {Xn , n ∈ Z} be a wide-sense stationary sequence with zero mean and covariance function RX (n) = 2−|n| . Find cov(X(3), X(5)), E|X(3)|2 , cov(2X(1) + −k 2 3X(2), 3X(1) − 2iX(3)), and E ∑∞ k=0 3 Xk . 8.5. Let {X(t),t ∈ R} be a wide-sense stationary measurable stochastic process and f : R → C be an integrable function. (1) Prove that the stochastic process Y (t) := R f (t −s)X(s)ds is correctly defined and stationary. (2) Express FY in terms of FX and an orthogonal random measure ZY in terms of ZX . 8.6. Assume that {Xn , n ∈ Z} is a wide-sense stationary random sequence with zero mean and the spectral function FX . Denote Yn = ∑k∈Z cn−k Xk , where {cn } ⊂ C with ∑n |cn | < ∞. (1) Prove that the sequence {Yn , n ∈ Z} is wide-sense stationary. Express the spectral function FY in terms of FX . (2) Prove that if {Xn , n ∈ Z} is a white noise and ∑n |cn |2 < ∞ then the series ∑k cn−k Xk is convergent in the mean square and stationary. 8.7. Assume that the covariance function RX of a wide-sense stationary random sequence satisfies ∑n∈Z |RX (n)|2 < ∞. Prove that the spectral measure has a density 1 RX (n)e−inζ , ζ ∈ (−π , π ], pX (ζ ) = 2π ∑ n where the series converges in L2 ((−π , π ]).
112
8 Stationary discrete- and continuous-time processes
8.8. Let X(t) = α (−1)N(t) , where N is a Poisson process, and the random variable α is independent of N. Which conditions should α satisfy in order for the process X to be: (a) wide sense stationary; (b) strictly stationary? Find RX and FX in case (a). 8.9. Assume that {Xn , n ∈ Z} is a wide-sense stationary random sequence with zero mean and the covariance function: 1, n = 0, (a) RX (n) = 0, n = 0. ⎧ ⎪ n = 0, ⎨4, (b) RX (n) = 1, |n| = 1, ⎪ ⎩ 0, |n| > 1. (c) RX (n) = an , where a ∈ C, |a| < 1. (d) RX (n) = an , where a ∈ C, |a| = 1. 1, n is even, (e) RX (n) = 0, n is odd. 3, n = 3k, (f) RX (n) = 1, n = 3k. (g) RX (n) = (h) RX (n) =
1 1+|n| .
⎧ ⎨10,
1 ⎩ (|n|)! ,
n = 0, n = 0.
Find the spectral measure FX . Describe the structure of {Xn , n ∈ Z} in items (d), (e), and (f). 8.10. Prove that R(n) = 1/(|n|)!, n ∈ Z cannot be a covariance function of either wide-sense stationary sequence. 8.11. A covariance function of a wide-sense stationary stochastic process {X(t), t ∈ R} is equal to (a) e−|t| . 2 (b) e−t /2 . 1 (c) 1+t 2 . (d) eiλ t . (e) cost. (f) cos2 2t + 1. it (g) eλ (e −1) . (h) R(t) = 1 − |t|, |t| ≤ 1 and R has a period T = 2. (i) sinatat .
8 Stationary discrete- and continuous-time processes
113
1−cost . t2 eiat −1 (k) iat .
(j)
Find the spectral function FX . 8.12. Let {X(t),t ∈ R} be a process with orthogonal increments and its structural measure be Lebesgue measure. Find 2 π (a) E sintdX(t) . 0
1
1
0
0
(b) E tdX(t) (2 + t 2 ) dX(t). 2
3
0
1
(c) E (3 + t)dX(t) t 2 dX(t). 8.13. Prove that {X(t),t ≥ 0} is a process with orthogonal increments and find its structural measure if: (a) X(t) = W (t) is a Wiener process. (b) X(t) = N(t) − λ t, where N is a Poisson process with intensity λ > 0. (c) X(t) = W 2 (t) − t, where W is a Wiener process. 8.14. Let {W (t),t ∈ R} be a Wiener process and h1 , . . . , hn ∈ L2 (R). Prove that (s), i = 1, n} is a Gaussian vector with zero mean and the covariance { R hi (s)dW matrix R hi (s)h j (s)dsni, j=1 .
t 8.15. (a) Prove that a stochastic process {X(t),t ∈ R}, where X(t) = −∞ eα (s−t) dW (s), α > 0, is stationary. (b) Find its covariance and spectral measure. (c) Prove that the increments of the corresponding process with orthogonal increments are Gaussian.
8.16. Find 1
(a) E
4 tdW (t) .
0
(b) EW (3) 1
π/2 0
sin sdW (s). 1
(c) E sdW (s) W (s)ds. 0
0
8.17. Assume that {X(t), t ∈ R} is a process with orthogonal increments and the structural measure FX . Let ϕ : R → R be a nondecreasing function. Put Y (t) = X(ϕ (t)). Prove that {Y (t)} is the process with orthogonal increments too. Find its structural measure. 8.18. Let {X(t),t ≥ 0}, X(0) = 0 be a Gaussian process with zero mean and orthogonal increments, and {W (t),t ≥ 0} be a Wiener process.
114
8 Stationary discrete- and continuous-time processes
(1) Find a function ϕ : [0; ∞) → [0; ∞) such that both stochastic processes {X(t),t ≥ 0} and {W (ϕ (t)),t ≥ 0} have the same distributions. (2) Assume that {X(t),t ≥ 0} is stochastically continuous and limt→+∞ EX 2 (t) = on the ∞. Prove that there exists a nonrandom function ϕ and a Wiener process W initial probability space such that X(t) = W (ϕ (t)),t ≥ 0. (3) Prove that, in the general case, there exist a nonrandom function ψ : [0; ∞) → (t) on some extension of the probability space, for [0; ∞) and a Wiener process W which X(t) = W (ψ (t)), t ≥ 0. (4) Solve the problem for a Gaussian process X(t) defined for all t ∈ R. 8.19. Let {X(t),t ∈ R} be a stochastic process with orthogonal increments and the t f (s)dX(s) structural measure FX . Suppose that f ∈ L2 (R, FX ). Prove that Y (t) := −∞ has orthogonal increments as well. Find its structural function. 8.20. Let f ∈ L2 (R). Prove that X(t) := Find its spectral function.
∞
−∞
f (t − s)dW (s) is a stationary process.
8.21. Let {c j , δ j , j = 1, n} be real numbers. Find a constant c ∈ R for which the process m
X(t) = cW (t) + ∑ c jW (t + δ j ) , t ∈ R j=1
is stationary. Find its spectral function. 8.22. Let {Xn , n ≥ 0} be a white noise. A sequence {Yn , n ≥ 0} is defined by the recurrence relation 1 Yn+1 = Yn + Xn , n ≥ 0, 2 where the random variable Y0 doesn’t depend on {Xn , n ≥ 0}. What should the expectation and variance of Y0 be to ensure the wide-sense stationarity of the sequence {Yn , n ≥ 0}? Find E(Y5Y 3 + 2Y1Y 2 + |Y3 |2 ) in this case. 8.23. Let {Xn , n ∈ Z} be a white noise. (1) Prove that the equation Yn+1 = α Yn + Xn , n ∈ Z has only one stationary solution if |α | = 1. (2) Express Yn as the series ∑k ck Xn−k . Consider the cases when: (a) |α | < 1; (b) |α | > 1. 8.24. The covariance function of a stationary sequence {Xn , n ∈ Z} is ⎧ ⎪ n = 0, ⎨5, RX (n) = 2, |n| = 1, ⎪ ⎩ 0, |n| > 1. (1) Represent Xn in the form ∑k ck εn−k , where {εn , n ∈ Z} is a white noise. (2) Express εn in terms of ZX and also represent it in the form ∑k ak Xn−k , where the series converge in the mean-square.
8 Stationary discrete- and continuous-time processes
8.25. Prove that X(t) = sin t
t+π t−π
cos sdW (s) − cos t
t+π t−π
115
sin sdW (s)
is a stationary stochastic process. 8.26. Solve Problem 8.23 for the equations: (a) 6Yn+2 − 5Yn+1 +Yn = Xn . (b) 2Yn+1 − 5Yn + 2Yn−1 = Xn . (c) 4Yn + 4Yn−1 +Yn = 3Xn + 2Xn−1 . 8.27. Let {Xn , n ∈ Z} be a white noise. Prove that the equation m
r
k=0
j=0
∑ akYn−k = ∑ b j Xn− j
has unique solution for any collection b0 , . . . , br if and only if the absolute value of k every root of the polynomial P(z) = ∑m k=0 ak z is not equal to 1. And in this case, the solution can be expressed in a form of moving average Yn = ∑k cn−k Xk , where {cn , n ∈ Z} ⊂ C is a summable sequence. 8.28. Prove that for any wide-sense stationary random sequence {Xn , n ∈ Z} the condition |P(z)| = 1 as |z| = 1 ensures the existence of a stationary solution to the equation from Problem 8.27. 8.29. Prove that the process X(t) = W (t + 1) −
t+1 t
W (s)ds is stationary.
8.30. Let {X(t),t ∈ R} be a wide-sense stationary stochastic process. Does the stationary solution exist for the following equations? If it does, is it unique? (a) Y (t) +Y (t) = X(t), dFX (ζ ) = 1Iζ ∈[−5,5] d ζ . (b) Y (t) + 4Y (t) = X(t) − X (t), dFX (ζ ) = (ζ − 2)2 1Iζ ∈[0,5] d ζ . 8.31. Let {Xn , n ∈ Z} be a wide-sense stationary random sequence. Does the stationary solution exist for the following equations? If it does, is it unique? (a) Yn+2 − 2iYn+1 sin α −Yn = Xn , dFX (ζ ) = 1Iζ ∈[−π /2,π /2] d ζ . 2 (b) ∑∞ k=1 (Yn−k )/k! = Xn , dFX (ζ ) = ζ 1Iζ ∈(−π ,π ] d ζ . 8.32. Represent stationary solutions of the equations in Problem 8.26 in the form Yn =
∞
∑ ck Xn−k ,
k=0
where {Xn , n ∈ Z} is a white noise. Evaluate Xn in terms of {Yk , k ∈ Z}. 8.33. Let {X(t),t ∈ R} be a wide-sense stationary stochastic process. Prove that the equation Y (t) + α Y (t) = X(t),t ∈ R, has a stationary solution if Re α = 0, and solutions to this equation are dX(s), − t∞ e−α (t−s) dX(s) when Re α > 0 and Re α < 0, respectively.
t
−α (t−s) −∞ e
116
8 Stationary discrete- and continuous-time processes
8.34. Prove that the equation d n−1Y (t) d nY (t) + a1 + · · · + anY (t) = X(t) n dt dt n−1 has a stationary solution for any stationary process {X(t),t ∈ R} if and only if the polynomial P(z) = zn + a1 zn−1 + · ·· + an does not have roots on the imaginary axis. ∞ f (t − s)X(s)ds. Represent the solution in the form −∞ 8.35. Let {X(t), Y (t),t ∈ (−π , π ]} be processes with orthogonal increments. Assume that a function f ∈ L2 ((−π , π ], FY ) is not vanishing at any point and it holds
(−π ,π ]
eint f (t)ZY (dt) =
(−π ,π ]
eint ZX (dt),
n ∈ Z.
Prove that 1/ f ∈ L2 ((−π , π ], FX ) and for all a, b ∈ (−π , π ]: 1 ZX (dt) . ZY ((a, b]) = (a,b] f (t) 8.36. Let {X(t), Y (t), t ∈ R} be processes with orthogonal increments. Assume that the function f ∈ L2 (R, FY ) is not vanishing at any point and it holds
R
eiut f (t)ZY (dt) =
Prove that for all a, b ∈ R: ZY ((a, b]) =
R
eiut ZX (dt), u ∈ R.
(a,b]
1 ZX (dt) . f (t)
8.37. Let Z be an orthogonal measure with the structural measure μ . Prove that: (a) Z(0) / = 0 a.s. (b) If {Δn , n ≥ 1} is a sequence of measurable disjoint sets with μ (∪n Δn ) < ∞, then the series ∑n Z(Δn ) converges to Z(∪n Δn ) in the mean square. 8.38. Let Z be an orthogonal stochastic measure on [0, 1] generated by a Wiener process {W (t),t ∈ [0, 1]}. Prove that for almost all ω the mapping Z(ω , ·) : B([0, 1]) → R is not a signed measure. (Compare with Problem 8.37.) 8.39. Prove that a wide-sense stationary centered stochastic process X(t) is continu ously differentiable in the mean square if and only if R s2 FX (ds) < ∞. And in this case X (t) = R iζ eiζ t ZX (d ζ ). 8.40. (1) Let {Xn , Yn , n ∈ Z} be wide-sense stationary random sequences. Prove that cov(Xn ,Ym ) = 0 for all n, m if, and only if, for any Borel sets Δ1 and Δ2 it holds EZX (Δ1 )ZY (Δ2 ) = 0. (2) Let {X(t), Y (t), t ∈ R} be continuous in the mean square wide-sense stationary stochastic processes. Prove that cov(X(t),Y (s)) = 0 for all t, s if, and only if, for any Borel sets Δ1 and Δ2 it holds EZX (Δ1 )ZY (Δ2 ) = 0. 8.41. (a) Can a wide-sense stationary sequence not be strictly stationary? (b) Can a strictly stationary sequence not be wide-sense stationary?
8 Stationary discrete- and continuous-time processes
117
8.42. (1) Let {Xn , n ≥ 0} be a homogeneous Markov chain. Assume that X0 has a stationary distribution. Prove that {Xn , n ≥ 0} is a strictly stationary random sequence. (2) Generalize the statement to the case of a continuous-time Markov process. 8.43. Let {Xn , n ≥ 0} be a strictly stationary random sequence. Prove that there exists (possibly, at another probability space) a strictly stationary sequence {X˜n , n ∈ Z} such that the distributions of {Xn , n ≥ 0} and {X˜n , n ≥ 0} coincide. 8.44. Prove that processes (a) X(t) = W (t + 1) −W (t),t ≥ 0, (b) X(t) = N(t + 1) − N(t),t ≥ 0 are strictly stationary. 8.45. Prove that the notions of the strict and wide-sense stationarity coincide for Gaussian random sequences or processes. 8.46. Let Ω = R2 and P be a normal distribution in R2 with zero mean and identity matrix of covariances. Assume that a transformation T : Ω → Ω acts in polar coordinates as T ((r, φ )) = (r, 2φ ), r ≥ 0, 0 ≤ φ < 2π . (a) Prove that T preserves the measure P. (b) Find the limit
1 f (x) + f (T (x)) + · · · + f (T n−1 (x)) , x ∈ R2 lim n→∞ n for ⎞ ⎛ x1 ⎠ , x = (x1 , x2 ) ∈ R2 . f (x) = x12 , f (x) = x1 x2 and f (x) = arccos⎝ . 2 2 x1 + x2 8.47. Let {Xn , n ≥ 0} be a strictly stationary random sequence. Prove that there ˜ P), ˜ a measure P˜ preserving mapping T : Ω˜ → Ω˜ , exists a probability space (Ω˜ , F, and a random variable ξ˜ (in Ω˜ ) such that {Xn , n ≥ 0} and {ξ˜ (T n−1 ), n ≥ 0} are stochastically equivalent in the general sense. 8.48. Let Ω = R∞ , F be a σ -algebra generated by the cylindric sets, and T be a function transforming the sequence (x1 , x2 , . . .) into (x2 , x3 , . . .). Assume that P is a measure corresponding to a sequence of independent identically distributed random variables. Prove that T is the measure P preserving transformation and T is ergodic. 8.49. Assume that a strictly stationary sequence {ξn , n ≥ 0} is m-dependent; that is, families of random variables {ξk , k ≤ n} and {ξ j , j ≥ n + m} are independent for any n. Prove that the sequence {ξn , n ≥ 0} is ergodic. 8.50. Assume that a mapping T is measure-preserving. Prove that T is ergodic if and only if for any random variable ξ with E|ξ | < ∞ it holds E(ξ /I) = Eξ , a.s. Prove a similar statement about the ergodicity of a strictly stationary random sequence. 8.51. Let {Xn , n ≥ 0} be a homogeneous Markov chain with a finite number of states. Assume that all states of {Xn , n ≥ 0} are connected and have unity period, and X0 has a stationary distribution. Prove that {Xn , n ≥ 0} is an ergodic sequence.
118
8 Stationary discrete- and continuous-time processes
8.52. Let {ξn , n ≥ 0} be a strictly stationary sequence, and f : R∞ → R be a measurable function. Prove that the random sequence
ηn := f (ξn+1 , ξn+2 , . . .) , n ≥ 0 is strictly stationary as well. Prove that if {ξn , n ≥ 0} is ergodic then the sequence {ηn , n ≥ 0} is ergodic too. Is the opposite statement correct? 8.53. Let {Xn , n ∈ Z} be a wide-sense stationary random sequence with zero mean. Prove that n−1 ∑n−1 k=0 Xk → ZX ({0}) as n → ∞ in the mean square. 8.54. Assume that a sequence {Xn , n ∈ Z} is both strictly stationary and wide-sense stationary. Suppose additionally that the spectral measure has a singular component. Prove that {Xn , n ∈ Z} is not ergodic. 8.55. (1) Assume that T is a measure-preserving transformation. Prove that T is ergodic if and only if for some p ≥ 1 there exists a total set S in L p (Ω , F, P) (i.e., the completion of the linear hull is the whole space L p ) such that for any random k variable ξ ∈ S the sequence n−1 ∑n−1 k=0 ξ (T ) converges a.s. to a constant. (2) Prove a similar statement about the ergodicity of a strictly stationary random sequence. 8.56. Let Ω = [0, 1), P be a Lebesgue measure in Ω , and T (x) = x + α (mod 1), where α is fixed. (1) Prove that T is a measure-preserving transformation. (2) Prove that T is ergodic if and only if α is an irrational number. (3) Describe the invariant sets if α is rational. 8.57. Let Ω = [0, 1), and P be Lebesgue measure in Ω . Assume that the mapping T transforms the number x = 0.x1 x2 x3 . . . into T (x) = 0.x2 x3 x4 . . .. Prove that T is a measure-preserving transformation. Is T ergodic? 8.58. Let {ξ (t), t ∈ R} be a continuous, strictly stationary stochastic process, and a : R2 → R be a continuous and bounded function. Assume that a satisfies the Lipschitz condition with respect to the first coordinate with a constant L, and α > L is some constant. Prove that there exists a unique strictly stationary solution to the differential equation dX(t) = −α X(t) + a(X(t), ξ (t)), t ∈ R. dt 8.59. Let {ξn , n ∈ Z} be a strictly stationary random sequence, and a : R2 → R be a continuous and bounded function. Assume that a satisfies the Lipschitz condition with respect to the first coordinate with constant L, and α ∈ (−1, 1) is a constant with |α | + L < 1. Prove that there exists a unique strictly stationary sequence {Xn , n ∈ Z} satisfying the equation Xn+1 = α Xn + a(Xn , ξn ), n ∈ Z.
8 Stationary discrete- and continuous-time processes
119
8.60. (Poincar´e theorem on returns) Let (Ω , F, P) be a probability space, T be a measure-preserving mapping, and A ∈ F. Prove that for almost all points ω ∈ A the number of members of the sequence {T n (ω ), n ≥ 1} belonging to A, is infinite. 8.61. Let {X(t),t ≥ 0} be a continuous strictly stationary process. Assume that E maxt∈[0,1] |X(t)| < ∞. (a) Prove that limt→+∞ e−α t X(t) = 0 a.s. for any α > 0. (b) Is it correct that limt→+∞ (X(t))/t 3 = 0 a.s.? 8.62. Give an example of a continuous and strictly stationary process {X(t),t ≥ 0} such that for any α > 0 it holds lim supt→+∞ X(t)e−α t = +∞ a.s. 8.63. Let a function a : Rm → Rm satisfy the Lipschitz condition. Assume that there exists a point x0 such that solution to the equation dy(t) = a(y(t)), t ≥ 0, (8.2) dt with initial condition y(0) = x0 is bounded. Prove that there exists a strictly stationary stochastic process {X(t),t ∈ R}, satisfying (8.2) for all ω . 8.64. Let functions a, b : Rm → Rm satisfy the Lipschitz condition. Assume that there exists a point x0 such that the solution of the stochastic differential equation dy(t) = a(y(t))dt + b(y(t))dW (t), t ≥ 0,
(8.3)
with initial condition y(0) = x0 is bounded in the mean square. Prove that there exists a strictly stationary stochastic process {X(t),t ≥ 0} satisfying (8.3).
Hints 8.14. Prove the corresponding statement for simple functions and then pass to a limit. 8.25. Observe that X(t) =
∞
−∞
1It−s∈[−π ,π ] sin(t − s)dW (s)
and use Problem 8.20. 8.26. See the solution to Problem 8.27. 8.27. Similarly to the solution of Problem 8.24 it can be proved that m
r
k=0
j=0
∑ ak e−ikζ ZY (d ζ ) = ∑ b j e−i jζ ZX (d ζ ).
(8.4)
m k j Denote P(z) = ∑m k=0 ak z , and Q(z) = ∑ j=0 b j z . The equality for the structural measures follows from (8.4): 2 1 −iζ 2 dζ . Q e P e−iζ FY (d ζ ) = 2π The measure FY must be finite, and the assumptions of the problem imply that the solution {Yn } exists for any coefficients {b j }. Thus the polynomial P(z) does not vanish when |z| = 1. It follows from (8.4) that
120
8 Stationary discrete- and continuous-time processes
Q e−iζ ZY (d ζ ) = −iζ ZX (d ζ ) . P e Let us represent Q(z)/P(z) as a sum: m1 m2 l β jk Q(z) = ∑ αk zk + ∑ ∑ , j P(z) k=0 ( γ jk − z) j=1 k=1
(8.5)
where αk , β jk , γ jk ∈ C. Because P doesn’t have roots on the unit circumference, then |γ jk | = 1 for all j, k. Let us expand every term of the second sum (8.5) into a uniformly convergent power series about z in the neighborhood of the unit circumference. Use spectral representation to complete the proof. 8.28. See the hint to Problem 8.27. 8.29. Use the result of Problem 13.2 and verify that X(t) =
t+1 t
(t − s)dW (s) =
∞
−∞
f (t − s)dW (s),
where f (t) = t1It∈[0,1] . 8.32. The following can be obtained similarly to the solution of Problem 8.27: π ∏k βk − e−iζ ZX (d ζ ) , α ei(n+m)ζ Yn = −π ∏ j γk − e−iζ where |γ j | = 1, m ∈ Z, and α , βk , γ j ∈ C. Note the following two facts. (1) For any γ ∈ C it holds |γ − eiζ | = |γ − e−iζ |. (2) If {Xn } is a white noise then for any measurable function f with | f (ζ )| = 1 as ζ ∈ (−π , π ], the sequence Xn =
π
−π
einζ f (ζ ) ZX (d ζ ) , n ∈ Z
is a white noise as well. Let Λ be a set of indices for which |γ j | < 1. Check that the function g(z) = α
∏k (βk − z) ∏ j∈/Λ (γ j − z) ∏ j∈Λ γ j − z−1
can be expanded into the Taylor series in a neighborhood of the unit circle, and observe that ∏k βk − e−iζ α ei(n+m)ζ ∏ j γk − e−iζ can be represented as g(e−iζ ) f (ζ ), where | f (ζ )| = 1 as ζ ∈ (−π , π ]. 8.34. If the solution exists then P(eiζ )ZY (d ζ ) = ZX (d ζ ). The stochastic measure ZY (d ζ ) = P(eiζ )−1 ZX (d ζ ) has the finite structural measure for any wide-sense stationary process X if and only if P(z) = 0 as |z| = 1. Rewrite P(z)−1 as n1 n2
αk j
j=1 k=1
βk j − z
∑∑
j
8 Stationary discrete- and continuous-time processes
121
and use Problem 8.5 (see also the solution to Problem 8.33). 8.35. At first check that it holds
(−π ,π ]
g(t) f (t)ZY (dt) =
(−π ,π ]
g(t)ZX (dt),
for any g ∈ C([−π , π ]). Then prove this formula for every g, such that f g ∈ L2 ([−π ; π ], FY ). 8.43. Use the Kolmogorov theorem on finite-dimensional distributions (Theorem 1.1). 8.45. Finite-dimensional distributions of a Gaussian process are uniquely determined by the mean and covariance functions. 8.46. The corresponding limits of averages are equal to the conditional expectation of initial functions with respect to the σ -algebra generated by the random variable ϕ . At first, prove this fact for the functions of the form f (r, ϕ ) = ∑m k=0 ck 1Iϕ ∈[αk ,βk ] 1Ir∈[xk ,yk ] , and then pass to a limit. coincide with the = R∞ , and let the finite-dimensional distributions of P 8.47. Take Ω finite-dimensional distributions of {Xn , n ≥ 0}. In this case the shift-transformation T : (x1 , x2 , . . .) → (x2 , x3 , . . .) satisfies the condition of the problem. 8.48. To prove ergodicity use the result of Problem 8.55 and the law of large numbers. 8.49. See the hint to Problem 8.55. 8.51. Use Theorem 10.3 and the result of Problem 8.55. 8.54. The sequence {Xn , n ∈ Z} is not regular (see Theorem 9.3), therefore it cannot be ergodic. 8.56. (2) Use Problem 8.55. Take exponential functions {einx , n ∈ Z} as a total set. 8.57. To prove the ergodicity of T use Problem 8.48 (the coordinates are independent random variables). 8.58. Uniqueness. Let X(t) and Y (t) be solutions to the initial equation, t > s. Check that (X(t) −Y (t))2 eα t ≤ (X(s) −Y (s))2 eα s +
t s
L(X(z) −Y (z))2 eα z dz.
The Gronwall–Bellman lemma (see Problem 14.17) implies that (X(t) − Y (t))2 ≤ (X(s) −Y (s))2 exp{−(α − L)(t − s)}. Tending s → −∞ prove that X(t) = Y (t) a.s. Existence. Denote the solution to the following equation by Xs (t),t ≥ s. dXs (t) dt = a(Xs (t), ξ (t)), t ≥ s, Xs (s) = 0. Use the same bounds as in the proof of uniqueness and check that for every ω the processes Xs (t) uniformly converge on any interval [a, b] as s → −∞, and the limiting process is stationary. 8.59. Check that for any ξ the function x → α x + a(x, ξ ) is a contractive mapping. Furthermore, use reasoning similar to the hint of Problem 8.58. 8.60. Let N = A ∩ ∩n≥1 T −n (Ω \ A). Check that for any numbers m = n it holds T −n (N) ∩ T −m (N) = 0/ and P(N) = P(T −n (N)). Conclude that P(N) = 0, that is, almost all points return to the set A at least once. 8.61. Let f be a nonnegative and decreasing function. Then
122
8 Stationary discrete- and continuous-time processes
E sup |X(t)| f (t) ≤ t≥0
∞
∞
n=0
n=0
max |X(t)| f (n) = E max |X(t)| ∑ f (n). ∑ E t∈[n,n+1] t∈[0,1]
8.62. Denote X(t) = f (W (t + 1) − W (t)), where f is a continuous function with 2 P( f (W (1)) > en ) > 1/n, n > 2. To prove the statement use the Borel–Cantelli lemma. 8.63. Denote by y(t, x) the solution to (8.2) with initial condition y(0, x) = x. Let μT be the image of Lebesgue measure on [0;T] under the mapping y(·, x0 ). Because the function y(t, x0 ),t ≥ 0 is bounded, the family of measures (1/T )μT is relatively compact. Let ν be a limit point of this family, and ξ be a random variable with distribution ν . Check that X(t) = y(t, ξ ) is the desired process. 8.64. See the hint to Problem 8.63.
Answers and Solutions 2 iλn t 8.1. RX (t) = ∑ n |cn | e . 1 − |t|, |t| < 1, 8.2. RX (t) = 0, |t| ≥ 1.
1 dFX (ζ ) = p(ζ )d ζ , where p(ζ ) = (1/2π ) −1 (1 − |t|)e−iζ t dt = (1 − cos ζ )/πζ 2 . 8.4. 0.25; 1; 6RX (0) + 4iRX (−2) + 9RX (1) + 6iRX (−1) = 10, 5 + 4i; 63/40. 8.5. Justify the transformations: f (t − s)X(s)ds = f (s)X(t − s)ds = f (s) eiζ (t−s) ZX (d ζ ) ds R R R R f (s)e−iζ s ds ZX (d ζ ) = eiζ t fˇ (ζ ) ZX (d ζ ) , = eiζ t R
R
R
where fˇ is inverse Fourier transform of f . Therefore, 2 FY (d ζ ) = fˇ (ζ ) ZX (d ζ ) . 2 8.6. FY (d ζ ) = ∑k ck e−ikζ FX (d ζ ) .
8.7. Verify that −ππ einζ pX (ζ )d ζ = RX (n). Because pX is a function from L2 ((−π , π ]), then pX (ζ )d ζ is a finite signed-measure on (−π , π ]. It follows from π
−π
einζ pX (ζ ) d ζ =
π
−π
einζ FX (d ζ ) , n ∈ Z,
that pX (ζ )d ζ = FX (d ζ ), thus, pX is a spectral density. 8.8. (a) Eα (−1)N(t) = Eα E(−1)N(t) = e−2λ t Eα . Thus, Eα = 0. Let t ≥ s and σ 2 = Eα 2 < ∞. Then EX(t)X(s) = σ 2 E(−1)N(t−s) = σ 2 e−2λ (t−s) . So, the process X is stationary and RX (t) = σ 2 e−λ |t| . The spectral density is pX (ζ ) =
σ2 2π
∞
−∞
e−2λ |t| e−it ζ dt =
2σ 2 λ . π (4λ 2 + ζ 2 )
8 Stationary discrete- and continuous-time processes
123
(b) α must have a symmetric distribution. 8.9. (a) 21π d ζ . ζ dζ . (b) 2+cos π (c)
1−|a|2 1 2π |1−ae−iζ |2 d ζ .
(d) The unit mass concentrated at the point ϕ with eiϕ = a. In this case Xn = an X0 . (e) Check that Xn+2 = Xn . Thus, for any n ∈ Z : π
iζ (n+2)
−π
e
ZX (d ζ ) =
π
−π
eiζ n ZX (d ζ ).
Therefore, (e2iζ − 1)ZX (d ζ ) = 0. So, the measure FX (d ζ ) is concentrated at points ζ with e2iζ = 1; that is, FX (d ζ ) = c1 δ{0} + c2 δ{π } . Substitute n = 0 and n = 1 into the equality RX (n) = −ππ eiζ n FX (d ζ ) and obtain the system of linear equations c1 + c2 = 1, c1 − c2 = 0. So, c1 = c2 = 1/2. of the previous item. (f) Check that Xn+3 = Xn and use reasoning 1 1 1 −i ζ n −i ζ (g) pX (ζ ) = 2π ∑ 1+|n| e = 2π −e ln(1 − eiζ ) − eiζ ln(1 − e−iζ ) − 1 ; n iζ −iζ (h) pX (ζ ) = 21π ee + ee + 9 . 8.10. Assume the contrary. In this case the spectral density is −iζ 1 eiζ pX (ζ ) = e + ee − 1 . 2π This function is negative in some neighborhood of the point π . We get the contradiction. 8.11. (a) π1 1+1ζ 2 d ζ . 2
ζ √1 e− 2 d ζ . 2π −|ζ | (c) e 2 d ζ (see item (a)). (d) δ{λ } . it −it δ +δ , thus, FX (d ζ ) = {1} 2 {−1} . (e) cost = e +e 2 4it −4it δ +δ +3δ{0} = e +e2 +3 , thus, FX (d ζ ) = {4} {−4} . (f) cos2 2t + 1 = cos 4t+3 2 2 int n it −1) ∞ e λ λ (e − λ = e ∑n=0 n! . (g) e So, FX has a Poisson distribution with parameter λ . (h) Express RX (t) as the series RX (t) = ∑ cn eπ int with n ⎧ √
(b)
1 cn = √ 2 So,
1
−1
RX (t)e−π int dt =
⎪ ⎨1/ 2, 0,√ ⎪ ⎩2 2 π |n| ,
n = 0, n even, n = 0, n odd.
124
8 Stationary discrete- and continuous-time processes
FX (d ζ ) = ∑ cn δ(π n) . n
(i) Uniform distribution on [−a, a]. (j) 1−|2 ζ | 1Iζ ∈[−1,1] d ζ . (k) Uniform distribution on [0, a]. 8.12. (a) 0π sin2 tdt = π /2. (b) 01 t(2 + t 2 )dt = 1.25. (c) E
2 0
3
(3 + t)dX(t)
E
3 0
1
t 2 dX(t) = 3
(3 + t)1It∈[0,2] dX(t)
0
t 2 1It∈[1,3] dX(t) =
2 1
(3 + t)t 2 dt = 11.75.
8.13. (a) FX (dt) = dt,t ≥ 0. (b) FX (dt) = λ dt,t ≥ 0. (c) FX (dt) = 4tdt, t ≥ 0. 8.15. (a) RX (t) = (1/2α )e−α |t| , FX (d ζ ) = α /(π (α 2 + ζ 2 ))d ζ . (b) Use the solution toProblem 8.5 and check that for any a, b ∈ R a random vari∞ 1Iζ ∈[a,b] ZX (d ζ ) can be expressed as a limit of a Gaussian able ZX (b) − ZX (a) = −∞ random variables of the form ∞
lim
n→∞ −∞
fn (t − s)X(s)ds
where { fn , n ≥ 1} is a sequence of integrable functions. 8.16. (a) It follows from Problem 8.14 that 01 tdW (t) ∼ N(0, 1/3). So, 1 4 E tdW (t) = 3σ 4 = 1/3. (b) Because W (3) =
3
0
0
1dW (s), it holds
0
1dW (s)
π /2
π /2
sin sdW (s) = 1 sin sds = 1. 0 0 0 1 1 1 1 1 E E sdW (s) W (s)ds = zdW (z) 1Iz∈[0,s] dW (z) ds E
(c)
3
0
0
=
1 s 0
0
0
1 zdz ds = . 6
0
8.17. FY (t) = FX (ϕ (t)). 8.18. (a) Let σ 2 (t) = EX 2 (t). Verify that the function σ 2 is monotonically decreasing. Put ϕ (t) = σ 2 (t). (t) = X(ϕ (−1) (t)) where ϕ (−1) (t) = inf{s : ϕ (s) = t}. (b) W (c) Define a monotone function ψ (t) = inf{s : ϕ (t) ≥ s}. There exists an enumerable set of disjoint intervals (an , bn ) on which the function ψ is constant. Let {Yn (t), t ∈ [an , bn ]} be a sequence of independent Brownian bridges on [an , bn ], and also independent on the process X. Then / (an , bn ), (t) := X(ψ (t)), ∀n : t ∈ W n (X(bn ) − X(an )), ∃n : t ∈ (an , bn ) X(an ) +Yn (t) + bt−a n −an
8 Stationary discrete- and continuous-time processes
125
is the desired Winer process. 8.19. FY (dt) = | f (t)|2 FX (dt). 8.20. RX (t) = R f (t − s) f (−s)ds, FX (d ζ ) = | f-(ζ )|2 d ζ . 8.21. Let 0 < δ1 < · · · < δn . The variance of the process X is constant only for c = −(c1 + · · · + cn ). In order to check that this implies the stationarity, observe that X(t) =
n
∑ (ck + · · · + cn ) (W (t + δk ) −W (t + δk−1 )) =
k=1
where
δ0 = 0,
f (t) =
R
f (t − s)dW (s),
n
∑ (ck + · · · + cn )1It∈[δk−1 ,δk ] .
k=1
To complete the proof, use the result of Problem 8.20. 8.22. If {Yn , n ≥ 0} is stationary and a = EYn , σ 2 = DYn , then 1 1 1 a = EYn+1 = EYn + EXn = EYn = a. 2 2 2 Thus, a = 0. Moreover 1 1 DY1 = DY0 + DX0 = σ 2 + 1. 2 2 Therefore, σ 2 = 2. In this case RY (n) = 2−|n|+1 , n ∈ Z, and E Y5Y 3 + 2Y1Y 2 + |Y3 |2 = RY (2) + 2RY (−1) + RY (0) = 5. 8.23. Solution 1. Let |α | < 1. Yn+1 = α Yn + Xn = α 2Yn−1 + α Xn−1 + Xn = · · · m
= α m+1Yn−m + ∑ α k Xn−k , m ∈ N.
(8.6)
k=0
Because E|Yn−m = E|Y0 = const, the first term in (8.6) tends to zero in the mean square, as m → ∞. This implies the equality |2
|2
∞
Yn =
∑ α k Xn−k−1 .
k=0
Solution 2. It follows from the spectral representation that for any n ∈ Z, π
−π
and thus,
ei(n+1)ζ ZY (d ζ ) =
π −π
π
−π
α einζ ZY (d ζ ) +
einζ eiζ − α ZY (d ζ ) =
π
−π
π
−π
einζ ZX (d ζ ),
einζ ZX (d ζ ) , n ∈ Z.
Problem 8.35 implies that ZY (d ζ ) = (eiζ − α )−1 ZX (d ζ ). Therefore, Yn = If |α | < 1, then
π
−π
einζ ZX (d ζ ). eiζ − α
126
8 Stationary discrete- and continuous-time processes ∞ einζ = ei(n−1)ζ ∑ α m e−imζ . −α m=0
eiζ
Because the series converges in L2 ((−π , π ], FX ) = L2 ((−π , π ], (2π )−1 d ζ ), then Yn =
∞
∑ α k Xn−k−1 .
k=0
If |α | > 1, then
∞ einζ = α −k−1 eiζ (n+k) ∑ eiζ − α k=0
and Yn =
∞
∑ α −k−1 Xn+k .
k=0
8.24. Observe that (see Problem 8.7): 2 d ζ , ζ ∈ (−π , π ]. FX (d ζ ) = 2 + eiζ 2π So, (Problem 8.19) the orthogonal random measure Zε (d ζ ) := (2 + eiζ )−1 ZX (d ζ ) has the structural density 1/(2π ). It means that π −1 ∞ εn := einζ 2 + eiζ ZX (d ζ ) = ∑ 2−k−1 Xn+k −π
k=0
is a white noise. Therefore, Xn =
π −π
2 + eiζ einζ Zε (d ζ ) = 2εn + εn+1 .
8.30. (a) If a solution exists then (see Problem 8.39) the following equalities hold: 1 − ζ 2 ZY (d ζ ) = ZX (d ζ ) , 1 − ζ 2 2 FY (d ζ ) = FX (d ζ ) = 1Iζ ∈[−5,5] d ζ . (8.7) Neither finite measure FY satisfies the relation (8.7). Therefore, there is no stationary solution. (b) (4 − ζ 2 )ZY (d ζ ) = (1 − iζ )ZX (d ζ ). It is easy to check that the orthogonal random measure (1 − iζ )(4 − ζ 2 )−1 ZX (d ζ ) is correctly defined and the process ∞ 1 − iζ eit ζ ZX (d ζ ) Y (t) = 4 −ζ2 −∞ is a stationary solution to the initial equation. This solution is not unique. A general stationary solution is of the form Y (t) = α1 e2it + α2 e−2it + Y (t) where the random variables α1 and α2 satisfy E|αi |2 < ∞, Eαi = 0, and α1 and α2 are orthogonal in L2 and orthogonal to {X(t),t ∈ R}. 8.31. (a) It follows from the spectral representation that: e2iζ − 2ieiζ sin α − 1 ZY (d ζ ) = ZX (d ζ ) ,
8 Stationary discrete- and continuous-time processes
or So,
eiζ − eiα
127
eiζ + eiα ZY (d ζ ) = ZX (d ζ ) .
2 2 iζ e − eiα eiζ − e−iα FY (d ζ ) = 1I[−(π /2),(π /2)] (ζ ) d ζ .
There is no finite measure FY satisfying the last equality. Answer: there is no stationary solution. −iζ /k!) = (ee−iζ − 1). (b) ∑∞ k=1 (e −iζ Because ee − 1 = 0 as ζ ∈ (−π , π ], there exists a unique stationary solution to the equation π −iζ −1 einζ ee − 1 ZX (d ζ ) . Yn = −π
t ∞ 8.33. Consider only the case Re α > 0. −∞ e−α (t−s) X(s)ds = −∞ f (t − s)X(s)ds, − α s where f (s) = e 1Is∈[0,∞) . It follows from the solution to Problem 8.5 that
Y (t) =
∞
−∞
eiζ t fˇ(ζ )ZX (d ζ ),
∞ −iζ t e f (t)dt = 1/(α + iζ ). To complete the proof, use the result where fˇ(ζ ) = −∞ of Problem 8.39 8.38. Use the fact that trajectories of a Wiener process have infinite variation a.s. on any interval (Problem 3.19). 8.41. (a) Yes. (b) Yes, if the sequence doesn’t have finite second moment. 8.55. The transformation U : L p (Ω , F, P) ξ → ξ (T ) ∈ L p (Ω , F, P) is an isometry, n therefore, the norm in L p (Ω , F, P) of the operator Pn : ξ → (1/n) ∑n−1 k=0 ξ (T ) is bounded by unity. The Birkhoff–Khinchin theorem implies that the sequence Pn strongly converges to the conditional expectation. So, to prove that the limit is a constant a.s., it is sufficient to check this on a total set.
9 Prediction and interpolation
Theoretical grounds Let {Xn , n ∈ Z} be a mean zero wide-sense stationary random sequence, Λ ⊂ Z be a set of indices, and HΛ := L cl (Xn , n ∈ Λ ) be a closure in L2 (Ω , F, P) of a linear hull of the system of random variables {Xn , n ∈ Λ }. In this chapter we study methods to find the projection of some element Xk on the space HΛ in L2 (Ω , F, P), that is, to find the best approximation of Xk by a linear combination of variables {Xn , n ∈ Λ }. Denote this projection by πHΛ (Xk ). The most important cases for technical applications are Λ = {−n, n ≥ 0} and Λ = Z \ {n1 , . . . , nm }. The first case is called the prediction problem, when we need to approximate in the best way the “future” Xn via “past”, that is, via observations X0 , X−1 , X−2 , . . . . The second case is called the interpolation problem and the task is to reconstruct in the best way some “lost” elements of the sequence {Xn , n ∈ Z}. Consider the prediction problem first. Introduce the spaces: Hk (X) = L cl (Xn , n ≤ k); S(X) = ∩k∈Z Hk (X);
H(X) = L cl (Xn , n ∈ Z); R(X) = H(X) ! S(X).
Let πk = πHk (X) be a projection in L2 (P) on Hk (X), and π−∞ = πS(X) . Definition 9.1. A sequence {Xn , n ∈ Z} is called regular if H(X) = R(X) and singular if H(X) = S(X). Write Xn as a sum Xnr + Xns , where Xns = π−∞ (Xn ) and Xnr = Xn − Xns . It can be proved that the sequences {Xns } and {Xnr } are mutually orthogonal and wide-sense stationary. Furthermore, the sequence {Xnr } is regular and {Xns } is singular. Because π0 (Xns ) = Xns , it makes sense to consider the prediction problem for the regular sequences only. The following result is a key tool for solving the prediction problem. Theorem 9.1. (Wald decomposition) Let X = {Xn , n ∈ Z} be a nondegenerate widesense stationary sequence. It is regular if and only if there exist a white noise ε = {εn , n ∈ Z} and a sequence {an , n ∈ Z+ } ⊂ C with ∑n∈Z+ |an |2 < ∞, such that: (a) Xn = ∑∞ k=0 ak εn−k and the series converges in the mean square. (b) Hn (X) = Hn (ε ), n ∈ Z. D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 9,
129
130
9 Prediction and interpolation
Theorem 9.2. Let Xn = ∑∞ k=0 ak εn−k be the Wald decomposition of a regular widesense stationary sequence {Xn , n ∈ Z}. The solution to the prediction problem is given by the formula: ∞
∑ ak εn−k .
π0 (Xn ) =
k=n
The following statement provides a criterion for the regularity in the terms of spectral function. Theorem 9.3. (Kolmogorov theorem) A nondegenerate wide-sense stationary random sequence is regular if, and only if, it has the spectral density p with π −π
log p(ζ )d ζ > −∞.
In particular cases it is possible to express π0 (Xn ) in terms of the spectral representation. Theorem 9.4. Assume that the spectral density of a wide-sense stationary sequence k {Xn , n ∈ Z} is of the form |Φ (e−iζ )|2 , where the function Φ (z) = ∑∞ k=0 ak z is an∞ alytic in the circle {|z| ≤ 1} and Φ (z) = 0 as |z| ≤ 1. Denote Φn (z) = ∑k=n ak zk . Then
π0 (Xn ) =
π −π
einζ
Φn (e−iζ ) ZX (d ζ ). Φ (e−iζ )
(9.1)
In the case where Λ consists of a single point, the solution to the interpolation problem is provided by the following theorem. Theorem 9.5. Assume that the spectral density p of a regular sequence {Xn , n ∈ Z} satisfies the condition −ππ (1/p(λ ))d λ < +∞. Then the projection πL cl (Xn ,n=0) (X0 ) can be found by the formula −ππ ϕ (ζ )ZX (d ζ ) where
ϕ (ζ ) = 1 −
2π
p(ζ )
π
dλ −π p(λ )
.
Bibliography [82] Chapter VI; [24], Volume 1, Chapter V §8-9; [79] Chapters XVII and XVIII; [72]; [69] Chapter V; [90] Chapter V; [15] Chapter XII.
9 Prediction and interpolation
131
Problems 9.1. Consider a random vector (X1 , X2 , X3 ) with zero mean and the covariance matrix , , , 1 1 −1, , , , cov(Xi , X j ) = , , 1 4 −1, . ,−1 −1 10 , (1) Find the best in mean square linear approximation X-3 of a random variable X3 via X1 and X2 . (2) Find E(X3 − X-3 )2 . (3) Find the best in mean square linear approximation of a random variable X3 − 2X1 via X1 + X2 . 9.2. Prove Theorem 9.4. 9.3. Let {Xn , n ∈ Z} be a white noise, and Y-n := π0 (Yn ), n ≥ 1. Express Y-n via {Yk , k ≤ 0}, if {Yk , k ∈ Z} is a stationary solution to the following equations. (a) Yn+1 = 12 Yn + Xn , n ∈ Z. (b) Yn+1 = 2Yn + Xn . (c) Yn+1 = 3Xn+1 + Xn . (d) 2Yn+2 − 3Yn+1 − 2Yn = Xn . (e) Yn+3 − 3Yn+2 − 4Yn+1 + 12Yn = Xn . (f) 4Yn+2 + 4Yn+1 +Yn = Xn + Xn+1 . (g) Yn = 2Xn+2 − 3Xn+1 − 2Xn . (h) Yn+2 + 2Yn+1 + 10Yn = Xn . (i) Yn = 4Xn + 4Xn−1 + Xn−2 . (j) Yn = 4Xn+1 + 4Xn + Xn−1 . (k) Yn+2 + 4Yn+1 +Yn = Xn . (l) Yn = 10Xn + 3Xn−1 − Xn−2 . (m) Yn = 3Xn + 11Xn−1 − 4Xn−2 . (n) Yn = 2Xn+1 + 13Xn − 7Xn−1 . (o) 4Yn+2 − 7Yn+1 − 2Yn = −2Xn + Xn−1 . (p) Yn+1 +Yn = Xn+1 + Xn . (q) Yn+1 + 2Yn = Xn+1 + Xn . (r) 3Yn+1 + 2Yn = Xn . Find E(Yn − Y-n )2 for items (a) – (c). 9.4. Express X-n := π0 (Xn ), n ≥ 1 via {Xk , k ≤ 0} if: (a) RX (n) = a⎧|n| , |a| < 1. ⎪ n = 0, ⎨5, (b) RX (n) = −2, |n| = 1, ⎪ ⎩ 0, |n| > 1. (c) RX (n) = 1, n ∈ Z. (d) RX (n) = eiλ n , λ ∈ (−π , π ], n ∈ Z. (e) RX (n) = 2 + einλ , λ ∈ (−π , π ], n ∈ Z. Is the sequence {Xn , n ∈ Z} regular? Singular?
132
9 Prediction and interpolation
9.5. Assume that {Xn , n ∈ Z} is a stationary sequence with FX (d ζ ) = δ{1} + 1Iζ ∈[0,π ] d ζ , where δ{1} is a unity mass concentrated at point 1. Find the regular and singular components of the sequence {Xn , n ∈ Z}. 9.6. Assume that FX (d ζ ) = δ{0} + (1/2π )|2 − e−iζ |2 d ζ . (1) Prove that X0 + · · · + X−n+1 L2 (P) −→ ZX ({0}) as n → ∞. n (2) Find the covariance function for the sequence Xn − ZX ({0}), n ∈ Z. (3) Describe the decomposition Xn = Xns + Xnr . 9.7. Assume that FX (d ζ ) = δ{1} + (1/2π )|3 − eiζ |2 d ζ . (1) Find a sequence of numbers {cn,k , k = 0, n, n ∈ N}, such that n
2 ZX ({1}) ∑ cn,k X−k −→
L
as n → ∞.
k=0
(2) Specify the decomposition Xn = Xns + Xnr . 9.8. Let {Xn , n ∈ Z} be a wide-sense stationary random sequence with spectral function FX . Denote by L2I the closure of a linear hull of {einζ , n ∈ I} in L2 ((−π , π ], FX ) and HI the closure of a linear hull of {Xn , n ∈ I} in L2 (Ω , F, P), where I is a subset of Z. (1) Prove that η ∈ HI if, and only if, there exists a function f ∈ L2I such that
η=
π
−π
f (ζ )ZX (d ζ ).
(2) Prove that η = πHI (X) (projection of X on HI ) if and only if η ∈ HI and E(η − X)X n = 0, n ∈ I. 9.9. Assume that polynomials P(z) = ∑rk=0 ak zk and Q(z) = ∑mj=0 b j z j do not have roots on the unit circumference. Prove that the equation r
m
k=0
j=0
∑ akYn−k = ∑ b j Xn− j ,
n∈Z
has a stationary solution, where {Xn , n ∈ Z} is a white noise. Prove that expressed as the series ∑∞ (a) Y-n := π0 (Yn ) can be( k=0 ckY−k convergent in the mean square with lim supk→∞ k |ck | < 1. (b) If Q(z) ≡ 1 then this series has only a finite number of nonvoid terms. 9.10. Prove Theorem 9.5 in the case where c1 ≤ p(ζ ) ≤ c2 , ζ ∈ (−π , π ], for some positive numbers c1 , c2 . 9.11. Let a wide-sense stationary random sequence {Yn , n ∈ Z} satisfy the relations: (a) Yn+1 = 2Yn + Xn , (b) Yn = 2Xn + Xn−1 , n ∈ Z, where {Xn , n ∈ Z} is a white noise. Find πH (Y0 ) with H = L cl (Yn , n = 0).
9 Prediction and interpolation
133
9.12. Suppose that the assumptions of Problem 9.10 are satisfied. Construct an algo/ I), where I ⊂ Z is a finite set, rithm for πH (X0 ) calculation with H = L cl (Xk , k ∈ containing 0. 9.13. Find πH (Y0 ) with H = L cl (Yn , n ∈ / {0; 1}) for the sequence {Yn , n ∈ Z} from Problem 9.11 (a). 9.14. Let {Xn , n ∈ Z} be a nondegenerate stationary sequence, and its spectral density be of the form pX (ζ ) =
m
∑
ck eiζ k , ζ ∈ (−π , π ],
k=−m
where m ∈ N. Prove that: (a) {Xn , n ∈ Z} is a regular sequence. (b) Xn = ∑m k=0 ak εn−k , where {εn , n ∈ Z} is a white noise. 9.15. Let {Xn , n ∈ Z} be a nondegenerate stationary sequence. Prove the following. If RX (n) = 0 as |n| ≥ m, then the sequence {Xn , n ∈ Z} is regular. Is this statement correct if |RX (n)| ≤ ca−|n| , n ∈ Z, where a ∈ (0, 1)? 9.16. Can a sum of orthogonal regular and singular sequences be: (a) regular; (b) singular; (c) neither a regular nor singular sequence? 9.17. Can a sum of orthogonal singular sequences be a regular sequence? 9.18. Is it possible that there exist different sequences {Xn1 }, {Xn2 }, {Yn1 }, and {Yn2 } such that Xn1 + Xn2 = Yn1 + Yn2 and the sequences {Xn1 }, {Yn1 } are regular, and the sequences {Xn2 }, {Yn2 } are singular? 9.19. Assume that {Xn , n ∈ Z} is a white noise, and Y is a random variable, with EY = 0, E|Y |2 < ∞, EY X¯n = 0, n ∈ Z. (1) Find the decomposition Zn = Zns + Znr for the sequence Zn := Xn +Y. (2) May another decomposition as a sum of regular and singular sequences exist for the sequence {Zn , n ∈ Z}? 9.20. Let {Xn , n ∈ Z} be a sequence of square integrable random variables (not necessarily stationary). Assume that EXm X¯k = 0, k ≤ 0 for some fixed m ∈ N. Is it always the case that the projection in L2 (Ω , F, P) of a random variable Xm on L cl (Xn , n < m) coincides with the projection on L (Xn , n = 1, . . . , m − 1)? 9.21. Prove Theorem 9.2.
Hints 9.2. Check that a function (zn Φn (z))/(Φ (z)) is analytic inside the circle |z| ≤ 1 and, therefore, it can be expanded into the uniformly convergent power series
134
9 Prediction and interpolation
k ∑∞ k=0 αk z , |z| ≤ 1. Conclude that the right-hand side of (9.1) belongs to the space H0 (X). Verify that
cov ξk , ξn −
Φn (e−iζ ) ZX (d ζ ) = 0, k ≤ 0. −i ζ Φ (e ) −π 9.3. Obtain the following representation for the spectral measure of the solution: π
einζ
P(e−iζ ) ZX (d ζ ), Q(e−iζ ) where P, Q are polynomials. Assume that Q(z) = 0 as |z| = 1. The spectral density is 2 1 P(e−iζ ) pY (ζ ) = . 2π Q(e−iζ ) ZY (d ζ ) =
Write the function pY as |Φ (e−iζ )|2 , where Φ (z) = 0 as |z| ≤ 1 and Φ is analytic inside a circle |z| ≤ 1. In order to do this observe that, if P(z) ∏n (z − αk ) = γ k=1 , γ , αk , βk ∈ C, Q(z) ∏mj=1 (z − βk ) then
where
P(z) P(z) Q(z) = , |z| = 1, Q(z) = P(z) = Q(z)
∏ (z − αk ) ∏ (1 − α k z),
k∈Λ1
k∈ / Λ1
∏ (z − β j ) ∏ (1 − β j z).
j∈Λ2
j∈ / Λ2
Here Λ1 is the index set for which |αk | > 1, and Λ2 is the index set for which |β j | > 1. Consider 1 P(z) Φ (z) = √ 2π Q(z) and use Theorem 9.4. 9.10. Check that the following two statements are true. (1) For any n = 0 it holds (Y − X0 , Xn ) = 0, where Y = −ππ ϕ (ζ )ZX (d ζ ). (2) The function ϕ can be presented as the convergent in the L2 ((−π , π ], FX ) series ϕ (ζ ) = ∑n=0 cn einζ . To prove this use the equivalence of the norms of the spaces L2 ((−π , π ], FX ) and L2 ((−π ,π ], λ 1 ). 9.12. Search the answer in the form −ππ ϕ (ζ )ZX (d ζ ), where
ϕ (ζ ) = 1 + (p(ζ ))−1 ∑ αk eikζ , k∈I
with
π −π
ϕ (ζ )e−inζ d ζ = 0, n ∈ I.
9.15. See the solution to Problem 9.14 (a).
9 Prediction and interpolation
135
9.21. Because Hk (X) = Hk (ε ) then ∑∞ k=n ak εn−k ∈ H0 (X). a ε ⊥ H ( ε ) = H Check that ∑n−1 0 0 (X). k n−k k=0
Answers and Solutions 9.1. (1) If X-3 = α X1 + β X2 then (X3 − X-3 , X1 ) = 0, (X3 − X-3 , X2 ) = 0. We obtain the system of linear equations −1 − α − β = 0, −1 − α − 4β = 0. Thus, α = −1, β = 0, X-3 = −X1 . The Gramm–Schmidt orthogonalization algorithm can also be used for obtaining X-3 . (2) E(X3 − X-3 )2 = E(X3 + X1 )2 = 10 − 2 + 1 = 9. (3) The covariance matrix of the vector (Y1 ,Y2 ) := (X1 + X2 , X2 + 2X3 ) equals , , , 7 −6, , ,. RY = , −6 16 , If Y-2 = α Y1 then (Y2 − Y-2 ,Y1 ) = 0; that is, −6 − 7α = 0. So, Y-2 = −(6/7)Y1 . 9.4. (a) The spectral density is equal(to p(ζ ) = (1 − |a|2 )/(2π |1 − ae−iζ |2 ). It follows from Theorem 9.4 that Φ (z) = (1 − |a|2 )/(2π )(1 − ae−iζ )−1 and X-n = an X0 . The sequence is regular. 0, n ≥ 2, 1 −i ζ 2 (b) p(ζ ) = 2π |2 − e | , Xn = ∞ X−k n = 1. ∑k=0 2k , The sequence is regular. (c) X-n = Xn = X0 . The sequence is singular. (d) X-n = Xn = X0 einλ . The sequence is singular. (e) The sequence Xn is of the type Xn = 2ZX ({0}) + einλ ZX ({λ }). Express ZX ({0}), ZX ({λ }) via X0 and X−1 , and find the representation of Xn in terms of X0 , X−1 . The sequence {Xn , n ∈ Z} is singular. 9.5. Because the regular and singular components are mutually orthogonal, their structural measures are also orthogonal (see Problem 8.40). That is why the measure FX has to be equal to the sum of structural measures FX r + FX s . Theorem 9.3 implies that FX r = 0; that is, {Xn , n ∈ Z} is the singular sequence. 9.6. (2) 1/(2π )|2 − e−iζ |2 d ζ . (3) Xns = ZX ({0}), Xnr = Xn − ZX ({0}). 9.7. (1) (1/n)∑nk=0 eik X−k → ZX ({1}) as n → ∞. (2) Xns = ZX ({1})ein . 9.9. (a) See Problem 9.3 hint. (b) If Q(z) = 1 then the function Φ (z) has a form (see Problem 9.3 hint) α , Φ (z) = r ∏k=1 (βk − z) where α , βk ∈ C with |βk | > 1. Let us observe that
136
9 Prediction and interpolation
Φn (z) Φn (z) − Φ (z) = 1+ . Φ (z) Φ (z) Because Φn (z)− Φ (z) and 1/Φ (z) are polynomials, a function Φn (z)/Φ (z) is a polynomial as well. Therefore, Theorem 9.4 implies that Y-n = can be expressed as a sum 9.11. (a) 4(Y−1 +Y1 ).
π
−π
einζ
∑2r k=0 ckY−k ,
Φn (e−iζ ) ZY (d ζ ) Φ (e−iζ )
where ck are complex numbers.
∞
(b) ∑ (−1/2)n (Yn −Y−n ). n=1
9.13. The desired random variable Y can be expressed as (see Problem 9.12 hint): −π ϕ (ζ )ZX (d ζ ), where
π
α0 α1 eiζ + p(ζ ) p(ζ ) and coefficients α0 , α1 satisfy linear equations: ⎧ ⎨2π + α0 π d ζ + α1 π eiζ d ζ = 0, −π p(ζ ) −π p(ζ ) ⎩α0 π eiζ d ζ + α1 π d ζ = 0. −π −π ϕ (ζ ) = 1 +
p(ζ )
p(ζ )
Answer: −(20/9)Y−1 + (16/9)Y2 . 9.14. (a) Because the spectral density pX (ζ ) is a nondegenerate analytic function in a neighborhood of the interval [−π , π ], then it can have only a finite number of zeros on [−π , π ], and every zero root has finite multiplicity. Use Theorem 9.3. (b) It follows from the assumptions that RX (k) = 0 as |k| > m. Thus, Xn ⊥ Hn−k (X) as k > m. Let Xn = ∑∞ k=0 ak εn−k , where {εk , k ∈ Z} is the white noise from the Wald decomposition, and ∑k |ak |2 < ∞. Because Hn−k (X) = Hn−k (ε ), we have Xn ⊥ εn−k as k > m. Therefore, ak = EXn ε n−k = 0, k > m, which was to be demonstrated. 9.16. (a) Yes. Assume that the spectral density of the sequence {Xn } equals 1Ix∈(−π ,π ] , and the spectral density of the sequence {Yn } equals 1Ix∈[0,π ] . Therefore (see Theorem 9.3) the sequence {Xn } is regular, and {Yn } is singular. The spectral density {Xn +Yn } is equal to 1Ix∈(−π ,π ] + 1Ix∈[0,π ] , and Theorem 9.3 implies that this sequence is regular. (b) No. (c) Yes. 9.17. Yes. For instance, if {Xn } and {Yn } are independent with spectral densities 1Ix∈[0,π ] and 1Ix∈[−π ,0] , respectively. See the solution of Problem 9.19(2). 9.18. Yes. 9.19. (1) Znr = Xn , Zns = Y. (2) Yes. Let Xn = Xn1 +Xn2 , where the sequences {Xn1 } and {Xn2 } are stationary and orthogonal, FX 1 (d ζ ) = (1/4π )1Iζ ∈[0,π ] d ζ , and FX 2 (d ζ ) = (1/2π )d ζ − FX 1 (d ζ ). Thus, {Xn2 } is regular, and {Xn1 +Y } is singular. 9.20. No. Consider m = 2 and the next sequence: X2 = Y2 , X1 = Y1 +Y2 , Xk = Yk + Y1 , k ≤ 0, where {Yn , n ∈ Z} is a white noise.
10 Markov chains: Discrete and continuous time
Theoretical grounds Let phase space X of a random sequence {Xn , n ∈ Z+ } be enumerable. The sequence {Xn , n ∈ Z+ } is called a Markov chain if ∀n ∈ N ∀i1 , . . . , in , in+1 ∈ X ∀t1 ≤ · · · ≤ tn ≤ tn+1 ∈ Z+ : P(Xtn+1 = in+1 /Xt1 = i1 , . . . , Xtn = in ) = P(Xtn+1 = in+1 /Xtn = in ). The system P(Xn+1 = j/Xn = i), i, j ∈ X, n ∈ Z+ is called the system of transition probabilities. If these conditional probabilities are independent of n then the Markov chain is said to be homogeneous. Further on we consider only homogeneous Markov chains. (n) A matrix P = pi j := P(X1 = j/X0 = i) (or Pn = pi j := P(Xn = j/X0 = i), respectively) is called the transition matrix (n-step transition matrix). Transition probabilities satisfy the Kolmogorov–Chapman equations (n+m)
∀i, j ∈ X ∀n, m ∈ Z+ : pi j
=
∑ pik
(n) (m) pk j .
k∈X
The Kolmogorov–Chapman equations can be reformulated as follows. Proposition 10.1. The n-step transition matrix is equal to the matrix P raised to the nth power. Definition 10.1. A state j is said to be accessible from a state i (i → j) if there is positive probability that in a finite number of steps a Markov chain moves from i to (n) j; that is, pi j > 0 for some n ≥ 0. A state i is said to communicate with state j (i ↔ j) if i → j and j → i. A state i is inessential if there exists a state j such that i → j but j i. Otherwise, a state i is said to be essential. Let τi = inf{n ≥ 1| Xn = i} be the time moment of the first visit to i. If P(τi < ∞/X0 = i) = 0 then period d(i) of a state i is the greatest common divisor of numbers (n) n such that pii > 0. If d(i) = 1 then the state i is called aperiodic.
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 137 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 10,
138
10 Markov chains: Discrete and continuous time (n)
Denote by fi j = P(τ j = n/X0 = i) the probability for the chain to make the first visit to a state j on the nth step given the chain started from the state i. A state i is (n) said to be recurrent if P(τi < ∞/X0 = i) = 1 or equivalently if ∑∞ n=1 f ii = 1. If a state is not recurrent, then it is said to be transient. (n)
Theorem 10.1. (Recurrence criterion) A state i is recurrent if and only if ∑n pii = +∞. The important class of Markov chains is random walks on a lattice Zd , that is, random sequences of the type Xn = x + ε1 + · · · + εn , where {εk , k ≥ 1} are i.i.d. random variables with values in Zd . The following criterion provides the necessary and sufficient recurrence condition for a random walk. Theorem 10.2. Assume that all states Zd of a random walk communicate. Let ϕ (u) = E{exp i(u, ε )}, u ∈ Rd be a characteristic function of a random walk {Xn , n ≥ 0} jump. The sequence {Xn , n ≥ 0} is recurrent if, and only if,
Re(1 − ϕ (u))−1 du = ∞.
(−π ,π )d
A system of nonnegative numbers πi , i ∈ X is called a distribution if ∑i πi = 1. Definition 10.2. A distribution πi = P(X0 = i), i ∈ X is said to be a stationary (or invariant) distribution of a Markov chain if πi = P(Xn = i) for all n ∈ N, i ∈ X. Theorem 10.3. (Ergodic theorem) Assume that all states of a Markov chain communicate and have period 1. Then for all i, j ∈ X (n)
pi j → 1/μ j , n → ∞, where μ j = E(τ j /X0 = j) ∈ (0, ∞] is an average recurrence time to the state j. The ergodic theorem can be proved by use of renewing theory methods (see Chapter 11, Problem 11.29). Theorem 10.4. Suppose that assumptions of Theorem 10.3 are satisfied. Denote π j = 1/μ j . Then, there are two possibilities. Either all π j = 0 and the stationary distribution does not exist or (π1 , π2 , . . . ) is the only stationary distribution. Theorem 10.5. (Strong Markov property) Let τ be a stopping time. Then ∀m ≥ 1, ∀n1 ≤ · · · ≤ nm ∈ N, ∀ j1 , . . . , jm ∈ X (n )
(n −n1 )
P(Xτ +n1 = j1 , . . . , Xτ +nm = jm / Fτ ) = pXτ1j1 p j1 2j2
(n −n
)
m m−1 . . . p jm−1 . jm
(10.1)
Continuous-time Markov chains Assume that a phase space X of a stochastic process {X(t),t ≥ 0} is an enumerable set. The process {X(t),t ≥ 0} is called a continuous-time Markov chain if
10 Markov chains: Discrete and continuous time
139
∀n ∈ N ∀i1 , . . . , in , in+1 ∈ X ∀t1 ≤ · · · ≤ tn ≤ tn+1 ∈ R+ : P(X(tn+1 ) = in+1 /X(t1 ) = i1 , . . . , X(tn ) = in ) = P(X(tn+1 ) = in+1 /X(tn ) = in ). Further on we consider only continuous-time Markov chains that are timehomogeneous and stochastic continuous. That is, the transition probability pi j (t) = P(X(t + s) = j/X(s) = i) does not depend on s and is continuous on t. Theorem 10.6. For any i, j there exist limits ⎧ ⎨ lim pi jt(t) , αi j := t→0+ pii (t)−1 ⎩ lim , t t→0+
i = j, i = j.
The limits αi j are finite as i = j, and αii can take value −∞. The limits αi j , i = j are called the transition intensities from a state i to a state j and a matrix A = αi j is called the generator of {X(t),t ≥ 0}. Proposition 10.2. If αii = −∞ for all i then {X(t),t ≥ 0} has a right-continuous modification. We assume that the right-continuous modification has already been chosen for such processes. Definition 10.3. A state i is regular if αii > −∞ and ∑ j αi j = 0. Theorem 10.7. (First, or backward, Kolmogorov system of equation) Assume that i is a regular state. Then the transition probabilities pi j (t) are differentiable in t and satisfy the system of differentiable equations pi j (t) = ∑ αik pk j (t), t ≥ 0.
(10.2)
k
In addition, the transition probabilities satisfy the relations pi j (0) = 1Ii= j , pi j (t) ≥ 0,
∑ pi j (t) = 1,
(10.3) t > 0.
(10.4)
j
Definition 10.4. A continuous-time Markov chain {X(t),t ≥ 0} is said to be regular if (a) X is a right-continuous process. (b) For any initial distribution the probability that an infinite number of jumps occurs in finite time is equal to zero. Theorem 10.8. A continuous-time Markov chain is regular if and only if one of the following conditions holds true. (1) There exists a unique solution to the system of equations (10.2) with initial condition (10.3) satisfying (10.4). (2) For any λ > 0 the system of equations λ gi = ∑ j αik gk , i ∈ E does not have any bounded solution {gi } except a trivial zero one.
140
10 Markov chains: Discrete and continuous time
Theorem 10.9. (Second, or forward, Kolmogorov system of equations) If X is a regular continuous-time Markov chain then the transition probabilities satisfy the system of differential equations pi j (t) = ∑k pik (t)αk j , t ≥ 0, pi j (0) = 1Ii= j . Theorem 10.10. (Strong Markov property) Let X be a regular continuous-time Markov chain. Then for any stopping time τ : ∀m ≥ 1, ∀t1 , . . .tm , 0 < t1 < · · · < tm , ∀ j1 , . . . , jm ∈ X : P(X(τ + t1 ) = j1 , . . . , X(τ + tm ) = jm / Fτ ) = pX(τ ) j1 (t1 )p j1 j2 (t2 − t1 ) . . . p jm−1 jm (tm − tm−1 ).
(10.5)
Bibliography [82] Chapter I §12, Chapter VIII; [69] Chapter III §2,3; [24], Volume 1, Chapter II §4–7; [79] Chapters XXI, XXIII; [22] Volume 1, Chapters 14–17; [9] Chapters 7–9; [15] Chapters 5,6; [89] Chapters 3-5; [80]; [12].
Problems 10.1. Prove that if i → j, j → k, then i → k. 10.2. Prove that a “communicate” relation is an equivalence relation. That is, it is reflective, symmetric, and transitive. 10.3. Prove that if a state i is recurrent then it is essential. Is the inverse statement true when the phase space is (a) finite; (b) countable? 10.4. Assume that i ↔ j. Prove that the state i is recurrent if and only if the state j is recurrent. 10.5. Prove that a state i is recurrent if and only if P(Xn = i infinitely often/X0 = i) = 1. 10.6. Suppose that i → j. Is it correct that: (a) d(i) ≥ d( j); (b) d(i) ≤ d( j)? 10.7. The transition matrix of a Markov chain {Xn , n ≥ 0} is of the following type , , ,0.2 0.8, , P=, ,0.4 0.6, . The initial distribution is: P(X0 = 1) = 0, 3; P(X0 = 2) = 0, 7. (a) Find the distribution of X1 . (b) Find the probabilities P(X0 = 1, X1 = 2); P(X0 = 1, X1 = 2, X2 = 2); P(X1 = 2, X2 = 2/X0 = 1); P(X1 = 1, X2 = 2); P(X2 = 1); P(X2 = 2/X0 = 1); P(X1 = 1, X3 = 2); P(X1 = 1, X2 = 1, X4 = 2, X6 = 1, X5 = 1/X0 = 1).
10 Markov chains: Discrete and continuous time
141
10.8. The transition matrix of the a Markov chain {Xn , n ≥ 0} is , , ,0.1 0.2 0.7, , , , P=, ,0.3 0.4 0.3, . ,0.5 0.4 0.1, The initial distribution is: P(X0 = 1) = 0.6;
P(X0 = 2) = 0.3;
P(X0 = 3) = 0.1.
(a) Find the distribution of X1 . (b) Find the probabilities P(X2 = 1); P(X1 = 1, X2 = 2, X3 = 3/X0 = 1); P(X1 = 1, X3 = 1, X4 = 3/X0 = 2); P(X1 = 1, X3 = 2, X4 = 1, X6 = 1, X8 = 1/X0 = 1); P(X0 = 1, X1 = 2); P(X1 = 1, X2 = 2); P(X4 = 1, X3 = 2, X1 = 1, X2 = 4, X5 = 1/X0 = 2). (c) Find the distribution of the random variable τ1 = inf{n ≥ 0| Xn = 1}. 10.9. Let {Xn , n ≥ 0} be a Markov chain with a finite phase space, and A be a set of its essential states. Let τ be a time of the first visit of the set A by the Markov chain. Prove that (a) There exist M > 0 and c ∈ [0, 1) such that P(τ > n) ≤ Mcn . (b) Eτ < ∞. Are these statements correct for a Markov chain with countable phase space? 10.10. Let a transition matrix be , , ,0.1 0.2 0.3 0.2 0.1 0.1, , , ,0.3 0.1 0.1 0.2 0.1 0.2, , , , 0 0 0.4 0.2 0.3 0.1, , P=, , 0 0 0.5 0.1 0.1 0.3, . , , , 0 0 0 0 0.6 0.4, , , , 0 0 0 0 0.2 0.8, (a) Classify the states. (b) Find the expectation of the time when the chain reaches some essential state given X0 = 1. (c) Let τ = inf{n ≥ 1| Xn = 3}. Find the probability P(τ < ∞/ X0 = 1). 10.11. John and Peter play a ”pennies matching” game till one is ruined. At the beginning John has $3 and Peter has $1. One bet costs $1 per game. The coin is symmetric. (a) Find the probability that John loses all his money. (b) Find the expectation of the number of games played till the one’s ruin. (c) How it affects the answer if the bet cost changes to 50 cents? 10.12. Solve the previous problem if at the beginning of the game John has $n, Peter has $m, and probability that John wins in each game is equal to p ∈ (0; 1). 10.13. Assume that a sequence {pn , n ≥ 1} of nonnegative numbers satisfies the relation p1 + p2 + · · · = 1. Describe the dynamics of the Markov chain and classify the states if the transition matrix is equal to
142
10 Markov chains: Discrete and continuous time
, , , , , p1 p2 p3 . . ., , p1 p2 p3 . . . , , , , , , 0 1 0 0 . . ., , 1 0 0 ... , , , , , 0 0 1 0 . . ., (a) , 1 0 0 ,; (b) , ,; , , , , , 0 0 0 1 . . . , .. .. , .. , , , ,. . . , . . . . . . . . . . . . ., , , , , , p1 p2 p3 . . ., , ,(1 − p1 ) p1 0 0 . . ., , , , , 1 0 0 . . ., , ,(1 − p2 ) 0 p2 0 . . ., , ,; , , , (c) , 0 1 0 . . .,; (d) , , ,(1 − p3 ) 0 0 p3 . . ., , 0 0 1 . . ., , ,. . . . . . . . . . . . . . . . . . ., , ,. . . . . . . . . . ., , , ,(1 − p1 ) p1 0 0 0 . . ., , , , 0 p2 0 0 . . ., (1 − p2 ) ,. (e) , , 0 0 (1 − p3 ) p3 0 . . ., , , , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ., (n)
Calculate f11 . Find the stationary distribution if it exists. 10.14. Consider transition matrices from the previous problem. Assume additionally (n) (5) that p1 = 0 in item (a). Find p1k in items (a), (b) and p34 in items (c)–(e). 10.15. Let {Xn , n ≥ 0} be a Markov chain, and τ = inf{n ≥ 0| Xn ∈ A} be a time of the first visit of a set A by the chain. Denote ri = P(τ < ∞/X0 = i), ρi = E(τ /X0 = i) ∈ (0, ∞]. Prove that: (a) Probabilities {ri } satisfy the system of linear equations ⎧ ⎪ i∈ / A, / pi j r j , ⎨ri = ∑ j∈A i ∈ A, ri = 1, ⎪ ⎩ i A and i ∈ / A. ri = 0, (b) Expectations {ρi } satisfy the following system of equations ρk = 1 + ∑ j pk j ρ j , k∈ / A, ρk = 0, k ∈ A. (c) If the phase space is finite then the expectation ρi is finite if and only if for any state j for which i → j there exists a state k ∈ A accessible from j ( j → k). 10.16. Find limn→∞ P(Xn = i) in Problems 10.8 and 10.10 if X0 = 1. 10.17. The lifetime of a detail is equal to 1 day with probability p ∈ (0, 1) and 2 days with probability q = 1 − p. At the moment of failure the detail is replaced immediately. Denote by Xn ∈ {1; 2} a total lifetime of the detail that is processing at the nth day. Let pn = P(Xn = 1). (a) Find p, if limn→∞ pn = 12 . (b) Is {Xn , n ≥ 0} a Markov chain? 10.18. Let {ξn , n ≥ 1} be i.i.d. random variables, and f : X × R → X be a measurable function. Suppose that a random variable X0 does not depend on {ξn , n ≥ 1}. Prove that a sequence Xn = f (Xn−1 , ξn ), n ≥ 1 is a Markov chain. Find a function f for a Markov chain from Problem 10.8 if {ξn , n ≥ 1} has a uniform distribution on [0, 1].
10 Markov chains: Discrete and continuous time
143
10.19. Let {Xn ,Yn , n ≥ 1} be Markov chains with values in Z and for any x ∈ Z, ∀i, j : P(Xn+1 − Xn ≥ x/Xn = i) ≥ P(Yn+1 −Yn ≥ x/Yn = j), P(X0 ≥ x) ≥ P(Y0 ≥ x). Prove that for all n ∈ N, x ∈ Z P(Xn ≥ x) ≥ P(Yn ≥ x). 10.20. Let {Xn , n ∈ Z} be a homogeneous Markov chain. Prove that {X−n , n ∈ Z} is also a Markov chain. Is it always homogeneous? Find the transition matrix of the chain {X−n , n ∈ Z} if the chain {Xn , n ∈ Z} is stationary and has a transition matrix P = pi j . 10.21. Let {Xn , n ≥ 0} be a Markov chain. Is it necessary that for any moments of time k < m < n and for any A, B,C ⊂ X the following equality holds true, P(Xn ∈ A| Xm ∈ B, Xk ∈ C) = P(Xn ∈ A| Xm ∈ B)? 10.22. Let {εn , n ≥ 0} be a sequence of independent Bernoulli random variables, P(εn = 1) = p, P(εn = −1) = 1 − p. For which p is a sequence Xn := εn+1 · εn , n ≥ 0 a Markov chain? 10.23. Give an example of a Markov chain {Xn } and a subset A ⊂ X for which the sequence {Yn = 1IXn ∈A } (a) Is a Markov chain (b) Is not a Markov chain. 10.24. Can all states of a Markov chain be inessential if the phase space is (a) finite; (b) infinite? 10.25. Is it possible that all states of a Markov chain are transient if the phase space is (a) finite; (b) infinite? 10.26. Let {εn , n ≥ 1} be a sequence of i.i.d. random variables, P(εn = 0) = P(εn = 1) = 12 . A sequence Xn ∈ X ≡ {0, 1}m , n ≥ 1 is built in a recurrent way as follows: X0 = (0, . . . , 0), kth coordinate of Xn is equal to the (k − 1)th coordinate of Xn−1 (k > 1) and the first coordinate Xn is equal to εn . Is the sequence {Xn } a Markov chain? If so, indicate the essential and inessential states, number of essential state classes, and period of every class. 10.27. Let {Xn , n ≥ 0} be i.i.d. random variables, P(Xn = 1) = 13 , P(Xn = 0) = 23 . Define Yn as a number of units among numbers Xn , Xn+1 , Xn+2 . Is the sequence {Yn } a Markov chain? If so, indicate essential and unessential states, number of essential state classes, and period of every class. 10.28. Let {Xn } be a random walk in Z, P(Xn+1 = i + 1|Xn = i) = p, P(Xn+1 = i − 1|Xn = i) = q = 1 − p, i ∈ Z, where p ∈ (0, 1). Find the n-step transition matrix. For which p is the walk recurrent?
144
10 Markov chains: Discrete and continuous time
10.29. A particle jumps 1 step up or down with probability 12 at every even moment of time independently of other steps, and 1 step to the right or to the left with probability 1 2 at every odd moment of time. Let Xn be a location of the particle at moment n. Find P(Xn = (k, l)/X0 = (i, j)). 10.30. Let {Yn } be a random walk in Z2 , P(Yn+1 − Yn = e) = 14 , where e ∈ {(0, 1), (1, 0), (−1, 0), (0, −1)}. Find the transition probabilities P(Yn = (k, l)/Y0 = (i, j)). Is this random walk recurrent? 10.31. Let {Xn } be a symmetric random walk on Z, P(Xn+1 = i±1|Xn = i) = 12 ; {Yn } be a symmetric random walk on Z+ with reflection at zero, P(Yn+1 = i ± 1|Yn = i) = 12 , i ∈ N, P(Yn+1 = 1|Yn = 0) = 1. Prove that the distributions of the sequences {|Xn |} and {Yn } coincide assuming that the distributions of random variables |X0 | and Y0 coincide. 10.32. Let {Xn } be the symmetric random walk from Problem 10.31, X0 = j ∈ N. (a) Prove that the probability of the event Xm = −k, k ∈ N is equal to the probability of the event that on time-interval 0, . . . , m the sequence {Xn } visits zero and Xm = k. That is, P(Xm = −k|X0 = j) = P({∃ l ∈ 0, m − 1 : Xl = 0} ∩ {Xm = k}|X0 = j). (b) Find the probability that Xm = k, k ∈ N and the sequence {Xn } has never been to zero till the moment m. (c) Find the probability that the sequence {Xn } visits zero the first time at the moment m. (d) Find P(X1 ≥ 0, . . . , Xm−1 ≥ 0, Xm = k|X0 = j), j, k ∈ Z+ . (e) Find P(X1 > 0, . . . , Xm−1 > 0, Xm = k|X0 = 0, Xm = k), k ∈ N. 10.33. Let {Yn } be a symmetric random walk in Z+ with capture in zero. That is, (n) P(Yn+1 = i ± 1|Yn = i) = 12 , i ∈ N and P(Yn+1 = 0|Yn = 0) = 1. Find pi j . 10.34. Let {Yn } be a symmetric random walk in Z+ reflecting at zero. That is, 1 P(Yn+1 = i ± 1|Yn = i) = , i ≥ 1, P(Yn+1 = 1|Yn = 0) = 1. 2 (n)
Find pi j . (n)
10.35. Find pi j for a symmetric random walk {Yn } on Z with the elastic barrier at 0: P(Yn+1 = i ± 1|Yn = i) = 12 , i = 0, P(Yn+1 = 1|Yn = 0) = p ∈ (0, 1), P(Yn+1 = −1|Yn = 0) = q = 1 − p. 10.36. Let {Xn , n ≥ 0} be a nonsymmetric random walk on Z+ reflecting at 0. That is, P(Xn+1 = i + 1/ Xn = i) = p and P(Xn+1 = max(i − 1, 0)| Xn = i) = (1 − p), i ∈ Z+ . (n) (a) Find f00 . (b) Give the values of p for which the chain X is recurrent. (c) Find the probability for {Xn } to visit zero assuming X0 = m, m ∈ N.
10 Markov chains: Discrete and continuous time
145
10.37. A device can break down at moments n = 1, 2, . . . ; a broken device is replaced immediately by a new one. A failure probability for the device of age k is equal to pk , k ≥ 1. Consider a Markov chain Xn = (the age of the device functioning at the moment n), n ≥ 1. Find: (n) (n) (1) f00 , f11 , n ≥ 1 and condition on {pk } under which the chain is recurrent. (2) Condition on {pk } under which the chain is positively recurrent, and find the invariant distribution of the chain. 10.38. A package of requests arrives at the device buffer. The device processes each request in one second. At the moment when the last request has been served the new package arrives and so on. The sizes of incoming packages are i.i.d. random variables {ξm , m ≥ 1}, P(ξm = k) = pk , k ≥ 1. Answer the questions formulated in the previous problem for the Markov chain Xn =(number of requests being in the buffer at the moment n), n ≥ 1. 10.39. Transition matrix P of a Markov , , chain is equal to , , ,0.1 0.2 0.3 0.1 0.1 0.2, ,0.2 0.8 0 0 0 , , , , , ,0.1 0.1 0.1 0.1 0.2 0.4, ,0.4 0.6 0 0 0 , , , , , , 0 0 0.4 0.6 0 0 , , ,; , , (a) ,0.1 0.2 0.3 0.2 0.2,; (b) , , , 0 0 0 0.1 0.9, , 0 0 0.3 0.7 0 0 , , , , 0 0 0 0 0.5 0.5, , 0 0 0 0.6 0.4, , , , 0 0 0 0 0.5 0.5, , , , , ,0.3 0.7 0 0 , ,1 0 0 0, , , , , ,0.5 0.5 0 0 , ,0.3 0.3 0.3 0.1, , , , , (c) , ,; (d) , 0 0 0.6 0.4,; ,0.2 0.3 0.3 0.2, , , ,0 0 0 1, , 0 0 0.4 0.6, , , ,1 0 0 0 0 0, , , , 0 0.5 0.5 0 0 0 , , , , 0 0.4 0.6 0 0 0 , , (e) , , 0 0 0 1 0 0 ,. , , , 0 0 0 0 0.2 0.8, , , ,0.1 0.2 0.3 0.1 0.1 0.2, Classify the states. Find P∞ = limn→∞ Pn . For any unessential state i, find the mean time before the chain visits an essential state given X(0) = i. Find the mean time before the chain reaches any state with even number if X(0) = 1 for the matrix from item (b). 10.40. Let {Xn , n ≥ 1} be a random walk with the step ξ1 = η − ζ , where: (a) η ∼ ζ ∼ Geom(p); (b) η ∼ Pois(1), ζ = 1. Prove that this random walk is recurrent. 10.41. N white and N black balls are put into two boxes containing N balls each. At the moments n = 1, 2, . . . one ball from each box is chosen at random and placed in the other box (Laplace diffusion model). Define a Markov chain Xn = (number of balls inside the first box at the moment n), n ≥ 1. (a) Find the transition probabilities matrix P and classify the states. (b) Find P∞ = limn→∞ Pn .
146
10 Markov chains: Discrete and continuous time
10.42. N balls are distributed between two boxes. At the time moments n = 1, 2, . . . a ball is chosen at random and is shifted into another box (P. and T. Ehrenfest diffusion model). Define a Markov chain Xn = (number of balls inside the first box at the moment n), n ≥ 1. (a) Find the transition probabilities matrix P and classify the states. (b) Find P∞ = limn→∞ Pn . 10.43. (Coding by pile of books method) Suppose that every second a router accepts letters from some alphabet A = {a1 , . . . , am } with probabilities p1 , . . . , pm respectively. A letter ak at the moment n is coded by a number xk (n) ∈ {1, . . . , m} (different letters are coded by different codes). The coding algorithm is built in a recurrent way. Assume that at the moment n the letter ak arrives. Then xk (n + 1) = 1. Letters encoded at the moment n by numbers 1, . . . , (xk (n) − 1), are re-encoded by numbers 2, . . . , xk (n), respectively. The codes of the other letters are left unchanged. Is a letter’s a1 code at the moment n a Markov chain? Find the limits for probabilities of the following event as n → ∞. (a) A letter a1 at the moment n is encoded by 1. (b) Letters a1 , a2 at the moment n are encoded by 1 and 2, respectively. Assume that letters come independently. 10.44. (A birth-and-death process). Let a phase space of a Markov chain {Xn } be Z+ . Assume that the transition matrix is equal to , , , r0 p0 0 0 0 . . . , , , ,q1 r1 p1 0 0 . . . , , , , P=, , 0 q2 r2 p2 0 . . . , , , 0 0 q3 r3 p3 . . ., , , ,. . . . . . . . . . . . . . . ., where qi + ri + pi = 1, qi > 0, pi > 0. (1) Classify the states of the Markov chain. (2) Prove that either the chain is recurrent or limn→∞ Xn = ∞ with probability one. Find the recurrence conditions in terms of {qi , ri , pi }. (3) Find the invariant distribution if it exists. 10.45. (A cyclic birth-and-death process). Let a Markov chain have the phase space {0, . . . , N} and the transition matrix , , , r0 p0 0 . . . q0 , , , , q1 r1 p1 . . . 0 , , , , P=, , . . . . . . . . . . . . . . . . . . . . ., , , 0 . . . qN−1 rN−1 pN−1 , , , , pN . . . 0 qN rN , where qi + ri + pi = 1, qi > 0, pi > 0. Find the invariant distribution of this chain if (a) p0 · · · pN = q0 · · · qN . (b) There exists μ = 1 for which (pi /qi ) = μ , i = 0, . . . , N. (c) p0 · · · pN = q0 · · · qN .
10 Markov chains: Discrete and continuous time
147
10.46. (A continuous-time birth-and-death process). Let a phase space of a continuous-time Markov chain {X(t),t ≥ 0} be Z+ . Suppose that a generator is , , ,−λ0 λ0 0 0 0... , , , , μ1 −(μ1 + λ1 ) λ1 0 0... , , , μ2 −(μ2 + λ2 ) λ2 0... , A=, ,, , 0 , , 0 μ −( μ + λ ) λ . . . 0 3 3 3 3 , , ,. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ., where μi > 0, λi > 0. Prove that the invariant distribution exists if, and only if, λ0 · · · λn−1 Λ = 1+ ∑ < ∞. n≥0 μ1 · · · μn In this case
πk = Λ −1
λ0 · · · λk−1 , k ≥ 1 and π0 = Λ −1 . μ1 · · · μk
10.47. A student tosses a die and makes the corresponding number of steps towards the finish line. The distance left to the finish equals six steps. Find the expected duration of the walk until reaching the finish in the following cases. (1) The step is not made if the number on the die is greater than the distance to the finish. (2) The student stops at the finish if the number on the die is more than or equal to the distance to the finish. 3) The random walk reflects from the finish if the number on the die is greater than the distance to the finish. 10.48. A package of requests arrives at a device buffer every second. The device processes 1 request per second. The sizes of incoming packages are i.i.d. random variables {ξm , m ≥ 1}, P(ξm = k) = pk , k ≥ 0. Describe the transition probabilities of the Markov chain {Xn = (number of requests in the buffer at time n), n ≥ 1}. Study the chain for recurrence. 10.49. Let {Xn , n ≥ 0} be a random walk in Zd . Assume that the step of the walk is uniformly distributed on {x ∈ Zd , x = 1}. For which d is this random walk recurrent? 10.50. Let the expectation of the step of a random walk be finite and nonzero. Prove that the random walk is not recurrent. 10.51. John and Peter play a game: they toss a coin till either three heads or two tails appear. In the first case John wins $3 and in the second case he loses $2. Find the expected amount of John’s winnings. 10.52. Suppose that a Markov chain {Xn } visits the set A ⊂ X infinitely many times with probability 1. Let τm be the mth moment. That is, τm = inf{k > τm−1 |Xk ∈ A}, where τ−1 ≡ 0. Prove that Yn = Xτn , n ≥ 0 is a Markov chain.
148
10 Markov chains: Discrete and continuous time
10.53. Assume that {Xn } is an aperiodic Markov chain, and all states communicate. Prove that for any d ∈ N a sequence Yn = Xnd , n ≥ 0 is an aperiodic Markov chain, and all states of {Yn } communicate. 10.54. Let the transition matrix of a Markov chain {Xn } be , , ,0 1 0, ,1 1, , P=, , 2 01 12 , . ,0 , 2 2 Find all possible distributions of X0 such that the distributions of X0 and X24 are equal. 10.55. The transition matrix of a Markov chain {Xn , n ≥ 0} equals , , , 0 0.5 0 0.5, , , ,0.3 0 0.7 0 , , P=, , 0 0.4 0 0.6, . , , ,0.1 0 0.9 0 , Classify the states and find the invariant distribution. Prove that Yn = X2n , n ≥ 0 is a Markov chain. Find all invariant distributions of {Yn }. 10.56. Suppose that {Xn ,Yn , n ≥ 1} are independent homogeneous Markov chains with the same transition probabilities. Let τ = inf{n| Xn = Yn }. (1) Prove that sequences (Xn ,Yn ), n < τ , Yn , n < τ , Vn := (Xn ,Yn ); Wn := Un := (Xn , Xn ), n ≥ τ ; Xn , n ≥ τ are homogeneous Markov chains. (2) Assume that a phase space is finite, all states communicate, and states are aperiodic. Prove that P(τ < ∞) = 1. (3) Generalize previous results for regular continuous-time Markov chains {X(t), Y (t), t ≥ 0}. 10.57. Assume that for every couple of states i, j of the regular Markov chain {X(t), t ≥ 0} with finite phase space there exists a moment ti j such that pi j (ti j ) > 0. Prove that there exists a unique stationary distribution, and the distribution of X(t) converges as t → ∞ to the stationary distribution whatever the initial distribution X(0) is. 10.58. Let the phase space X of the stochastically continuous Markov chain {X(t), t ≥ 0} be finite. Prove that X has c`adl`ag modification, and this modification is a regular process. 10.59. Let all states of a Markov chain {X(t), t ≥ 0} be regular and its intensities satisfy the condition supi αii > −∞. Prove that the chain X is regular. 10.60. Let a Markov chain {X(t),t ≥ 0} have a finite phase space and be stochastically continuous. Prove that for any t > 0 the determinant of the matrix P(t) is positive.
10 Markov chains: Discrete and continuous time
149
10.61. Let a stochastically continuous Markov chain {X(t),t ≥ 0} have a finite phase space and for some i, j there exists a time moment t such that pi j (t) > 0. Prove that pi j (t) > 0 for all t > 0. 10.62. Let {X(t),t ≥ 0} be a regular Markov chain, X0 = i, τ1 be a moment of the first jump. Prove that τ1 has exponential distribution with parameter (−αii ) and αi j P(X(τ1 ) = j/X(0) = i) = − , i = j. αii 10.63. Suppose that a homogeneous continuous-time Markov chain {X(t),t ≥ 0} is regular and αii = 0 for any i ∈ X. Let {τn , n ≥ 0} be a moment of the nth jump of {X(t),t ≥ 0}; that is, τn+1 = inf{t > τn |X(t) = X(τn )}. Prove that P(τn = +∞) = 0, n ∈ N and Yn = X(τn ), n ≥ 0 is a homogeneous Markov chain. 10.64. Prove that the Poisson process is a homogeneous continuous-time Markov chain. Find its transition matrices P(t),t ≥ 0 and the generator A. Is this chain regular? 10.65. Let a phase space of a Markov chain {X(t),t ≥ 0} be {1, 2}. A jump 1 → 2 intensity is a. No other jumps occur. Find P(t),t ≥ 0. 10.66. Let a phase space of a Markov chain {X(t),t ≥ 0} be {1, 2, 3}. The jumps 1 → 2, 2 → 3 intensities are a, b, respectively. No other jumps occur. Find P(t),t ≥ 0. Consider the cases a = b and a = b. 10.67. Let a phase space of a Markov chain {X(t),t ≥ 0} be {1, 2, 3, 4}. The jumps 1 → 2, 2 → 3, 3 → 4 intensities are equal to a. No other jumps occur. Find P(t),t ≥ 0. Generalize this problem for the case when the phase space consists of n elements. 10.68. Let a phase space of a Markov chain {X(t),t ≥ 0} be {1, 2, 3}. The jumps 1 → 2, 1 → 3 intensities equal a, b, respectively. No other jumps occur. Find P(t), t ≥ 0. 10.69. Let a phase space of a Markov chain {X(t),t ≥ 0} be {1, 2, 3, 4}. The jumps 1 → 2, 1 → 3, 2 → 4, 3 → 4 intensities equal a, b, c, d, respectively. No other jumps occur. Find P(t),t ≥ 0. 10.70. Let a phase space of a Markov chain {X(t),t ≥ 0} be {1, 2, 3, 4}. The jumps 1 → 2, 1 → 3, 2 → 4 intensities equal a, b, c, respectively. No other jumps occur. Find P(t),t ≥ 0. 10.71. Let a phase space of a Markov chain {X(t),t ≥ 0} be {1, 2, 3, 4}. The jumps 1 → 2, 2 → 3, 2 → 4 intensities equal a, b, c, respectively. No other jumps occur. Find P(t),t ≥ 0. 10.72. Let a Markov chain, and {X(t),t ≥ 0} be a birth-and-death process with μi = iμ , λi = iλ , i ∈ Z+ . Find EX(t),t ≥ 0, if X(0) = n.
150
10 Markov chains: Discrete and continuous time
10.73. Let {X(t),t ≥ 0} be a Markov process on Z with intensities αk,k+1 = λk > 0, k ∈ Z+ and αk,0 = μk > 0, k ∈ N (all other intensities are equal to zero). Find the conditions on {λk } and {μk } under which the process is recurrent. When does {X(t),t ≥ 0} possess the invariant distribution? Find this distribution. 10.74. Let {X(t),t ≥ 0} be a Markov process on Z+ with intensities αk,k−1 = μk > 0, k ∈ Z+ and α0,k = λk > 0, k ∈ N. Assume that ∑k λk < +∞ (all other intensities are equal to zero). Find the conditions on {λk } and {μk } under which the process is recurrent. When does {X(t),t ≥ 0} possess the invariant distribution? Find this distribution. 10.75. Let {X(n), n ≥ 0} be a Markov chain, and {N(t),t ≥ 0} be a Poisson process independent on it. Prove that a stochastic process {X(N(t)),t ≥ 0} is a Markov process. Express its transition function via transition probabilities of X. Find the generator. 10.76. Let {X(t),t ≥ 0} be a continuous-time Markov chain with a finite phase space. Assuming that X is stochastic continuous, prove that there exist an independent discrete-time Markov chain {Y (n), n ≥ 0} and the Poisson process N(t),t ≥ 0 with intensity λ such that a process {Y (N(t)),t ≥ 0} has the same distribution as {X(t),t ≥ 0}. What is the least value λ can have? 10.77. Find representations of the form {Y (N(t)),t ≥ 0} from Problem 10.76 if the generator of the process {X(t),t ≥ 0} is , , ,−4 1 1 2 , , , , 1 −3 2 0 , , , , 0 5 −6 1 , . , , , 1 1 1 −3, 10.78. Let {X(t),t ≥ 0} be a Markov process with the generator from the previous problem. Denote by τn the moment of the nth jump of X. Find the probability P(X(τ1 ) = 1, X(τ2 ) = 4, X(τ3 ) = 3/ X(0) = 2). 10.79. Packages arrive at a device at random with intensity λ and are processed with intensity μ , λ < μ . If the device is busy then a new package is added at the end of the queue. Denote by X(t) the total number of packages waiting in a queue or processing at the moment t. (1) Write the Kolmogorov equations for transition probabilities of the Markov process X(t). (2) Find the expectation of the queue’s length and time of processing a package that has just come if the system is in the stationary state. Assume that the number of places in the queue is infinite. 10.80. A device accepts requests with intensity λ and processes requests with intensity μ , λ ≤ μ . If the device is busy then a new package is added at the end of the queue. The capacity of the queue is n. If the queue is full then any other pending request at the moment is discarded. Let X(t) be defined as in Problem 10.79.
10 Markov chains: Discrete and continuous time
151
(1) Write the Kolmogorov equations for the transition probabilities for the Markov process X(t). (2) Find the expectation of the queue’s length and the probability that the arrived request is discarded, if the system is in the stationary state. 10.81. A switch unites n channels transferring signals. Signals arrive with intensity λ , and each signal can be transferred by some channel with intensity μ . If all channels are busy then any other pending signal at the moment is discarded. (1)Write the Kolmogorov equations for the transition probabilities of the Markov process X(t) = (number of busy channels at the moment t). (2) Find the expectation of the number of busy channels if the system is in the stationary state. (3) Find the probability that the arrived signal is discarded if the system is in the stationary state. 10.82. Consider the queueing system from the previous problem. Assume that at the initial moment of time all channels were free. Find the probability that the signal that arrived at moment t is discarded if the number of channels is equal to: (a) n = 1; (b) n = 2. Find the expectation of the number of lost signals until moment t. 10.83. A router accepts packets with intensity λ , and it processes packets with intensity μ . If the router is busy then a new packet is added at the end of the queue. The queue capacity is n. Let pn (t) be the probability that the packet arrived at the moment t is discarded. Find limn→∞ limt→∞ pn (t). 10.84. Requests of two types A and B arrive at the device at random with intensities λ1 , λ2 and are processed with intensities μ1 , μ2 , respectively. If the device is busy then any other pending request at the moment is discarded. (1) Define the Markov process describing such a queueing system. Write the Kolmogorov equations for the transition probabilities. (2) Find the probability that an arriving request is discarded if the system is in the stationary state. 10.85. Consider the queueing system from the previous problem, but assume that the type A request has higher priority than the type B request. That is, if the system is busy with processing the type B request, and the type A request arrives, then request B is discarded and request A starts the process immediately. Suppose that the system is in the stationary state. (1) Find the probability that the arrived request is discarded if it is of: (a) type A; (b) type B. (2) Find the probability that the request B will be accepted and processed. 10.86. The queueing system consists of the main and standby devices. If the main device is free then the request is processed by it. Otherwise, if the standby device is free then the request is processed to the end by the standby device. The intensities of processing by the main and standby devices are equal to μ1 and μ2 , respectively. The request arriving intensity equals λ .
152
10 Markov chains: Discrete and continuous time
(1) Characterize the Markov process describing this queueing system. Write the Kolmogorov equations for the transition probabilities. (2) Solve this problem under the following two arrangements: if both devices are busy then a new request: (a) is added at the end of the (infinite capacity) queue; (b) is discarded. 10.87. A device accepts requests with intensity λ , and it processes requests with intensity μ . If the device is busy then the new request is added at the end of the queue. The capacity of the queue is n. The intensity of discarding a request during the processing or waiting in a queue is equal to ν . Find the probability that the request will be accepted and processed if the system is in the stationary state. 10.88. The queueing system consists of n devices. Requests arrive at random with intensity λ . Every device processes a request with intensity μ . If all devices are busy, then the new request is added at the end of the common queue. The capacity of the queue is m. Find the stationary distribution. 10.89. The queueing system consists of two devices. The requests of types A and B arrive at the device with intensities λ1 , λ2 , respectively. Every device processes the request A (B) with intensity μ1 (μ2 ). If both devices are busy then a new request is discarded. Characterize the Markov process describing such a queueing system. Write the Kolmogorov equations for the transition probabilities. 10.90. The requests of types A and B arrive at the device with intensities λ1 , λ2 and are processed with intensities μ1 , μ2 , respectively. The number of the queue capacity is equal to 1. Request A has higher priority than B. That is, if the device is busy with processing of the type B request, another request B is waiting in the queue, and the type A request arrives, then the first request B moves to the queue, the second request B discards, and request A starts process. (1) Characterize a Markov process describing such a queueing system. Write the Kolmogorov equations for transition probabilities. (2) Assume that at the initial moment of time the request B is processing. Find the probability that it will be processed sooner or later. (3) Find the probability that the request A starts the process on arrival. Suppose that the system is in the stationary state.
Hints (n)
10.4. States i and j communicate. So, there exist n and m such that pi j > 0 and (m)
(n+k+m)
p ji > 0. Prove that pii (k)
(n) (k) (m)
≥ pi j p j j p ji for any k ∈ N. Therefore, if a series (k)
∑k p j j is divergent, then a series ∑k pii is divergent as well. Apply Theorem 10.1. 10.5. Let v be a probability of return to the initial state. Use Theorem 10.5 and prove that the probability to return to the initial state at least n times is equal to vn . For the transient state use the Borel–Cantelli lemma.
10 Markov chains: Discrete and continuous time
153
10.9. Let n0 be such number that α := min j max1≤n≤n0 P(Xn ∈ A/X0 = j) > 0. Then / A, k = 1, . . . , nd) ≤ P(Xld ∈ / A, l = 1, . . . , n) ≤ α n . P(Xk ∈ For the Markov chain with a countable set of states the corresponding statements, in general, are not correct. 10.19. Construct the chains {X¯n , n ≥ 0}, {Y¯n , n ≥ 0} that are stochastically equivalent in the wide sense to {Xn , n ≥ 0}, {Yn , n ≥ 0}, respectively, and are of the following type X¯n+1 = f (X¯n , εn ), Y¯n+1 = g(Y¯n , εn ), where X¯0 ≥ Y¯0 , {εn , n ≥ 0} are the i.i.d. random variables and f (x, ε ) ≥ g(y, ε ) for all ε > 0, x, y, x ≥ y. 10.27. Consider the conditional probabilities P(Y2 = 3/Y1 = 2,Y0 = 3) and P(Y2 = 3/Y1 = 2). [n/2] n−[n/2] (r) where pik is the 10.29. P(Xn = (k, l)/X0 = (i, j)) equals the product pik · p jl transition probability for the symmetric random walk on Z (see Problem 10.28 with p = 12 ). 10.30. Let {Xn = (Xn1 , Xn2 )} be the sequence from Problem 10.29. Prove that the sequences {Yn } and 1 2 X 1 − X 2 X2n + X2n 2n , 2n Yn = 2 2 have the same distributions. Thus, P(Yn = (k, l)/Y0 = (i, j)) 1 2 = k + l − i − j, X2n = k + j − i − l/X0 = (0, 0)). = P(X2n
In particular, m C2m , 2m 2 P(Y2m−1 = (0, 0)/Y0 = (0, 0)) = 0.
P(Y2m = (0, 0)/Y0 = (0, 0)) =
−2mCm diverges then (0, 0) is recurrent. By the same reaBecause the series ∑∞ m=1 2 2m soning show that any other state is also recurrent. 10.32. (a) Determine the one-to-one correspondence between the route from point j to (−k) and the route from j to k that visits zero. 10.35. The desired probability equals the sum of the transition probabilities from i to j in n steps without visitation zero plus the corresponding probability with visitation zero. Apply the results of Problems 10.32, 10.34. 10.37. See Problem 10.13 (d). 10.38. See Problem 10.13 (c). 10.40. Use Theorem 10.2. 10.45. Prove that for the invariant distribution (π0 , . . . , πN ) the following equality holds, qi πi − pi−1 πi−1 = c, i = 0, . . . , N, where c is some constant (the notion 0 − 1 = N is used). Check that c = 0 ⇔ p0 · · · pN = q0 · · · qN . Every πi can be expressed either via π0 in the case p0 · · · pN = q0 · · · qN or via c in the case p0 · · · pN = q0 · · · qN . The unknown values π0 or c can be found from the condition π0 + · · · + πN = 1. 10.48. Let ε be the number of requests in one package. Use the law of large numbers and prove that the chain is recurrent if Eε < 1 and the chain is not recurrent if Eε > 1.
154
10 Markov chains: Discrete and continuous time
When Eε = 1 then the chain is recurrent if and only if the random walk on Z with the step equal to ε − 1 is recurrent. Use Theorem 10.2 and prove the recurrence. 10.49. Use Theorem 10.2. 10.50. Use the law of large numbers. 10.52. Prove that τn is a Markov moment and use the strong Markov property. 10.53. To prove the Markov property of {Yn } use the definition. Check that for any (n) i ∈ X there exists a number n0 such that pii > 0, n ≥ n0 . It follows from this that all states of the chain {Yn } communicate and have a period 1. 10.54. {Xn } is an aperiodic Markov chain and all states of {Xn } communicate. Thus, Yn = X24n , n ≥ 0 is also an aperiodic Markov chain all states of which communicate (see 10.53). Thus, {Yn } has a unique invariant distribution. The invariant distribution of {Xn } is also the invariant distribution of {Yn }, so this distribution is the unique solution of the problem. 10.59. Check that the generator A is a bounded linear operator in the space l∞ . That is why the first Kolmogorov system of equations has the unique solution. Use Theorem 10.8. 10.63. To prove that P(τn = +∞) = 0 use Problem 10.62. To prove the Markov property for {Yn } verify that {τn } are Markov moments and use the strong Markov property. 10.69. p11 (t) = e−(a+b)t , p22 (t) = e−ct , p24 (t) = 1 − e−ct , p33 (t) = e−dt , p34 (t) = 1 − e−dt , p44 (t) = 1. Use the Kolmogorov equations for p12 (t) and p13 (t), and obtain the differential equations with one variable. Then, find p14 (t), p13 (t). Other probabilities are equal to zero. 10.70. p11 (t) = e−(a+b)t , p22 (t) = e−ct , p24 (t) = 1−e−ct , p13 (t) = 1−e−bt , p33 (t) = p44 (t) = 1, p14 (t) = 1 − p13 (t) − p11 (t) − p12 (t). Use the Kolmogorov equations for p12 (t) and obtain the differential equations with one variable. 10.71. p11 (t) = e−at , p22 (t) = e−(b+c)t , p24 (t) = (c/(b + c))(1 − e−(b+c)t ), p23 (t) = (b/(b + c))(1 − e−(b+c)t ), p33 (t) = p44 (t) = 1, p14 (t) = 1 − p13 (t) − p11 (t) − p12 (t). Use the Kolmogorov equations for p13 (t) and obtain the differential equations with one variable. All other probabilities are equal to zero.
Answers and Solutions 10.3. (a) Yes. (b) No. 10.6. If j i, then d(i), d( j) can be arbitrary. If i ↔ j then d(i) = d( j). 10.11. (a) 0.75; (b) 3; (c) 0.75, 12. 10.12. (a) If p = 12 then the ruin probability for John is equal to m/(m + n). Otherwise it is equal to (q/p)n+m − (q/p)n . (q/p)n+m − 1 (b) If p =
1 2
then the expectation of the game duration is equal to nm, and
10 Markov chains: Discrete and continuous time
155
n n + m 1 − (q/p)n − q − p q − p 1 − (q/p)n+m
otherwise.
10.13. (a) Let all pi > 0. Then π1 = 1/(2 − π1 ), πk = pk /(2 − π1 ), k ≥ 2. (b) Any distribution for which p1 = 0 is stationary. (c) Assume that all pi > 0. Denote an+1 = (1 − p1 − · · · − pn ), n ≥ 1, a1 = 1. The stationary distribution exists if and only if ∑n an < ∞. At the same time πk = ak / ∑n an . (d) Let all pi > 0. Denote an+1 = p1 · · · · · pn , n ≥ 1, a0 = 1. The chain is recurrent if and only if limn→∞ an = 0. The stationary distribution exists if and only if ∑n an < ∞, and πk = ak / ∑n an . 10.17. (a) p = 23 . (b) No. However, a sequence Yn = (Xn−1 , Xn ) is a Markov chain. 10.21. No. 10.22. p = 0.5; p = 0; p = 1. 10.24. (a) No. (b) Yes. 10.25. (a) No. (b) Yes. 10.26. The aperiodic Markov chain with one class of communicating states. 10.27. No. 10.28. n+ j−i n+ j−i n+i− j (n) pi j = Cn 2 p 2 q 2 , if n + j − i is the even number and | j − i| ≤ n. The Stirling’s formula implies that (4p(1 − p))m (2m) √ , m → ∞. pii ∼ πm Thus, the random walk is recurrent if, and only if, the series (4p(1 − p))m ∑ √π m m diverges, that is, p = 12 . 10.32. (b) P(X1 = 0, . . . , Xm−1 = 0, Xm = k|X0 = j) = P(Xm = k|X0 = j) − P(∃ l ∈ {1, . . . , m} : Xl = 0, Xm = k|X0 = j) (m+k− j)/2 −m
= Cm
2
− P(Xm = −k|X0 = j) = 2−m (C(m+k− j)/2 −C(m−k− j)/2 ),
if |k − j| ≤ m, m + k − j is an even number. (c) P(X1 = 0, . . . , Xm−1 = 0, Xm = 0|X0 = j) = P(X1 = 0, . . . , Xm−2 = 0, Xm−1 = 1, Xm = 0|X0 = j) 1 = P(X1 = 0, . . . , Xm−2 = 0, Xm−1 = 1|X0 = j). 2 (d) P(X1 ≥ 0, . . . , Xm−1 ≥ 0, Xm = k|X0 = j) = P(X1 = 0, . . . , Xm−1 = 0, Xm = k + 1|X0 = j + 1). (e) k/m. 10.33. The required probability coincides with the answer to Problem 10.32 (b) if i = 0, j = 0.
156
10 Markov chains: Discrete and continuous time
10.34. Due to Problem 10.31: (n+ j−i)/2
(n−i− j)/2
+Cn , 2n where i, j ∈ N, n + j − i is an even number, | j − i| ≤ n. 10.35. Let i ≥ 0. Then: (n+ j−i)/2 (n−i− j)/2 (n−i− j)/2 2−n (Cn −Cn ) + 2−n+1 pCn , (n) pi j = n−i− j 2−n+1 qCn 2 , (n)
pi j =
Cn
j ≥ 0; j < 0.
10.36. The chain is recurrent if p ≤ 0.5. 10.41. pk,k+1 = (N − k)2 /N 2 ; pk,k−1 = k2 /N 2 ; pk,k = 2k(N − k)/N 2 , k = 1, . . . , N − 1; p0,1 = pN,N−1 = 1. The stationary distribution is πk = CNk /2k , k = 0, . . . , N. All states communicate and have period 1. 10.43. The code of the letter a1 at the moment n is not a Markov chain generally, but the vector X(n) = (x1 (n), . . . , xm (n)) is an aperiodic Markov chain all states of which communicate. That is why there exists only one stationary distribution. It is also assumed that X(n) is already stationary distributed. (a) Check that the probability that a1 is encoded by 1 at the moment n + 1 is equal to p1 . (b) Let r be a sought probability. Then r = P(x1 (n + 1) = 1, x2 (n + 1) = 2) = P(x1 (n) = 1, x2 (n) = 2, and a letter a1 has arrived at the moment n) + P(x2 (n) = 1, and a letter a1 has arrived at the moment n) = rp1 + p2 p1 . Therefore, r = p1 p2 / (1 − p1 ). 10.44. Denote by yn,k , k = 1, . . . , n the probability that the chain visits the point 0 earlier than n + 1, yn,0 = 1, yn,n+1 = 0. Then yn,k = qk yn,k−1 + rk yn,k + pk yn,k+1 .
(10.6)
Taking into account that rk = 1 − qk − pk we obtain: qk q1 qk yn,k − yn,k−1 = (yn,k−1 − yn,k−2 ) = · · · = × · · · × (yn,1 − 1). pk p1 pk We add these identities and obtain yn,n − yn,1 = (yn,1 − 1) ∑nk=1 ∏ki=1 pi q−1 i . Due to −1 n,1 (10.6) we have that −yn,n = ∏n+1 p q (y − 1) for k = n + 1. i=1 i i Thus, −1 k ∑n+1 k=1 ∏i=1 pi qi yn,1 = . −1 k 1 + ∑n+1 k=1 ∏i=1 pi qi −1 k The chain is recurrent if and only if limn→∞ yn,1 = 1 or ∑∞ k=1 ∏i=1 pi qi = +∞. The −1 ∞ k stationary distribution exists if A := 1 + ∑k=1 ∏i=1 pi−1 qi < +∞. In this case k
π0 = A−1 , πk = A−1 ∏ pi−1 q−1 i , k ≥ 1. i=1
10.45. (a) πi = σi /(∑Nj=0 σ j ), where σ0 = 1, σ j = (p j−1 . . . p0 )/(q j . . . q1 ), j = 1, . . . , N. −1 −1 N (b) πi = p−1 i (∑ j=0 p j ) , i = 0, . . . , N. (c) Denote
10 Markov chains: Discrete and continuous time
157
−1 1+ ρi−1 + ρi−1 ρi−2 +· · ·+ ρi−1 · · · ρ0 ρN · · · ρi = pi /qi , θi = qi (1− ρ0 · · · ρN ) N ρi+1 ; then πi = θi /(∑ j=0 θ j ), i = 0, . . . , N. 10.47. (a), (c) 6. (b) Let rk , k = 1, . . . , 6 be the expected time to finish if the distance to finish equals k. Then rk can be found from the recurrent formula: r1 = 1, rk+1 = 1 + 16 ∑kj=1 r j . 10.49. The chain is recurrent if and only if d ≤ 2. 10.54. P(X0 = 1) = 0.2, P(X0 = 2) = P(X0 = 3) = 0.4. 10.60. det P(t) = det exp{At} = (det exp{At/n})n . If n is large, then the matrix exp{At/n} is close to the identity matrix, and therefore its determinant is positive. Furthermore, it is known from the theory of differential equations that det P(t) = exp{(trA)t} > 0. 10.64. The Poisson process is a regular Markov chain with (λ t) j−i −λ t e 1I j≥i , A = αi j , pi j (t) = ( j − i)! where αii = −λ , αi(i+1) = λ and αi j = 0 for all other i, j. 10.72. EX(t) = ne(λ −μ )t . 10.75. P(t) = ∑n≥0 e−λ t ((λ t)n /n!)Pn , A = λ (P − 1I). 10.76. Pλ = λ −1 A + 1I, where λ ≥ max1≤i≤n (−λii ). 1 . 10.78. See Problem 10.62: 13 · 24 · 13 = 18 10.79. X(t) is the number of packages in the system; transition intensities αn,n+1 = λ , an,n−1 = μ . The stationary distribution is k
πk =
λ μ
1 − λμ
, k ∈ Z+ .
10.80. The stationary distribution is k )n+1 j *−1 λ λ πk = , ∑ μ μ j=0
k = 0, . . . , n + 1.
10.81. Stationary distribution is) *−1 1 λ k n 1 λ j πk = , ∑ k! μ μ j=0 j! 10.83.
λ −μ μ ,
0,
k = 0, . . . , n.
λ > μ, λ ≤ μ.
10.84. Introduce the states: A = “the request A is processed”; B= “the request B is processed”; 0 = “the system is free”. The transition intensities 0 → A, 0 → B, A → 0, B → 0 are equal to λ1 , λ2 , μ1 , μ2 , respectively. The probability for a request to be rejected equals (λ1 μ2 + μ1 λ2 )/(λ1 μ2 + μ1 λ2 + μ1 μ2 ). 10.85. λ1 μ 1 λ2 . πA = , πB = λ1 + μ 1 (λ1 + λ2 + μ2 )(λ1 + μ1 )
158
10 Markov chains: Discrete and continuous time
The probability that the request B is accepted and processed is equal to (1 − πB )μ2 /(λ1 + μ2 ). 10.87. Let X(t) ∈ {0, 1, . . . , n + 1} be the number of requests in a system (either processing or waiting in the queue) at moment t. To find the stationary distribution {πk } observe that X(t) is the birth-and-death process with the birth intensity λk = λ , k = 0, . . . , n and the death intensity μ j = μ + jν . The probability that the request is accepted and processed is equal to ∑nk=0 πk ρk , where ρk is the probability that the request has k requests in a queue before it will be processed. It order to find ρk , consider the additional construction. Let {τ j , j ≥ 1} be the time moments when the requests, in the queue before the initial request (inclusively), leave the system (i.e., they either have been processed or lost), τ0 = 0. Denote by Ym the position of the initial request in the queue at the time moment τm . We assume that Ym = ∞ if the initial request is lost and Ym = 0 if it was processed at the moment τm . The sequence Ym , m ≥ 0 is the Markov chain with the transition probabilities μ + (i − 1)ν ν , pi,∞ = , pi,i−1 = μ + iν μ + iν where i = 1, . . . , n + 1. Thus, ρk is the probability that the sequence {Ym , m ≥ 0} hits 0 earlier than ∞ given Y0 = k. This probability is equal to pk,k−1 pk−1,k−2 . . . p1,0 = μ /(μ + kν ). 10.88. Let k n−1 (λ /μ ) j (λ /μ )n m λ + A= ∑ ∑ nμ . j! n! j=0 k=0 The stationary distribution is A−1 (λ /μ )k /k!, πk = A−1 ( λμ )k /(n!nk−n ),
k = 0, . . . , n, k = n + 1, . . . , n + m.
10.90. (1) States: 0 = “the system is free”; A = “there is only request A in the system”; B = “there is only request B in the system”; AB = “the request A is processing, the request B is waiting in the queue”; AA = “the request A is processing, another request A is waiting in the queue”; BB = “the request B is processing, another request B is waiting in the queue”. (2) This probability is equal to the probability of hitting 0 for the following discrete-time Markov chain. States set: 0, B, AB, C (“cemetery”, meaning the request is lost). The transition probabilities are p0,0 = pC,C = 1, pB,0 = μ2 /(λ1 + μ2 ), pB,AB = λ1 /(λ1 + μ2 ), pA,B = μ2 /(λ1 + μ2 ), pA,C = λ1 /(λ1 + μ2 ). The initial state X0 = B. (3) The required probability is equal to π2 = λ12 /(λ12 + λ1 μ1 + μ12 ); see the system from Problem 10.80 if n = 1.
11 Renewal theory. Queueing theory
Theoretical grounds Queueing theory was founded by Danish scientist A. Erlang who worked for the Copenhagen Telephone Exchange for many years at the beginning of the twentieth century. The works of F. Pollaczek, A. Khinchin, L. Takacs, and also B. Gnedenko and his school had significant influence on the development of queueing theory. The main model of queueing theory can be described as follows. Assume that one or several devices (machines, routers, cash desks) accept and process some requests (components, packets, clients) that arrive at random. The processing time of each request may also be random. There is a large variability of rules and orders of processing, which have an influence on the quality of service. Examples of such rules could be the following: existence or nonexistence of a queue, different priorities of the requests, and so on. We also have to know what happens to request processing at the current instant of time when a new request with higher priority arrives. In this case the first request may be discarded or may be added to the queue for later processing. Moreover, processing can start from the very beginning or may take into account the time of the initial processing. The main objects of queueing theory investigations, among others, are the following characteristics: average time that a request spends from arrival until processing; average number of requests that have been processed during a certain period of time; distribution of time until the system becomes free from requests; the distribution of a busy period (i.e., the length of time from the instant the request arrives at an empty system until the instant when the system is empty again); probability that the arrived request is discarded (this happens if the queue is overfull); and the limit or stationary distributions for queueing systems. Compound Poisson processes with jumps having the same sign are widely used to describe queueing systems in renewal theory:
ξ (t) = t − S(t),
N(t)
S(t) =
∑ ξk ,
k=1
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 159 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 11,
160
11 Renewal theory. Queueing theory
ζ (t) = S(t) − t,
P(ξk > 0) = 1,
where N(t) is a Poisson process with rate λ > 0, and {ξk , k ≥ 1} is a sequence of i.i.d. random variables with distribution function F(x) = P(ξk ≤ x). Processes ξ (t) and ζ (t) have a temporal interpretation in queueing theory. Moments of the process N(t) jumps are associated with moments of request arrivals, jumps ξk of a process S(t) are associated with a time necessary for a service of the N(t) kth request, and S(t) = ∑k=1 ξk is a total time necessary for service of all requests arrived before t. A process ζ (t) = S(t) − t describes additional time necessary for service after t if the device is not idle in [0,t]. A process ξ (t) = t − S(t) is called a controlling process. A process α (t) = ξ + (t) = sup0≤t ≤t ξ (t ) is called a nonbusy process or idle process. This is a continuous nondecreasing process. A process β (t) = t − α (t) is called a busy process. Denote by τk the difference between the nth and (n − 1)th jumps of the process N(t), a sequence {τk , k ≥ 1} is a sequence of independent exponential distributed random variables with rate λ . Introduce virtual waiting time w(t),t ≥ 0 or the virtual waiting process, as follows. ⎧ ⎪ w(t1 ) = ξ1 , ⎨0, 0 ≤ t < τ1 = t1 , + (11.1) w(t) = (ξ1 − t + τ1 ) , t ∈ [t1 ,t2 ), ⎪ ⎩ t ∈ [tn ,tn+1 ), n > 1, (w(tn ) − t + tn )+ , where tn =
∑ τk ,
w(t1 ) = ξ1 ,
Δn w = w(tn ) − w(tn − 0) = ξn , n > 1.
k≤n
Graphs of ξ (t), α (t), β (t), and w(t) are shown on plot 1, page 360. Processes α (t), β (t) are piecewise linear functions. Intervals of linear growth β (t) are called busy periods {θk , k ≥ 1} (they have the same distribution). Intervals where a function β (t) is constant form idle periods. Comparing graphs of the processes ξ (t), α (t), and w(t), it is easy to see that at busy periods (11.2) w(t) = α (t) − ξ (t) > 0, τ < t < τ + θk , (k ≥ 1), k
k
τk
where is the initial time of the kth busy period. Outside busy periods w(t) ≡ 0. If w(0) = 0 and Eξ (1) = 1 − λ μ < 0 (μ = Eξk > 0) then the following formula holds for the generating function of w(t), t ω (z,t) = Ee−zw(t) = etk(z) 1 − z e−uk(z) P0 (u)du , (11.3) 0
Here k(z) is the cumulant function of a process ζ (t); P0 (u) = P(w(u) = 0) is a probability that a system is free of requests at the time moment u. The function k(z) is defined by ∞ e−zx F(x)dx , (11.4) k(z) = ln Ee−zζ (1) = z + λ Ee−zξ1 − 1 = z 1 − λ 0
where F(x) := P(ξ1 > x),
11 Renewal theory. Queueing theory
p0 (s) = s
∞ 0
−1 - (s) - (s) = Ee−sθ1 , e−sx P0 (x)dx = s s + λ − λ Π , Π
161
(11.5)
- is the moment generating function of the first busy period θ1 . and Π There are three possible modes for a queueing system with one device and a Poisson stream of requests depending on a sign of Eζ (1) = λ μ − 1 (ζ (t) = N(t) ∑k=1 ξk − t). (1) Eζ (1) = λ μ − 1 > 0 (λ μ > 1). Then average service time is greater than average time of request arrival: μ > 1/λ . This mode is called above-critical. (2) Eζ (1) = 0 (μ = 1/λ ). This mode is called critical. (3) Eζ (1) < 0 (λ μ < 1, μ < 1/λ ). In this case the equilibrium mode exists, and random variables w(t) weakly converge as t → ∞. Note that the distribution function of busy period Π (x) = P(θ1 < x) → 1 as x → ∞ only in the last two cases. In the first case
Π (+∞) = (λ μ )−1 < 1. Usually the third (equilibrium) case μ < λ −1 is the most interesting from the point of view of queueing theory. Classical results of renewal theory cited below are widely used for queueing systems study. Definition 11.1. Assume that {Tn , n ≥ 1} are nonnegative i.i.d. random variables with a distribution function F, F(0) < 1. Let a sequence of nonnegative random variables {Sn , n ≥ 0} be constructed recurrently Sn+1 = Sn + Tn+1 , n ≥ 0, and S0 does not depend on {Tn , n ≥ 1}. The variables Sn are called renewal epochs. A random sequence {Sn , n ≥ 0} is said to be a pure renewal process if S0 = 0, and a delayed renewal process otherwise. For example, Tn may be successive service times of requests, intervals between request arrivals, and so on. Let z : R → R be a measurable function that equals zero when x < 0. The renewal equation is an equation of the form Z(x) = z(x) +
[0;x]
Z(x − y)F(dy), x > 0.
The renewal equation can be written briefly as Z = z + F ∗ Z, where “*” is a convolution sign; that is, F ∗ Z(x) = Let us introduce the renewal function U=
(11.6)
[0;x] Z(x − y)F(dy).
∞
∑ F ∗n ,
(11.7)
n=0
where F ∗n is a convolution of n distribution functions F, and F ∗0 a distribution function of a unit measure concentrated at 0.
162
11 Renewal theory. Queueing theory
Theorem 11.1. If z is a bounded function then Z =U ∗z
(11.8)
is a solution of equation (11.6). This solution is unique in a class of functions that are bounded on finite intervals and vanish on (−∞, 0). Theorem 11.2. Assume that the conditions of the previous theorem are satisfied, z is a Riemann integrable function, and
∑
sup |z(x)| < ∞.
n x∈[n,n+1]
If F is not arithmetic, that is, ∀ h > 0 : P(T1 ∈ hN) < 1, then for each h > 0 the following equality holds, lim Z(t) = μ −1
t→+∞
∞
z(y)dy,
(11.9)
0
where μ ∈ (0, ∞] is the expectation of T1 . If T1 is arithmetic then ∀ x ≥ 0 : lim Z(x + nh) = n→∞
h μ
∑ z(x + kh), k
for all h multiples of a span of a distribution, that is, multiples of the largest g such that P(T1 ∈ gN) = 1.
Bibliography [22] Volume 1, Chapter 13, Volume 2, Chapters 6, 11, 13, 14; [29]; [6]; [32]; [28]; [47].
Problems For problems for Markov models in queueing theory see Chapter 10 (Problems 10.76–10.90). 11.1. (Renewal theorem). Assume that the conditions of Theorem 11.1 are satisfied. Prove that (1) If F is a nonarithmetic distribution then for any h > 0 h (11.10) lim (U(t + h) −U(t)) = . t→∞ μ (2) If F is an arithmetic distribution then (11.10) is satisfied when h is a multiple of the span of F.
11 Renewal theory. Queueing theory
163
11.2. Let {Sn , n ≥ 0} be a delayed renewal process, F0 be a d.f. of S0 . Denote by V (t) the expected number of renewal epochs until the moment of time t. Prove that (a) V (t) = ∑∞ n=0 P(Sn ≤ t). (b) V = F0 + F ∗V. (c) V = F0 ∗U, where U is given by (11.7). (d) limt→∞ (V (t)/t) = μ −1 . (e) The renewal rate is constant; that is, V (t) = μ −1t, if and only if F0 (t) = μ −1
t 0
(1 − F(y))dy.
11.3. Let ν (t) be a number of renewal epochs within [0,t]. Prove that (a) There exists θ > 0 such that Eeθ ν (t) < ∞. (b) If S0 = 0 then Eν (t) = U(t), where the function U is given by (11.7). 11.4. A device accepts requests. The processing times for different requests are i.i.d. random variables with distribution function F, expectation μ , and variance σ 2 . The processing of a new request starts immediately after the previous one has been processed. The first request starts processing at the moment of time t = 0. Let: H(t, x) be the distribution function of residual processing time for the request processed at the moment t. G(t, x) be the distribution function of the total time this request had been processing before t. L(t, x) be the distribution function of full time processing of the request. (a) Express functions H, G, L in terms of F and U. (b) Find limits for the distributions H(t, x), G(t, x), L(t, x) as t → ∞ in the case when the distribution of F is nonarithmetical. (c) Express the first moments of the limit distributions in terms of the moments of F assuming μ < ∞. 11.5. Let V be a renewal function for a delayed renewal process (see Problem 11.2). Assume that F is nonarithmetical, and a function z satisfies the conditions of Theorem 11.2. Consider Z = V ∗ z. Prove that lim Z(t) = μ −1
t→∞
∞
z(y)dy. 0
11.6. Solve Problem 11.4 assuming that the first request was processing before moment 0, and its residual processing time has a distribution function F0 . 11.7. Let F be a distribution function of nonnegative random variable ξ . The function - := F(s)
∞ 0
e−st dF(t) = Me−sξ , s ≥ 0
is called the Laplace–Stieltjes transform of F. Prove that M ξ k < ∞ if and only if the - is k times differentiable at 0. Moreover M ξ k = (−1)k (d k F/ds - k )|s=0 . function F(s)
164
11 Renewal theory. Queueing theory
11.8. Assume that a machine processes details in consecutive order. The distribution function of a service time is F. Find the Laplace–Stieltjes transform of (a) A moment of time when nth detail will be processed. (b) A moment of time when a defective detail starts processing, if the probability that any detail is defective is p. (c) Time duration until ξ details are processed, where ξ has a Poisson distribution with parameter λ . Suppose that details are processed independently of each other, and independently on ξ . 11.9. Assume that ζ , η1 , η2 , . . . are independent random variables, ζ takes values in Z+ , P(ζ = k) = pk , and random variables {ηk , k ∈ N} are i.i.d. and nonnegative. ζ Find the Laplace–Stieltjes transform of ∑k=1 ηk . 11.10. A router has accepted n independent packets, and processes them in consecutive order. Assume that each packet has a type A with probability p and type B with probability q = 1 − p, and processing times have distribution functions FA and FB , respectively. Find the Laplace–Stieltjes transform of a time duration necessary for all packets’ service. Particularly, find answers in the cases when (a) FA and FB have exponential distribution with parameters α and β , respectively; (b) FA and FB are distribution functions of constant random variables equal to α and β , respectively. 11.11. Generalize a result of Problem 11.10 to the case when m types of packets arrive with probabilities p1 , . . . , pm , and distribution functions of service times are F1 , . . . , Fm respectively. 11.12. Customers in a store are conventionally divided into m categories. Probability that a customer is of the kth category is equal to pk , and distribution function of his purchase equals Fk in this case. Assume that a cash desk breaks down with probability α while serving each customer. Find the Laplace–Stieltjes transform of money that a cash desk obtains until first breakdown. 11.13. Service time of a detail is a random variable with distribution function F1 . After a detail is processed, the machine does not work during a random time with distribution function F2 . This time is needed for machine supervision and loading a new detail. Let Z(t) be the probability that a machine is processing some detail at the moment of time t. - = ∞ Z(t) (a) Express Z(t) in terms of F1 , F2 . Find the Laplace transform Z(s) 0 e−st dt. (b) Find lim Z(t). t→∞
(c) Let L1 (t), L2 (t) be expectations of processed detail numbers and idle periods until the moment of time t, respectively. Express L1 , L2 and their Laplace–Stieltjes
11 Renewal theory. Queueing theory
165
-i (s) = ∞ e−st dFi (t) in terms of F1 , F2 and their Laplace–Stieltjes transtransforms L 0 forms. Suppose that the first detail started processing at the moment of time t = 0, intervals of service and idle times are independent, and F1 , F2 are nonarithmetical. 11.14. Consider the queueing system described in Problem 11.13. Let: W1 (t, x) be the distribution function of the residual time from t until the first moment when some detail was completely processed, W2 (t, x) be the distribution function of time from t until the moment when the detail arrived first after t was completely served, W3 (t, x) be the distribution function of time from t until the first start of service after t. (a) Express Wi (t, x) in terms of F1 , F2 . (b) Find limt→∞ Wi (t, x). (c) Find limit distributions in item (1) assuming that F1 and F2 have exponential distributions with parameters α1 , α2 , and in the case when F1 (x) = 1Ix≥β , F2 (x) = (1 − e−α x )1Ix≥0 . 11.15. Assume that α ∈ [0, 1) and a function z satisfies conditions of Theorem 11.2. Prove that equation Z(t) = z(t) + α
t 0
Z(t − u)F(du)
has a solution, this solution is unique in a class of locally bounded functions, and is of the form Z(t) =
t
0
z(t − u)U(du),
n ∗n where U = ∑∞ n=0 α F . Express the Laplace–Stieltjes transform of Z in terms of Laplace transform z(s) = 0∞ e−st z(t)dt and the Laplace–Stieltjes transform ∞ −st - = dF(t). F(s) 0 e
11.16. A device processes details in consecutive order. The time of each detail processing is a random variable with distribution function F. A probability that the detail is defective equals p ∈ (0, 1). Denote by H(t) a probability that a device has not processed any defective detail before a moment of time t. Let L1 (t), L2 (t), L3 (t) be the expectations of the number of details processed by a device before t if defective detail: (a) stops a device, and new details are not accepted, (b) is eliminated immediately, (c) is processed (with the same distribution as a quality detail), respectively. Find functions H, L1 , L2 , L3 and their Laplace transforms. Suppose that quality and processing of different details are independent, and the processing of the first detail started at moment zero. How the answer would change if the residual processing time of the first detail had distribution function F1 ? 11.17. Requests of two types arrive at one device. The distribution functions of the service times for these types of requests are G1 and G2 , respectively. At the instant
166
11 Renewal theory. Queueing theory
when the device becomes free, either a type 1 request arrives with probability p ∈ (0, 1) or a type 2 request arrives with probability q = 1 − p. Let: L(t) be an expectation of type 1 requests that have arrived before the moment of time t, H(t, x) be a distribution function of a time interval from t until the first arrival of a new request after t, H1 (t, x), H2 (t, x) be distribution functions of a time interval from t until the first arrival of a new request (type 1 or 2, respectively) after t. Find L, H, H1 , H2 and their Laplace–Stieltjes transforms. Suppose that the first request arrives at the moment of time 0. Type and service time are independent for different requests. 11.18. Service time of a detail has an exponential distribution with a parameter α . A processing of a new detail starts immediately after the previous detail was completely processed. However, if some detail is processing for more than a period of time β , then the machine overheats and stops. Let ξ be a moment when the machine stops. Find Eξ , Dξ , if the service of the first detail starts at the moment t = 0. 11.19. Solve Problem 11.18 if a service time has uniform distribution on [0, α ], α > β. 11.20. Patients with diseases of types 1, . . . , n come to the doctor and their arrivals form independent Poisson streams with rates λ1 , . . . , λn . Service time of a patient with a kth disease is a random variable with expectation βk . Prove that (a) if ∑nk=1 λk βk < 1, then with probability 1 there exists a moment of time when there are no patients in the queue (assume that the doctor is an “ideal” one, that is, works without breaks, days off, etc.) (b) The answer remains the same if patients with some diseases have higher priority than others (if a new patient with an urgent diseases arrives then a patient with lower priority who already started the examination can either be examined completely or re-examined later taking into account initial examination time). (c) Moments when no patients remain form a renewal process. Generalize the problem to the case when there are several doctors with the same qualification and specialization at the hospital. 11.21. A Poisson stream of requests arrives at the device. The rate of the stream equals λ > 0. The distribution function of service time of each request equals F, and the expectation equals μ . Assume that λ μ < 1. If the device is busy then the new request is added at the end of the queue. Denote by Π a distribution function of the busy period, that is, the length of time from the instant the request arrives at an empty system until the instant when the system is empty again. Prove that
11 Renewal theory. Queueing theory
167
- (s) = ∞ e−st d Π (t) satisfies the equation Π - (s) = (a) Laplace–Stieltjes transform Π 0 - (s)), Re s > 0. - +λ −λΠ F(s (b) 0∞ td Π (t) = μ/(1 − λ μ ). (c) 0∞ t 2 d Π (t) = 0∞ t 2 dF(t)(1 − λ μ )−3 . 11.22. Consider a queueing system from Problem 11.21, where service time is (a) Nonrandom and equals μ . (b) Exponentially distributed with parameter 1/μ . Find the first three moments and variance of the busy period. 11.23. Patients’ arrival to the doctor forms a Poisson stream with rate λ > 0. The expectation of the service time equals μ , λ μ < 1. Find a probability that a new patient catches a doctor free, if the doctor started her work a long enough time ago. 11.24. Assume that a device accepts n independent Poisson streams of requests with rates λ1 , . . . , λn respectively. Assume that a distribution function of the service time of requests from the kth stream equals Fk , ∑nk=1 λk μk < 1. Suppose also that all requests have the same priority. Prove that Laplace–Stieltjes transformation of the busy period satisfies the equation - (s) = λΠ
n
∑ λk F-k (s + λ − λ Π- (s)),
k=1
where λ = λ1 + · · · + λn . Find the expectation of the busy period. 11.25. Consider a queueing system from Problem 11.24 under the assumption that requests from different streams have different priorities. Consider the following cases. (a) A higher priority request interrupts a lower priority request, and the lower priority request is added to the queue. The service time of the lower priority request resumes at the point where it was interrupted. (b) If the service of a lower priority request has been already started, then a higher priority request does not interrupt it. The service of a higher priority request starts before all lower priority requests from the queue. Write an equation for the Laplace–Stieltjes transformation of the busy period distribution function. Find the expectation of the busy period. 11.26. Solve Problem 11.25 if the service time of a lower priority request does not resume at the point where it was interrupted, and service starts from the very beginning. 11.27. A telephone exchange has an infinite number of channels. Assume that request arrivals follow a Poisson process with rate λ . The service time for each request has distribution function F. Let X(t) be the number of busy channels at the time moment t, X(0) = 0. Find P(X(t) = k), EzX(t) , k ≥ 0, |z| < 1. 11.28. A device accepts a Poisson stream of requests with a rate λ > 0. The service time for each request has the distribution function F. If a device is busy then a new
168
11 Renewal theory. Queueing theory
request is added at the end of the queue. Let tn , n ≥ 1 be the moment when the nth request is processed; Xn is a number of requests in the system just after tn . Prove that {Xn , n ≥ 1} is a homogeneous Markov chain with transition matrix , , , p0 p1 p2 . . ., , , , p0 p1 p2 . . ., , , , P=, , 0 p0 p1 . . ., , , 0 0 p0 . . ., , , ,. . . . . . . . . . ., where pk =
∞ (λ x)k −λ x e dF(x). 0
k!
∞
Prove that if λ μ < 1, where μ = 0 xdF(x), then this Markov chain is ergodic n and generating the function π (z) = ∑∞ n=0 πn z , |z| < 1 of the stationary distribution equals - λ (1 − z)) (1 − λ μ ) (1 − z)F( π (z) = . - λ (1 − z)) − z F( 11.29. Denote by Sn the nth hitting moment of some state i by a Markov chain (possible Sn = ∞). (a) Prove that {Sn } is a delayed renewal process. (b) Prove the ergodic theorem for the Markov chain (Theorem 10.3). 11.30. A device accepts a Poisson stream of requests with a rate λ . The expected service time is μ , λ μ < 1. Prove that a generating function of virtual waiting time w(θs ) stopped at the moment θs , where θs has exponential distribution with rate s > 0, is of the form: ∞ s − z p0 (s) (z, s) = Ee−zw(θs ) = s . ω e−st ω (z,t)dt = s − k(z) 0 Use (11.5) and prove identity
- (s) 1 − λ Π s − z + s
, P(w(θs ) = 0) = p0 (s). (z, s) = ω s − k(z) s + λ 1 − Π - (s) 11.31. Assume that λ μ < 1 in Problem 11.30. Denote by w∗ the weak limit of virtual waiting time limt→∞ w(t). Use the last identity of the previous problem and prove that the generating function of the limit virtual waiting time is z(1 − λ μ ) 1−λμ ∞ ∗ (z) := Ee−zw∗ = w , = 1 − λ 0∞ e−zx F(x)dx z − λ (1 − 0 e−zx dF(x)) where F(x) = P(ξk > x), x > 0. Find probability p+ = P(w∗ = 0) and show that the last identity is equivalent to the classical Pollaczek–Khinchin formula −1 ∞ −zw∗ −1 −zx = p+ 1 − q+ μ e F(x)dx , q+ = 1 − p+ = λ μ . Ee 0
11 Renewal theory. Queueing theory
169
11.32. A server accepts a Poisson stream of packets with rate λ > 0. The service time of each packet is nonrandom and equals τ , λ τ < 1. If the server is busy then the new packet is added at the end of the queue. Find the Laplace–Stieltjes transformation and expectation of the waiting time of an arbitrary packet in stationary conditions. 11.33. A router accepts a Poisson stream of requests with rate λ > 0. Each request contains 1 symbol with probability p and (m + 1) symbols with probability q = 1 − p. The service time of one symbol is nonrandom and equals τ . Find the Laplace–Stieltjes transformation and expectation of the waiting time experienced by an arbitrary request in the stationary mode. It is assumed that the capacity of the queue is infinite. 11.34. Find a cumulant function ψ (α ) = ln Eeiαζ (1) for a service process N(t) ζ (t) = ∑k=1 ξk − t, where the characteristic function of service time equals ϕ (α ) = 1/(1 − iα )2 . Find Eζ (1). For which λ does the stationary mode exist? 11.35. Consider the queueing system with a process ζ (t) from the previous problem. Use the Pollaczek–Khinchin formula (see Problem 11.31) to find the generating function of stationary virtual waiting time ω∗ when λ < 12 , (F(x) = 12 (1 + x)e−x , x > 0). If λ = 14 then expand w˜ ∗ (z) into linear-fractional functions, find Laplace transformations, and find the distribution of w∗ .
Hints 11.1. Take z = 1I[0,h] in Theorem 11.2. 11.2. Use item (c) and previous problem in a solution of item (d). (e) If V (t) = μ −1t, then (b) implies F0 (t) =
t 1 − μ μ
t
(t − y)F(dy).
0
Now integrate by parts. 11.5. Use the Lebesgue dominated convergence theorem. 11.6. (a) All formulas have the same form as in Problem 11.4 where a function U is replaced by V. (b) Use Problem 11.5. - −1 . 11.15. Use the Banach fixed point theorem. Z- = z(1 − α F) 11.19. See solution of Problem 11.18. 11.20. Use the strong law of large numbers. 11.21. Prove that Π (+∞) = 1 (see Problem 11.20), and prove that Π ∗n is a distribution function of a busy period if service of one request just started and there are n − 1 requests in the queue. Assume that the service of the first request has finished at the time moment u. A probability that n requests arrived during this period of time is equal to
170
11 Renewal theory. Queueing theory
x −λ u e−λ u ((λ u)n /n!). Thus Π (x) = ∑∞ ((λ u)n /n!)Π ∗n (x − u)F(du). All sumn=0 0 e mands are nonnegative, so we may change the order of sum and integration:
∞ ∞
(λ u)n - n Π (s)F(du) n! 0 n=0 ∞ - (s) . e−u(s+λ −λ Π (s)) F(du) = F- λ + s − λ Π =
- (s) = Π
∑ e−su e−λ u
(11.11)
0
Note that we have not proved uniqueness of the solution (in fact, such uniqueness also holds). - and F- are analytical when Re s > 0 and equalities (11.11) hold true Functions Π for all real positive s. Therefore (11.11) holds true for all s, Re s > 0. Differentiate (11.11), pass to a limit as s → 0+ and find the moments of Π . 11.24. Use the reasoning of Problem 11.21 and the following fact. The probability that the request from the kth stream arrived earlier than other requests equals λk /(λ1 + · · · + λn ). (n) 11.29. Let z(n) = fi j , F be a distribution function of the return time from a point j to itself. Use Theorem 11.2 with x = 0, h = 1. (z, s). 11.30. P(w(θs ) = 0) = limz→∞ ω 11.31. To find the generating function for w∗ , pass to a limit as s → 0 in a relation (z, s). for ω 11.32, 11.33. Apply the Pollaczek–Khinchin formula (Problem 11.31).
Answers and Solutions 11.3. (a) Let ε > 0 and δ > 0 be such that P(T1 ≥ δ ) > ε . Use a sequence of i.i.d. random variables Tn = δ 1I{Tn ≥δ } and construct a renewal process Sn (S0 = 0) and process ν(t). Then P (ν(mδ ) = k) = Ckm ε m+1 (1 − ε )k−m ≤ const · km · (1 − ε )k . So the expectation Eeθ ν(mδ ) is finite when θ < − ln(1 − ε ). To complete the proof, observe that ν(t) is a monotone process and ν(t) ≥ ν (t). 11.4. (a) Assume that the last request before time moment t was served at the moment u. Then the conditional probability that the next request is served on the time interval (t,t + x] equals F(t + x − u) − F(t − u). So H(t, x) = ∑n 0t (F(t + x − u) − F(t − u))P(Sn ∈ du) = 0t (F(t + x − u) − F(t − u))U(du), t t (1 − F(t − u))U(du), L(t, x) = t−x (F(x) − F(t − u))U(du). G(t, x) = t−x (b) Due to Theorem 11.2 with z(u) = F(u + x) − F(u), z(u) = (1 − F(u))1Iu∈[0,x) , z(u) = (F(x) − F(u))1Iu∈[0,x) , we obtain the following limit distributions for H, G, L respectively.
μ −1
x
0
(1 − F(u))du;
μ −1
x
0
(1 − F(u))du;
μ −1
x
udF(u). 0
11 Renewal theory. Queueing theory
171
(c) (μ 2 + σ 2 )/(2μ ); (μ 2 + σ 2 )/(2μ ); (μ 2 + σ 2 )/μ . n. 11.8. (a) (F(s)) (b) p(1 − qF(s))−1 with q = 1 − p. - − 1)}. (c) exp{λ (F(s) ∞ k = G(ln - F(s)), 11.9. ∑k=0 pk (F(s)) where F is the distribution function of ηk , G is the distribution function of ζ . n 11.10. (pFA (s) +qFB (s)) . (a)
pα α +s
β + βq+s
n
.
(b) (pe−sα + qe−sβ )n . n 11.11. ∑m k=1 pk Fk (s) .
ξ
11.12. Profit equals ∑ j=1 η j , (∑01 = 0), where η j has the distribution function ∑k pk Fk , P(ξ = n) = αβ n , n ∈ Z+ β = 1 − α . Therefore ∞
ξ
Me−s ∑ j=1 ηs =
α
∑ Me−s ∑ j=1 η j αβ n = 1 − β ∑ n
n=0
k
pk F-k (s)
.
11.13. (a) Let F = F1 ∗ F2 be the distribution function of the time between starts (or ∞ ∗n endings) of processing t of two successive details, U = ∑n=0 F -. Then 1 − F1 (s) - = . Z(t) = (1 − F1 (t − u))U(du), Z(s) 0 s(1 − F1 (s)F-2 (s)) (b) Take z(x) = 1 − F1 (x) in Theorem 11.2 ∞ and obtain μ1 −1 lim Z(t) = (μ1 + μ2 ) (1 − F1 (u))du = . t→∞ μ 0 1 + μ2 (c) A function L1 is a renewal function of a pure renewal process with F = F1 ∗ F2 ; ∗n that is, L1 = ∑∞ n=0 F . A function L2 is a renewal function for a delayed renewal process, where the distribution function of delay equals F1 . 1 F-1 -2 = -1 = ; L . L 1 − F-1 F-2 1 − F-1 F-2 11.14. Epochs when details end their processing form a delayed renewal process. The first renewal epoch is the moment when the first detail finished processing, its distribution function equals F1 . Thus W1 (t, x) =
t 0
(F(t + x − u) − F(t − u))V (du),
where V = F1 ∗ U, U and F are defined in a solution of Problem 11.13 (see also Problems 11.4, 11.6). Analogously, W3 (t, x) = 0t (F(t + x − u) − F(t − u))U(du). −1
lim W1 (t, x) = lim W3 (t, x) = (μ1 + μ2 )
t→∞
t→∞
x 0
(1 − F(u))du =: W (x).
It is easy to see that W2 = W3 ∗ F1 . So, limt→∞ W2 (t, x) = W ∗ F1 (x). n ∗n (see Problem 11.15). Then 11.16. Denote U(t) := ∑∞ n=0 q F H(t) = q
t 0
(1 − F(t − u))U(du), L1 (t) = 1 − H(t).
172
11 Renewal theory. Queueing theory
Let {Sn , n ≥ 0} be epochs when the processing of details starts. Assume that u is the last instant prior to t when the processing of some detail started. Then the conditional expectation of a number of defective details arrived n at instant u equals ∑∞ n=0 np q = p/q (for L2 (t)), or p (for L3 (t)). So L2 (t) = t ∗n = ∑∞ (p/q) 0 (1−F(t −u))U(du), L3 (t) = p 0t (1−F(t −u))U(du), where U n=0 F is a renewal function for a process with nondefective details only. 11.17. Let G be the distribution function of the first arrival of the type 1 request. Then n ∗n G(t) = p + q 0t G(t − s)G2 (ds); that is, G = p ∑∞ n=0 q G2 , G = p/(1 − qG2 ). Denote by F = G1 ∗ G a distribution function of the time between two successive ∗n arrivals of type 1 requests, U = ∑∞ n=0 F . A problem to calculate L(t) is equivalent to the following one. Service time of a request has a distribution function F. After the service is complete a device is idle during a time with distribution function G. Then (see solution of Problem 11.13) p -= L = G ∗U, L -1 −qG -2 . 1−pG Analogously, to find H1 or H2 use the solution of Problem 11.14. Function H is defined by the formula from Problem 11.4 (b) with distribution function between successive requests equal to F = pG1 + qG2 . 11.18. Probability that a machine overheats during the first detail processing is equal to e−αβ . Therefore the distribution function of the machine’s stopping time satisfies the equation Z(x) = z(x) +
x 0
Z(x − y)α e−α y 1Iy∈[0,β ) dy
= z + (1 − e−αβ )Z ∗ F,
(11.12)
= α e−α x (1 − e−αβ )−1 1Ix∈[0,β ) dx. where z(x) = e−αβ 1Ix≥β , d F(x) Let us integrate the left-hand and right-hand sides of (11.12) with respect to x. Then e2αβ − 1 − 2αβ eαβ eαβ − 1 , DX = . EX = α α2 11.22. (a) μ /(1 − λ μ ); μ 2 /(1 − λ μ )3 ; (2λ μ + 1)μ 3 /(1 − λ μ )5 ;σ = λ μ 3 /(1 − λ μ )3 . (b) μ (1 − λ μ )−1 ; 2μ 2 (1 − λ μ )−3 ; 6μ 3 (1 + λ μ )(1 − λ μ )−5 ; σ 2 = (1 + λ μ )μ 2 (1 − λ μ )−3 . 11.23. One can use the result of Problem 11.13 (b), where F1 has exponential distribution with a rate λ , and F2 is a busy interval. Then the corresponding probability equals (see also Problem 11.21 (b)) ∞
∞ 0
xdF1 (x) = xdF1 (x) + 0∞ xdF2 (x)
1 λ
0
1 λ
+ 1−μλ μ
= 1 − λ μ.
11.24. (∑nk=1 λk μk ) (∑nk=1 λk )−1 (1 − ∑nk=1 λk μk )−1 . 11.25. The answer is the same as in Problem 11.24. 11.27. Let N(t) be the number of requests that arrived before t. It can be proved (see Problem 5.17) that conditional distribution of call moments τ1 , . . . , τn given N(t) = n is equal to the distribution of division points of [0,t] by a system of n independent uniformly distributed in [0,t] random variables.
11 Renewal theory. Queueing theory
173
The probability that a call arrived at the moment u ∈ [0,t] will be serving at moment t equals 1 − F(t − u). So t k t n−k k −n (1 − F(u))du F(u)du , P(X(t) = k/N(t) = n) = Cn t 0
n ≥ k. This implies that
0
t
P(X(t) = k) =
k
0 (1 − F(u))du
k!
EzX(t) = e−λ (1−z)
t
e−λ
t
0 (1−F(u))du
0 (1−F(u))du
;
.
11.34. ψ (α ) = λ (ϕ (α ) − 1) − iα = iα λ (2 − iα )(1 − iα )−2 − 1 ; E ζ (1) = 2λ − 1. %−1 $ 11.35. Ee−zω∗ = (1 − 2λ ) 1 − λ (2 + z)(1 + z)−2 , q+ = 1 − p+ = 2λ as λ < 12 . If λ = 14 , then p+ = 12 , $ % √ √ 5 1 5 −x(7− 17)/8 −x(7+ 17)/8 1+ √ = + 1− √ e e P(ω∗ > x) = 4 17 17 * ) √ √ √ 17x 5 17 17x −7x/8 1 ch + sh , x > 0. =e 2 8 34 8
12 Markov and diffusion processes
Theoretical grounds Let T ⊂ R, (Ω , F, {Ft }t∈T , P) be a probability space with complete filtration. Let X = {X(t),t ∈ T} be an adapted stochastic process taking values in some metric space (X, X), which sometimes is called the phase space of the process X. Definition 12.1. Process X is called a Markov process (MP) with respect to filtration {Ft }t∈T if for any s ∈ T and for any event A ∈ F≥s := σ {X(u), u ≥ s, u ∈ T} the following equality holds P(A/Fs ) = P(A/X(s)) P-a.s.
(12.1)
If equality (12.1) holds for some filtration then it holds for natural filtration X can be interpreted as the {FtX }t∈T as well. The sets from σ -algebras FsX and F≥s events from the “past” and from the “future” of the process X (with respect to the “present” moment s of time). There exist some equivalent characterizations of the Markov property (12.1). Theorem 12.1. A stochastic process X is a Markov process if and only if one of the following properties holds. (1) For any s ∈ T and for any bounded F≥s -measurable real random variable F : Ω → R P-a.s. the following equality holds, E(F/Fs ) = E(F/X(s)). (2) For any s ≤ t, s,t ∈ T and for any bounded B(X)-measurable function f : X → R P-a.s. the following equality holds, E( f (X(t))/Fs ) = E( f (X(t))/X(s)). (3) For any s ≤ t, s,t ∈ T and for any B ∈ B(X), P-a.s. the following equality holds, P(X(t) ∈ B/Fs ) = P(X(t) ∈ B/X(s)). The stochastic process X is Markov with respect to natural filtration if and only if one of the following equivalent properties holds,
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 175 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 12,
176
12 Markov and diffusion processes
(4) For any m ≥ 1 and any s1 < · · · < sm < s ≤ t, s1 , . . . , sm , s,t ∈ T, and for any bounded X-measurable function f P-a.s. the following equality holds, / / E f (X(t)) X(s1 ), . . . , X(sm ), X(s) = E f (X(t)) X(s) . (5) For any s ≤ t, s,t ∈ T and for any set B ∈ X P-a.s. the following equality holds, / / P X(t) ∈ B X(s1 ), . . . , X(sm ), X(s) = P X(t) ∈ B X(s) . (6) For any s ∈ T and for any sets A ∈ Fs , B ∈ F≥s P-a.s. the following equality holds, P(A ∩ B/X(s)) = P(A/X(s))P(B/X(s)). Theorem 12.2. Let {X(t),t ∈ T} be an n-dimensional stochastic process with independent increments. Then it is a Markov process with respect to natural filtration. Definition 12.2. A function P(s, x,t, B), s ≤ t, x ∈ X, B ∈ X taking values in [0, 1] is called a transition function, or a Markov transition function, if it satisfies the following conditions. (1) For s, x,t fixed the function P(s, x,t, · ) is a probability measure on (X, X), (2) For s,t, B fixed the function P(s, · ,t, B) is X – B(R)-measurable, (3) P(s, x, s, B) = 1Ix∈B for any s ∈ T, x ∈ X, B ∈ X, (4) For any s < u < t (s, u,t ∈ T), x ∈ X, B ∈ X this function satisfies the Kolmogorov–Chapman equations P(s, x,t, B) =
X
P(s, x, u, dy)P(u, y,t, B).
Remark 12.1. It follows from condition (3) that the transition function satisfies the Kolmogorov–Chapman equations for any s ≤ u ≤ t. Definition 12.3. We say that Markov process X = {X(t),t ∈ T} has transition function, or transition probability P(s, x,t, B), if for any s ≤ t, s,t ∈ T and any B ∈ X, P(X(t) ∈ B/X(s)) = P(s, X(s),t, B) P-a.s., or, what is the same, P(s, x,t, B) = P(X(t) ∈ B/X(s) = x) for almost all x with respect to the distribution X(s). In the case when the phase space is countable, the transition function exists for an arbitrary Markov process (see Problems 12.3 and 12.4). Consider one more form of the Kolmogorov–Chapman equations: for s ≤ u ≤ t, B ∈ X, and for almost all x with respect to the distribution of X(s), P(s, x,t, B) = E P(u, X(u),t, B)/X(s) = x . Definition 12.4. The Markov process X = {X(t),t ∈ T} with the values in the space (X, X) and with transition function P(s, x,t, B) is called a homogeneous Markov process (HMP) if P(s, x,t, B) depends only on the difference t − s for any B ∈ X, x ∈ X.
12 Markov and diffusion processes
177
For homogeneous Markov processes with T = R+ or Z+ , it is sufficient to consider only the functions P(x,t, B) := P(0, x,t, B), t ∈ T. If we have HMP, then the conditions (1)–(4) from Definition 12.2 of the transition function can be rewritten in the form (1 ) For any fixed x,t the function P(x,t, · ) is a probability measure on X, (2 ) For any fixed t, B the function P( · ,t, B) is X – B(R)-measurable, (3 ) P(x, 0, B) = δx (B) for any x, B, (4 ) For any x, s,t, B this function satisfies the Kolmogorov–Chapman equations: P(x, s + t, B) =
X
P(x, s, dy)P(y,t, B).
Definition 12.5. The transition probability P(t, x, B) of HMP X is called measurable if for any B ∈ X it is a measurable function of the pair of variables (t, x) on the product of the spaces (R+ , B(R+ )) × (X, X). HMP X is called weakly measurable if its transition function is measurable. Denote by B = B(X) the Banach space of all real-valued Borel functions f : X → R with the norm f = supx∈X | f (x)|. Define on B the family of operators {Tt ,t > 0} via the formula f (y)P(t, x, dy), T0 f = f . (12.2) Tt f (x) = X
Then it follows from the measurability of P(t, x, · ) with respect to x that the inclusion holds Tt f ∈ B, for f ∈ B; that is, Tt : B → B. The family of linear operators {Tt ,t ≥ 0} creates the semigroup; that is, for any t, s ≥ 0, Tt+s = Tt Ts
(12.3)
(see Problem 12.31). In what follows, we consider only weakly measurable HMP. Define the resolvent operator {Rλ , Re λ > 0} of HMP X as the family of operators on B of the form Rλ f (x) =
∞ 0
e−λ t
X
f (y)P(t, x, dy)dt =
∞ 0
e−λ t Tt f (x)dt.
The measurability of the transition probability provides that the integral is well defined. If we denote BC the space of complex-valued bounded Borel functions f : X → C with the uniform norm f = supx∈X | f (x)|, then Rλ : B → BC . The principal properties of resolvent operators are the following ones. (1) For any λ = μ , the resolvent equation holds: Rμ − Rλ Rλ Rμ = , λ −μ (2) Rλ 1I = (1/λ )1I and for any λ > 0, here 1I stands for the function that equals 1 identically; for f ∈ B+ := { f ∈ B f ≥ 0 on X} we have Rλ f ∈ B+ , (3) Rλ ≤ 1/Re λ . Denote B0 the set of those functions from the space B, for which limt↓0 Tt f − f = 0.
178
12 Markov and diffusion processes
Lemma 12.1. (1) The set B0 is a closed linear subspace in the B. (2) The operators Tt transfer B0 into B0 . (3) For any f ∈ B we have that Rλ f ∈ B0 . (4) For any f ∈ B0 we have that limλ →+∞ λ Rλ f − f = 0. Definition 12.6. The operator Aϕ := limh↓0 h−1 (Th ϕ − ϕ ), defined for those functions ϕ , for which this limit exists in the sense of strong convergence in B, is called the infinitesimal operator, or generator of the semigroup {Tt ,t ≥ 0}, related to HMP X. The set of the functions ϕ mentioned above is denoted as DA . Evidently, DA ⊂ B0 . Theorem 12.3. For any f ∈ B0 the following equality holds, ARλ f = λ Rλ f − f , and for any ϕ ∈ DA the following equality holds, Rλ Aϕ = λ Rλ ϕ − ϕ . Remark 12.2. Definition 12.6 is a general definition of the generator of a semigroup. The generator A of semigroup {Tt ,t ≥ 0}, defined on B, satisfies the relation dTt f = ATt f , f ∈ B, (12.4) dt dTt f = Tt A f , f ∈ DA . dt
(12.5)
Theorem 12.4. (Hille–Yosida) Let B be a real Banach space, BC be its complex extension, and on BC the family of operators Rλ (Re λ > 0) be defined that satisfy the following assumptions. (1) Rλ ≤ 1/Re λ , (2) For λ > 0, Rλ (B) ⊂ B, (3) For any λ , μ , the resolvent equation holds: Rλ − Rμ = (μ − λ )Rλ Rμ , (4) For some λ > 0 the set Rλ (B) is dense in B. Then there exists the semigroup of linear operators {Tt ,t ≥ 0} on B, satisfying the conditions: (1) Tt Ts = Tt+s , t, s ≥ 0, (2) For any x ∈ B, Tt x − x → 0, t ↓ 0, (3) Tt ≤ 1, (4) For any x ∈ B, Rλ x = 0∞ e−λ t Tt xdt. Remark 12.3. The Hille–Yosida theorem asserts that any family of operators which has resolvent properties corresponds to some semigroup. In the general case this semigroup does not correspond to the transition function of some MP. For the assumptions on phase space and on resolvent operators that are sufficient for some semigroup to create the transition function of MP, see, for example [24], Vol. 2. Now, let T = R+ or Z+ , (X, X) be measurable space, and P(t, x, B) be the function satisfying the assumptions (1 )–(3 ). Let us have the stochastic process X(t, ω ), defined on Ω and with values in X. Denote σ -algebras F≥0 F≤t , generated by the sets of the form {X(s) ∈ B}, s ∈ T, and s ≤ t, correspondingly. Let {Px , x ∈ X} be the family of probability measures defined on the σ -algebra F = F≥0 .
12 Markov and diffusion processes
179
Definition 12.7. The pair (X, P· ) is called the homogeneous Markov family (HMF) with transition function P(t, x, B), if for any t, h ∈ T, x ∈ X, B ∈ X Px -a.s. we have that Px (X(t + h) ∈ B/F≤t ) = P(h, X(t), B). Kolmogorov–Chapman equations (4 ) are a consequence of the property mentioned above. For a given HMF (X, P· ) and every x∗ ∈ X, the process X, considered on the probability space (Ω , F, Px∗ ), is a HMP with the transition function P(t, x, B) such that X(0) = x∗ a.s. Moreover, for a given probability measure μ on X, define the measure Pμ by Pμ (A) =
X
Px (A)μ (dx), A ∈ F.
Then the process X, considered on the probability space (Ω , F, Pμ ), is a HMP with the transition function P(t, x, B) such that the distribution of X(0) is equal to μ . Therefore, the notion of HMF allows one to consider sets of Markov processes with the same transition function and various probability laws of the initial value X(0). When HMF has the same transition function with some HMP X, we say that this HMF corresponds to the process X. The expectation w.r.t. Px or Pμ is denoted by Ex or Eμ , respectively. In general, it is not a trivial problem to construct, for a given HMP, a HMF with the same transition function. Fortunately, for typical and most important processes this problem can be solved explicitly (see, e.g., Problem 12.23). In particular, the construction described in Problem 12.23 allows one to extend the definition of a Wiener process and consider a Wiener process with arbitrary distribution of its initial value W (0). Let (X, P· ) be HMF, and let filtration {Ft ,t ∈ T}, Ft ⊆ F≥0 be defined on the space Ω . Definition 12.8. The stochastic process X is called progressively measurable, if for any s ∈ T the restriction of X(·, ω ) on [0, s] × Ω is B([0, s]) × Fs -measurable. (See Problem 3.7.) Definition 12.9. HMF (X, P· ) is called a strong Markov family with respect to the filtration {Ft ,t ∈ T}, if the following conditions hold. (1) Stochastic process X is progressively measurable. (2) For any stopping time τ and for any t ∈ T, any x ∈ X and B ∈ X we have that Px (X(τ + t) ∈ B/Fτ ) = P(t, X(τ ), B). If, for a Markov process X, the corresponding HMF is a strong Markov family, then the process X is said to have a strong Markov property. Theorem 12.5. Let the process X(t) with HMF (X, P· ) have the trajectories that are continuous from the right, and the function F(t, x) := Ex f (X(t)) satisfy the relation: F(s, y) → F(t, x) for s ↑ t, y → x and for any continuous bounded function f . Then HMF (X, P· ) is a strong Markov family.
180
12 Markov and diffusion processes
For example, m-dimensional Wiener process W (t),t ∈ R+ has a strong Markov property (see Problem 12.24). Definition 12.10. HMF (X, P· ) in a phase space (Rm , B(Rm )) is called a diffusion process if the following conditions hold. (1) For the semigroup {Tt ,t ≥ 0}, generated by the transition probability P(t, x, B) 2 (Rm ) of doubly conof the process, the generator A is well defined on the space Cfin m tinuously differentiable functions with compact supports in R , (2) There exist such subjects as continuous vector function (drift vector, or drift coefficient) b(x) = (bi (x), 1 ≤ i ≤ m) and continuous, symmetric, and positively definite matrix function (diffusion matrix, or diffusion coefficient) a(x) = (ai j (x), 1 ≤ i, j ≤ m) for which A f (x) = L f (x) :=
1 m i j ∂ 2 f (x) m i ∂ f (x) 2 a (x) + ∑ b (x) , f ∈ Cfin (Rm ), 2 i,∑ ∂ x ∂ x ∂ x i j i j=1 i=1
(3) The trajectories of X are continuous a.s. Theorem 12.6. Let (X, P· ) be such a HMF on (Rm , B(Rm )) that for any ε > 0 uniformly in x ∈ Rm the following conditions hold. (1) P(t, x, Rm \ B(x, ε )) = o(t), t → 0, (2) There exists a continuous vector function b(x) = (bi (x), 1 ≤ i ≤ m) for which i − xi )P(t, x, dy) = bi (x)t + o(t), t → 0, (y B(x,ε ) (3)There exists a continuous matrix function a(x) = (ai j (x), 1 ≤ i, j ≤ m) for which B(x,ε ) (yi − xi )(y j − x j )P(t, x, dy) = ai j (x)t + o(t), t → 0. (Here B(x, ε ) denotes a ball of radius ε with the center at the point x.) Then the 2 (Rm ) of process X is a diffusion process, its generator A is defined on the space Cuni the bounded functions, uniformly continuous together with their partial derivatives of the first and the second order, and on this space the generator equals L f . Theorem 12.7. Let HMF (X, P· ) satisfies conditions (1)–(3) of Theorem 12.6 uniformly in x from any bounded set, and for any bounded set B ⊂ Rm there exists a bounded set B ⊃ B such that P(t, x, B) = o(t) as t → 0 uniformly in x ∈ Rm \ B . Then X is a diffusion process. Theorem 12.8. Assume the generator A of some diffusion process is well defined on the subset of functions f ∈ C2 (Rm ) for which ∂ f (x) ∂ 2 f (x) | f (x)| + + ≤ ϕ (x), ∂ xi ∂ xi ∂ x j where ϕ (x) → 0 as x → ∞, 1 ≤ i, j ≤ m, and coincide on this set with L . Also, assume the transition probability to have a form P(t, x, B) = B p(t, x, y)dy, where the density p : (0, ∞) × Rm × Rm → R+ is continuous in all its variables together with ∂ p/∂ t, ∂ p/∂ xi and ∂ 2 p/∂ xi ∂ x j . Let, moreover, ∂p ∂2p ∂ p (t, x, y) + (t, x, y) ≤ c(t, y)ϕ (x), |p(t, x, y)| + (t, x, y) + ∂t ∂ xi ∂ xi ∂ x j
12 Markov and diffusion processes
181
where c : (0, ∞) × Rm → R+ is a continuous positive function. Then the transition probability density p satisfies the equation
∂ p(t, x, y) 1 m i j ∂ 2 p(t, x, y) m i ∂ p(t, x, y) = ∑ a (x) + ∑ b (x) , ∂t 2 i, j=1 ∂ xi ∂ x j ∂ xi i=1
(12.6)
or, in a brief form, ∂ p/∂ t = Lx p. Here index x means that the operator L is applied to the density p considered as a function of x with t, y fixed. Now, let the functions ai j ∈ C2 (Rm ), bi ∈ C1 (Rm ). Define the operator L ∗ , formally conjugate to L , by the formula L ∗ g(x) :=
∂2 ij ∂ 1 m a (x)g(x) − ∑ bi (x)g(x) , ∑ 2 i, j=1 ∂ xi ∂ x j ∂ x i i
and this equality relates to such smooth functions g for which L ∗ is correctly defined. Theorem 12.9. Let (X, P· ) be a diffusion process with the generator L . Also, let the density p(t, x, y) of some transition probability have continuous partial derivatives ∂ p/∂ t, ∂ p/∂ yi , and ∂ 2 p/∂ yi ∂ y j . Then this density satisfies the equation m ∂ p(t, x, y) 1 m ∂ 2 ai j (y)p(t, x, y) ∂ i = ∑ b (y)p(t, x, y) , −∑ (12.7) ∂t 2 i, j=1 ∂ yi ∂ y j i=1 ∂ yi or, in other terms, ∂ p/∂ t = Ly∗ p. Remark 12.4. Equation (12.6) is called the backward Kolmogorov equation; the equation (12.7) is called the forward Kolmogorov equation, or Fokker–Planck equation. Equation (12.6) is in fact equation (12.4), and (12.7) is in fact equation (12.5), rewritten in terms of parabolic partial differential equations. Now, consider the m-dimensional stochastic differential equation dX(t) = b(X(t))dt + σ (X(t))dW (t), X(0) = x ∈ Rm , t ∈ R+ ,
(12.8)
and suppose that its (homogeneous in t) coefficients satisfy conditions of Theorem 14.1, that are sufficient for existence and uniqueness of its strong solution (the elements of the theory of stochastic differential equations are given in Chapter 14). Let X = X(t, ω , x) be the solution of equation (12.8). Define the function P(t, x, B) := P(X(t, ω , x) ∈ B),
B ∈ B(Rm ).
m Theorem 12.10. (1) For any t, h ∈ R/+ , x ∈ Rm , and B ∈ B(R ) P-a.s. the following equality holds, P X(t + h, ω , x) ∈ B Ft = P h, X(t, ω , x), B . (2) If we put Ω := Ω × Rm , X(t) = X(t, ω ) = X(t, ω , x), F≥0 := σ {X(t),t ≥ 0}, F≤t := σ {X(s), s ≤ t},and define for measurable sets C ⊂ Rm × Ω , C ∈ F≥0 the function Px (C) := P ω (x, ω ) ∈ C , then the pair (X, P· ) is a HMF with transition probability P(t, x, B). (3) Let A be the infinitesimal operator of a semigroup generated by HMF (X, P· ). 2 (Rm ) ⊂ D , and for f ∈ C2 (Rm ) we have the equality Then Cfin A fin
182
12 Markov and diffusion processes
A f (x) = L f (x) =
1 m i j ∂ 2 f (x) m i ∂ f (x) a (x) + ∑ b (x) , 2 i,∑ ∂ xi ∂ x j i=1 ∂ xi j=1
where ai j (x) = σ (x)σ T (x). Therefore, our HMF is a diffusion process.
Bibliography [9], Chapter VI; [38], Chapter IV, §5–6; [90], Chapters 8–11; [24], Volume 2; [25], Chapter I, §4, Chapter VIII; [27], Chapter 5; [18]; [19]; [40]; [51], Chapters 14 and 15; [70]; [79], Chapters 19, 25, 26, 29; [22], Chapter X; [46], Chapter 12, §12.1; [61], Chapters VII–VIII; [68], Chapters 8 and 13; [87].
Problems 12.1. Prove that a process, being a Markov one with respect to some filtration to which it is adapted, is also a Markov process with respect to its natural filtration (see Definition 3.4). 12.2. Prove that real-valued stochastic process X = {X(t), FtX ,t ∈ T} which consists of independent random variables (process with independent values) is a Markov process with respect to natural filtration. 12.3. Let phase space X of Markov process {Xt ,t ≥ 0} be at most a countable set; that is, let Xt be a Markov chain with continuous time. For such a process, put Xt := {i ∈ X| P(Xt = i) = 0},t ≥ 0; pi j (s,t) = P(Xt = j/Xs = i), s ≤ t, i ∈ Xs , j ∈ Xt . Prove that its transition probabilities pi j (s,t) satisfy the conditions: (a) pi j (s,t) ≥ 0, i ∈ Xs , j ∈ Xt , s ≤ t. (b) For any i ∈ Xs , 0 ≤ s ≤ t ∑ j∈Xt pi j (s,t) = 1. (c) pi j (s, s) = δi j , i, j ∈ Xs . (d) For any i ∈ Xs , j ∈ Xt , 0 ≤ s ≤ u ≤ t the equations hold, pi j (s,t) =
∑
pik (s, u)pk j (u,t).
k∈Xu
12.4. In the framework of Problem 12.3 put pi j (s,t) = 0 for i ∈ Xs , j ∈ Xt and pi j (s,t) = pi0 j (s,t) for i ∈ Xs , j ∈ Xt , where i0 ∈ Xs is chosen arbitrarily. Prove that the properties (a)–(d) of transition probabilities mentioned in Problem 12.3 will still hold for such an “extension” of pi j (s,t). 12.5. Consider the “random broken line” Y (t) = (t − k)X(k) + (k + 1 − t)X(k + 1), where {X(k), k ≥ 0} is a real-valued Markov chain. Is it a Markov process? 12.6. Prove that the Wiener process with values in Rm is MP having a transition function and determine this function.
12 Markov and diffusion processes
183
12.7. Prove that the Poisson process is MP having a transition function and determine this function. 12.8. Let {X(t),t ≥ 0} be a real-valued MP. Are the processes (a){X([t]),t ≥ 0}; (b) {[X(t)],t ≥ 0} Markov ones? 12.9. Let {X(t),t ≥ 0} be a real-valued MP with a transition function. Prove that Y (t) := (X(t),t) is a homogeneous MP. Determine the transition function of Y (t). 12.10. Prove that Gaussian process X = {X(t),t ≥ 0} is a Markov process if and only if for arbitrary 0 ≤ s1 < s2 < t conditional distributions of X(t) with respect to σ -algebras σ (X(s1 )) and σ (X(s1 ), X(s2 )) coincide. 12.11. Prove that Gaussian process X = {X(t),t ≥ 0} is a Markov process if and only if its covariance function satisfies the following conditions. (i) RX (s1 , s2 )RX (s2 , s3 ) = RX (s1 , s3 )RX (s2 , s2 ) for arbitrary 0 ≤ s1 < s2 < s3 , (ii) If RX (s, s) = 0 for some s ≥ 0, then RX (s1 , s2 ) = 0 for arbitrary s1 ≤ s ≤ s2 . 12.12. Let the homogeneous MP take its values in Rm and its transition function be such that P(x,t, B) = P(x + y,t, B + y) for any x, y ∈ Rm , t ≥ 0, B ∈ B(Rm ). Prove that this process has independent increments. 12.13. Let the transition function of HMP be equal to " −(y − m(s)x)2 # 1 dy. P(s, x, A) = √ exp 2σ 2 (s) 2πσ (s) A
Prove that Kolmogorov–Chapman equations hold if and only if m(t + s) = m(t)m(s) and σ 2 (t + s) = m2 (t)σ 2 (s) + σ 2 (t). (Note that if m, σ are continuous then either m(t) = eat , σ (t) = b(e2at − 1), a = 0, b > 0, or m(t) = 1, σ (t) = bt, b > 0). 12.14. Let {ξi , i ≥ 0} be i.i.d.r.v., ξi = ±1 with probabilities 12 , S0 = 0, Sn = ∑ni=1 ξi , Xn = max0≤k≤n Sk . Prove that X is not a Markov process with respect to the filtration generated by the process S. 12.15. Let {W (t),t ≥ 0} be a Wiener process, α , β > 0. The Ornstein–Uhlenbeck process is defined as Vt := e−β t W (α e2β t ), t ≥ 0. Is it a Markov process? A homogeneous Markov process? As for the Ornstein– Uhlenbeck process see also Example 6.9, Problems 6.12, 14.7, and 14.8. 12.16. Prove that the Brownian bridge is a nonhomogeneous Markov process and determine its transition function. 12.17. (Telegraph signal) Let {N(t),t ≥ 0} be the Poisson process with intensity λ > 0. Let the random variable ζ be independent of N and P{ζ = −1} = P{ζ = 1} = 12 . Put X(t) = ζ (−1)N(t) , t ≥ 0. Is this process Markov?
184
12 Markov and diffusion processes
12.18. Let N be the Poisson process. Prove that the process X(t) = N(−t),t ∈ (−∞, 0] is Markov and determine its transition function. 12.19. Let W be the Wiener process. Prove that the process X(t) = W (−t),t ∈ (−∞, 0] is Markov and determine its transition function. 12.20. Let {X(t),t ≥ 0} be a Markov process. For T > 0 define the process {Y (s) = X(T − s), s ∈ [0, T ]}. Is the process Y Markov? Is Y homogeneous if X is homogeneous? Compare with Problem 5.37. 12.21. Let {X(t),t ≥ 0} and {Y (t),t ≥ 0} be real-valued Markov processes. Are the following processes Markov: {X(t) +Y (t),t ≥ 0} {X(t)Y (t),t ≥ 0}? 12.22. Give an example of a martingale that is not a Markov process and an example of a Markov process that is not a martingale. 12.23. Let W be an m-dimensional Wiener process. Put Ω = C(R+ , Rm ), F = B(C(R+ , Rm )), and for x ∈ Rm define the measure Px as the distribution in (Ω , F) of the random element x +W (·) (see Definition 16.1 and Problem 16.1). Verify that (W, P· ) is a HMF with the transition function P(t, x, B) = (2π t)−(m/2)
e−((y−x
2 )/2t)
dy.
B
12.24. (1) Prove the strong Markov property of the Wiener process. (2) Prove the following stronger version of the previous statement: if τ is a stopping time w.r.t. natural filtration {FtW } generated by the Wiener process W , then the stochastic process {V (t) := W (t + τ ) − W (τ ), t ≥ 0} is a Wiener process as well, and it is independent of σ -algebra FτW . 12.25. Let W (t) = (W1 (t), . . . ,Wm (t)) be an m-dimensional Wiener process, m ≥ 1, 1/2 2 and Xm (t) := ∑m be a Bessel process, that is, the radial component of k=1 Wk (t) the process W . Is this process Markov? 12.26. Let W be a Wiener process. (1) Assume that g : R → R is a bijection, and both g and g−1 are measurable functions. Prove that the process X(t) = g(W (t)), t ≥ 0 is a homogeneous Markov process and determine its transition function. (2) Prove that X(t) = |W (t)|,t ≥ 0 is a homogeneous Markov process and determine its transition function. (3) Are the processes from the previous two items strong Markov processes? 12.27. Let
PW (t, x, A), x = 0 , t ≥ 0, A ∈ B(R), P(t, x, A) = δ0 (A), x=0
where PW is the transition function of Wiener process. Check that P satisfies the Kolmogorov–Chapman equations. Prove that the Markov process with transition function P is not a strong Markov process.
12 Markov and diffusion processes
185
12.28. Let X be a homogeneous Markov process, and φ : X → R be abounded measurable function. Prove that double-component process Y (t) = (X(t), 0t φ (X(s)) ds), t ≥ 0 is also a homogeneous Markov process. 12.29. Let X be a homogeneous Markov process, and φ : X → R be a nonnegative measurable bounded function, t Q(t, x, A) := Ex exp − φ (X(s)) ds 1IA (X(t)), x ∈ X, A ∈ B(X). 0
Prove that Q is a substochastic transition function; that is, for Q all the conditions (1 )–(4 ) from the definition of a transition function are satisfied except the condition Q(t, x, X) = 1,t ≥ 0, x ∈ X, instead of which the following condition holds, Q(t, x, X) ≤ 1,t ≥ 0, x ∈ X. 12.30. Let X = R+ . Prove the following statements. (1) The function
2 2 1 e−(((y−x) )/2) + e−(((y+x) )/2) dy, t ∈ R+ , x ∈ X, A ∈ B(X) Q+ (t, x, A) = √ 2π t A is a transition function, (2) The function
2 2 1 e−(((y−x) )/2) − e−(((y+x) )/2) dy, t ∈ R+ , x ∈ X, A ∈ B(X) Q− (t, x, A) = √ 2π t A is a substochastic transition function. The functions Q+ , Q− are transition functions for the processes, called the Brownian motion on R+ with reflection and the Brownian motion on R+ with absorption at zero point, correspondingly. 12.31. Prove that the family of operators {Tt ,t ≥ 0} defined by the equality (12.2) creates a semigroup, and moreover, Tt = 1. 12.32. (1) Prove that the resolvent Rλ is an operator-valued analytical function of λ for Re λ > 0, and d R f = −R2λ f . dλ λ (2) Prove that for any λ with Re λ > 0 and any n ≥ 1, dn R f = (−1)n n!Rn+1 λ f. dλ n λ 12.33. Prove Lemma 12.1. 12.34. Determine the semigroup generated by the m-dimensional Wiener process. 2 (Rm ) is contained in D , and for those functions A f = 1 Δ f , Prove that the space Cuni A 2 where Δ is the Laplace operator (Laplacian). √ 12.35. Let N be the Poisson process, X(t) = iN(t) , t ≥ 0, where i = −1. Prove that X is a homogeneous Markov process. Determine its semigroup, infinitesimal operator, and resolvent.
186
12 Markov and diffusion processes
12.36. Determine the resolvent of the Poisson process. 12.37. Determine the resolvent of the Wiener process. 12.38. Let the diffusion process {X(t), Ft ,t ∈ R+ } with continuous drift coefficient μ (x) and continuous diffusion coefficient σ (x) = 0 start from the point x ∈ (a, b), and let τa (τb ) be the hitting time of the point a (point b) by this process. Denote u(x) = P(τb < τa X0 = x), a < x < b. We know that u ∈ C2 (R). Prove that the function u satisfies the following differential equation.
σ 2 (x) u (x) + u (x)μ (x) = 0, a < x < b, 2 with boundary conditions u(a) = 0, u(b) = 1. 12.39. Consider a HMF corresponding to the m-dimensional Wiener process W (see discussion after Definition 12.7 and Problem 12.23). (1) Prove that for any random variable ξ for which the expectations below are correctly defined, the equality holds Eμ ξ = Rm (Ex ξ ) μ (dx). (2) Let B(x, r) be the ball with the center x ∈ Rm and of radius r. Prove that the Wiener process starting from the point x exits the ball B(x, r) with probability 1. (3) Let τr denote the exit time for the process W from the ball B(x, r). Find the distribution of random variable W (τr ) assuming that W (0) = x, and prove that Ex τr < ∞. (4) Prove that Ex τr = r2 /m. (5) Let ϕ be a bounded measurable function, f (x) = Ex φ (W (τr )). Using the strong Markov property of the Wiener process (Problem 12.24), prove that f (x) = f S(x,r) (y) μ (dy), where S(x, r) is the spherical surface of the ball B(x, r), and μ is the unit mass uniformly distributed on S(x, r). 12.40. (Example of a local martingale that is not a martingale). Let {W (t),t ∈ R+ } be a Wiener process with values in R3 , and W (0) = x0 = 0. Also, let the function h : R3 {0} → R, h(x) = x−1 , x ∈ R3 {0}. (1) Prove that h(W (t)) is a local martingale but is not a martingale with respect to natural filtration. (2) Prove that Ex0 (h(W (t)))2 < ∞ for all t ∈ R+ and x0 ∈ R3 , and moreover, supt≥0 Ex0 (h(W (t)))2 < ∞, where the expectation is under the condition that the process starts from the point x0 .
Hints 12.1. Use Definition 12.1. 12.2. Use item (4) or (5) of Theorem 12.1. 12.6, 12.7. Use independence of increments of corresponding processes. 12.9. Consider the case when Ω = XT is the set of all the functions defined on T and taking their values in X, F is the σ -algebra generated by cylinder sets. Define a
12 Markov and diffusion processes
187
:= R+ × X, and a new measurable space Ω = {ω = x(t) x(t) = new phase space X Define (t0 + t, x(t))}, where x ∈ XT , t0 ∈ R+ , with the corresponding σ -algebra F. on F the probability function Px(B), x = (t0 , x) as the product of measures λt0 and Pt0 ,x , where λt0 is a probability measure on the space of functions of the form gs (t) = s + t, s,t ∈ R+ , concentrated on the function gt0 (t), and Pt0 ,x are the measures on F corresponding to the original nonhomogeneous process that starts from the point x at t0 . Now it is necessary to check that the constructed process is homogeneous and determine its transition probabilities. 12.11. Use Problems 12.10 and 6.20. 12.15, 12.16. Use Problems 12.11 and 6.18. 12.18, 12.19. To prove the Markov property, use the symmetry of the property from item (6) of Theorem 12.1 with respect to time reversion. 12.24. In order to shorten the notation, consider the one-dimensional case; the multidimensional case can be treated quite analogously. (1) Use Theorem 12.5: the function F(t, x) from the formulation of this theorem 2 can be given explicitly, F(t, x) = (2π t)−(1/2) R f (y)e−(((y−x) )/2t) dy. Prove that this function is continuous in (t, x) ∈ (0, +∞) × R for every continuous bounded function f (in fact, even for every measurable bounded f ). (2) At first, prove that E1IA g(V ) = E1IA Eg(V ) for any k ≥ 1, 0 ≤ t1 ≤ t2 ≤ · · · ≤ tk , continuous function g : Rk → R, event A ∈ Fτ , and vector V := (V (t1 ),V (t2 ), . . . ,V (tk )). In this order consider the discrete approximations of the stopping time τ by the stopping times τn (ω ) = ∑∞j=1 ( j/2n )1IA j,n ,where A j,n = {( j − 1)2−n ≤ τ < j2−n )}. Deduce from here that for any event B ∈ B(Rm ) the equality P(A ∩ {V ∈ B}) = P(A)P(V ∈ B) holds. Check that the proof follows from the last equality. 12.27. Consider Markov moment τ = inf{t| X(t) = 0}. 12.28. For arbitrary s ≤ t we can decompose the variable I(t) := 0t φ (X(r)) dr into t the sum I(s) + I(s,t), in which the second term I(s,t) := s φ (X(r)) dr is an F≥s measurable variable. 12.29. Prove that the function Q admits the following probabilistic representation, t φ (X(s)) ds , Q(t, x, A) = Px X(t) ∈ A, ζ > 0
where ζ is a random variable independent of the process X and with a distribution Exp(1). Use this representation to prove the required statement. 12.30. The function Q+ is a transition function of the process |W |; see Problem 12.26. The function Q− admits the representation Q− (t, x, A) = Px (W (t) ∈ A, τ0 > t), where τ0 = inf{r|W (r) = 0} (prove this fact using the reflection principle for the Wiener process; see Problem 7.109). Use the above-mentioned probabilistic representations of the functions Q+ , Q− in order to verify the required properties of these functions. 12.31. Use the Kolmogorov–Chapman equation (4 ). 12.32. Use the resolvent equation. 12.33. Items (1), (2) follow directly from the definition of B0 . (3) Prove the following equality Th Rλ f −Rλ f = (eλ h −1)Rλ f −eλ h 0h Ts f e−λ s ds, whence Th Rλ f − Rλ f ≤ (eλ h − 1)Rλ f + h f , h ≥ 0.
188
12 Markov and diffusion processes
(4) Use the Lebesgue dominated theorem and the fact that Tt = 1. 12.34. Use Theorem 12.6. 12.38. Let h > 0. Prove that u(x) = E u(Xh ) X0 = x + o(h), h → 0 + . Then assume that u ∈ C2 (R) and establish the following expansion with the help of the Taylor formula. u(Xh ) = u(x) + u (x)μ (x)h +
σ 2 (x) u (x)h + α (h), 2
where E(α (h)|X0 = x) → 0, h → 0+. 12.39. (4) Check that it is enough to consider the case x = 0. At first, use the self (t) := rW (t/r2 ) is a similar property of the Wiener process: a stochastic process W Wiener process as well. Consider the moment τ1 of the first exit of the trajectory of the process W from the ball B(0, 1) and the moment τr of the first exit of the trajectory from the ball B(0, r). Verify that τr = r2 τ1 , whence obtain the relaof the process W tion E0 τr = C · r2 , where C = E0 τ1 is some constant. This constant is finite; it follows from item (3). Calculate it for m = 2; for higher m the proof is similar but more technical. Let σ1 be the moment of the first hit of the surface x12 + x22 = r2 by the process W , and σ2 be the moment of the first hit of one of the lines x1 = ±1 by the process W . Prove that E0 σ1 = E0 σ2 − Eμ σ2 , where μ is the uniform distribution on the surface x12 + x22 = r2 . Use the fact that σ2 is the moment of the first hit of the boundary of the interval [−r, r] by the one-dimensional Wiener process (more exactly, by the second coordinate of W ). Prove that Ex σ2 = r2 − x2 . Use item (1) and prove that C = 12 for m = 2.
Answers and Solutions 12.3. We prove only item (d). We have the following equalities P(Xt = j,Xs =i) P(Xs =i) P(Xt = j,Xu =k,Xs =i) = ∑k∈Xu P(Xs =i)
pi j (s,t) = P(Xt = j/Xs = i) =
u =k,Xs =i) = ∑k:P(Xu =k,Xs =i)>0 P(Xt = j/Xu = k, Xs = i) P(XP(X s =i) = ∑k∈Xu P(Xt = j/Xu = k)P(Xu = k/Xs = i) = ∑k∈Xu pik (s, u)pk j (u,t).
12.5. No. √ 2 12.6. P(t, x, A) = P(x +W (t) ∈ A) = (1/ 2π t) A e−(((y−x) )/2t) dy. 12.7. P(t, x, A) = P(x + N(t) ∈ A) = ∑k:k+x∈A e−λ t (((λ t)k )/k!). 12.8. (a) Yes; (b) no. 12.10. Necessity follows from the definition of a Markov property. Let’s prove sufficiency. Denote for the fixed s < t the random variable Δ (s,t) = X(t) − E[X(t)|X(s)]. The theorem on normal correlation provides that E[X(t)|X(s)] is a linear function of X(s); that is, for arbitrary s1 , . . . , sm ∈ [0, s) the random variables Δ (t, s), X(s1 ), . . .
12 Markov and diffusion processes
189
X(sm ) create a Gaussian system. Moreover, by our assumption, for arbitrary k = 1, . . . , m the random variables Δ (t, s) and X(sk ) are noncorrelated. At last, Δ (s,t) and X(s) are also noncorrelated. Therefore (Proposition 6.3) the random variable Δ (s,t) is independent of the vector (X(s1 ), . . . , X(sm ), X(s)). So, the random variable X(t) can be presented as the sum of random variable Δ (s,t), independent of X(s1 ), . . . , X(sm ), X(s), and the random variable E[X(t)|X(s)], which is measurable with respect to the σ -algebra σ (X(s1 ), . . . , X(sm ), X(s)). It means that the conditional distribution of X(t) with respect to σ (X(s1 ), . . . , X(sm ), X(s)) is a Gaussian distribution with the mean value E[X(t)|X(s)] + EΔ (s,t) and variance DΔ (s,t). This distribution equals the conditional distribution of X(t) with respect to σ -algebra σ (X(s)). 12.12. Let’s write the sequence of equalities that hold true for all the necessary = values of s,t, A: P(X(t + s) − X(t) ∈ A/Ft ) = P(X(t + s) − X(t) ∈ A/X(t)) P(X(t + s) − x ∈ A/X(t) = x){X(t)=x} = P(X(t + s) ∈ A + x/X(t) = x){X(t)=x} = P(x, s, x + A) = P(0, s, A), and the last probability is constant, whence the {X(t)=x}
proof follows. 12.13. Denote p(s, x, y) the density of transition probability and obtain the following relations.
I := p(s, x, z)p(t, z, y)dz R (y − m(t)z)2 (z − m(s)x)2 1 exp − dz exp − = 2πσ (s)σ (t) R 2σ 2 (s) 2σ 2 (t) ⎧ ⎫ ⎨ ( y − z)2 ⎬ 1 1 (z − m(s)x)2 m(t) = exp exp − − dz. σ 2 (t) ⎩ ⎭ m(t) 2πσ (s) σ (t) R 2σ 2 (s) 2 2 m(t) m (t) The right-hand side, up to the term 1/m(t), is the density at the point 0 of the sum of two independent Gaussian random variables y σ 2 (t) , . N m(s)x, σ 2 (s) and N − m(t) m2 (t) Therefore 4 I=
1 3 1 m(t) √ σ 2 (t) 2π σ 2 (s)+ 2
m (t)
=
√
2π
√
1
m2 (t)σ 2 (s)+σ 2 (t)
exp −
y ( m(t) −m(s)x)2
σ 2 (s)+
σ 2 (t) m2 (t)
# " 2 exp − m(y−m(s)m(t)x) 2 (t)σ 2 (s)+σ 2 (t) ,
whence the required statement follows immediately. 12.14. It is sufficient to check the following relations: P(X3 = 1/X2 = 0, S2 = 0) = 1 2 = P(X3 = 1/X2 = 0, S2 = −2) = 0, but for a Markov process these probabilities must be equal. 12.15. The Ornstein–Uhlenbeck process is a homogeneous Markov process. 12.16. The transition probability density equals 1/2 " (x(1 − s) − y(1 − t))2 # 1−s . p(s, x,t, y) = exp − 2(1 − t)(1 − s)(t − s) 2π (1 − t)(t − s)
190
12 Markov and diffusion processes
12.17. The telegraph signal is a Markov process. 12.18. The transition probability density equals 4 3 s(y − st x)2 s exp − , s < t < 0, x, y ∈ R. p(s, x,t, y) = 2π t(t − s) 2t(t − s) 12.19.
t y
x! t x−y , s < t < 0, x, y ∈ Z+ , x ≥ y. s s y!(x − y)! 12.20. The answer for the first question is positive, and for the second one it is, generally speaking, negative. 12.21. The answers for both questions are, generally speaking, negative. 12.22. The example of a martingale that is not a Markov process can be described as X0 = X1 = 1, Xn+1 = Xn + Xn−1 · (εn /2n ), where εn are i.i.d. Bernoulli random variables; the example of a Markov process that is not a martingale can be described as W (t) + t, where W (t) is a Wiener process. 12.25. Yes. 12.26. Denote by {Ft } the filtration generated by the process W . For the arbitrary measurable function g we have the following equalities. P(s, x,t, {y}) =
1−
E[1IA (X(t))/Fs ] = E[1Ig−1 (A) (W (t)/Fs ] = PW (t − s,W (s), g−1 (A)), where PW is the transition function of a Wiener process (see Problem 12.6). In item (1), we have that W (s) = g−1 (X(s)), therefore the transition function of the process X equals P(x,t, A) = PW (g−1 (x),t, g−1 (A)), x ∈ R, t ≥ 0, A ∈ B(R). In item (2), the function g(x) = |x| is not a bijection, nevertheless, for any Borel set A ⊂ R+ and arbitrary t ∈ R+ the values of transition probability satisfy the relations PW (t,W (s), g−1 (A)) = PW (t,W (s), A) + PW (t,W (s), −A) =
−(((y−W (s))2 )/2t) 2 √1 e dy + √21π t −A e−(((y−W (s)) )/2t) dy 2π t A
=
√1 2π t A
−(((y−W (s))2 )/2t) 2 e + e−(((y+W (s)) )/2t) dy
and thus depend only on |W (s)|. It means that the transition function of the process |W | equals
2 2 1 e−(((y−x) )/2t) + e−(((y+x) )/2t) dy, x ∈ R+ , t ≥ 0, A ∈ B(R+ ). P(t, x, A) = √ 2π t A The processes g(W ) and |W | are strong Markov processes. 12.35. Let λ be the intensity of Poisson process, then ∞
(λ t)n k+n f (i ), t ≥ 0, A f (ik ) = λ [ f (ik+1 ) − f (ik )], n! n=0 Rμ f (ik ) = [(λ + μ )4 − λ 4 ]−1 (λ + μ )3 f (ik ) + (λ + μ )2 λ f (ik+1 ) + (λ + μ )λ 2 f (ik+2 ) + λ 3 f (ik+3 ) , μ > 0, k = 0, 1, 2, 3.
Tt f (ik ) =
∑ e−λ t
12 Markov and diffusion processes
191
12.36. Let λ be the intensity of Poisson process, then Rμ f (k) = 12.37. Rλ f (x) =
∞
λn
∑ (λ + μ )n+1 f (k + n), μ > 0, k ∈ Z+ .
n=0
R Qλ (x, y) f (y) dy,
∞
√ 1 e−(((y−x) )/2t) √ dt = √ e− 2λ |y−x| . 0 2π t 2λ 12.39. (2) If the trajectory of a Wiener process stays inside the ball B(x, r) till the moment t = n, then all the increments W (1)−W (0),W (2)−W (1), . . . ,W (n)−W (n−1) do not exceed 2r in absolute value. These increments are jointly independent and Gaussian. Therefore, if τr is the moment of the first exit from the ball B(x, r), then Px (τr ≥ n) ≤ (P(|W (1) − W (0)| < 2r))n = pn , where p < 1. It means that P(τr = ∞) = 0. (3) It follows with evidence from item (1) that the random point W (τr ) is situated on the spherical surface S(x, r) that restricts the ball B(x, r). Because the distribution density of any increment W (t) −W (s) depends only on t − s and on the modulus of the vector W (t) − W (s), we have that the distribution of the Wiener trajectory does not change under any rotation of the whole space around the center of the ball for any angle. So, the distribution of the random point W (τr ) is invariant with respect to all the rotations of the spherical surface S(x, r). The unique distribution with this property is the uniform distribution for which the probability for W (τr ) to hit within some domain of the spherical surface is proportional to the area of this domain. It means that the distribution of W (τr ) is uniform on S(x, r). Let F(t) be the distribution function of τr . Then, according to item (2),
Qλ (x, y) =
2
e−λ t
∞ n 0 tdF(t) ≤ ∑n=1 n−1 tdF(t) n ∞ ≤ ∑∞ n=1 n n−1 dF(t) ≤ ∑n=1 nPx (τr
E x τr =
∞
n−1 < ∞. ≥ n − 1) ≤ ∑∞ n=1 np
(5) Let a Wiener process start from the point x. At the moment τr , according to item (2), the distribution of W (τr ) is uniform on S(x, r). According to the strong Markov property, the process W can be considered as a Wiener process with uniform initial distribution. Therefore, due to item (1), we obtain that Ex ϕ (W (τ )) = Eμ ϕ (W (τ )) = E ϕ (W (τ ))μ (dy). S(x,r) y 12.40. (1) Denote the increasing sequence of stopping times τk := inf{t > 0| W (t) ≤ 1/k}, k ≥ 1. Then τk → ∞ a.s., because in R3 P(W (t) = 0 for some t > 0) = 0. The function h is harmonic in R3 {0} and for any k ≥ 1 Dk := {x ∈ R3 | x > 1/k} ⊂ D(h). Consider the closure Dk and define on it the function gk (x) := Ex h(W (τk )), where Ex is an expectation under initial condition W (0) = x. Using the strong Markov property (12.6) of the Wiener process and Problem 12.39, item (6), we obtain that gk (x) = S(x,r) gk (y)dm(y) for all sufficiently small balls B(x, r) with the centers in the point x ∈ Dk and spherical surfaces S(x, r), where the measure m has unit mass and is concentrated on S(x, r) . Therefore, gk is harmonic in Dk . Moreover, it is continuous in Dk and gk |∂ Dk = h, where ∂ Dk = Dk Dk . From the principle of the maximum for harmonic functions we obtain that gk = h in Dk for any k. Furthermore, for any k ≥ 1, x ∈ Dk and t > 0
192
12 Markov and diffusion processes
Ex (h(W (τk ))/Ft ) = 1Iτk ≤t h(W (t ∧ τk )) + 1Iτk >t Ex (W (τk )/Ft ). (Here {Ft } is the filtration generated by the process W .) It follows from the strong Markov property that on the set {τk > t} we have the equality Ex (h(W (τk ))/Ft ) = EW (t) (h(W (τk ))) = gk (W (t)). Because gk = h in Dk , then, combining previous relations, we obtain that Ex (W (τk )/Ft ) = h(W (t ∧ τk )). Furthermore, W (0) = x0 , and x0 ∈ Dk for all sufficiently large k, and we obtain that {h(W (t ∧ τk )),t ∈ R+ } is a bounded martingale, whence {h(W (t)),t ∈ R+ } is a local martingale. But it is not a martingale. Indeed, for t > 0 and R > 2x0 we have that Ex0 h(W (t)) = (2π t)−3/2 ≤ (2π t)−3/2
R3
y−1 e−((y−x0
y≤R
y−1 dy +
2 )/2t)
dy
y>R
y−1 e−((y
2 )/8t)
dy
C1 R2 C2 → 0, ≤ + 3/2 R (2π t) if, at first, t → ∞, and then R → ∞. It means that Ex0 h(W (t)) = Ex0 (h(W (0))) = x0 −1 for large t. (2) Please, prove this statement on your own, similarly to previous estimates.
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
Theoretical grounds Let {W (t),t ∈ R+ } be a Wiener process, and {g(t), FtW ,t ∈ R+ } a stochastic process (recall that the previous notation means that g is adapted to a natural filtration {FtW } of the Wiener process). Let F W = σ {Wt ,t ≥ 0}. A process g is said to belong to the class Lˆ2 ([a, b]) if it is measurable and E ab g2 (s)ds < ∞. A process g belongs to the class Lˆ2 if it belongs to Lˆ2 ([0,t]) for all t ∈ R+ . Assume process g ∈ Lˆ2 ([a, b]) is simple, that is, has the form g(s) = g(tk ), s ∈ (tk ,tk+1 ], a = t0 < t1 < · · · < tn = b, g(a) = g0 . Then the stochastic integral of g on [a, b] with respect to the Wiener process is defined as ab g(s)dW (s) := ˆ b]) then there exists a sequence of sim∑n−1 k=0 g(tk )[W (tk+1 ) − W (tk )]. If g ∈ L2 ([a, ple processes gn ∈ Lˆ2 ([a, b]) such that E ab [g(s) − gn (s)]2 ds → 0, n → ∞. The Itˆo stochastic integral of a process g with respect to the Wiener process is defined by the formula I[a,b] (g) := ab g(s)dW (s) := l.i.m. ab gn (s)dW (s). Let g ∈ Lˆ2 ([0, T ]). Then g(·)1I·∈[0,t] ∈ Lˆ2 ([0, T ]) for every t ∈ [0, T ]. Define a collection of stochastic integrals It (g) = 0t g(s)dW (s) when t ∈ [0, T ] by a formula It (g) = I[0,T ] (g(·)1I·∈[0,t] ). Denote I[s,t] (g) = I[0,T ] (g(·)1I·∈[s,t] ), [s,t] ⊂ [0, T ]. (See Problem 13.7.) Theorem 13.1. Stochastic integral {It (g), t ∈ [0, T ]} of a process g ∈ Lˆ2 ([0, T ]) with respect to a Wiener process has the following properties. (1) It is linear with respect to the integrand: It (a f + bg) = aIt ( f ) + bIt (g), f , g ∈ Lˆ2 ([0, T ]). (2) It is additive with respect to the interval of integration: It ( f ) = Iu ( f )+I[u,t] ( f ) for all 0 ≤ u ≤ t. (3) It has a modification continuous in t ∈ [0, T ]. (4) It is a square integrable martingale with respect to a flow {Ft , t ∈ R+ }, and is isometric It (g)L2 (P) = gL2 ([0,t]) , g ∈ Lˆ2 .
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 193 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 13,
194
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
It is known ([60] Chapter IV, §3) that each measurable adapted process has a progressively measurable modification. In the following we always consider a progressive modification for measurable adapted processes. The Itˆo integral can be extended to the class of measurable adapted processes with square integrable trajectories by the locality principle (Problem 13.10). We also call this extension the Itˆo integral. It keeps properties (1)–(3) of Theorem 13.1. Let process X be of the form X(t) = X(0) +
t
μ (s)ds +
0
t
σ (s)dW (s), t ∈ R+ ,
0
where σ is a measurable adapted process such that P( 0t σ 2 (s)ds < ∞) = 1, t ∈ R+ t and μ is measurable and adapted, and P( 0 |μ (s)|ds < ∞) = 1, t ∈ R+ . Then the process X(t) is said to have a stochastic differential dX(t) = μ (t)dt + σ (t)dW (t). Theorem 13.2. (The Itˆo formula) Let function F : [0, ∞) × R → R belong to the class C1 ([0, ∞)) × C2 (R) of the functions, continuously differentiable in t ∈ [0, ∞) and twice continuously differentiable in x ∈ R+ . Suppose that process {X(t), Ft ,t ∈ R+ } has a stochastic differential dX(t) = f (t)dW (t) + g(t)dt. Then the following formula holds true: P-a.s. for all t ∈ R+ , F(t, X(t)) = F(0, X0 ) +
+
t 0
t 0
∂ F(s, X(s)) f (s)dW (s) ∂x
1 ∂ 2 F(s, X(s)) 2 ∂ F(s, X(s)) ∂ F(s, X(s)) + g(s) + f (s) ds. ∂s ∂x 2 ∂ x2
The statement of Theorem 13.2 can be written in the following form. The process Y (t) = {F(t, X(t)),t ∈ R+ } has a stochastic differential
∂F ∂F 1 ∂ 2F (t, X(t))dt + (t, X(t))dX(t) + (t, X(t))(dX(t))2 , ∂t ∂x 2 ∂ x2 where (dX(t))2 is defined via the following formal “operating with differentials” rules: dt · dt = dt · dW (t) = dW (t) · dt = 0, dW (t) · dW (t) = dt. dY (t) =
Theorem 13.3. (The multidimensional Itˆo formula) Assume that X(t) = (X1 (t), . . . , Xm (t)) is a stochastic process, and each coordinate Xi (t) has a stochastic differential Wi (t) are correlated Wiener processes: Wi ,W j (t) = dX t i (t) = μi (t)dt + σi (t)dWi (t), m → R belongs to the class C1 ([0, ∞)) ×C2 (Rm ). Then ρ (s)ds, and F : [0, ∞) × R i j 0 t ∂ F(s, X(s)) m ∂ F(s, X(s)) +∑ F(t, X(t)) = F(0, X0 ) + μi (s) ∂t ∂ xi i=1 0 t m ∂ F(s, X(s)) 1 m ∂ 2 F(s, X(s)) σi (s)σ j (s)ρi j (s) ds + ∑ σi (s)dWi (s). + ∑ 2 i=1 ∂ xi ∂ x j ∂ xi i=1 0
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
195
Definition 13.1. Let {W (t), t ∈ R+ } be a one-dimensional Wiener process. The limit in probability
L(t, x) := limε ↓0 (2ε )−1 0t 1IW (s)∈(x−ε ,x+ε ) ds = limε ↓0 (2ε )−1 λ 1 {s ∈ [0,t]|W (s) ∈ (x − ε , x + ε )}
(13.1)
is called the local time of the process W at the point x on the time interval [0,t]. The limit in (13.1) in fact exists both with probability 1 and in the mean square (see Problems 13.47 and 13.52). The notion of a local time was introduced by L´evy [56]; see also [13], [87]. Often it is said that the local time describes the relative time spent by the trajectory of the process W near the point x. Recall (see Problem 3.21) that the total time spent by this trajectory at the point x is a.s. equal to zero. Theorem 13.4. (The Tanaka formula) For every (t, x) ∈ R+ × R, |W (t) − x| − |W (0) − x| =
t 0
sign(W (s) − x)dW (s) + L(t, x) a.s.
The Tanaka formula can be considered as a generalization of the Itˆo formula for the function f (x) = |x|, which is not twice differentiable (see also [66] for further generalizations). Theorem 13.5. (The Fubini theorem for stochastic integrals) Let {W (t), t ∈ R+ } be a Wiener process, a function f = f (x, s, ω ) : [a, b] × [c, d] × Ω → R be jointly measurable with respect to all arguments, f (x, s) ∈ Lˆ2 ([a, b]) for all s ∈ [c, d], f (x, s) ∈ Lˆ2 ([c, d]) for all x ∈ [a, b], and ab f 2 (x, s)dx ∈ Lˆ1 ([c, d]), that is, E cd ab f 2 (x, s)dxds d < ∞. Then the stochastic integral c f (x, s)dW (s) exists for each x ∈ [a, b], and being considered as a function of x it is Lebesgue integrable on [a, b] with probability 1. d b Also stochastic integral c a f (x, s)dx dW (s) exists, and a.s. the following equality holds true. b d d b f (x, s)dW (s) dx = f (x, s)dx dW (s). a
c
c
a
Assume that {M(t), Ft , t ∈ [0, T ]} is a square integrable martingale with trajectories in D([0, T ]) (see Theoretical grounds in Chapter 3). Let M(t) be its quadratic characteristic. Denote by Πb ([0, T ]) a space of simple functions g such that g(s) = g(tk ), s ∈ (tk ,tk+1 ] with some partition {0 = t0 < t1 < · · · < tn = T } and g(tk ) being a bounded Ftk -measurable random variable for every k = 0, . . . , n − 1. For g ∈ Πb ([0, T ]), the Itˆo stochastic integral of g w.r.t. martingale M is defined ) − M(tk )). as 0T g(s)dM(s) := ∑n−1 k=0 g(tk )(M(tk+1 Introduce a norm gM := (E 0T g2 (s)dM(s))1/2 in Πb ([0, T ]). The just defined correspondence I : g → 0T g(s)dM(s) is a linear and isometric mapping from Πb ([0, T ]) to L2 (Ω , F, P). Define the space Lˆ2 ([0, T ], M) as the completion of Πb ([0, T ]) in the norm · M , and extend the correspondence I to the whole space Lˆ2 ([0, T ], M). This extension is called the Itˆo stochastic integral w.r.t. M and is denoted by I. For a process g ∈ Lˆ2 ([0, T ], M) the Itˆo stochastic integral of g w.r.t. M is defined as the image of g under I, and is denoted by I(g).
196
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
A process It (g), t ∈ [0, T ] can be introduced similarly to the Wiener case. This process satisfies properties (1) and (2) of Theorem 13.1. In addition, EIt (g1 )It (g2 ) = E 0T g1 (s)g2 (s)dM(s), g1 , g2 ∈ Lˆ2 ([0, T ], M), E(It2 (g) − Is2 (g)/Fs ) = E( st g2 (u)dM(u)/Fs ), and if M ∈ M 2, c then I· (g) ∈ M 2, c . For continuous square integrable martingales, the following version of the Itˆo formula holds true. Theorem 13.6. Let M ∈ M 2, c , and f ∈ C2 (R). Then t 1 t f (M(s))dM(s) + f (M(s))dM(s). f (M(t)) = f (M(0)) + 2 0 0
Bibliography [9], Chapter VIII; [38], Chapter II; [64]; [13]; [90], Chapter 12, §12.1–12.3; [24], Volume 3, Chapter I, §2–3; [25], Chapter VIII, §1; [51], Chapter 19; [57], Chapters 4 and 5; [79], Chapters 30 and 32; [20], Chapters 11 and 12; [8], Chapter 7, §7.1–7.4; [46], Chapter 12, §12.4 and §12.6; [54], Chapter 3, §3.4; [61], Chapters III–IV; [68], Chapter 13, §13.1.1; [85], Chapters 6–8; [66].
Problems 13.1. Let f ∈ L2 ([0, T ]) be a real-valued nonrandom function, and {W (t), t ∈ [0, T ]} be the Wiener process. (1) Prove that I[0,T ] ( f ) = 0T f (s)dW (s) has a Gaussian distribution. Find its mean and variance. The integral of a nonrandom function is also called the Wiener integral, or the Itˆo–Wiener integral. (2) Prove that E exp{iuI[0,T ] ( f )} = exp{−(u2 /2) 0T f 2 (t)dt}. (3) Verify that for a nonrandom f ∈ L2 ([0, T ]) the Wiener integral 0T f (s)dW (s) coincides with the stochastic integral w.r.t. the process W considered as a process with orthogonal increments (Chapter 8, Problems 8.13 and 8.14) if f ∈ L2 ([0, T ]). 13.2. (1) Let {W (s), s ∈ [0,t]} be a Wiener process. Prove the following integration by parts formula. If a function f ( · , ω ) ∈ C1 [0, T ] for a.a. ω ∈ Ω , and f ∈ Lˆ2 , then t 0
(2) Prove that
f (s, ω )dW (s) = f (t, ω )W (t) − t
t 0 sdW (s) = tW (t) − 0 W (s)ds.
13.3. Prove that the random variables dent.
t
1 0
0
tdW (t) and
W (s) f (s, ω )ds. 1 0
(2t 2 − 1)dW (t) are indepen-
13.4. (1) Find 0t W (s)dW (s), where {W (s), s ∈ [0,t]} is a Wiener process. (2) Find variances the integrals 0t |W (s)|1/2 dW (s) and 0t (W (s) + s)2 dW (s). t of (3) Prove that 0 W 2 (s)dW (s) = 13 W 3 (t) − 0t W (s)ds.
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
197
13.5. Assume that process g ∈ Lˆ2 ([0, T ]) is continuous in the mean square sense. Prove that T n−1 (n) (n) (n) W ti+1 −W ti , g(t)dW (t) = l.i.m. ∑ g ti n→∞
0
i=0
(n)
(n)
where a sequence of partitions πn := {0 = t0 < · · · < tn = T } is such that |πn | → 0 as n → ∞. 13.6. Assume that process g ∈ Lˆ2 ([0, T ]). Prove that T 0
n−1
g(t)dW (t) = l.i.m. ∑ n n→∞
k=1
kT /n
(k−1)T /n
g(s)ds(W ((k + 1)T /n) −W (kT /n)).
13.7. Let g ∈ Lˆ2 ([0, T ]). (1) Prove that for each t ∈ [0, T ] a restriction of the process g on [0,t] lies in a class Lˆ2 ([0,t]). (2) Prove that 0t g(s)dW (s) = 0T 1I[0,t] (s)g(s) dW (s), t ∈ [0, T ] a.s. 13.8. (Localization of the stochastic integral) Let f , g ∈ Lˆ2 ([0, T ]). Denote A = {ω ∈ Ω | 0T ( f (t, ω ) − g(t, ω ))2 dt = 0}. Prove that 0t f (s)dW (s) = 0t g(s)dW (s), t ∈ [0, T ] almost surely on the set A. 13.9. Let f ∈ Lˆ2 ([0, T ]), τ be a stopping time. (1) Prove that the process fτ := f (·)1I[0,τ ] belongs to Lˆ2 ([0, T ]). (2) Denote J(t) = 0t f (s)dW (s), t ∈ [0, T ]. Prove that J(τ ∧T ) = 0T 1It≤τ f (t)dt := I[0,T ] ( fτ ) a.s. (in the left-hand side, the stopping time τ ∧T is substituted into the continuous stochastic process J). 13.10. Let the measurable adapted stochastic process f (t), t ∈ [0, T ] have square integrable trajectories. Prove that there exists a unique process I(t), t ∈ [0, T ] such that for any g ∈ Lˆ2 ([0, T ]) the equality I(t) = 0t g(s)dW (s), t ∈ [0, T ] holds almost & ' surely on the set ω ∈ Ω | 0T ( f (t, ω ) − g(t, ω ))2 dt = 0 . Stochastic process I(t), t ∈ [0, T ] is called the stochastic Itˆo integral of f w.r.t. W , and is denoted I(t) = 0t f (s)dW (s). 13.11. Assume that g(t), t ∈ [0, T ] is a measurable adapted process with square in kT /n tegrable trajectories. Prove that ∑n−1 k=1 n (k−1)T /n g(s)ds(W ((k + 1)T /n) −W (kT /n))
converges in probability as n → ∞ to
T 0
g(t)dW (t). Compare with Problem 13.6.
13.12. Assume that { f (t), t ∈ [0, T ]} is a continuous adapted process. Prove that the following convergence in probability holds T n−1 (n) (n) (n) W ti+1 −W ti , f (t)dW (t) = P-lim ∑ f ti 0
n→∞
i=0
(n)
(n)
where a sequence of partitions πn := {0 = t0 < · · · < tn = T } is such that |πn | → 0 as n → ∞. Compare with Problem 13.5.
198
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
13.13. Let πn = {0 = tn,0 < tn,1 < · · · < tn,n = 1} be a sequence of partitions such that |πn | → 0 n → ∞. Find limits in probability. (a) limn→∞ ∑n−1 k=0 W (tn,k )(W (tn,k+1 ) −W (tn,k )). )(W (tn,k+1 ) −W (tn,k )). (b) limn→∞ ∑n−1 k=0 W (t n,k+1 (c) limn→∞ ∑n−1 W (t n,k+1 + tn,k )/2 (W (tn,k+1 ) −W (tn,k )). k=0 13.14. Assume that a sequence of progressively measurable stochastic processes bn (t), t ∈[0, T ] is such that
(1) P 0T b2n (t)dt < ∞ = 1, n ≥ 0. (2) P lim 0T (bn (t) − b0 (t))2 dt = 0 = 1. n→∞
Prove that
T 0
P
bn (t)dW (t) −→
T 0
b0 (t)dW (t), n → ∞.
13.15. Let {Wn (t), t ∈ [0, T ], n ≥ 0} be a sequence of Wiener processes such that P
Wn (t) −→ W0 (t), n → ∞ for each t ∈ [0, T ]. Assumethat stochastic processes ξn (t), t ∈ [0, 1] are σ {Wn (s), s ≤ t}-adapted, respectively, P lim E
n→∞
1 0
1 2 0 ξn (t)dt < ∞
= 1, n ≥ 0 and
(ξn (t) − ξ0 (t))2 dt = 0. Prove the convergence of stochastic integrals 1
lim
n→∞ 0
ξn (t)dWn (t) =
1 0
ξ0 (t)dW0 (t)
in probability. T
13.16. Calculate E(
0
W n (t)dW (t))2 , n > 1.
13.17. Apply the Itˆo formula and find stochastic differentials of the processes. (a) X(t) = W 2 (t). b) X(t) = sint + eW (t) . (c)X(t) = t ·W (t). (d) X(t) = W12 (t)+W22 (t), where W (t) = (W1 (t),W2 (t)) is a two-dimensional Wiener process. (e) X(t) = W1 (t) ·W2 (t). (f) X(t) = (W1 (t)+W2 (t)+W3 (t))(W22 (t)−W1 (t)W3 (t)), where (W1 (t),W2 (t),W3 (t)) is a three-dimensional Wiener process. 13.18. Let f ∈ Lˆ2 , X(t) = 0t f (s)dW (s). Prove that process M(t) := X 2 (t) − t 2 0 f (s)ds, t ≥ 0 is a martingale.
13.19. Let W (t) = (W1 (t), . . . ,Wm (t)), t ≥ 0 be an m-dimensional Wiener process, f : Rm → R, f ∈ C2 (Rm ). Prove that t 1 t f (W (s))ds, f (W (t)) = f (0) + ∇ f (W (s))dW (s) + 2 0 0
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
199
2 2 where ∇ = ((∂ /∂ x1 ), . . . , (∂ /∂ xm )) is the gradient, and = ∑m i=1 (∂ /∂ xi ) is the Laplacian.
13.20. Find the stochastic differential dZ(t) of a process Z(t), if: (a) Z(t) = 0t f (s)dW (s), where f ∈ Lˆ2 ([0,t]). (b) Zt = exp{α W (t)}. (c) Zt = exp{α X(t)}, where process {X(t), t ∈ R+ } has a stochastic differential dX(t) = α dt + β dW (t), α , β ∈ R, β = 0. (d) Zt = X n (t), n ∈ N, dX(t) = α X(t)dt + β X(t)dW (t), α , β ∈ R, β = 0. (e) Zt = X −1 (t), dX(t) = α X(t)dt + β X(t)dW (t), α , β ∈ R, β = 0. 13.21. Let hn (x), n ∈ Z+ be the Hermite polynomial of nth order: 2 2 dn hn (x) = (−1)n e(x /2) n e−(x /2) . dx (1) Verify that h0 (x) = 1, h1 (x) = x, h2 (x) = x2 − 1, h3 (x) = x3 − 3x. (2) Prove that the multiply Itˆo stochastic integrals t t tn−1 1 ··· dW (tn ) · · · dW (t2 )dW (t1 ) , n ≥ 1 In (t) := 0
0
0
are well-defined. (3) Prove that
n!In (t) = t
n/2
hn
W (t) √ , t > 0. t
13.22. Let Δn (T ) = {(t1 , . . . ,tn )| 0 ≤ t1 ≤ · · · ≤ tn ≤ T }. For f ∈ L2 (Δn (T )), denote T t t2 n f ··· f (t1 , . . . ,tn )dW (t1 ) · · · dW (tn ) . In (T ) := 0
(1) Prove
0
0
that EInf (T )Img (T ) = 0 for any n = m, f ∈ L2 (Δn (T )), g ∈ L2 (Δm (T )), T tn t2 EInf (T )Ing (T ) = ··· f (t1 , . . . ,tn )g(t1 , . . . ,tn )dt1 · · · dtn 0 0 0
and
for any n ≥ 1, f , g ∈ L2 (Δn (T )). (2) Let 0 = a0 ≤ a1 ≤ · · · ≤ am ≤ T, ki ∈ Z+ , i = 0, . . . , m − 1, n = k0 + · · · + km−1 . Put f (t1 , . . . ,tn ) =
m−1 k0 +···+ki+1 −1
∏ i=0
Prove that Inf (T ) =
m−1
∏ i=0
∏
j=k0 +···+ki
(ai+1 − ai )ki /2 hki ki !
1I[ai ,ai+1 ) (t j ).
W (ai+1 ) −W (ai ) √ . ai+1 − ai
(3) Prove that for any m ∈ N, ai ∈ [0, T ], ki ∈ Z+ , i = 1, . . . , m there exist functions f j ∈ L2 ([0, T ] j ), 0 ≤ j ≤ n = k0 + · · · + km−1 and a constant c such that n
W (a1 )k1 · · ·W (am )km = c + ∑ I j j (T ). j=0
f
200
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
(4) Use the completeness of polynomial functions in any space L2 (Rn , γ ) where γ is a Gaussian measure (see, e.g., [5]), and prove that for any random variable ξ measurable w.r.t. σ (W (a1 ), . . . ,W (am )) there exists a unique sequence of functions { fn ∈ L2 (Δn (T )), n ∈ Z+ } such that ∞
ξ = f0 + ∑ I j j (T ).
(13.2)
f0 = Eξ , Eξ 2 = f02 + ∑ fn 2L2 (Δn (T )) .
(13.3)
f
j=0
Moreover
n≥1
(5) Prove that for any random variable ξ ∈ L2 (Ω , FTW , P) there exists a unique sequence of functions { fn ∈ L2 (Δn (T )), n ∈ Z+ } such that (13.2), (13.3) are satisfied. 13.23. Let X(t) be a.s. a positive stochastic process, dX(t) = X(t)(α dt + β dW (t)), α , β ∈ R. Prove that d ln X(t) = (α − 12 β 2 )dt + β dW (t), t ∈ R+ . 13.24. Assume that processes X(t) and Y (t), t ∈ R+ have stochastic differentials dX(t) = a(t)dt + b(t)dW (t), dY (t) = α (t)dt + β (t)dW (t). Prove that d(XY )t = X(t)dY (t) +Y (t)dX(t) + dX(t)dY (t) = (α (t)X(t) + a(t)Y (t) + b(t)β (t))dt + (β (t)X(t) + b(t)Y (t))dW (t). 13.25. Assume that processes X(t) and Y (t), t ∈ R+ have stochastic differentials dX(t) = adt + bdW (t), dY (t) = Y (t)( α dt + β dW (t)), α , β , a, b ∈R, and Y (0) > −1 (t) = Y −1 (t) (−α + β 2 )dt − β dW (t) , d X(t)Y −1 (t) = 0 a.s. Prove that dY
Y −1 (t) (a − bβ + X(t)(β 2 − α ))dt + (b − β X(t))dW (t) . (Note that Y (t) > 0,t ≥ 0 a.s. if Y (0) > 0, see Problem 14.2.) 13.26. Let {W (t), Ft , t ∈ [0, T ]} be a Wiener process, { f (t), FtW , t ∈ [0, T ]} be a bounded process, | f (t)| ≤ C, t ∈ [0, T ]. Prove that E| 0t f (s)dW (s)|2m ≤ C2mt m (2m − 1)!! 13.27. Assume that a process X has a stochastic differential dX(t) = U(t)dt +dW (t), where U is bounded process. Define a process Y (t) = X(t)M(t), where M(t) = exp {− 0t U(s)dW (s) − 12 0t U 2 (s)ds}. Apply the Itˆo formula and prove that Y (t) is an Ft -martingale where Ft = σ {U(s),W (s), s ≤ t}. 13.28. Let {W (t), t ∈ [0, T ]} be a Wiener process. Assume process { f (t), FtW , t ∈ t T 2m [0, T ]} is such that 0 E| f (t)| dt < ∞. Prove that E| 0 f (s)dW (s)|2m ≤ [m(2m − 1)]mt m−1 0t E| f (s)|2m ds. 13.29. Prove that for any p ≥ 1 there exist positive constants c p ,Cp such that c p E| max
t
t∈[0,T ] 0
f (s)dW (s)|2p ≤ E|
T 0
f 2 (s)ds)| p ≤ Cp E| max
t
t∈[0,T ] 0
f (s)dW (s)|2p .
ˆ , and a pro13.30. Let { f (t), FtW , t ∈ R+ } be a bounded stochastic process, f ∈ L t2 cess X(t) be the stochastic exponent, that is, have the form X(t) = exp{ 0 f (s)dW (s)− 1 t 2 f (s)ds}, where {W (t), Ft , t ∈ R+ } is a Wiener process. Prove that dX(t) = 0 2 X(t) f (t)dW (t).
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
201
13.31. Let (Ω , F, {Ft }t∈[0,T ] , P) be a filtered probability space, {W (t), Ft , t ∈ [0, T ]} be a Wiener process,{γ (t), Ft , t ∈ [0, T ]} be a progressively measurable stochastic ∞} = 1, and {ξ (t), Ft , t ∈ [0, T ]} be a stochastic process such that P{ 0T γ 2 (s)ds < process of the form ξ (t) = 1 + 0t γ (s)dW (s), t ∈ [0, T ] (its stochastic integrability was grounded in Problem 13.10). Assume that ξ (t) ≥ 0, t ∈ [0, T ] P-a.s. Prove that ξ is a nonnegative supermartingale and Eξ (t) ≤ 1, t ∈ [0, T ]. 13.32. Assume that a process {X(t), t ∈ R+ } has a stochastic differential dX(t) = α X(t)dt + σ (t)dW (t), X(0) = X0 ∈ R, where α ∈ R, σ ∈ L2 ([0,t]) for all t ∈ R+ . Find m(t) := EX(t). 13.33. Assume that a process {X(t), t ∈ R+ } has a stochastic differential dX(t) = μ (t)dt + σ (t)dW (t), where μ ∈ L1 ([0,t]), σ ∈ L2 ([0,t]) for all t ∈ R+ , and μ (t) ≥ 0 for all t ∈ R+ . Prove that {X(t), FtW , t ∈ R+ } is a submartingale, where FtW is a flow of σ -algebras generated by a Wiener process {W (t), t ∈ R+ }.
13.34. Let X(t) = exp{ 0t μ (s)ds + 0t σ (s)dW (s)}, and μ ∈ L1 ([0,t]), σ ∈ L2 ([0,t]) for all t ∈ R+ . Find conditions on functions μ and σ under which the process X(t) is (a) a martingale; (b) a submartingale; (c) a supermartingale. 13.35. Apply the Itˆo formula and prove that the following processes are martingales with respect to natural filtration: (a) Xt = et/2 cosW (t). (b) Xt = et/2 sinW (t). (c) Xt = (W (t) + t) exp{−W (t) − 12 t}. 13.36. Let {W (t), Ft , t ∈ R+ } be the Wiener process. Prove that W 4 (1) = 3 + 1 3 0 (12(1 − t)W (t) + 4W (t))dW (t). 13.37. Let {W (t), Ft , t ∈ R+ } be the Wiener process, and τ be a stopping time such that Eτ < ∞. Prove that EW (τ ) = 0, EW 2 (τ ) = Eτ . 13.38. Let a < x < b, {X(t) = x +W (t), t ∈ R+ } where W is the Wiener process. Denote ϕ (x) := P(X(τ ) = a) where τ is the exit moment of the process X from the interval (a, b). Prove that ϕ (x) ∈ C∞ [a, b] and that ϕ satisfies differential equation ϕ (x) = 0, x ∈ [a, b], and ϕ (a) = 1, ϕ (b) = 0. Prove that P(X(τ ) = a) = (b − x)/(b − a). This is a particular case of Problem 14.31.
13.39. Let {H(t), FtW , t ∈ [0, T ]} be a stochastic process with 0T H 2 (t)dt < ∞ a.s. t Put M(t) := 0 H(s)dW (s). (1) Prove that M(t) is a local square integrable martingale. (2) Let E supt≤T M 2 (t) < ∞. Prove that E 0T H 2 (t)dt < ∞ and the process M(t) is a square integrable martingale. 13.40. Let
x2 1 , 0 ≤ t < 1, x ∈ R, and p(1, x) = 0. exp − p(t, x) = √ 2(1 − t) 1−t
202
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
Put M(t) := p(t,W (t)), where W (t) is a Wiener process. (a) Prove that M(t) = M(0) + 0t (∂ p/∂ x)(s,W (s))dW (s). (b) Let H(t) = (∂ p/∂ x)(t,W (t)). Prove that 01 H 2 (t)dt < ∞ a.s., and 1 2 E 0 H (t)dt = +∞. 13.41. (1) Let {W (t) = (W1 (t), . . . ,Wm (t)), Ft ,t ≥ 0} be an m-dimensional Wiener 2 process. Put M(t) := ∑m i=1 (Wi (t)) . Prove that {M(t) − mt, Ft ,t ≥ 0} is a martingale and its quadratic characteristic equals M(t) = (2) Let N(t) = (M(t))1/2 ,
ϕ (x) =
t
4M(u) du. 0
ln |x|, m = 2, |x|2−m , m ≥ 3.
Prove that {ϕ (N(t)), Ft ,t ≥ 0} is a local martingale.
13.42. Let {Mt , Ft , t ∈ [0, T]} be a martingale of the form 0t H(s)dW (s) + 0t K(s)ds, where 0t H 2 (s)ds < ∞ a.s, 0t |K(s)|ds < ∞ a.s. Prove that K(t) = 0 λ 1 |[0,T ] × P a.s. 13.43. Let M ∈ M 2, c . Prove that M 2 (t) − M 2 (s) = 2
t 0
M(s)dM(s) + M(t) − M(s), s < t.
13.44. Let {M(t), Ft , t ∈ R+ } be a continuous local martingale, M(0) = 0, and limt→∞ M(t) = ∞ a.s. Denote
τt := inf{s > 0| M(s) > t}, t > 0. Prove that a stopped stochastic process {M(τt ), t ≥ 0} is a Wiener process with respect to the filtration {Fτt , t ∈ R+ }. 2, c 13.45. Let M ∈ Mloc , M(0) = 0 and EM(t) < ∞ for all t > 0. Prove that M ∈ M 2, c .
13.46. (The Dynkin formula) (1) Assume that a stochastic process {X(t), Ft , t ∈ R+ } has a stochastic differential dX(t) = μ (t, X(t))dt + σ (t, X(t))dW (t), X(0) = x ∈ R, where μ and σ are continuous bounded functions. Let a function f be bounded and f = f (t, x) ∈ C1 (R+ ) ×C2 (R), τ be a bounded stopping time, and a second-order differential operator L be of the form ∂f 1 2 ∂2 f + σ (t, x) 2 . L f (t, x) = μ (t, x) ∂x 2 ∂x (This operator is similar to the operator L introduced in Definition 12.16, but now its coefficients are nonhomogeneous in time.) Prove that τ ∂f + L f (u, Xu )du. E f (τ , Xτ ) = f (0, X0 ) + E ∂t 0
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
203
(2) Prove the following version of the Dynkin formula for unbounded stopping times. Let C02 (Rn ) be the class of twice continuously differentiable functions on Rn with compact support, and τ be a stopping time such that Ex τ < ∞. Then Ex f (Xτ ) = f (x) + E
τ 0
L f (Xu )du.
(3) Let n-dimensional stochastic process {X(t), Ft , t ∈ R+ } have a stochastic differential of the form dXi (t) = μi (t, X(t))dt + Σ m j=1 bi, j (t, X(t))dW j (t), X(0) = x ∈ Rn , where W (t) = (W1 (t),W2 (t), . . . ,Wm (t)) is an m-dimensional Wiener process. Assume that all components of μ and b are bounded continuous functions, a function f is bounded, f = f (t, x) ∈ C1 (R+ ) × C2 (Rn ), and τ is a bounded stopping time. Write down and prove the multidimensional Dynkin formula for X. 13.47. Let {W (t), t ∈ R+ } be a Wiener process. Prove that the limit (13.1) exists in L2 (P) and for all t ∈ R+ and x ∈ R, t 1 1IW (s)∈[x,∞) dW (s) + L(t, x) a.s. (W (t) − x)+ − (W (0) − x)+ = 2 0 13.48. Prove Theorem 13.4 (the Tanaka formula). 13.49. Prove that there exists continuous in (t, x) a modification of L(t, x). 13.50. Let {W (t), t ∈ R+ } be a Wiener process. Prove that for all t ∈ R+ and a ≤ b, b a
L(t, x)dx =
t 0
1IW (s)∈(a,b) ds a.s.
13.51. Let f ∈ L1 (R). Prove that for each t ∈ R+ , ∞
−∞
L(t, x) f (x)dx =
t 0
f (W (s))ds a.s.
13.52. Prove that a limit in (13.1) exists with probability 1. 13.53. (1) Let {L(t, y),t ≥ 0, y ∈ R} be a local time of a Wiener process. Prove that Ex (L(τa ∧ τb , y)) = 2u(x, y), a ≤ x ≤ y ≤ b where τz := inf{s ≥ 0|W (s) = z}, z = a, b, and u(x, y) = ((x − a)(b − y))/(b − a). (2) Prove that Ex
τa ∧τb 0
f (W (s))ds = 2
b a
u(x, y) f (y)dy
for any nonnegative bounded Borel function f : R → R+ . 13.54. Let W (t) = (W1 (t), . . . ,Wn (t)) be an n-dimensional Wiener process, X(t) = a +W (t) where a = (a1 , . . . , an ) ∈ Rn (n ≥ 2), and let R > a. Denote τR := inf{t ≥ 0 | X(t) ∈ B(0, R)} where B(0, R) = {x ∈ Rn | x < R}. Find EτR . 13.55. Let X be the process from Problem 13.54, but a > R. Find the probability for this process to hit the ball B(0, R).
204
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
13.56. Find some function f = ( f1 , f2 ) such that for any process X(t) = (X1 (t), X2 (t)) satisfying the equation dX1 (t) = −X2 (t)dW (t) + f1 (X1 (t), X2 (t))dt, dX2 (t) = X1 (t)dW (t) + f2 (X1 (t), X2 (t))dt, the equality X12 (t) + X22 (t) = X12 (0) + X22 (0),t ≥ 0 a.s. holds true. 13.57. (1) Prove the L´evy theorem. A stochastic process {W (t), Ft ,t ≥ 0} is a Wiener process if and only if it is a square integrable martingale, W (0) = 0, and E((W (t) − W (s))2 /Fs ) = t − s for any s < t. (2) Prove the following generalization of the L´evy theorem. For any square integrable martingale {M(t), Ft ,t ≥ 0} with the quadratic characteristic (see Definition 7.10) of the form M(t) = 0t α (s) ds where α ∈ L1 ([0,t]) for any t > 0, and α (s) > 0 for all s > 0, there exists a Wiener process {W (t), Ft ,t ≥ 0} such that M(t) = 0t (α (s))1/2 dW (s). (3) Prove the multidimensional version of the L´evy theorem. An n-dimensional stochastic process {W (t) = (W1 (t),W2 (t), . . . ,Wn (t)), Ft ,t ≥ 0} is an n-dimensional Wiener process if and only if it is a square integrable martingale, W (0) = 0, and the processes Wi (t)W j (t) − δi j t are martingales for any 1 ≤ i, j ≤ n. Another formulation for the latter claim is that, for any 1 ≤ i, j ≤ n, the joint quadratic characteristic Wi ,W j (t) equals δi j t . 13.58. Prove that stochastic process Y (t) := process.
t
0 sign (W (s))dW (s),
t ≥ 0 is a Wiener
13.59. Let W (t) = (W1 (t), . . . ,Wm (t)) be a Wiener process. Assume that a progressively measurable process U(t),t ≥ 0 takes values in a space of n × m matrices and U(t)U ∗ (t) = idn a.s., where idn is the identity n × n matrix. Prove that t W (t) = 0 U(s)dW (s) is an n-dimensional Wiener process. 13.60. Assume that a progressively measurable stochastic process β (t),t ≥ 0 satisfies the condition: ∃ c,C > 0 ∀ t ≥ 0 : c ≤ β (t) ≤ C a.s. Let A(t) =
t
−1
β (s)ds, A (t) := inf s ≥ 0|
0
s
β (z)dz = t .
0
Prove that A−1 (t) is a stopping time. Consider filtration {FtA := FA−1 (t) ,t ≥ 0}. Prove that the process (t) = W
−1 (t) A
β 1/2 (s)dW (s)
0
is an
{FtA ,t
≥ 0}-Wiener process. Prove the following change of variables formula, A−1 (t) 0
b(s)dW (s) =
t 0
(z), b(A−1 (z))β −1/2 (A−1 (z))dW
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
205
where b(t),t ≥ 0 is a σ {W (s), s ≤ t}-adapted process such that T 2 b (s)ds < ∞ = 1 P 0
for each T > 0.
Hints 13.1. Represent the integral as a mean-square limit of the corresponding integral sums. 13.2. See the hint to Problem 13.1. The Itˆo formula can also be applied. 13.3. Prove that these random variables are jointly Gaussian (see Problem 13.1), and calculate their covariance. 13.4. (3) Apply the Itˆo formula to W 3 (t). 13.5. Apply the definition of a stochastic integral and properly approximate a continuous process f ∈ Lˆ2 ([0, T ]). kT /n 13.6. Prove that process gn = ∑n−1 k=1 n (k−1)T /n g(s)ds1I[kT /n,(k+1)T /n) belongs to Lˆ2 ([0, T ]), and gn → g, n → ∞ in Lˆ2 ([0, T ]). In order to prove this verify that linear kT /n operator Pn : f → ∑n−1 n f (s)ds1I[kT /n,(k+1)T /n) in Lˆ2 ([0, T ]) has a norm 1, k=1
(k−1)T /n
and a sequence {Pn , n ≥ 1} strongly converges to the identity operator in Lˆ2 ([0, T ]). 13.8. Prove that a sequence of simple processes fn (t) = ∑ n k
k/n
(k−1)/n
f (s)ds1It∈[k/n,(k+1)/n)
converges to a process f in Lˆ2 ([0, T ]). Construct a similar sequence gn for g and ob serve that the Itˆo integrals of these processes 0t fn (s)dW (s), 0t gn (s)dW (s) coincide on the set A for any fixed t ∈ [0, T ]. In order to prove that t t f (s)dW (s) = g(s)dW (s) = 0 P ω ∈ A| ∃t ∈ [0; T ] t
0
0
t
notice that processes 0 f (s)dW (s), 0 g(s)dW (s) are continuous in t. 13.9. Prove the required statements for a stopping time that take values in a finite set. Then approximate arbitrary stopping time τ by a sequence of stopping times τn = ∑k≥1 k/n1Iτ ∈[(k−1)/n, k/n) . See also the solution to Problem 13.37. 13.10. Put τn = inf{t ≥ 0 | 0t f 2 (s)ds = n}∧T. Then τn is a stopping time, a sequence {τn } is nondecreasing, and for a.a. ω there exists n0 = n0 (ω ) such that τn (ω ) = T, n ≥ n0 . Set fn (t) = f (t)1Iτn >t . Then 0T ( fn (t, ω ) − f (t, ω ))2 dt = 0 if τn (ω ) = T. Denote I(t) = 0t fn (s)dW (s) for {ω | τn (ω ) = T } and apply the results of the previous two problems. 13.11. Let
τm = inf{t ≥ 0|
t 0
g2 (s)ds = m}∧T, gn (t) =
n−1
∑n
k=1
kT /n
(k−1)T /n
g(s)ds1I[kT /n,(k+1)T /n) (t).
206
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
Then (see Problems 13.6, 13.10) T 0
g(s)dW (s) =
kT /n
n−1
∑n
(k−1)T /n
k=1
T 2 T 2 0 gn (t)dt ≤ 0 g (t)dt for all ω ∈ Ω , n ∈ N, and T 0
g(s)1Iτm =T dW (s),
g(s)ds(W ((k + 1)T /n) −W (kT /n))
=
T 0
gn (t)1Iτm =T dW (t)
for a.a. ω such that τm (ω ) = T . Tend n → ∞ and apply Problems 13.6, 13.10. 13.12. Apply results of Problems13.5–13.10. 13.13. The first sum converges to 01 W (t)dW (t) = (W 2 (1) − 1)/2 in L2 (P). Compare the sums in (b), (c) with the sum in (a) and apply the reasoning of Problem 3.19 to estimate differences between the sums. 13.14. Introduce stopping times
τN,n = inf{t ≥ 0| Then
t 0
b2n (s)ds ≥ N} ∪ {T }.
T
0
2 bn (t)1It≤τN, n − b0 (t)1It≤τN, 0
dt → 0, n → ∞.
Therefore 0T bn (t)1It≤τN,n dW (t) converges in L2 (P) to 0T b0 (t)1It≤τN,n dW (t) as n → ∞, and so converges in probability. The localization property of the stochastic integral (Problem 13.8) implies that T
bn (t)1It≤τN,n dW (t) =
0
T 0
bn (t)dW (t)
for a.a. ω from the set {τN,n = T }. So, to complete the proof, it suffices to observe that ∀ ε > 0 ∃ N : lim P(τN,n < T ) < ε . n→∞
13.15. Introduce auxiliary processes
ξn,m (t) =
∑
k=0
Then
1 0
k+1/m
m−2
m k/m
ξn (s)ds1I ((k+1)/m),((k+2)/m) (t).
2 2 1 ξ (t)dW (t) ≤ 3 ( ξ (t) − ξ (t))dW (t) n,m n 0 0 0 0 n 2 + 01 ξn,m (t)dWn (t) − 01 ξ0,m (t)dW0 (t) 2 . + 01 (ξ0 (t) − ξ0,m (t))dW0 (t)
ξn (t)dWn (t) −
1
(13.4) It is easy to see that for each m we have the following convergence in probability,
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula 1
lim
n→∞ 0
ξn,m (t)dWn (t) =
It can be proved that Pm : f →
207
1
ξ0,m (t)dW0 (t). 0 k+1/m (t) f (s)ds1I ∑m−2 k=0 m k/m ((k+1)/m),((k+2/m)
in
Lˆ2 ([0, T ]) has a norm 1, and a sequence {Pm , m ≥ 1} strongly converges to the identity operator in Lˆ2 ([0, T ]). So, 1 2 lim E (ξ0 (t) − ξ0,m (t))dW0 (t) m→∞
= lim E m→∞
Observe that
0
1 0
(ξ0 (t) − ξ0,m (t))2 dt = 0.
2 E 0 (ξn (t) − ξn,m (t))dWn (t) 1 2 = E 0 (ξn (t) − ξn,m (t)) dt ≤ 3 E 01 (ξn (t) − ξ0 (t))2 dt 1 1 2 2 +E 0 (ξ0 (t) − ξ0,m (t)) dt + E 0 (ξ0,m (t) − ξn,m (t)) dt
1
≤ 6E
1 0
(ξn (t) − ξ0 (t))2 dt + 3E
1 0
(ξ0 (t) − ξ0,m (t))2 dt.
Therefore the first term in the right-hand side of (13.4) converges in the mean square to 0 as n, m → ∞. So, the left-hand side of (13.4) converges to 0 in probability as n → ∞. 13.19. Apply the multidimensional Itˆo formula. 13.20. Apply the Itˆo formula. 13.21. (2), (3) Use the √ method of mathematical induction. In particular, in item (3), let a(s) := (W (s))/ s. Then da(s) = s−1/2 dW (s) − s−3/2W (s)ds/2. Due to the Itˆo formula d sn/2 hn (a) = n2 sn/2−1 hn (a) + sn/2 hn (a(s)) s−1/2 dW (s) − s−3/2W (s)ds/2 −1 ds = sn/2−1 nh (a(s)) − a(s)h (a(s)) + 12 sn/2 hn (a(s))s n n 2 + hn (a(s)) ds + s(n−1)/2 hn (a(s))dW (s). The following property of Hermite polynomials is well known: nhn (x) − xhn (x) + hn (x) = 0, and also hn (x) = nhn−1 (x). That is d sn/2 hn (a(s)) = ns(n−1)/2 hn−1 (a(s))dW (s), that which was to be demonstrated (this is the inductive step). 13.22. (1) Use properties of stochastic integrals. (2) Apply Problem 13.21. (4) Uniqueness follows from item (1). To prove existence, use items (1) and (3). f f (5) Random variables In n (T ), Imm (T ) are orthogonal in L2 (Ω , FTW , P) if m = n (see fn item (1)), so {In (T )| fn ∈ L2 (Δn (T ))} are orthogonal subspaces of L2 (Ω , FTW , P). Therefore, it suffices to verify that linear combinations of multiple Itˆo integrals are dense in L2 (Ω , FTW , P). As mentioned, the set of polynomials is dense in any space L2 (Rn , γ ), where γ is a Gaussian measure. So any square integrable random variable
208
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
of the form g(W (s1 ), . . . ,W (sn )) can be approximated in L2 (P) by linear combinations of multiple stochastic integrals. Prove now that a set of square integrable random variables having the form g(W (s1 ), . . . ,W (sn )), sk ∈ [0, T ], n ∈ N is dense in L2 (Ω , FTW , P). 13.23.–13.25. Apply the Itˆo formula. In Problem 13.24 this can be made straightforwardly; in Problems 13.23 and 13.25 an additional limit procedure should be used because the functions x → ln x and (x, y) → x/y do not belong to the classes C2 (R) and C2 (R2 ), respectively. For instance, for any given c > 0 there exists a function Fc ∈ C2 (R) such that Fc (x) = ln x, x ≥ c. Write the Itˆo formula for Fc (X(t)) and then tend c → 0+. 13.26. Let X(t) := 0t f (s)dW (s). Put τN := inf{t ∈ R+ | sup0≤s≤t |X(s)| ≥ N} ∧ T, apply the Itˆo formula to |X(t ∧ τN )|2m , obtain the estimate E|X(t)|2m ≤ C2 m(2m − t 1) 0 E|X(s)|2m−2 ds, and apply mathematical induction. 13.28. the Itˆo formula and obtain the equality E|X(t ∧ τN )|2m = m(2m − t∧τApply N |X(s)|2m−2 f 2 (s)ds, where X(t) and τN are the same as in Problem 13.26. 1)E 0 Apply the H¨older inequality with p = m/(m − 1), q = m to the right-hand side. Verify and then use the following facts: E|X(t ∧ τN )|2m < ∞ and E|X(t ∧ τN )|2m is nondecreasing in t. 13.29. Prove that the quadraticcharacteristic of the square integrable martingale t t 2 0 f (s)dW (s), t ∈ [0, T ] equals 0 f (s)ds, t ∈ [0, T ]. Apply the Burkholder–Davis inequality for a continuous-time martingale (Theorem 7.17). 13.30. Apply the Itˆo formula. 13.31. Consider the following stopping times: τn = inf{t ∈ [0, T ] | 0t γ 2 (s)ds ≥ n}, T 2 where we set τn = T if 0 γ (s)ds < n. Use properties of the Itˆo integral and verify that process ξ (t ∧ τn ) is a continuous nonnegative Ft -martingale. Check that τn → T a.s. as n → ∞. Apply the Fatou lemma and deduce both statements. 13.32. Write the equation for m(t). 13.33. Prove that 0t σ (t)dW (t) is a martingale with respect to the indicated flow and apply the definition of a submartingale. 13.34. Apply the Itˆo formula. 13.39. (1) Consider a sequence of stopping times τn = inf{t ∈ R+ 0t H 2 (s)ds = n} ∧ T and prove that EM 2 (τn ) = E 0τn H 2 (s)ds. (2) Apply the Lebesgue monotone convergence theorem and relation E 0τn H 2 (s)ds = EM 2 (τn ) ≤ E supt≤T M 2 (t). 13.41. (1) To calculate a quadratic characteristic apply the Itˆo formula to the difference M 2 (t) − M 2 (s), 0 ≤ s < t. (2) Apply the multidimensional version of the Itˆo formula. 13.42. Consider a sequence of stopping times τn = inf{t ∈ R+ | 0t H 2 (s)ds = n} ∧ T. 13.43. Apply the generalized Itˆo formula. 13.46. Apply the Itˆo formula and Doob’s optional sampling theorem. 13.49. Apply Theorem 3.7. 13.51. Use approximation and Problem 13.50. 13.53. (1) Due to Problem 13.38, P{τa < τb } = (b − x)/(b − a). Prove that Ex
τa ∧τb 0
sign(W (s) − y)dW (s) = 0
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
209
and apply Theorem 13.4 to deduce the identity Ex (L(τa ∧ τb , y)) = |b − y| − |x − y| + (b − x)/(b − a)(|a − y| − |b − y|) and then the required statement. (2) Follows from item (1) and Problem 13.51. 13.57. (1) Apply the generalization of the Itˆo formula to a function F(W (t)) = exp {iuW (t)}. Denote I(t) := E(F(W (t))/Fs ). Then I(t) satisfies the equation I(t) = F(W (s)) − (u2 /2) st I(θ ) d θ . Therefore, I(t) = F(W (s)) exp {−(u2 /2)(t − s)}, and so E(exp {iu(W (t) −W (s))}/Fs ) = exp {−(u2 /2)(t − s)}. t (2) Put W (t) := 0 ((dM(s))/(α (s))). Verify that {W (t), t ≥ 0} is a Wiener process. 13.58. A process Yt is a square integrable martingale with quadratic characteristic Y t = 0t sign 2 (W (s))ds. Use Problem 3.21 and prove that Y t = t. Finally apply Problem 13.57 (L´evy theorem). 13.59. See hint to Problem 13.58. 13.60. Due to Problem 7.44, a random variable A−1 (t) is a stopping time. Check that (t) = 0, EW 2 (t) = t, and use the L´evy (t) is a continuous FA−1 (t) -martingale, EW W theorem (Problem 13.57). To prove the change of variable formula, approximate a process b(t) by simple processes, prove the formula for them and apply the result of Problem 13.14.
Answers and Solutions 13.13. (a)
(b)
(c)
13.16.
W 2 (1) − 1 , 2 W 2 (1) + 1 , 2 W 2 (1) . 2 (2n − 1)!! n+1 T . n+1
13.32. m(t) = X0 eα t . 13.34. For almost all s ∈ R+ and ω ∈ Ω , the value μ (s, ω ) − ((σ 2 (s, ω ))/2) should (a) equal 0; (b) ≥ 0: (c) ≤ 0. 13.36. Put X(t) := E(W 4 (1)/FtW ), 0 ≤ t ≤ 1. Due to the Markov property of a Wiener process, X(t) = E(W 4 (1)/W (t)). The conditional distribution of W (1) given W (t) is Gaussian, N(W (t), 1 − t), thus (1) −W (t))4 /W (t) X(t) = E (W(1) −W (t) +W (t))4 /W (t) = E (W + 4E (W (1) −W (t))3W (t)/W (t) + 6E (W (1) −W (t))2W 2 (t)/W (t) + 4E (W (1) −W (t))W 3 (t)/W (t) +W 4 (t) = 3(1 − t)2 + 6(1 − t)W 2 (t) + W 4 (t).
210
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
Hence, due to the Itˆo formula X(s) = X(0) + 0s (12(1 − t)W (t) + 4W (t)3 )dW (t) (check this). Finally, X(0) = EW 4 (1) = 3. 13.37. Consider the adapted function f (s, ω ) := 1Iτ (ω )≥s . Then P 0∞ f 2 (s)ds < ∞ t n = P(τ < ∞) = 1. Let us show that 0 f (s)dW (s) = W (t ∧ τ ) a.s. t Put τn = k/2 n n if (k − 1)/2 ≤ τ ≤ k/2 , τn = ∞ if τ = ∞. Consider integrals 1I dW (s) = ∞ ∞ 0 τn ≥s n for some i, then t 1I 1 I dW (s). If t = i/2 dW (s) = 1 I dW (s) = τ ∧t≥s τ ≥s τ ∧t≥s n n n 0 0 0 stochastic integral and Wiener process, the last equalWτn ∧t . Due to continuity of the ity is satisfied for all t. Next, 0∞ E(1Iτn ≥s −1Iτ ≥s )2 ds = 0∞ (P(s ≤ τn ) − P(s ≤ τ )) ds = t Eτn − Eτ ≤ 1/2n → 0, n → ∞, so 0t 1Is≤τ dW (s) = l.i.m. τ n→∞ 0 1Is≤τn dW ∞ (s) = W ( τ ∧t) = W ( τ ∧t). Then P-a.s. W ( τ ) = 1 I dW (s) = l.i.m.n→∞ n s≤ τ 0 1Is≤τ dW (s), 0 and E 0∞ 1I2s≤τ ds = Eτ < ∞. So, EW (τ ) = E 0∞ 1Is≤τ dW (s) = 0, and EW 2 (τ ) = E( 0∞ 1Is≤τ dW (s))2 = 0∞ E1I2s≤τ ds = Eτ . 13.44. One can verify that random variable τt is a stopping time for each t > 0. One can also check that Mτt = t a.s. Let us show that {Mτt , Fτt , t ∈ R+ } is a square integrable martingale. Define a localizing sequence
σk := inf{t > 0| |Mt | > k}. R+ }
is a bounded martingale, and hence, by Theorem 7.5 a Then {Mt∧σk , Ft , t ∈ process {Mτt ∧σk , Fτt , t ∈ R+ } is a bounded martingale as well. According to Problem 13.43 τt ∧σk M 2 (τt ∧ σk ) = 2 M(s)dM(s) + M(τt ∧ σk ). 0
+2 it holds that It was mentioned in Theoretical grounds in Chapter 7 that for M ∈ M 2 2 M (t) − M(t) is a martingale. Therefore, M (t ∧ σk ) −M(t ∧ σk ), and so M 2 (τt ∧ τ ∧σ σk ) − M(τt ∧ σk ) are martingales. That is, a process 0 t k M(s)dM(s) is a martingale, moreover a bounded martingale. Thus its expectation is zero and EM 2 (τt ∧ σk ) = EM(τt ∧ σk ) ≤ EM(τt ) ≤ t. Due to Fatou’s lemma {M(τt ), Fτt , t ∈ R+ } is a square integrable martingale. Due to now to prove Problem 13.57, it suffices that M(τt ) is a continuous process and E (M(τt ) − M(τs ))2 /Fτs = t − s. Let us first prove the last relation. It is easy to see that E (M(τt ) − M(τs ))2 /Fτs = M τ (t) − M τ (s), where M τ is a quadratic characteristic of a martingale M(τt ). Due to the generalization of the Itˆo formula, M 2 (τt ) = 2 0τt M(s)dM(s) + t. We can con2 sider the τtlast relation as a Doob–Meyer decomposition for supermartingale M (τt ), where 0 M(s)dM(s) is a local martingale, and A(t) = t is nonrandom and thus a predictable nondecreasing process. Because of uniqueness of such decomposition M τ (t) = t. Finally, let us prove the continuity M(τt ). A function τt is right continuous. Therefore M(τt ) has right continuous trajectories a.s. Note that M(τt − ) = M(τt ) = t and M is a continuous process. Denote by A ⊂ Ω the set on which the trajectories are not continuous. Then A = {ω ∈ Ω | there exists t > 0 such that τt − = τt , and M(τt − ) = M(τt )} ⊂ r,s∈Q {ω ∈ Ω | there exists t > 0 such that τt − < r < s < τt , M(r) = M(s), M(r) = M(s)} ⊂ r,s∈Q, 0 M(r)}. Then {M(r) = M(s), M(r) = M(s)} = {σ ≥ s, M(r) = M(σ ∧s)} and M(σ ∧s∧ σk )−M(r ∧ σk ) = 0. Then (M(σ ∧s∧ σk ))2 −
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
(M(r ∧ σk ))2 =
σ ∧s∧σk r∧σk
211
M(u)dM(u), and the right-hand side has zero expectation,
and the expectation of the left-hand side equals E (M(σ ∧ s ∧ σk ) − M(r ∧ σk ))2 . Tend k → ∞ and apply the Fatou lemma: E(M(σ ∧ s) − M(r))2 = 0. It is easy to deduce now that P(A) = 0. 13.45. Let us write down a generalization of the Itˆo formula for a localizing sequence {τn , n ≥ 1} (see also Problem 13.43), M 2 (t ∧ τn ) = 2
t∧τn
t∧τn
0
M(s)dM(s) + M(t ∧ τn ),
and M(s)dM(s) is a martingale (see Theoretical grounds to Chapter 7). So 0 E 0t∧τn M(s)dM(s) = 0, hence EM 2 (t ∧ τn ) ≤ EM(t) < ∞. The application of Fatou’s lemma completes the proof. n μ (t, 13.46. (3) E f (τ , Xτ ) = f (0, X0 )+E 0τ (L f + ∂ f /∂ t) (u, Xu )du, where L = Σi=1 i 1 n 2 m x)(∂ f (t, x))/∂ xi + 2 Σi, j=1 σi j (t, x)(∂ f (t, x))/∂ xi ∂ x j with σi j := Σk=1 bik b jk . 13.47. Put fx (y) = (y − x)+ . Define approximations fxε (y)(ε > 0) of the function fx (y): ⎧ if y ≤ x − ε , ⎨ 0, fxε (y) = (y − x + ε )2 /4ε , if x − ε ≤ y ≤ x + ε , ⎩ y − x, if y ≥ x + ε . There exists a sequence ϕn ∈ C∞ (R) of functions, with compact supports that contract to {0}, and such that gn := ϕn ∗ fxε (i.e., gn (y) = R fxε (y−z)ϕn (z)dz) satisfy relations: gn → fxε and gn → fxε uniformly on R, and gn → fxε pointwise except at points x± ε . Notice that gn ∈ C∞ (R). For example, we can put ϕn (y) = nϕ (ny), where ϕ (y) = c exp{−(1 − y2 )−1 } for |y| < 1 and ϕ (y) = 0 for |y| ≥ 1; and a constant c is 1 such that c −1 ϕ (y)dy = 1. Apply the Itˆo formula to gn : t
1 t g (W (s))ds. 2 0 n 0 For all t a sequence 1Is∈[0,t] gn (W (s)) converges as n → ∞ to 1Is∈[0,t] fxε (W (s)) uni formly on R+ × Ω . So, 0t gn (W (s))dW (s) converges in L2 (P) to 0t fxε (W (s))dW (s). Observe that limn→∞ gn (W (s)) = fxε (W (s)) a.s. for each s ∈ R+ because P(W (s) = x ± ε ) = 0 for all ε > 0. Due to Fubini’s theorem (applyied to a product of measures P × λ 1 ) this limit relation holds a.s. for almost all s ∈ R+ with respect to Lebesgue measure λ 1 . Because |gn | ≤ (2ε )−1 , the Lebesgue dominated theorem implies the convergence 0t gn (W (s))ds → 0t fxε (W (s))ds in L2 (P) and a.s. So, for each x and t t 1 t 1 fxε (W (s))dW (s) + 1I ds. a.s. fxε (W (t)) − fxε (W (0)) = 2 0 2ε W (s)∈(x−ε ,x+ε ) 0 (13.5) + as (0)−x) A sequence fxε (W (t))− fxε (W (0)) converges L2 (P) to (W (t)−x)+ −(W t ε ↓ 0 because | fxε (W (t))− fxε (W (0))| ≤ |W (t)−W (0)|. Moreover E 0 ( fxε (W (s))− √ 1IW (s)∈[x,∞) )2 ds ≤ E 0t 1IW (s)∈(x−ε ,x+ε ) ds ≤ 0t (2ε / 2π s)ds → 0 as ε → 0. Theregn (W (t)) − gn (W (0)) =
gn (W (s))dW (s) +
fore 0t fxε (W (s))dW (s) converges in L2 (P) to required statement.
1 0
1IW (s)∈[x,∞) dW (s). This implies the
212
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula
13.48. A process (−W ) is also a Wiener process. So it possesses a local time at the point (−x). Let us denote it by L− (t, −x). Applying Definition 13.1 to L− (t, −x) we obtain that L− (t, −x) = L(t, x) a.s. Combine the last equality and the application of the result of Problem 13.47 to (−W ) and (−x) instead of W and x. Then we get t 1 (W (t) − x)− − (W (0) − x)− = − 1IW (s)∈(−∞,x] dW (s) + L(t, x) 2 0 (check the last equality). Add this equality to the equality from Problem 13.47, and obtain the required statement because 01 1IW (s)=x dW (s) = 0 a.s. 13.50. Without loss of generality we may assume that L(t, x) is continuous in (t, x) (see Problem 13.49). Denote I(t, x) = 0t 1IW (s)∈[x,∞) dW (s), I(t, x) is continuous in (t, x). Then (Problem 13.47) 1 L(t, x) = (W (t) − x)+ − (W (0) − x)+ − I(t, x). (13.6) 2 x+ε Let fxε (z) = (1/2ε ) x− ε 1Iz∈[y,∞) dy (see solution of Problem 13.47). Due to the stochastic Fubini theorem (Theorem 13.5), we obtain t 1 t x+ε 0 f xε (W (s))dW (s) = 2ε 0 x−ε 1IW (s)∈[y,∞) dy dW (s) 1 x+ε t
=
2ε x−ε
0 1IW (s)∈[y,∞) dW (s) dy
=
1 x+ε 2ε x−ε I(t, y)dy
a.s.
(Verify that conditions of the stochastic Fubini theorem are really satisfied.) Substituting the received identity into formula (13.5) we obtain that a.s. 1 x+ε 1 t 1 fxε (W (t)) − fxε (W (0)) − I(t, y)dy = 1I ds. (13.7) 2ε x−ε 2 0 2ε W (s)∈(x−ε ,x+ε ) Let us integrate the last equality with respect to x. Then b" # 1 x+ε fxε (W (t)) − fxε (W (0)) − I(t, y)dy dx 2ε x−ε a 1 t b 1I dxds a.s. (13.8) = 4ε 0 a x−ε <W (s)<x+ε For any z ∈ R
1 b 1I(x−ε ,x+ε ) (z)dx = 1Iz∈(a,b) + 1Iz=a + 1Iz=b . (13.9) ε ↓0 2ε a Make ε ↓ 0 in (13.8). A function I is continuous, so identity (13.9) implies that b 1 1 (W (t) − x)+ − (W (0) − x)+ − I(t, x) dx = 1I ds a.s. 2 0 W (s)∈(a,b) a The required statement follows from the last identity and (13.6). 13.52. Let us apply equality (13.8). Its left-hand side is continuous in ε > 0 and the right-hand side is left-continuous in ε > 0. So, identity (13.8) holds for all ε > 0 simultaneously, for a.a. ω . The left-hand side converges to 12 L(t, x) as ε ↓ 0 (see (13.6)) because I(t, · ) is continuous. That is what was to be demonstrated. 13.54. Let m ∈ N. Apply the multidimensional Dynkin formula to the process X, τ = τm = τR ∧m and a bounded function f ∈ C2 (R) such that f (x) = x2 as x ≤ R. Observe that L f (x) = 12 f (x) = n when x ≤ R. Therefore lim
13 Itˆo stochastic integral. Itˆo formula. Tanaka formula E f (X(τm )) = f (a) + 12 E 0τm f (X(s))ds τm 2 2
= a + E
0
213
nds = a + nEτm .
So Eτm for all m ∈ N. Thus τR = limm→∞ τm < ∞ a.s. and EτR = (1/n)(R2 − a2 ). 13.55. Let σk be the first exit time from the ring Ak = {x ∈ Rn | R ≤ x ≤ 2k R}, k ∈ N. Let also fn,k ∈ C2 (R) have a compact support and − ln |x|, n = 2, fn,k (x) = |x|2−n , n > 2, ≤ (1/n)(R2 − a2 )
as R ≤ x ≤ 2k R. Because fn,k = 0 on a set Ak , Dynkin’s formula implies E fn,k (X(σk )) = fn,k (a) for all k ∈ N. We put pk := P(|X(σk )| = R), qk = P(|X(σk n > 2 separately. If n = 2 then due to (13.10)
)| = 2k R),
(13.10)
and consider cases n = 2 and
− ln R · pk − (ln R + k ln 2)qk = − ln a, k ∈ N. So qk → 0 as k → ∞, and P(σk < ∞) = 1. If n > 2, then (13.10) implies pk · R2−n + qk (2k · R)2−n = a2−n . Because 0 ≤ qk ≤ 1, a 2−n . lim pk = P(σk < ∞) = k→∞ R 13.56. f1 (x1 , x2 ) = −0, 5x1 ; f2 (x1 , x2 ) = −0, 5x2 .
14 Stochastic differential equations
Theoretical grounds Consider a complete filtration {Ft ,t ∈ [0, T ]} and an m-dimensional Wiener process {W (t),t ∈ [0, T ]} with respect to it. By definition, a stochastic differential equation (SDE) is an equation of the form dX(t) = b(t, X(t))dt + σ (t, X(t))dW (t), 0 ≤ t ≤ T,
(14.1)
with X0 = ξ , where ξ is an F0 -measurable random vector, b = b(t, x) : [0, T ] × Rn → Rn , and σ = σ (t, x) : [0, T ] × Rn → Rn×m are measurable functions. Equality (14.1) is simply a formal writing of the stochastic integral equation X(t) = X0 +
t 0
b(s, X(s))ds +
t
σ (s, X(s))dW (s), 0 ≤ t ≤ T.
(14.2)
0
Definition 14.1. A strong solution to stochastic differential equation (14.1) on the interval 0 ≤ t ≤ T is an Ft -adapted Rm -value process {X(t), 0 ≤ t ≤ T } with a.s. continuous paths and such that after its substitution into the left- and right-hand sides of relation (14.2), for each 0 ≤ t ≤ T the equality holds with probability 1. Definition 14.2. Equation (14.1) has a unique strong solution in the interval [0, T ] if the fact that processes X = X(t) and Y = Y (t), t ∈ [0, T ], are strong solutions to the given equation with the same initial condition, implies that X is a modification of Y (then these continuous processes do not differ, i.e., P(X(t) = Y (t),t ∈ [0, T ]) = 1). Theorem 14.1. Assume both Lipschitz and linear growth conditions |b(t, x) − b(t, y)| + |σ (t, x) − σ (t, y)| ≤ L|x − y|, x, y ∈ Rn ,t ∈ [0, T ]; |b(t, x)|2 + |σ (t, x)|2 ≤ L(1 + |x|2 ), t ∈ [0, T ], x ∈ Rn , where L > 0 is a constant. Then the stochastic differential equation has a unique strong solution. Here we denote Euclidean norm of both vector and matrix by the symbol | · |; that is,
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 215 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 14,
216
14 Stochastic differential equations
|b(t, x)| =
n
∑ |bk (t, x)|2
1/2
, |σ (t, x)| =
k=1
n,m
∑
|σk j (t, x)|2
1/2
.
k=1, j=1
Theorem 14.1 gives the simplest conditions for existence and uniqueness of a strong solution; those are the classical Lipschitz and linear growth conditions. There exist generalizations of the theorem to the case where the Lipschitz condition is replaced by a weaker one. Theorem 14.2. Consider a scalar equation with homogeneous coefficients X(t) = X0 +
t 0
b(X(s))ds +
t
σ (X(s))dW (s), 0 ≤ t ≤ T,
(14.3)
0
and assume that its coefficients satisfy the next conditions. (1) The functions b(x) and σ (x) are bounded. (2) There exists a strictly increasing function ρ (u) on [0, ∞) such that ρ (0) = 0, 0+ ρ −2 (u) = ∞, and |σ (x) − σ (y)| ≤ ρ (|x − y|) for all x, y ∈ R (it is Yamada’s condition [91], see also [38]). (3) There exists an increasing convex function ς (u) on [0, ∞), such that ς (0) = 0, −1 0+ ς (u) = ∞, and |b(x) − b(y)| ≤ ς (|x − y|) for all x, y ∈ R. Then equation (14.3) has a unique strong solution. In particular, one may take ρ (u) = uα , α ≥ 12 , and ς (u) = Cu. Now, we pass to the definition of a weak solution. Assume that only nonrandom coefficients b(t, x) and σ (t, x) are given, and at the moment there is no stochastic object at hand. Theorem 14.3. Consider a scalar equation (14.3) with homogeneous coefficients and assume that its coefficients satisfy the Lipschitz and linear growth conditions. Then a process X has a strong Markov property. Definition 14.3. If on a certain probability space (Ω , F, P) one can construct a W (t)), t ∈ [0, T ]}, which are filtration {Ft , t ∈ [0, T ]} and two processes {(X(t), (t), t ∈ [0, T ]} is a Wiener process, and adapted to the filtration and such that {W is a solution to equation (14.1) in which W is changed for W , then it is said that X(t) equation (14.1) has a weak solution. Theorem 14.4. Let coefficients b(t, x) and σ (t, x) be measurable locally bounded functions, continuous in x for each t ∈ [0, T ], and moreover |b(t, x)|2 + |σ (t, x)|2 ≤ L(1 + |x|2 ), t ∈ [0, T ], x ∈ Rn . Then equation (14.1) has a weak solution. Remark 14.1. Throughout this and the next chapters by a Wiener process we mean a process W that satisfies the usual definition of a Wiener process, except, maybe, for the condition W (0) = 0. However, if no initial condition is specified, it is assumed that W (0) = 0.
14 Stochastic differential equations
217
Bibliography [9], Chapter VIII; [38], Chapter IV; [90], Chapter 12, §§12.4–12.5; [24], Volume 3, Chapters II and III; [25], Chapter VIII, §§2–4; [26]; [51], Chapter 19; [57], Chapter 4; [79], Chapter 31; [20], Chapter 14; [8], Chapter 7, §7.5; [46], [49], Chapter 21; Chapter 12, §12.5; [54], Chapter 3, §3.5; [61], Chapter V; [68], Chapter 13, §13.1; [85], Chapters 9, 10, and 15.
Problems 14.1. Let {W (t), t ∈ R+ } be a one-dimensional Wiener process. Prove that the next processes are the solutions to corresponding stochastic differential equations. (a) X(t) = eW (t) is a solution to an SDE dX(t) = 12 X(t)dt + X(t)dW (t). (b) X(t) = (W (t))/(1 + t) with W (0) = 0 is a solution to an SDE dX(t) = −(1/(1 + t))X(t)dt + (1/(1 + t))dW (t), X(0) = 0. (c) X(t) = sinW (t) with W (0) = a ∈ (−(π /2).(π /2)) is a solution to an SDE dX(t) = − 12 X(t)dt +(1−X 2 (t))1/2 dW (t) for t < τ (ω ) = inf{s > 0|W (s) ∈ [−(π /2), (π /2)]}. (d) (X1 (t), X2 (t)) = (t, et W (t)) is a solution to an SDE 0 dX1 (t) 1 = dt + X1 (t) dW (t), X2 (t) dX2 (t) e (e) (X1 (t), X2 (t)) = (chW (t), shW (t)) is a solution to an SDE 1 X1 (t) dX1 (t) X2 (t) = dt + dW (t). dX2 (t) X1 (t) 2 X2 (t) 14.2. Prove that the process X(t) = X0 exp{(r − (σ 2 /2))t + σ W (t)} is a strong solution to an SDE dX(t) = rX(t)dt + σ X(t)dW (t), X(0) = X0 , t ∈ R+ , and find an equation that is satisfied by the process X(t) = X0 exp{rt + σ W (t)}. 14.3. Let the process {X(t),t ∈ R+ } be a solution to an SDE dX(t) = (μ1 X(t) + μ2 )dt + (σ1 X(t) + σ2 )dW (t), X0 = 0, t ∈ R+ . (1) Find an explicit& form for X(t). ' (2) Let S(t) = exp (μ1 − (σ12 /2))t + σ1W (t) , where W is the same Wiener process that is written in the equation for X. (a) Prove that the process {S(t)} is a strong solution to an SDE dS(t) = μ1 S(t)dt + σ1 S(t)dW (t), t ∈ R+ . (b) Find a stochastic differential equation that is satisfied by the process {S−1 (t)}. (3) Prove that d(X(t)S−1 (t)) = S−1 (t) ((μ2 − σ1 σ2 )dt + σ2 dW (t)) . 14.4. Let {W (t) = (W1 (t), . . . ,Wn (t)), t ∈ R+ } be an n-dimensional Wiener process. Find a solution to SDE n dX(t) = rX(t)dt + X(t) ∑ αk dWk (t) , X(0) > 0. k=1
218
14 Stochastic differential equations
14.5. (1) Prove that the process X(t) = α (1 − t/T ) + β t/T + (T − t)
t 0
dW (s) , 0 ≤ t ≤ T, T −s
is a solution to SDE
β − X(t) dt + dW (t), t ∈ [0, T ], X0 = α . T −t (2) Prove that X(t) → β as t → T −, a.s. (3) Prove that the process X is a Brownian bridge over the interval [0, T ] with fixed endpoints X(0) = α and X(T ) = β . (A standard Brownian bridge is obtained with T = 1, α = β = 0, see Example 6.1 and Problems 6.13, 6.21.) dX(t) =
14.6. Solve differential the next stochastic equations. dX1 (t) 1 1 0 dW1 (t) (a) = dt + , 0 0 X1 (t) dX2 (t) dW2 (t) where W (t) = (W1 (t),W2 (t)) is a two-dimensional Wiener process. (b) dX(t) = X(t)dt + dW (t). (c) dX(t) = −X(t)dt + e−t dW (t). 14.7. (1) Solve the Ornstein–Uhlenbeck equation (or the Langevin equation) dX(t) = μ X(t)dt + σ dW (t), μ , σ ∈ R. The solution is called the Ornstein–Uhlenbeck process (cf. Problem 6.12). (2) Find EX(t) and DX(t) (cf. Example 6.2). 14.8. (1) Solve the mean-reverting Ornstein–Uhlenbeck equation dX(t) = (m − X(t))dt + σ dW (t), μ , σ ∈ R. Here the coefficient m is the “mean value”. Respectively, a solution to this equation is called the mean-reverting Ornstein–Uhlenbeck process. (2) Find EX(t) and its asymptotic behavior as t → ∞, and also find DX(t). 14.9. Solve an SDE dX(t) = rdt + α X(t)dW (t), r, α ∈ R. 14.10. Solve the next stochastic differential equations: . . (1) dX(t) = ( 1 + X 2 (t) + 1/2X(t))dt + 1 + X 2 (t)dW (t), 2 X(t) − a(1 + t)2 dt + a(1 + t)2 dW (t). (2) dX(t) = 1+t 14.11. Consider a linear SDE dX(t) = A(t)x(t)dt + σ dW (t), X(0) = x, where A(t) is a nonrandom continuous on R+ function, moreover A(t) ≤ −α < 0 for all t ≥ 0. Prove that σ2 σ2 σ 2 −2α t e (1 − e−2α t ) ≤ + x2 − . EX 2 (t) ≤ e−2α t x2 + 2α 2α 2α
14 Stochastic differential equations
219
14.12. Solve a two-dimensional SDE, dX1 (t) = X2 (t)dt + α dW1 (t), dX2 (t) = X1 (t)dt + β dW2 (t), where (W1 (t),W2 (t)) is a two-dimensional Wiener process and α , β ∈ R. 14.13. (1) Solve a system of stochastic differential equations dX(t) = Y (t)dt, dY (t) = −β X(t)dt − α Y (t)dt + σ dW (t), where α , β , and σ are positive constants. (2) Show that in the case where the vector (X(0),Y (0)) has a joint Gaussian distribution, the vector process (X,Y ) is Gaussian. Find its covariance function. 14.14. (Feynman–Kac formula) Let {X(t), Ft ,t ∈ R+ } be a diffusion process that admits a stochastic differential dX(t) = μ (t, X(t))dt + σ (t, X(t))dW (t), 0 ≤ t ≤ T, and let there exist a solution f (t, x), (t, x) ∈ [0, T ] × R, to a partial differential equation ∂f + L f (t, x) = r(t, x) f (t, x), 0 ≤ t ≤ T, ∂t with boundary condition f (T, x) = g(x). (Here r(t, x) ∈ C([0, T ] × R), r ≥ 0, and the operator L is introduced in Problem 13.46.) Prove that T / f (t, x) = E g(X(T ))e− t r(u,X(u))du X(t) = x . 14.15. Prove that the next one-dimensional SDE has a unique strong solution: dX(t) = log(1 + X 2 (t))dt + 1IX(t)>0 X(t)dW (t), X0 = a ∈ R. 14.16. Let a, c, d be real constants and a > 0. Consider a one-dimensional SDE, dX(t) = (cX(t) + d)dt + (2aX(t) ∨ 0)1/2 dW (t).
(14.4)
(1) Prove that for any initial value X(0), the equation has a unique strong solution. (2) Prove that in the case d ≥ 0 and X(0) ≥ 0, the solution is nonnegative; that is, X(t) ≥ 0 for all t ≥ 0, a.s. 14.17. (Gronwall–Bellman lemma) Let {x(t), 0 ≤ t ≤ T } be a nonnegative contin uous on [0, T ] function that satisfies the inequality x(t) ≤ a + b 0t x(s)ds, t ∈ [0, T ], a, b ≥ 0. Prove that x(t) ≤ aebt , t ∈ [0, T ]. 14.18. Let the assumptions of Theorem 14.1 hold. Prove that under an additional assumption E|ξ |2 < ∞, the unique solution X to equation (14.1) satisfies the inequality E|X(t)|2 ≤ k1 exp(k2t), 0 ≤ t ≤ T with some constants k1 and k2 . 14.19. Let {Xs,x (t),t ≥ s}, x ∈ Rd be a solution to SDE (14.1) for t ≥ s, with initial condition Xs,x (s) = x. Assume that the coefficients of the equation satisfy both Lipschitz and linear growth conditions. Prove that for any T > 0 and p ≥ 2, there exists a constant c such that for Xs,x (t) the following moment bounds are valid.
220
14 Stochastic differential equations
(1) ∀ s,t ∈ [0, T ], s ≤ t ∀ x ∈ Rd : E|Xs,x (t)| p ≤ c(1 + |x| p ). (2) ∀ s,t, 0 ≤ s ≤ t ≤ T ∀ x1 , x2 : E|Xs,x1 (t) − Xs,x2 (t)| p ≤ c|x1 − x2 | p . (3) ∀ x ∀ s,t1 ,t2 ∈ [0, T ], s ≤ t1 ∧ t2 : E|Xs,x (t1 ) − Xs,x (t2 )| p ≤ c(1 + |x| p )|t1 − t2 | p/2 . (4) ∀ x ∀ s1 , s2 ,t ∈ [0, T ], t ≥ s1 ∨ s2 : E|Xs1 ,x (t) − Xs2 ,x (t)| p ≤ c(1 + |x| p )|s1 − s2 | p/2 . 14.20. Assume the conditions of Problem 14.19. Prove that the process {Xs,x (t), t ≥ s}, x ∈ Rn has a modification that is continuous in (s,t, x). 14.21. Let (Ω , F, {Ft }t∈[0,T ] , P) be a filtered probability space, {W (t), Ft ,t ∈ [0, T ]} be a Wiener process, and {γ (t), Ft ,t ∈ [0, T ]} be a progressively measurable stochastic process with P{ 0T γ 2 (s)ds < ∞} = 1. Consider an SDE dX(t) = γ (t)X(t)dW (t), X(0) = 1. (1) Prove that there exists a nonnegative continuous solution to the equation, which is unique and given by the formula X(t) = exp{ 0t γ (s)dW (s) − 12 0t γ 2 (s)ds}. The process X(t) is called a stochastic exponent (see Problem 13.30). (2) Prove that the process X is a supermartingale and EX(t) ≤ 1. 14.22. (Novikov’s condition for martingale property of stochastic exponent) Let the conditions of Problem 14.21 hold. Then under E exp{ 12 0T γ 2 (s)ds} < ∞, a supermartingale X(t, γ ) := exp{ 0t γ (s)dW (s) − 12 0t γ 2 (s)ds} is a martingale and EX(t, γ ) = 1. 14.23. Assume the conditions of Problem 14.21 and for a certain δ > 0 let it hold that sup E exp{δ γ 2 (t)} < ∞. t≤T
Prove that EX(T, γ ) = 1, t ∈ [0, T ]. 14.24. Let under the conditions of Problem 14.21, γ be a Gaussian process with supt≤T E|γ (t)| < ∞ and supt≤T Dγ (t) < ∞. Prove that EX(T, γ ) = 1. 14.25. (Girsanov theorem for continuous time) Let under the conditions of item (1) of Problem 14.21, it hold EX(t, γ ) = 1. Define a probability measure Q by an equality dQ= X(t, γ )dP. Prove that on the probability space (Ω , F,Q) a stochastic process (t) := Wt − t γ (s)ds is Wiener with respect to the filtration {Ft }t∈[0,T ] . W 0 14.26. Let a : Rn → Rn be a bounded measurable function. Construct a weak solution {X(t) = Xx (t)} to an SDE dX(t) = a(X(t))dt + dW (t), X(0) = x ∈ Rn . 2 14.27. Let W be a Wiener ( process. Prove that X(t) := W (t) is a weak solution to an SDE dX(t) = dt + 2 |X(t)|dW (t) where W is another Wiener process.
14 Stochastic differential equations
14.28. Let X(t) = x0 +
t 0
a(s)dL(s) +
t 0
221
b(s)dW (s),
where L(t),t ≥ 0 is a continuous nondecreasing adapted process. Consider the pro from Problem 13.60 and introduce processes X(t) = X(A−1 (t)) cesses β , A−1 , and W −1 and L(t) = L(A (t)). Prove that + b(A−1 (t))β −1/2 (A−1 (t))dW (t). = a(A−1 (t))d L(t) d X(t) In particular, if β (t) = c(X(t)), a(t) = α (X(t)), and b(t) = σ (X(t)), where α , σ , c is a solution are nonrandom functions, c(x) > 0, and L(t) = t, then the process X(t) to SDE, −1 −1/2 (t). = α (X(t))c (X(t))dt + σ (X(t))c (X(t))dW d X(t) 14.29. Let measurable processes α (t) and β (t), t ≥ 0 be adapted to the σ -algebra generated by a Wiener process and ∃ c,C > 0 ∀ t ≥ 0 : |α (t)| ≤ C, c ≤ β (t) ≤ C. Prove that for a process X with dX(t) = α (t)dt + β (t)dW (t), ≥ 0, it holds with probability 1 that lim sup |X(t)| = +∞ a.s. t→∞
14.30. Let X(t),t ≥ 0 satisfy SDE, dX(t) = a(X(t))dt + b(X(t))dW (t), t ≥ 0, where a, b : R → R satisfy the Lipschitz condition. Let b(x) > 0 for all x ∈ [x1 , x2 ] and / X(0) = x0 ∈ [x1 , x2 ]. Prove that with probability 1 the exit time τ = inf{t ≥ 0| X(t) ∈ (x1 , x2 )} of the process X from the interval [x1 , x2 ] is finite and Eτ m < ∞ for all m > 0. 14.31. Let {X(t),t ≥ 0} be a solution to SDE (14.3) with initial condition X(0) = x, where b, σ : R → R satisfy the Lipschitz condition and σ (x) = 0, x ∈ R. For the process X, prove that the probability pab (x), x ∈ (a, b), to hit the point a before the point b equals (s(b) − s(x))/(s(b) − s(a)), where y x 2b(z) dz dy exp − s(x) = 2 c1 c2 σ (z) with arbitrary constants c1 and c2 . 14.32. Find the probability pab (x), x ∈ (a, b) to hit the point a before the point b for a process X that satisfies the next stochastic differential equations with initial condition X(0) = x. (a) dX(t) = dW (t). (b) dX(t) = dW (t) + Kdt. (c) dX(t) = (2 + sin X(t))dW (t). (d) dX(t) = AX(t)dt + BX(t)dW (t) with B = 0 and A > 0. (e) dX(t) = (A/(X(t)))dt + dW (t), where A > 0.
222
14 Stochastic differential equations
14.33. Find the probability that a path of a Wiener process intersects a straight line y = kt + l with t ≥ 0. 14.34. Prove that with probability 1 the process X(t) from Problem 14.32 (e) does not hit the origin in finite time for A ≥ 1/2. 14.35. Let W be a Wiener process on Rn with n > 1, and x ∈ Rn \ {0}. Prove that the stochastic process X(t) = x +W (t) satisfies SDE, n−1 dt + dB(t), dX(t) = 2X(t) where B(t) is a one-dimensional Wiener process. Use Problem 14.34 and check that P(∃ t > 0| X(t) = 0) = 0. 14.36. Let coefficients of an SDE and a function s satisfy the conditions of Problem 14.31. Prove that: (a) If limx→+∞ s(x) = +∞ and limx→−∞ s(x) = −∞, then P(sup X(t) = +∞) = P(inf X(t) = −∞) = 1. t≥0
t≥0
(b) If limx→+∞ s(x) = +∞ and limx→−∞ s(x) = c ∈ R, then P(sup X(t) < ∞) = P(inf X(t) = −∞) = P( lim X(t) = −∞) = 1. t→+∞
t≥0
t≥0
(c) If there exist finite limits limx→−∞ s(x) = s(−∞) and limx→+∞ s(x) = s(+∞), then P(supt≥0 X(t) < ∞/X(0) = x) = P(inft≥0 X(t) = −∞/X(0) = x) s(+∞)−s(x) = s(+∞)−s(−∞) . 14.37. Assume that coefficients of an SDE satisfy the conditions of Problem 14.31. Prove that: (a) P(lim sup |X(t)| = ∞) = 1. t→∞
(b) P(lim inf X(t) = −∞, lim sup X(t) ∈ R) = 0. t→∞
t→∞
(c) With probability 1 one of the three disjoint events occurs: either limt→∞ X(t) = +∞, or limt→∞ X(t) = −∞, or {lim inft→∞ X(t) = −∞, lim sup X(t) = +∞}; that is, t→∞
P(lim inf X(t) = −∞, lim sup X(t) = +∞) t→∞
t→∞
+ P(lim X(t) = −∞) + P(lim X(t) = +∞) = 1. t→∞
t→∞
14.38. Let coefficients of an SDE satisfy the conditions of Problem 14.31. Denote by τ[a,b] = inf{t ≥ 0| X(t) ∈ / [a, b]} the first exit time of the process X from an interval [a, b]. Prove that the function v(x) = E(τ[a,b] /X(0) = x) is finite and equal to v(x) = − where
x a
2ϕ (y)
y a
dz dy + σ 2 (z)ϕ (z)
b a
ϕ (x) = exp −
a
x
2ϕ (y)
y a
x dz a ϕ (z)dz , dy b σ 2 (z)ϕ (z) a ϕ (z)dz
2b(z) dz . σ 2 (z)
14 Stochastic differential equations
223
14.39. Let τ be the exit time of a Wiener process from an interval [−a, b] with a > 0 and b > 0. Find Eτ .
Hints 14.1. Apply the Itˆo formula. 14.2. Apply the Itˆo formula to X(t) = (r − (σ 2 /2))t + σ W (t) and F(x) = ex . 14.3. (1) Note that it is a linear heterogeneous equation. Apply a method similar to the variation of constants method. (2) Apply the Itˆo formula. 14.4. Look for a solution in the form X(t) = X(0) exp{at + ∑nk=1 bkWk (t)}, apply the Itˆo formula, and take into account the independence of components of Wk . t (T −t) ((dW (s))/(T − s)) = 0 a.s., set M(t) = 14.5. In order to prove that lim t→T − 0 t ((dW (s))/(T − s)), apply modified Theorem 7.16 (some analogue of Doob’s mar0 tingale inequality for continuous time), and prove that P(supT (1−2−n )≤t≤T (1−2−n−1 ) (T − t)|M(t)| > ε ) ≤ 2ε −2 2−n . Apply the Borel–Cantelli lemma and obtain that for a.a. ω there exists n(ω ) < ∞ such that for all n ≥ n(ω ) it holds ω ∈ An , where An = {ω |
sup
(T − t)|M(t)| > 2−(n/4) }.
T (1−2−n )≤t≤T (1−2−n−1 )
14.6. (b) Multiply both sides of the equation by e−t and compare with d(e−t W (t)). 14.7, 14.8. Use Problem 14.1. 14.9. Multiply both sides by exp{−α W (t) + 12 α 2t}. 14.10. (1) First solve an SDE, . 1 dX(t) = 1 + X 2 (t)dW (t) + X(t)dt. 2 For this purpose write the Itˆo formula for a function f which is unknown at the moment: ( t f (X(s)) 1 +X 2 (s)dW (s) f (X(t)) = f (X(0)) + 0 t + 0 f (X(s))X(s) ds + 1/2 0t f (X(s))(1 + X 2 (s))ds, and find the function f such that the integrand expression in the Lebesgue integral in the latter equality is identical zero. Then use the fact that in the initial equation the first summands in the drift and diffusion coefficients coincide. (2) For the most part, the reasoning is similar. 14.11. Use the Itˆo formula for X 2 (t), compose an ordinary differential equation for EX 2 (t), and solve it. 14.15. Use Theorem 14.1. 14.16. (1) Check that the conditions of Theorem 14.2 hold true. (2) Separately consider the cases d = 0 and d > 0. In the first case, based on uniqueness of the solution to equation (14.4), prove that X(t) ≡ 0 if X(0) = 0; and if X(0) > 0 then set σ = inf{t| X(t) = 0} and show that X(t) = X(t ∧ σ ). Let d > 0. We set σ−ε = inf{t| X(t) = −ε } where ε > 0 satisfies −cε + d > 0. Suppose that P(σ−ε < ∞) > 0. Then with probability 1, if to choose any r < σ−ε such that X(t) < 0
224
14 Stochastic differential equations
for t ∈ (r, σ−ε ), we have dX(t) = (cX(t) + d)dt in the interval (r, σ−ε ); that is, X(t) is growing in this interval, which is impossible. 14.18. Use the Gronwall–Bellman lemma. 14.20. Use the results of the previous problem and check that ∀ R > 0, T > 0, p ≥ 2 ∃c > 0 ∀ s1 , s2 ,t1 ,t2 ∈ [−T, T ], s1 ≤ t1 , s2 ≤ t2 ∀ x1 , x2 ∈ Rn , x2 ≤ R, x2 ≤ R : EXs1 ,x1 (t1 ) − Xs2 ,x2 (t2 ) p ≤ c(|s1 − s2 | p/2 + |t1 − t2 | p/2 + x1 − x2 p ). Use Problem 3.12 and Theorem 3.7. 14.21. (1) Existence of the solution of the given form can be derived from the Itˆo formula. Let Y be another continuous solution. Prove by the Itˆo formula that d((Y (t))/(X(t))) = 0. (2) It follows from Problem 13.31. 14.24. Use Problem 14.23. 14.25. Similarly to Problem 7.96, check that the process {Mt } is a Q-local martinNext, use the Itˆo formula and gale if and only if Mt X(t, γ ) is a local P-martingale. (t) · X(t, γ ) = t X(s, γ )dW (s) + t W (s)X(s, γ ) f (s)dW (s); that is, this obtain that W 0 0 (t) is a local Q-martingale. Now, because process is a local P-martingale. Thus, W is a square ]t = t, obtain, based on Theorem 7.17, that W the quadratic variation [W integrable Q-martingale. 14.27. Write the Itˆo formula for X and compare it with the required equation. Use Problem 13.58. 14.28. Make an ordinary change of variables in the first integral: A−1 (t) 0
a(s)dL(s) =
t 0
a(A−1 (z))d L(z).
For another integral, use Problem 13.60. 14.33. The desired probability equals the probability that the process satisfying dX(t) = dW (t)−k dt and X(0) = −l hits the point 0. Denote τ = inf{t ≥ 0|X(t) = 0}. If k = 0 then P(τ < ∞) = 1. Let, to be specific, k < 0. Then with probability 1 (see Problem 3.18), limt→+∞ X(t) = +∞. Therefore, P(τ < ∞) = 1 if l ≥ 0. For l < 0, P(τ < ∞) = limn→∞ P(X(t) hits 0 before the point n) −kn −kl = limn→∞ e e−kn−e−1 = e−kl . 14.34. Use the reasoning from Problem 14.36. 14.35. Let ζ = inf{t ≥ 0| X(t) = 0} ∪ {+∞}. Then by the Itˆo formula dX(t) =
n Wk (t) n−1 dt + ∑ dWk (t), t ≤ ζ . 2X(t) W (t) k=1
From Problem 13.59 it follows that the process B(t) =
n
∑
t Wk (s)
k=1 0
W (s)
dWk (s)
is Wiener. 14.36. (a) Denote by Px the distribution of X under the condition X(0) = x. Then for any x1 , x2 , x1 ≤ x ≤ x2 : Px (supt≥0 X(t) ≥ x2 ) ≥ Px (the process X hits x2 before x1 ) = (s(x) − s(x1 ))/(s(x2 ) − s(x1 )) (see Problem 14.31). Let x1 → −∞.
14 Stochastic differential equations
225
(b) For any x2 > x Px (supt≥0 X(t) > x2 ) ≤ Px (there exists x1 ≤ x such that the process X hits x2 before x1 ) = limx1 →−∞ Px (X(t) hits x2 before x1 ) = (s(x) − s(−∞))/ (s(x2 ) − s(−∞)). Let x2 → +∞ and use the result of Problem 14.37. (c) Use reasoning of item (b) and the result of Problem 14.37. 14.37. (a) With probability 1 the process X(t) exits from any interval (see Problem 14.30). (b) Use the strong Markov property of a solution to SDE (see Theorem 14.3), the Borel–Cantelli lemma, and Problem 14.31, and check the following. If for some c ∈ R there exists a sequence of Markov times {τn } such that limn→∞ τn = +∞ and X(τn ) = c, then P(lim inft→∞ X(t) = −∞)= P(lim supt→∞ X(t) = +∞) = 1. If P(lim inft→∞ X(t) = −∞ and lim supt→∞ X(t) > c) > 0, then take τn = inf{t ≥ σn : X(t) = c} with σn+1 = inf{t ≥ τn : X(t) = c − 1} and σ0 = 0. (c) See (a) and (b). 14.38. Notice that the function v is twice continuously differentiable and Lv(x) = −1, x ∈ [a, b], v(a) = v(b) = 0, where Lv(x) = b(x)v (x) + 12 σ 2 (x)v (x). Then by the Itˆo formula and the properties of a stochastic Itˆo integral one has that for any n ∈ N, E(v(X(n ∧ τ[a,b] ))/X(0) = x) n∧τ
= v(x) + E 0 [a,b] Lv(X(s))ds/X(0) = x = v(x) − E(n ∧ τ[a,b] /X(0) = x).
Because τ[a,b] < ∞ a.s. (see Problem 14.30), by the dominated convergence theorem the left-hand side of the equality tends to 0, whereas by the monotone convergence theorem the expectation on the right-hand side converges to E(τ[a,b] /X(0) = x). 14.39. Use the result of Problem 14.38 with the process dX(t) = dW (t).
Answers and Solutions 14.14. By the Itˆo formula
∂f + L f (t, X(t)) dt + dP(t), d f (t, X(t)) = ∂t where dP(t) = (∂ f /∂ x)σ (t, x)dW (t). Taking into account that f is a solution to the given partial differential equation, we obtain
d f (t, X(t)) = r(t, X(t)) f (t, X(t))dt + dP(t), 0 ≤ t ≤ T. Because this SDE is linear in f , its solution has a form T T r(u,X(u))du − ts r(u,X(u))du t f (t, X(t)) + e dP(s) . f (T, X(T )) = e t
Taking the expectation under the condition X(t) = x, accounting for the boundary condition, and using the martingale property of the stochastic process P, then we obtain the desired statement.
226
14 Stochastic differential equations
14.17. Let a > 0. Then for t > 0 t = log a + b x(s)ds 0
bx(t) ≤ b. a + b 0t x(s)ds
Integrating from 0 up to t we obtain t log a + b x(s)ds − log a ≤ bt, t ∈ [0, T ], t
0
x(s)ds ≤ aebt .
Consider the case a = 0 yourself. and a + b 0 14.19. (1) Let τN = inf{t ≥ s| Xs,x (t) ≥ N} ∧ T. For simplicity we consider only the case m = n = 1. The Itˆo formula implies that t∧τN |Xs,x (z)| p−1 sign (Xs,x (z)) b(z, Xs,x (z))dz |Xs,x (t ∧ τN )| p = |x| p + p
s
t∧τN 1 +σ (z, Xs,x (z))dW (z) + p(p − 1) |Xs,x (z)| p−2 σ 2 (z, Xs,x (z))dz. 2 s Therefore, E|Xs,x (t ∧ τn )| p ≤ K1 |x| p + E st∧τN |Xs,x (z)| p−2 (1 + |Xs,x (z)| + ξsz2 )dz ≤ K2 |x| p + E st∧τN (1 + |Xs,x (z)| p )dz ≤ K3 |x| p + st (1 + E|Xs,x (z ∧ τN )| p )dz ≤ K3 |x| p + T + st E|Xs,x (z ∧ τN )| p dz .
Now, the Gronwall–Bellman lemma implies the inequality E|Xs,x (t ∧ τN )| p ≤ c(1 + |x| p ), 0 ≤ s ≤ t ≤ T, x ∈ R, where the constant c does not depend on N. Use Fatou’s lemma to prove the desired inequality. Items (2) to (4) are proven similarly to (1) based on the Itˆo formula and Gronwall–Bellman lemma. t t 2 14.22. Let a> 0, and σa = inf{t ] | 0 γ (s)dW (s) − 0 γ (s)ds = −a}, σa = T , t 2 ∈ [0, T t if inft∈[0,T ] 0 γ (s)dW (s) − 0 γ (s)ds > −a. Let also λ ≤ 0. Show that EX(σa , λ γ ) = 1. According to Problem 14.21, X(σa , λ γ ) = 1 + λ
σa 0
X(s, λ γ )γ (s)dW (s).
Thus, it is enough to show that E 0σa X 2 (s, λ γ )γ 2 (s)ds < ∞. But this relation is im plied by the next two bounds: E 0σa γ 2 (s)ds ≤ E 0T γ 2 (s)ds ≤ E exp{ 12 0T γ 2 (s)ds} < ∞, and s s X(s, λ γ ) = exp λ ( γ (u)dW (u) − γ 2 (u)du 0 0 s 2 λ × exp (λ − ) γ 2 (u)du ≤ exp{|λ |a}, for all λ ≤ 0 and s ≤ σa . 2 0
14 Stochastic differential equations
227
Now, we show that EX(σa , λ γ ) = 1 for 0 < λ ≤ 1. Define
ρ (σa , λ γ ) = eλ a X(σa , λ γ ), A(ω ) = B(ω ) =
σa
σa
σa 0
γ 2 (s)ds,
γ (s)dW (s) − γ 2 (s)ds + a ≥ 0, 0 0 √ is clear that 0 ≤ λ ≤ 1 if and only and let u(z) = ρ (σa , λ γ ), where λ = 1 − 1 − z. It √ if 0 ≤ z ≤ 1. Besides, u(z) = exp{(z/2)A(ω )+(1− 1 − z)B(ω )}. For 0 ≤ z < 1, one k ω ), where can P-a.s. expand the function u(z) into a series: u(z) = ∑∞ k=0 (z /k!)pk (√ pk (ω ) ≥ 0 P-a.s., for all k ≥ 0. Problem 13.31 implies that Eu(z) ≤ ea(1− 1−z) < ∞. k If 0 ≤ z0 < 1 and |z| ≤ z0 , then E ∑∞ k=0 (|z| )/(k!)pk (ω ) ≤ Eu(z0 ) < ∞. Therefore, k due to the Fubini Theorem for any |z| < 1 we have Eu(z) = ∑∞ k=0 (z /k!)Epk (ω ). √ k On the other hand, for −∞ ≤ z < 1 we have ea(1− 1−z) = ∑∞ k=0 (z /k!)ck , where ck ≥ 0, k ≥ 0. From this, and also from the equality Eρ (σa , λ γ ) = eλ a , λ ≤ 0, we ∞ k k Epk (ω ) = ck , obtain for −1 < z ≤ 0 that ∑∞ k=0 (z /k!)Epk (ω ) = ∑k=0 (z /k!)ck , thus, √ ∞ k k ≥ 0, and then for 0 ≤ z < 1 we have Eu(z) = ∑k=0 (z /k!)ck = ea(1− 1−z) . Because A(ω ) and B(ω ) are nonnegative P-a.s., then ρ (σa , λ γ ) ↑ ρ (σa , γ ) for λ ↑ 1. Due to the monotone convergence theorem, ea = lim Eρ (σa , λ γ ) = Eρ (σa , γ ). λ ↑1
Evidently, 1 = EX(σa , γ ) = EX(σa , γ )1Iσa 2δ . Represent X(T, γ ) as a product: X(T, γ ) = ∏n−1 j=0 X(t j ,t j+1 , γ ), where 0 = t0 < t1 < · · · < tn = T t
t
and X(t j ,t j+1 , γ ) = exp{ t jj+1 γ (t)dW (t) − 12 t jj+1 γ 2 (t)dt}, t j+1 − t j ≤ 2δ , 0 ≤ j ≤ n − 1. Then EX(t j ,t j+1 , γ ) = 1 E(X(t j ,t j+1 , γ )/Ft j ) = 1 P-a.s. Thus,EX(T, γ ) = E(E(X(T, γ )/Ftn−1 )) = EX(tn−1 , γ ) = · · · = EX(t1 , γ ) = 1. 14.26. Define a martingale M(t) = exp{ 0t a(W (s))dW (s)− 0t a2 (W (s))ds/2}, where W is a Wiener process. Fix T > 0 and define a probability measure Q by relation on FW dQ = MT dP T . Then, due to the Girsanov theorem, the stochastic process t ˆ W (t) := − 0 a(W (s))ds + W (t) is a Wiener process with respect to the measure Q
228
14 Stochastic differential equations
for t ≤ T, at that dW (t) = a(W (t))dt + dWˆ (t). If one sets W (0) = x, then the couple (W, Wˆ ) forms the desired weak solution. 14.29. Perform the time change ξ (t) = X(A−1 (t)), where A(t) = 0t β 2 (s)ds (see (t), where α (t)dt + dW (t) = α (A−1 (t)) × Problem 14.28). Then d ξ (t) = α −2 −1 2 (t)| ≤ K := C/c . Use Problem 5.39 with a = 1 and b = 2K +2m+1. β (A (t)), |α 14.30. Let bounded functions a and b satisfy the Lipschitz condition, b(x) ≥ c > 0, x ∈ R, where c is a constant, and a(x) = a(x), b(x) = b(x) for all x ∈ [x1 , x2 ]. Denote by X(t) a solution to SDE with coefficients a and b. Then P(X(t) = X(t), t ≤ τ ) = 1 and the process X satisfies the condition of Problem 14.29, whence P(τ < ∞) = 1. One can see from the solution of Problems 14.29 and 5.39 that the probabilities P(τ ≥ n) do not exceed αβ n , where α > 0 and β ∈ (0; 1) are constants. This implies that all moments of τ are finite. 14.31. Notice that the function s satisfies the equation Ls(x) = 0, x ∈ R, where L = b(x)(d/dx) + 12 σ 2 (x)(d 2 /dx2 ). Therefore, the Itˆo formula implies that ds(X(t)) = f (X(t))dW (t), where f (x) = exp{− cx2 ((2b(z))/(σ 2 (z)))dz}σ (x). Because f is a bounded in (a, b) function and the exit time ζab of the process X(t) from the inter2 < ∞ (see Problem 14.30), then val (a, b) is finite with probability 1, we have E ζab Es(X(ζab )) = s(x). Thus, s(a)P(X(ζab ) = a) + s(b)P(X(ζab ) = b) = s(x), and therefore, P(X(ζab ) = a) = (s(b) − s(x))/(s(b) − s(a)). 14.32. pab (x) = (s(b) − s(x))/(s(b) − s(a)), where (see Problem 14.31): (a) s(x) = x. (b) s(x) = −e−kx . (c) s(x) = x. (d) s(x) =
− 2A +1
x B2 , log x,
x−2A+1 , (e) s(x) = log x, −kl e , kl > 0 14.33. 1, kl ≤ 0. 14.39. Eτ = ab.
2A = B2 2A = B2 .
A = 1/2 A = 1/2.
15 Optimal stopping of random sequences and processes
Theoretical grounds The optimal stopping problem can be considered for nearly any stochastic process, and its formulation will be similar in each case. But its solution will be relatively simple only for a few processes. One class of such processes consists of discrete-time Markov chains. Let{Xn , n ∈ Z+ } be a Markov chain with a finite or countable phase space (X, X) and one-step transition probabilities {pxy , x, y ∈ X}. Consider also a bounded function f : X → R+ . Let X∞ := x0 and f (x0 ) = 0. We need the following: (1) to compute v(x) := supτ Ex f (Xτ ), where sup is taken over all Markov moments τ , and the chain starts from the point x ∈ X; and (2) to find a Markov moment τ0 , for which v(x) = Ex f (Xτ0 ). As in game theory, f (x) is called a payoff function, or a premium function, v(x) is a price of the game, and τ0 is an optimal strategy. Often only stopping times τ are considered and in this case τ0 is said to be an optimal stopping time. Definition 15.1. A function g : X → R+ is said to be excessive (with respect to a chain X) if for any x ∈ X it holds Pg(x) ≤ g(x), where P is the operator defined by Pg(x) = ∑y∈X pxy g(y). Lemma 15.1. If a function f is excessive, then for any Markov moment τ the inequality f (x) ≥ Ex f (Xτ ) holds. Remark 15.1. In fact, an optimal strategy for an excessive function is to stop immediately. Lemma 15.2. The price of the game is an excessive function. Furthermore, the price of the game v(x) is the least among all excessive functions h(x) with h(x) ≥ f (x) for any x ∈ X. The price of the game is said to be the excessive majorant for a function f or the least excessive majorant for f . D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 229 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 15,
230
15 Optimal stopping of random sequences and processes
Corollary 15.1. If #X = n, then the price of the game v(x) is a minimal function satisfying the system of 3n inequalities v(x) ≥
∑ pxy v(y), v(x) ≥ f (x), v(x) ≥ 0, x ∈ X.
y∈X
Definition 15.2. A set Γ of points x where the payoff f (x) is equal to its excessive majorant v(x), is called a supporting set or a stopping set. A set Γ c of points x where the payoff function f (x) is less than its excessive majorant v(x), is said to be a continuation set. Theorem 15.1. If the phase space X is finite, the time τ0 of the first hit of a stopping set by a chain X visits a stopping set at the first time is an optimal strategy (and it is also an optimal stopping). Let now {X(t), t ≥ 0} be a diffusion process defined on Rn . That is, X(t) satisfies an SDE dX(t) = b(t, X(t))dt + σ (t, X(t))dW (t), where X(t) ∈ Rn , b(t, x) : R+ × Rn → Rn , σ (t, x) : R+ × Rn → Rn×m and {W (t),t ∈ R+ } is an m-dimensional Wiener process. Let also f : Rn → R+ be a continuous payoff function. The optimal stopping problem in this case consists in finding a stopping time τ0 with respect to the filtration {FtW ,t ∈ R+ } generated by W , such that Ex f (X(τ0 )) = supτ Ex f (X(τ )), x ∈ Rn where sup is taken over all Markov stopping times is considered and the process starts at t = 0 from x. We also have to find v f (x) := Ex f (X(τ0 )). Let us introduce some notions. All functions are supposed to be measurable. Definition 15.3. A function f : Rn → R+ is said to be lower semicontinuous if for any point x ∈ Rn and for any sequence xn → x as n → ∞, it holds f (x) ≤ lim inf f (xn ). n→∞
Definition 15.4. A lower semicontinuous function f : Rn → R+ is said to be superharmonic with respect to a diffusion process X if f (x) ≥ Ex f (Xτ ) and stopping times τ . If the lower semicontinuity is not required, then for all x ∈ the function is called excessive (cf. Lemma 15.1). Rn
Definition 15.5. Let h : Rn → R+ be a measurable function and f be a superharmonic function with f ≥ h. Then f is said to be a superharmonic majorant for h. If, in addition, for any superharmonic majorant f1 for the function h it holds f1 ≥ f , then f is called the least superharmonic majorant for h and denoted by h. Concerning a construction of a superharmonic majorant, see Problem 15.14. If “superharmonic function” in Definition 15.5 is replaced by “excessive function”, then one gets definitions of excessive majorant and the least excessive majorant. Denote Γ = {x| f (x) = v f (x)}, Γ c = {x| f (x) < v f (x)}, τΓ = inf{t ∈ R+ | X(t) ∈ Γ }, and ΓN = {x| f (x) ∧ N = f (x) ∧ N}.
15 Optimal stopping of random sequences and processes
231
Theorem 15.2. Let f : Rn → R+ be a continuous payoff function. Then (1) v f (x) = f-(x). (2) If τΓ < ∞ Px a.s. and the sequence { f (XτΓN ), N ≥ 1} is uniformly integrable then v f (x) = Ex f (XτΓ ) and τ0 = τΓ is an optimal stopping. (3) If τ1 is an optimal stopping, then τ1 ≥ τΓ , Px -a.s. Theorem 15.3. (Construction of the least superharmonic majorant) Let g0 be a nonnegative lower semicontinuous function defined on Rn . Define inductively gn (x) = sup Ex gn−1 (X(t)), t∈Sn
:= {k2−n |0 ≤ k
where Sn n ∈ N. Then gn ↑ g-0 , where g-0 is the least superharmonic majorant for the function g0 . ≤ 4n },
Now, let a premium function g ∈ C2 (Rn ), L be the generator of a homogeneous diffusion process X satisfying an SDE dX(t) = b(X(t))dt + σ (X(t))dW (t). That is, 1 n ∂2 f ∂f bi (x) + ∑ σi j (x). 2 i, j=1 ∂ xi ∂ x j i=1 ∂ xi n
L f (x) = ∑ Introduce a set
U = {x ∈ Rn | L g(x) > 0}. Then U
(15.1)
⊂ Γ c.
Bibliography [19], Chapter III; [67]; [23], Chapter 6; [81]; [54], Chapter 2; [61], Chapter X; [85].
Problems 15.1. Prove Lemma 15.1. 15.2. Prove: if a function f is excessive and τ ≥ σ are two Markov moments, then Ex f (Xσ ) ≥ Ex f (Xτ ). 15.3. Let a function f be excessive, and τΓ be a time of the first entry into a set Γ ⊂ X by a Markov chain {Xn , n ∈ Z+ }. Prove that the function h(x) := Ex f (XτΓ ) is excessive as well. 15.4. Prove: if a payoff function f is excessive, then the price of the game v equals f . 15.5. Prove: if an excessive function g dominates a payoff function f then it also dominates the price of the game v. 15.6. Prove Lemma 15.2.
232
15 Optimal stopping of random sequences and processes
15.7. Prove the following properties of excessive functions defined on X. (1) If a function f is excessive and α > 0, then the function α f is excessive as well. (2) If functions f1 , f2 are excessive, then the sum f1 + f2 is excessive as well. (3) If { fα , α ∈ A} is a family of excessive functions, then the function f (x) := infα ∈A fα (x) is excessive as well. (4) If { fn , n ≥ 1} are excessive functions and fn ↑ f pointwise, then f is an excessive function as well. 15.8. Let a Markov chain {Xn , n ∈ Z+ } be a symmetric random walk with absorption points 0 and N. That is, X = {0, 1, . . . , N}, pxy = 12 if x = 1, 2, . . . , N − 1, y = x ± 1, pNN = 1, and p 00 = 1. Prove: the class of functions f : X → R+ , which are excessive w.r.t. this random walk, coincides with the class of concave functions. 15.9. Let a Markov chain {Xn , n ∈ Z+ } be the same as in Problem 15.8. (1) Prove that the price of the game v(x) is the least concave function for which v(x) ≥ f (x), x ∈ X = {0, 1, . . . , N}. (2) Prove that to stop at time τ0 , when the chain first enters any of the points x with f (x) = v(x), is an optimal strategy. 15.10. Let a Markov chain {Xn , n ∈ Z+ } be a random walk over the set X = {0, 1, . . . , N} and px(x+1) = p, px(x−1) = q = 1 − p = p if x = 1, 2, . . . , N − 1 and pNN = 1, p 00 = 1 (nonsymmetric random walk with absorption at points 0 and N). (1) Describe the class of functions f : X → R+ which are excessive w.r.t. the random walk. (2) Let a premium function f (x) = x. Find an optimal strategy and calculate the price of the game if: (a) q > p; (b) q < p. 15.11. Let a Markov chain {Xn , n ∈ Z+ } be a random walk over a set X = {0, 1, . . . , N} and px(x+1) = p, px(x−1) = q = 1 − p if x = 1, 2, . . . , N − 1 and pN(N−1) = 1, p 01 = 1 (nonsymmetric random walk with reflection at points 0 and N). Describe the class of functions f : X → R+ that are excessive w.r.t. the random walk. 15.12. (Optimal stopping for a Wiener process with absorption) Consider a Wiener process {W (t), t ∈ R+ } with W (0) = x ∈ [0, a]. Furthermore, if the process visits point 0 or a, then it stays there forever; that is, X = [0, a] (such a process is called a Wiener process with absorption at the points 0 and a). Let a function f ∈ C([0, a]) and is nonnegative. Find the price of the game v(x) = supτ Ex f (W (τ )) and construct a Markov moment τ0 , for which v(x) = Ex f (W (τ0 )). 15.13. (Optimal stopping for a two-dimensional Wiener process) Let W (t) = {(W1 (t), W2 (t)), t ∈ R+ } be a Wiener process in R2 . (1) Prove that only constant functions are superharmonic nonnegative functions with regard to W . (2) Prove that there is no optimal stopping for an unbounded function f . (3) Prove that the continuation set is Γ c = {x| f (x) < || f ||∞ }, where || f ||∞ = supx∈R2 | f (x)|.
15 Optimal stopping of random sequences and processes
233
(4) Prove that if the logarithmic capacity cap (∂Γ c ) = 0, then τΓ = ∞ a.s. (The logarithmic capacity of a planar compact set is the value γ (E) = exp{−V (E)}, where V (E) = infP E×E ln |u − v|−1 dP(u)dP(v) and the infimum is taken over all probabilistic measures on E. The value V (E) is called the Robbins constant for the set E and the set E is called polar if V (E) = +∞ or, the equivalent, if γ (E) = 0.) (5) Prove that if cap (∂Γ c ) > 0 then τΓ < ∞ a.s. and v f (x) = || f ||∞ = Ex f (WτΓ ). It means that τΓ is the optimal stopping. 15.14. Let us suppose that there exists a Borel set H such that gH (x) := Ex g(X(τH )) dominates a function g and, at the same time, gH (x) ≥ Ex g(X(τ )) for all stopping times τ and all x ∈ Rn . Prove that in this case vg (x) = gH (x); that is, τH is an optimal stopping. 15.15. (Optimal stopping for an n-dimensional Wiener process if n ≥ 3) Let W (t) = {(W1 (t), . . . ,Wn (t)),t ∈ R+ } be a Wiener process in Rn with n ≥ 3. (1) Let a premium function be x−1 , if x ≥ 1, g(x) = 1, ifx < 1, x ∈ R3 . Prove that this function is superharmonic in R3 . Furthermore, it is such that vg = g, and it is optimal to stop immediately regardless on an initial point. (2) Let a premium function be x−α , ifx ≥ 1, h(x) = 1, ifx < 1,
α > 1, a set H = {x ∈ Rn | x ≤ 1}, and a function h(x) = Ex h(W (τH )) (remember that τH is the moment of the first hit on a set H). (a) Show that h(x) = Px (τH < ∞). (b) Show that 1, if x < 1, h(x) = x−1 , if x ≥ 1, It means that the function h coincides with the function g from item 1), which is the superharmonic majorant for h. h = g and the moment τH is an optimal stopping. (c) Prove that vh = 15.16. Let {W (t),t ∈ R+ } be a one-dimensional Wiener process. Find vg and an optimal stopping τ0 where it exists, if: (a) vg (x) = supτ Ex |W (t)| p with p > 0. 2 (b) vg (x) = supτ Ex e−W (τ ) . (c) vg (s, x) = supτ Es,x (e−ρτ chW (τ )), where ρ > 0 and ch z := (ez + e−z )/2. (Here the expectation is taken under the condition W (s) = x.) 15.17. Prove that U ⊂ Γ (see formula (15.1)).
234
15 Optimal stopping of random sequences and processes
15.18. (Optimal stopping for a time-dependent premium function) Let a premium function g be of the following form: g = g(t, x) : R × Rn → R+ , g ∈ C(R × Rn ). Find g0 (x) and τ0 such that g0 (x) = sup Ex g(τ , X(τ )) = Ex g(τ0 , X(τ0 )), τ
where X is a diffusion process with dX(t) = b(X(t))dt + σ (X(t))dW (t), t ∈ R+ and X(0) = x, b : Rn → Rn , σ : Rn → Rn×m are measurable functions, W is an mdimensional Wiener process. 15.19. Let X(t) = W (t), t ≥ 0 be a one-dimensional Wiener process and a premium function g(t, x) = e−α t+β x , x ∈ R, where α , β ≥ 0 are some constants. +of the process (1) Prove that the generator L s+t Y(s,x) (t) = W (t) + x is given by
2 +f (s, x) = ∂ f + 1 ∂ f , f ∈ C2 (R). L ∂ s 2 ∂ x2 +g = (−α + 1 β 2 )g and the identity vg = g holds true when (2) Deduce that L 2 2 β ≤ 2α and the optimal strategy is the instant stopping; if β 2 > 2α then optimal moment τ0 does not exist and vg = +∞.
15.20. Let {Yn , Fn , 0 ≤ n ≤ N} be the Snell envelope for a nonnegative process {Xn , Fn , 0 ≤ n ≤ N} (see Problem 7.22). Let Tn,N be the set of stopping times taking values in the set {n, n + 1, . . . , N}. (1) Prove that the r.v.
τ0 := inf{0 ≤ n ≤ N| Yn = Xn } is a stopping time and the stopped stochastic process {Ynτ0 = Yn∧τ0 , Fn , 0 ≤ n ≤ N} is a martingale. (2) Prove that Y0 = E(Xτ0 /F0 ) = supτ ∈T0,N E(Xτ /F0 ) (i.e., in this sense τ0 is an optimal stopping). (3) Generalizing the statements (1) and (2), prove that the r.v.
τn := inf{n ≤ j ≤ N| Y j = X j } is a stopping time and Yn = sup E(Xτ /Fn ) = E(Xτn /Fn ). τ ∈Tn,N
15.21. Prove that a stopping τ is optimal if and only if Yτ = Xτ and the stopped process {Ynτ , Fn , 0 ≤ n ≤ N} is a martingale. This statement means that τ0 from Problem 15.20 is the least optimal stopping time.
15 Optimal stopping of random sequences and processes
235
15.22. Let Yn = Mn −An be the Doob–Meyer decomposition for a Snell envelope (see Problem 7.62). Put τ1 = inf{0 ≤ n ≤ N| An+1 = 0} ∧ N. Prove that τ1 is an optimal stopping and τ1 ≥ τ for any optimal stopping τ . It means that τ1 is the largest optimal stopping. 15.23. Let a sequence {Xn , 0 ≤ n ≤ N} be a homogeneous Markov chain taking values in a finite set X with transition matrix P. Also let a function ϕ = ϕ (n, x) : {0, 1, . . . , N} × X → R be measurable. Prove that the Snell envelope for the sequence Zn := ψ (n, Xn ) is determined by the formula Yn = y(n, Zn ) where the function y is given by relations: y(N, x) = ψ (N, x) for any x ∈ X and for 0 ≤ n ≤ N − 1 u(n, ·) = max(ψ (n, ·), Pu(n + 1, ·)). 15.24. Let {Yn , 0 ≤ n ≤ N} be a Snell envelope for a sequence {Xn , 0 ≤ n ≤ N}. Prove that for every n EYn = supτ ∈Tn,N EXτ . In particular, EY0 = supτ ∈T0,N EXτ . 15.25. Prove that τ0 is optimal if and only if EXτ0 = supτ ∈T0,N EXτ .
Hints 15.2. Write formula (15.3) (see solution to Problem 15.1) for σ and τ . Deduce that Ex α σ f (Xσ ) ≥ Ex α τ f (Xτ ). Next, let α → 1. 15.4. Derive from the definition of v(x) that v(x) ≥ f (x) and from Problem 15.20 (or Lemma 15.1) that f (x) ≥ v(x). 15.7. Use the definition of excessive function or Lemma 15.1. 15.8. Use the definition of excessive function. 15.9. Use Theorem 15.1 and Problem 15.8. 15.12. I method. At first, one can prove that for the Wiener process with absorption, the class of functions g : [0, a] → R+ satisfying the condition g(x) ≥ Ex g(W (τ )) for all Markov moments τ coincides with the class of all nonnegative concave functions. The proof of the statement that any concave function satisfies this inequality is rather technically complicated (see, e.g., [19]). To prove the concavity of the function g satisfying the inequality g(x) ≥ Ex g(W (τ )), establish the inequality Ex g(W (τ )) = g(x1 )
x2 − x x − x1 + g(x2 ) , x2 − x1 x2 − x1
where τ is the moment of the first exit from the interval [x1 , x2 ] ∈ [0, a]. In this order check that the probability P(x, x1 , x2 ) to start from a point x and to get to x1 earlier than to x2 for 0 ≤ x1 ≤ x2 ≤ a is equal to (x2 − x)/(x2 − x1 ). (See Problem 14.31.) Next, prove that the price of the game is the least concave dominant of the function f . Consider a strategy τ that consists in waiting till the moment when process W visits point x1 or point x2 and then the strategies τ1 or τ2 , respectively, are used. Here
236
15 Optimal stopping of random sequences and processes
τi , i = 1, 2 are strategies leading under initial states x1 and x2 to the average payoff which is more than v(x1 ) − ε or v(x2 ) − ε . (The existence of such strategies follows from the definition of supremum.) Derive that Ex f (W (τ )) ≥
x2 − x x − x1 v(x1 ) + v(x2 ) − ε , x2 − x1 x2 − x1
and that
x − x1 x2 − x v(x1 ) + v(x2 ). x2 − x1 x2 − x1 Thus, the function v is concave. It means that the price of the game is the least nonnegative, concave dominant for the function f . Indeed, v obviously dominates f and is concave. Moreover, it follows from the above considerations that for any other concave dominant z of f we have that z(x) ≥ Ex z(W (τ )) ≥ Ex f (W (τ )) = v(x). And finally, prove that for f ∈ C([0, a]) the optimal Markov moment equals τ0 = inf{t| W (t) ∈ Γ }, where Γ = {x ∈ [0, a]| f (x) = v(x)}. II method. Use the fact that a Wiener process is a diffusion process. Take into account the absorption at 0 and a. 15.13. (1) Suppose that f is a nonnegative superharmonic function with regard to W and there exist two points x, y ∈ R2 with f (x) < f (y). Consider Ex f (W (τ )), where τ is the time of the first visit by the Markov process Wt small disk with center at y. Use a multidimensional version of the Dynkin formula (Problem 13.46) and Problem 13.55. (2) Follows directly from item (1). (3) Follows from item (1) and the definition of continuation set. (4) and (5) follow from item (3). (See also [63] and [43].) Indeed, according to [43] and [63], if a set B is compact and τB = inf{t > 0| W (t) ∈ B} then P(τB < ∞) equals 0 or 1 depending on whether the set B has zero or positive logarithmic capacity; in our case B = R2 \D is compact. v(x) ≥
15.15. (1) Follows obviously from the definitions. (2)(b) Use Problems 13.54 and 13.55. (c) Use item (b) and Problem 15.14. 15.19. The first statement is evident. For the second one consider only a case where +g(s, x) > 0} = R2 in this case, thus, Γ c = R2 . β 2 > 2α . First, the set U := {(s, x)| L It means that τ0 does not exist. Second, use Theorem 15.3 in order to construct the least superharmonic majorant: Ess,x g(Y (t)) = supt∈Sn Ee−α (s+t)+β W (t) = supt∈Sn e−α (s+t) · eβ x+(1/2)β t 2 = supt∈Sn g(s, x)e(−α +(1/2)β )t = g(s, x) exp((−α + 12 β 2 )2n ), x
2
therefore, gn (s, x) → ∞ as n → ∞. 15.23. Use the definition of a Snell envelope and the fact that for a bounded and measurable function f : X → R and a homogeneous Markov chain {Zn , 0 ≤ n ≤ N} it holds
15 Optimal stopping of random sequences and processes
237
E ( f (Zn+1 )/Fn ) = P f (Zn ), where P f (x) = ∑y∈X pxy f (y), {pxy }x,y∈X are entries of the transition matrix P. 15.24. Use Problem 15.20. 15.25. Let E(Xτ0 /F0 ) = supτ ∈T0,N E(Xτ /F0 ). Calculate the mathematical expectation for both parts and prove that EXτ0 = supτ ∈T0,N EXτ . On the contrary, let EXτ0 = supτ ∈T0,N EXτ . Prove that in this case Yτ = Xτ and Y τ are martingales.
Answers and Solutions 15.1. Put ϕ (x) = f (x)− α P f (x), 0 < α < 1. Then f (x) = (ϕ + α Pϕ +· · ·+ α n Pn ϕ + α n+1 Pn+1 f )(x) and ϕ (x) ≥ 0, x ∈ E. Furthermore, 0 ≤ Pn f = Pn−1 (P f ) ≤ Pn−1 f n n which means that α n Pn f → 0 as n → ∞. This implies that f (x) = ∑∞ n=0 α P ϕ (x) 0 n where P = I is the identity operator. Check that P ϕ (x) = Ex ϕ (Xn ). Then * ) f (x) = Ex
∞
∑ α n ϕ (Xn )
.
(15.2)
n=0
And again, similarly to (15.2), prove that Ex α τ f (Xτ ) = Ex (α τ ϕ (Xτ ) + α τ +1 ϕ (Xτ +1 ) + · · · ).
(15.3)
Comparing (15.2) with (15.3) we can conclude that f (x) ≥ Ex α τ f (Xτ ). Now, let α to 1. 15.3. Let τΓ = inf{n ≥ 1| Xn ∈ Γ }. Then, τΓ ≥ τΓ . It follows from Problem 15.2 that Ex f (Xτ ) ≤ Ex f (XτΓ ) = h(x). But, if the first step leads the Markov chain from x to Γ y then Ex f (Xτ ) = Ey f (XτΓ ) = h(y). So, Ex f (Xτ ) = ∑y∈X pxy h(y) = Ph(x). Thus, Γ Γ Ph(x) ≤ h(x), x ∈ X. 15.5. If g ≥ f and g is an excessive function, then for any strategy τ Ex f (Xτ ) ≤ Ex g(Xτ ) ≤ g(x), which implies that v(x) = supτ Ex f (Xτ ) ≤ g(x). 15.6. Because for τ = ∞ the relation Ex f (X∞ ) = 0 holds true, then v(x) ≥ 0. Fix ε > 0. It follows from the definition of supremum that for every y ∈ X there exists a strategy τε ,y such that Ey f (Xτε ,y ) ≥ v(y) − ε . Now let the strategy τ consist in making one step and then, if this step leads to the point y, we continue with strategy τε ,y . It is evident that τ is a Markov moment (check this), and for this τ
238
15 Optimal stopping of random sequences and processes
Ex f (Xτ ) =
∑ pxy Ey f (Xτε ,y ) ≥ ∑ pxy (v(y) − ε ) ≥ Pv(x) − ε .
y∈X
y∈X
Now pass to the limit as ε → 0. 15.10. (1) Put yk = (q/p)k , k = 0, . . . , N, Y = {y0 , . . . , yN }. Put a function f : {0, . . . , N} → R+ . It is excessive if and only if the function g : Y → R+ determined by the identities g(yk ) = f (k), k ∈ {0, . . . , N}, is concave. (2) For f (x) = x the function g defined above is equal to g(y) = logq/p y. It is concave if q > p and convex if q < p. For the first case a concave majorant for g is this function itself, and for the second case such majorant is a linear function w with w(y0 ) = g(y0 ) = 0, w(yN ) = g(yN ) = N; that is, w(y) = ((y − y0 )/(yN − y0 )) N. So, (a) an optimal strategy consists in the immediate stopping, and v(x) = x. (b) An optimal strategy consists in the stopping at the first of one of the points 0 or N, and x q x −1 p q = N N. v(x) = w p q −1 p 15.11. Prove that every excessive function f is constant. For any x consider the moment τx of the first entry into the state x. Because all states are connected with each other and the total number of states is finite, then for any y it holds Py (τx < +∞) = 1, and thus, Ey f (τx ) = f (x). Considering f as a payoff function, we obtain that the corresponding price of the game v(y) ≥ f (x). Because f is excessive, then v = f , and it means that f (y) ≥ f (x). Changing the roles for the points x and y we obtain the opposite inequality, f (x) ≥ f (y). So, f (x) = f (y). 15.14. It is evident that if g(x) is the least excessive majorant for the function g, then g(x) ≤ gH (x). On the other hand, gH (x) ≤ supτ Ex g(X(τ )) = vg (x), and it follows from Theorems 15.2 (1) and (15.3) that vg = gH . 15.16. (a) vg (x) = +∞, and τ0 does not exist. (b) vg (x) = 1, and τ0 = inf{t > 0|W (t) = 0}. (c) if ρ < 12 , then vg (s, x) = +∞, and τ0 does not exist; if ρ ≥ 12 , then vg (s, x) = g(s, x). 15.17. Let a point x ∈ V ⊂ U and τ0 be the time of the first exit from a bounded open set V . According to the Dynkin formula (see Problem 13.46, the multidimensional version) for any v > 0 Ex g(X(τ0 ∧ v)) = g(x) + Ex
τ0 ∧v 0
L g(Xs )ds > g(x).
It means that g(x) < vg (x); that is, x ∈ Γ c . 15.18. Reduce the nonhomogeneous in time problem to a homogeneous one, the solution of which is determined by Theorem 15.2, as follows. Define a new diffusion process
15 Optimal stopping of random sequences and processes
Y (t) = Ys,x (t) as Y (t) =
239
s+t , t ∈ R+ , Xx (t)
where Xx (t) is a diffusion starting from the point x, and s ∈ R+ . Then 1 0 (Y (t))dW (t), dW (t) = b(Y (t))dt + σ dY (t) = dt + σ (X(t)) b(X(t)) with b(y) = b(t, x) =
1 b(x)
⎞ 0...0 (y) = σ (t, x) = ⎝ . . . . . . ⎠ ∈ R(n+1)×m , ∈ Rn+1 , σ σ (x) ⎛
where y = (t, x) ∈ R × Rn . We see that Y is the diffusion process starting from the point y = (s, x). Let Py = Ps,x be the distribution of Y and Ey = E(s,x) mean the expectation with respect to the measure Py . The problem can be written in terms of Y (t) as follows: to find g0 and τ0 such that g0 (x) = vg (0, x) = sup E0,x g(Y (τ )) = E0,x g(Y (τ0 )) τ
which is a particular case for the problem of finding vg (s, x) and τ with vg (s, x) = sup Es,x g(Y (τ )) = Es,x g(Y (τ0 )). τ
This problem is standard if we replace X(t) with Y (t). 15.20. (1) Because YN = XN , τ0 is correctly defined. Besides, theevents {τ0 = 0} = {Y0 = X0 } ∈ F0 , {τ0 = n} = {Y0 > X0 } ∩ · · · ∩ {Yn−1 > Xn−1 } {Yn = Xn } ∈ Fn , 1 ≤ n ≤ N; that is, τ0 is a stopping time. Furthermore, Y(n+1)∧τ0 − Yn∧τ0 = (Yn+1 − Yn )1Iτ0 >n . Because Yn > Xn on the set {τ0 > n} then Yn = E(Yn+1 /Fn ). So, τ0 τ − Yn 0 /Fn ) = E(Y(n+1)∧τ0 − Yn∧τ0 /Fn ) = 1Iτ0 >n E (Yn+1 − E(Yn+1 /Fn )/Fn ) = E(Yn+1 τ 0. This implies that {Yn 0 , Fn , 0 ≤ n ≤ N} is a martingale. τ (2) Because Y 0 is a martingale then Y0 = Y0τ0 = E(YNτ0 /F0 ) = E(Yτ0 /F0 ) = E(Xτ0 /F0 ). From the other side, if τ ∈ T0,N , then the stopped process Y τ is a supermartingale. So, Y0τ = Y0 ≥ E(YNτ /F0 ) = E(Yτ /F0 ) ≥ E(Xτ /F0 ). Thus, E(Xτ0 /F0 ) = supτ ∈T0,N E(Xτ /F0 ). The proof of item (3) is similar to the proofs of (1) and (2), if we replace 0 with n. 15.21. Let the stopped process Y τ be a martingale. Prove, taking into account the first condition, that Y0 = E(Xτ /F0 ), and deduce the optimality of τ from item 2) of Problem 15.20. And vice versa, let a stopping τ be optimal. Prove the following sequence of inequalities using Problem 15.20 (2). Y0 = E(Xτ /F0 ) ≤ E(Yτ /F0 ) ≤ Y0 . Based on this and the inequality Xτ ≤ Yτ derive that Xτ = Yτ . Then prove the inequalities E(Yτ /F0 ) = Y0 ≥ E(Yτ ∧n /F0 ) ≥ E(Yτ /F0 ) and deduce that E(Yτ ∧n /F0 ) =
240
15 Optimal stopping of random sequences and processes
E(Yτ /F0 ) = E(E(Yτ /Fn )/F0 ). Because Yτ ∧n ≥ E(Yτ /Fn ) then Yτ ∧n = E(Yτ /Fn ) and this means that {Ynτ , Fn , 0 ≤ n ≤ N} is a martingale. 15.22. Because the process {An , 1 ≤ n ≤ N} is predictable, then τ1 is a stopping time. It is evident that Y τ1 = M τ1 as Aτn1 = An∧τ1 ≡ 0. It means that Y τ1 is a martingale. According to Problem 15.21, we need to prove that Yτ1 = Xτ1 . For ω such that τ1 = N the identities Yτ1 = YN = XN = Xτ1 hold true. If {τ1 = n}, 0 ≤ n < N, then the following identities take place: Yτ1 1Iτ1 =n = Yn 1Iτ1 =n = Mn 1Iτ1 =n = E(Mn+1 /Fn )1Iτ1 =n = E(Yn+1 + An+1 /Fn )1Iτ =n > E(Yn+1 /Fn )1Iτ =n ; that is, Yτ1 1Iτ1 =n = Yn 1Iτ1 =n = Xn 1Iτ1 =n = Xτ1 1Iτ1 =n . Thus, τ1 is an optimal stopping, τ ≥ τ1 , and P(τ > τ1 ) > 0. So, EAτ > 0 and EYτ = EMτ − EAτ = EM0 − EAτ = EY0 − EAτ < EY0 , and the stopped process M τ is a martingale (check why it is true), and thus, τ is not an optimal stopping.
16 Measures in a functional spaces. Weak convergence, probability metrics. Functional limit theorems
Theoretical grounds In this chapter, we consider random elements taking values in metric spaces and their distributions. The definition of a random element taking values in X involves the predefined σ -algebra X of subsets of X. The following statement shows that in a separable metric space, in fact, the unique natural choice for the σ -algebra X is the Borel σ -algebra B(X). Lemma 16.1. Let X be a separable metric space, and X be a σ -algebra of subsets of X that contains all open balls. Then X contains every Borel subset of X. Further on, while dealing with random elements taking values in a separable metric space X, we assume X = B(X). For a nonseparable space, a σ -algebra X is given explicitly. Theorem 16.1. (The Ulam theorem) Let X be a Polish space and μ be a finite measure on the Borel σ -algebra B(X). Then for every ε > 0 there exists a compact set Kε ⊂ X such that μ (X\Kε ) < ε . Generally, we deal with functional spaces X; that is, spaces of the functions or the sequences with a given parametric set. Let us give an (incomplete) list of such spaces with the corresponding metrics. In all the cases mentioned below we consider x, y ∈ X. (1) X = C([0, T ]), ρ (x, y) = maxt∈[0,T ] |x(t) − y(t)|.
−k max (2) X = C([0, +∞)), ρ (x, y) = ∑∞ t∈[0,k] |x(t) − y(t)| ∧ 1 . k=1 2
1/p . (3) X = L p ([0, T ]), p ∈ [1, +∞), ρ (x, y) = 0T |x(t) − y(t)| p dt (4) X = L∞ ([0, T ]), ρ (x, y) = ess supt∈[0,T ] |x(t) − y(t)|.
p 1/p . (5) X = p , p ∈ [1, +∞), ρ (x, y) = [∑∞ k=1 |xk − yk | ] (6) X = ∞ , ρ (x, y) = supk∈N |xk − yk |. (7) X = c0 = {(xk )k∈N : ∃ limk→∞ xk = 0}, ρ (x, y) = supk∈N |xk − yk |.
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 241 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 16,
242
16 Weak convergence, probability metrics. Functional limit theorems
(8) X = D([a, b]) (Skorokhod space, see Chapter 3, Remark 3.2), the metrics in this space are given below. Definition 16.1. Let X = {X(t),t ∈ T} be a real-valued random process with T = [0, T ] or T = [0, +∞). If there exists a modification X of this process with all its trajectories belonging to some functional space X (e.g., one of the spaces from items ω ) is F − X (1)–(4), (8) of the above–given list), and the mapping Xˆ : Ω ω → X(·, measurable for a certain σ -algebra X in X, then the process X is said to generate the random element Xˆ in (X, X). If process X generates a random element Xˆ in (X, X), then its distribution in (X, X) is the probability measure μX ≡ P ◦ Xˆ −1 , μX (A) = P(Xˆ ∈ A), A ∈ X. The notions of the random element generated by a random sequence and the corresponding distribution are introduced analogously. Example 16.1. The Wiener process generates a random element in C([0, T ]) (see Problem 16.1). The distribution of the Wiener process in C([0, T ]) is called the Wiener measure. Definition 16.2. A sequence of probability measures {μn } defined on the Borel σ algebra of the metric space X weakly converges to measure μ if X
f d μn →
X
f dμ, n → ∞
(16.1)
for arbitrary continuous bounded function f : X → R (notation: μn ⇒ μ ). If the sequence of distributions of random elements Xˆn converges weakly to the distribution ˆ then the sequence of the random elements Xˆn converges weakly or of the element X, d ˆ by distribution to Xˆ (notation: Xˆn ⇒ Xˆ or Xˆn → X). If processes Xn generate random elements in a functional space X and these elements converge weakly to the element generated by a process X, then the sequence of the processes Xn is said to converge to X by distribution in X. A set A ∈ B(X) is called a continuity set for the measure μ if μ (∂ A) = 0 (∂ A denotes the boundary of the set A). Theorem 16.2. All the following statements are equivalent. (1) μn ⇒ μ . (2) Relation (16.1) holds for every bounded function f satisfying the Lipschitz condition: there exists L such that | f (x) − f (y)| ≤ Lρ (x, y), x, y ∈ X (here ρ is the metric in X). (3) lim supn→∞ μn (F) ≤ μ (F) for every closed set F ⊂ X. (4) lim infn→∞ μn (G) ≥ μ (G) for every open set G ⊂ X. (5) limn→∞ μn (A) = μ (A) for every continuity set A for the measure μ . Theorem 16.3. (1) Let X, Y be metric spaces and F : X → Y be an arbitrary function. Then the set DF of the discontinuity points for this function is a Borel set (moreover, a countable union of closed sets).
16 Weak convergence, probability metrics. Functional limit theorems
243
(2) Let random elements Xˆn , n ≥ 1 in (X, B(X)) converge weakly to a random element Xˆ and P(Xˆ ∈ DF ) = 0. Then random elements Yˆn = F(Xˆn ), n ≥ 1 in (X, B(X)) ˆ converge weakly to the random element Yˆ = F(X). Consider the important partial case X = C([0, T ]). If X is a random process that generates a random element in C([0, T ]), then for every m ≥ 1,t1 , . . . ,tm ∈ [0, T ] the finite-dimensional distribution PtX1 ,...,tm can be represented as the image of the distribution of X in C([0, T ]) under the mapping
πt1 ,...,tm : C([0, 1]) x(·) → (x(t1 ), . . . , x(tm )) ∈ Rm . (“the finite-dimensional projection”). Because every function πt1 ,...,tm is continuous, Theorem 16.3 yields that, for every sequence of random processes Xn that converge by distribution in C([0, T ]) to a process X, for every m ≥ 1,t1 , . . . ,tm ∈ [0, T ] finitedimensional distributions PtX1n,...,tm converge weakly to the finite-dimensional distribution PtX1 ,...,tm . It should be mentioned that the inverse implication does not hold true and convergence of the finite-dimensional distributions of random processes does not provide their convergence by distribution in C([0, T ]) (see Problem 16.13). Definition 16.3. (1) A family of measures {μα , α ∈ A} is called weakly (or relatively) compact if each of its subsequences contains a weakly convergent subsequence. (2) A family of measures {μα , α ∈ A} is called tight if for every ε > 0 there exists a compact set Kε ⊂ X such that μα (X\Kε ) < ε , α ∈ A. Theorem 16.4. (The Prokhorov theorem) (1) If a family of measures {μα , α ∈ A} is tight, then it is weakly compact. (2) If a family of measures {μα , α ∈ A} is weakly compact and X is a Polish space, then this family is tight. It follows from the definition that a sequence of measures {μn } converges weakly if and only if (a) this family if weakly compact, and (b) this family has at most one weak partial limit (i.e., if two of its subsequences converge weakly to measures μ and μ , then μ = μ ). The statements given above provide the following criteria which are very useful for investigation of the convergence of the random processes by the distribution in C([0, T ]). Proposition 16.1. In order for random processes Xn to converge by distribution in C([0, T ]) to a random process X it is necessary and sufficient that (a) The sequence of their distributions in C([0, T ]) is tight. (b) All the finite-dimensional distributions of the processes Xn converge weakly to the corresponding finite-dimensional distributions of the process X. For a tightness of a sequence of distributions in C([0, T ]) of stochastic processes, a wide choice of sufficient conditions is available (see [25], Chapter 9; [4], Chapter 2; [9], Chapter 5). Here we formulate one such condition that is analogous to the Kolmogorov theorem (Theorem 3.12).
244
16 Weak convergence, probability metrics. Functional limit theorems
Theorem 16.5. Let a sequence of random processes Xn = {Xn (t),t ∈ [0, T ]}, n ≥ 1 be such that, for some constants α , β ,C > 0, E|Xn (t) − Xn (s)|α ≤ C|t − s|1+β , t, s ∈ [0, T ], n ∈ N. Then the sequence of the distributions of these processes in C([0, T ]) is tight. A random walk is a sequence of sums Sn = ∑nk=1 ξk , n ≥ 1, where {ξk k ∈ N} are independent random variables (in general, these variables can have various distributions, but we assume further they are identically distributed. For random walks, see also Chapters 10, 11, and 15). Theorem 16.6. (The Donsker theorem) Let {Sn , n ≥ 1} be a random walk and Eξk2 < +∞. Then the random processes Xn (t) =
S[nt] − [nt]Eξ1 ξ[nt]+1 − Eξ1 ( + (nt − [nt]) ( , t ∈ [0, 1], n ≥ 1 nDξ1 nDξ1
converge by distribution in C([0, 1]) to the Wiener process. Corollary 16.1. Let F : C([0, 1]) → R be a functional with its discontinuity set DF having zero Wiener measure. Then F(Xn ) ⇒ F(W ). Note that Corollary 16.1 does not involve any assumptions on the structure of the laws of the summands ξk , and thus the Donsker theorem is frequently called the invariance principle: the limit distribution is invariant w.r.t. choice of the law of ξk . Another name for this statement is the functional limit theorem. Processes with continuous trajectories have a natural interpretation as random elements valued in C([0, T ]). This allows one to study efficiently the limit behavior of the distributions of the functionals of such processes. In order to extend this construction for the processes with c`adl`ag trajectories, one has to endow the set D([0, T ], Y) with a structure of a metric space, and this space should be separable and complete (see Problem 16.17, where an example of an inappropriate metric structure is given). Below, we describe the metric structure on D([0, T ], Y) introduced by A. V. Skorokhod. Let Y be a Polish space with the metric ρ . Denote by Λ the class of strictly monotone mappings λ : [0, T ] → [0, T ] such that λ (0) = 0, λ (T ) = T. Denote λ (t) − λ (s) λ = sup ln . t −s s=t Definition 16.4. For x, y ∈ D([0, T ], Y), denote d(x, y) = inf{ε | ∃λ ∈ Λ , sup ρ (x(λ (t)), y(t)) ≤ ε , sup |λ (t) − t| ≤ ε }, t∈[0,T ]
t∈[0,T ]
d0 (x, y) = inf{ε | ∃λ ∈ Λ , sup ρ (x(λ (t)), y(t)) ≤ ε , λ ≤ ε }. t∈[0,T ]
16 Weak convergence, probability metrics. Functional limit theorems
245
Theorem 16.7. (1) The functions d, d0 are metrics on D([0, T ], Y). (2) The space (D([0, T ], Y), d) is separable but is not complete. (3) The space (D([0, T ], Y), d0 ) is both separable and complete. (4) A sequence {xn } ⊂ D([0, T ], Y) converges to some x ∈ D([0, T ], Y) in the metric d if and only if this sequence converges to x in the metric d0 . The last statement in Theorem 16.15 shows that the classes of the closed sets in the spaces (D([0, T ], Y), d) and (D([0, T ], Y), d0 ) coincide, and therefore the definitions of the weak convergence in these spaces are equivalent. For a tightness of a sequence of distributions in D([0, T ]) of stochastic processes, a wide choice of sufficient conditions is available (see [25], Chapter 9, [4], Chapter 3). We formulate one of them. Theorem 16.8. Let a sequence of random processes Xn = {Xn (t),t ∈ [0, T ]}, n ≥ 1 be such that, for some constants α , β ,C > 0, E|Xn (t) − Xn (s)|α |Xn (r) − Xn (t)|α ≤ C|r − s|1+β , s < t < r, n ∈ N. Then the sequence of the distributions of these processes in D([0, T ]) is tight. Consider a triangular array of random variables {ξnk , 1 ≤ k ≤ n}, where for every n random variables ξn1 , . . . , ξnn are independent and identically distributed. Consider the random walk Snk = ∑kj=1 ξn j , 1 ≤ k ≤ n corresponding to this triangular array. Theorem 16.9. Let the central limit theorem hold for the array {ξnk , 1 ≤ k ≤ n}; that is there exists random variable η such that Snn ⇒ η . Then the random processes Xn (t) := S[nt] , t ∈ [0, 1], n ≥ 1 converge by distribution in D([0, 1]) to the stochastically continuous homogeneous process with independent increments Z such that d
Z(1) = η . Note that, under an appropriate choice of the array {ξnk }, any infinitely divisible distribution can occur as the distribution of the variable η . Correspondingly, any L´evy process can occur as the limiting process Z. Together with a qualitative statement about convergence of a sequence of distributions, frequently (especially in applications) explicit estimates for the rate of convergence are required. The rate of convergence for a sequence of probability distributions can be naturally controlled by a distance between the prelimit and limit distributions w.r.t. some probability metric; that is, a metric on the family of probability measures. Below, we give a list of the most important and frequently used probability metrics. The class of probability measures on a measurable space (X, X) will be denoted by P(X). Consider first the case X = R, X = B(R). In this case, every measure μ ∈ P(R) is uniquely defined by its distribution function Fμ .
246
16 Weak convergence, probability metrics. Functional limit theorems
Definition 16.5. The uniform metric (or the Kolmogorov metric) is the function dU (μ , ν ) = sup |Fμ (x) − Fν (x)|,
μ , ν ∈ P(R).
x∈R
Definition 16.6. The L´evy metric is the function dL (μ , ν ) = inf{ε | Fν (x − ε ) − ε ≤ Fμ (x) ≤ Fν (x + ε ) + ε , x ∈ R},
μ , ν ∈ P(R).
Definition 16.7. The Kantorovich metric is the function dK ( μ , ν ) =
R
|Fμ (x) − Fν (x)| dx,
μ , ν ∈ P(R).
(16.2)
Note that the integral in the right-hand side of (16.2) can diverge, and thus the Kantorovich metric can take value +∞. Next, let (X, ρ ) be a metric space, X = B(X). Definition 16.8. The L´evy-Prokhorov metric is the function dLP (μ , ν ) = inf{ε | μ (A) ≤ ν (Aε ) + ε , A ∈ B(X)},
μ , ν ∈ P(X),
where Aε = {y| ρ (y, A) < ε } is the open ε -neighborhood of the set A. It requires some effort to prove that dLP is a metric indeed; see Problem 16.52. For a Lipschitz function f : X → R, denote by Lip( f ) its Lipschitz constant; that is, the infimum of L such that | f (x) − f (y)| ≤ Lρ (x, y), x, y ∈ X. Definition 16.9. The Lipschitz metric is the function dLip (μ , ν ) = sup f d μ − f d ν , f : Lip( f )≤1
X
X
μ , ν ∈ P(X).
The Lipschitz metric is closely related to the Kantorovich metric; see Theorem 16.12. Some authors use term “Kantorovich metric” for the metric dLip . For μ , ν ∈ P(X), denote by C(μ , ν ) the class of all random elements Z = (X,Y ) in (X × X, X ⊗ X) such that the first component X has distribution μ and the second component Y has distribution ν . Such a random element is called a coupling for the measures μ , ν . Definition 16.10. The Wasserstein metric of the power p ∈ [1, +∞) is the function dW,p (μ , ν ) =
inf
(X,Y )∈C(μ ,ν )
[Eρ p (X,Y )]1/p ,
In general, the Wasserstein metric can take value +∞.
μ , ν ∈ P(X).
16 Weak convergence, probability metrics. Functional limit theorems
247
We remark that some authors insist that, from the historical point of view, the correct name for dW,p is the Kantorovich metric (or the Kantorovich-Rubinstein metric). We keep the term “Wasserstein metric” which is now used more frequently. The Wasserstein metric is a typical example of a coupling (or minimal) probability metric. The general definition for the coupling metric has the form inf
(X,Y )∈C(μ ,ν )
H(X,Y ),
(16.3)
where H is some metric on the set of random elements (see [92], Chapter 1). In the definition of the Wasserstein metric, H is equal to the L p -distance H(X,Y ) = ρ (X,Y )L p . Under quite general assumptions, the infimum in the definition of the Wasserstein metric is attained; that is, for a given μ , ν ∈ P(X), p ∈ [1, +∞) there exists an element Z ∗ = (X ∗ ,Y ∗ ) ∈ C(μ , ν ) such that p (μ , ν ) Eρ p (X ∗ ,Y ∗ ) = dW,p
(16.4)
(see Problem 16.55). Any element Z ∗ satisfying (16.4) is called an optimal coupling for the measures μ , ν w.r.t. metric dW,p . In the important particular cases, explicit formulae are available both for the Wasserstein metric and for corresponding optimal couplings. Proposition 16.2. Let X = R, ρ (x, y) = |x − y|. For arbitrary μ , ν ∈ P(X) define the vector [−1] [−1] Zμ ,ν = (Fμ (U), Fν (U)), where U is the random variable uniformly distributed on [0, 1] and F [−1] (x) = inf{y| F(y) > x} is the quantile transformation for the function F. Then for every p ∈ [1, +∞) the random vector Zμ ,ν is an optimal coupling for the measures μ , ν w.r.t. to the metric dW,p . In particular (see Problem 16.57), p dW,p (μ , ν ) =
1 [−1] 0
Fμ
[−1]
(x) − Fν
p (x) dx,
p ∈ [1, +∞).
(16.5)
Now, let (X, X) be an arbitrary measurable space. Recall that, by the Hahn theorem, for any σ -finite signed measure κ there exists a set C ∈ X such that κ(A) ≥ 0 for any A ∈ X, A ⊂ C and κ(B) ≤ 0 for any B ∈ X, B ⊂ X\C. The measure |κ| (·) := κ(· ∩ C) − κ(· ∩ (X\C)) is called the variation of the signed measure κ, and |κ| (X) is called the total variation of κ. Definition 16.11. The total variation metric (or the total variation distance) is the function dV (μ , ν ) = μ − ν var , μ , ν ∈ P(X), where μ − ν var is the total variation of the signed measure μ − ν .
248
16 Weak convergence, probability metrics. Functional limit theorems
Definition 16.12. The Hellinger metric is the function ⎡
)3
dH ( μ , ν ) = ⎣
X
dμ − dλ
3
dν dλ
⎤1/2
*2
dλ ⎦
,
μ , ν ∈ P(X),
where λ is an arbitrary σ -finite measure such that μ λ , ν λ . The value dH (μ , ν ) does not depend on the choice of λ (see Problem 16.65). The Hellinger metric is closely related to the Hellinger integrals. Definition 16.13. The Hellinger integral of the power θ ∈ [0, 1] is the function Hθ (μ , ν ) =
X
dμ dλ
θ
dν dλ
1−θ
dλ ,
μ , ν ∈ P(X).
Here, as in the previous definition, λ is a measure such that μ λ , ν λ . For θ = 0 or 1, the notational convention 00 = 1 is used. The Hellinger integral H1/2 (μ , ν ) is also called the Hellinger affinity. The values Hθ (μ , ν ), θ ∈ [0, 1] do not depend on the choice of λ (see Problem 16.65). Hellinger integrals appear to be a useful tool for investigating the properties of absolute continuity and singularity of the measures μ and ν (see Definitions 17.1, 17.2, and Problems 16.68,16.69). In order to estimate how close two probability distributions are each to other, some “distance” functions are also used, not being the metrics in the true sense. These functions can be nonsymmetric w.r.t. μ , ν , fail to satisfy the triangle inequality, and so on. Here we give one such function that is used most frequently. Definition 16.14. For μ , ν ∈ P(X), let λ be a σ -finite measure such that μ λ , ν λ . Denote f = d μ /d λ , g = d ν /d λ . The relative entropy (or the Kullback–Leibler distance) for the measures μ , ν is defined by f dλ . f ln E(μ ν ) = g X Here, the notational conventions 0 ln(0/p) = 0, p ≥ 0, and p ln(p/0) = +∞, p > 0 are used. The relative entropy can take value +∞. Its value for a given μ , ν does not depend on the choice of λ (see Problem 16.65). Let us formulate the most important properties of the probability metrics introduced above. Let X be a Polish space. Theorem 16.10. (1) A sequence {μn } ⊂ P(X) converges weakly to μ ∈ P(X) if and only if dLP (μn , μ ) → 0, n → +∞. (2) The set P(X) with the L´evy–Prokhorov metric dLP forms a Polish metric space.
16 Weak convergence, probability metrics. Functional limit theorems
249
Theorem 16.11. Assume the metric ρ on the set X is bounded. Then for every p ∈ [1, +∞), (1) The set P(X) with the Wasserstein metric dW,p forms a Polish metric space. (2) The sequence {μn } ⊂ P(X) converges weakly to μ ∈ P(X) if and only if dW,p (μn , μ ) → 0, n → +∞. Theorem 16.12. (The Kantorovich–Rubinstein theorem) The Wasserstein metric dW,1 coincides with the Lipschitz metric dLip . Furthermore, in the case X = R, ρ (x, y) = |x − y| both these metrics are equal to the Kantorovich metric dK . Convergence of a sequence of measures w.r.t. total variation metric dTV is called var convergence in variation (notation: μn → μ ). This convergence is stronger than the var weak convergence; that is, μn → μ implies μn ⇒ μ , but inverse implication, in general, does not hold. The following statement, in particular, shows that convergence in the Hellinger metric is equivalent to convergence in variation. Proposition 16.3. For the Hellinger metric dH and the total variation metric dTV , the following relations hold. dH2 ≤ dTV ≤ 2dH . Let us give one more property of the total variation metric, which has a wide range of applications in ergodic theory for Markov processes with a general phase space. The following statement, by different authors, is named the coupling lemma or the Dobrushin lemma. Proposition 16.4. For any μ , ν ∈ P(X), dTV (μ , ν ) = 2
inf
(X,Y )∈C(μ ,ν )
P(X = Y ).
The coupling lemma states that the total variation metric, up to multiplier 2, coincides with the coupling metric that corresponds to the “indicator distance” H(X,Y ) = P(X = Y ). The properties given above show that there exist close connections between various probability metrics. The variety of probability metrics used in the literature is caused by the fact that every such metric arises naturally from a certain class of models and problems. On the other hand, some of the metrics have several additional properties that appear to be useful because these properties provide more convenient and easy analysis involving these metrics. One such property is called the tensorization property and means that some characteristics of the metric are preserved under the operation of taking a tensor product. Let us give two examples of statements of such kind (see Problems 16.60, 16.67). Proposition 16.5. Let X = X1 × X2 and the metric ρ on X has the form
1/p ρ (x, y) = ρ1p (x1 , y1 ) + ρ2p (x2 , y2 ) , x = (x1 , x2 ), y = (y1 , y2 ) ∈ X,
250
16 Weak convergence, probability metrics. Functional limit theorems
where p ∈ [1, +∞) and ρ1 , ρ2 are the metrics in X1 , X2 . Then the distance between arbitrary product-measures μ = μ1 × μ2 , ν = ν1 × ν2 w.r.t. the Wasserstein metric of the power p is equal to
1/p p p (μ1 , ν1 ) + dW,p ( μ 2 , ν2 ) . dW,p (μ , ν ) = dW,p Proposition 16.6. Let X = X1 × X2 . Then the distance between arbitrary productmeasures μ = μ1 × μ2 , ν = ν1 × ν2 w.r.t. the Hellinger metric satisfies 1 2 1 1 dH (μ , ν ) = 1 − 1 − dH2 (μ1 , ν1 ) 1 − dH2 (μ2 , ν2 ) . 2 2 2 In particular, dH2 (μ , ν ) ≤ dH2 (μ1 , ν1 ) + dH2 (μ2 , ν2 ).
Bibliography [4]; [9], Chapter V; [17]; [25], Chapter IX; [88]; [92], Chapter I.
Problems 16.1. Let {X(t),t ∈ [0, T ]} be a process that has a continuous modification. Prove that the process X generates a random element in C([0, T ]). 16.2. Let {X(t),t ∈ [0, T ]} be a process that has a measurable modification and such that E 0T X 2 (t) dt < +∞. Prove that the process X generates a random element in L2 ([0, T ]). 16.3. Let {X(t),t ∈ [0, T ]} be a process that has a c`adl`ag modification. Prove that the process X generates a random element in D([0, T ]). 16.4. Let X be a metric space and μ be a finite measure on B(X). Prove that (1) For any A ∈ B(X), ε > 0 there exist a closed set Fε and open set Gε such that Fε ⊂ A ⊂ Gε and μ (Gε \Fε ) < ε (the regularity property for a measure on a metric space). (2) If X is a Polish space then for any A ⊂ B(X), ε > 0 there exists a compact set Kε ⊂ A such that μ (A\Kε ) < ε (a refinement of the Ulam theorem). (3) For any p ∈ [1, +∞) the set Cb,u (X) of all bounded uniformly continuous functions on X is dense in L p (X, μ ). 16.5. Let (X, ρ ) be a Polish space, and μ : B(X) → [0, 1] be an additive set function. Prove that μ is σ -additive (i.e., is a measure) if and only if μ (A) = sup{μ (K)| K ⊂ A, K is a compact set}, A ∈ B(X).
16 Weak convergence, probability metrics. Functional limit theorems
251
16.6. Let X be the σ -algebra of subsets of ∞ generated by the open balls, and {ξk , k ≥ 1} are i.i.d. random variables with P(ξk = ±1) = 12 . Prove: (1) The sequence {ξk } generates a random element ξ in (∞ , X). (2) Every compact subset K ⊂ ∞ belongs to the σ -algebra X, and for every such set P(ξ ∈ K) = 0. 16.7. Let {ξk , k ≥ 1} be the sequence of i.i.d. random variables that have standard normal distribution. Prove: # " ( (1) The sequence ζk = ξk / ln(k + 1) does not generate a random element in c0 . (2) The sequence {ζk } generates a random element ζ in the space ∞ with the σ -algebra X generated by the open balls. (3) Every compact subset K ⊂ ∞ belongs to the σ -algebra X, and for every such set P(ζ ∈ K) = 0. Thereby, for the distributions of the element ζ and the element ξ introduced in the previous problem, the statement of the Ulam theorem fails. 16.8. Let Xn , X be the random variables, Xn ⇒ X, and the distribution function FX be continuous in every point of some closed set K. Prove that supx∈K |FXn (x) − FX (x)| → 0, n → +∞. 16.9. Give an example of the random vectors Xn = (Xn1 , Xn2 ), n ≥ 1, X = (X 1 , X 2 ) such that Xn ⇒ X and (a) For every n there exists a function fn ∈ C(R) such that Xn2 = fn (Xn1 ) a.s.. (b) There does not exist a measurable function f such that X 2 = f (X 1 ) a.s. Such an example demonstrates that functional dependence is not preserved under weak convergence. 16.10. Let {Xn } be a sequence of random elements in (X, B(X)) with a tight family of the distributions. Prove that for every f ∈ C(X, Y) the family of the distributions in (Y, B(Y)) of the elements Yn = f (Xn ) is also tight. 16.11. Let X, Y be metric spaces, X × Y be their product, and {μn } be a sequence of measures on the Borel σ -algebra in X × Y. Prove that the sequence {μn } is tight if and only if both the sequences {μn1 }, {μn2 } of the marginal distributions for the measures {μn } are tight. The marginal distributions for a measure μ on B(X × Y) are defined by
μ 1 (A) = μ (A × Y), A ∈ B(X),
μ 2 (B) = μ (X × B), B ∈ B(Y).
16.12. Consider the" following functional spaces: # (a) Lip([0, 1]) = f f Lip := | f (0)| + sups,t∈[0,1],s=t | f (t) − f (s)|/|t − s| < +∞ (Lipschitz functions). " # (b) Hγ ([0, 1]) = f f Hγ ≡ | f (0)| + sups,t∈[0,1],s=t | f (t) − f (s)|/|t − s|γ < +∞
(H¨older functions with the index γ ∈ (0, 1)).
252
16 Weak convergence, probability metrics. Functional limit theorems
Are they Banach spaces w.r.t. norms · Lip and · Hγ , respectively? Which of these spaces are separable? 16.13. Let
t
Xn (t) = nt1I[0,1/(2n)) + 1 − 1I[1/(2n),1/n) , t ∈ [0, 1], n ≥ 1, X ≡ 0. n Prove that (a) All the finite-dimensional distributions of the process Xn converge weakly to the corresponding finite-dimensional distributions of the process X. (b) The sequence {Xn } does not converge to X by distribution in C([0, 1]). 16.14. Let {a, b, c, d} ⊂ (0, 1) and a < b, c < d. Calculate the distance between the functions x = 1[a,b) and 1I[c,d) , considered as elements of D([0, 1]), w.r.t. the metric (a) d; (b) d0 . 16.15. Let xn (t) = 1I[1/2,1/2+1/n) (t),t ∈ [0, 1], n ≥ 2. (1) Prove that the sequence {xn } is fundamental in D([0, 1]) w.r.t. the metric d, but this sequence does not converge. (2) Check directly that this sequence is not fundamental in D([0, 1]) w.r.t. the metric d0 . 16.16. Is the set D([0, 1]) a closed subset of the space B([0, 1]) of all bounded functions on [0, 1] with the uniform metric? 16.17. Prove that the space D([0, 1]), endowed with the uniform metric, is complete but is not separable. 16.18. Prove that if xn → x in the metrics d of the space D([0, 1]) and the function x is continuous, then xn − x∞ → 0. 16.19. Prove that: (1) C([0, 1]) is a closed subset of D([0, 1]). (2) C([0, 1]) is a nowhere dense subset of D([0, 1]); that is, for every nonempty open ball B ⊂ D([0, 1]) there exists a nonempty open ball B ⊂ B such that B ∩ C([0, 1]) = ∅. 16.20. Give an example of sequences {xn }, {yn } in D([0, 1]) such that the sequences themselves converge in D([0, 1]) but the sequence of the R2 -valued functions {zn (t) = (xn (t), yn (t))} does not converge in D([0, 1], R2 ). 16.21. Is the mapping S : (x, y) → x + y, x, y ∈ D([0, 1]) continuous as a function D([0, 1]) × D([0, 1]) → D([0, 1])? 16.22. For a given a < b consider the following functional on C([0, 1]): Iab (x) =
1 0
1Ix(s)∈[a,b] ds,
x ∈ C([0, 1]).
Prove that this functional is not continuous, but the set of its discontinuity points has zero Wiener measure.
16 Weak convergence, probability metrics. Functional limit theorems
253
16.23. Let {X(t) ∈ [0, 1]} be a process with continuous trajectories. For a ∈ R, denote by DXa the set of t ∈ [0, 1] such that the corresponding one-dimensional distribution has an atom in the point a; that is, DXa = {t ∈ [0, 1]| P(X(t) = a) > 0}. Prove that: (1) DXa ∈ B([0, 1]). (2) The set of discontinuity points for the functional Iab introduced in the previous problem has zero measure w.r.t. distribution of the process X if and only if the set DXa ∪ DXb has zero Lebesgue measure. 16.24. For a given a < b, consider the functional Iab (Problem 16.22) on the space D([0, 1]). Prove that this functional is not continuous, but for every c = 0 the set of its discontinuity points has zero measure w.r.t. distribution of the process X(t) = N(t) + ct,t ∈ [0, 1], where N is the Poisson process. Does the last statement remain true for c = 0? 16.25. For a given z ∈ R, consider the functional on C(R+ ) inf{t | x(t) = z}, {t | x(t) = z} = ∅, τ (z, x) = +∞, {t | x(t) = z} = ∅. Prove that τ (z, ·) is not a continuous functional, but the set of its discontinuity points has zero Wiener measure. 16.26. On the space C([0, 1]), consider the functionals M(x) = max x(t), ϑ (x) = min{t | x(t) = M(x)}, x ∈ C([0, 1]). t∈[0,1]
Prove that (a) the functional M is continuous; (b) the functional ϑ is not continuous, but the set of its discontinuity points has zero Wiener measure. 16.27. Prove that the following functionals on C([0, 1]) are not continuous, but the sets of their discontinuity points have zero Wiener measure. κ(x) = min{t ∈ [0, 1]| x(t) = x(1)},
χ (x) = max{t ∈ [0, 1]| x(t) = 0},
x ∈ C([0, 1]). 16.28. Let a : [0, 1] → R be a positive continuous function. Prove that the functional Ma : D([0, 1]) → R, Ma (x) = supt∈[0,1] x(t)/(a(t)) is continuous. 16.29. Denote T (x) = inf{t ∈ [0, 1] | x(t−) = x(t)}, x ∈ D([0, 1]). Is T (·) a continuous functional on D([0, 1])? 16.30. For c > 0, denote Tc (x) = inf{t ∈ [0, 1] | |x(t−) − x(t)| > c}, x ∈ D([0, 1]), that is, the moment of the first jump of the function x with the jump size exceeding c. Prove that for any measure μ on D([0, 1]) there exists at most countable set Aμ such that for arbitrary c ∈ Aμ the set of discontinuity points for Tc has zero measure μ .
254
16 Weak convergence, probability metrics. Functional limit theorems
16.31. Denote τa (x) = inf{t ∈ [0, 1]| x(t) ≥ a}, x ∈ D([0, 1]), that is, the moment of the first passage of x over the level a (if the set is empty, then put τa (x) = 1). Describe the set of values a ∈ R such that the set of discontinuity points for τa has zero measure w.r.t. the distribution of the Poisson process in D([0, 1]). 16.32. Prove that for arbitrary a ∈ R the set of discontinuity points for the functional τa introduced in the previous problem has zero measure w.r.t. the distribution of the process X(t) = N(t) − t,t ∈ [0, 1], where N is the Poisson process. 16.33. Let {Sn = ∑k≤n ξk , n ∈ Z+ } be a random walk with Eξk = 0, Eξk2 = 1. Prove that for any a < b √ √ 1 P(Sn ∈ [a N, b N]) → ∑ N n≤N
1 b 0
a
1 −y2 /(2s) √ e dyds, N → ∞. 2π s
16.34. Let √{Sn , n ∈ Z+ } be as in the previous problem. Denote HSN (z) = #{n ≤ N| Sn ≥ z · N}, z ∈ R. Prove that P
1 N H (0) ≤ α N S
→
√ 2 arcsin α , N → ∞, α ∈ (0, 1). π
16.35. (1) Let {Sn = ∑k≤n ξk , n ∈ Z+ } be the random walk with P(ξk = ±1) = 12 . Prove that P(max Sn ≥ z) = 2P(SN > z) + P(SN = z), z ∈ Z+ . n≤N
(2) Let W be the Wiener process. Prove that P(max W (s) ≥ z) = 2P(W (t) ≥ z), s≤t
z ≥ 0.
(3) Let {Sn = ∑k≤n ξk , n ∈ Z+ } be a random walk with Eξk = 0, Eξk2 = 1. Prove that
√ P(maxn≤N Sn ≥ z · n) √ → 2, N → ∞, z ≥ 0. P(SN ≥ z · n)
16.36. Let W be the Wiener process, z > 0. (1) Find the distribution density of the random variable τ (z,W ) (the functional τ (·, ·) is defined in Problem 16.25). (2) Prove that Eτ α (z,W ) = +∞ for α ≥ 12 and Eτ α (z,W ) < +∞ for α ∈ (0, 12 ). 16.37. Prove that {Y (z) = τ (z,W ), z ∈ R+ } is a stochastically continuous homogeneous process with independent increments. 16.38. Find the cumulant and the L´evy measure of the process Y from the previous problem.
16 Weak convergence, probability metrics. Functional limit theorems
255
16.39. Let {Sn = ∑k≤n ξk } be the random walk with P(ξk = ±1) = 12 , and HSN be the function defined in Problem 16.34. Prove that for z < 0, α ∈ (0, 1),
1 N H (z) ≤ α P N S
→
1−α 3 2 0
·
|z|
−z 3 e
π3 s 2
2 /(2s)
3 · arcsin
α ds, N → ∞. 1−s
Give the formula for limN→∞ P N −1 HSN (z) ≤ α when z > 0. 16.40. Let {Sn } be as in the previous problem. (1) For a given n, N (n < N) find P(Sm ≤ Sn , m ≤ N). (2) Denote ϑ N (S) ≡ min{m : Sm = maxn≤N Sn }. Prove that P(ϑ N (S) ≤ α · N) →
√ 2 arcsin α , N → ∞, α ∈ (0, 1). π
16.41. Let {Sn = ∑k≤n ξk , n ∈ Z+ } be the random walk with P(ξk = ±1) = 12 . Find P(max Sn = m, ϑ N (S) = k, SN = r). n≤N
16.42. Find the joint distribution of the variables maxt∈[0,1] W (t), ϑ (W ),W (1), where W is a Wiener process, and the functional ϑ is defined in Problem 16.26. 16.43. Let {Sn = ∑k≤n ξk , n ∈ Z+ } be the random walk with P(ξk = ±1) = 12 . Find P(maxn≤N Sn = m, minn≤N Sn = k, SN = r). 16.44. Find the joint distribution of the variables maxt∈[0,1] W (t), mint∈[0,1] W (t),W (1) (W is a Wiener process). Compare with Problem 7.108. 16.45. Let {Sn = ∑k≤n ξk , n ∈ Z+ } be a random walk with Eξk = 0, Eξk2 = 1. Prove that (2m+1)z √ 2 1 √ e−y /2 dy P(max |Sn | ≤ z · N) → ∑ (−1)m n≤N (2m−1)z 2 π m∈Z π 2 (2m + 1)2 4 ∞ (−1)m exp − = 1− ∑ , N → ∞, z > 0. π m=1 m + 1 8z2 d
16.46. Without an explicit calculation of the distribution, show that κ(W ) = χ (W ) (the functionals κ, χ are defined in Problem 16.27). 16.47. Prove that P(κ(W ) ≤ α ) = P(χ (W ) ≤ α ) =
√ 2 arcsin α = lim P(Sn = 0, n ≥ α · N), N→∞ π
α ∈ (0, 1), where {Sn } is the random walk with P(ξk = ±1) = 12 .
256
16 Weak convergence, probability metrics. Functional limit theorems
16.48. Without passing to the limit, find the distribution of the variables κ(W ), χ (W ). Compare with the previous problem. 16.49. In the array {ξnk , 1 ≤ k ≤ n} let the random variables ξn1 , . . . , ξnn be i.i.d. with P(ξn1 = 1) = λ /n, P(ξn1 = 0) = 1 − λ /n. Prove that for any a > 0, P(Snk ≤ ak, k = 1, . . . , n) → P( max (N(t) − at) ≤ 0), n → ∞, t∈[0,1]
where N is the Poisson process with intensity λ . 16.50. In the situation of the previous problem, prove that for arbitrary a > 0 the distributions of the variables n−1 #{k : Snk > ak} weakly converge to the distribution of the variable 01 1IN(t)>at dt. 16.51. Verify the following relations between the uniform metric dU and and L´evy metric dL . (1) dU (μ , ν ) ≥ dL (μ , ν ), μ , ν ∈ P(R). (2) If the measure ν possesses a bounded density pν , then dU (μ , ν ) ≤ 1 + sup pν (x) dL (μ , ν ), μ ∈ P(R). x∈R
In particular, if ν ∼ N(0, 1), then dL (μ , ν ) ≤ dU (μ , ν ) ≤ 1 + (2π )−1/2 dL (μ , ν ),
μ ∈ P(R).
16.52. Verify the metric axioms for the L´evy–Prokhorov metric dLP : (a) dLP (μ , ν ) = 0 ⇔ μ = ν . (b) dLP (μ , ν ) = dLP (ν , μ ). (c) dLP (μ , ν ) ≤ dLP (μ , π ) + dLP (π , ν ) for any μ , ν , π ∈ P. 16.53. Prove the triangle inequality for the Wasserstein metric. 16.54. Let X be a Polish space, μ , ν ∈ P(X), and let {Zn , n ≥ 1} ⊂ C(μ , ν ) be an arbitrary sequence. Prove that the sequence of distributions of Zn , n ≥ 1 in X × X is weakly compact. 16.55. Let X be a Polish space, μ , ν ∈ P(X), p ∈ [1, +∞). Prove the existence of an optimal coupling for the measures μ , ν , that is, of such an element Z ∗ = (X ∗ ,Y ∗ ) ∈ p (μ , ν ). C(μ , ν ) that Eρ p (X ∗ ,Y ∗ ) = dW,p 16.56. Let X be a Polish space, p ∈ [1, +∞). Prove that the class of optimal couplings for the measures μ , ν depends on μ , ν continuously in the following sense. For any sequences {μn } ⊂ P(X) and {νn } ⊂ P(X), convergent in the metric dW,p to measures μ and ν , respectively, and any sequence Zn∗ , n ≥ 1 of optimal couplings for μn , νn , n ≥ 1 weakly convergent to an element Z ∗ , the element Z ∗ is an optimal coupling for μ , ν .
16 Weak convergence, probability metrics. Functional limit theorems
257
16.57. Prove formula (16.5) (a) for discrete measures μ , ν ; (b) in the general case. 16.58. Calculate the Wasserstein distance dW,2 (μ , ν ) for μ ∼ U(0, 1), ν ∼ Exp(λ ). For what λ is this distance minimal; that is, which exponential distribution gives the best approximation for the uniform one? 16.59. Calculate the Wasserstein distance dW,2 (μ , ν ) for μ ∼ N(a1 , σ12 ), ν ∼ N(a2 , σ22 ). 16.60. Prove Proposition 16.5. 16.61. Let {λk }, {θk } be a given sequences of nonnegative real numbers such that ∑k λk < +∞, ∑k θk < +∞, and {ξk }, {ηk } are sequences of independent centered Gaussian random variables with the variances {λk } and {θk }, respectively. Find the Wasserstein distance dW,2 between the distributions of the random elements generated by these two sequences in the space 2 . 16.62. Let {X(t),Y (t),t ∈ [a, b]} be centered Gaussian processes, and let their covariance functions RX , RY be continuous on [a, b]2 . (1) Prove that the processes X,Y generate random elements in the space L2 ([a, b]). (2) Prove the following estimate for the Wasserstein distance between the distributions μX , μY of the random elements generated by the processes X,Y in L2 ([a, b]), 9 dW,2 (μX , μY ) ≤
[a,b]2
(QX (t, s) − QY (t, s))2 dsdt,
where QX , QY ∈ L2 ([a, b]2 ) are arbitrary kernels satisfying b a
QX (t, r)QX (s, r) dr = RX (t, s),
b a
QY (t, r)QY (s, r) dr = RY (t, s),
t, s ∈ [a, b]. 16.63. Prove that the Wasserstein distance dW,2 between the distributions of the random elements generated by the Wiener process and the√Brownian bridge in L2 ([0, 1]) √ √ is bounded by 1/ 3 − ( 2)/3 from below and by 1/ 3 from above. 16.64. Prove that the creating of convex combinations of probability measures does not enlarge the Wasserstein distance dW,p ; that is, for any μ1 , . . . , μm , ν1 , . . . , νm ∈ P(X) and α1 , . . . , αm ≥ 0 with ∑m k=1 αk = 1, ) * dW,p
m
m
k=1
k=1
∑ αk μk , ∑ αk νk
≤ max dW,p (μk , νk ). k=1,...,m
Does this property hold for other coupling metrics? 16.65. A σ -finite measure λ is said to dominate measure μ ∈ P(X) if μ λ . In Definitions 16.11—16.13, the values of the Hellinger metric dH (μ , ν ), Hellinger integrals Hθ (μ , ν ), θ ∈ [0, 1], and relative entropy E(μ ν ) are defined in terms of a measure λ that dominates both μ and ν . Prove that, for a given μ , ν ∈ P(X): (1) There exists at least one such a measure λ . (2) The values of dH (μ , ν ), Hθ (μ , ν ), θ ∈ [0, 1], and E(μ ν ) do not depend on the choice of λ .
258
16 Weak convergence, probability metrics. Functional limit theorems
16.66. Verify that Hθ (μ1 × μ2 , ν1 × ν2 ) = Hθ (μ1 , ν1 )Hθ (μ2 , ν2 ), θ ∈ [0, 1]. Use this relation for proving Proposition 16.6. 16.67. Prove Proposition 16.3. 16.68. Let μ , ν ∈ P(X). Prove the following statements. (1) H0 (μ , ν ) = H1 (μ , ν ) = 1 and Hθ (μ , ν ) ≤ 1 for every θ ∈ (0, 1). If Hθ (μ , ν ) = 1 for at least one θ ∈ (0, 1), then μ = ν . (2) The function Hμ ,ν : [0, 1] θ → Hθ (μ , ν ) is log-convex; that is, α Hμ ,ν (αθ1 + (1 − α )θ2 ) ≤ Hμα,ν (θ1 )Hμ1− ,ν (θ2 ), θ1 , θ2 ∈ [0, 1], α ∈ (0, 1).
(3) The function Hμ ,ν is continuous on the interval (0, 1). (4) The measure μ is absolutely continuous w.r.t. ν if and only if the function Hμ ,ν is continuous at the point 1. (5) In order for the measures μ and ν to be mutually singular it is necessary that for every θ ∈ (0, 1), and it is sufficient that for some θ ∈ (0, 1), the Hellinger integral Hθ (μ , ν ) equals 0. :
16.69. (Kakutani alternative). Let (X, X) = (∏n∈N Xn , n∈N Xn ) be a countable product of a measurable spaces (Xn , Xn ), n ∈ N, and let μ and ν be the product measures on this space: μ = ∏n∈N νn , μ = ∏n∈N νn , where μn , νn ∈ P(Xn ), n ∈ N. Assuming that for every n ∈ N the measure μn is absolutely continuous w.r.t. νn , prove that for the measures μ , ν only the two following relations are possible. (a) μ ν ; (b) μ ⊥ ν . Prove that the second relation holds if and only if ∏n∈N H1/2 (μn , νn ) = 0. 16.70. Prove that the Hellinger intregrals are continuous w.r.t. the total variation convar var vergence; that is, as soon as μn → μ , νn → ν , n → ∞, one has Hθ (μn , νn ) → Hθ (μ , ν ), n → ∞, θ ∈ [0, 1]. 16.71. Calculate Hθ (μ , ν ), θ ∈ [0, 1] for (a) μ ∼ N(a1 , σ 2 ), ν ∼ N(a2 , σ 2 ). (b) μ ∼ N(a, σ12 ), ν ∼ N(a, σ22 ). (c) μ is the uniform distribution on [a1 , b1 ]; ν is the uniform distribution on [a2 , b2 ] (a1 < b1 , a2 < b2 ). 16.72. Let μ , ν be the distributions of Poisson random variables with the parameters λ and ρ , respectively. Find Hθ (μ , ν ), θ ∈ [0, 1]. In the case λ = ρ , find θ∗ such that Hθ∗ (μ , ν ) = minθ ∈[0,1] Hθ (μ , ν ). 16.73. Let μ and ν be the distributions of the random vectors (ξ1 , . . . , ξm ) and (η1 , . . . , ηm ). Assuming that the components of the vectors are independent and ξk ∼ Pois(λk ), ηk ∼ Pois(ρk ), k = 1, . . . , m, find Hθ (μ , ν ), θ ∈ [0, 1].
16 Weak convergence, probability metrics. Functional limit theorems
259
16.74. Let μ and ν be the distributions of the random elements in L2 ([0, T ]) defined by the Poisson processes with the intensity measures κ1 and κ2 , respectively (see Definition 5.3). Find Hθ (μ , ν ), θ ∈ [0, 1]. 16.75. Prove that the relative entropy E(μ ν ) is equal to the left derivative at the point 1 of the function θ → Hθ (μ , ν ): E(μ ν ) = lim
θ →1−
Hθ (μ , ν ) − 1 . θ −1
16.76. Prove that E(μ1 × μ2 ν1 × ν2 ) = E(μ1 ν1 ) + E(μ2 ν2 ). 16.77. (Variational formula for the entropy). Prove that, for arbitrary measure ν ∈ P(X) and arbitrary nonnegative function h ∈ L1 (X, ν ), ln h d μ − E(μ ν ) . ln h d ν = max X
μ ∈P(X)
X
Hints ω ) ∈ B} ∈ F for arbitrary open (or closed) ball B ⊂ X 16.1–16.3. Prove that {ω : X(·, and respective modification X of the process X. Use Lemma 16.1. 16.4. (1) Use the “principle of the fitting sets”. Prove that the class of the sets described in the formulation of the problem is a σ -algebra that contains all open sets. (2) If Fε /2 is a closed set from the previous statement and Kε /2 is a compact set from ε = Fε /2 ∩ Kε /2 is the required compact. the statement of the Ulam theorem, then K (3) Consider the following classes of functions: K0 = Cb,u (X); K1 = {the functions of the form f = 1IG , G is an open set}; K2 = {the functions of the form f = 1IA , A is a Borel set}; K3 = L p (X, μ ). Prove that every function from the class Ki (i = 1, 2, 3) can be obtained as an L p -limit of a sequence of linear combinations of the functions from Ki−1 . 16.7. (1) Use statement (a) of Problem 1.16. (2), (3) Use reasoning analogous to that given in the proof of Problem 16.6. 16.10. Use that the image of a compact set under a continuous mapping is also a compact set. 16.11. Use the previous problem and the following two statements. (1) The functions πX : X × Y (x, y) → x ∈ X, πY : X × Y (x, y) → y ∈ Y are continuous; (2) if K1 , K2 are the compact sets in X, Y then K1 × K2 is a compact set in X × Y. 16.13. (a) For a given t1 , . . . ,tm and n greater than some n0 = n0 (t1 , . . . ,tm ), PtX1n,...,tm = PtX1 ,...,tm . ' & (b) For an open set G = y| supt |y(t)| < 12 , limn P(Xn ∈ G) = 0 < 1 = P(X ∈ G). 16.14. If the function λ does not satisfy conditions λ (c) = a, λ (d) = b, then supt |x(λ (t)) − y(t)| = 1.
260
16 Weak convergence, probability metrics. Functional limit theorems
16.15. Use the previous problem. 16.17. Let xa = 1It∈[a,1] ∈ D([0, 1]), a ∈ [0, 1]. Then, for every a1 = a2 , the uniform distance between xa1 and xa2 is equal to 1. 16.22, 16.23. Verify that x ∈ C([0, 1]) is a continuity point for the functional Iab if and only if 01 1I{a}∪{b} (x(t)) dt = 0. Use the hint to Problem 3.21. 16.24. Verify that x ∈ D([0, 1]) is a continuity point for the functional Iab if and only if 01 1I{a}∪{b} (x(t)) dt = 0. 16.25. Verify that x ∈ C([0, 1]) is a discontinuity point for the functional τ (·, z) in the following cases. (1) {x(t) = z} = ∅ and at least one of the sets {x(t) < z}, {x(t) > z} is not empty. (2) There exists a nonempty interval (a, b) ⊂ R+ such that x(t) = z,t ∈ (a, b). Prove that otherwise x ∈ C([0, 1]) is a continuity point for τ (·, z) and use Problem 3.23. 16.26. (a) | maxt x(t) − maxt y(t)| ≤ supt |x(t) − y(t)|. (b) Verify that x ∈ C([0, 1]) is a continuity point for the functional ϑ if and only if the function x takes its maximum value on [0, 1] in a unique point, and use Problem 3.22. 16.27. Describe explicitly the sets of discontinuity points for the functionals κ, χ (see the Hint to Problem 16.25) and use Problem 3.23. 16.33. Use the invariance principle, Theorem 16.3, and Problem 16.22. 16.34. Use Problem 10.32 and the strong Markov property for the random walk (see also [22], Vol. 1, Chapter III, §4). 16.35. (1) Use the reflection principle (see Problem 10.32 or [22], Vol. 1, Chapter III, §1). (2), (3) Use the invariance principle and item 1). In item (2), you can also use the reflection principle for the Wiener process; see Problem 7.109. 16.36. (1) P(τ (z,W ) ≤ x) = P(maxs≤x W (s) ≥ z). (2) Use item (1). 16.37. Use the strong Markov property for the Wiener process (see Definition 12.9 and Theorem 12.5). 16.38. Let n ∈ N, and denote η = τ (1,W ), ηn = τ (1/n,W ). By Problem 16.36, d
d
η = n2 ηn . Thus, for every n ≥ 1, ηn = n−2 (η1 + · · · + ηn ), where η1 , . . . , ηn are the independent random variables identically distributed with η . Therefore, η has a stable distribution with the parameter α = 12 . In addition, η ≥ 0. For the description of a characteristic function of a stable distribution, see [22] Vol. 2, Chapter XVII, §§3,5. 16.39. Prove the relation P(HSN (z) = m) =
N
∑ P(HSN−k (0) = m)P(τ = k),
k=1
√
where τ = min{l : Sl ≥ z · N}. Use this relation and Problems 16.34, 16.36. 16.40. See [22], Vol. 1, Chapter III, §7. 16.41–16.44. See [4], Chapter 2, §7. 16.46. Use Problem 6.5, item (e). 16.47. Use Problems 16.27, 16.40, and the invariance principle.
16 Weak convergence, probability metrics. Functional limit theorems
261
16.48. P(χ (W ) < x) = P(W (s) = 0, s ∈ [x, 1]) = P(W (x) > 0, min W (s) −W (x) ≥ −W (x)) s∈[x,1]
+ P(W (x) < 0, max W (s) −W (x) ≤ −W (x)) s∈[x,1] −y2 /(2x) ∞ e 2 −z2 /(2(1−x)) √ ( 1− = e dz dy. R |y| 2π x 2π (1 − x) 16.49. Use Theorem 16.9 and Problem 16.28. 16.50. Use Theorem 16.9 and Problem 16.24. 16.51. (1) If ε ≥ dU (μ , ν ), then Fμ (x) ≤ Fν (x) + ε ≤ Fν (x + ε ) + ε for every x ∈ R. (2) If ε > dL (μ , ν ), then Fμ (x) ≤ Fν (x + ε ) + ε = Fν (x) +
x+ε x
pν (y) dy + ε , x ∈ R.
16.53. For random elements taking values in a Polish space X, prove the statement analogous to the one given in Problem 1.11. Then use this statement in order to solve the problem. 16.54. See Problem 16.11. 16.55. Use Problem 16.54 and the Fatou lemma. 16.56. Use the triangle inequality for the Wasserstein metric and, analogously to the solution of Problem 16.55, the Fatou lemma. 16.57. If X is a random variable and Xn = [nX]/n, n ∈ N is its discrete approximation, then E|X − Xn | p ≤ n−p → 0, n → ∞, and therefore the distributions of the variables Xn converge to the distribution of X in the metric dW,p . Therefore the statement of item (b) can be proved using item (a) and Problem 16.56. 16.58. Use formula (16.5). 16.59. Use formula (16.5) and the fact that the distribution function for N(a, σ 2 ) has the form F(x) = Φ ((x − a)/σ ) , where Φ denotes the distribution function for N(0, 1). 16.61. Use Problem 16.59 and Proposition 16.34. 16.62. Use Problem 6.13 in item (1) and Problems 6.28, 6.30 in item (2). 16.63. Use Problems 6.35 and 16.59 in order to obtain the upper and the lower estimates, respectively. 16.64. Let X1 , . . . , Xm be the random elements with distributions μ1 , . . . , μm , respectively, and θ be a random variable, independent of X1 , . . . , Xm and taking values 1, . . . , m with probabilities α1 , . . . , αm . Then the random element ⎧ ⎪ ⎨X1 , θ = 1 Xθ = . . . ⎪ ⎩ Xm , θ = m has the distribution α1 μ1 + · · · + αm μm .
262
16 Weak convergence, probability metrics. Functional limit theorems
16.66.
d(μ1 × μ2 ) d μ1 d μ2 = . d(λ1 × λ2 ) d λ1 d λ2
The Hellinger metric and the Hellinger affinity satisfy relation 1 − dH2 (μ , ν ) = H1/2 (μ , ν ). 16.69. Use Problem 16.68. 16.75. Use item (4) of Problem 16.68, the Fatou lemma, and the Lebesgue dominated convergence theorem. 16.77. Use Jensen’s inequality.
Answers and Solutions 16.1. Let B be a closed ball with & the center x ∈ C([0, T ]) and the radius r. Then ω ) ∈ B} = ∩t∈Q∩[0,T ] ω | |X(t, ω ) − x(t)| ≤ r} ∈ F. {ω | X(·, 16.2. Let B be a closed ball with the center x ∈ L2 ([0, T ]) and the radius r. Then {ω | ω ) − x(t))2 dt ≤ r}. Because the process X(t) − x(t) is ω ) ∈ B} = {ω | T (X(t, X(·, 0 T 2 − x(t)) dt is a random variable and thus {ω | X(·, ω ) ∈ B} ∈ F. measurable, 0 (X(t) 16.3. See [4], Theorem 14.5. 16.6. (1) Let B be a closed ball with the center x and radius r, then {ξ ∈ B} = ∩k {|ξk − xk | ≤ r} ∈ F. (2) Every compact set can be represented as an intersection of a countable family of sets, each one being a finite union of the open balls. Therefore, every compact set belongs to X. Let us prove that every open ball B with the radius 1 has zero measure w.r.t. distribution of the element ξ ; because every compact set is covered by a finite union of such balls, this will provide the required statement. Let the center of the ball B be a sequence x = (xk )k∈N . Then, for every k ∈ N, at least one of the inequalities holds true: |xk − 1| ≥ 1, |xk + 1| ≥ 1. Consider the following sequence (yk )k∈N : if for the given k the first relation holds, then yk = −1; otherwise yk = 1. Then {ξ ∈ B} ⊂ k {ξk = yk } and P(ξ ∈ B) ≤ ∏k∈N 12 = 0. 2,n 2 16.9. Consider the points xnjk = (x1,n jk , x jk ) ∈ [0, 1] , j, k = 1, . . . , n such that j−1 j k−1 k 1,n 2,n x jk ∈ , , x jk ∈ , , k, j = 1, . . . , n, n n n n r,n and xr,n jk = xil , r = 1, 2 for every j, k, i, l such that ( j, k) = (i, l). There exists a Borel
(and even a continuous) function fn : [0, 1] → [0, 1] such that fn (xk1,nj ) = xk2,nj , k, j = 1, . . . , n. Define the distribution of the random vector Xn = (Xn1 , Xn2 ) in the following way. Xn takes values xnjk , j, k = 1, . . . , n with the probabilities n−2 . By the definition, fn (Xn1 ) = Xn2 . On the other hand, (Xn1 , Xn2 ) weakly converges to the vector X = (X 1 , X 2 ) with independent components uniformly distributed on [0, 1]. For arbitrary Borel function f : [0, 1] → [0, 1], one has cov( f (X 1 ), X 2 ) = 0 and therefore relation f (X 1 ) = X 2 does not hold. 16.12. Both the spaces Lip([0, 1]) and Hγ ([0, 1]) with arbitrary γ ∈ (0, 1) are Banach. None of these spaces is separable.
16 Weak convergence, probability metrics. Functional limit theorems
263
16.14. d(x, y) = max |a − c|, |b − d| , % $ a b−a 1−b . , ln d0 (x, y) = min 1, max ln , ln c d −c 1−d 16.16. Yes, it is. 16.18. If xn → x in the metric d of the space D([0, 1]), then there exists a sequence λn ∈ Λ such that supt |λn (t) − t| → 0, supt |xn (t) − x(λn (t))| → 0. As soon as x is continuous, it is uniformly continuous, and thus supt |x(t) − x(λn (t))| → 0. This gives the required convergence supt |xn (t) − x(t)| → 0. 16.20. xn (t) = 1It∈[1/2−1/(2n),1] , yn (t) = 1It∈[1/2−1/(3n),1] . 16.21. No, it is not. Consider the functions xˆn = xn , yˆn = −yn , where xn , yn are the functions from the previous solution. Then xˆn → 1It∈[1/2,1] , yˆn → −1It∈[1/2,1] , but xˆn + yˆn → 0. 16.28. If xn → x in the metric d of the space D([0, 1]) then there exists a sequence λn ∈ Λ , n ≥ 1 such that supt |λn (t) − t| → 0, supt |xn (t) − x(λn (t))| → 0. Because a is continuous, supt |a(λn (t)) − a(t)| → 0. Thus sup xn (t) − sup x(t) = sup xn (t) − sup x(λn (t)) → 0. t a(t) t a(t) t a(t) t a(λn (t)) 16.29. No, it is not. 16.31. For a ∈ Z+ . 16.36. (a) $ % ∞ 2 2 1 d −y2 /(2x) √ e dy = √ x−3/2 e−z /(2x) . p(x) = dx 2π x z 2π ( √ 16.38. Π (du) = (1/ 2π )1Iu>0 u−3/2 du, ψ (z) = − 2|z|. 16.39. For z > 0, α ∈ (0, 1), 3 1−α 3 1 N z −z2 /(2s) 2 α HS (z) ≤ α → ds+ P · e · arcsin N π 3 s3/2 1−s 0 ∞
2 1 z √ · 3/2 e−z /(2s) ds, N → ∞. 2π s 16.52. Statement (c) (the triangle inequality) follows immediately from the rela> 0 (we leave details for the reader). It is obvious that tion (Aε )δ ⊂ Aε +δ , ε , δ dLP (μ , μ ) = 0. Because ε >0 Aε = A for any closed set A, it follows from the relation dLP (μ , ν ) = 0 that μ (A) ≤ ν (A) for every closed A. Then for every continuous function f taking values in (0, 1) one has
+
X
1−α
n
∑ f (tn,k )μ n→+∞
f (x) μ (dx) = lim
≤ lim
n→+∞
k=1 n
∑
k=1
{x| f (x) ∈ [tn,k−1 ,tn,k ]}
f (tn,k )ν {x| f (x) ∈ [tn,k−1 ,tn,k ]} = f (x) ν (dx), X
264
16 Weak convergence, probability metrics. Functional limit theorems
where the sequence of partitions πn = {0 = tn,0 < · · · < tn,n = 1} is chosen in such a way that |πn | → ∞, n → ∞ and μ ( f = tn,k ) = ν ( f = tn,k ) = 0, 1 ≤ k ≤ n. This inequality and the analogous inequality for f = 1 − f provide that X f d μ = X f d ν . Because f is arbitrary, this yields μ = ν and completes the proof of statement (a). Let us prove (b). It follows from (a) that dLP (μ , ν ) = 0 if and only if dLP (ν , μ ) = 0. Assume that dLP (μ , ν ) > 0 and take t ∈ (0, dLP (μ , ν )). By definition, there exists a set A ∈ B(X) such that μ (A) > ν (At ) +t. Denote B = X\At ; then the latter inequality can be written as μ (A) > 1 − ν (B) + t or, equivalently, ν (B) > μ (X\A) + t. If x ∈ Bt then there exists some y ∈ X\At such that ρ (x, y) < t and thus x ∈ A. Therefore, Bt ⊂ X\A and we have the inequality
ν (B) > μ (X\A) + t ≥ μ (Bt ) + t, which yields t < dLP (ν , μ ). Because t ∈ (0, dLP (μ , ν )) is arbitrary, we obtain that dLP (μ , ν ) ≤ dLP (ν , μ ). By the same arguments applied to the pair (ν , μ ) instead of (μ , ν ), we have dLP (μ , ν ) ≥ dLP (ν , μ ) and thus dLP (μ , ν ) = dLP (ν , μ ). 16.55. If dW,p (μ , ν ) = +∞ then one can take arbitrary coupling Z, thus only the case dW,p (μ , ν ) < +∞ needs detailed consideration. Take a sequence Z n = (X n ,Y n ) ∈ p (μ , ν ), n → ∞. By Problem 16.54 the family C(μ , ν ) such that Eρ p (X n ,Y n ) → dW,p n of distributions of Z , n ≥ 1 is tight. Using the Prokhorov theorem and passing to a limit, we may assume that Z n , n ≥ 1 converges weakly to some Z ∗ = (X ∗ ,Y ∗ ). Because the projection in X × X on one component is a continuous function, by Theorem 16.3 we have X n ⇒ X ∗ ,Y n ⇒ Y ∗ . Therefore, X ∗ and Y ∗ have distributions μ and ν , respectively; that is, Z ∗ ∈ C(μ , ν ). Consider a sequence of continuous functions fk : R+ → R+ , k ∈ N such that ∞ ∑k=1 fk ≡ 1 and fk (t) = 0 t ∈ [k−1, k]. Every function φk (z) = ρ p (x, y) fk (ρ (x, y)), z = (x, y) ∈ X × X is continuous and bounded, and thus Eφk (Z n ) → Eφk (Z ∗ ), n → ∞, k ∈ N. Therefore, by the Fatou lemma, Eρ p (X ∗ ,Y ∗ ) =
∞
∞
k=1
k=1
sup ∑ Eφk (Z n ) ∑ Eφk (Z ∗ ) ≤ lim n→∞
p = lim sup Eρ p (X n ,Y n ) = dW,p (μ , ν ). n→∞
Hence Z ∗ is an optimal coupling. 16.57. (a) Consider the case p > 1. Denote by T = {tk , k ∈ N} the set of all the points that have a positive measure μ or ν . Then the distribution of any vector Z ∈ C(μ , ν ) is defined by the matrix {z jk = P(Z = (t j ,tk ))} j,k∈N that satisfies relations
∑ zik = ν ({tk }) =: yk , ∑ z ji = μ ({t j }) =: x j , i
k, j ∈ N.
i
We denote the class of such matrices also by C(μ , ν ). By the definition of the Wasserstein metric and Problem 16.55,
16 Weak convergence, probability metrics. Functional limit theorems p dW,p (μ , ν ) =
=
∑
inf
{z jk }∈C(μ ,ν ) j,k∈N
∑
265
z jk c jk
z∗jk c jk , c jk := |t j − tk | p , k, j ∈ N,
j,k∈N
where the matrix {z∗jk } corresponds to the distribution of an optimal coupling Z ∗ ∈ C(μ , ν ). We write j ≺ k, if t j < tk . Let us show that for arbitrary j1 ≺ j2 , k1 ≺ jk at least one number z∗j1 k2 , z∗j2 k1 is equal to 0. This would be enough for solving the problem, because this condition on the matrix {z∗jk }, together with the conditions
∑ z∗ik = yk , ∑ z∗ji = x j , i
k, j ∈ N,
i
defines this matrix uniquely, and, on the other hand, this condition is satisfied for the matrix corresponding to the coupling described in Proposition 16.2. Assume that for some j1 ≺ j2 , k1 ≺ k2 the required condition fails, put a = z jk } by min(z∗j1 k2 , z∗j2 k1 ) > 0 and define the new matrix { ⎧ ∗ ⎪ ( j, k) ∈ {( jl , kr ), l, r = 1, 2}, ⎨z jk , ∗ z jk = z jk + a, ( j, k) = ( j1 , k1 ) or ( j2 , k2 ), ⎪ ⎩∗ z jk − a, ( j, k) = ( j1 , k2 ) or ( j2 , k1 ). By the construction, { z jk } ∈ C(μ , ν ) and
∑
z jk c jk = a[c j1 k1 + c j2 k2 − c j1 k2 − c j2 k1 ] +
j,k∈N
∑
z∗jk c jk .
j,k∈N
It can be checked directly that for every s1 < s2 , r1 < r2 , |r1 − s1 | p + |r2 − s2 | p < |r1 − s2 | p + |r2 − s1 | p , whence c j1 k1 + c j2 k2 − c j1 k2 − c j2 k1 < 0. This means that
∑
j,k∈N
z jk c jk <
∑
z∗jk c jk ;
j,k∈N
consequently, {z∗jk } does not correspond to an optimal coupling. This contradiction shows that the above assumption is impossible; that is, the matrix {z∗jk } satisfies the required condition. The case p = 1 can be obtained by an appropriate limit procedure for p → 1+. 2 ( μ , ν ) = 1 x + λ −1 ln(1 − x) 2 dx = 1/3 − (3/2)λ −1 + 2λ −2 . λ 16.58. dW,2 min = 0 8/3. 16.59. Because F1,2 (x) = Φ ((x − a1,2 )/σ1,2 ) , the optimal coupling for μ , ν has the form (X,Y ) = (a1 + σ1 η , a2 + σ2 η ), where η ∼ N(0, 1). Therefore, . . dW,2 (μ , ν ) = E[a1 − a2 + (σ1 − σ2 )η ]2 = (a1 − a2 )2 + (σ1 − σ2 )2 .
266
16 Weak convergence, probability metrics. Functional limit theorems
16.60. Consider a random element Z = (Z1 , Z2 , Z3 , Z4 ) taking values in X×X = X1 × X2 × X1 × X2 . If Z ∈ C(μ , ν ), then the components Z1 , Z2 , Z3 , Z4 have distributions μ1 , ν1 , μ2 , and ν2 , respectively. Thus Eρ p (Z1 , Z2 ), (Z3 , Z4 ) = Eρ1p (Z1 , Z3 ) + Eρ2p (Z2 , Z4 ) ≥
inf
(X1 ,Y1 )∈C(μ1 ,ν1 )
Eρ1p (X1 ,Y1 ) +
inf
(X2 ,Y2 )∈C(μ2 ,ν2 )
Eρ1p (X2 ,Y2 ),
p p and, by the definition of the Wasserstein metric, we have dW,p (μ , ν ) ≥ dW,p (μ1 , ν1 )+ p dW,p (μ2 , ν2 ). Next, let ε > 0 be fixed. Take such random elements (X1ε ,Y1ε ) ∈ C(μ1 , ν1 ), (X2ε ,Y2ε ) ∈ C(μ2 , ν2 ) that p p (μ1 , ν1 ) + ε , Eρ2p (X2ε ,Y2ε ) ≤ dW,p ( μ 2 , ν2 ) + ε . Eρ1p (X1ε ,Y1ε ) ≤ dW,p
Such elements exist by the definition of the Wasserstein metric. Construct the rand d dom element Z ε = (Z1ε , Z2ε , Z3ε , Z4ε ) with (Z1ε , Z3ε ) =(X1ε ,Y1ε ), (Z2ε , Z4ε ) =(X2ε ,Y2ε ), and ε ε ε ε ε elements (Z1 , Z3 ), (Z2 , Z4 ) being independent. By construction, Z1 and Z2ε are independent. In addition, Z1ε has distribution μ1 and Z2ε has distribution μ2 . Thus, (Z1ε , Z2ε ) has distribution μ . Analogously, (Z3ε , Z4ε ) has distribution ν . Therefore Z ε ∈ C(μ , ν ) and p p p (μ , ν ) ≤ Eρ p (Z1ε , Z2ε ), (Z3ε , Z4ε ) ≤ dW,p (μ1 , ν1 ) + dW,p (μ2 , ν2 ) + 2ε . dW,p This gives . the required statement because ε > 0 is arbitrary. 1/2
1/2
16.61. ∑k (λk − θk )2 . 16.62. (1) The processes X,Y are mean square continuous (Theorem 4.1), and thus have measurable modification (Theorem 3.1). Hence these processes generate random elements in L2 ([a, b]) (Problem 16.2). (2) Let W be the Wiener process. Put X(t) =
b a
QX (t, s) dW (s), Y (t) =
b a
QY (t, s) dW (s), t ∈ [a, b].
The processes X,Y are centered and their covariance functions equal RX and RY , respectively (Problem 6.35). Therefore (X,Y ) ∈ C(μ , ν ) and 2 (μ , ν ) ≤ EX dW,2
=
b b a
E
a
−Y 2L2 ([a,b]
=E
b a
2 (QX (t, s) − QY (t, s))dW (s) dt =
(X(t) −Y (t))2 dt
[a,b]2
(QX (t, s) − QY (t, s))2 dsdt.
16.63. It follows from Problem 6.13 that the pair of the processes W (t),W (t) − t,t ∈ [0, 1] is a coupling for μ , ν (here W is the Wiener process). Then 2 (μ , ν ) ≤ dW,2
1 0
1 t 2 dt = . 3
16 Weak convergence, probability metrics. Functional limit theorems
267
On the other hand, for arbitrary coupling (X,Y ) ∈ C(μ , ν ), EX −Y 2L2 ([0,1]) ≥ E
$
1
0
%2 (X(t) −Y (t)) dt
= E[ξ − η ]2 ,
where we have used the notation ξ = 01 X(t) dt, η = 01 Y (t) dt. The variables ξ and η are centered Gaussian ones with the variances a = 01 01 (t ∧ s) dtds = 13 and b = 01 01 (t ∧ s − ts) dtds = 29 , respectively. Then Problem 16.59 yields that . EX −Y 2L
2 ([0,1])
≥
√
√ √ 2 1 . a− b = √ − 3 3
16.64. Let X1 , . . . , Xm ,Y1 , . . . ,Ym be random elements with distributions μ1 , . . . , μm , ν1 , . . . , νm , and θ be an independent random variable that takes values 1, . . . , m with probabilities α1 , . . . , αm . Then the variables Xθ ,Yθ (see the Hint for the notation) have distributions ∑k αk μk , ∑k αk νk . For every ε > 0 the elements X1 , . . . , Xm ,Y1 , . . . ,Ym can be constructed in such a way that p Eρ p (X j ,Y j ) ≤ ε + max dW,p (μk , νk ), k
j = 1, . . . , m.
Then p p dW,p (μ , ν ) ≤ Eρ p (X,Y ) = ∑ α j Eρ p (X j ,Y j ) ≤ ε + max dW,p (μk , νk ). k
j
Because ε > 0 is arbitrary, this finishes the proof. Literally the same arguments show that, for arbitrary H from (16.3) (not necessarily a metric), the function dH,min := inf(X,Y )∈C(μ ,ν ) H(X,Y ) has the same property ) dH,min
m
m
k=1
k=1
*
∑ αk μk , ∑ αk νk
≤ max dH,min (μk , νk ) k=1,...,m
as soon as, in the previous notation, H(Xθ ,Yθ ) ≤ ∑ αk H(Xk ,Yk ). k
16.65. (a) λ = 12 (μ + ν ) is a probability measure that dominates both μ and ν . (b) Let λ1 , λ2 dominate μ , ν simultaneously. Assume first that λ1 λ2 . Then d μ d λ1 dμ = d λ2 d λ1 d λ2
λ2 − a.s.,
and thus d μ θ d μ 1−θ d μ θ d μ 1−θ d λ1 θ d λ1 1−θ d λ2 = d λ2 d λ2 d λ1 d λ2 d λ2 X d λ2 X d λ1
268
16 Weak convergence, probability metrics. Functional limit theorems
=
X
dμ d λ1
θ
dμ d λ1
1−θ
d λ1 d λ2
d λ2 =
X
dμ d λ1
θ
dμ d λ1
1−θ
d λ1 .
Now, let λ1 , λ2 do not satisfy any additional assumption. Then the measure λ3 = 1 2 (λ1 + λ2 ) dominates both λ1 and λ2 , and therefore, taking into account previous considerations, we get d μ θ d μ 1−θ d μ θ d μ 1−θ d λ1 = d λ3 d λ1 d λ3 X d λ1 X d λ3 d μ θ d μ 1−θ = d λ2 . d λ2 X d λ2 Invariance of Definitions 16.11, 16.13 with respect to the choice of λ can be proved analogously. 16.67. Denote f =d μ /d λ , g = d ν /d λ , then (μ − ν )(A) = A ( f − g) d λ , A ∈ X, and thus μ − ν var = X | f − g| d λ . Hence ( √ √ ( √ f − g)2 d λ ≤ | f − g|( f + g) d λ X X %1/2 $ %1/2 $ ( ( √ √ = dTV (μ , ν ) ≤ ( f − g)2 d λ ( f + g)2 d λ X X . = dH (μ , ν ) 2 + 2H1/2 (μ , ν ) ≤ 2dH (μ , ν ).
dH2 (μ , ν ) =
(
(
16.68. Denote f = d μ /d λ , g = d ν /d λ . Then H0 (μ , ν ) = X g d λ = 1, H1 (μ , ν ) = older inequality with p = 1/θ X f d λ = 1. Inequality Hθ ( μ , ν ) ≤ 1 comes from the H¨ applied to f θ , g1−θ . Log-convexity of the function Hμ ,ν is provided by the relation
f αθ1 +(1−α )θ2 g1−αθ1 −(1−α )θ2 d λ α 1−α α f θ1 g1−θ1 f θ2 g1−θ2 = d λ ≤ Hμα,ν (θ1 )Hμ1− ,ν (θ2 );
Hμ ,ν (αθ1 + (1 − α )θ2 ) =
X X
in the last inequality we have used the H¨older inequality with p = 1/α . This proves the statements (1), (2). The measures μ and ν are mutually singular if and only if μ (A) = 1, ν (A) = 0 for some set A ∈ X. This condition is equivalent to the condition for the product f g to be equal to zero λ -a.s. (verify this!), whence the statement (5) follows. Statement (3) follows from the statements (2) and (5): if Hμ ,ν takes value 0 at some point, then Hμ ,ν is zero identically (and thus is continuous) on (0, 1). If, otherwise, all the values of Hμ ,ν are positive, then θ → ln Hμ ,ν (θ ) is a bounded convex function on [0, 1], and thus is continuous on (0, 1). statement (4). If μ ν then one can assume λ = ν and Hθ (μ , ν ) = Consider θ d ν . We have f θ → f , θ → 1− pointwise. In addition, f θ ≤ f ∨ 1 ∈ L (X, ν ) for f 1 X every θ ∈ (0, 1), and, combined with the dominated convergence theorem, it implies that Hθ (μ , ν ) → X f d ν = 1 = H1 (μ , ν ), θ → 1−. On the other hand, if μ ν , then there exists A ∈ X with μ (A) > 0, ν (A) = 0. Hence, for every θ ∈ (0, 1)
16 Weak convergence, probability metrics. Functional limit theorems
Hθ (μ , ν ) =
269
f θ g1−θ d λ ≤ [μ (X\A)]θ .
X\A
Taking a limit as θ → 1−, we get lim sup Hθ (μ , ν ) ≤ μ (X\A) < 1 = H1 (μ , ν ); θ →1−
that is, the function Hμ ,ν is discontinuous at the point 1. 16.69. Denote hn (θ ) = Hθ (μn , νn ), HN (θ ) = ∏∞ n=N hn (θ ); then H1 (θ ) = Hθ ( μ , ν ) (verify this!). By the assumption, μn νn , and thus every function hn is continuous at the point 1 (Problem 16.68, item 4)). Assume μ ν . Then H1 is discontinuous at the point 1 and γ := lim infθ →1− H1 (θ ) < 1. Therefore, by statement (2) from the same problem, H1 ( 12 ) ≤ γ 1/2 . Then there exists N1 ∈ N such N1 −1 hn ( 12 ) ≤ γ 1/3 . Because hn (θ ) → 1, θ → 1− for any n ∈ N, we have that ∏n=1 together with statement (2) of Probthe relation lim infθ →1− HN1 (θ ) = γ, which, 1 1/2 lem 16.68, provides an estimate HN1 2 ≤ γ . Therefore, there exists N2 ∈ N such N2 −1 h 1 ≤ γ 1/3 . Repeating these arguments, we obtain a sequence Nk , k ≥ 1 that ∏n=N 1 n 2 Nk −1 hn 12 ≤ γ 1/3 < 1, k ∈ N (here we denote N0 = 1). Thus such that ∏n=N k−1 * ∞ 1 ∏ hn 2 ≤ ∏ γ 1/3 = 0, n=Nk−1 k=1
)
∞
H 1 (μ , ν ) = ∏ 2
k=1
Nk −1
and therefore μ ⊥ ν . −n 16.70. Take λ = 14 (μ + ν + ∑∞ n=1 2 ( μn + νn )) ; then the measure λ dominates every measure μ , ν , μn , νn , n ≥ 1. Denote f=
dμ dν , g= , dλ dλ
fn =
d μn d νn , gn = . dλ dλ
Then X
| fn − f |d λ = μn − μ var → 0,
X
|gn − g|d λ = νn − ν var → 0, n → ∞.
By the H¨older and Minkowski inequalities, θ |Hθ (μ , ν ) − Hθ (μn , νn )| = f θ g1−θ d λ − fnθ g1− d λ n X
≤ +
n → ∞, θ ∈ (0, 1). 2 2 16.71. (a) e−(θ (1−θ )(a1 −a2 ) )/(2σ ) .
X
X
X
| f − fn | d λ |g − gn | d λ
θ
X
g dλ
1−θ X
1−θ
fn d λ
1−θ
→ 0,
270
16 Weak convergence, probability metrics. Functional limit theorems
.
(b)
σ11−θ σ2θ
.
(1 − θ )σ12 + θ σ22
(c) One can assume that a1 ≤ a2 . In this case, Hθ (μ , ν ) =
(b1 − a2 )+ , θ ∈ (0, 1). (b1 − a1 )θ (b2 − a2 )1−θ
16.72. Hθ (μ , ν ) = exp[λ θ ρ 1−θ − θ λ −(1− θ )ρ ]. θ∗ = logλ /ρ [(λ /ρ − 1)/ ln(λ /ρ )].
θ ρ 1−θ − θ λ − (1 − θ )ρ 16.73. Hθ (μ , ν ) = exp ∑m . λ k k k=1 k k 16.74. Denote κi (T ) = κi ([0, T ]), κˆ i = κi /(κi (T )), i = 1, 2. Then
Hθ (μ , ν ) = exp κ1θ (T )κ21−θ (T )Hθ (κˆ 1 , κˆ 2 ) − θ κ1θ (T ) − (1 − θ )κ2θ (T ) .
16.77. Let us prove that X ln h d μ − E(μ ν ) ≤ ln X h d ν for arbitrary measure μ ∈ P(X). If μ ν , then E(μ ν ) = +∞ and the required inequality holds true. If μ ν , denote f = d μ /d ν . Applying Jensen’s inequality to the concave function ln(·), we get X
ln h d μ − E(μ ν ) =
X
≤ ln
(ln h − ln f ) d μ =
X
h d μ = ln f
X
X
(ln h − ln f ) d μ
h dν .
On the other hand, if h/ f = const, then this inequality becomes an equality. Put d μ = f d ν , where f = h/( X h d ν ) if X h d μ > 0 and f ≡ 1 otherwise. Then μ is a probability measure with X ln h d μ − E(μ ν ) = ln X h d ν .
17 Statistics of stochastic processes
Theoretical grounds General statement of the problem of testing two hypotheses Let the trajectory x(·) of a stochastic process {X(t), t ∈ [0, T ]} be observed. It is known that the paths of the process belong to a metric space of functions F[0,T ] defined on [0, T ]. For example, it can be the space of continuous functions C([0, T ]) or Skorokhod space D([0, T ]); see Chapter 16. On F[0,T ] the Borel σ -field B is considered. Two hypotheses concerning finite-dimensional distributions of the process X are given. According to the hypothesis Hk , k = 1, 2, a probability measure μk on the σ -field B corresponds to the process {X(t)}; that is, under the hypothesis Hk the equality holds P(X(·) ∈ A) = μk (A), A ∈ B. Based on the observations, one has to select one of the hypotheses. It can be done on the basis of either randomized or nonrandomized decision rule, as we show below. It is said that a randomized decision rule R is given, if for each possible trajectory x(·) ∈ F[0,T ] the probability p(x(·)) is defined (here p is a measurable functional on F[0,T ] ) to accept H1 if the path x(·) is observed, and 1 − p(x(·)) is the probability to accept the alternative hypothesis H2 . The rule is characterized by the probabilities of Type I and Type II errors: α12 = P(H1 |H2 ) to accept H1 when H2 is true, and α21 = P(H2 |H1 ) to accept H2 when H1 is true. The error probabilities are expressed by integrals
α12 =
F[0,T ]
p(x)μ1 (dx), α21 =
(1 − p(x))μ1 (dx).
F[0,T ]
It is natural to look for the rules minimizing the error probabilities. In many cases it is enough to content oneself with nonrandomized decision rules for which p(x(·)) takes only values 0 and 1. Then F[0,T ] is partitioned into two measurable sets G1 and G2 := F[0,T ] \ G1 ; if x(·) ∈ G1 then H1 is accepted, whereas if x(·) ∈ G2 then H2 is accepted. The set G1 is called the critical region for testing H1 . The error probabilities are calculated as αi j = μ j (Gi ), i = j. D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 271 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 17,
272
17 Statistics of stochastic processes
Absolutely continuous measures on function spaces Let μ1 and μ2 be two finite measures on the σ -field B in F[0,T ] . Definition 17.1. The measures μ1 and μ2 are singular if there exists a partition of the total space F[0,T ] into two sets Q1 and Q2 such that μ1 (Q2 ) = 0 and μ2 (Q1 ) = 0. Notation: μ1 ⊥ μ2 . Definition 17.2. The measure μ2 is absolutely continuous with respect to μ1 if for each A ∈ B such that μ1 (A) = 0, it holds μ2 (A) = 0. Notation: μ2 μ1 . If μ2 μ1 then by the Radon–Nikodim theorem there exists a B-measurable nonnegative function ρ (x) such that for all A ∈ B
μ2 (A) =
ρ (x)μ1 (dx).
(17.1)
A
The function is called the density of the measure μ2 with respect to μ1 . Notation:
ρ (x) =
d μ2 (x). d μ1
Definition 17.3. If μ2 μ1 and μ1 μ2 simultaneously, then the measures μ1 and μ2 are called equivalent. Notation: μ1 ∼ μ2 . Measures μ1 and μ2 are equivalent if and only if the function ρ from the equality (17.1) is positive a.e. with respect to μ1 . In this case
μ1 (A) =
1 μ2 (dx), A ∈ B. ρ (x)
A
For any finite measures μ1 and μ2 one can find pairwise disjoint sets Δ1 , Δ2 , and Δ such that μ1 (Δ2 ) = 0 and μ2 (Δ1 ) = 0, and on Δ the measures are equivalent; that is, there exists a measurable function ρ : Δ → (0, +∞) such that for all A ∈ B,
μ2 (A ∩ Δ ) =
ρ (x)μ1 (dx),
A∩Δ
μ1 (A ∩ Δ ) =
A∩Δ
1 μ2 (dx). ρ (x)
(17.2)
Let H be a real separable infinite-dimensional Hilbert space. Consider a finite measure μ on the Borel σ -field B(H). Definition 17.4. The mean value of a measure μ is called a vector mμ ∈ H such that (mμ , x) =
(z, x)d μ (z), x ∈ H.
H
The correlation operator of a measure μ is called a linear operator Sμ in H such that (Sμ x, y) = (z − mμ , x)(z − mμ , y)d μ (z), x, y ∈ H. H
17 Statistics of stochastic processes
273
It is known that the correlation operator Sμ of a measure μ , if it exists, is a continuous self-adjoint operator. Moreover it is positive; that is, (Sμ x, x) ≥ 0, x ∈ H. Definition 17.5. A measure μ is called a Gaussian measure on H if for each linear continuous functional f on H, the induced measure μ ◦ f −1 is a Gaussian measure on the real line. The correlation operator Sμ of a Gaussian measure μ always exists, moreover ∑∞ i=1 λi (Sμ ) < ∞ where λi (Sμ ) are eigenvalues of Sμ that are counted according to orthonormal eigenvectors. their multiplicity. Let {ei , i ≥ 1} be the corresponding ( They form a basis in H. Define the operator Sμ in H, (
∞
Sμ x = ∑
.
λi (Sμ )(x, ei )ei , x ∈ H.
i=1
Theorem 17.1. (Hajek–Feldman theorem) Let μ and ν be two Gaussian measures in H, with √ common correlation operator √ S and mean values mμ = 0 and mν = a. If a ∈ S(H) then μ ∼ ν , and if a ∈ S(H) then μ ⊥ ν . In the case μ ∼ ν , the Radon–Nikodim derivative is , , √ −1 ,2 ∞ xk ak 1, dν , (x) = exp{− , S a, , + ∑ λ }. dμ 2 k k=1 Here λk are positive eigenvalues of the operator S, and ϕk are the corresponding eigenvectors, the coefficients xk = (x, ϕk ), ak = (a, ϕk ), k ≥ 1, and the series converges for μ –almost all x. The Neyman–Pearson criterion Fix ε ∈ (0, 1). The Neyman–Pearson criterion presents a randomized rule for hypothesis testing, which for a given upper bound ε of a Type I error (that is, when α12 ≤ ε ), minimizes a Type II error α21 . Consider three cases concerning the measures μ1 and μ2 related to the hypotheses H1 and H2 . (1) μ1 ⊥μ2 . Then there exists a set G1 such that μ1 (G1 ) = 1 and μ2 (G1 ) = 0. If x(·) ∈ G1 then we accept H1 , otherwise if x(·) ∈ G1 then H2 is accepted. We have
α12 = μ2 (G1 ) = 0, α21 = μ1 (F[0,T ] \ G1 ) = 0. Thus, in this case one can test the hypotheses without error. (2) μ1 ∼ μ2 . Let ρ be the density of μ2 with respect to μ1 . For λ > 0 denote Rλ = {x ∈ F[0,T ] : ρ (x) < λ }, Γ λ = {x ∈ F[0,T ] : ρ (x) = λ }. Then there exists λ¯ such that
μ2 (Rλ ) ≤ ε , ¯
μ2 (Rλ ∪ Γ λ ) ≥ ε . ¯
¯
274
17 Statistics of stochastic processes
Consider three options. ¯ ¯ (2a) μ2 (Rλ ) = ε . Then set G1 = Rλ . We have
α12 = ε , α21 = 1 − μ1 (Rλ ). ¯
(2b) μ2 (Rλ ) < ε and μ2 (Rλ ∪ Γ λ ) = ε . Then set G1 = Rλ ∪ Γ λ . We have ¯
¯
¯
¯
¯
α12 = ε , α21 = 1 − μ1 (Rλ ) − μ1 (Γ λ ). ¯
¯
(2c) μ2 (Rλ ) < ε and μ2 (Rλ ∪ Γ λ ) > ε . Then we construct a randomized rule by means of the probability functional p(x): ¯ p(x) = 1 if x ∈ Rλ , ¯ ¯ p(x) = 0 if x ∈ F[0,T ] \ (Rλ ∪ Γ λ ), ¯
¯
ε −μ2 (Rλ ) ¯ μ2 (Γ λ ) measure μ2 of
p(x) =
¯
¯
if x ∈ Γ λ . ¯
If the any single path equals 0, then in case (2c) we define a nonran¯ ¯ domized rule as follows. There exists D ⊂ Γ λ such that μ2 (D) = ε − μ2 (Rλ ); we set ¯ G1 = Rλ ∪ D and obtain the decision rule with α12 = ε and minimal α21 . (3) Now, let μ1 and μ2 be neither singular nor equivalent. There exist pairwise disjoint sets Δ1 , Δ2 , and Δ such that (17.2) holds and μ2 (Δ1 ) = μ1 (Δ2 ) = 0. Let Rλ = {x ∈ Δ : ρ (x) < λ } ∪ Δ1 , Γ λ = {x ∈ Δ : ρ (x) = λ }. Consider two options. (3a) ε ≥ 1 − μ2 (Δ2 ). We set G1 = Δ1 ∪ Δ , then
α12 = 1 − μ2 (Δ2 ) ≤ ε , α21 = 0. (3b) ε < 1 − μ2 (Δ2 ). Choose λ¯ such that
μ2 (Rλ ) ≤ ε , ¯
μ2 (Rλ ∪ Γ λ ) ≥ ε , ¯
¯
and construct the rule as in case (2). Therefore, in order to construct an optimal criterion one has to find the sets Δk on which the singular measures are concentrated, or in the case of equivalent measures to find the relative density of measures, wherein the probability law of ρ (x(·)) is needed for each hypothesis. Hypothesis testing for diffusion processes The case of different diffusion matrices Let x(t), t ∈ [0, T ], be a path of a diffusion process in Rm , and under a hypothesis Hk the drift vector of the diffusion process is ak (t, x), and its diffusion matrix is Bk (t, x) (all the functions are continuous in both arguments); k = 1, 2. This means that under Hk the observed diffusion process is a weak solution to the stochastic integral equation
17 Statistics of stochastic processes
x(t) = x0 +
t
ak (s, x(s))ds +
0
t
1/2
Bk (s, x(s))dW (s), t ∈ [0, T ].
275
(17.3)
0
Here W is an m–dimensional Wiener process; that is, W (t) = (W1 (t), . . . ,Wm (t))' , 1/2 t ∈ [0, T ], where Wi , i = 1, m are independent scalar Wiener processes, and Bk is a positive semidefinite matrix such that its square is a positive semidefinite matrix Bk . For the equation (17.3) the analogue of Theorem 14.5 about the existence of a weak solution holds true. Having the path x(t) one can find Bk (t, x(t)), t ∈ [0, T ], provided the hypothesis Hk is true. This can be done as follows. For z ∈ Rm we set 2 2n −1 k+1 k λ (t, z) = lim ∑ x t −x n t ,z . (17.4) n n→∞ 2 2 k=0 The limit in (17.4) exists a.s. under each hypothesis Hk , k = 1, 2, and
λ (t, z) =
t
(Bk (s, x(s))z, z) ds, t ∈ [0, T ],
(17.5)
0
if Hk is true. If on the observed path for some t ∈ [0, T ] and z ∈ Rm , t
(B1 (s, x(s))z, z)ds =
0
t
(B2 (s, x(s))z, z)ds,
(17.6)
0
then the equality (17.5) is correct only for a single value of k. Due to the continuity of the integrand functions, (17.6) holds true if and only if for some t ∈ [0, T ] and z ∈ Rm , (B1 (t, x(t))z, z) = (B2 (t, x(t))z, z). Thus, under this condition we accept Hk if (17.5) holds for that k, and finally obtain the error-free decision rule. Condition for equivalence of measures, and distribution of density under various hypotheses Now, let along the observed path ∀z ∈ Rm :
(B1 (t, x(t))z, z) = (B2 (t, x(t))z, z).
Then B1 (t, x(t)) ≡ B2 (t, x(t)). Therefore, one can assume that ∀t ∈ [0, T ], x ∈ Rm :
B1 (t, x) = B2 (t, x).
Assume that the distribution of x(0) is given and does not depend on the choice of the hypothesis. Denote
276
17 Statistics of stochastic processes
B(t, x) = B1 (t, x) = B2 (t, x), a(t, x) = a2 (t, x) − a1 (t, x). Let μk be a measure generated by the observed process on the space C([0, T ]) under the hypothesis Hk ; k = 1, 2. Theorem 17.2. For the equivalence of measures μ1 ∼ μ2 , the next condition is sufficient: for each t, x there exists b(t, x) ∈ Rm such that the next two conditions hold: (1)
a(t, x) = B(t, x)b(t, x). T
(2)
(a(t, x(t)), b(t, x(t)))dt < ∞
0
for almost every x(·) with respect to the measure μ2 . Therein the density of μ2 with respect to μ1 is T
ρ (x(·)) = exp{ (b(t, x(t)), dx(t)) 0
1 − 2
T
(b(t, x(t)), a1 (t, x(t)) + a2 (t, x(t)))dt}.
(17.7)
0
Here the differential dx(t) is written based on the stochastic equation (17.3), and the first integral on the right-hand side of (17.7) is understood, respectively, as a sum of the Lebesgue integral and the stochastic Ito integral. Homogeneous in space processes Let ak (t, x) = ak (t) and Bk (t, x) = Bk (t), k = 1, 2; that is, the coefficients of the diffusion process do not depend on the spatial variable. As above we assume that all the coefficients are continuous functions. Then the process {x(t), t ∈ [0, T ]} has independent increments. From (17.5) it follows that
λ (t, z) =
t
(Bk (s)z, z)ds
0
if Hk is true. Therefore, the hypotheses are tested without error if there exists such t that B1 (t) = B2 (t). Let B1 (t) = B2 (t) = B(t) and a(t) = a2 (t) − a1 (t). Denote by Lt the range {B(t)z : z ∈ Rm } and by E the set of such t ∈ [0, T ] that a(t) does not belong to Lt . Let P(t) be the projection operator on Lt . If λ 1 (E) > 0 then the hypotheses are tested without error: T
I(x(·)) := 0
P(t)(x(t) − a1 (t))2 dt = 0
17 Statistics of stochastic processes
277
under the hypothesis H1 , and I(x(·)) > 0 under the hypothesis H2 . Now, let a(t) ∈ Lt , t ∈ [0, T ]; that is, ∀t ∃b(t) ∈ Rm : a(t) = B(t)b(t).
(17.8)
In order for the vector b(t) in (17.8) to be uniquely defined, we select it from the subspace Lt ; this is possible because the matrix B(t) is symmetric. Note that then (a(t), b(t)) ≥ 0. Under condition (17.8), the necessary and sufficient condition for the absolute continuity of the measures μ1 and μ2 is the condition T
(a(t), b(t))dt < ∞.
(17.9)
0
Under the conditions (17.8) and (17.9), the density of the measure μ2 with respect to μ1 in the space C([0, T ]) is T
T
0
0
1 ρ (x(·)) = exp{ (b(t), dx(t)) − 2
(b(t), a1 (t) + a2 (t))dt}.
(17.10)
Under the hypothesis Hk it holds log ρ (x(·)) ∼ N(mk , σ 2 ) with
σ =
T
2
(a(t), b(t))dt, mk = (−1)k
0
σ2 ; k = 1, 2. 2
This makes it possible to construct the Neyman–Pearson criterion. Next, we construct an error-free test under the assumption (17.8) and the condition T
(a(t), b(t))dt = +∞.
(17.11)
0
Select a sequence of continuous functions bn (t) such that n≤
T
(B(t)bn (t), bn (t))dt =
T
0
(a(t), bn (t))dt < ∞
0
(this is possible due to the imposed assumptions). Then we accept the hypothesis H1 if ⎛ ⎞ T
lim ⎝ (B(t)bn (t), bn (t))dt ⎠
−1 T
n→∞
0
otherwise the hypothesis H2 is accepted.
0
bn (t)d(x(t) − a1 (t)) = 0,
278
17 Statistics of stochastic processes
Hypothesis testing about the mean of Gaussian process Let x(t), t ∈ [0, T ], be a path of a scalar Gaussian process with given continuous correlation function R(t, s). Under the hypothesis H1 the mean of the process equals 0, whereas under the alternative hypothesis H2 the mean is equal to a given continuous function a(t). Condition for singularity of measures Introduce a linear operator R in X := L2 ([0, T ]), (Rg)(t) =
T
R(t, s)g(s)ds, g ∈ X, t ∈ [0, T ].
0
This is a Hilbert–Schmidt integral operator. Its eigenspace which corresponds to a zero eigenvalue is the kernel of the operator R. Also the operator has a sequence of positive eigenvalues and corresponding normalized eigenfunctions {λk , ϕk ; k ≥ 1} with ∑k≥1 λk < ∞, the functions ϕk are pairwise orthogonal, and their linear combinations are dense in the range of the operator R. Consider two cases. (1) In the space X the function a(·) has no series expansion in the functions ϕk . Then the measures μ1 and μ2 on the space X that correspond to the hypotheses H1 and H2 , are singular. We describe a decision rule. Let a(·) ˆ = a(·) − ∑ (a, ϕk )ϕk (·).
(17.12)
k≥1
Hereafter (·, ·) is the inner product in X, and in the case of an infinite number of ϕk , the series in (17.12) converges in the norm in X. If T
ˆ I(x(·)) :=
x(t)a(t)dt ˆ =0
0
then we accept H1 ; otherwise we accept H2 . (2) In the space X the function a(·) has a Fourier expansion in the functions ϕk : a(t) =
∑ ak ϕk (t),
ak := (a, ϕk ).
(17.13)
k≥1
(In the case of an infinite number of ϕk , the series in (17.13) converges in the norm in X). Assume additionally that ∞
a2
∑ λkk = ∞.
(17.14)
k=1
Then μ1 ⊥μ2 . A decision rule is constructed as follows. Select a sequence {mn } such that mn 2 a ∀n ≥ 1 : ∑ k ≥ n. k=1 λk
17 Statistics of stochastic processes
If
) lim
n→∞
mn
a2k ∑ k=1 λk
*−1
mn
ak ∑ λk k=1
T
x(t)ϕk (t)dt = 0
279
(17.15)
0
then we accept the hypothesis H1 ; otherwise we accept H2 . Condition for equivalence of measures In the notations of the previous subsection, the criterion of the equivalence of the measures μ1 and μ2 is the condition a2
∑ λkk < ∞.
(17.16)
k≥1
Under this condition the density of μ2 with respect to μ1 is 4 a2k xk ak 1 , ρ (x) = exp ∑ − ∑ 2 k≥1 λk k≥1 λk
(17.17)
where xk := (x, ϕk ), and the first series under the exponent converges for μ1 –almost all x. Under the hypothesis Hk , log ρ (x(·)) ∼ N(mk , σ 2 ), σ 2 =
a2
∑ λkk ,
k≥1
1 mk = (−1)k σ 2 ; 2
k ≥ 1.
For the condition (17.16) it is sufficient that in X = L2 ([0, T ]) there exists a solution b(·) to the Fredholm Type I equation a(t) =
T
R(t, s)b(s)ds, 0 ≤ t ≤ T.
(17.18)
0
Via this solution the density ρ can be written differently: ⎧ ⎫ T ⎨T ⎬ 1 ρ (x) = exp x(s)b(s)ds − a(s)b(s)ds . ⎩ ⎭ 2 0
(17.19)
0
Parameter estimation of distributions of stochastic process Let x(t), 0 ≤ t ≤ T, be the observed path of a stochastic process that generates a probability measure μθ on the function space F[0,T ] . A parameter θ is to be estimated and belongs to a parameter set Θ , which is a complete separable metric space or a Borel subset in such a space. Definition 17.6. A function θ (x) : F[0,T ] → Θ which is B − B(Θ ) measurable is called the estimator θ (x) of the parameter θ for any family of measures μθ .
280
17 Statistics of stochastic processes
Assume that there exists a σ -finite measure ν on Borel σ -field B in F[0,T ] , with respect to which all the measures μθ are absolutely continuous and d μθ (x) = ρ (θ , x), θ ∈ Θ , x ∈ F[0,T ] . dν Then the family of measures {μθ , θ ∈ Θ } is called regular. Definition 17.7. An estimator θ (x) of the parameter θ for a regular family of measures {μθ } is called strictly consistent under increasing T, if θ (x) → θ as T → ∞, a.s. For a regular family of measures, the estimator can be found by the maximum likelihood method via maximization of the function ρ (θ , x(·)) on Θ . A real parameter θ for a regular family of measures {μθ , θ ∈ Θ } can be estimated by the Bayes method as well. Let Θ be a finite or infinite interval on the real line, on which a pdf is given. We call it the prior density of the parameter θ . Based on the path x = x(·) one can compute the posterior density
ρ (θ |x) := Θ
ρ (θ , x)ρ (θ ) , θ ∈ Θ. ρ (θ , x)ρ (θ )d θ
It is correctly defined if for the observed path it holds ρ (θ , x) > 0, for a.e. θ ∈ Θ . For an estimator θ (x), we introduce two loss functions: quadratic L2 (θ (x), θ ) := (θ (x) − θ )2 and all-or-nothing loss function L0 (θ (x), θ ) := 1Iθ (x)=θ . The latter is approximated by the functions Lε (θ (x), θ ) := 1I|θ (x)−θ |>ε as ε → 0+. Under the quadratic loss function, the Bayes estimator θˆ2 (x) of the parameter θ is defined as a minimum point of the next function (we suppose that the posterior density possesses a finite second moment):
Q(θˆ ) :=
L2 (θˆ , θ )ρ (θ |x)d θ , θˆ ∈ Θ .
Θ
This implies that θˆ2 (x) coincides with the expectation of the posterior distribution; that is, θˆ2 (x) = θ ρ (θ |x)d θ . Θ
Under the all-or-nothing loss function, we have the approximating cost functions Qε (θˆ ) :=
Lε (θˆ , θ )ρ (θ |x)d θ , θˆ ∈ Θ .
Θ
Their minimum points, under unimodal and smooth posterior density, tend to the mode of this density. Therefore, the mode is taken as the Bayes estimator θˆ0 (x) under a given loss function,
17 Statistics of stochastic processes
281
θˆ0 (x) = argmax ρ (θ |x). θ ∈Θ
The case of a pairwise singular family {μθ , θ ∈ Θ } is more specific for statistics of stochastic processes, in contrast to classical mathematical statistics. It is natural to expect in this case, that the parameter θ can be estimated without error by a single path x(t), 0 ≤ t ≤ T . Definition 17.8. An estimator θ (x) of the parameter θ for a pairwise singular family of measures {μθ } is called consistent if ∀θ ∈ Θ :
μθ {x ∈ F[0,T ] : θ (x) = θ } = 1.
Thus, the consistent estimator makes it possible to find the parameter without error for a singular family of measures, which is impossible for a regular family of measures.
Bibliography [51], Chapter 24; [57], Chapters 7, 17; [31], Chapter 4; [37], Chapters 2–4.
Problems 17.1. On [0, T ] a process is observed which is the Wiener process {W (t)} under the hypothesis H1 , and is the process {γ t +W (t)} with given γ = 0 under the hypothesis H2 . Construct the Neyman–Pearson test. 17.2. On [0, T ] a process is observed which is a homogeneous Poisson process with intensity λk under the hypothesis Hk ; k = 1, 2. (a) Prove that the corresponding measures in the space E = D([0, T ]) are equivalent with density d μ2 (x) = ρ (x(·)) = d μ1
λ2 λ1
x(T )
e(λ1 −λ2 )T , x ∈ E.
(b) Construct the Neyman–Pearson criterion to test the hypotheses. 17.3. Let {x(t), t ∈ [0, T ]} and μ1 , μ2 be the objects described in the subsection of Theoretical grounds Hypothesis testing about the mean of Gaussian process. Condition for singularity of measures. Prove that: (a) In cases (1) and (2) of the above-mentioned subsection the measures μ1 and μ2 are singular. (b) Under condition (17.16) it holds μ1 ∼ μ2 , and the density of μ2 with respect to μ1 is given in (17.17).
282
17 Statistics of stochastic processes
17.4. Prove that the decision rule described in the subsection of Theoretical grounds Hypothesis testing about the mean of Gaussian process. Condition for singularity of measures, case (1), tests the hypotheses without error. 17.5. Prove that the decision rule described in the subsection of Theoretical grounds Hypothesis testing about the mean of Gaussian process. Condition for singularity of measures, case (2), tests the hypotheses without error. 17.6. On [0, 1] a path x(·) of a Gaussian process with correlation function e−|t−s| is observed. Under the hypothesis H1 the mean of the process equals 0, whereas under the hypothesis H2 it is equal to a given function a ∈ C2 ([0, 1]) with a (0) = a(0), a (1) = −a(1). Prove that the Neyman–Pearson criterion is constructed as follows. If 1 a(s) − a (s) ds < σ Φ −1 (ε ) + σ 2 x(s) 2 0
then H1 is accepted; otherwise H2 is accepted. Here σ > 0 and 2σ 2 = a2L2 + a 2L2 + a(0)2 + a(1)2 ,
Φ is the cdf of standard normal law, and Φ −1 is the inverse function to the function Φ . 17.7. On [0, T ] a path x(·) of a zero mean Gaussion process is observed. Under the hypothesis H1 the correlation function of the process is R(t, s), whereas under the hypothesis H2 it is equal to σ 2 R(t, x) with unknown positive σ 2 = 1. Because σ 2 is unknown, the hypothesis H2 is composite. Here R(t, s) is a given continuous function such that the integral operator in L2 ([0, T ]), (Ag)(t) =
T
R(t, s)g(s)ds, t ∈ [0, T ], g ∈ L2 ([0, T ]),
0
has an infinite number of positive eigenvalues. Construct an error-free criterion to test the hypotheses. 17.8. On [0, 2] a path x(·) of a scalar diffusion process is observed. Under the hypothesis H1 the diffusion coefficient b1 (t, x) = 1, whereas under the hypothesis H2 the diffusion coefficient b2 (t, x) = t. Under each hypothesis Hk the drift coefficient is the unknown continuous function ak (t, x), k = 1, 2. Construct an error-free test. 17.9. On [0, T ] a path x(·) of a scalar diffusion process starting from 0 is observed. Under both hypotheses H1 and H2 its diffusion coefficient is t, t ∈ [0, T ], and under the hypothesis H1 its drift coefficient is 0. (a) Let T < 1 and under H2 the drift coefficient is | logt|−1/2 for t ∈ (0, T ] and 0 for t = 0. Construct an error-free test. √ (b) Let under H2 the drift coefficient be t, t ∈ [0, T ]. Construct the Neyman– Pearson criterion to test the hypotheses.
17 Statistics of stochastic processes
283
17.10. On [0, T ] a path x(·) of a two-dimensional diffusion process starting from the origin is observed. Under the hypotheses H1 and H2 its diffusion matrix is diagonal with entries 1 and t on√the√diagonal. Under H1 its drift vector is 0, whereas under H2 the drift vector is ( 4 t; 4 t)' . Construct the Neyman–Pearson criterion to test the hypotheses. 17.11. On [0, T ] a path N(·) of a homogeneous Poisson process with intensity λ is observed. (a) Based on Problem 17.2 (a), show that the maximum likelihood estimator of the parameter λ is λˆ T = N(T )/T (more precisely the maximum in λ > 0 of the density at the observed path is attained if N(T ) > 0, and in the case N(T ) = 0 we set λˆ T = 0). λ ; it is a strongly (b) Prove the next: it is an unbiased estimator; that is, Eλˆ T = √ consistent estimator under increasing T ; the normalized estimator T (λˆ T − λ ) converges in distribution to the normal law as T → ∞; that is, λˆ T is an asymptotically normal estimator. 17.12. On [0, T ] a path N(·) is observed of a nonhomogeneous Poisson process with intensity function λ t (it is a density of the intensity measure with respect to Lebesgue measure). (a) Show that the maximum likelihood estimator of the parameter λ is λˆ T = 2N(T )/T 2 (more precisely, the maximum in λ > 0 of the density on the observed path is attained if N(T ) > 0, and in the case N(T ) = 0 we set λˆ T = 0). (b) Prove that λˆ is an unbiased and strongly consistent estimator (see the corresponding definitions in Problem 17.11). (c) Prove that T (λˆ T − λ ) converges in distribution to the normal law as T → ∞; that is, λˆ T is an asymptotically normal estimator. 17.13. Let f , g ∈ C(R+ ); f (t) ≥ 0, t > 0; g(t) > 0, t > 0. Nonhomogeneous Poisson processes {N f (t), Ng (t), t ≥ 0} are given with intensity functions f and g (these functions are the densities of the intensity measures with respect to Lebesgue measure). Let μ1 be the measure generated by the process Ng on D([0, T ]), and μ2 be the similar measure for N f . Prove that μ2 μ1 and T
d μ2 f (ti ) 1Ix(ti )−x(ti −)=1 exp{ (g(t) − f (t))dt}, (x) = ∏ d μ1 i g(ti )
(17.20)
0
x ∈ D([0, T ]). Here ti are jump points of the function x, and if x ∈ C([0, T ]) then the product in (17.20) is set to be equal to 1. 17.14. On [0, T ] a path N(·) is observed of a nonhomogeneous Poisson process with intensity function 1 + λ0t, λ0 > 0 (it is a density of the intensity measure with respect to Lebesgue measure). (a) Write an equation for the maximum likelihood estimator λˆ T of the parameter λ0 and show that with probability 1 this equation has a unique positive root for all T ≥ T0 (ω ). (b) Prove that λˆ T is strongly consistent; that is, λˆ T → λ as T → ∞, a.s.
284
17 Statistics of stochastic processes
17.15. On [0, T ] a path x(·) is observed of a mean square continuous stochastic process with given correlation function r(s,t). For the integral operator J on L2 ([0, T ]) with the kernel r(t, s) it holds that Ker J = {0}. A mean value m of the process is estimated, and m does not depend on t. Let ⎫ ⎧ T ⎬ ⎨ M = mˆ = f (t)x(t)dt f ∈ C([0, T ]); ∀ m ∈ R : Em mˆ = m ⎭ ⎩ 0
(that is, M is a certain class of linear unbiased estimators). Here the integral is a mean square limit of integral Riemann sums, and Em is a standard notation for the expectation with respect to the distribution μm of the observed process with mean m. Prove that: (a) *−1 ) inf Dmˆ =
m∈M ˆ
∞
∑ λn−1 a2n
,
n=1
where {λn , ϕn , n ≥ 1} are all the eigenvalues of J and corresponding orthonormal eigenfunctions, and an = 0T ϕn (t)dt, n ≥ 1. (b) In particular if the series in (a) diverges then ∃ {mˆ k , k ≥ 1} ⊂ M : mˆ k → m as k → ∞, a.s.; that is, then mˆ k is strictly consistent in the sense of Definition 17.7. 17.16. On [0, T ] a path x(·) is observed of a mean square continuous stochastic process with given correlation function r(s,t). A mean value m of the process is estimated, and m does not depend on t. Let M = {mˆ F =
T 0
x(t)dF(t) | F is a function of bounded variation;
∀ m ∈ R : Em mˆ F = m}. Here the integral is a mean square limit of integral Riemann sums. For the notation Em see the previous problem. Suppose that there exists an estimator mˆ F0 ∈ M such that for all s ∈ [0, T ], T 0 r(s,t)dF0 (t) = C. Prove that min Dmˆ H = Dmˆ F0 = C.
mˆ H ∈M
17.17. On [0, T ] a path x(·) of the process with given correlation function r(s,t) is observed . A mean value m of the process is estimated, and m does not depend on t. Let M be the class of estimators from Problem 17.16. Prove that the next estimator has the least variance in M. (a) mˆ 1 = (2 + β T )−1 x(0) + x(T ) + β 0T x(t)dt , if r(s,t) = exp{−β |t − s|} with β > 0. (b) mˆ 2 = x(0) if r(s,t) = min(s + 1,t + 1). Dm, ˆ where mˆ G ∈ M and G (c) In cases (a) and (b) prove that Dmˆ G > minm∈M ˆ is an absolutely continuous function of bounded variation (that is, G(t) = G(0) + t 0 f (s)ds, t ∈ [0, T ], with f ∈ L1 ([0, T ])).
17 Statistics of stochastic processes
285
17.18. On [0, T ] a path x(·) of the process {μ t + σ W (t)} is observed where W is a separable Wiener process with unknown parameters μ ∈ R and σ 2 > 0. (a) Construct an error-free estimate of the parameter σ 2 . (b) For a fixed σ 2 , prove that the maximum likelihood estimator of the parameter μ is μˆ T = x(T )/T . √ (c) Prove that the 2expectation of μˆ T is μ , and μˆ T → μ as T → ∞, a.s., and T (μˆ T − μ ) ∼ N(0, σ ). 17.19. Assume the conditions of Problem 17.18 and let the prior distribution N(μ0 , σ02 ) of the parameter μ be given. (a) Find the posterior distribution of the parameter μ . (b) Construct the Bayes estimator of the parameter under the quadratic loss function. 17.20. On [0, T ] a path N(·) is observed of a homogeneous Poisson process with intensity λ that has the prior gamma distribution Γ (α , β ). (a) Find the posterior distribution of the parameter λ . (b) Construct the Bayes estimator λ under the quadratic loss function and under the all-or-nothing loss function. 17.21. On [0, T ] the process is observed ⎫ ⎧ * ) t ⎬ ⎨ m ϕ (s) + ∑ θi gi (s) ds + σ W (t) , x(t) = ⎭ ⎩ i=1 0
where W is a Wiener process, the unknown function ϕ belongs to a fixed subspace K ⊂ L2 ([0, T ]); {gi } are given functions from L2 ([0, T ]) that are linearly independent modulus K; that is, a linear combination of these functions which belongs to K is always a combination with zero coefficients; and θ = (θ1 , . . . , θm )' ∈ Rm and σ > 0 are unknown parameters. Let M = ⎫ ⎧ T ⎬ ⎨ θˆ = f (t)dx(t) f ∈ L2 ([0, T ], Rm ); ∀ θ ∈ Rm ∀ ϕ ∈ K : Eϕ ,θ θˆ = θ . ⎭ ⎩ 0
Prove that there exists a unique estimator θˆ ∗ ∈ M such that for any estimate θˆ ∈ M the matrix S − S∗ is positive semidefinite. Here S and S∗ are covariance matrices of the estimators θˆ and θˆ ∗ . 17.22. Let X = [0, 1]2 , and Θ = [0, 1]∪[2, 3], and for θ ∈ Θ μθ be a measure on B(X). If θ ∈ [0, 1] then μθ is Lebesgue measure on [0, 1] × {θ }, whereas for θ ∈ [2, 3], μθ is Lebesgue measure on {θ − 2} × [0, 1]. (a) Check that the measures {μθ } are pairwise singular. (b) Prove that there is no consistent estimator θ (x), x ∈ X, of the parameter θ .
286
17 Statistics of stochastic processes
Hints 17.1. Both processes are diffusion ones with continuous paths, and the density is
ρ (x) =
d μ2 (x), x ∈ C([0, T ]). d μ1
17.2. (a) Use a representation of a homogeneous Poisson process given in Problem 5.17. (b) The Neyman–Pearson criterion for equivalent measures can be applied. 17.3. The μ1 and μ2 are Gaussian measures on Hilbert space X = L2 ([0, T ]). Use the Hajek–Feldman theorem. 17.4. Let L be a closure of the set R(X), where R is an integral operator with kernel R(t, s). If a vector h is orthogonal to R(X) then under the hypothesis H1 the variance of r.v. (x(·), h(·)) is 0, therefore, the r.v. is equal to 0, a.s. Then μ1 (L) = 1. 17.5. Under the hypothesis H1 , {(x(·), ϕk (·)), k ≥ 1} is a sequence of independent Gaussian random variables with distributions N(0, λk ), k ≥ 1. 17.6. Solve an integral equation (17.18) where T = 1, a(·) is the function from the problem situation, and R(t, s) = e−|t−s| , t, s ∈ [0, 1]. 17.7. Let {ϕn , n ∈ N} be an orthonormal system of eigenfunctions of the operator A with corresponding eigenvalues λn , n ∈ N. Then under both hypotheses xn := 0T x(t)ϕn (t)dt, n ≥ 1 is a sequence of centered independent Gaussian random variables. 17.8. Because the diffusion coefficients are different, singular measures on C([0, 2]) correspond to the hypotheses. 17.9. (a) The equality (17.11) holds true. (b) The density of μ2 with respect to μ1 can be found by the formula (17.10). 17.10. The condition (17.9) holds. 17.11. (a) Let dμ ρλ (x) = λ (x), x ∈ D([0, T ]). d μ1 Here μλ , λ > 0 is a measure generated by a homogeneous Poisson process with intensity λ . Then λˆ T is a point of maximum in λ > 0 of the log-density L(λ ; N) := log ρλ (N). (b) For T ∈ N use the SLLN and CLT. 17.12. (a) Let νλ be a measure on D([0, T ]) generated by given process. The formula for the density dν ρλ (x) := λ (x) d ν1 is derived similarly to Problem 17.2 (a). √ (b), (c) The process {N1 (t) := N( 2t), t ≥ 0} is a homogeneous process with intensity λ . 17.13. Use Problem 5.17 and generalize the solution of Problem 17.2 (a). 17.14. (a) Use Problem 17.13. The derivative in λ of the log-density L(λ , N) at the observed path is a strictly decreasing function in λ . (b) Investigate the behavior of the function
17 Statistics of stochastic processes
ϕ (λ , T ) := T −2
287
∂ L(λ , N) ∂λ
as T → ∞, when λ is from the complement to the fixed neighborhood of λ0 . 17.15. (a) Expand f in Fourier series by the basis {ϕn }. (b) Use the Riesz lemma about a subsequence of random variables that converges a.s. 17.16. Let the minimum of the variance be attained at mˆ F ∈ M. For α , β ∈ [0, T ] introduce G(t) = 1It≥α − 1It≥β , t ∈ [0, T ]. Then for all δ ∈ R it holds mˆ F+δ G ∈ M. 17.17. Use Problem 17.16. 17.18. (a) Use Problem (17.4). (b) Let the measure μ1 correspond to the process {σ W (t)}, and the measure μ2 be generated by the given process. Use the formula (17.7). (c) Use Problem 3.18. 17.19. Use a density ρ (x) from the solution of Problem 17.18. 17.20. The density of the distribution of the process is derived in Problem 17.2. 17.21. Reformulate this problem in terms of vectors in the space H = L2 ([0, T ]). 17.22. (b) Prove to the contrary. Let θ (x) be a consistent estimator. Introduce A1 = {x : θ (x) ∈ [0, 1]}. Then for all x2 ∈ [0, 1] it holds λ 1 ({x1 ∈ [0, 1] : (x1 , x2 ) ∈ A1 }) = 1.
Answers and Solutions 17.1. Under the hypothesis Hk the observed process {x(t), t ∈ [0, T ]} generates a measure μk on the space E = C([0, T ]); k = 1, 2. By Theorem 17.2 we have μ1 ∼ μ2 , and by formula (17.7) it holds
ρ (x) = exp{γ x(T ) −
γ2 T }, x ∈ E. 2
Without loss of generality we can assume that γ > 0 (for γ < 0 one should consider the process y(t) := −x(t)). Then for λ > 0 1 1 2 λ R := {x ∈ E : ρ (x) < λ } = x ∈ E : x(T ) < (log λ + γ T ) . γ 2 Under the hypothesis H1 , x(T ) ∼ N(0, T ), whereas under the hypothesis H2 , x(T ) ∼ N(γ T, T ). Fix ε ∈ (0, 1). We are looking for c = c(ε ) such that μ2 ({x : x(T ) < c}) = ε . We have c − γT x(T ) − γ T √ < √ x: μ2 ({x : x(T ) < c}) = μ2 T T c − γT √ =ε =Φ T
288
17 Statistics of stochastic processes
√ for c(ε ) = Φ −1 (ε ) T + γ T. According to the Neyman–Pearson criterion we accept H1 if x(T ) < c(ε ); otherwise we accept H2 . At that c(ε ) ¯ √ , α21 = 1 − μ1 ({x : x(T ) < c(ε )}) = Φ T where Φ¯ := 1 − Φ . 17.2. (a) Let {ξn , n ≥ 1} be independent random variables, uniformly distributed on [0, T ], and νi ∼ Pois(λi T ), νi is independent of {ξn }, i = 1, 2. According to Problem 5.17 the process 4 Xi (t) =
νi
∑ 1Iξn ≤t ,
t ∈ [0, T ]
n=1
is a homogeneous Poisson process on [0, T ] with intensity λi . A measure μi on E that is generated by the process Xi , is concentrated on the set of functions of the form f0 (t) = 0,
k
∑ 1Ixn ≤t ,
fk (x,t) =
t ∈ [0, T ],
n=1
where k ≥ 1 and x = (x1 , . . . , xk ) is a vector of k distinct points from the interval (0, T ). Let Fk = { fk (x, ·) | x = (x1 , . . . , xk ) ∈ Ak } where Ak is a symmetric Borel set in (0, T )k , and F0 = { f0 }. Then 4
μi (Fk ) = P
νi
∑ 1Iξn ≤t ∈ Fk , νi = k
= P{(ξ1 , . . . , ξk ) ∈ Ak } · P{νi = k},
n=1
where k ≥ 1 and i = 1, 2. Hence for k ≥ 0 we have k μ2 (Fk ) P{ν2 = k} λ2 = = e(λ1 −λ2 )T . μ1 (Fk ) P{ν1 = k} λ1 Therefore, for any Borel set B ⊂ D([0, T ]), B
=
λ2 λ1
x(T )
∞
k=0
{x∈B: x(T )=k}
∑
=
∞
e(λ1 −λ2 )T d μ1 (x) =
λ2 λ1
∑ μ2 ({x ∈ B :
x(T )
e(λ1 −λ2 )T d μ1 (x) =
x(T ) = k}) = μ2 (B).
k=0
(b) Suppose that λ2 > λ1 . For λ > 0 we have Rλ := {x ∈ E : ρ (x) < λ } = {x ∈ E : x(T ) < y}, y =
log λ + (λ2 − λ1 )T . log λ2 − log λ1
17 Statistics of stochastic processes
289
Given ε ∈ (0, 1) we are looking for y = nε ∈ Z+ such that μ2 (Rλ ) < ε and μ2 (Rλ ∪ Γ λ ) ≥ ε . Under the hypothesis H2 we have x(T ) ∼ Pois(λ2 T ). The desired nε can be found uniquely from the condition
∑
0≤k
(T λ2 )k −T λ2 (T λ2 )k −T λ2 e e <ε ≤ ∑ . k! k! 0≤k≤nε
If x(T ) < nε then we accept the hypothesis H1 , and if x(T ) > nε then accept H2 . In the case x(T ) = nε we accept H2 with probability p and accept H1 with probability 1 − p. Here ) * (T λ2 )nε −T λ2 −1 ε − μ2 (Rλ ) (T λ2 )k −T λ2 e e = p= ε − . ∑ k! nε ! μ2 (Γ λ ) k
∑ λk
1/2
(g, ϕk )ϕk , g ∈ X.
k≥1
The measure μ1 is Gaussian with zero mean and the correlation operator R, and the measure μ2 is Gaussian as well with the same correlation operator and the mean a = a(t) ∈ X. By the Hajek–Feldman theorem, in the case a(·) ∈ R1/2 (X) the measures are equivalent; otherwise they are singular. In the case (1) from the above-mentioned subsection, a(·) does not belong to the closure L of the set R1/2 (X), and in the case (2) it holds a(·) ∈ L \ R1/2 (X). Therefore, in both cases the measures are singular. Under the condition (17.16) it holds a(·) ∈ R1/2 (X), and then μ1 ∼ μ2 . The desired density is found by the Hajek–Feldman theorem. 17.4. The function a(·) ˆ from (17.12) is a nonzero vector orthogonal to L. Under ˆ = (x, a) ˆ = 0 a.s. the hypothesis H1 it holds x(·) ∈ L a.s. (see Hints), hence I(x(·)) Under the hypothesis H2 we have x(t) = a(t) + x0 (t) where x0 (·) ∈ L a.s., and then ˆ ˆ = (a, a) ˆ a.s. Moreover I(x(·)) = (a, a) ˆ + (x0 , a) (a, a) ˆ = a2 − ∑ (a, ϕk )2 > 0 k≥1
because a ∈ L by the problem situation. 17.5. Let zn be a Gaussian r.v. under the limit in (17.15). Under the hypothesis H1 we have Ezn = 0, * ) mn 2 −1 ak 1 3 ≤ , Ez4n ≤ 2 , Dzn = ∑ n n k=1 λk therefore,
290
17 Statistics of stochastic processes ∞
∑ Ez4n < ∞.
n=1
For ε > 0 by the Chebyshev inequality we have P{|zn | ≥ ε } ≤ ε −4 Ez4n , thus ∞
∑ P{|zn | ≥ ε } < ∞,
n=1
and by the Borel–Cantelli lemma with probability 1 for n ≥ n0 (ε , ω ) it holds |zn | < ε . Therefore, under the hypothesis H1 it holds zn → 0, a.s. Next, under the hypothesis H2 , )
mn
a2 zn = 1 + ∑ k k=1 λk
*−1
mn
ak
∑ λk (x − a, ϕk ) → 1, a.s.
k=1
17.6. We are looking for a continuous solution b(·) to the integral equation a(t) =
1
e−|t−s| b(s)ds, t ∈ [0, 1].
0
Rewrite the equation in a form a(t) =
t s−t
e
b(s)ds +
1
et−s b(s)ds, t
0
whence
t
1
0
t
a (t) = − es−t b(s)ds + et−s b(s)ds, t 1 a (t) = es−t b(s)ds + et−s b(s)ds − 2b(t), t
0
b(t) =
a(t) − a (t) , t ∈ [0, 1]. 2
(17.21)
Integrating by parts we verify that this continuous function does satisfy the given integral equation. It is essential that a ∈ C2 ([0, 1]), a (0) = a(0), and a (1) = −a(1). Then the density ρ = dd μμ21 can be written in the form (17.19), with the function b(·) given in (17.21). We have ⎛ ⎞ 1 1 1 1 a(s)b(s)ds = ⎝ a2 (s)ds − a(s)a (s)ds⎠ 2 0
0
=
0
1 2 aL2 + a 2L2 + a2 (0) + a2 (1) , 2
that is denoted by σ 2 in the problem situation. Then
17 Statistics of stochastic processes
log ρ (x) =
1
x(s)b(s)ds −
0
σ2 , x ∈ C([0, 1]), 2
291
(17.22)
and under both hypotheses D log ρ (x(·)) =
1
⎛ ⎝
0
1
⎞ e−|t−s| b(s)ds⎠ b(t)dt =
0
1
a(t)b(t)dt = σ 2 .
0
Next, under H1 , E log ρ (x(·)) = −
σ2 σ2 and log ρ (x(·)) ∼ N − , σ 2 . 2 2
According to representation (17.22) we have under H2 , that E log ρ (x(·)) =
1 0
2 σ2 σ2 σ 2 = and log ρ (x(·)) ∼ N ,σ . a(s)b(s)ds − 2 2 2
We use the Neyman–Pearson criterion to construct the decision rule. Fix ε ∈ (0, 1). We accept H1 if log ρ (x(·)) < C, where a threshold C is found from the equation 2 σ , σ2 < C = ε, μ2 ({x : log ρ (x) < C}) = P N 2 and C = σ Φ −1 (ε ) + σ 2 /2. Thus, H1 is accepted if 1
x(s)b(s)ds < σ Φ −1 (ε ) + σ 2 ;
0
otherwise we accept H2 . There the Type II error is σ2 α21 = μ1 ({x : log ρ (x) ≥ C}) = P N − , σ 2 ≥ C = Φ¯ Φ −1 (ε ) + σ . 2 17.7. Under the hypothesis H1 we have Dxn = λn , whereas under the hypothesis H2 it holds Dxn = σ 2 λn . By the SLLN, under H1 we have yn :=
1 n xk2 ∑ λk → 1, a.s., n k=1
whereas under hypothesis H2 it holds yn → σ 2 = 1, a.s. Thus, we accept H1 if ⎛ ⎞2 T n 1 1 ⎝ x(t)ϕk (t)dt ⎠ = 1; lim ∑ n→∞ n λ k=1 k 0
otherwise we accept H2 .
292
17 Statistics of stochastic processes
17.8. For t ∈ [0, 2] we set
λ (t) = lim
2n −1
n→∞
∑
x
k=0
2 k+1 k t − x t . 2n 2n
Under the hypothesis Hk this limit exists a.s., and by the formula (17.5)
λ (t) =
t
bk (s, x(s))ds; k = 1, 2.
0
In particular λ (t) = t for k = 1, and λ (t) = t 2 /2 for k = 2. The values of these functions differ, for example, at t = 1. Therefore, we calculate
λ (1) = lim
n→∞
2n −1
∑
k=0
2 k+1 k x −x n . n 2 2
We accept the hypothesis H1 if λ (1) = 1; otherwise we accept H2 . 17.9. (a) In the notations of subsection Homogeneous in space processes from Theoretical grounds, we have a1 (t) = 0, B(t) = t; a2 (t) = | logt|−1/2 for t ∈ (0, T ], a2 (0) = 0; a(t) = a2 (t). There exists a solution b(t) to the equation (17.8) and it is equal to ⎧ t =0 ⎨ 0, 1 b(t) = , 0 < t ≤ T. ⎩ ( t | logt| Integral (17.11) is divergent: T
a(t)b(t)dt =
0
T 0
dt = +∞. t| logt|
That is why measures in the space C([0, T ]) that correspond to the distribution of the process under the hypotheses H1 and H2 are singular. In order to construct an error-free criterion we have to find a sequence of continuous functions bn (t), n ≥ 1, t ∈ [0, T ], such that T
lim
n→∞
a(t)bn (t)dt = +∞.
0
The functions can be defined as follows: T b(t), T n ≤ t ≤ T, bn (t) = b n , 0 ≤ T < Tn .
17 Statistics of stochastic processes
If
T
lim (
n→∞
tb2n (t)dt)−1
0
T
293
bn (t)dx(t) = 0
0
then we accept H1 ; otherwise we accept H2 . (b) In the notations of subsection √ Homogeneous in space processes from Theoretical grounds, we have a2 (t) = t = a(t); b(t) = t −1/2 for t ∈ (0, T ] and b(0) = 0. Condition (17.9) holds, therefore, a density of μ2 with respect to μ1 in the space C([0, T ]) is equal to T
ρ (x) = exp{
0
dx(t) 1 √ − 2 t
log ρ (x) =
T 0
T 0
1 √ √ tdt}, t
dx(t) T √ − . 2 t
Under the hypothesis Hk we have 1 T log ρ (x) ∼ N(mk , σ 2 ), σ 2 = T , mk = (−1)k σ 2 = (−1)k ; 2 2
k = 1, 2.
Fix a bound ε for the Type I error, ε ∈ (0, 1). We accept the hypothesis H1 if log ρ (x(·)) < L := σ Φ −1 (ε ) +
T 2
which is equivalent to T 0
dx(t) √ √ < T Φ −1 (ε ) + T. t
Otherwise we accept H2 . Then α12 = ε ,
L − m1 σ √ α21 = Φ¯ Φ −1 (ε ) + T .
α21 = P{N(m1 , σ 2 ) ≥ L} = Φ¯
,
17.10. In the notations of subsection Homogeneous in space processes from Theo√ √ retical grounds we have a1 (t) = 0,√ B(t) = diag(1,t), a2 (t) = a(t) = ( 4 t; 4 t)' . From equation (17.8) we find b(t) = ( 4 t; b2 (t))' with b2 (t) = t −3/4 for t ∈ (0, T ] and b2 (0) = 0. Check the condition (17.9): T 0
(a(t), b(t))dt =
T √ 0
1 t+√ t
2 dt = T 3/2 + 2T 1/2 < ∞. 3
294
17 Statistics of stochastic processes
Then by the formula (17.10) log ρ (x) =
T √
T
0
0
4
t dx1 (t) +
dx2 (t) − t 3/4
)
* T 3/2 1/2 +T , 3
where (x1 (t), x2 (t))' = x(t) is the observed vector path. Under the hypothesis Hk , k = 1, 2, * ) σ2 T 3/2 2 2 1/2 +T , mk = (−1)k . log ρ (x(·)) ∼ N(mk , σ ), σ = 2 3 2 Let the bound ε of the Type I error be given, ε ∈ (0, 1). We accept the hypothesis H1 if σ2 , log ρ (x(·)) < L := σ Φ −1 (ε ) + 2 which is equivalent to T √
T
0
0
4
t dx1 (t) +
dx2 (t) < σ 2; t 3/4
otherwise we accept H2 . Then α12 = ε and L − m1 2 ¯ = Φ¯ Φ −1 (ε ) + σ . α21 = P{N(m1 , σ ) ≥ L} = Φ σ 17.11. (a) The log-density is L(λ ; N) = N(T ) log λ + (1 − λ )T, λ > 0. In the case N(T ) > 0 it attains its maximum at λ = λˆ T := T −1 N(T ). (c) Introduce an i.i.d. sequence ni = N(i) − N(i − 1), i ≥ 1; n1 ∼ Pois(λ ). For N = k ∈ N consider as k → ∞ : N(k) n1 + · · · + nk = → En1 = λ , a.s. λˆ k = k k
(17.23)
Next, for any real T ≥ 1 consider N([T ]) [T ] N(T ) − N([T ]) · + δT , δT := . λˆ T = [T ] T T
(17.24)
As a result of (17.23) the first summand in (17.24) tends to λ , a.s. We have 0 ≤ δT ≤
N([T ] + 1) − N([T ]) nm+1 = , m := [T ]. [T ] m
It remains to prove that as m → ∞, nm+1 → 0, a.s. m
(17.25)
17 Statistics of stochastic processes
Consider
∞
∑E
m=1
n
m+1
m
2
=
295
∞
λ +λ2 < ∞. 2 m=1 m
∑
For ε > 0 we have by the Chebyshev inequality that # " n 1 nm+1 2 m+1 , P >ε ≤ 2 m ε m therefore, a series of these probabilities converges. Then by the Borel–Cantelli lemma |nm+1 /m| ≤ ε for all m ≥ m0 (ω ), a.s. This implies (17.25). (d) For T = k ∈ N we have as k → ∞: √ (n1 − λ ) + · · · + (nk − λ ) d √ k(λˆ k − λ ) = → N(0, Dn1 ) = N(0, λ ). k
(17.26)
According to the expansion (17.24), for any T ≥ 1 we have √ N([T ]) − λ T √ + T δT . T (λˆ T − λ ) = T Now, (17.26) implies that the first summand converges in distribution to N(0, λ ) as T → ∞. The second summand is estimated as √ nm+1 0 ≤ T δT ≤ √ , T √ √ P where m = [T ]. Then E| T δT | → 0 as T → ∞, and T δT → 0 as T → ∞. Finally the Slutzky lemma implies that √ d T (λˆ T − λ ) → N(0, λ ), T → ∞. 17.12. (a) The density mentioned in Hints is equal to
ρλ (x) = λ x(T ) e(1−λ )T
2 /2
, x ∈ D([0, T ]),
and this implies the desired relation. (b), (c) For the process N1 introduced in Hints, we have N1 (T 2 /2) , λˆ T = T 2 /2 hence the desired relation follows from Problem 17.1. In particular 3 T2 ˆ d d (λT − λ ) → N(0, λ ), T (λˆ T − λ ) → N(0, 2λ ), 2 as T → ∞. 17.13. Let {ξn1 , n ≥ 1} be independent random variables distributed on [0, T ] −1 with a density gT (t) = g(t) 0T g(s)ds , and {ξn2 , n ≥ 1} be an i.i.d. sequence
296
17 Statistics of stochastic processes
with similar density fT generated by the function f ; ν1 ∼ Pois 0T g(t)dt and ν2 ∼ Pois 0T f (t)dt , and νi is independent of {ξni , n ≥ 1}, i = 1, 2. According to Problem 5.17 the processes 4 Xi (t) =
νi
∑ 1Iξni ≥t ,
t ∈ [0, T ]
n=1
are nonhomogeneous Poisson processes on [0, T ] with intensity functions g (for i = 1) and f (for i = 2). In the notations from the solution to Problem 17.2 (a), we have for the measures μ1 and μ2 for k ≥ 1: μ2 (Fk ) = P{(ξ12 , . . . , ξk2 ) ∈ Ak } · P{ν2 = k} =
k fT (ti )
k
∏ gT (ti ) · ∏ gT (ti )dt1 . . . dtk · P{ν1 = k} k=1 Ak i=1 ⎛ ⎞k T
⎜ × ⎝ 0T
g(t)dt f (t)dt
T ⎟ ⎠ exp{ (g(t) − f (t))dt}. 0
0
Here ∏ki=1 gT (ti ) is a density of random vector (ξ11 , . . . , ξk1 ). Then
μ2 (Fk ) =
Ak
T
f (ti ) ∏ g(ti ) × exp{ (g(t) − f (t))dt}d μ1 (x), i=1 k
0
where ti = ti (x) are jump points of a step-function x(·). As in the solution to Problem 17.2 (a), this implies that for any Borel set B ⊂ D([0, T ]),
μ2 (B) =
B
T
f (ti ) ∏ g(ti ) 1Ix(ti )−x(ti −)=1 exp{ (g(t) − f (t))dt}d μ1 (x). i 0
17.14. (a) Based on Problem 17.13 the estimator λˆ T is found as a maximum point of the function N(T ) λT2 , λ > 0, L0 (λ , N) = ∑ log(1 + λ ti ) − 2 i=1 or (for N(T ) ≥ 1) as a solution to the equation hT (λ ) :=
1 N(T ) ti 1 = , λ > 0. ∑ 2 T i=1 1 + λ ti 2
The function hT is strictly increasing and continuous in λ ≥ 0. We have hT (+∞) = 0 < 12 . For the existence of a unique solution one has to ensure that hT (0) =
1 N(T ) 1 ti > . ∑ 2 T i=1 2
(17.27)
17 Statistics of stochastic processes
297
We have EhT (0) = T −2 EN(T )·E(ti | ti ≤ T ). Here ti is any jump point of the observed path, and under the condition ti ≤ T its density is equal to 1 + λ0 t T + λ02T Then EhT (0) =
1 T2
T
2
.
t(1 + λ0t)dt =
0
1 1 λ0 T + > . 2 3 2
(17.28)
It is straightforward to check that hT (0) − EhT (0) → 0 as T → ∞, a.s. Therefore, (17.28) implies that the inequality (17.27) holds with probability 1 for all T ≥ T0 (ω ). For such T ≥ T0 (ω ) there exists a unique maximum point of the function L0 (λ , N). (b) Notice that EhT (λ0 ) =
1 T2
T 0
t(1 + λ0t) 1 dt = . 1 + λ0 t 2
Fix 0 < ε < λ0 . For 0 < λ ≤ λ0 − ε we have hT (λ ) ≥ hT (λ0 − ε ) = EhT (λ0 − ε ) + o(1) ≥
1 + δ1 (ε ) + o(1). 2
Here δ1 (ε ) > 0 and o(1) is a r.v. tending to 0 as T → ∞, a.s. In a similar way for λ ≥ λ0 + ε we have hT (λ ) ≤ hT (λ0 + ε ) = EhT (λ0 + ε ) + o(1) ≤
1 − δ2 (ε ) + o(1). 2
Because hT (λˆ T ) = 12 , then with probability 1 there exists Tε (ω ) such that for all T ≥ Tε (ω ) it holds |λˆ T − λ | < ε . This proves the strong consistency of λˆ T . 17.15. (a) Let f generate mˆ ∈ M. The unbiasedness of mˆ is equivalent to the condition T 0 f (t)dt = 1. Then ∞
Dmˆ = (J f , f ) = ∑ λn c2n , cn := ( f , ϕn ), n ≥ 1, i=1
at that
∑∞ n=1 cn an
= 1. By the Cauchy–Schwartz inequality ) 1=
∞
∑ cn an
n=1
*2 ≤
∞
∞
n=1
n=1
∑ λn c2n · ∑ λn−1 a2n ,
(17.29)
−1 2 −1 . and Dmˆ ≥ ∑∞ n=1 λn an Let the latter series converge (the divergency case is treated in solution (b) below). The equality in (17.29) is attained if cn is proportional to λn−1 an (though −1 ∑∞ n=1 λn an ϕn is not necessarily a continuous function). Introduce a continuous function
298
17 Statistics of stochastic processes
)
N
∑
fN (t) =
*−1
λn−1 a2n
n=1
N
· ∑ λn−1 an ϕn (t), N ≥ 1, t ∈ [0, T ], n=1
and the corresponding estimator mˆ N =
T
) fN (t)x(t)dt = m +
N
∑
*−1
λn−1 a2n
n=1
0
Here 1 xn = √ λn
T
N
∑
.
λn−1 an xn .
n=1
ϕn (t)(x(t) − m) dt, n ≥ 1,
0
is a sequence of uncorrelated random variables with zero mean and unit variance. Then ) *−1 ) *−1 N
∑ λn−1 a2n
lim Dmˆ N = lim
N→∞
N→∞
=
n=1
∞
∑ λn−1 a2n
.
n=1
Moreover there exists the mean square limit of mˆ N as N → ∞, and this is an −1 2 −1 . This estimator can be out unbiased estimator mˆ ∗ with variance ∑∞ n=1 λn an of M. (b) In this case Dmˆ N → 0 as N → ∞. Now, from the unbiasedness of mˆ N it follows P that mˆ N → m as N → ∞, and by the Riesz lemma there exists a subsequence of estimators {mˆ N(k) , k ≥ 1} that converges to m, a.s. 17.16. Let mˆ F ∈ M. Then 0T dF(t) = m and Dmˆ F =
T T
r(s,t)dF(s)dF(t) =: Φ (F).
0 0
Suppose that the minimum of the variance is attained at mˆ F , and G is a function introduced in Hints. For each δ ∈ R we have
Φ (F + δ G) = Φ (F) + δ Φ (G) + 2δ
T
2
R(t)dG(t) ≥ Φ (F)
0
where R(s) =
T 0
r(s,t)dF(t), s ∈ [0, T ]. This implies that β
R(t)dG(t) = R(α ) − R(β ) = 0.
α
Therefore, R(s) ≡ C and Dmˆ F = 0T R(s)dF(s) = C. Vice versa, let F be a function of bounded variation such that mˆ F ∈ M and R(s) ≡ C. Let mˆ H ∈ M; then for G := H − F we have 0T dG(t) = 0 and
17 Statistics of stochastic processes
Dmˆ H = Φ (F + G) = Φ (F) + Φ (G) + 2
299
T
R(t)dG(t) 0
= Φ (F) + Φ (G) ≥ Φ (F) = Dmˆ F . 17.17. (a) mˆ 1 = 0T x(t)dF(t), F(t) = (2 + β T )−1 (1It>0 + 1It≥T + β t) . The equality holds 0T dF(t) = F(T ) − F(0) = 1, therefore, mˆ 1 ∈ M. Next, it is straightforward that for each s ∈ [0, T ], T
e−β |t−s| dF(t) =
0
2 = const, 2+βT
thus, mˆ 1 has the least variance in M. (b) mˆ 2 = 0T x(t)dF(t), F(t) = 1It>0 . This estimator belongs to M, and for each s ∈ [0, T ] it holds T
min(s + 1,t + 1)dF(t) = 1 = const.
0
That is why mˆ 2 has the least variance in M. (c) Content ourself with case (a). Based on Problem 17.16 it is enough to show there is no such function f ∈ L1 ([0, T ]) that T
e−β |t−s| f (t)dt ≡ 1, s ∈ [0, T ].
(17.30)
0
To the contrary, suppose that (17.30) holds. Differentiating we obtain that for almost every s ∈ [0, T ], −β s
−β e
s
βt
βs
e f (t)dt + β e
T
e−β t f (t)dt = 0.
s
0
The last two equalities imply that −β s
s
2e
eβ t f (t)dt = 1
0
for almost every s ∈ [0, T ]. Both parts of the equation are continuous in s; then
s βt βs 0 e f (t)dt ≡ e /2. Differentiating this identity we obtain f (s) = β /2 for almost
every s ∈ [0, T ]. But this function does not satisfy (17.30), because the integral I(s) =
T
−β |t−s|
e 0
dt =
T −s
e−β |u| du,
−s
is not a constant. We came to contradiction.
s ∈ [0, T ],
300
17 Statistics of stochastic processes
17.18. (a)
σ =T 2
−1
2n −1
∑
lim
n→∞
k=0
k+1 x T 2n
k −x nT 2
2 , a.s.
(b) μ 2T d μ2 −2 , x ∈ C([0, T ]). ρ (x) = (x) = exp σ μ x(T ) − d μ1 2 Hence, μˆ T = argmaxμ >0 log ρ (x(·)) = T −1 x(T ). (c) It holds μˆ T = μ + σ W (T )/T , which implies the desired relations. For that Problem 3.18 is used. 17.19. (a) Write down the prior density up to multipliers that do not depend on μ :
ρ (μ ) ∼ exp{−
μ2 μ μ0 + 2 }. 2 2σ0 σ0
Then the posterior density ρ (μ |x) is proportional to the expression:
ρ (μ |x) ∼ ρ (μ )ρ (x|μ ) ∼ exp{− with A :=
Aμ 2 + Bμ }, 2
μ0 x(T ) T 1 + and B := 2 + 2 . σ 2 σ02 σ σ0
Therefore,
ρ (μ |x) ∼ exp{−
(μ − BA−1 )2 }. 2A−1
The posterior distribution will be the normal law N(μT , σT2 ) with parameters
μT = BA−1 =
μ0 σ 2 + σ02 x(T ) σ02 σ 2 2 and σ = . T T σ02 + σ 2 T σ02 + σ 2
(b) The Bayes estimator is
μT = KT μ0 + (1 − KT )
x(T ) σ2 with KT = 2 . T σ + T σ02
Thus, the estimator is a convex combination of the prior estimator μ0 and the maximum likelihood estimator T −1 x(T ). The coefficient KT is called the confidence factor. As T → ∞ it tends to 0 (that is, for large T we give credence to the maximum likelihood estimator), while as T → 0 it tends to 1 (that is, for small T we give more credence to the prior information rather than to the data). Answer: the posterior distribution is N(μT , σT2 ) with
μT = KT μ0 + (1 − KT )
σ2 x(T ) , KT = 2 , T σ + T σ02
17 Statistics of stochastic processes
σT2 =
301
σ02 σ 2 , T σ02 + σ 2
and μT is the Bayes estimator of the parameter μ . 17.20. (a) Up to multipliers that do not depend on λ , ρ (λ ) ∼ λ α −1 e−β λ . According to Problem 17.2, the density of the distribution of the process ρ (N|λ ) ∼ λ N(T ) e−λ T . Then ρ (λ |N) ∼ ρ (λ )ρ (N|λ ) ∼ λ α +N(T )−1 e−(β +T )λ . The posterior distribution is gamma distribution Γ (α + N(T ), β + T ). (b) In the first case the Bayes estimator is N(T ) α + N(T ) α = K1 + (1 − K1 ) . λˆ T 1 = β +T β T Here K1 = β /(β + T ) is the confidence factor (see the discussion in the solution of Problem 17.19). In the second case the Bayes estimator is
α + N(T ) − 1 . λˆ T 2 = β +T It exists if α + N(T ) > 1. Under the additional constraint α > 1 it holds
α −1 N(T ) with K2 = K1 . λˆ T 2 = K2 + (1 − K2 ) β T This estimator is a convex combination of the prior estimator (α − 1)/β (under the same loss function) and the maximum likelihood function. Answer: gamma distribution Γ (α + N(T ), β + T ), N(T ) α β with K = , λˆ T 1 = K + (1 − K) β T β +T ) λˆ T 2 = K αβ−1 + (1 − K) N(T T . 17.21. The unbiasedness condition Eϕ ,θ θˆ = θ means the following:
∀ϕ ∈K:
¯ < f , ϕ >= 0;
< f , g' >= Im .
Here g = (g1 , . . . , gm )' ; < f , ϕ > is a column vector with components ( f , ϕi ), and < f , g' > is a matrix with entries ( fi , g j ); Im stands for the unit matrix of size m. Looking for an optimal estimator in the class M is reduced to the following problem. Find a vector function f = ( f1 , . . . , fm )' such that: (1) { f1 , . . . , fm } ⊂ K ⊥ . (2) < f , g' >= Im . (3) For each vector function h that satisfies the conditions (1) and (2), the matrix < h, h' > − < f , f ' > is positive semidefinite.
302
17 Statistics of stochastic processes
Let P be the projection operator on K, and Pg = (Pg1 , . . . , Pgm )' , Φ =< Pg, (Pg)' >. The matrix Φ is nonsingular as a result of linear independence of the functions {Pgi }. The desired vector is unique and has a form f = f ∗ = Φ −1 Pg. The covariance matrix S∗ of the corresponding estimator is S∗ = E[(θˆ ∗ − θ )(θˆ ∗ − θ )' ] = σ 2 Φ −1 . 17.22. (b) Continue the reasoning from Hints. By the Fubini theorem
λ 2 (A1 ) =
λ 1 ({x1 : (x1 , x2 ) ∈ A1 })d λ 1 (x2 ) = 1.
[0,1]
Here λ 2 is Lebesgue measure on the plane. Next, A2 := X \ A1 = {x : θ (x) ∈ [2, 3]} and the consistency of the estimator implies that for all x1 ∈ [0, 1] it holds
λ 1 ({x2 ∈ [0, 1] : (x1 , x2 ) ∈ A2 }) = 1. Therefore,
λ 2 (A2 ) =
1 · d λ 1 (x1 ) = 1.
[0,1]
But due to additivity of a measure, 1 = λ 2 (X) = λ 2 (A1 ) + λ 2 (A2 ) = 2. We came to a contradiction.
18 Stochastic processes in financial mathematics (discrete time)
Theoretical grounds Consider a model of a financial market with a finite number of periods (i.e., of the moments of time) at which it is possible to trade, consume, spend, or receive money or other valuables. The model consists of the following components. There exist d + 1 financial assets, d ≥ 1, and the prices of these assets are available at moments t ∈ T = {0, 1, . . . , T }. The price of the ith asset at moment t is a nonnegative r.v. Si (t) defined on the fixed probability space (Ω , F, P). This space is assumed to support some filtration {Ft }t∈T , and we suppose that the random vector St = (S0 (t), S(t)) = (S0 (t), S1 (t), . . . , Sd (t)) is measurable with respect to the σ -field Ft . With the purpose of technical simplifying we assume in what follows that F0 = {∅, Ω } and FT = F. In the most applications the asset S0 (t) is considered as a risk-free (riskless) bond (num´eraire), and sometimes it is supposed that S0 (t) = (1 + r)t , where r > −1 is a risk-free interest rate. In real situations r > 0, but it is not obligatory. Other assets are considered as risky ones, for example, stocks, property, currency, and so on. Definition 18.1. A predictable d + 1-dimensional stochastic process ξ = (ξ 0 , ξ ) = {(ξ 0 (t), ξ 1 (t), . . . , ξ d (t)), t ∈ T} is called the trading strategy (portfolio) of a financial investor. A coordinate ξ i (t) of the strategy ξ corresponds to the quantity of units of the ith asset during the tth trading period between the moments t − 1 and t. Therefore, ξ i (t)Si (t − 1) is the sum invested into the ith asset at the moment t − 1, and ξ i (t)Si (t) is the corresponding sum at the moment t. The total value of the portfolio at the moment t − 1 equals (ξ (t), S(t − 1)) = ∑di=0 ξ i (t)Si (t − 1), and at the moment t this value can be equated to (ξ (t), S(t)) = ∑di=0 ξ i (t)Si (t) ( ( · , · ) is, as always, the symbol of the inner product in Euclidean space). The predictability of the strategy reflects the fact that the distribution of resources happens at the beginning of each trading period, when the future prices are unknown. Definition 18.2. The strategy ξ is called self-financing, if the investor’s capital V (t) satisfies the equality V (t) = (ξ (t), S(t)) = (ξ (t + 1), S(t)), t ∈ {1, 2, . . . , T − 1}. D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 303 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 18,
304
18 Stochastic processes in financial mathematics (discrete time)
The self-financing property of the trading strategy means that the portfolio is always redistributed in such a way that its total value is preserved. The strategy is self-financing if and only if for any t ∈ {1, 2, . . . , T }, t
V (t) = (ξ (t), S(t)) = (ξ (1), S(0)) + ∑ (ξ (k), (S(k) − S(k − 1))). k=1
The value (ξ (1), S(0)) is the initial investment that is necessary for the purchasing of the portfolio ξ (1). Below we assume that S0 (t) > 0 P-a.s., for all t ∈ T. In this case it is possible to define the discounted prices of the assets X i (t) := (Si (t))/(S0 (t)), t ∈ T, i = 0, 1, . . . , d. Evidently, after the discounting we obtain that X 0 (t) ≡ 1, and X(t) = (X 1 (t), . . . , X d (t)) is the value of the vector of risk assets in terms of units of the asset S0 (t), which is the discounting factor. Despite the fact that the asset S0 (t) is called risk-free and the vector X(t) is called the vector of risk assets, these notions are relative, to some extent. Introduce the extended vector of discounted prices X(t) = (1, X 1 (t), . . . , X d (t)). Definition 18.3. A stochastic process of the form {V (t), Ft ,t ∈ T} where V (t) = (ξ (t), X(t)) =
(ξ (t), S(t)) S0 (t)
is called the discounted capital of investor. If a strategy is self-financing, the equality t
V (t) = (ξ (1), X(0)) + ∑ (ξ (k), (X(k) − X(k − 1))) k=1
holds true for all t ∈ T. Here
∑0k=1
:= 0.
Definition 18.4. A self-financing strategy is called the arbitrage possibility if its capital V satisfies inequalities V (0) ≤ 0, V (T ) ≥ 0 P-a.s., and P(V (T ) > 0) > 0. Definition 18.5. A probability measure Q on (Ω , F) is called the martingale measure, if the vector-valued discounted price process {X(t), Ft ,t ∈ T} is a d-dimensional Q-martingale; that is, EQ X i (t) < ∞ and X i (s) = EQ (X i (t)/Fs ), 0 ≤ s ≤ t ≤ T , 1 ≤ i ≤ d. Theorem 18.1. A financial market is free of arbitrage if and only if the set P of all martingale measures, which is equivalent to measure P, is nonempty. In this case there exists a measure P∗ ∈ P with bounded density dP∗ /dP. Definition 18.6. A nonnegative r.v. C on (Ω , F, P) is called the European contingent claim (payoff). The European contingent claim can be interpreted as an asset that guarantees to its owner the payment C(ω ) at moment T . The moment T is called the expiration date, or maturity date of the claim C. The corresponding discounted contingent claim has a form H = C/ST0 . If this discounted contingent claim can be presented in a functional form, namely, H = f (X(·), where f : Rd(T +1) → R+ is a measurable function, then it is called the derivative, or derivative security, of the vector of primary financial assets X(t),t ∈ T.
18 Stochastic processes in financial mathematics (discrete time)
305
Definition 18.7. A contingent claim C is called attainable (replicable, redundant), if there exists a self-financing strategy ξ such that the value of portfolio at the maturity date equals C; that is, C = (ξ T , ST ) P-a.s. In this case we say that a strategy ξ creates a replicating portfolio for C (replicates C, is a hedging strategy for C). A contingent claim is attainable if and only if the corresponding discounted contingent claim has a form T
H = (ξ T , X T ) = VT = V0 + ∑ (ξk , (Xk − Xk−1 )), k=1
for some self-financing strategy ξ . Definition 18.8. An arbitrage-free financial market is called complete if on this market any contingent claim is attainable. Theorem 18.2. A financial market is complete if and only if there exists and is unique the equivalent martingale measure. Definition 18.9. A number π (H) is called the arbitrage-free price (fair price) of discounted European contingent claim H if there exists a nonnegative adapted stochastic process X d+1 = {X d+1 (t), Ft ,t ∈ T} such that X d (0) = π (H), X d (T ) = H, and the extended financial market (X 0 (t), . . . , X d+1 (t)) is arbitrage-free. Theorem 18.3. If a contingent claim is attainable then it has a unique arbitrage-free price EP∗ (H). If a contingent claim is not attainable, then the set of its arbitragefree prices is an interval of the form (π ↓ (H), π ↑ (H)) on nonnegative axis (possibly π ↑ (H) = ∞). Definition 18.10. A nonnegative adapted stochastic process C = {C(t), Ft ,t ∈ T} is called the American contingent claim. An American contingent claim is a contract that is issued at moment t = 0 and obliges the writer to pay a certain amount C(t), provided the buyer decides at moment t to exercise this contract. The contract is exercised only once. If the buyer has not decided to exercise the contract till the maturity date T , then at this moment the contract is automatically exercised. The buyer has a possibility to exercise the contract not only at nonrandom moment t ∈ T, but at any stopping time τ ∈ T. The aim of the buyer is to find an optimal stopping time τ0 in the sense that EC(τ0 ) = sup0≤τ ≤T EC(τ ), where τ are stopping times, or, in terms of the corresponding discounted contingent claim H(t) = (C(t))/(S0 (t)), EH(τ0 ) = sup0≤τ ≤T EH(τ ). Definition 18.11. An American call option on an asset S is the derivative that can be exercised at any stopping time τ ∈ T, and in this case the payment is (S(τ ) − K)+ , where S(t) is the price at moment t of the underlying asset. An American put option is defined similarly. The strategies of the buyer and writer of an American option are different: the buyer wants to exercise the option at that moment τ0 where the mean value of the payment is the biggest, and the writer wants to create his portfolio in order to have a possibility to exercise the option whenever the buyer comes.
306
18 Stochastic processes in financial mathematics (discrete time)
Bibliography [55], Chapter IV; [23], Chapters 1,5,6; [84], Volume 2, Chapters V and VI; [21], Chapters I and II; [46], Chapters 4–8; [54], Chapter IX; [62], Chapters 1–4.
Problems 18.1. The owner of a European call option has a right, but not an obligation, to buy some asset, for example, some stock S, at moment T at price K, which is fixed initially. This price is called the strike price. Similarly, the owner of a European put option has a right, but not an obligation, to buy some asset, for example, the same stock S, at moment T at the price K, which is fixed initially. (1) Prove that the value of the call option C = Ccall equals C = C(T ) = (S(T ) − + K) , and the value of the put option P = P put equals P = P(T ) = (K − S(T ))+ . (2) Let a finance market be arbitrage-free, π (C) be the arbitrage-free price of a European call option, and π (P) be the arbitrage-free price of the European put option, both with strike price K. Prove that π (C) ≤ S(0) and π (P) ≤ K, where S(0) is the initial price of the risk asset. 18.2. (1) We know that the financial market is arbitrage-free, the price of an asset at moment 0 equals S(0), and at moment T the possible values of this asset are S(ωi ), i = 1, . . . , M. Also, let the risk-free interest rate at any moment equal r. What is the risk-free price of a European call option on this asset if the strike price equals K with K < min1≤i≤M S(ωi )? (2) What is the risk-free price of a European call option on this asset if the strike price is zero? 18.3. We know that the financial market is arbitrage-free, the interest rate equals r at any period, and T is the expiration date. (1) Prove the following inequalities constructing an explicitly arbitrage strategy in an opposite case : (a) π (P) ≥ K(1 + r)−T − S(0). (b) π (C) ≥ S(0) − K(1 + r)−T . (2) Using a definition of the martingale measure prove the following specifications of the inequalities from item (1). (a) Prove that the arbitrage-free price of a European put option admits the bounds max(0, (1 + r)−T K − S(0)) ≤ π (P) ≤ (1 + r)−T K. (b) Prove that the price of the corresponding call option admits the bounds max(0, S(0) − (1 + r)−T K] ≤ π (C) ≤ S(0). 18.4. We know that a financial market is arbitrage-free, the interest rate equals r at any period, T is the expiration date, and K is the strike price of all the options mentioned below.
18 Stochastic processes in financial mathematics (discrete time)
307
(1) (a) Prove that under the conditions for absence of arbitrage, the put–call parity holds between arbitrage-free prices of call and put options: S(0) + π (P) − π (C) = K(1 + r)−T . (b) Prove the following generalization of the put–call parity to any intermediate moment: S(t) + P(t) −C(t) = K(1 + r)−T +t , where P(t) and C(t) are arbitrage-free prices of put and call options at moment t, respectively. (2) Prove that the selling one asset, selling one put option, and buying one call option yields positive profit with vanishing risk (the arbitrage) under the assumption S(0) + π (P) − π (C) > K(1 + r)−T . (3) Prove that the buying one asset, buying one put option, and selling one call option provides the arbitrage under the assumption S + π (P) − π (C) < K(1 + r)−T . 18.5. Let the price of an asset (e.g., stock) at moment t equal S(t). All the options under consideration are supposed to have the expiration date T and the strike price K, unless otherwise specified. The interest rate equals r at any period between buying and exercising options. Calculate the capital at moment T of the investor whose activity at moment t can be described as follows. (a) She has one call option and one put option. (b) She has one call option with strike price K1 and sells one put option with strike price K2 . (c) She has two call options and sells one asset. (d) She has one asset and sells one call option. 18.6. (Law of one price) Let the financial market be arbitrage-free, C be an attainable contingent claim, and ξ = {ξ (t),t ∈ T} be any replicating portfolio for C. Prove that the initial capital V (0) = (ξ (1), S(0)) is the same for any such portfolio. 18.7. (Binomial model, or Cox–Ross–Rubinstein model) Assume that there is one riskless asset (a bond) {Bn = (1 + r)n , 0 ≤ n ≤ N} with the interest rate r > −1 and one risky asset (a stock) {Sn , 0 ≤ n ≤ N} within the financial market. The price Sn can be calculated as follows. S0 > 0 is a given value, Sn+1 is equal either to Sn (1 + a) or Sn (1 + b), where −1 < a < b. Hence, Ω = {1 + a, 1 + b}N . We put F0 = {∅, Ω } and Fn = σ {S1 , . . . , Sn }, 1 ≤ n ≤ N. Assume that every element of Ω has positive probability. Let Rn = Sn /Sn−1 , 1 ≤ n ≤ N. If {y1 , . . . , yn } is some element of Ω then P ({y1 , . . . , yn }) = P(R1 = y1 , . . . , Rn = yn ). (1) Show that Fn = σ {R1 , . . . , Rn }, 1 ≤ n ≤ N. (2) Show that the discounted stock price Xn := Sn /(1 + r)n is a P∗ -martingale if and only if EP∗ (Rn+1 /Fn ) = 1 + r, 0 ≤ n ≤ N − 1. (3) Prove that the condition r ∈ (a, b) is necessary for the market to be arbitragefree. (4) Prove that under the condition r ∈ (a, b) a random sequence {Xn , Fn , 0 ≤ n ≤ N} is a P∗ -martingale if and only if random variables R1 , . . . , Rn are mutually independent and identically distributed and P∗ (R1 = 1 + b) = (r − a)/(b − a) =: p∗ . Show that the market is complete in this case.
308
18 Stochastic processes in financial mathematics (discrete time)
18.8. In the framework of an arbitrage-free and complete binomial model consider a discounted derivative security of the form H = f (X0 , . . . , XN ), where f : RN+1 → R+ is a measurable function. (1) Prove that H is integrable with respect to the martingale measure P∗ . (2) Prove that the capital Vn of any hedging strategy for H can be presented in the form Vn = EP∗ (H/Fn ) = vn (X0 , X1 (ω ) . . . , Xn (ω )), where the function vn (x0 , . . . , xn ) = EP∗ f x0 , . . . , xn , xn (X1 /X0 ), . . . , xn ((XN−n )/(X0 )) . (3) Prove that a self-financing strategy ξ = (ξ 0 , ξ ), which is a replicating strategy for H, has the form ξn (ω ) = Δn (X0 , X1 (ω ), . . . , Xn−1 (ω )), where
Δn (x0 , x1 , . . . , xn−1 ) = aˆ =
ˆ vn (x0 ,...,xn−1 ,xn−1 b)−v ˆ n (x0 ,...,xn−1 ,xn−1 a) , ˆ a) xn−1 (b− ˆ 1+a ˆ 1+b 1+r , b = 1+r ,
and 0 ξ10 (ω ) = EP∗ (H) − ξ1 (ω )X0 , ξn+1 (ω ) − ξn0 (ω ) = − (ξn+1 (ω ) − ξn (ω ))Xn .
(4) Let H = f (XN ). Prove that in this case the functions vn (xn ) can be presented N−n−k b ˆ k )Ck (p∗ )k (1 − p∗ )N−n−k ; in particular, the in the form vn (xn ) = ∑N−n N−n k=0 f (xn aˆ unique arbitrage-free price of the contingent claim H can be presented in the form π (H) = v0 (X0 ) = ∑Nk=0 f (X0 aˆN−k bˆ k )CNk (p∗ )k (1 − p∗ )N−k . (5) Denote by Cn the price at moment n of a European call option with the expiration date N and the strike price K. Prove that under the conditions of item (4) of Problem 18.7 it holds that Cn = c(n, Sn ), where c(n,x) (1+r)n−N
+ = EP∗ x ∏Ni=n+1 Ri − K
(N−n)! ∗ j ∗ N−n− j x(1 + a)N−n− j (1 + b) j − K + . = ∑N−n j=0 (N−n− j)! j! (p ) (1 − p )
18.9. (Trinomial model) Let a financial market consist of one risky asset (stock) {Sn , 0 ≤ n ≤ N} and one riskless asset (bond) {Bn = (1 + r)n , 0 ≤ n ≤ N} with the interest rate r > −1. The price of Sn is defined as follows. S0 > 0 is a fixed number, and Sn+1 equals either Sn (1 + a), or Sn (1 + b), or Sn (1 + c), where −1 < a < b < c. Therefore, Ω = {1 + a, 1 + b, 1 + c}N ; that is, any element of Ω can be presented as ω = {y1 , . . . , yN }, where yn = 1 + a, or 1 + b, or 1 + c. Similarly to the binomial model, put F0 = {∅, Ω } and Fn = σ {S1 , . . . , Sn }, 1 ≤ n ≤ N. Suppose that any element of Ω has positive probability. (1) How many equivalent martingale measures exist in this model? Is this model arbitrage-free? complete? (2) If we consider two martingale measures, do they lead to the same price of attainable payoff; that is, does the law of one price hold in the trinomial model? 18.10. Denote by C(t) and Y (t), t ∈ T the price which will pay the buyer of a European and American option, respectively, if he buys them at moment t. It is supposed that the options are derivatives at the same asset. Prove that Y (t) ≥ C(t).
18 Stochastic processes in financial mathematics (discrete time)
309
18.11. Consider an American option with payoffs {Y (t),t ∈ T}. Let a European option at moment T have the same payoff Y (T ). Prove the following statement. If C(t) ≥ Y (t) for any t ∈ T and any ω ∈ Ω , then C(t) = Y (t) and the optimal strategy for the buyer is to wait until moment T and then exercise the option. 18.12. (Hedging of American option) Assume an American option can be exercised at any moment n = 0, 1, . . . , N, and let {Sn , Fn , 0 ≤ n ≤ N} be a stochastic process, that is equal to the profit, provided the option is exercised at moment n. Denote by {Sn0 , 0 ≤ n ≤ N} the price of a risk-free asset (the discounting factor), which is supposed to be nonrandom, and let Xn = Sn /Sn0 . Denote also by {Yn , 0 ≤ n ≤ N} the price (value) of the option at moment n. The market is supposed to be arbitrage-free and complete, and let P∗ be the unique martingale measure. (1) Prove that YN = SN . (2) Prove that for any 0 ≤ n ≤ N − 1 the equality holds Yn = max(Sn , Sn0 EP∗ (
Yn+1 /Fn )). 0 Sn+1
(3) Prove that in the case Sn0 = (1 + r)n , the price of the American option can be presented as 1 EP∗ (Yn+1 /Fn )). Yn = max(Sn , 1+r (4) Prove that the discounted price of the American option Zn := Yn /Sn0 is a P∗ supermartingale; moreover it is the smallest P∗ -supermartingale dominating the sequence {Xn , 0 ≤ n ≤ N}. Prove that Zn is a Snell envelope of the sequence {Xn }. (5) Prove that the following equalities hold: Zn = supτ ∈Tn,N EP∗ (Xτ /Fn ), 0 ≤ n ≤ N. (See Problems 7.22 and 15.20 for the corresponding definitions.) 18.13. (Price of an American put option in the context of the Cox–Ross–Rubinstein model) Let a financial market consist of one stock {Sn , 0 ≤ n ≤ N}, with the price defined by the Cox–Ross–Rubinstein model and one bond with the interest rate r > −1 (see Problem 18.7). (1) Prove that at moment n with 0 ≤ n ≤ N the price Pn of the American put option with the maturity date N and the strike price K equals Pn = P(n, Sn ), where P(n, x) can be found as follows. P(N, x) = (K − x)+ , and for 0 ≤ n ≤ N − 1 it holds P(n, x) = max((K − x)+ , (( f (n + 1, x))/(1 + r))), where f (n + 1, x) = pP (n + 1, x(1 + a)) + (1 − p)P (n + 1, x(1 + b)) , with p = (b − r)/(b − a). (2) Prove that P(0, x) = supτ ∈T0,N EP∗ ((1 + r)−τ (K − xVτ )+ ) , where the sequence {Vn , 0 ≤ n ≤ N} is given by V0 = 1,Vn = ∏ni=1 Ui , 1 ≤ n ≤ N, and Ui are some random variables. Determine their simultaneous distribution with respect to the measure P∗ . (3) Use item (2) and prove that the function P(0, x) : R+ → R+ is convex. (4) Let a < 0. Prove that there exists a real number x∗ ∈ [0, K] such that for x ≤ x∗ it holds P(0, x) > (K − x)+ . (5) Let the owner have an American put option at moment t = 0. What are the values of S0 for which it is the most profitable to exercise this option at the same moment?
310
18 Stochastic processes in financial mathematics (discrete time)
18.14. An American option is called attainable if for any stopping time 0 ≤ τ ≤ T there exists a self-financing strategy such that the corresponding capital V satisfies the equality V (τ ) = Y (τ ). Let an American option be attainable, and the corresponding discounted stochastic process H(t) := Y (t)/S0 (t) be a submartingale with respect to some martingale measure P∗ . Prove that the optimal stopping time τ coincides with the exercise date (i.e., τ = T ), and the price of this American option coincides with the price of a European option, C(T ) = Y (T ). 18.15. Consider an American call option that can be exercised at any moment t ∈ T, and in the case where it is exercised at moment t ∈ T, the strike price equals K(1 + q)t , where q is a fixed number; that is, the payoff at moment t equals (S(t) − K(1 + q)t )+ . For q ≤ r, where r is the interest rate, prove that the option will not be exercised before moment t = T . 18.16. Let a risky asset at moment T have the price S(T ), where S(T ) is a r.v. with the distribution determined by the measure P. At moment T, let the option on this asset have the price C(T ). Consider a portfolio consisting of ξ0 units of a riskless asset and ξ units of a risky asset, and the portfolio is constant during all trading periods; such a strategy is called “buy and hold”; it is not obligatory self-financing. We suppose that the interest rate is zero, and let the initial capital equal V (0). (1) Prove that the additional costs that must be invested by the owner of this portfolio with the purpose to be able at moment T to exercise the contingent claim C(T ), can be calculated by the formula D := C(T ) −V (0) − ξ (S(T ) − S(0)). (2) In terms of ES(T ), EC(T ), DS(T ), and cov(S(T ),C(T )), determine those values of V (0) and ξ that minimize E(D2 ), and prove that under these values of V (0) and ξ it holds ED = 0. (3) Prove that in the case of a complete market the option C(T ) linearly depends on S(T ) − S(0), and that V (0) and ξ can be chosen in such a way that D = 0.
Hints 18.7. (1) Express {S1 , . . . , Sn } via {R1 , . . . , Rn } and vice versa. (2) Write the equation EQ (Xn+1 /Fn ) = Xn in the equivalent form EQ (((Xn+1 )/(Xn ))/ Fn ) = 1. (3) Let the market be arbitrage-free. Then there exists a measure P∗ ∼ P such that Xn is a P∗ -martingale. Furthermore, use the statement of item (2). (4) If Rn are mutually independent and P∗ (Ri = 1 + b) = p∗ then EP∗ (Rn+1 /Fn ) = 1+r. Check this and use the statement (2). And conversely, let EP∗ (Rn+1 /Fn ) = 1+r. Derive from here that P∗ (Rn+1 = 1 + b/Fn ) = p∗ and P(Rn = 1 + a/Fn ) = 1 − p∗ . Prove by induction that P∗ (R1 = x1 , . . . , Rn = xn ) = ∏ni=1 pi , where pi = p∗ , if xi = 1 + b, and pi = 1 − p∗ , if xi = 1 + a. Note that the P∗ -martingale property of Xn uniquely determines the distribution (R1 , . . . , RN ) with respect to the measure P∗ , so, uniquely determines the measure P∗ itself. That is why the market is complete. 18.8. (1) Integrability of H is evident, because all the random variables take only a finite number of values.
18 Stochastic processes in financial mathematics (discrete time)
311
(2) To prove the equality Vn = EP∗ (H/Fn ) it is necessary to prove at first, using backward induction, that all the values Vn of the capital are nonnegative. Then, with the help of the formula for the capital of a self-financing strategy, it is possible to prove that Vn is a martingale with respect to the measure P∗ . To prove the second inequality you can write Xk , n ≤ k ≤ N in the form Xk = Xn (Xk /Xn ) and use the fact that the r.v. Xk /Xn is independent of Fn and has the same distribution as (Xk−n )/X0 . (3) Write the equality ξn (ω )(Xn (ω )−Xn−1 (ω )) = Vn (ω )−Vn−1 (ω ), in which ξn (ω ), Xn−1 (ω ), Vn−1 (ω ) depend only on the first n − 1 components of the vector ω . Denote ω a := (y1 , . . . , yn−1 , 1 + a, yn−1 , . . . , yN ), ω b := (y1 , . . . , yn−1 , 1 + b, yn−1 , . . . , yN ) and obtain the equalities ξn (ω )(Xn−1 (ω )bˆ − Xn−1 (ω )) = Vn (ω b ) − Vn−1 (ω ) and ξn (ω )(Xn−1 (ω )aˆ − Xn−1 (ω )) = Vn (ω a ) − Vn−1 (ω ). Derive from here the formulae for ξn (ω ). The formulae for ξn0 (ω ) can be derived from a definition of the selffinancing strategy. (4) Use item (2). 18.12. (1) Apply backward induction starting at moment N. Use the fact that at moment N it is necessary to pay the price for an option that equals his benefit at this moment (i.e., YN = SN ), and at moment N − 1 it is necessary to have the capital that is sufficient for buying at that moment (i.e., SN−1 ), and also have the sum that is sufficient to buy it at moment N, but the cost ofthis sum at moment N − 1 equals 0 0 EP∗ (XN /FN−1 ) = SN−1 EP∗ (SN /SN0 )/FN−1 . SN−1 Statements (2)–(5) follow from item (1) if using Problems 15.21–15.23. 18.13. Apply Problem 18.12. 18.15. Apply Problem 18.11.
Answers and Solutions 18.1. (1) This statement is evident. (2) If π (C) > S(0) then at moment t = 0 the seller of the call option sells it and buys the stock at the price S(0). Therefore, at moment t = T he can pay for the claim concerning the call option, for any market scenario and any strike price. In this case he will obtain a guaranteed profit π (C) − S(0) > 0. Next, if π (P) > K then the seller of the put option sells it at the price π (P) and at moment t = T has a possibility to buy the stock for the stock price K < π (P) in the case where the option will be exercised. In this case he will obtain a guaranteed profit π (P) − K > 0. 18.2. (1) Nonarbitrage price of such an option equals π (C) = S(0) − (K/((1 + r)T )). (2) S(0). 18.3. (1)(a) Suppose that π (P) < K(1 + r)−T − S(0); that is, (π (P) + S(0))(1 + r)T < K. At moment t = 0 it is possible to borrow the sum π (P) + S(0), buy the stock at the price S(0), and buy the option with the strike price K at the price π (P). At moment T we sell the stock at the price K (or even at a higher price, if its market price exceeds K) and return (π (P) + S(0))(1 + r)T as a repayment for the borrowed sum. So, we will have a guaranteed profit not less than K − (π (P) + S(0))(1 + r)T > 0. (b) Suppose that π (C) < S(0) − K(1 + r)−T ; that is, (S(0) − π (C))(1 + r)T > K. At moment t = 0 it is possible to make a short sale of the stock (short sale of a
312
18 Stochastic processes in financial mathematics (discrete time)
stock is an immediate sale without real ownership) at the price S(0) and buy the option with the strike price K at the price π (C). At moment T we have the sum (S(0) − π (C))(1 + r)T , so we can buy the stock at the price K (or even at a lower price if its market price is lower than K) and return the borrowed sum. So, we obtain a guaranteed profit not less than (S(0) − π (C))(1 + r)T − K > 0. 18.4. (1) Both relations of the put–call parity are direct consequences of a definition of the martingale measure and the evident equality C(T ) − P(T ) = S(T ) − K. (2) At moment t = 0 we act as proposed, obtain the sum S(0) + π (P) − π (C), which we put into a bank account and obtain at moment t = T the sum (S(0) + π (P) − π (C))(1 + r)T . If S(T ) ≥ K we use the call option, buy the stock at the price K, and return the borrowed sum. Possible exercising of the put option will not lead to losses. If S(T ) < K, then after exercising the put option we buy the stock at the price K, and return our debt with the help of this stock. We do not exercise the call option. In both cases we have a profit not less than (S(0) + π (P) − π (C))(1 + r)T − K > 0. (3) At moment t = 0 we borrow the sum S + π (P) − π (C) and act as mentioned in the problem situation. If S(T ) < K then we use the put option, sell the stock at the price K, and return the borrowed sum which size is now (S(0) + π (P) − π (C))(1 + r)T . Possible exercising of a call option will not lead to losses. If S(T ) ≥ K then the call option will be exercised, we will sell the stock at the price K, will not exercise the call option, and also can return the borrowed sum which size is now (S(0) + π (P) − π (C))(1 + r)T . In both cases we have a profit at least K − (S(0) + π (P) − π (C))(1 + r)T > 0. 18.5. (a) ((S(T ) − K)+ + (K − S(T ))+ = S(T ) − K. (b) (S(T ) − K1 )+ + π (P)(1 + r)T −t − (K2 − S(T ))+ . (c) 2(S(T ) − K)+ + S(t)(1 + r)T −t . (d) S(T )1IS(T )
b−r c − r − p1 (c − a) c − r − p1 (b − a) c−r , ), with < p1 < . c−b c−b b−a c−a
The market will be arbitrage-free and incomplete. (2) The law of one price holds. 18.10. The solution is based on the fact that it is possible to postpone exercising the American option until moment T . 18.11. If the buyer requires the American option and C(t) ≥ Y (t), it is not reasonable to exercise it at moment t < T and get the sum Y (t), because it is possible at that moment to guarantee instead the payoff C(t). For instance, it is possible to sell the European option or take a short position with respect to the portfolio that hedges such an option. So, it is necessary to wait until moment T , but it means that the prices of both options coincide. 18.14. If Y (t)/S0 (t) is a submartingale, then by Theorem 7.5 (a version of Doob’s optional sampling theorem for continuous time), EP∗ (Y (τ )/S0 (τ )) ≤ EP∗ (Y (T )/S0 (T )), therefore, max EP∗ [Y (τ )/S0 (τ )) = EP∗ (Y (T )/S0 (T )), 0≤τ ≤T
18 Stochastic processes in financial mathematics (discrete time)
313
but this is the price of the corresponding European option that is exercised at moment τ = T . Because the option is attainable, there exists a strategy that hedges this option; that is, the strategy where capital at moment T equals the value of the option.
19 Stochastic processes in financial mathematics (continuous time)
Theoretical grounds Let t ∈ T, where T = R+ or T = [0, T ]. Also, let us have a filtration {Ft , t ∈ T}. Consider a financial market with one risk-free asset (bond) B(t) and one risk asset (stock) S(t), adapted to this filtration. Definition 19.1. A couple of stochastic processes {ϕ (t)} and {ψ (t)}, where {ϕ (t)} is a number of bonds and {ψ (t)} is a number of stocks, is called a portfolio. The process ϕ is supposed to be F-adapted, and the process ψ is supposed to be Fpredictable; that is, ψ is adapted to the information that comes strictly before the moment t (an exact definition of the predictable process with continuous time parameter is contained, e.g., in the book [84]; all the processes with continuous or continuous from the left trajectories are predictable). The investor’s capital, that corresponds to the portfolio mentioned above, equals V (t) = ψ (t)S(t) + ϕ (t)B(t). Definition 19.2. A portfolio (ϕ , ψ ) is called self-financing if dV (t) = ψ (t) dS(t) + ϕ (t) dB(t) (it means that the changes of capital occur only as a consequence of changes in bond and stock prices without any external entry or departure of the capital; also, we suppose that the stochastic process which describes the stock price admits a stochastic differential). Let T = [0, T ] and X be an FT -adapted r.v. (the contingent claim). A selffinancing portfolio is the replicating strategy for X if the following equality holds, V (T ) = ψ (T )S(T ) + ϕ (T )B(T ) = X. A probability measure P∗ ∼ P is called the martingale measure if the corresponding discounted process {B−1 (t)S(t)} is a P∗ martingale. The existence of the measure P∗ is equivalent to the arbitrage-free property of the market, and the uniqueness of such a measure is equivalent to the market completeness. In turn, the completeness of the market means that any FT -measurable integrable contingent claim X is attainable; that is, there exists a replicating portfolio for such a claim. D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 315 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 19,
316
19 Stochastic processes in financial mathematics (continuous time)
Let W be a Wiener process that is adapted to the filtration F, and a financial market consist of two assets with prices described by the formulas B(t) = exp{rt} (bond price) and S(t) = S0 exp{(μ − σ 2 /2)t + σ W (t)} (stock price; this model is called geometrical Brownian motion), t ≥ 0. Such a model is called a (B, S)-model, or Black–Scholes model. Consider a European call option with the strike price K, the exercise date T , and with the payoff of the form C(S(T ), T ) = (S(T ) − K)+ . Denote by C(S,t) the arbitrage-free (fair) price of the option at moment t under the condition that the stock price equals S. (Here C(S,t) is called the price function, and π (t) = C(S(t),t) is called the price process.) Then the function C(S,t) satisfies Black–Scholes equation
∂ C 1 2 2 ∂ 2C ∂C + σ S − rC = 0 + rS ∂t 2 ∂ S2 ∂S with boundary conditions C(0,t) = 0, C(S, T ) = (S − K)+ . The solution to this equation has a form C(S,t) = SΦ (d1t ) − Ke−r(T −t) Φ (d2t ) (Black–Scholes formula), where Φ (x) =
x
√ 2π is standard Gaussian distribution function, and
−(u2 /2) du/ −∞ e
d1t =
log KS + r + 12 σ 2 (T − t) t log KS + r − 12 σ 2 (T − t) √ √ , d2 = . σ T −t σ T −t
In particular, for t = 0 we obtain the Black–Scholes formula for the arbitragefree (fair) price of the call option at initial moment:
π (C(S(T ), T )) := C(S, 0) = SΦ (d10 ) − Ke−rT Φ (d20 ). How can the Black–Scholes equation be deduced? If the investor’s portfolio consists of one call option and of (−Δ ) stocks, then its cost equals π = C − Δ · S, and the change of this cost equals d π = dC − Δ · dS, but from reasoning based on the absence of arbitrage, it must be equal to d π = rπ dt, whence 1 ∂ C 1 2 2 ∂ 2C + σ S , π= r ∂t 2 ∂ S2 and we can easily deduce the Black–Scholes equation. The values Δ = ∂ C(S, t)/∂ S, γ = ∂ 2C(S, t)/∂ S2 , θ = −∂ C(S, t)/∂ t, ρ = ∂ C(S, t)/∂ r, and V = ∂ C(S, t)/∂ σ are called Delta, Gamma, Theta, Rho, and Vega options, respectively, and as a whole they are called Greeks, although “vega” does not correspond to a letter of the Greek alphabet. The prices of put and call options, C(S,t) and P(S,t), correspondingly, satisfy the following put–call parity relation, C(S,t) − P(S,t) = S − Ke−r(T −t) . Consider a stock, on which some dividends are paid. Suppose that they are paid with a constant rate that is proportional to the stock price, so that the owner of a(t) units of this stock, t ∈ [α , β ], obtains the total quantity of dividends that is equal to
19 Stochastic processes in financial mathematics (continuous time)
317
β
rD
α
a(t)S(t)dt.
The coefficient rD is called the dividend yield. If C(S,t) denotes the price of the call option on the stock with dividend payments, then C(S,t) is a solution to the modified Black–Scholes equation ∂ C 1 2 2 ∂ 2C ∂C + σ S − rC = 0. + (r − rD )S ∂t 2 ∂S ∂ S2
Bibliography [55], Chapter IV; [84], Volume 2, Chapters VII and VIII; [21], Chapters IV-VII; [46], Chapters 11–13; [54]; [61], Chapter X; [68], Chapter 13; [85], Chapters 10–14.
Problems 19.1. Suppose that the price of a discounted asset at moment T has lognormal distribution with parameters a and σ 2 ; that is, log((S(T ))/(S(0))) has normal distribution with parameters mentioned above. Calculate ES(T ). 19.2. In the framework of an arbitrage-free market and the Black–Scholes model consider two European call options with the same strike price and on the same underlying asset. Is it true that the option with a longer time to maturity has a larger arbitrage-free price? What can you say in this connection concerning a European put option? 19.3. Let a market be arbitrage-free, dividends not be paid, and C = C(S,t) be the price of a European call option at moment t under the condition that the stock price equals S. Prove the following inequalities. (a) C(S, t) ≤ S. (b) C(S, t) ≥ S − Ke−r(T −t) . (c) If we consider two call options with the same exercise date and on the same stock but with different strike prices K1 and K2 , then 0 ≤ C(S, t; K1 ) −C(S, t; K2 ) ≤ K2 − K1 . (d) If we consider two call options with the same strike price and on the same stock but with different exercise dates T1 < T2 , then C(S, t; T1 ) ≤ C(S, t; T2 ). 19.4. Prove that the functions C(S, t) = AS and C(S, t) = Aert , where A is an arbitrary constant, are solutions to the Black–Scholes equation. 19.5. Find on your own the solution to the Black–Scholes equation using the following steps. (1) Change the variables S = Kex , t = T − 2τ /σ 2 , and C = Kv(x, τ ), and reduce the Black–Scholes equation to the equation of the form
318
19 Stochastic processes in financial mathematics (continuous time)
∂ v ∂ 2v ∂v = 2 + (k − 1) − kv, ∂τ ∂x ∂x where k = 2r/σ 2 . A new boundary condition is v(x, 0) = (ex − 1)+ . (2) Put v(x, τ ) = eα x+β τ u(x, τ ) with unknown coefficients α and β . Get an equation for v, put β = α 2 + (k − 1)α − k, 2α + (k − 1) = 0 in it, and reduce the equation for u to the form ∂ u/∂ τ = ∂ 2 u/∂ x2 , x ∈ R, τ > 0 (the heat equation, or the diffusion equation), where u(x, 0) = u0 (x) = max{e(k+1)x/2 − e(k−1)x/2 , 0}, v(x, τ ) = exp{−(k − 1)x/2 − (k + 1)2 τ /4}u(x, τ ). Check that the function 1 u(x, τ ) = √ 2 πτ
R
u0 (y)e−(((x−y)
2 )/4τ )
dy
is the unique solution to the diffusion equation (Cauchy problem) with initial condition u(x, 0) = u0 (x). (3) Obtain the Black–Scholes formula by the inverse change of variables. 19.6. In the framework of the Black–Scholes model consider the discounted capital V (t) = ψ (t)Z(t) + ϕ (t), where Z(t) = B−1 (t)S(t). Prove that the portfolio (ϕ , ψ ) is self-financing if and only if dV (t) = ψ (t)dZ(t). 19.7. Let a market consist of one stock S(t) = S(0) exp{σ W (t) + μ t} and one bond B(t) = exp{rt}. Also, suppose that a filtration F is natural, that is, generated by a Wiener process {W (t),t ∈ [0, T ]}. Prove that this market is arbitrage-free, and for any nonnegative FT -measurable claim X such that EX 2+α < ∞ for some α > 0, there exists a replicating portfolio (ϕ , ψ ). Also, prove that the arbitrage-free price of X at moment t equals
π (X)(t) = B(t) EP∗ (B−1 (T )X|Ft ) = e−r(T −t) EP∗ (X|Ft ), where P∗ is the equivalent martingale measure with respect to which the discounted process {B−1 (t)S(t)} is a martingale. 19.8. Let a stochastic process X be the nominal return, dX(t) = X(t)(α dt + σ dW (t)), (t)), and a stochastic process Y describe the inflation, dY (t) = Y (t)(γ dt + δ dW is a Wiener process, independent of W . We suppose that the coefficients where W α , σ , γ , and δ are constant. Derive the SDE for real return Z(t) := X(t)/Y (t). 19.9. In the framework of the Black–Scholes model consider a European contingent claim of the form
19 Stochastic processes in financial mathematics (continuous time) ⎧ ⎪ if S(T ) ≤ A, ⎨K, H = K + A − S(T ), if A < S(T ) < K + A, ⎪ ⎩ 0, if S(T ) > K + A.
319
The expiration date of H is supposed to equal T . Define a portfolio consisting of bonds, stocks, and a European call option, that will be constant in time and replicates the claim H. Define the arbitrage-free price of H. 19.10. (1) Using Problem 19.5, choose such a change of variables that permits the reduction of the equation
∂ u ∂ 2u ∂u = 2 + a + bu, a, b ∈ R ∂t ∂x ∂x to the diffusion one. (2) Choose such a change of time that permits the reduction of the equation c(t)
∂ u ∂ 2u = 2 , c(t) > 0, t > 0 ∂t ∂x
to the diffusion one. (3) Suppose that σ 2 (·) and r(·) in the Black–Scholes equation are the functions of t, however, (r(t))/(σ 2 (t)) does not depend on t. Rewrite the Black–Scholes formula for this case. 19.11. Suppose that in the Black–Scholes equation the functions r(·) and σ 2 (·) are known nonrandom functions of t. Prove that the following steps reduce the Black– Scholes equation to the diffusion one. (1) Put S = Kex , C = Kv, and t = T − t , and obtain the equation
∂v 1 2 ∂ 2v 1 2 ∂v − r(t )v. = σ (t ) + r(t ) − σ (t ) ∂ t 2 ∂ x2 2 ∂x (2) Change the time variable as τ-(t ) =
t 1 2 0 2 σ (s)ds and obtain the equation
∂ v ∂ 2v ∂v = 2 + a(τ-) − b(τ-)v, ∂τ ∂x ∂x where a(τ-) = 2r/σ 2 − 1, b(τ-) = 2r/σ 2 . (3) Prove that the general solution to the first-order partial differential equation of the form ∂v ∂v = a(τ-) − b(τ-)v ∂ τ∂x can be presented as v(x, τ-) = F(x+A(τ-))e−B(-τ ) , where dA(τ-)/d τ- = a(τ-), dB(τ-)/d τ- = b(τ-), and F(·) is an arbitrary function. (4) Prove that the solution to the second-order partial differential equation from item (2) has a form x, τ-), v(x, τ-) = e−B(-τ )V (-
320
19 Stochastic processes in financial mathematics (continuous time)
where x- = x + A(τ-), A(τ-) B(τ-) are the functions of τ-, taken from item (3), and V is a solution to the diffusion equation ∂ V /d τ- = ∂ 2V /∂ x2 . (5) Transform the initial data correspondingly to the change of variables. 19.12. Let C(S, t) and P(S, t) be the prices at moment t of a European call and put option, correspondingly, with the same strike price and exercise date. (1) Prove that both P and C − P satisfy the Black–Scholes equation; moreover, the boundary condition for C − P is extremely simple: C(S, T ) − P(S, T ) = S − K. (2) Deduce from the put–call parity that S − Ke−r(T −t) is a solution to the Black– Scholes equation with the same boundary condition. 19.13. Use the exact solution to the diffusion equation to find the Black–Scholes price P(S, t) of a put option P(S, T ) = (K − S)+ without using the put–call parity. 19.14. (1) Prove that in the case where the initial condition of the boundary value problem for the heat equation is positive, then u(x, τ ) > 0 for any τ > 0. (2) Deduce from here that for any option with positive payoff its price is also positive, if it satisfies the Black–Scholes equation. 19.15. (1) In the framework of the Black–Scholes model find the arbitrage-free option price with the payoff f (S(T )), where the function f ∈ C(R) and increases at infinity not faster than a polynomial. (2) Find the arbitrage-free option price with the payoff of the form BH (K − S(T )), where H (s) = 1Is≥0 is a Heaviside function and B > 0 is some constant (the option “cash-or-nothing”). (3) The European digital call option of the kind asset-or-nothing has the payoff S(T ) in the case S(T ) > K, and zero payoff in the case S(T ) ≤ K. Find its price. 19.16. What is a probability that a European call option will expire in-the-money? 19.17. Calculate the price of a European call option on an asset with dividends, if the dividend yield equals rD on the interval [0, T ]. 19.18. What is the put–call parity relation for options on the asset with dividends? 19.19. What is the delta for the call option with the continuous and constant dividend yield rD ? 19.20. Calculate Δ , Γ , θ , ρ , and V for put and call options. 19.21. On the Black–Scholes market a company issued an asset “Golden logarithm” (briefly GLO). The owner of GLO(T ) with the expiration date T receives at moment T the sum log S(T ) (in the case S(T ) < 1 the owner pays the corresponding sum to the company). Define the price process for GLO(T ). 19.22. Let the functions r(x) and u(x) denote the interest rate and the process (flow) of the cash receipt, correspondingly, under the condition that the initial value of a risk asset X(0) = x, where X is a time-homogeneous diffusion process with the drift
19 Stochastic processes in financial mathematics (continuous time)
321
coefficient μ = μ (x) and diffusion σ = σ (x); μ and σ are continuous functions, and σ (x) = 0 for all x ∈ R. Suppose that u is bounded and continuous. (1) Write the stochastic differential of the process X. (2) Check that the function t u(t, x) := E e− 0 r(X(s))ds u(X(t))X(0) = x , t ∈ R+ can be considered as the expected discounted cash flow at moment t under the condition that X(0) = x. (3) Find a partial derivative equation for which the function u(t, x) is a solution. 19.23. (When is the right time to sell the stocks?) Suppose that the stock price {S(t),t ∈ R+ } is a diffusion process of the form dS(t) = rS(t)dt + σ S(t)dW (t), S(0) = x > 0 (for the explicit form of S(t) see Problem 14.3). Here W is a onedimensional Wiener process, r > 0, and σ = 0. Suppose that there is a fixed transaction cost a > 0, connected to the sale of the asset. Then, regarding inflation, the discounted asset price at moment t equals e−ρ t (S(t) − a). Find the optimal stopping time τ0 , for which Es,x e−ρτ0 (S(τ0 ) − a) = sup Es,x e−ρτ (S(τ ) − a) = sup Es,x g(τ , S(τ )), τ
τ
where g(t, y) = e−ρ t (y − a). 19.24. (Vasicek stochastic model of interest rate) According to the Vasicek model, the interest rate r(·) satisfies a SDE, dr(t) = (b − ar(t))dt + σ dW (t), where W is a Wiener process. (1) Find an explicit form of r(·). (2) Find the limit distribution of r(t) as t → ∞. 19.25. (Cox–Ingersoll–Ross stochastic model of interest rate) According to the Cox– Ingersoll–Ross model, the interest rate r(·) satisfies a SDE ( dr(t) = (α − β r(t))dt + σ r(t)dW (t), where W is a Wiener process, α > 0, and β > 0. The process {r(t)} is also called the square of the Bessel process. (Concerning the existence and uniqueness of the strong solution to this equation see ( Problem 14.16.) (1) Define the SDE for { r(t)} in the case α = 0. (2) Suppose that a nonrandom function u(·) satisfies the ordinary differential equation u (t) = −β u(t) − (σ 2 /2)u2 (t), u(0) = θ ∈ R. Fix T > 0 and assume that α = 0. Find the differential equation for the function G(t) = E exp{−u(T − t)r(t)}. Calculate the mean value and variance of r(T ). (3) In the general case, calculate the density and moment-generating function for the distribution of r(t).
322
19 Stochastic processes in financial mathematics (continuous time)
Hints 19.2. Yes, for a call option it is true. To check this statement it is necessary to prove that if we fix all other parameters, then the stochastic process {Yt := e−rt (St − K)+ ,t ≥ 0} becomes a submartingale with respect to the natural filtration and to the risk-neutral measure. For put options the situation becomes more complicated, and the answer is negative in the general case. 19.3. I method. Prove that under opposite inequalities the arbitrage is possible. II method. Directly use the form of the solution to the Black–Scholes equation. 19.6. Verify directly the definition of the self-financing property. 19.10. (3) Choose a change of time in order to reduce the equation to the diffusion one, and then use the usual Black–Scholes formula or apply Problem 19.11. 19.15. (2), (3) Substitute the corresponding function f into the formula obtained in item (1). 19.16. This is the probability of the event {S(T ) ≥ K}, and the distribution of log S(T ) is Gaussian. 19.18. Solve the Black–Scholes equation for C − P (with dividends), using the boundary condition C(S, T ) − P(S, T ) = S − K. 19.22. Apply Problem 14.14. 19.24. (2) Use the equality (19.3) (see Answers and Solutions to this chapter) and the fact that the integral on the right-hand side has a Gaussian distribution.
Answers and Solutions 19.1. ES(T ) =
2 1 √ ea+(σ /2) . σ 2π
19.7. It is possible to construct the martingale measure P∗ by the Girsanov theorem (see Problems 14.25 and 14.22). For this purpose it is necessary to put 2 4 1 μ −r 1 μ −r 1 dP∗ := exp − − σ W (t) − − σ t . dP σ 2 2 σ 2 Then the Novikov’s condition evidently holds on the finite interval [0, T ], stochastic process W˜ (t) := W (t) + ((μ − r)/(σ ) − 12 σ is a Wiener process with respect to the same filtration, and the discounted process (S(t))/(B(t)) is a martingale with respect to the measure P∗ and the same filtration. Now, because EX 2+α < ∞ for some
19 Stochastic processes in financial mathematics (continuous time)
323
α > 0, it is easy to check with the help of the H¨older inequality that the claim X is square integrable with respect to a measure P∗ . Put V (t) = EP∗ (B−1 (T )X|Ft ). Because the filtration F is generated by a Wiener process, then, for example, by Theorem 5.13 [57], the representation holds V (t) = EV (0) + 0t β (s)dW (s). Now, it is necessary to put ψ (t) = μ −1 β (t)B(t)S−1 (t) and ϕ (t) = V (t) − ψ (t)S(t)B−1 (t). Furthermore you can check it on your own with the help of the Itˆo formula that the following equations hold: B(t)V (t) = ϕ (t)B(t) + ψ (t)S(t), whence, in particular, X = ϕ (T )B(T ) + ψ (T )S(T ); that is, our portfolio replicates X, and also B(t)V (t) = B(t)dV (t)+V (t)dB(t) = ϕ (t)dB(t)+ ψ (t)dS(t); it is equivalent to the self-financing property of the strategy (ϕ , ψ ). (t)). 19.8. dZ(t) = Z(t)((α − γ + δ 2 )dt + σ dW (t) − δ dW 19.9. Write H in the form H = K · 1 − (S(T ) − A)+ + (S(T ) − A − K)+ . Thus, the desired portfolio can be constructed from K bonds of the price 1 each, short position in option (S(T ) − A)+ (i.e., this option must be sold), and the long position in option (S(T ) − A − K)+ (i.e., this option must be bought). Hence the arbitrage-free price of the claim H at moment t equals π (H)(t) = Ke−r(T −t) − π (S(T ) − A)+ + π (S(T ) − A − K)+ , where the arbitrage-free prices of the options mentioned above have to be defined by the Black–Scholes formula. 19.15. (1) The required arbitrage-free price equals −(y − μ )2 1 √ dy. f (ey ) exp 2σ 2 σ 2π R 19.21. The required price process has a form π (t) = log S(t) + (r − σ 2 /2)(T − t). of the process Y (t) = (s + t, S(t)) is given by the 19.23. The infinitesimal operator L formula 2 f (s, x) = ∂ f + rx ∂ f + 1 σ 2 x2 ∂ f , f ∈ C2 (R2 ). L ∂s ∂x 2 ∂ x2
Therefore, in our case Lg(s, x) = e−ρ s ((r − ρ )x + ρ a), whence if r ≥ ρ , R × R+ , U := {(s, x)| Lg(s, x) > 0} = ρ }, if r < ρ . {(s, x)|x < ρa−r So, if r ≥ ρ , then U = Γ c = R × R+ , and there is no optimal stopping time. If r > ρ , then vg = ∞, and for r = ρ it holds vg (s, x) = xe−ρ s (prove these statements). Consider the case r < ρ and prove that the set Γ c is invariant in t; that is, Γ c + (t0 , 0) = Γ c for all t0 . Indeed,
324
19 Stochastic processes in financial mathematics (continuous time)
Γ c + (t0 , 0) = {(t + t0 , x)| (t, x) ∈ Γ c } = {(s, x)| (s − t0 , x) ∈ Γ c } = {(s, x)| g(s − t0 , x) < vg (s − t0 , x)} = {(s, x)| eρ t0 g(s, x) < eρ t0 vg (s, x)} = {(s, x)| g(s, x) < vg (s, x)} = Γ c . Here the equalities vg (s − t0 , x) = supτ Es−t0 ,x e−ρτ (S(τ ) − a) = supτ Ee−ρ (τ +(s−t0 )) (S(τ ) − a) = eρ t0 supτ Ee−ρ (τ +s) (S(τ ) − a) = eρ t0 vg (s, x) were used. Therefore, a connected component of the set Γ c containing U must have the form Γ c (x0 ) = {(t, x)| 0 < x < x0 }, for some x0 > aρ /(ρ − r). Note that Γ c cannot have any other components, because another component V of the set Γ c must < 0 in V , and then for y ∈ V satisfy the relation Lg Ey g(Y (τ )) = g(y) + Ey
τ 0
Lg(Y (t))dt < g(y),
for all stopping times bounded by the exit time from a strip in V . So, it follows from Theorem 15.2, item (2), that vg (y) = g(y), and then V = ∅. Put τ (x0 ) = τΓ c (x0 ) and calculate g(s, x) = gx0 (s, x) := Es,x g (Y (τ (x0 ))) . This function is a solution to the boundary value problem ∂f ∂s
+ rx ∂∂ xf + 12 σ 2 x2 ∂∂ x2f = 0, 0 < x < x0 , 2
f (s, x0 ) = e−ρ s (x0 − a).
(19.1)
If we try a solution of (19.1) of the form f (s, x) = e−ρ s ϕ (x), we get the following one-dimensional problem
−ρϕ + rxϕ (x) + 12 σ 2 x2 ϕ (x) = 0, 0 < x < x0 ,
ϕ (x0 ) = x0 − a.
(19.2)
The general solution to the equation (19.2) has a form
ϕ (x) = C1 xγ1 +C2 xγ2 , where Ci , i = 1, 2 are arbitrary constants, and ) * 3 1 2 2 −2 1 2 2 γi = σ σ − r ± (r − σ ) + 2ρσ , γ2 < 0 < γ1 . 2 2 Because the function ϕ (x) is bounded as x → 0, it should hold C2 = 0, and the −γ boundary requirement gives C1 = x0 1 (x0 − a). Hence,
19 Stochastic processes in financial mathematics (continuous time)
gx0 (s, x) = f (s, x) = e−ρ s (x0 − a)
x x0
γ1
325
.
If we fix (s, x), then the maximal value of gx0 (s, x) is attained at x0 = xmax = aγ1 /(γ1 − 1) (here γ1 > 1 if and only if r < ρ ). At last, vg (s, x) = sup Es,x g τ (x0 ), X(τ (x0 )) = sup gx0 (s, x) = gxmax (s, x). x0
x0
The conclusion is that one should sell the stock at the first moment when the price of it reaches the value xmax = aγ1 /(γ1 − 1). The expected discounted profit obtained from this strategy equals γ1 − 1 γ1 −1 x γ1 . vg (s, x) = e−ρ s a γ1 19.24. (1) r(t) =
b b + r(0) − e−at + σ a a
t 0
e−a(t−s) dW (s).
(19.3)
( r(t). Then β σ 1 σ2 q(t) + dt + dW (t). dq(t) = − 2 8 q(t) 2
19.25. (1) Denote q(t) =
(2) The function G satisfies the differential equation (in the integral form) G(t) = exp{−θ r(T )} − tT G(s)r(s)(u (T − s) + β u(T − s) + (σ 2 /2)u2 (T − s))ds = E exp {−θ r(T )}; that is, in fact, it does not depend on t. Now, it is easy to prove (please do it yourself) that θ β e−β t . u(t) = 2 β + σ2 θ (1 − e−β t ) Now, it is necessary to write the equality G(0) = E exp{−u(T )r(0)} = E exp{−θ r(T )} and to take the derivative of the left-hand and right-hand sides of this equality in θ to find the corresponding moments. (3) The density of distribution of r(t) equals ft (x) = ce−c(u+x) q=
2α σ2
q/2 x u
√ Iq (2c xr), x ≥ 0, c =
− 1,
and the function Iq (x) =
∞
2β , σ 2 (1−e−β t )
u = r(0)e−β t ,
(x/2)2k+q
∑ k!Γ (k + q + 1)
k=0
is a modified Bessel function of the first kind and order q. This is a noncentral χ 2 distribution with 2(q + 1) degrees of freedom and the skew coefficient 2cu. The density of distribution of r(t) can be also presented in the form
326
19 Stochastic processes in financial mathematics (continuous time)
ft (x) =
∞
∑ (((cu)k )/k!)e−cu gk+q+1,c (x),
k=0
where the function gγ ,λ (x) = (1/(Γ (γ )))λ γ xγ −1 e−λ x is the density of Gamma distribution Γ (γ , λ ). The moment generating function equals m(ν ) =
c q+1 " cuν # . exp c−ν c−ν
(Noncentral χ 2 distributions are considered in detail in [42].)
20 Basic functionals of the risk theory
Theoretical grounds Mathematical foundations of investigating of the risk process in insurance were created by Swedish mathematician Filip Lundberg in 1903–1909. For a long time this theory had been developed by mostly Nordic mathematicians, such as Cram´er, Segerdal, Teklind, and others. Later on risk theory started to develop not only with connection to insurance but also as the method of solving different problems in actuarial and financial mathematics, econometrics. In the second half of the twentieth century the applied area of risk theory was expanded significantly. Processes with independent increments play a very important role in risk theory. The definitions and the characteristics of the homogeneous processes with independent increments are presented in Chapter 5. Not general multidimensional processes but rather real-valued ones with independent increments and with the jumps of the same sign are used in queueing and risk theory. In particular, stepwise processes ξ (t) with jumps ξk satisfying one of the following conditions, E eiαξ1 /ξ1 > 0 =
c , c − iα
E eiαξ1 /ξ1 < 0 =
b , b + iα
(20.1)
have a range of application. Theorem 20.1. The L´evy process {ξ (t), t ≥ 0} is piecewise constant (stepwise) with probability one if and only if its L´evy measure Π satisfies the condition Π (R\{0}) < ∞ and the characteristic function in the L´evy–Khinchin formula (Theorem 5.2) of an increment ξ (t) − ξ (0) is as follows, Eeiα (ξ (t)−ξ (0)) = et ψ (α ) ,
(20.2)
with the cumulant function ψ (α ) determined by the relation
ψ (α ) =
∞ eiα x − 1 Π (dx). −∞
(20.3)
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, 327 c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1 20,
328
20 Basic functionals of the risk theory
Theorem 20.2. The L´evy process {ξ (t),t ≥ 0} is a nondecreasing function of time with probability one if and only if ∞ eiα x − 1 Π (dx), (20.4) Eeiα (ξ (t)−ξ (0)) = et ψ (α ) , ψ (α ) = iα a + 0
where a ≥ 0 and the measure Π satisfies the condition
1 0
xΠ (dx) < ∞.
Theorem 20.3. The process {ξ (t),t ≥ 0} with independent increments has a bounded variation with probability one on any bounded interval if and only if eiα x − 1 Π (dx), (20.5) Eeiα (ξ (t)−ξ (0)) = et ψ (α ) , ψ (α ) = iα a + where a ∈ R,
R
1
−1 |x|Π (dx) < ∞.
Theorem 20.4. The characteristic function of the compound Poisson process N(t)
ξ (t) = at + ∑ ξk , (ξ0 = ξ (0) = 0),
(20.6)
k=0
where {ξk , k ≥ 1} are i.i.d. random variables independent of the simple Poisson process N(t) with intensity λ > 0, can be expressed in the form of identity (20.2) with the following cumulant function, eiα x − 1 dF(x), F(x) = P(ξ1 < x), x ∈ R. ψ (α ) = iα a + λ (20.7) R
Definition 20.1. The compound Poisson (20.6) process with jumps of the same sign is said to be: (1) Upper continuous if a > 0, P(ξk < 0) = 1. (2) Lower continuous if a < 0, P(ξk > 0) = 1. We call such kinds of processes semicontinuous. Definition 20.2. The compound Poisson process (20.6) with a ≤ 0 is said to be almost upper semicontinuous if the first condition in (20.1) is satisfied with c > 0. The processes (20.6) with a ≥ 0 are said almost lower semicontinuous if the second condition holds true in (20.1) with b > 0. If for these processes a = 0, then they are called stepwise almost upper or lower semicontinuous. Let us introduce the basic notions connected with risk processes and their basic characteristics. Definition 20.3. The classic risk process (or the reserve process) is the process N(t)
Ru (t) = ξu (t) = u +Ct − ∑ ξk , C > 0, u > 0, P(ξk > 0) = 1,
(20.8)
k=0
which describes the reserved capital of an insurance company at time t. Here u is N(t) an initial capital, C is a gross risk premium rate. The process S(t) = ∑k=0 ξk (ξ0 = S(0) = 0) determines the outpayments of claims with mean value 0 < μ = Eξ1 < ∞.
20 Basic functionals of the risk theory
329
Definition 20.4. The safety security loading is the number
δ=
Eξ (1) C − λ μ = > 0, ξ (t) = ξ0 (t), ES(1) λμ
(20.9)
(Here Eξ (1) = m := C − λ μ .) Definition 20.5. The claim surplus process is the process
ζ (t) = u − ξu (t) = S(t) −Ct.
(20.10)
Let us denote the extremums of the processes ξ (t) and ζ (t) as
ξ ± (t) = sup (inf) ξ (t ); 0≤t ≤t
±
ξ = sup (inf) ξ (t);
ζ ± (t) = sup (inf) ζ (t ); 0≤t ≤t
±
ζ = sup (inf) ζ (t);
0≤t<∞
(20.11)
0≤t<∞
τ + (u) = inf{t ≥ 0 | ζ (t) > u} = inf{t ≥ 0 | ξu (t) < 0}. Definition 20.6. The ultimate ruin probability is
Ψ (u) = P(ξu (t) < 0 for some t > 0).
(20.12)
It can be written in the terms of distributions of the extremums as follows:
Ψ (u) = P(ζ + > u) = P(τ + (u) < ∞) = P(ξ − < −u).
(20.13)
Definition 20.7. The ruin probability with finite horizon [0, T ] is the probability
Ψ (u, T ) = P(ζ + (T ) > u) = P(τ + (u) < T ).
(20.14)
Besides the extreme values (20.11), other boundary functionals are also used in risk theory. In particular, the following overjump functionals are used:
γ + (u) = ζ (τ + (u)) − u − the value of overjump; γ+ (u) = u − ζ (τ + (u) − 0) − the value of lowerjump; γu+
(20.15)
+
= γ (u) + γ+ (u), u > 0;
here γu+ is the value of the jump covering the level u. Let γ+ (u) be the lowerjump of the stepwise process ζ (t) with a = 0 under the condition that the first jump ξ1 > u crossed the level u ≥ 0. Then γ+ (u) takes up the fixed value u with positive probability P(γ+ (u) = u, ζ + > u) = F(u) = P(ξ1 > u) > 0. Let us also mention that all boundary functionals (20.11), (20.15) have their own interpretation in risk theory, namely: τ + (u) is the ruin time. γ + (u) is the security of ruin. γ+ (u) is the surplus ζ (t) prior to ruin. γu+ is the claim causing ruin.
330
20 Basic functionals of the risk theory
(Figures 2 and 3, page 360 contain the graphs of the process ξu (t) and ζu (t), and of the functionals mentioned above). Denote by τ (u) = inf{t > τ + (u) | ξu (t) > 0} the time of returning of ξu (t) after the ruin into the half-plane Π + = {y ≥ 0}, τ (u) − τ + (u), τ + (u) < ∞, (20.16) T (u) = τ + (u) = ∞. ∞, T (u) is said to be the first “red period”, determining the first duration of ξu (t) being in the risk zone Π − = {y < 0} or ζ (t) being in the risk zone Πu+ = {x > u}. (Figure 6, page 363 contains the graphs of these functionals for the classic risk process). The risk zone Πu+ = {x > u} and the survival zone {x ≤ u} for the process ζ (t) are divided by the “critical” boundary x = u. Let Z + (u) = Z1+ (u) =
ζ (t),
sup
τ + (u)≤t<∞
sup
(20.17)
ζ (t).
τ + (u)≤t<τ (u)
Z + (u) determines the the total maximal deficit; Z1+ (u) is the maximal deficit during a period T (u). The duration of the process ζ (t) being over “critical” level u > 0 is defined by the integral functional Qu (t) =
t
1Iζ (s)>u ds,
0
(20.18)
which determines for t → ∞ the total duration of the “red period” Qu (∞) =
∞
1Iζ (s)>u ds.
0
Let us note that the functionals (20.16)–(20.18) are needed to study the behavior of the risk processes after the ruin. This need is explained by the possibility for the insurance agency to function even after the ruin. It can borrow some capital. In order to estimate the predicted loan, it is important to know the distribution of these functionals. To study the distributions of the functionals from (20.11), (20.15)–(20.18), we need the results from the theory of boundary problems for the processes with independent increments which can be found in [7, 33, 50]. Let {ξ (t), t ≥ 0} (ξ (0) = 0) be a general real-valued homogeneous process with independent increments that has the following characteristic function in the L´evy– Khinchin form: Eeiαξ (t) = et ψ (α ) ,
σ2 ψ (α ) = iαγ − α 2 + 2
∞ −∞
iα x
e
iα x −1− 1 + x2
Π (dx),
(20.19)
20 Basic functionals of the risk theory 0<|x|≤1
331
x2 Π (dx) < ∞.
We denote as θs an exponentially distributed random variable which is independent of ξ (t), P(θs > t) = e−st ,
s > 0, t > 0,
ϕ (s, α ) = Eeiαξ (θs ) ,
ϕ± (s, α ) = Eeiαξ
± (θ
s)
,
and consider a randomly stopped process ξ (θs ). The introduction of θs allows us to write in the short form the Laplace–Karson transform of the distributions of ξ (t), ξ ± (t) and their characteristic function. In particular, P(s, x) := P(ξ (θs ) < x) = s Eeiαξ (θs ) = s
∞ 0
∞ 0
e−st P(ξ (t) < x) dt, x ∈ R,
e−st Eeiαξ (t) dt, Eeiαξ
± (θ
s)
=s
∞ 0
e−st Eeiαξ
± (t)
dt.
It is easy to prove that
ϕ (s, α ) = Eeiαξ (θs ) =
s . s − ψ (α )
(20.20)
Theorem 20.5. The following main factorization identity holds true for the characteristic function of ξ (θs )(see Theorem 2.2 in [33]):
ϕ (s, α ) = ϕ+ (s, α )ϕ− (s, α ), ±∞ iα x ± e − 1 dNs (x) , ϕ± (s, α ) = exp ± Ns+ (x) = − Ns− (x) =
0
∞ 0 ∞
(20.21) (20.22)
0
e−st t −1 P(ξ (t) > x) dt,
x > 0,
e−st t −1 P(ξ (t) < x) dt, x < 0.
The relations (20.22) are called the Spitzer–Rogozin identities. The characteristic functions ϕ± (s, α ) in them are expressed via the complicated transformations of the distributions for the positive (negative) values of ξ (·). Let us mention that the following supplements to the extremums of the process
ξˆ ± (t) = ξ (t) − ξ ∓ (t), ξˆ ± (θs ) = ξ (θs ) − ξ ∓ (θs ) satisfy the following relations, d ξˆ ± (θs ) = ξ ± (θs );
that is,
E exp{iα ξˆ ± (θs )} = ϕ± (s, α ).
332
20 Basic functionals of the risk theory
It means that the components of the main factorization identity (20.22) can be interpreted as the characteristic functions of the supplements ξˆ ± (θs ). Later in the text we use the following notations, P+ (s, x) := P(ξ + (θs ) < x), x > 0; P− (s, x) := P(ξ − (θs ) < x), x < 0. We denote as θv the exponentially distributed random variable (independent of θs and ξ (t) ) with a parameter v > 0. The following statement on the second factorization identity holds true. Theorem 20.6. The joint distribution of the pair {τ + (·), γ + (·)} is determined by the moment generating function +
+
Ee−sτ (θv )−zγ (θv ) 1Iτ + (θv )<∞ ϕ+ (s, iv) v 1− , (s, z, v > 0). = v−z ϕ+ (s, iz)
(20.23)
It is easy to determine the moment generating functions inverting (20.23) in v: Ee−sτ
+ (x)−zγ
+ (x)
−zγ+ (x)
Ee where
1Iτ + (x)<∞ =
1 ϕ+ (s, iz)
1 1Iτ + (x)<∞ = ϕ+ (iz)
∞
∞ x
x
ez(x−y) dP+ (s, y), m ∈ R,
ez(x−y) dP+ (y), m < 0,
P+ (y) = P(ζ + < y).
Theorem 20.7. If the pair {τ + (0), γ + (0)} satisfies the condition P(τ + (0) = γ + (0) = 0) = 0, then the joint moment generating function {τ + (0), γ + (0)} is as follows p+ (s) , ϕ+ (s, iz) p+ (s) = P(ξ + (θs ) = 0), q+ (s) = 1 − p+ (s), + p+ (s) p+ (s) = , Ee−zξ (θs ) = 1 − f+ (s, z) 1 − q+ (s) gs (z) f+ (s, z) = Ee−zγ
+ (0)−sτ + (0)
1Iτ + (0)<∞ = 1 −
(20.24)
(20.25)
+
f+ (s, z) = Ee−zγ (0) 1Iξ + (θs )>0 = q+ (s) gs (z), / + gs (z) = E e−zγ (0) ξ + (θs ) > 0 . If m = Eξ (1) ≥ 0, then f+ (s, z) → Ee−zγ s→0
+ (0)
1Iξ + >0 = Ee−zγ
+ (0)
(because P(ξ + >
0) = P(ξ + = +∞) = 1), and the distributions γ + (0) and γ + (∞) = limu→∞ γ + (u) are connected by the relation + 1 −zγ + (0) 1 − Ee . (20.26) Ee−zγ (∞) = zEγ + (0)
20 Basic functionals of the risk theory
333
If m < 0, then the moment generating function of the absolute maximum can be expressed by the generalized Pollaczek–Khinchin formula / + + p+ , g0 (z) = E e−zγ (0) ξ + > 0 , (20.27) Ee−zξ = 1 − q+ g0 (z) q+ (s) = P(ξ + (θs ) > 0) → q+ = P(ξ + > 0), 0 < q+ < 1, s→0
p+ = 1 − q+ .
The complicated dependence of the distribution of the positive (negative) values of the process for the positive (negative) components ϕ± (s, α ) of the main factorization identity becomes considerably simpler for the semicontinuous and almost semiconthe jumping part of tinuous processes. Later we consider only the processes ξ (t), which has the bounded variation. For such kind of processes |x|≤1 |x|Π (dx) < ∞ and the cumulant is as follows.
ψ (α ) = iα a −
σ2 2 α + 2
eiα x − 1 Π (dx), R
σ 2 ≥ 0.
(20.28)
Thus, the drift coefficient a = 0 and jumps of the process have different signs for the semicontinuous processes with σ 2 = 0. For almost semicontinuous processes with σ 2 = 0 only the exponentially distributed jumps and drift coefficient a have different signs. Let us denote k(r) := ψ (−ir) and write the Lundberg equation k(r) − s = 0, s ≥ 0.
(20.29)
For upper (lower) semicontinuous and almost semicontinuous processes due to the convexity of k(r) in the neighborhood of r = 0 the equation (20.29) has only one positive (negative) root rs = ±ρ± (s) which completely determines ϕ± (s, α ). Theorem 20.8. The following relations hold for the upper continuous nonmonotonic process ξ (t) with the cumulant (20.28), where Π (dx) = 0 for x > 0 and m = k (0):
ϕ+ (s, α ) =
ρ+ (s) , k(ρ+ (s)) = s, ρ+ (s) − iα
P+ (s, x) := P(ξ + (θs ) > x) = e−ρ+ (s)x , x ≥ 0,
ρ+ (s) → 0 for m ≥ 0; s→0
ρ+ (0) = m−1
(20.30)
for m > 0.
The distribution of ξ − (θs ) can be determined by the relation P− (s, x) =
1 P (s, x) + P(s, x), x < 0. ρ+ (s)
(20.31)
If σ 2 > 0, then the derivative P (s, x) (in x) exists for all x ∈ R1 . If σ 2 = 0, a > 0, and ξ (t) have the bounded variation, then the derivative P (s, x) exists only for x = 0 and
334
20 Basic functionals of the risk theory
s P (s, +0) − P (s, −0) = , a p− (s) := P(ξ − (θs ) = 0) =
s > 0, aρ+ (s)
(20.32)
1 m , p− (0) = , m > 0, m a where a is a constant drift from (20.28).
ρ+ (0) =
Theorem 20.9. The following relations hold for the lower continuous nonmonotonic process ξ (t) with the cumulant (20.28), where Π (dx) = 0 for x < 0 and m = k (0):
ϕ− (s, α ) =
ρ− (s) , k(−ρ− (s)) = s, ρ− (s) + iα
ρ− (s) → 0 for m ≤ 0; ρ− (0) = m−1 for m < 0,
(20.33)
s→0
P− (s, x) = eρ− (s)x , x ≤ 0. The distribution of ξ + (θs ) is determined by the relation P+ (s, x) =
1 P (s, x) + P(s, x), x > 0. ρ− (s)
If σ 2 = 0, a < 0, then p+ (s) = P(ξ + (θs ) = 0) = p+ (s) → p+ = s→0
s > 0, |a|ρ− (s)
(20.34)
|m| λμ , q+ = , m < 0. |a| |a|
Theorem 20.10. For the almost upper semicontinuous process ξ (t) satisfying the first condition in (20.1) with c > 0 and with cumulant function of the form
ψ (α ) = cλ1
∞ 0 eiα x − 1 e−cx dx + (eiα x − 1)Π (dx), λ1 > 0, −∞
0
the following relations holds:
ϕ+ (s, α ) =
p+ (s)(c − iα ) , ρ+ (s) = cp+ (s), k(ρ+ (s)) = s, ρ+ (s) − iα P+ (s, x) = q+ (s)e−ρ+ (s)x , x ≥ 0,
ρ+ (s) → 0 if m ≥ 0; s→0
P− (s, x) =
ρ+ (0) = m−1 ,
p+ (0) = (cm)−1 ,
(20.35)
m = Eξ (1) = k (0) > 0.
∞ 1 P(s, x) − cq+ (s) e−cy P(s, x − y) dy , x < 0. p+ (s) 0
(20.36)
20 Basic functionals of the risk theory
335
Theorem 20.11. For the almost lower semicontinuous process ξ (t) satisfying the second condition in (20.1) with b > 0 and with cumulant function of the form
ψ (α ) = bλ2
0 ∞ eiα x − 1 ebx dx + (eiα x − 1)Π (dx), λ2 > 0, −∞
0
the following relations hold true.
ϕ− (s, α ) =
p− (s)(b + iα ) , ρ− (s) = bp− (s), k(−ρ− (s)) = s, ρ− (s) + iα
ρ− (0) = |m|−1 ,
p− (0) = (b|m|)−1 , m = k (0) < 0;
(20.37)
ρ− (s)x
P− (s, x) = q− (s)e , x < 0, ρ− (s) → 0 if m ≤ 0. s→0 0 1 P+ (s, x) = P(s, x) − bq− (s) eby P(s, x − y) dy , x > 0. p− (s) −∞
(20.38)
The corresponding results for the distributions of the absolute extremums are simple corollaries from Theorems 20.8–20.11. Corollary 20.1. The following relations are true for the upper continuous processes ξ (t) in accordance to the sign of m = Eξ (1) (or to the sign of safety security loading δ = m/(λ μ )): (1) If m > 0 then ρ+ (s) → 0, ρ+ (s)s−1 → m−1 , P(ξ + = +∞) = 1. It follows s→0
s→0
from (20.31) that for s → 0 we obtain ∞ − P(ξ < x) = m P(ξ (t) < x) dt , x < 0. 0
(20.39)
x
( √ (2) If m = 0 then we have for s → 0 that ρ+ (s) ≈ 2sσ1−1 , σ1 = Dξ (1), P(ξ ± = ±∞) = 1. (3) If m < 0 then ρ+ (s) → ρ+ > 0 and, according to (20.30), we obtain the following relation as s → 0,
s→0
P(ξ + > x) = e−ρ+ x , x ≥ 0, P(ξ − = −∞) = 1.
(20.40)
Corollary 20.2. The following is true for the lower continuous processes in accordance with the sign of m (or with the sign of δ ). (1) If m < 0, then ρ− (s) → 0, ρ− (s)s−1 → |m|−1 and P(ξ − = −∞) = 1. Acs→0
s→0
cording to (20.34) we obtain the following relations as s → 0: ∞ + P(ξ > x) = m P(ξ (t) > x) dt , x > 0. 0
(20.41)
x
√ (2)(If m = 0, then we obtain the following relations as s → 0: ρ± (s) ≈ 2sσ1−1 , σ1 = Dξ (1), P(ξ ± = ±∞) = 1. (3) If m > 0, then ρ− (s) → ρ− > 0 and according to (20.33) we obtain the following relations as s → 0:
s→0
P(ξ − < x) = eρ− x , x ≤ 0, P(ξ + = +∞) = 1.
(20.42)
336
20 Basic functionals of the risk theory
Note that the process ξ (t) = at + σ W (t) is both upper and lower continuous. Thus, its characteristic functions ξ ± (θs ) are determined by the first formulas in (20.30) and (20.33). Corollary 20.3. The following relations hold true for the almost upper semicontinuous process ξ (t) with a = 0 in accordance with the sign of m (or with the sign of δ ). (1) If m > 0 then ρ+ (s) → 0, ρ+ (s)s−1 → ρ+ (0) = m−1 , P(ξ + = +∞) = 1. s→0
s→0
According to (20.36) for x < 0
lim P− (s, x) = P(ξ − < x) ∞ = cm P(ξ (t) < x) dt − c
s→0
0
∞ ∞
−cy
e
0
0
P(ξ (t) < x − y) dt dy . (20.43)
√ (2) If m = 0 then tending s → 0 we obtain that ρ+ (s) ≈ 2sσ1−1 , P(ξ ± = ±∞) = 1. (3) If m < 0 then ρ+ (s) → ρ+ = cp+ , p+ = P(ξ + = 0) = c|m|/λ > 0, q+ = 1 − p+ s→0
and according to (20.35) P(ξ + > x) = q+ e−ρ+ x , x > 0, P(ξ − = −∞) = 1.
(20.44)
Corollary 20.4. The following relations are true for the almost lower semicontinuous process ξ (t) with a = 0. (1) If m < 0 then ρ− (s) → 0, ρ− (s)s−1 → |m|−1 , P(ξ − = −∞) = 1, p+ = s→0
b|m|/|λ |, q+ = 1 − p+ . According to (20.38) we have for x > 0:
s→0
lim P+ (s, x) = P(ξ + > x) ∞ P(ξ (t) > x) dt − b = b|m|
s→0
0
0
∞ ∞ x
eb(x−y) P(ξ (t) > y) dy dt . (20.45)
√ (2) If m = 0 then given s → 0, ρ− (s) ≈ 2sσ1−1 , P(ξ ± = ±∞) = 1. (3) If m > 0 then ρ− (s) → ρ− = bp− , p− = P(ξ − = 0) = (bm)/λ > 0 and s→0
according to (20.37) for x < 0
P(ξ − < x) = q− eρ− x , q− = 1 − p− , P(ξ + = +∞) = 1.
(20.46)
Let us define the Laplace–Karson transform of the ruin probability with finite horizon (20.14):
Ψs (u) = s
Then, according to (20.34),
∞
0
e−stΨ (u,t) dt.
20 Basic functionals of the risk theory
Ψs (u) =
1 P (s, −u) + P(s, −u), u > 0 ρ− (s)
337
(20.47)
for the upper continuous classic ruin process ξu (t) assigned by the formula (20.8). And it follows from (20.39) that for m > 0 and s → 0, ∞ Ψ (u) = lim Ψs (u) = m P(ξ (t) < x) dt = P(ξ − < −u). (20.48) s→0
0
x=−u
We should take into account that the jumps ξk are positive for the lower continuous claim surplus process ζ (t) = S(t) −Ct (see (20.10)). On the other hand, for the almost lower semicontinuous risk process we have that
ζ (t) = S(t) −C(t), C(t) =
N2 (t)
∑ ξk ,
Eeiαξk =
k=0
b , b > 0, b − iα
(20.49)
N (t)
1 the jumps of the process S(t) = ∑k=0 ξk , that is, the claims ξk are also positive (the Poisson processes N1,2 (t) with intensities λ1 > 0, λ2 > 0 are independent). The cumulant of ζ (t) can be written in two ways:
ψ (α ) = λ1 (ϕ1 (α ) − 1) + λ2 (
b − 1), ϕ1 (α ) = Eeiαξ1 b + iα
or
ψ (α ) = λ (ϕ (α ) − 1), ϕ (α ) = λ
pϕ1 (α ) + q
b b + iα
,
where λ = λ1 + λ2 , p = λ1 /λ , q = 1 − p, δ = (λ |μ |)/(λ1 μ1 ) > 0, λ μ = λ1 μ1 − λ2 b−1 , μ1 = Eξ1 . Let also P(ξ1 > 0) = F 1 (0) = 1, F 1 (x) = P(ξ1 > x), F(x) = pF 1 (x), x > 0. For both processes (20.10) and (20.49) the conditional moment generating function for γ + (0) in Pollaczek–Khinchin formula (see (20.27)) is as follows. / + E e−zγ (0) ζ + > 0 = −1 ∞ −zx (20.50) ϕ0 (z) = 0∞ F(x) dx F(x) dx, for ζ (t) in (20.10) 0 e = g0 (z) = q1+ 0∞ e−yz (dF(y) + bF(y)dy), for ζ (t) in (20.49). Furthermore, the moment generating function for ζ + is determined by the classic Pollaczek–Khinchin formula. This implies the following decomposition, +
Ee−zζ =
∞ p+ = p+ ∑ (q+ ϕ0 (z))n . 1 − q+ ϕ0 (z) n=0
(20.51)
Let us denote Sn = ∑k≤n ξk , where ξk are i.i.d. random variables with the moment generating function ϕ0 (z). Inverting (20.51) in the variable z, we obtain
Ψ (u) = P(ζ + > u) = p+
∞
∑ qn+ P(Sn > u).
n=1
(20.52)
338
20 Basic functionals of the risk theory
It means that ζ + = Sν . Here the random variable ν follows the geometric distribution with parameter q+ = P(ζ + > 0) < 1. d
Definition 20.8. The process (20.49) is called the risk process with random (exponentially distributed) claims (Figures 4 and 5, page 362 contain the graphical images of the functionals of the process (20.49)).
Let us mention that if Eeiαξk = c/(c − iα ), Eeiαξk = b/(b − iα ) then the process (20.49) is both the almost upper and almost lower semicontinuous. The joint moment generating function of the overjump functionals {τ + (x), γk (x), k = 1, 3}, where γ1 (x) = γ + (x), γ2 (x) = γ+ (x), γ3 (x) = γx+ , that is, V (s, x, u1 , u2 , u3 ) = Ee−sτ
+ (x)− 3 u γ (x) ∑k=1 k k
1Iτ + (x)<∞ ,
can be determined by some integral equation on the semi-axis x > 0. For the lower continuous (almost lower semicontinuous) processes ζ (t) this equation is as follows. (s + λ )V (s, x, u1 , u2 , u3 ) − λ
x
V (s, x − y, u1 , u2 , u3 )dF(y) = A(x, u1 , u2 , u3 ),
−∞
where A(x, u1 , u2 , u3 ) = λ
∞ x
e(u1 −u2 )x−(u1 +u2 )z dF(z), x > 0.
Let us denote the convolution A(·) with dP− (·) (which is the exponential distribution of ζ − (θs ); see (20.33), (20.37)), as G(·). Then G(s, x, u1 , u2 , u3 ) =
+0 −∞
A(x − y, u1 , u2 , u3 ) dP− (s, y)
= p− (s)A(x, u1 , u2 , u3 ) + q− (s)ρ− (s)
0 −∞
(20.53)
A(x − y, u1 , u2 , u3 ) dP− (s, y),
p− (s) ≥ 0. Note that for the lower continuous processes ζ (t) in (20.53) p− (s) = 0, q− (s) = 1. Theorem 20.12. The joint moment generating function of the overjump functionals for the lower semicontinuous (almost semicontinuous) risk processes ζ (t) is determined by the relation V (s, u, u1 , u2 , u3 ) = s−1
u −0
G(s, u − y, u1 , u2 , u3 ) dP+ (s, y).
(20.54)
The moment generating function of the pair {τ + (u), γk (u)}, (k = 1, 3) is determined by the relation Vk (s, u, uk ) : = Ee−sτ = s−1
+ (u)−u
u −0
k γk (u)
1Iτ + (u)<∞
Gk (s, u − y, uk ) dP+ (s, y), k = 1, 3,
(20.55)
20 Basic functionals of the risk theory
339
where p+ (s) > 0 for the processes (20.10) and (20.49), P± (s, x) = P(ζ ± (θs ) < x)
(±x > 0),
G1 (s, u, u1 ) = G(s, u, u1 , 0, 0), G2 (s, u, u2 ) = G(s, u, 0, u2 , 0), G3 (s, u, u3 ) = G(s, u, 0, 0, u3 ). In turn, functions Gk corresponding to the process ζ (t) from (20.10) have a form ∞ λ ρ− (s) e−u1 y − e−ρ− (s)y dF(u + y), G1 (s, u, u1 ) = ρ− (s) − u1 0 ∞
e−u2 (u+z)−ρ− (s)z F(u + z) dz, ∞ e−u3 z 1 − eρ− (s)(u−z) dF(z). G3 (s, u, u3 ) = λ
G2 (s, u, u2 ) = λ ρ− (s)
(20.56)
0
u
The relations (20.54) and (20.55) can be easily inverted in uk (k = 1, 3). Denote
φ (s, k, u, x) =
∂ P(γk (u) < x, ζ + (θs ) > u). ∂x
This value tends to a limit as s → 0. We denote this limit φk (u, x). The limit distributions of the overjump functionals and marginal ruin functions can be found using these relations. Theorem 20.13. The first two ruin functions for the lower continuous ruin processes ζ (t) in the case when m < 0 are determined by the relations
Φ1 (u, x) := P(ζ + > u, γ + (u) > x) = =
λ λ F(u + x) + c |m|
u 0
x
φ1 (u, z)dz
F(u + x − z) dP+ (z), F(y) =
Φ2 (u, y) := P(ζ + > u, γ+ (u) > y) = =
∞
⎧ ⎨
∞ y
(20.57)
∞
F(z) dz, y
φ2 (u, z)dz
λ (u)F(y), |m| P + u u−y λ ⎩ p F(u) + F(y) dP (z) + F(u − z) dP (z) , + + + u−y 0 |m|
(20.58) y > u, 0 < y < u,
where P+ (u) = P(ζ + < u), p+ = |m|/C. The distribution density of the claim γu+ that caused the ruin (i.e., of the third ruin function) is as follows.
φ3 (u, y) := =
∂ ∂ P(γu+ < y, ζ + > u) = − Φ3 (u, y) ∂z ∂y
u λ |m| F (y) 0 (y − u + z) dP+ (z), 0 λ |m| F (y) −y (z + y) dP+ (y + u),
y > u, 0 < y < u.
(20.59)
340
20 Basic functionals of the risk theory
In order to study the distribution of the first duration and the total duration of “red period” it should be taken into account that T (u) = τ − (−γ + (u)),
u > 0,
−
x < 0.
d
τ (x) = inf{t > 0| ζ (t) < x},
(20.60)
The statement below follows from (20.55), (20.56), and (20.60) (if k = 1, s → 0). Theorem 20.14. Consider the lower continuous risk process ζ (t) (see (20.10)). The following relations hold true for m = Eζ (1) < 0. g+ (u, z) := Ee−zγ
+ (u)
1Iτ + (u)<∞
(20.61) u ∞ λ ∞ −zx λ = e F(u + x) dx + e−zx F(u − y + x) dx dP+ (y). c 0 |m| 0+ 0 The following formula is a consequence of (20.60). It is called the corrected dos Reis formula. (20.62) gu (s) := Ee−sT (u) 1Iτ + (u)<∞ = g+ (u, ρ− (s)). The integral transform of the moment generating function for the total duration of the “red period” is determined by the following relation (m < 0). d+ (u, μ ) :=
∞ 0
eiα u du Ee−μ Qu (∞) =
+ ϕ+ (α ) , ϕ+ (α ) = Eeiαζ . ϕ+ (μ , α )
(20.63)
The following analogue of (20.61) is true for the almost lower semicontinuous risk process (20.49) with arbitrary distributed claims and exponentially distributed premiums in the case when b > 0, m = λ (pμ1 − qb−1 ) < 0, λ = λ1 + λ2 . g+ (u, z) : = +
∞ 0
e−zx φ1 (u, x)dx =
λ b|m|
where g˜0 (z) = Moreover,
u ∞ 0+ 0
∞ 0
e−yz dF∗ (u + y)
(20.64)
e−yz F∗ (u − y + x) dx dP+ (y) →
∞
u→0 0
e−yz dF∗ (y) = q+ g˜0 (z),
1 ∞ −yz dF∗ (y) = (F ∗ (0))−1 0∞ e−yz dF∗ (y). q+ 0 e
b|m| . λ The moment generating function of T (u) is determined by the generalization of the formula (20.62): (20.65) gu (s) = q− (s)g+ (u, ρ− (s)). The integral Fourier–Stieltjes transform for Qu (∞) is determined by the relation (20.63) the same way as for lower continuous processes. The moment generating function for the total deficit maximum Z + (u) (see (20.17)) is determined for the processes (20.10) and (20.49) by the relation F∗ (y) := F (y) + bF(y), y > 0, p+ =
κu (z) := Ee−zZ
+ (u)
+
1Iζ + >u = e−uz Ee−zζ .
(20.66)
20 Basic functionals of the risk theory
341
Theorem 20.15. Let ζ (t) be the lower almost semicontinuous stepwise risk process. The limiting ruin densities for m < 0 are determined by the relations
φ1 (u, x) = F∗ (u + x) +
λ b|m|
u
F∗ (u + x − y)dP+ (y), x ≥ 0,
0
(20.67)
F∗ (x) = F (x) + bF(x), x ≥ 0, F(0) = p > 0. If y = u ≥ 0 then P(γ2 (u) = γ+ (u) = u) = F(u) = pF 1 (u), and for y = u λ F(y)P+ (u), y > u, u φ2 (u, y) = |m| λ −1 |m| F(y) u−y dP+ (z) + b P+ (u − y) , 0 < y < u,
φ3 (u, y) =
u λ −1 |m| F (y) 0− (z + y − u + b )dP+ (z), 0 λ −1 |m| F (y) −y (z + y + b )dP+ (z + u),
y > u, 0 < y < u.
(20.68)
(20.69)
Let us remark that (20.67) follows from (20.64) inverting it on u1 = z and (20.68) and (20.69) follow from (20.55) using the boundary transition as s → 0 and inversion on u2,3 . At that time the integral in the first row in (20.69) can be written as u 0− −1
z + y − u + b−1 dP+ (y)
= p+ (z − u + b ) +
u
z + y − u + b−1 dP+ (y), z > u.
0+
The relations (20.57) and (20.58) as u → 0 imply the following. Corollary 20.5. Let ζ (t) be the lower continuous risk process (see (20.10)). Then for + m < 0 the following relations are true for Ee−sτ (0) 1Iγk (0)>z = P(ζ + (θs ) > θ , γk (0) > z), k = 1, 3.
λ ∞ (z−y)ρ− (s) e F(y) dy, C z ∞ λ e−yρ− (s) F(y) dy, P(γ+ (0) > z, ζ + (θs ) > 0) = C z ∞ λ 1 − e−yρ− (s) dF(y). P(γ0+ > z, ζ + (θs ) > 0) = Cρ− (s) z
P(γ + (0) > z, ζ + (θs ) > 0) =
(20.70)
If m < 0 then for s → 0 and the lower continuous process ζ (t) the formula (20.70) implies p+ = |m|/C, P(γ + (0) > z, ζ + > 0) = P(γ+ (0) > z, ζ + > 0) = P(γ0+ > z, ζ + > 0) =
λ C
∞ z
y dF(y), F(z) =
P(γk (0) > 0, ζ + > 0) = P(ζ + > 0) = q+ =
λμ , C
λ F(z), C
∞
F(y) dy, z
μ = F(0).
(20.71)
342
20 Basic functionals of the risk theory
Corollary 20.6. If ζ (t) is the lower almost semicontinuous risk process (see (20.49)) then the following relations are true (F(x) = pF 1 (x), x > 0, p = F(0)) p± (s) = P(ζ ± (θs ) = 0) > 0,
p+ (s)ρ− (s) =
sb , s+λ
∞ λ F(x) + q− (s)b e(x−y)ρ− (s) F(y) dy , s+λ x ∞ λ b q− (s) e−vρ− (s) F(v) dv, (20.72) P(γ+ (0) > y, ζ + (θs ) > 0) = s+λ y λ bq− (s) ∞ 1 − e−yρ− (s) dF(y) . F(z) + P(γ0+ > z, ζ + (θs ) > 0) = s+λ ρ− (s) z
P(γ + (0) > x, ζ + (θs ) > 0) =
If s → 0 and m < 0 then it follows from (20.72) for the lower almost semicontinuous process ζ (t) that p+ = (b|m|)/λ and P(γ + (0) > x, ζ + > 0) = p F 1 (x) + bF 1 (x) → q+ = p(1 + bμ1 ), x→0
P(γ+ (0) > y, ζ + > 0) = bpF 1 (y), F 1 (y) =
∞
(20.73)
F 1 (x) dx,
y
P(γ+ (0) > 0, ζ + > 0) < q+ , P(γ+ (0) = 0, ζ + > 0) = pF 1 (0), ∞ P(γ0+ > z, ζ + > 0) = p F 1 (z) + b y dF1 (y) → q+ . z
z→0
In order to calculate the moments mk = Eζ (1)k (given mk < ∞, k = 1, 4) of the risk process ζ (t) = S(t) − Ct (or ζ (t) = S(t) − C(t)) the derivatives of its cumulant k(r) = ln Eerζ (1) = ψ (−ir) at zero point (r = 0) are used. They are said to be semiinvariants k (0) =: κ1 = m1 k (0) =: κ2 = Dζ (1) = m2 − m21 ,
(20.74)
k (0) =: κ3 = m3 − 3m1 m2 + 2m31 , k(4) (0) =: κ4 = m4 − 3m1 − 4m1 m3 + 12m21 m2 − 6m41 . This implies that m1 = κ1 , m2 = κ2 + κ12 , m3 = κ3 + 3κ1 κ2 + κ13 ,
(20.75)
m4 = κ4 + 3κ22 + 4κ1 κ3 + 6κ12 κ2 + κ14 . And let us finally mention that it is not always possible to find the ruin probabilities Ψ (T, u) and Ψ (u) in an explicit form from the integro-differential equation derived for Ψ (T, u). Most often the Laplace–Karson transform on T or Laplace or Fourier transform on u are used. That is why the approximating estimates of these probabilities are often used in risk theory. They could be found in [30, 33].
20 Basic functionals of the risk theory
343
Bibliography [7, 33, 47, 50]; [55] Chapter III; [1, 10, 30, 68].
Problems 20.1. Let us consider the process ξ (t) = at + σ W (t) with a characteristic function Eeiαξ (t) = et ψ (α ) , ψ (α ) = iaα − 12 σ 2 α 2 . Write the characteristic function for ξ (θs ), express the components of the main factorization identity (characteristic functions of ξ ± (θs )) in terms of roots rs = ±ρ± (s) of the Lundberg equation. Find the distributions of the extremums P± (s, x) = P(ξ ± (θs ) < x), (±x > 0). For Eξ (1) = a < 0, (a > 0) find the distributions of the absolute extremums. Find out the shape of the distribution for γk (x) (γ1 (x) = γ + (x), γ2 (x) = γ+ (x)) and for the first duration of T (x) being over the level x. If a < 0 find the moment generating function for the total duration of being over the level x > 0. Show that Q0 (t) satisfies the arcsine law for a = 0. 20.2. Let ζ (t) = S(t) −C(t) be a risk process with claims ξk and premiums ξk both following the exponential distributions, ξk with parameter c and ξk with parameter 1. Furthermore, let N1 (t) c , S(t) = ∑ ξk , Eeiαξk = c − iα k=0 C(t) =
N2 (t)
∑ ξk ,
k=0
Eeiαξk =
1 , 1 − iα
where N1,2 (t) are independent Poisson processes with λ1 = λ2 = 1. Find ψ (α ), ϕ (s, α ), and ϕ± (s, α ) using the roots of the Lundberg equation. If ±m < 0 (m = Eζ (1)), find the characteristic function of ζ ± . Find the joint moment generating function of {τ + (x), γ + (x)} relying on the second factorization identity (20.23). If m < 0, find the moment generating function for the distribution of the first duration of “red period” T (x) and the moment generating function for the total duration Qx (∞) of being over the level x > 0. N(t)
20.3. Let ζ (t) = −t + ∑k=0 ξk be a claim surplus process following the exponential distribution ϕ (α ) = Eeiαξk = c(c − iα )−1 , c > 0, and N(t) be a Poisson process with intensity λ > 0. Find ψ (α ), ϕ (s, α ) and express ϕ± (s, α ) via roots of the Lundberg equation. Find the characteristic function of ζ ± and the ruin probability for ±m < 0 (m = Eζ (1) = (λ − c)/c) if the initial capital u > 0 and m < 0. Write the formulas for the densities of the ruin functions φk (u, z) = (∂ /∂ z)P(γ k (u) < z, ζ + > u), k = 1, 2, 3. 20.4. Calculate all three densities of the ruin functions for the process ζ (t) from Problem 20.3 taking into account that
344
20 Basic functionals of the risk theory
P+ (y) = P(ζ + < y) = 1 − q+ e−ρ+ y , y > 0. Prove also that the first density function in the solution of Problem 20.3 can be simplified (see (20.93) below):
∂ P(γ + (u) < x, ζ + > u) = λ e−ρ+ u−cx , ∂x λ P(γ + (u) > x, ζ + > u) = e−ρ+ u−cx . c Find the moment generating function for γ + (x) and the moment generating function for the first duration of the “red period” T (u). 20.5. Let us consider the risk processes with random premiums from Problem 20.2 and the classical risk process with linear premium function from Problem 20.4 the claims of which follow the exponential distribution with parameter c > 0. It was shown that the moment generating functions of the distribution γ + (u) have the same form (see the expressions for g+ (u, z) in the end of the solutions of Problems 20.2 and 20.4). Find the moment generating function of the total deficit Z + (u). N(t)
20.6. Let ζ (t) = (∑k=0 ξk ) − t be a claim surplus process where ξk have the characteristic function 3 1 7 , ϕ (α ) = Eeiαξk = + 2 3 − iα 7 − iα where N(t) is a Poisson process with intensity λ = 3, (m = Eζ (1) = −2/7). Prove that −(iα )3 + 7(iα )2 − 6iα , ψ (α ) = (3 − iα )(7 − iα ) s(3 − iα )(7 − iα ) s = , ϕ (s, α ) = s − ψ (α ) P3 (s, iα ) where ϕ (s, α ) is a fractional rational function, and P3 (s, r) is a cubic polynomial: P3 (s, r) = r3 + r2 (s − 7) + r(6 − 10s) + 21, s > 0. Find the roots of the equation P3 (0, r) = 0 and show that the negative root r1 (s) = −ρ− (s) → 0 and the positive roots r2 (s) = ρ+ (s) < r3 (s) stay positive as s → 0: s→0
ρ+ (s) = r2 (s) → 1, r3 (s) → 6. Express ϕ± (s, α ) via roots found. Find the distribution of ζ + and the ruin probability Ψ (u). 20.7. Find the first two ruin functions Φ1,2 (u, x) for the process from Problem 20.6 taking into account that 1 −3x e + e−7x , x > 0, 2 24 −x 1 −6x + P(ζ > x) = e + e , x > 0. 35 35 F(x) =
20 Basic functionals of the risk theory
345
20.8. Find the characteristic function of the absolute maximum ζ + for the process ζ (t) from Problem 20.6 using the Pollaczek–Khinchin formula (see (20.50) and (20.51)) and show that the denominator of the characteristic +
Eeiαζ =
(3 − iα )(7 − iα ) 2 7 (3 − iα )(7 − iα ) − 3(5 − iα )
coincides with P2 (0, r) after the substitution r = iα . As a result the identity of the last characteristic function obtained by the Pollaczek– Khinchin formula and the characteristic function for ζ + obtained from Problem 20.6 using factorization is assigned. 20.9. Let ζ (t) = S(t)−C(t) be a risk process with exponentially distributed premiums S(t) =
N1 (t)
∑
ξk , C(t) =
k=0
N2 (t)
∑ ξk ,
k=0
where N1,2 (t) are independent Poisson processes with λ1 = λ2 = 1,
ϕ1 (α ) = Eeiαξk =
1 b , ϕ2 (α ) = Eeiαξk = . 2 (1 − iα ) b − iα
Prove that ψ (α ) = λ1 (ϕ1 (α ) − 1) + λ2 (ϕ2 (−α ) − 1),
ϕ (s, α ) = and find
s s − ψ (α )
ϕ± (s, α ) = Eeiαζ
± (θ
s)
.
20.10. Consider the process from Problem 20.9 (given that b = 1/14, m = 2 − b−1 = −12 < 0) taking into account that s−1 ρ− (s) → |m|−1 = 1/12, s−1 ρ− (s) → s→0
(b|m|)−1 = 7/6. Show that
ϕ+ (α ) = lim
s→0
s→0
(1 − iα )2 s (1 − iα )2 = b|m| . p− (s) P2 (s, iα ) P2 (0, iα )
Show that the fractional rational characteristic function ϕ+ (α ) allows the decomposition 3 27 25 1 1 . (20.76) ϕ+ (α ) = + + 7 41 1 − 4iα 287 7iα − 12 Invert (20.76) in α and find the distribution for ζ + and Ψ (u). Find the densities φ1,2 (x), using the formula (20.67) which follows from (20.64) after inversion on z, and (20.68). 20.11. For the above problem find the moment generating function ζ + using the Pollaczek–Khinchin formula (20.51), taking into account the equalities F 1 (x) = (1 + x)e−x , μ1 = 0∞ F 1 (x) dx = 2.
346
20 Basic functionals of the risk theory
20.12. Calculate the ruin functions for the process from Problem 20.9 for u = 0 and m = 2 − b−1 < 0, using formula (20.73) and the relations F 1 (x) = (1 + x)e−x and F 1 (x) = (2 + x)e−x . 20.13. Find the moment generating function T (0) for the duration of the first “red period” for the ruin process ζ (t) from Problem 20.6 (given u = 0, m < 0) using the formula (20.50) for the moment generating function of γ + (0): g+ (0, z) = Ee−zγ
+ (0)
1Iζ + >0 = q+ ϕ0 (z).
Use formula (20.62). 20.14. Consider the process ζ (t) from Problem 20.9. The moment generating function for γ + (0) is determined by the formula (20.50). Find the moment generating function of the first duration of “red period” given u = 0, b = 1/14 (m = −12), using the formula (20.65). For the calculation of the moment generating function ϕ0 (z) it should be taken into account that F 1 (x) = (1 + x)e−x , x > 0, μ = 2,
ϕ0 (z) =
1 2
∞ 0
e−zx F 1 (x) dx.
20.15. Find the moment generating function of γ + (u) for the risk process from Problem 20.6 (given u > 0, λ = 3, m = −2/7 < 0) . Use the solution of Problem 20.7 for Φ1 (u, x). Use the obtained expression g+ (u, z) = Ee−zγ
+ (u)
1Iτ + (u)<∞ =
∞ 0
e−xz d Φ1 (u, x)
to determine the moment generating function of the total deficit Z + (u) by the formula (20.66). 20.16. Consider the risk process with random premiums from Problem 20.9 (given u > 0, b = 1/14, m = −12 < 0). Find the moment generating function of γ + (u) using (20.64) and the relations F(x) = (1 + x)e−x , (x > 0), P+ (y) =
27 −y/4 25 −12y/7 e e − , (y > 0). 41 287
Or, in order to do it, you can use the density φ1 (u, x) found in Problem 20.10 and calculate ∞ e−zx φ1 (u, x)dx. g+ (u, z) = 0
Use this relation for g+ (u, z) in order to determine the moment generating function of the total deficit Z + (u) by the formula (20.66). 20.17. The risk process ζ (t) = S(t) −Ct δ = (λ μ )−1 (C − λ μ ) > 0 has the cumulant
20 Basic functionals of the risk theory ∞
k(z) :=
1 ln Ee−zζ (t) = C z + λ ( f (z) − 1), t
f (z) = Ee−zξ1 =
Rewrite it as the queueing process η (t) with cumulant k1 (z) = z + λ1 ( f (z) − 1), λ1 = λ C−1 , 1 − f (z) = z
∞
347
e−zx dF(x).
0
e−zx F(x) dx.
0
Investigate the virtual time waiting process w(t) (w(0) = 0) for η (t) using formula (z, s) := Ee−zw(θs ) by rewriting this expectation via the probability ω (11.3). Find ∞ −su p0 (s) = s 0 e P0 (u) du of the system to be free of claims in the exponentially distributed moment of time θs . Using the boundary transition as z → ∞ show that the atomic probability of w(θs ) being in 0 is positive: (z, s) = p0 (s) > 0. P(w(θs ) = 0) = p+ (s) = lim ω z→0
20.18. It can be identified based on the Figure 1 (page 360) that d θ1 = τη− (−ξ1 ), τη− (−x) = sup{t| η (t) < x}, x < 0.
for the queueing process η (t) from the previous problem. Find the moment generating function θ1 using the average in ξ1 and prove that
−
π (s) = Ee−sθ1 1Iθ <∞ = Ee−sτη (−ξ1 ) 1Iτη− (−ξ1 )<∞ = f (ρ (s)), 1
where ρ (s) is the positive root of the Lundberg equation k1 (ρ ) = s. 20.19. Consider the process η (t) from Problem 20.17 given δ > 0. Find the moment generating function for the distribution of stationary waiting time w∗ = limt→∞ w(t); that is, find ω∗ (z) = Ee−zw∗ . 20.20. Prove the identity of the moment generating functions for w(θs ), η + (θs ), and −ξ-1− (θs ), and also for w∗ , η + and −ξ-1− using the remark of Theorem 20.5 and notations ξ (t) = −ζ (t) (ξ1 (t) = −η (t)) within Problem 20.17. The Pollaczek– Khinchin formula can be established in such a way for the moment generating function of the virtual waiting time: p+ ∞ . Ee−zw∗ = −1 −zx F(x) dx 1 − q+ μ 0 e Show the following relations for the risk process ζ (t). Ee−zζ
+ (θ
s)
+
(Cz, s), Ee−zζ = ω∗ (Cz), ω (z, s) = Ee−zw(θ1 ) . =ω
20.21. Prove that the cumulant for the risk process ζ (t) from Problem 20.9 is the fractional rational function of the form k(r) = ψ (α )|iα =r = r ·
λ1 (2 − r)(b + r) − λ2 (1 − r)2 . (1 − r)2 (b + r)
Calculate the first three moments mk = Eζ (1)k (k = 1, 3) using (20.74) and (20.75) for λ1 = λ2 = 1, b = 1/14.
348
20 Basic functionals of the risk theory N(t)
20.22. Consider the risk process ζ (t) = ∑k=1 ξk − t (c = 1) with characteristic function of claims δ2 ϕ (α ) = Eeiαξk = , δ > 0, (δ − iα )2 Find m = Eξ (1), ψ (α ), ϕ (s, α ), ϕ± (s, α ), and write the main factorization identity. Using the second factorization identity (20.23) find the moment generating function for pairs {τ + (θμ ), γ + (θμ )}, {τ + (x), γ + (x)}. If m = 2λ δ −1 − 1 < 0(λ < δ /2) find the characteristic function for ϕ+ (α ) and compare it with one determined by the Pollaczek–Khinchin formula. Find the distribution function of ζ + if λ = δ /4. 20.23. Consider the process ζ (t) from the previous problem with λ = δ /4. Find g+ (u, z) with help of formula (20.61), d+ (α , μ ) using formula (20.63), gu (s) with help of formula (20.62), and κu (z) using formula (20.66). 20.24. Using the equality (20.57) with m < 0, prove for the process ζ (t) from the formula (20.10) the following equality Eγ + (u)1Iζ + >u = where F-3 (u) =
∞ u
∞ 0
Φ1 (u, x)dx =
λλ F3 (u) + c |m|
u 0
F-3 (u − z)dP+ (z), u ≥ 0,
F(x)dx is the tail of the third order of the d.f. F(x), x > 0.
20.25. Using the equality (20.67) with m < 0, prove for the process ζ (t) from the formula (20.49) the following equality Eγ + (u)1Iζ + >u = where F ∗ (u) =
∞ u
∞ 0
xφ1 (u, x)dx = F ∗ (u) +
λ b|m|
u 0
F ∗ (u − z)dP+ (z), u ≥ 0,
F ∗ (x)dx, F ∗ (x) = F(x) + bF(x), x > 0.
Hints 20.4. To obtain the simplified relation for the first duration of “red period” it is sufficient to calculate the corresponding integral taking into account that for m < 0, ρ+ = c|m|, p+ = (c − λ )/c, ρ+ = cp+ , u
∞ λ q+ ρ+ e−c(u+x−z)−ρ+ z dz |m| 0 0 u ecq+ z dz = λ e−cp+ u−cx − e−c(u+x) . = λ cq+ ρ+ e−c(u+x)
λ |m|
e−c(u+x−z) dP+ (z) =
0
The negative part in the mentioned integral compensates the first term of the first density P(γ + (u) < x, ζ + > u) and, thus, its simple exponential expression is fulfilled. It implies that the moment generating function for the security of ruin γ + (u) has a form
20 Basic functionals of the risk theory
g+ (u, z) = Ee−zγ
+ (u)
1Iζ + >u =
349
cq+ −ρ+ u e . c+z
Thus, the moment generating function T (u) according to (20.62) is determined by gu (s) = g+ (u, ρ− (s)). The dual relations for the second and third densities can be simplified in a similar way. 20.5. Use the moment generating function g+ (u, z) from Problem 20.4 and the formula (20.66). 20.6. The process ζ (t) is lower continuous. The fractional rational expressions for ψ (α ) and ϕ (s, α ) can be found calculating the cumulant ψ (α ) = −iα + λ (ϕ (α ) − 1). The lower continuity of ζ (t) implies that
ϕ− (s, α ) =
ρ− (s) , P(ζ − (θs ) < x) = eρ− (s)x , x < 0. ρ− (s) + iα
Dividing P3 (s, r) by (r + ρ− (s)) we obtain that P2 (s, r) = r2 + r(s − 7 − ρ− (s)) + 21sρ−−1 (s) = (ρ+ (s) − r)(r+ (s) − r).
Furthermore, the main factorization identity implies that
ϕ+ (s, α ) =
s (3 − iα )(7 − iα ) . ρ− (s) P2 (s, iα )
Because m < 0, s−1 ρ− (s) → |m|−1 = 7/2, r2 (s) = ρ+ (s) → ρ+ = 1, r3 (s) → 6 s→0
s→0
then
ϕ+ (α ) = lim ϕ+ (s, α ) = s→0
s→0
2 (3 − iα )(7 − iα ) . 7 P2 (0, iα )
Decompose the fractional rational function of the second order into fractionally linear parts (because P2 (0, r) can be decomposed as (r − 1)(r − 6)) and invert ϕ+ (α ). 20.7. Before calculating the first two ruin functions
Φ1 (u, x) := P(γ + (u) > x, ζ + > u), x ≥ 0, Φ2 (u, x) := P(γ+ (u) > x, ζ + > u), x ≥ 0, use the formulas (20.57) and (20.58) for x = 0 and corresponding conditions from the previous problem: λ = 3, c = 1, m = −2/7, p+ = 2/7, 1 −3x 1 1 −3x 1 −7x e + e−7x , F(x) = e + e , x > 0, F(x) = 2 2 3 7 P+ (z) =
24 −z 6 −6z e + e , z > 0, 35 35
350
20 Basic functionals of the risk theory
and show that
Φ1 (u, 0) = Φ2 (u, 0) = P+ (u) =
24 −u 1 −6u e + e , u > 0. 35 35
The ruin functions Φ1 (u, x) and Φ2 (u, x) can be found by formulas (20.57) and (20.58). 20.9. Find ψ (α ) and show (for m = 2 − b−1 = (2b − 1)/b) that
ϕ (s, α ) =
s(b + iα )(1 − iα )2 , P3 (s, iα )
P3 (s, r) = r3 (s + 2) + r2 (s(b − 2) + b − 4) − r(s + 1)mb + bs = 0. The negative root r1 (s) = −ρ− (s) of the cubic equation P3 (s, r) = 0 can be used for the determination of ϕ− (s, α ), and P2 (s, r) = P3 (s, r)(r + ρ− (s))−1 = r2 (s + 2) + (b − 4 + s(b − 2) − (2 + s)ρ− (s))r + bsρ−−1 (s) for ϕ+ (s, α ). 20.10. For calculation φ1,2 (u, x) by the formulas (20.67)–(20.68) it should be taken into account that under conditions of the problem λ 1 7 1 = , λ = λ1 + λ2 = 1, p = F(0) = , b = , m = −12, 2 14 b|m| 31 1 1 F(x) = (1 + x)e−x , F (x) = xe−x , x > 0, 2 2 15x + 1 −x e , x > 0, F∗ (x) = F (x) + bF(x) = 28 1 25 P+ (y) = 1 − (27e−y/4 − e−12y/7 ), y > 0. 41 7
Notice that for y = u (see (16.47) in [33]) P(γ+ (u) = u, ζ+ > u) = pF¯1 (u) > 0 in order to calculate φ2 (u, y).
Answers and Solutions 20.1. It was mentioned before that ξ (t) is both an upper and lower continuous process for which 2s ϕ (s, α ) = Eeiαξ (θs ) = , (20.77) 2s − 2iα a − (iα )2 σ 2 and the characteristic functions of ξ ± (θs ) can be expressed by formulas (20.30) and (20.33). According to (20.77), the Lundberg equation can be reduced to a quadratic one: σ 2 r2 + 2ar − 2s = 0, r1,2 (s) = ±ρ± (s),
20 Basic functionals of the risk theory
351
√ 2sσ 2 + a2 ∓ a ρ± (s) = , s ≥ 0. σ2 Thus, the main factorization identity and its components have the form:
ϕ (s, α ) = ϕ+ (s, α )ϕ− (s, α ), ϕ± (s, α ) =
ρ± (s) , s ≥ 0. ρ± (s) ∓ iα
(20.78)
The characteristic functions from (20.78) can be easily inverted in α and the densities of ξ ± (θs ) can be found:
∂ P(ξ ± < x) = ρ± (s)e∓ρ± (s)x , (±x > 0). (20.79) ∂x (1) For a < 0 we have that ρ+ (s) → ρ+ = 2|a|σ −2 > 0, ρ− (s) → 0. So, the p± (s, x) =
s→0
s→0
characteristic function and the distribution of ξ + are +
ϕ+ (α ) := Eeiαξ =
ρ+ , P(ξ + > x) = e−ρ+ x , x ≥ 0, ρ + − iα
(20.80)
P(ξ − = −∞) = 1. √ (2) For a = 0 we have that ρ± (s) = 2s/σ → 0, P(ξ ± = ±∞) = 1. s→0
(3) For a > 0 we have that ρ− (s) → ρ− = 2aσ −2 > 0, ρ+ (s) → 0 therefore s→0
P(ξ + = +∞) = 1, −
ϕ− (α ) := Eeiαξ =
s→0
ρ− , P(ξ − < x) = eρ− x , x ≤ 0. ρ − + iα
(20.81)
Because the process ξ (t) = at + σ W (t) is continuous then P(γk (x) = 0) = 1, (k = 1, 3). The formula (20.62) implies Ee−sT (x) 1Iτ + (x)<∞ = Ee−ρ+ (s)γ
+ (x)
1Iτ + (x)<∞ = P(τ + (x) < ∞).
Thus, P(T = 0/τ + (x) < ∞) = 1; that is, T is a degenerating random variable. The moment generating function of the total duration of being over the level x, ∞ that is, 1Iξ (t)>x dt, Dx (μ ) = Ee−μ Qx (∞) , Qx (∞) = 0
for m = a < 0 is determined, according to (20.63), by the integral transform d+ (α , μ ) =
∞ 0
eiα x dx Dx (μ ) =
ϕ+ (α ) . ϕ+ (μ , α )
Furthermore, the following relation holds true 2|a| ρ+ ρ+ (μ ) − iα d+ (α , μ ) = , ρ+ = 2 , ρ+ ( μ ) = ρ + ( μ ) ρ + − iα σ Inverting it in α we can find the moment generating function
(
(20.82)
2μσ 2 + a2 − a . σ2
352
20 Basic functionals of the risk theory
Dx (μ ) = 1 −
ρ+ (μ ) − ρ+ −ρ+ x e , x > 0. ρ+ ( μ )
(20.83)
The integral transform for Dx (s, μ ) = Ee−μ Qx (θs ) (x ≥ 0) can be defined in a similar way to (20.82), according to (2.70) in [7]: d+ (s, α , μ ) = =
∞ 0
eiα x dx Dx (s, μ )
ϕ+ (s, α ) ρ+ (s) ρ+ (s + μ ) − iα = . ϕ+ (s + μ , α ) ρ+ (s + μ ) ρ+ (s) − iα
(20.84)
After inversion in α we can find, similarly to (20.83), that Dx (s, μ ) = 1 −
ρ+ (s + μ ) − ρ+ (s) −ρ+ (s)x e , x ≥ 0. ρ+ (s + μ )
For x = 0 D0 (s, μ ) = Ee−μ Q0 (θs ) =
(20.85)
ρ+ (s) . ρ+ (s + μ )
√ 2s/σ . So, for ξ (t) = σ w(t) we get 3 ∞ s s e−st Ee−μ Q0 (t) dt = Ee−μ Q0 (θs ) = . s+μ 0
If a = 0, then ρ± (s) =
After inversion in s we obtain the well-known result for the distribution Q0 (t): 3 2 x P(Q0 (t) < x) = arcsin , (0 ≤ x ≤ t). π t 20.2. It was mentioned above that the considering process is both upper and lower almost semicontinuous. We have c (c − 1)iα − (iα )2 1 , + 1− = −ψ (α ) = 1 − 1 + iα c − iα (1 + iα )(c − iα ) (20.86) (c − iα )(1 + iα ) s = . ϕ (s, α ) = s − ψ (α ) (s − 2)α 2 + (1 − c)(1 − s)iα + sc Hence, the characteristic function ϕ (s, α ) is a rational function of the second order. Let us decompose ϕ (s, α ) into a product of a fractional linear multipliers that determine ϕ± (s, α ). After substitution r = iα (making the denominator in (20.86) equal to 0) we obtain the Lundberg equation which is quadratic: −(2 + s)r2 + rcm(1 − s) − sc = 0, m = (1 − c)c−1 .
(20.87)
It follows for s = 0 that 2r2 + rcm = 0 and the roots are r0 = 0, r10 = −cm/2, (r10 > 0 if m < 0, r10 < 0 if m > 0). If s > 0 it is possible to find the roots of the equation (20.87). In particular, for ( m = 0 the roots are of very simple form: r1,2 (s) = ± s/(2 + s) = ±ρ± (s). If m = 0
20 Basic functionals of the risk theory
353
then the roots are r1 (s) = |r2 (s)|, ρ+ (s) = r1 (s), ρ− (s) = −r2 (s). For s > 0 these roots determine the characteristic function for the distribution of ζ ± (θs ), according to (20.35) and (20.37),
ϕ+ (s, α ) =
p+ (s)(c − iα ) , ρ+ = cp+ (s), ρ+ (s) − iα
P(ζ + (θs ) = 0) = p+ (s),
ϕ− (s, α ) =
p+ (s)p− (s) =
s , s+λ
(20.88)
p− (s)(1 + iα ) , ρ− (s) = p− (s) = P(ζ − (θs ) = 0). ρ− (s) + iα
If m = (1 − c)/c < 0, that is equivalent to c > 1, then ρ+ = c|m|/2 = (c − 1)/2 > 0, thus + p+ (c − iα ) ϕ+ (α ) = Eeiαζ = , ρ+ = cp+ . (20.89) ρ + − iα If m > 0 , that is equivalent to c < 1, then ρ− = cm/2 = (1 − c)/2 > 0, thus −
ϕ− (α ) = Eeiαζ =
p− (1 + iα ) , ρ− = p− . ρ − + iα
(20.90)
According to (20.23) it is easy to calculate the joint moment generating function of {τ + (x), γ + (x)}: Ee−sτ
+ (θ )−uγ + (θ ) μ μ
1Iτ + (θμ )<∞ =
μ q+ (s) c , ρ+ (s) + μ c + u
which after inversion in μ is Ee−sτ
+ (x)−uγ + (x)
1Iτ + (x)<∞ = q+ (s)e−ρ+ (s)x
c . c+u
(20.91)
This implies the following relation (after the inversion in u), Ee−sτ
+ (x)
1Iτ + (x)<∞ P(γ + (x) > z) = q+ (s)e−ρ+ (s)x e−cz , z > 0,
(20.92)
Thus, the overjump γ + (x) follows the exponential distribution with the same parameter c > 0 as claims ξk . If s → 0 then it follows from (20.90) that g+ (x, z) = Ee−zγ
+ (x)
1Iτ + (x)<∞ =
cq+ −ρ+ x e , c+z
and according to the formula (20.65) we could find Ee−sT (x) 1Iτ + (x)<∞ = q− (s)g+ (x, ρ− (s)) = q− (s)
cq+ e−ρ+ x . c + ρ− (s)
If m < 0 then the moment generating function Qx (∞) is determined in a similar way to the (20.79) and (20.80) relation but with other values for the roots ρ+ and ρ+ (μ ). They can be determined by the equation (20.87) for s = 0 and s = μ .
354
20 Basic functionals of the risk theory
20.3. Let us mention that the process ζ (t) is lower continuous and almost upper semicontinuous. Its cumulants ψ (α ) and ϕ (s, α ) can be easily calculated: (iα )2 + cmiα λ iα − iα = , c − iα c − iα s(c − iα ) s = ϕ (s, α ) = . s − ψ (α ) cs − iα (s + cm) − (iα )2
ψ (α ) = λ (ϕ (α ) − 1) − iα =
The Lundberg equation can be reduced to a quadratic one r2 + (s + cm)r − cs = 0, (Ds = (s + cm)2 + 4cs > 0), with roots ±r1,2 (s) > 0 r1,2 (s) =
√ 1 −(s + cm)2 ± Ds , ρ+ (s) = r1 (s), ρ− (s) = −r2 (s). 2
These roots determine the components of the main factorization identity (ρ+ (s) = cp+ (s) < c)
ϕ± (s, α ) =
p+ (s)(c − iα ) ρ− (s) , ϕ− (s, α ) = . ρ+ (s) − iα ρ− (s) + iα
(1) For m < 0 we have that ρ+ (s) → ρ+ = c|m| > 0, ρ− (s) → 0, thus P(ζ − = − s→0
∞) = 1 and +
ϕ+ (α ) = Eeiαζ =
s→0
p+ (c − iα ) , P(ζ + > x) = q+ e−ρ+ x , x > 0. ρ + − iα
So, the ruin probability for m < 0 and u > 0 is equal to
Ψ (u) = P(ζ + > u) = q+ e−ρ+ u . (2) For m > 0 we have that ρ+ (s) → 0, ρ− (s) → ρ− = cm = cp− , thus s→0
P(ξ + = ∞) = 1, and −
ϕ− (α ) = Eeiαζ =
s→0
ρ− , P(ζ − < x) = eρ− x , x ≤ 0. ρ − + iα
(3) For m = 0 we have that ρ± (s) → 0, thus P(ζ ± = ±∞) = 1. s→0
Because F(x) = e−cx , then the following relations take place according to the formulas (20.57)–(20.58) for the marginal densities of the ruin functions (for γ1 (x) = γ + (x), γ2 (x) = γ+ (x) and γ3 (x) = γx+ , respectively):
20 Basic functionals of the risk theory
∂ P(γ + (u) < x, ζ + > u) ∂x u λ e−c(u+x−z) dP+ (z), = λ e−c(x+u) + |m| 0 ∂ P(γ+ (u) < y, ζ + > u) φ2 (u, y) := ∂y λ −cy e P+ (u), P+ (u) = P(ζ + < u), y > u, = |m| λ −cy P(u − y < ζ + < y), 0 < y < u, |m| e
355
φ1 (u, x) :=
(20.93)
∂ φ3 (u, z) := P(γu+ < z, ζ + > u) ∂z λ c −cz u z > u, 0 (z − u + y) dP+ (y), |m| e = λ c −cz 0 −z (z + y) dP+ (y + u), 0 < z < u. |m| e 20.4. φ1 (u, x) = (∂ /∂ x)P(γ + (u) < x, ζ + > u) = λ e−ρ+ u−cx , x > 0;
∂ P(γ+ (u) < y, ζ + > u) ∂y ⎧ ⎨ λ e−cy (1 − e−ρ+ u ) , |m| = λ ⎩ |m| q+ e−cy e−ρ+ (u−y) − e−ρ+ u ,
φ2 (u, y) =
y > u, 0 < y < u;
∂ P(γu+ < z, ζ + > u) ∂ z ⎧ ⎨ λ cq+ e−cz (1 − e−ρ+ u ) (z + ρ+−1 ) − u , |m| = λ cq+ −cz −1 −ρ (u−z) ⎩ |m| e ρ+ e + − e−ρ+ u (z + ρ+−1 )
φ3 (u, z) =
g+ (u, z) =
z > u, 0 < z < u;
cq+ −ρ+ z cq+ e e−ρ+ u . , gu (s) = c+z c + ρ− (s)
20.5. Ee−zZ
+ (u)
1Iζ + >u =
cp+ q+ −(ρ+ +z)u e . ρ+ + z
20.6.
ϕ− (s, α ) =
s ρ− (s) (3 − iα )(7 − iα ) . , ϕ+ (s, α ) = ρ− (s) + iα ρ− (s) (ρ+ (s) − iα )(r2 (s) − iα )
24 1 1 1 6 ; + , ρ− (s)s−1 → ρ− (0) = s→0 35 1 − iα 35 6 − iα |m| 24 1 Ψ (u) = P(ζ + > u) = e−u + e−6u . 35 35
ϕ+ (α ) =
356
20.7.
20 Basic functionals of the risk theory
3 −7x 1 −3x 3 3 1 e − e , x ≥ 0. Φ1 (u, x) = e−u e−3x + e−7x + e−6u 5 7 10 7 3 ⎧ −u 3 1 −3y 1 −7y ⎪ 24e − e−6u , y > u, ⎨ 20 3 e + 7 e 3 −7y Φ2 (u, y) = 10 e−u 6e−2y + 2e−6y − 4e−3y − 12 7 e ⎪ ⎩ −6u 1 −y 1 3y −3y − 1 e−7y , 0 < y < u. +e 2e − 6 e +e 14
20.9.
p− (s)(b + iα ) s (1 − iα )2 . , ϕ+ (s, α ) = ρ− (s) + iα ρ− (s) P2 (s, iα )
ϕ− (s, α ) = 20.10.
1 25 −12u/7 −u/4 27e , Ψ (u) = P(ζ > u) = − e 41 7 +
φ1 (u, x) =
e−x 25 9 (3x − 4)e−−12u/7 + (5x + 7)e−u/4 , x ≥ 0; 41 7 4
If y = u P(γ2 (u) = u, ζ + > u) = 12 (1 + u)e−u , if y = u ⎧ 1 27 −u/4 25 −12u/7 −y ⎪ , y > u; + 287 e ⎨ 12 (1 + y)e 1 − 41 e −y 25 −12u/7 1 e 243 −u/4 φ2 (u, y) = 12 (1 + y) 41 7 e − 27e + 2 e−(u−y)/4 ⎪ ⎩ 625 −12(u−y)/7 , 0 < y < u. − 7 e 20.11. +
Ee−zζ =
2p+ (1 + z)2 . 2p+ + 2z2 + (4 − q+ )z
20.12.
Φ1 (0, x) = p(F 1 (x) + bF 1 (x)) 1 4 1 = e−x (1 + x + b(2 + x)) → (1 + 2b) |b=1/14 = ; x→0 2 2 7 1 1 1 Φ2 (0, y) = pbF 1 (y) = b(2 + y)e−y → b |b=1/14 = ; y→0 2 2 14 1 1 P(ζ + > 0, γ+ (0) = 0) = , φ2 (0, 0) = . 2 14 Φ3 (0, z) = p F 1 (z) + b
∞ z
y dF1 (y)
4 1 1 = e−z (1 + z + b(z2 + 2z + 2)) → (1 + 2b) |b=1/14 = . z→0 2 2 7 20.13. −zT (0)
g0 (s) = Ee
3 1Iζ + >0 = 2
1 1 + . 3 + ρ− (s) 7 + ρ− (s)
20 Basic functionals of the risk theory
20.14. g0 (s) = Ee−sT (0) 1Iζ + >0 = q+ q− (s) 20.15. g+ (u, z) =
357
2 + ρ− (s) . 2(1 + ρ− (s))2
3 4(12 + 3z)e−u + (2 − z)e−6u . 10 (3 + z)(7 + z)
20.16. g+ (u, z) =
1 1 63 −u/4 100 −12u/7 15 5 −12u/7 3 −u/4 e e e . − + + e 41 1 + z 4 7 1+z 7 4
(z, s) = (s − k1 (z))−1 (s − z p0 (s)), where p0 (s) is defined in (11.5). 20.17. ω −1 (z, s) = p+ 1 − λ1 0∞ e−zx F(x) dx , 20.20. ω∗ (z) = lims→0 ω p+ = P(w∗ = 0) = 1 − λ1 μ , q+ = 1 − p+ = λ1 μ . 20.22. m = λ μ − 1 = 2λ δ −1 − 1, 2 α )2 ψ (α ) = (δ λ−iδα )2 −iα ; ϕ (s, α ) = s(Pδ −i (s,α ) , 3
r2 +(s−2δ + λ − ρ− (s))r +sδ 2 ρ−−1 (s), ρ− (s) α )2 ϕ+ (s, α ) = ρ−s(s) (Pδ −i ρ− (s)+iα , (s,i α) . 2
P3 (s, r) = P2 (s, r)(ρ− (s)+r),
ϕ (s, α ) = ϕ+ (s, α )ϕ− (s, α );
P2 (s, r) =
ϕ− (s, α ) =
For λ < δ /2 (m < 0)
ϕ+ (α ) = lim ϕ+ (s, α ) = |m| s→0
(δ − iα )2 , P2 (0, iα )
P2 (0, r) = r2 + (λ − 2δ )r + |m|δ 2 = (r − r1 )(r − r2 ), ( 1 r1,2 = (2δ − λ ∓ λ (4δ + λ )) > 0. 2 After decomposition of ϕ+ (α ) into linear-fractional functions we obtain (r1 − δ )2 1 (r2 − δ )2 1 . ϕ+ (α ) = |m| 1 + + r 2 − r 1 r 1 − iα r 1 − r 2 r 2 − iα Thus, the distribution of ζ + can be expressed via e−xr1,2 inverting the previous rewe can lation on α . According to (20.52) and using the Pollacek–Khinchin formula √ 1 obtain the similar result for ϕ+ (α ) if λ = δ /4 p+ = q+ = 2 , r1,2 = (7∓ 17)δ /8 . Inverting it on α we obtain $ % √ √ 5 1 5 + −(7− 17)δ x/8 −(7+ 17)δ x/8 √ √ 1+ P{ζ > x} = . + 1− e e 4 17 17
A Appendix
D. Gusak et al., Theory of Stochastic Processes, Problem Books in Mathematics, c Springer Science+Business Media, LLC 2010 DOI 10.1007/978-0-387-87862-1,
359
360
A Appendix
A Appendix
361
362
A Appendix
A Appendix
363
364
A Appendix
List of abbreviations c`adl`ag c`agl`ad cdf CLT HMF HMP i.i.d. i.i.d.r.v. pdf r.v. SDE SLLN
Right continuous having left-hand limits (p. 24) Left continuous having right-hand limits (p. 24) Cumulant distribution function Central limit theorem Homogeneous Markov family (p. 179) Homogeneous Markov process (p. 176) Independent identically distributed Independent identically distributed random variables Probability density function Random variable Stochastic differential equation Strong law of large numbers
List of probability distributions Be(p) Bi(n, p) Geom(p) Pois(λ ) U(a, b) N(a, σ 2 ) Exp(λ ) Γ (α , β )
Bernoulli, P(ξ = 1) = p, P(ξ = 0) = 1 − p Binomial, P(ξ = k) = Cnk pk (1 − p)n−k , k = 0, . . . , n Geometric, P(ξ = k) = pk−1 (1 − p), k ∈ N Poisson, P(ξ = k) = (λ k /k!)e−λ , k ∈ Z+ Uniform on (a, b), P(ξ ≤ x) = 1 ∧ ((x − a)/(b − a))+ , x ∈ R x −(y−a)2 /2σ 2 Normal (Gaussian), P(ξ ≤ x) = (2πσ 2 )−1/2 −∞ e dy, x∈R Exponential, P(ξ ≤ x) = [1 − e−λx ]+ , x ∈ R Gamma, P(ξ ≤ x) = (β α /Γ (α )) 0x yα −1 e−β y dy, x ∈ R+
A Appendix
List of symbols aξ aX Aϕ B(X) B(X) C([0, T ]) C([0, +∞)) 2 (Rm ) Cuni 2 Cfin C(X, T) c0 cap cov(ξ , η ) D([a, b], X) DXa DF ∂A Eμ F F F [−1] Ft+ Ft− FtX,0 FtX Fτ FX (n) fi j F ∗n F ∗0 Hp H(X) Hk (X) HSN Hγ HΛ I[a,b] ( f ) Iab It ( f ) Iξ L(t, x)
12 11 178 177 3 241 241 180 180 2 241 233 11 24 253 242 242 45,108 329 339 4 21 21 21 21 71 21 138 161 161 89 129 129 254 251 129 193 252 193 110 195
L∗ Lˆ2 ([a, b]) Lˆ2 L cl ∞ L∞ ([0, T ]) p L p ([0, T ]) l.i.m. Lip M M c [M] M, N [M, N] Mc Md M∗ M Mloc M2 2 Mloc M 2,c M 2,d + M +loc M +2 M
M τn NP N(a, B) P(s, x,t, B) pi j pi j (t) (n) pi j PtX1 ,...,tm Rλ RX RX,Y Rξ Tt f Uf
181 193 193 129 241 241 241 241 38 251 74 75 75 74 74 74 74 89 73 73 73 73 74 74 89 89 74 73 21 59 176 137 139 137 1 177 11,107 11 12 177 110
365
366
A Appendix
XH Xnr Xns XT (X, X) ZX βN (a, b) Δs M d ϑ N (S) Λ
61 129 129 2 1 108 76 75 255 129,244
τΓ τΓ Φ Φ¯ Φ −1 φξ φtX1 ,...,tm (Ω , F, P) # ⇒
λ 1 |[0,1] λ 1 |R+ πHΛ
4 44 129
→
d
α ∈A Gα
5 5 261,282 288 282 12 12 1 3 242 242 21
References
367
References 1. Asmussen S (2000) Ruin Probability. World Scientist, Singapore 2. Bartlett MS (1978) An Introduction to Stochastic Processes with Special Reference to Methods and Applications. Cambridge University Press, Cambridge, UK. 3. Bertoin J (1996) Levy Processes. Cambridge University Press, Cambridge, UK. 4. Billingsley P (1968) Convergence of Probability Measures. Wiley Series in Probability and Mathematical Statistics, John Wiley, New York 5. Bogachev VI (1998) Gaussian Measures. Mathematical Surveys and Monographs, vol. 62, American Mathematical Society, Providence, RI 6. Borovkov AA (1976) Stochastic Processes in Queueing Theory. Springer-Verlag, Berlin 7. Bratijchuk NS, Gusak DV (1990) Boundary problems for processes with independent increments [in Russian]. Naukova Dumka, Kiev 8. Brzezniak Z, Zastawniak T (1999) Basic Stochastic Processes. Springer-Verlag, Berlin 9. Bulinski AV, Shirjaev AN (2003) Theory of Random Processes [in Russian]. Fizmatgiz, Laboratorija Bazovych Znanij, Moscow 10. B¨uhlmann H (1970) Mathematical Methods in Risk Theory Springer-Verlag, New-York 11. Chaumont L, Yor M (2003) Exercises in Probability: A Guided Tour from Measure Theory to Random Processes, Via Conditioning. Cambridge University Press, Cambridge, UK 12. Chung KL (1960) Markov Chains with Stationary Transition Probabilities. Springer, Berlin 13. Chung KL, Williams RJ, (1990) Introduction to Stochastic Integration. Springer-Verlag New York, LLC 14. Cram´er H, Leadbetter MR (1967) Stationary and Related Stochastic Processes. Sample Function Properties and Their Applications. John Wiley, New York 15. Doob JL (1990) Stochastic Processes. Wiley-Interscience, New York 16. Dorogovtsev AY, Silvesrov DS, Skorokhod AV, Yadrenko MI (1997) Probability Theory: Collection of Problems. American Mathematical society, Providence, RI 17. Dudley, RM (1989) Real Analysis and Probability. Wadsworth & Brooks/Cole, Belmont, CA 18. Dynkin EB (1965) Markov processes. Vols. I, II. Grundlehren der Mathematischen Wissenschaften, vol. 121, 122, Springer-Verlag, Berlin 19. Dynkin EB, Yushkevich AA (1969) Markov Processes-Theorems and Problems. Plenum Press, New York 20. Elliot RJ (1982) Stochastic Calculus and Applications. Applications of Mathematics 18, Springer-Verlag, New York 21. Etheridge A (2006) Financial Calculus. Cambridge University Press, Cambridge, UK 22. Feller W (1970) An Introduction to Probability Theory and Its Applications (3rd ed.). Wiley, New York 23. F¨ollmer H, Schied A (2004) Stochastic Finance: An Introduction in Discrete Time. Walter de Gruyter, Hawthorne, NY 24. Gikhman II, Skorokhod AV (2004) The Theory of Stochastic Processes: Iosif I. Gikhman, Anatoli V. Skorokhod. In 3 volumes, Classics in Mathematics Series, Springer, Berlin 25. Gikhman II, Skorokhod AV (1996) Introduction to the Theory of Random Processes. Courier Dover, Mineola 26. Gikhman II, Skorokhod AV (1982) Stochastic Differential Equations and Their Applications [in Russian]. Naukova dumka, Kiev 27. Gikhman II, Skorokhod AV, Yadrenko MI (1988) Probability Theory and Mathematical Statistics [in Russian] Vyshcha Shkola, Kiev
368
References
28. Gnedenko BV (1973) Priority queueing systems [in Russian]. MSU, Moscow 29. Gnedenko BV, Kovalenko IN (1989) Introduction to Queueing Theory. Birkhauser Boston, Cambridge, MA 30. Grandell J (1993) Aspects of Risk Theory. Springer-Verlag, New York 31. Grenander U (1950) Stochastic Processes and Statistical Inference. Arkiv fur Matematik, Vol. 1, no. 3:1871-2487, Springer, Netherlands 32. Gross D, Shortle JF, Thompson JM, Harris CM (2008) Fundamentals of Queueing Theory (4th ed.). Wiley Series in Probability and Statistics, Hoboken, NJ 33. Gusak DV (2007) Boundary Value Problems for Processes with Independent Increments in the Risk Theory. Pratsi Instytutu Matematyky Natsional’no¨ı Akademi¨ı Nauk Ukra¨ıny. Matematyka ta ¨ı¨ı Zastosuvannya 65. Instytut Matematyky NAN Ukra¨ıny, Ky¨ıv 34. Hida T (1980) Brownian Motion. Applications of Mathematics, 11, Springer-Verlag, New York 35. Ibragimov IA, Linnik YuV (1971) Independent and Stationary Sequences of Random Variables. Wolters-Noordhoff Series of Monographs and Textbooks on Pure and Applied Mathematics, Wolters-Noordhoff, Groningen 36. Ibragimov IA, Rozanov YuA (1978) Gaussian Random Processes. Applications of Math., vol. 9, Springer-Verlag, New York 37. Ibramkhalilov IS, Skorokhod AV (1980) Consistent Estimates of Parameters of Random Processes [in Russian]. Naukova dumka, Kyiv 38. Ikeda N, and Watanabe S (1989) Stochastic Differential Equations and Diffusion Processes, Second edition. North-Holland/Kodansya, Tokyo 39. Ito K (1961) Lectures on Stochastic Processes. Tata Institute of Fundamental Research, Bombay 40. Ito K, McKean H (1996) Diffusion Processes and Their Sample Paths. Springer-Verlag, New York 41. Jacod J, Shiryaev AN (1987) Limit Theorems for Stochastic Processes. Grundlehren der Mathematischen Wissenschaften, vol. 288, Springer-Verlag, Berlin 42. Johnson NL, Kotz S (1970) Distributions in Statistics: Continuous Univariate Distributions. Wiley, New York 43. Kakutani S (1944) Two-dimensional Brownian motion and harmonic functions Proc. Imp. Acad., Tokyo 20:706–714 44. Karlin S (1975) A First Course in Stochastic Processes. Second edition, Academic Press, New York 45. Karlin S (1966) Stochastic Service Systems. Nauka, Moscow 46. Kijima M (2003) Stochastic Processes with Application to Finance. Second edition. Chapman and Hall/CRC, London 47. Klimov GP (1966) Stochastic Service Systems [in Russian]. Nauka, Moscow 48. Kolmogorov AN (1992) Selected Works of A.N. Kolmogorov, Volume II: Probability theory and Mathematical statistics. Kluwer, Dordrecht 49. Koralov LB, Sinai YG (2007) Theory of Probability and Random Processes, Second edition. Springer-Verlag, Berlin 50. Korolyuk VS (1974) Boundary Problems for a Compound Poisson Process. Theory of Probability and its Applications 19, 1-14, SIAM, Philadelphia 51. Korolyuk VS, Portenko NI, Skorokhod AV, Turbin AF (1985) The Reference Book on Probability Theory and Mathematical Statistics [in Russian]. Nauka, Moscow 52. Krylov NV (2002) Introduction to the Theory of Random Processes. American Mathematical Society Bookstore, Providence, RI 53. Lamperti J (1977) Stochastic Processes. Applied Mathematical Sciences, vol. 23, Springer-Verlag, New York
References
369
54. Lamberton D, Lapeyre B (1996) Introduction to Stochastic Calculus Applied to Finance. Chapman and Hall/CRC, London 55. Leonenko MM, Mishura YuS, Parkhomenko VM, Yadrenko MI (1995) Probabilistic and Statistical Methods in Ecomometrics and Financial Mathematics. [in Ukrainian] Informtechnika, Kyiv 56. L´evy P (1948) Processus Stochastiques et Mouvement Brownien. Gauthier-Villars, Paris 57. Liptser RS, Shiryaev AN (2008) Statistics Of Random Processes, Vol. 1. Springer-Verlag New York 58. Liptser RS, Shiryaev AN (1989) Theory of Martingales. Mathematics and Its Applications (Soviet Series), 49, Kluwer Academic, Dordrecht 59. Lifshits MA (1995) Gaussian Random Functions. Springer-Verlag, New York 60. Meyer PA (1966) Probability and Potentials. Blaisdell, New York 61. Øksendal B (2000) Stochastic Differential Equations, Fifth edition. Springer-Verlag, Berlin 62. Pliska SR (1997) Introduction to Mathematical Finance. Discrete Time Models. Blackwell, Oxford 63. Port S, Stone C (1978) Brownian Motion and Classical Potential Theory. Academic Press, New York 64. Protter P (1990) Stochastic Integration and Differential Equations. A New Approach Springer-Verlag, Berlin 65. Prokhorov AV, Ushakov VG, Ushakov NG (1986) Problems in Probability Theory. [in Russian] Nauka, Moscow 66. Revuz D, Yor M (1999) Continuous martingales and Brownian Motion. Third edition. Springer-Verlag, Berlin 67. Robbins H, Sigmund D, Chow Y (1971) Great Expectations: The Theory of Optimal Stopping. Houghton Mifflin, Boston 68. Rolski T, Schmidli H, Schmidt V, Teugels J (1998) Stochastic Processes for Insurance and Finance. John Wiley and Sons, Chichester 69. Rozanov YuA (1977) Probability Theory: A Concise Course. Dover, New York 70. Rozanov YuA (1982) Markov Random Fields. Springer-Verlag, New York 71. Rozanov YuA (1995) Probability Theory, Random Processes and Mathematical Statistics. Kluwer Academic, Boston 72. Rozanov YuA (1967) Stationary Random Processes. Holden-Day, Inc., San Francisco 73. Sato K, Ito K (editor), Barndorff-Nielsen OE (editor) (2004) Stochastic Processes. Springer, New York 74. Sevast’yanov BA (1968) Branching Processes. Mathematical Notes, Volume 4, Number 2 / August, Springer Science+Business Media, New York 75. Sevastyanov BA, Zubkov AM, Chistyakov VP (1988) Collected Problems in Probability Theory. Nauka, Moscow 76. Skorokhod AV (1982) Studies in the Theory of Random Processes. Dover, New York 77. Skorokhod AV (1980) Elements of the Probability Theory and Random Processes [in Russian]. Vyshcha Shkola Publ., Kyiv 78. Skorohod AV (1991) Random Processes with Independent Increments. Mathematics and Its Applications, Soviet Series, 47 Kluwer Academic, Dordrecht 79. Skorohod AV (1996) Lectures on the Theory of Stochastic Processes. VSP, Utrecht 80. Spitzer F (2001) Principles of Random Walk. Springer-Verlag New York 81. Shiryaev AN (1969) Sequential Statistical Analysis. Translations of Mathematical Monographs 38, American Mathematical Society, Providence, RI 82. Shiryaev AN (1995) Probability. Vol 95. Graduate Texts in Mathematics, Springer-Verlag New York
370
References
83. Shiryaev AN (2004) Problems in Probability Theory. MCCME, Moscow 84. Shiryaev AN (1999) Essentials of Stochastic Finance, in 2 vol. World Scientific, River Edge, NJ 85. Steele JM (2001) Stochastic Calculus and Financial Applications. Springer-Verlag, New York 86. Striker C, Yor M (1978) Calcul stochastique dependant d’un parametre. Z. Wahrsch. Verw. Gebiete, 45: no. 2: 109–133. 87. Stroock DW, Varadhan SRS (1979) Multidimensional Diffusion Processes SpringerVerlag, New York 88. Vakhania NN, Tarieladze VI, Chobanjan SA (1987) Probability Distributions on Banach Spaces. Mathematics and Its Applications (Soviet Series), 14. D. Reidel, Dordrecht 89. Ventsel’ ES and Ovcharov LA (1988). Probability Theory and Its Engineering Applications [in Russian]. Nauka, Moscow 90. Wentzell AD (1981) A Course in the Theory of Stochastic Processes. McGraw-Hill, New York 91. Yamada T, Watanabe S (1971) On the uniquenes of solutions of stochastic differential equations J. Math. Kyoto Univ., 11: 155–167 92. Zolotarev VM (1997) Modern Theory of Summation of Random Variables. VSP, Utrecht
Index
Symbols
σ –algebra cylinder, 2 generated by Markov moment, 71 predictable, 72 “0 and 1” rule, 52 B Bayes method, 280 boundary functional, 329 Brownian bridge, 60 C call (put) option American, 305 claim causing ruin, 329 classic risk process, 328 Coding by pile of books method, 146 contingent claim American, 305 attainable, 305 European, 304 continuity set, 242 convergence of measures weak, 242 of random elements by distribution, 242 weak, 242 correlation operator of measure, 272 coupling, 246 optimal, 247 covariance, 11 criterion
for the regularity, 130 Neyman–Pearson, 273 recurrence, 138 critical region, 271 cumulant, 45 cumulant function, 327 D decision rule nonrandomized, 271 randomized, 271 decomposition Doob’s for discrete-time stochastic processes, 87 Doob–Meyer for supermartingales, 72 for the general supermartingales, 73 Krickeberg, 87 Kunita–Watanabe, 90 Riesz, 87 Wald, 129 density of measure, 272 posterior, 280 prior, 280 spectral, 107 diffusion process, 180 distribution finite-dimensional, 1 Gaussian, 59 marginal, 251 371
372
Index
of Markov chain invariant, 138 stationary, 138 E equation Black–Sholes, 316 Fokker–Planck, 181 Langevin, 218 Lundberg, 333 Ornstein–Uhlenbeck, 218 ergodic transformation, 110 errors of Type I and II, 271 estimator Bayes, 280 consistent for singular family of measures, 281 strictly for regular family of measures, 280 of parameter, 279 excessive majorant, 229 F fair price, 305 family of measures tight, 243 weakly compact, 243 filtration, 21 complete, 21 continuous, 21 left-hand continuous, 21 natural, 21 right-hand continuous, 21 financial market, 303 Black–Scholes/(B,S) model, 316 complete, 305 dividend yield, 317 Greeks, 316 flow of σ -algebras, 21 formula Black–Sholes, 316 Dynkin, 202 Feynman–Kac, 219 Itˆo, 194 multidimensional, 194 L´evy–Khinchin, 44 Pollaczek–Khinchin classic, 337 generalized, 333 Tanaka, 195
fractional Brownian motion, 61 function bounded growth, 77 characteristic, 12 m-dimensional, 12 common, 12 covariance, 11, 107 excessive, 229, 230 generalized inverse, 4 H¨older, 23, 251 Lipschitz, 251 lower semicontinuous, 230 mean, 11 mutual covariance, 11 nonnegatively defined, 11, 12 payoff, 229 continuous, 230 premium, 229, 230 renewal, 161 spectral, 107 structural, 108 superharmonic, 230 G generator, 178 Gronwall–Bellman lemma, 219 I inequality Burkholder, 78 Burkholder–Davis, 77, 78 Doob’s, 75, 77 integral, 76 Khinchin, 78 Marcinkievich–Zygmund, 78 infinitesimal operator, 178 K Kakutani alternative, 258 Kolmogorov equation backward, 181 forward, 181 Kolmogorov system of equations first (backward), 139 second (forward), 140 Kolmogorov–Chapman equations, 137, 176
Index L local time, 195 loss function all-or-nothing, 280 quadratic, 280 M main factorization identity, 331 Markov homogeneous family, 179 Markov chain, 137 continuous-time, 138 homogeneous, 137 regular, 139 Markov moment, 71 predictable, 71 Markov process, 175 homogeneous, 176 weakly measurable, 177 Markov transition function, 176 martingale, 71 inverse, 85 L´evy, 79 local, 73 martingale measure, 304 martingale transformation, 80 martingales orthogonal, 90 matrix covariance, 12 joint, 60 maximal deficit during a period, 330 mean value of measure, 272 mean vector, 12 measure absolutely continuous, 272 Gaussian, 59 on Hilbert space, 273 intensity, 45 L´evy, 45 locally absolutely continuous, 80 locally finite, 44 random point, 45 Poisson point, 45 spectral, 107 stochastic orthogonal, 108 structural, 108 Wiener, 242
measures equivalent, 272 singular, 272 model binomial (Cox–Ross–Rubinstein), 307 Ehrenfest P. and T., 146 Laplace, 145 modification continuous, 22 measurable, 22 N number of crossings of a band, 76 O outpayments, 328 overjump functional, 329 P period of a state, 137 Polya scheme, 81 portfolio, 315 self-financing, 315 price of game, 229 principle invariance, 244 of the fitting sets, 10 reflection, 260 process adapted, 71 almost lower semicontinuous, 328 almost upper semicontinuous, 328 Bessel, 88 birth-and-death, 147 claim surplus, 329 differentiable in L p sense, 33 in probability, 33 with probability one, 33 discounted capital, 304 Galton—Watson, 85 geometrical Brownian motion, 316 integrable, 71 L p sense, 34 in probability, 34 with probability one, 34 L´evy, 44 lower continuous, 328 nonbusy, 160
373
374
Index
of the fractional effect, 37 Ornstein–Uhlenbeck, 61, 183 Poisson, 44 compound, 49, 328 with intensity measure κ, 44 with parameter λ , 44 predictable, 72 discrete-time, 72 progressively measurable, 25 registration, 44 renewal delayed, 161 pure, 161 semicontinuous, 328 stepwise, 327 stochastic, 1 uniformly integrable, 72 upper continuous, 328 Wiener, 44 two-sided, 61 with discrete time, 1 with independent increments, 43 homogeneous, 43 Q quadratic characteristic joint, 74 quantile transformation, 4 R random element, 1 generated by a random process, 242 distribution, 242 generated by a random sequence, 242 distribution, 242 random field, 1 Poisson, 45 random function, 1 centered, 11 compensated, 11 continuous a.s., 33 in mean, 33 in mean square, 33 in probability, 22, 33 in the L p sense, 33 with probability one, 33 measurable, 22 separable, 22
stochastically continuous, 33 random functions stochastically equivalent, 21 in a wide sense, 21 random walk, 138 realization, 1 red period, 330 renewal epoch, 161 equation, 161 theorem, 162 representation spectral, 109 reserve process, 328 resolvent operator, 177 risk process, 328 risk zone, 330 ruin probability with finite horizon, 329 ruin time, 329 S safety security loading, 329 second factorization identity, 332 security of ruin, 329 sequence ergodic, 110 random, 1 regular, 129 singular, 129 set continuation, 230 cylinder, 2 of separability, 22 stopping, 230 supporting, 230 total, 118 shift operator, 110 Snell envelope, 81 space functional, 241 Skorohod, 24 Spitzer–Rogozin identity, 331 square variation, 74 square characteristic, 74 square variation joint, 74 State regular, 139
Index state accessible, 137 communicable, 137 essential, 137 inessential, 137 recurrent, 138 transient, 138 stationarity in wide sense, 107 strictly, 109 stationary sequence interpolation, 129, 130 prediction, 129, 130 stochastic basis, 71 stochastic differential, 194 stochastic differential equation, 215 strong solution, 215 weak solution, 216 stochastic integral Itˆo, 193 discrete, 80 over orthogonal measure, 108 stopping optimal, 230 stopping time, 71 optimal, 229 strategy optimal, 229 strong Markov family, 179 submartingale, 71 superharmonic majorant, 230 least, 230 supermartingale, 71 surplus prior to ruin, 329 T telegraph signal, 183 theorem Birkhoff–Khinchin, 110 Bochner, 12 Bochner–Khinchin, 107 Donsker, 244 Doob’s on convergence of submartingale, 76 on number of crossings, 76 optional sampling, 73 ergodic, 138
375
Fubini for stochastic integrals, 195 functional limit, 244 Hajek–Feldman, 273 Herglotz, 107 Hille –Yosida, 178 Kolmogorov on finite-dimensional distributions, 2 on continuous modification, 23 on regularity, 130 L´evy, 204 on normal correlation, 60 Poincare on returns, 119 Prokhorov, 243 Ulam, 241 total maximal deficit, 330 trading strategy, 303 arbitrage possibility, 304 self-financing, 303 trajectory, 1 transform Fourier–Stieltjes, 340 Laplace, 164 Laplace–Karson, 336 Laplace–Stieltjes, 163 transition function, 176 substochastic, 185 transition intensity, 139 transition probabilities matrix, 137 U ultimate ruin probability, 329 uniform integrability of stochastic process, 72 of totality of random variables, 72 V vector Gaussian, 59 virtual waiting time, 160 W waiting process, 3 Wald identity first, 85 fundamental, 85 generalized, 85 second, 85 white noise, 109