Arjun K. Gupta
Wei-Bin Zeng
Yanhong Wu
Probability and Statistical Models Foundations for Problems in Reliability and Financial Mathematics
Birkh¨auser
Arjun K. Gupta Department of Mathematics and Statistics Bowling Green State University Bowling Green, OH 43403 USA
[email protected]
Yanhong Wu Department of Mathematics California State University Stanislaus One University Circle Turlock, CA 95382 USA
[email protected]
Wei-Bin Zeng Department of Mathematics University of Louisville 328 Natural Sciences Building Louisville, KY 40292 USA
[email protected]
ISBN 978-0-8176-4986-9 e-ISBN 978-0-8176-4987-6 DOI 10.1007/978-0-8176-4987-6 Springer New York Dordrecht Heidelberg London Library of Congress Control Number: 2010934423 Mathematics Subject Classification (2010): Primary: 60-01, 62-01, 91-01; Secondary: 60K05, 60K10, 90B25, 91B30, 91B24, 91G40, 91G20 c Springer ScienceCBusiness Media, LLC 2010 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer ScienceCBusiness Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper www.birkhauser-science.com
Preface
Probability models are now a vital component of every scientific investigation. This book is intended to introduce basic ideas in stochastic modeling, with emphasis on models and techniques. These models lead to well-known parametric lifetime distributions, such as exponential, Weibull, and gamma distributions, as well as the change-point and mixture models. They also motivate us to consider more general notions of nonparametric lifetime distribution classes. Particular attention has been paid to their applications in reliability, insurance mathematics, and economics. The following topics are the focus in this volume: 1. 2. 3. 4. 5. 6. 7.
Exponential Distributions and the Poisson Process; Parametric Lifetime Distributions; Nonparametric Lifetime Distribution Classes; Multivariate Exponential Extensions; Association and Dependence; Renewal Theory; Applications to Reliability, Insurance, Finance, and Credit Risk.
Chapter 1 provides notation and basic results in probability theory that are needed in the consequent chapters. Chapters 2 and 3 are devoted to models related to exponential distribution and Poisson processes. Particular attentions is paid to the characterizations of exponential distribution and the Poisson process. Two of the most important properties that characterize exponential distribution: the lack of memory property and constant failure rate are discussed in detail. Then the generalizations of exponential distribution are examined in three directions: through its parametric form that leads to parametric families of lifetime distributions; via notions of aging (such as monotone failure rate) that lead to a variety of lifetime distribution classes; and through lifetime distributions of multiple component systems that lead to multivariate (mainly bivariate) exponential extension. These three generalizations are treated in Chaps. 4, 5, and 6, respectively. In Chap. 7, we deal with various concepts of association and dependence, which extend and generalize the results in Chaps. 4, 5, and 6. Chapter 8 introduces renewal theory, which plays a key role in applied probability techniques. Applications to insurance, finance, and credit risk are discussed in Chaps. 9, 10, and 11. A series of questions are provided at the end of each chapter, which consists of three types. The first type consists of direct numerical applications of basic concepts and results. The second type involves some v
vi
Preface
theoretical calculations. The third type needs theoretical proofs. The second and third types of questions extend the concepts and related theoretical results. Answers and brief solutions are provided for the first two types of questions at the end of the book. Hints and brief steps are provided for some third type questions if special techniques are needed. A relatively large collection of some important books and research articles for further reading is given in the bibliography. The book differs from traditional probability textbooks in several aspects. First, it does not cover the central limit theorem and normal theory-related topics. In this sense, it can be treated as a textbook for a second course in probability models. Second, it puts more emphasis on applied probability techniques and models with recent applications in reliability, insurance, and economics. Third, no measuretheory knowledge is necessary to understand the material, and thus the book can be used as a one-semester senior undergraduate or first-year graduate textbook in courses on applied probability models for students majoring in mathematics, statistics, engineering, and economics. It can also be used as supplemental reading for a variety of other courses. An important feature is that after the discussion of exponential distribution and the Poisson process (Chaps. 1–3), each of the subsequent chapters (Chaps. 4–11) covers one of the important topics in applied probability and can be read independently. This gives more flexibility for instructors and readers. Also, the material in each chapter is carefully selected and covers the basic concepts and techniques on the topic and readers can get into more advanced material smoothly. Further reading material and important references are listed in the bibliographical notes and given in the bibliography. As the material is presented in a concise way, some classical topics such as queueing theory, network theory, and Markov chain and dynamic systems are omitted. Many references including books and research articles are not included in the bibliography for the same reason. The original project for this book started when the second author was visiting Bowling Green State University. Since then the first two authors have given one semester courses on this topic; the first author at Bowling Green State University and the second author at the University of Louisville. The book was completed after the third author, who is currently teaching at California State University Stanislaus, made some major revisions on parametric and nonparametric life distribution classes and added the material on association and dependence, renewal theory, and applications to insurance and economics. The authors are thankful to their colleagues and students who have contributed directly or indirectly. Special thanks from the first two authors are due to Professors K.S. Lau and Samuel Kotz for helpful discussions on many relevant topics. Editorial help from Tom Grasso of Birkh¨auser and the comments from several anonymous reviewers are gratefully appreciated. Finally, we thank our families for their tolerance and patience throughout the completion of this project. August, 2010
AK Gupta;
[email protected] W-B Zeng;
[email protected] Y Wu;
[email protected]
Contents
1
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.2 Notations.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.3 Random Variable and Distribution Function .. . . . . . . .. . . . . . . . . . . . . . . . . 1.4 Mean and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.5 Joint and Conditional Distributions . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.5.1 Joint Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.5.2 Independent Sums and Laws . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.5.3 Conditional Distribution and Mean .. . . . . . . .. . . . . . . . . . . . . . . . . 1.6 Survival Function and Failure Rate. . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.6.1 Survival Function and Failure Rate . . . . . . . . .. . . . . . . . . . . . . . . . . 1.6.2 Mean and Mean Residual Life . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 1.6.3 Cauchy Functional Equation . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . Problems . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
1 1 2 4 5 8 8 9 10 13 13 15 16 17
2
Exponential Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.1 Introduction .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.2 Exponential Distribution .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.3 Characterization of Exponential Distribution .. . . . . . .. . . . . . . . . . . . . . . . . 2.3.1 Memoryless Property .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.3.2 Constant Failure Rate Function . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.3.3 Extreme Value Distribution . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.4 Order Statistics and Exponential Distribution . . . . . . .. . . . . . . . . . . . . . . . . 2.4.1 Some Properties of Order Statistics. . . . . . . . .. . . . . . . . . . . . . . . . . 2.4.2 Characterization Based on Order Statistics. . . . . . . . . . . . . . . . . . 2.4.3 Record Values. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 2.5 More Applications.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . Problems . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
23 23 23 27 27 30 30 32 32 35 37 37 40
3
Poisson Process.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.1 Poisson Process as a Counting Process. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.2 Characterization of Poisson Processes as Counting Processes . . . . . . . 3.3 Poisson Process as a Renewal Process . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
45 45 47 53 vii
viii
Contents
3.4
Further Properties of Poisson Process . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.4.1 Superposition Process . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 3.4.2 Decomposition of Poisson Process . . . . . . . . .. . . . . . . . . . . . . . . . . 3.5 Examples of Poisson Process . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . Problems . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
57 57 58 60 67
4
Parametric Families of Lifetime Distributions . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.1 Weibull Distribution .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.2 Gamma Distribution.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.3 Change-Point Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.4 Mixture Exponential Distribution . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 4.5 IFR (DFR) and Mixture Erlang Distribution .. . . . . . . .. . . . . . . . . . . . . . . . . Problems . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .
71 71 74 78 79 81 84
5
Lifetime Distribution Classes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 87 5.1 IFR and DFR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 87 5.1.1 IFR and PF2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 87 5.1.2 Smoothness of IFR Distribution . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 90 5.1.3 A Sufficient Condition . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 91 5.2 IFRA and DFRA Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 92 5.3 Several Lifetime Distribution Classes . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 95 5.4 Preservation of Lifetime Distributions Under Reliability Operations 99 5.4.1 Independent Sums . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 99 5.4.2 Mixture of Lifetime Distributions . . . . . . . . . .. . . . . . . . . . . . . . . . .101 5.5 Shock Models and Lifetime Distribution Classes . . .. . . . . . . . . . . . . . . . .104 5.5.1 IFRA Property of Shock Model.. . . . . . . . . . . .. . . . . . . . . . . . . . . . .104 5.5.2 Extension of Cumulative Damage Model . .. . . . . . . . . . . . . . . . .107 5.5.3 General Cumulative Damage Model.. . . . . . .. . . . . . . . . . . . . . . . .108 5.5.4 Shock Models Leading to Other Lifetime Distributions .. . .110 Problems . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .112
6
Multivariate Lifetime Distributions . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .117 6.1 Basic Properties of Bivariate Distributions . . . . . . . . . .. . . . . . . . . . . . . . . . .117 6.2 Bivariate Memoryless Property .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .120 6.3 Properties of the BVE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .125 6.4 A Nonfatal Shock Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .133 6.5 Absolutely Continuous Bivariate Exponential Extensions . . . . . . . . . . .135 Problems . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .139
7
Association and Dependence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .141 7.1 Several Concepts of Association . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .141 7.2 M TP2 Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .146 7.3 Multivariate Failure Rate and Distribution Class . . . .. . . . . . . . . . . . . . . . .149 7.4 Negative Association .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .151 Problems . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .156
Contents
ix
8
Renewal Theory.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .159 8.1 Renewal Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .159 8.2 High-Order Approximations and Bounds .. . . . . . . . . . .. . . . . . . . . . . . . . . . .163 8.3 Delayed Renewal Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .166 8.4 Defective Renewal Process . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .169 Problems . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .175
9
Risk Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .179 9.1 Classical Risk Model .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .179 9.2 Approximation and Bounds for Ruin Probability.. . .. . . . . . . . . . . . . . . . .181 9.3 Deficit at Ruin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .183 9.4 Large Claim Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .185 9.4.1 Bounds in terms of NWU (NBU) Distribution Classes . . . . .186 9.4.2 Subexponential Classes . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .190 9.5 Risk Sharing and Stop-Loss Reinsurance . . . . . . . . . . . .. . . . . . . . . . . . . . . . .193 Problems . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .196
10 Asset Pricing Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .199 10.1 Utility, Risk, and Pricing Kernel.. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .199 10.1.1 Utility and Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .199 10.1.2 Asset Pricing Formula and Pricing Kernel .. . . . . . . . . . . . . . . . .200 10.2 Models for Returns .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .203 10.2.1 ˇ-Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .203 10.2.2 Frontier Expression .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .204 10.2.3 Log-Normal Model .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .204 10.3 Examples of Risk Assets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .205 10.4 Risk-Neutral Probabilities . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .207 10.5 Option Pricing for Binomial Model . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .208 10.5.1 Pricing Formula for Multiple Stages. . . . . . . .. . . . . . . . . . . . . . . . .208 10.5.2 Binomial Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .208 10.6 Portfolio Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .210 10.6.1 Discrete Financial Market .. . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .210 10.6.2 Risk Management . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .211 10.6.3 Hedging Options .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .213 10.7 Black–Scholes Formula .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .216 Problems . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .218 11 Credit Risk Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .221 11.1 Two Models for Default Probability .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .221 11.1.1 Basic Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .221 11.1.2 Reduced Form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .222 11.1.3 Structural Model .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .224 11.2 Valuation of Default Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .225 11.2.1 No Recovery Zero-Coupon Defaultable Bond . . . . . . . . . . . . . .226 11.2.2 Non-Zero Recovery.. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .226 11.2.3 Actual and Risk Neutral Default Intensity .. . . . . . . . . . . . . . . . .227
x
Contents
11.3 Credit Rating: Default and Transition .. . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .227 11.3.1 Credit Rating .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .227 11.3.2 Rating Assignment . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .229 11.3.3 Rating Transition . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .229 11.4 Correlated Defaults. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .230 11.4.1 Credit Metrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .230 11.4.2 Correlated Default Intensities .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .231 11.4.3 Copula-Based Correlation Modeling . . . . . . .. . . . . . . . . . . . . . . . .231 11.5 Credit Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .232 11.5.1 Credit Default Swaps . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .233 11.5.2 Collateral Debt Obligations . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .234 Problems . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .235 Bibliographical Notes and Further Reading . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .237 References .. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .241 Answers and Solutions to Selected Problems . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .245 Index . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .265
Chapter 1
Preliminaries
1.1 Introduction There are phenomena from an experiment on a target “population” that are deterministic, in the sense that when we observe them, we are absolutely sure about their outcomes. However, there are other phenomena that are nondeterministic. More often than not, the quantitative or categorical outcomes of interest induced from the phenomena will not be predictable. Probability and statistics are the branches of knowledge dealing with such stochastic phenomena and uncertainties. Probability theory provides the main mathematical tools for studying stochastic phenomena, whereas statistics involves collection, analysis, and interpretation of data and drawing inferences about the population from observing stochastic phenomena. They represent two aspects of the statistical science: probability theory is deductive as a branch of mathematics, and statistics is inductive as a part of every quantitative science. Here is a simple example. On the one hand, if it is known that in a high school of 1,200 students, 900 students like mathematics, it is possible to compute the probability that 24 out of a random sample of 30 students would like mathematics. This would be a problem in probability theory. On the other hand, if it is known that out of a sample of 30 chosen students, 24 like mathematics, inference about the total number or proportion of students who like mathematics in the whole population of 1,200 students would become a problem in statistics. To study a stochastic phenomenon of an experiment, random variables are usually formulated as characteristics of outcomes in its sample space, then data are collected as observations of the random variables, and statistical properties are found from the analysis of the data. Just imagine the best scenario in the study: if, from the statistical properties, we can obtain the ( joint) probability distribution of the relevant random variables, then we can have better understanding of the characteristics of the phenomenon. This is exactly the subject of probabilistic modeling, also often called stochastic modeling. Starting with some statistical properties, we establish a probability model for the stochastic phenomenon of interest and try to find the ( joint) probability distribution of the random variables. As a bridge connecting the two aspects in statistical science, probabilistic modeling lies on the borderline between probability and statistics.
A.K. Gupta et al., Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics, DOI 10.1007/978-0-8176-4987-6 1, c Springer Science+Business Media, LLC 2010
1
2
1 Preliminaries
1.2 Notations Throughout the book, we will use the following notations, most of which are standard and have been used in every textbook, e.g., Ross (2005). Some of the terms will be defined in the consequent sections. A; B; C; : : : P() X; Y; Z; : : : FX .x/ D P ŒX x FNX .x/ D 1 FX .x/ f .x/ f .x/ r.x/ D FN .x/ E.X / Var.X / MX .t/ P .AjB/ D
P .A \ B/ P .B/
EŒX I A N; Z; Q; RC ; R
Sample space Event Probability measure Random variable Cumulative distribution function of X Survival function of X Probability density function Failure rate function (also called hazard rate, or force of mortality) Expected value or mean of X Variance of X Moment generating function Conditional probability of A given B The expectation of X when condition A also occurs. Sets of natural numbers, integers, rational numbers, nonnegative reals, and reals, respectively Set of positive natural numbers
ZC
Notation for Distribution ˇ.m; n/ Bin.n; p/ Exp./ f .xI / F .xI / .x/ ˆ.x/ WEI.; ˛/ Gam.; ˛/ N.; 2 / Pois./ 2 .r/ U.a; b/
Beta distribution with parameters m and n Binomial distribution with parameters n and p Exponential distribution with parameter Probability density function with parameter Cumulative distribution function with parameter Standard normal density function Standard normal cumulative distribution function Weibull distribution with parameters and ˛ Gamma distribution with parameters and ˛ Normal distribution with mean and variance 2 Poisson distribution with mean Chi-square distribution with degree of freedom r Uniform distribution over .a; b/
1.2 Notations
3
Abbreviations d.f. p.d.f. BVE c.d.f. r.v. m.g.f. IFR (DFR) PF2 M TP2 LTD (RTI) PQD PRD IFRA (DFRA) NBU (NWU) DMRL (IMRL) NBUC (NWUC) NBUE (NWUE) HDMRL (HIMRL) HNBUE (HNWUE) ACBVE
Degree of freedom Probability density function Bivariate exponential distribution Cumulative distribution function Random variable Moment generating function Increasing (decreasing) failure rate Polya frequency function of order 2 Multivariate total positivity Left tail decreasing (right tail increasing) Positive quadrant dependet Positive regression dependent Increasing (decreasing) failure rate in average New better (worse) than used Decreasing (increasing) mean residual life New better (worse) than used in convex order New better (worse) than used in expectation Harmonic decreasing (increasing) mean residual life Harmonic better (worse) than used in expectation Absolute continuous bivariate exponential distribution
Notations O() and o() For studying the limiting behavior of a function f .x/ as x tends to zero, or infinity, or some other special limits, it is helpful to compare it with some known function. .x/ remains bounded as x tends (a) f .x/ is said to be at most of order g.x/, when fg.x/ to its limit. We write f .x/ D O.g.x//. For example, ax C b D O.x/ as x tends to 1. .x/ (b) f .x/ is said to be of smaller order than g.x/, when fg.x/ tends to zero, as x tends to its limit. We write f .x/ D o.g.x//. For example, x n D o.ex / as x tends to 1.
Notations O.x/, O.1/, and o.1/ will also be used without referring to a specified function f .x/. For example, O.x/ will mean any function which is at most of order x, O.1/ will mean any bounded function, and o.1/ will mean any function tending to zero.
4
1 Preliminaries
1.3 Random Variable and Distribution Function For convenience, we first introduce the following basic concepts of probability theory. We always denote the underlying probability space generated from the stochastic phenomenon by .; A; P .://, where is the sample space, which consists of all possible outcomes, A is the set of events (theoretically, it is a -algebra of subsets of ), and P .:/ is the probability measure which calculates the probability P .A/ of any given event A. The most important notion of probability theory is that of a random variable, which represents a numerical characteristic of the stochastic phenomenon and is denoted by a capital letter, e.g., X; Y; Z; : : :. In notation, a random variable X is a mapping X : ! R such that for all a 2 R, the subset A D f! 2 jX.!/ ag is an event in .; A; P /, i.e., A 2 A. Usually, we abbreviate the event A by A D fX ag. For a given random variable X , the function F .x/ D P ŒX x is called its cumulative distribution function. When necessary, we emphasize X by adding a subscript such as FX .x/. A distribution function F .x/ is nondecreasing, continuous from the right, and F .C1/ D 1, F .1/ D 0. We call a distribution function F .x/ degenerate at a point t if F .x/ D
1 if x t 0 otherwise:
If F .x/ is a step function, that means, X only takes countable values fx1 ; ; xk ; : : :g, then it is said to be a discrete distribution function, and we denote by pk D P ŒX D xk ; k D 1; 2; : : : the probability distribution function of X . The most well-known discrete random variable is the binomial random variable X , and its probability distribution is defined as P ŒX D k D
nŠ p k .1 p/nk ; kŠ.n k/Š
for k D 0; 1; : : : ; n, where X represents the total number of successes in n independent Bernoulli trials with success probability p.
1.4 Mean and Variance
5
If F .x/ is continuous, then it is a continuous distribution function. For a continuous distribution function F .x/, if there exists a nonnegative function f .x/ such that Z x F .x/ D
f .t/dt; 1
d then F .x/ is called absolutely continuous. In this case, dx F .x/ D f .x/ for almost all x, and f .x/ is called the probability density function of F .x/. It follows that Z 1
f .x/dx D 1: 1
R1 Rx If fact, if f .x/ 0 and 1 f .x/dx D 1, then the function F .x/ D 1 f .t/dt is an absolutely continuous distribution function. The most used standard normal density is defined as 1 x2 .x/ D p e 2 2
for 1 R< x < 1, and the standard normal distribution function is denoted as x ˆ.x/ D 1 .z/dz. Let A and B be two events, and A \ B denote the intersection of A and B. The conditional probability of A given B is defined by P .AjB/ D
P .A \ B/ ; P .B/
for P .B/ > 0:
For definitions, examples, and a list of frequently used random variables, the readers are referred to textbooks such as A First Course in Probability by Ross (2005).
1.4 Mean and Variance Given a random variable X , the mean (expected value) of X is defined as Z
1
D EŒX D D
xdF .x/ 1 1 X
xk P ŒX D xk ; if X is discreteI
kD1 Z 1
D
xf .x/dx; if X is absolutely continuous: 1
6
1 Preliminaries
Generally, for Y D g.X / defined as a function of X , Z
1
EŒY D EŒg.X / D
g.x/dF .x/: 1
In particular if Y D a C bX , then EŒY D a C bEŒX I Var.Y / D b 2 Var.X /: When X is discrete, EŒg.X / D
1 X
g.xk /P ŒX D xk I
kD1
and when X is absolutely continuous, Z
1
EŒg.X / D
g.x/f .x/dx: 1
In particular, the variance of X is calculated as Var.X / D E .X E.X //2 D
Z
D E X 2 .E.X //2 D
1
.x /2 dF .x/ 1 Z 1
x 2 dF .x/ 2 : 1
A useful tool to calculate the moments of X is using its moment generating function, which is defined as h i Z 1 M.t/ D E etX D etx dF .x/; 1
if it exists for jtj < c for a constant c > 0. For example, h i dk E X k D k M.t/jt D0 : dt For notational convenience, we shall use EŒg.X /I A D EŒg.X /IA to denote the expectation (mean) of g.X / when the event A also occurs, where IA is the indicator function of A. Consequently, EŒg.X /jA D
EŒg.X /I A ; P .A/ > 0 P .A/
1.4 Mean and Variance
7
denotes the conditional mean of g.X / given that the event A has occurred. For example, if X has density function f .x/, EŒg.X /I X c EŒg.X /jX c D D P ŒX c
Rc 1
g.x/f .x/dx ; if F .c/ > 0: F .c/
Example 1.1. Consider an experiment in which the host rolls a die three times. Suppose you bet $1 on face 6, and you win $k if there are total k faces on 6 for k D 1; 2; 3; otherwise, you lose the bet if no faces are on 6. (a) By denoting O if the face is 6 and N if the face is not 6, the sample space D fOOO; OON; ONO; NOO; ONN; NON; NNO; NNNg with corresponding probabilities 1=216; 5=216; 5=216; 5=216; 25=216; 25=216; 25=216; 125=216; respectively. (b) Let A denote the event that there are exactly two 6s. Then P .A/ D
5 5 5 15 C C D : 216 216 216 216
(c) If the random variable X denotes the total number of faces on 6, then X follows the binomial distribution with n D 3 and p D 1=6: X Dk 0 1 2 3 P ŒX D k 125/216 75/216 15/216 1/216
(d) The mean of X is EŒX D 108=216 D 1=2. (e) Now define Y D g.X / as the amount you win. That means, g.0/ D 1 and g.k/ D k for k D 1; 2; 3. Then EŒY D 17=216. Example 1.2. (a) For the standard normal random variable Z with density .z/, EŒZ D 0, Var.Z/ D 1 and 1 Eet Z D p 2
Z
1
1 2
1 2
et z 2 z dt D e 2 t : 1
(b) For a general normal random variable X D C Z, EŒX D , Var.X / D 2 , and 1 2 2
M.t/ D EetX D EetCt Z D etC 2 t
:
8
1 Preliminaries
1.5 Joint and Conditional Distributions 1.5.1 Joint Distribution For two random variables X and Y defined on the same probability space, we denote by F .x; y/ D P ŒX xI Y y the joint distribution function of X and Y . The marginal distributions of X and Y are FX .x/ D P ŒX x D F .x; 1/ and FY .y/ D P ŒY y D F .1; y/: X and Y are called independent if for all x; y 2 R, P ŒX x; Y y D P ŒX xP ŒY y: For any general function g.x; y/, the mean of g.X; Y / is calculated as Z
1
Z
1
EŒg.X; Y / D
g.x; y/dF .x; y/: 1
1
In particular, the covariance of X and Y is defined as Cov.X; Y / D EŒX Y EŒX EŒY : When X and Y are independent, for any functions g.x/ and h.y/, EŒg.X /h.Y / D EŒg.X /EŒh.Y /: This implies, Cov.g.X /; h.Y // D 0. If both X and Y are discrete, we denote by p.x; y/ D P ŒX D xI Y D y the joint probability distribution function of X and Y , and their marginal probability distributions are thus X X pX .x/ D P ŒX D x D p.x; y/ and pY .y/ D P ŒY D y D p.x; y/; y
x
respectively. X and Y are called jointly absolutely continuous if there exists the joint density function f .x; y/ such that Z x Z y F .x; y/ D f .u; v/dudv; 1
1
1.5 Joint and Conditional Distributions
9
for all x; y 2 R. For a transformation U D g.X; Y / and V D h.X; Y /, suppose the determinant of the Jacobian matrix ˇ ˇ ˇ @g @g ˇ ˇ ˇ ˇ @x @y ˇ ˇ ˇ ¤ 0: J.x; y/ D ˇ ˇ ˇ @h @h ˇ ˇ ˇ @x @y Then the joint density function of U and V is f.U;V / .u; v/ D f.X;Y / .x; y/jJ.x; y/j1 ; where x and y on the right hand side are written as inverse functions of u and v. Joint distributions for general several random variables can be similarly defined. More detailed properties for the joint distributions can be found in Chap. 6.
1.5.2 Independent Sums and Laws When X and Y are independent, the moment generating function of X C Y can be written as the product of their moment generating functions: MXCY .t/ D Eet .XCY / D MX .t/MY .t/: This property can be used to verify the Poisson law for the binomial random variable and the normal law for any sample means. Theorem 1.1. Let X be a binomial random variable from n independent Bernoulli trials with success probability p. Suppose n ! 1 and p ! 0 such that np ! . Then the distribution of X approaches a Poisson distribution defined by P ŒX D k D
k e ; kŠ
for k D 0; 1; 2; . Proof. We prove the result using the moment generating function. First, note that the binomial random variable X can be written as the sum of n independent Bernoulli random variable X1 ; ; Xn with same success probability p, where P ŒX1 D 1 D 1 P ŒX1 D 0 D p. Thus, its moment generating function is equal to MX .t/ D MX1 .t/ MXn .t/ D Œ1 C p.et 1/n : As n ! 1 and p ! 0 such that np ! , n np.et 1/ t ! e.e 1/ ; Œ1 C p.et 1/n D 1 C n
10
1 Preliminaries
which is the moment generating function of the Poisson distribution since 1 X
et k
kD0
1 X k .et /k e D e kŠ kŠ kD0
D et .e
t 1/
: t u
The following theorem is the central limit theorem or the normal law for the mean of a random sample. Theorem 1.2. Let X1 ; ; Xn be identically and independently distributed random variables with mean , variance 2 , and finite third moment. Then as n ! 1, the p N distribution of .X /=.= n/ approaches the standard normal distribution ˆ.x/. Proof. We again use the moment generating function. Let M.t/ denote the moment generating function of X1 . Then using the following Taylor expansions ln.1 C x/ D 1 C x as x ! 0, and M.t/ D 1 C t C
x2 C o x2 ; 2
EŒX 2 2 t C o t2 ; 2
as t ! 1, we have as n ! 1, p
n h p i t n t N p exp E et .X/=.= n/ D M n p
t n t p D exp n ln M n p
2 t n t t 1 2 D exp n ln 1C p C 2 E.X /Co 2 n n n 2
t ; ! exp 2 which is the moment generating function of the standard normal distribution.
t u
1.5.3 Conditional Distribution and Mean For discrete random variables X and Y following joint probability distribution function p.x; y/, we call pXjY .xjy/ D
p.x; y/ P ŒX D xI Y D y D P ŒY D y pY .y/
1.5 Joint and Conditional Distributions
11
the conditional probability distribution function of X given Y D y. For a function g.x/, the conditional mean of g.X / given Y D y is calculated as X
EŒg.X /jY D y D
g.x/pXjY .xjy/:
x
If X and Y are jointly absolutely continuous with the density function f .x; y/, we call f .x; y/ fXjY .xjy/ D fY .y/ the conditional density function of X given Y D y. The conditional mean of g.X / given Y D y is calculated as Z
1
EŒg.X /jY D y D 1
g.x/fXjY .xjy/dx:
Example 1.3. Suppose the joint density function for X and Y is f .x; y/ D
1 xy ye ; for 0 < x < 1; 0 < y < 2: 2
(a) The marginal density for Y is equal to Z
1
fY .y/ D 0
1 1 xy ye dx D ; for 0 < y < 2; 2 2
which is uniform on Œ0; 2. (b) The conditional density of X given Y D y is thus fXjY .xjy/ D yexy ; for 0 < x < 1: (c) The conditional mean of X given Y D y is Z
1
EŒX jY D y D
xye
xy
0
1 dx D y
Z
1
xex dx D
0
1 : y
(d) Given Y D 1, the conditional mean of g.X / D eX=2 is Z EŒe
X=2
1
jY D 1 D
ex=2 ex dx D 2:
0
Three useful tools to calculate the mean and variance of g.X / as well as Cov.X; Y / are given in the following theorem.
12
1 Preliminaries
Theorem 1.3. (a) EŒg.X / D EŒEŒg.X /jY ; (b) Var.g.X // D EŒVar.g.X /jY / C Var.EŒg.X /jY /; (c) Cov.X; Y / D Cov.EŒX jZ; EŒY jZ/ C EŒCov.X; Y jZ/. Proof. For (a), we look at the discrete case. EŒEŒg.X /jY D
X
EŒg.X /jY D ypY .y/
y
DD DD
XX y
x
y
x
XX
g.x/
p.x; y/ pY .y/ pY .y/
g.x/p.x; y/
D EŒg.X /: For (b), note that Var.g.X /jY D y/ D E .g.X / EŒg.X /jY D y/2 jY D y D E g 2 .X /jY D y .EŒg.X /jY D y/2 : Thus,
EŒVar.g.X /jY / D E g 2 .X / E .EŒg.X /jY /2 :
However, Var.EŒg.X /jY / D E .EŒg.X /jY /2 .EŒEŒg.X /jY /2 D E .EŒg.X /jY /2 .EŒg.X //2 : Adding the above two equations, we get the expected result. The proof for (c) is left as an exercise.
t u
The following example demonstrates a typical application. Example 1.4. Let fXi g for i D 1; 2; be i.i.d. random variables with mean and variance 2 , and N be an integer random variable which is independent of fXi g’s. P We consider the random sum N i D1 Xi . P (a) By conditioning on the value of N , we can calculate the mean of N i D1 Xi as " E
N X i D1
# Xi
" " DE E
n X
## Xi jN
i D1
D EŒN D EŒN :
1.6 Survival Function and Failure Rate
13
(b) By using the above theorem, we can calculate its variance as Var
N X
! Xi
" D E Var
i D1
N X
!# Xi jN
" C Var E
i D1
N X
#! Xi jN
i D1
D EŒ 2 N C Var.N/ D 2 EŒN C 2 Var.N /: P X jN D 0, we can calculate (c) Similarly, since Cov N; N i i D1
Cov N;
N X
! Xi
" D Cov EŒN jN ; E
i D1
N X
#! Xi jN
i D1
D Cov.N; N / D Var.N /: t u
1.6 Survival Function and Failure Rate 1.6.1 Survival Function and Failure Rate In this book, we will concentrate on variables representing lifetimes of mechanical or biological components or amount of some demands or claims. So we always assume X is a nonnegative random variable, and hence F .x/ D 0 for x < 0. The right-tail probability of the distribution FN .x/ D 1 F .x/ D P ŒX > x is called the survival function of X . It represents the probability that the lifetime (from the moment of birth) exceeds x. Clearly, the survival function FN .x/ is nonincreasing, FN .1/ D 0, and FN .x/ D 1 for x < 0. For t > 0, the residual life X t given X > t, denoted by X tjX > t, has the distribution function F .xjt/ D P ŒX t C xjX > t D
P Œt < X t C x P ŒX > t
D
F .t C x/ F .t/ ; FN .t/
14
1 Preliminaries
for FN .t/ > 0 and the conditional survival function is given by FN .xjt/ D P ŒX > t C xjX > t P ŒX > t C x P ŒX > t FN .t C x/ D : FN .t/ D
If the probability density function f .x/ exists, then the probability density function of the residual life X tjX > t is given by d F .t C x/ F .t/ dx FN .t/ f .t C x/ : D FN .t/
f .xjt/ D
In particular, its density f .0jt/ at x D 0 is called the failure (hazard) rate function of X , denoted by f .t/ r.t/ D ; FN .t/ as it gives the instant failure rate at time t given that X has survived until time t. When the failure rate function r.t/ exists, we note f .t/ FN .t/ d D ln FN .t/: dt
r.t/ D
This gives ln FN .t/ D
Rt 0
r.s/ds, and thus FN .t/ D e
Rt
0
r.s/ds
:
Consequently, we can write the conditional survival function for the residual life as FN .x C t/ FN .xjt/ D FN .t/ D
e
De
R t Cx 0
e
r.s/ds
Rt
0 r.s/ds R t Cx t r.s/ds
:
1.6 Survival Function and Failure Rate
15
Example 1.5. Suppose r.t/ D a C bt for t 0 where a; b 0. Then 1 2 FN .t/ D eat 2 bt ; 1 2 f .t/ D r.t/FN .t/ D .a C bt/eat 2 bt :
and 1 2 FN .xjt/ D e.aCbt /x 2 bx :
1.6.2 Mean and Mean Residual Life Theorem 1.4. Let X be a nonnegative random variable with distribution function F .x/ and E.X / < 1. Then Z
1
E.X / D
.1 F .x//dx: 0
Proof. We prove this theorem in the case when X is absolutely continuous with probability density function f .x/. Let r > 0. Then integration by parts yields Z
Z
r
xf .x/dx D xŒ1 F .x/jr0 C
0
Z D rŒ1 F .r/ C
r
Œ1 F .x/dx 0 r
Œ1 F .x/dx: 0
Note that Z
Z
1
xf .x/dx D
E.X / D
Z
r
0
0
and
1
xf .x/dx C
Z
xf .x/dx < 1; r
1
xf .x/dx D 0:
lim
r!1 r
Furthermore, Z
Z
1
1
xf .x/dx r r
f .x/dx D rŒ1 F .r/: r
16
1 Preliminaries
Hence we have limr!1 rŒ1 F .r/ D 0, and Z
1
E.X / D
xf .x/dx 0
Z
r
D lim
xf .x/dx
r!1 0 Z 1
Œ1 F .x/dx:
D 0
t u When f .x/ does not necessarily exist, Theorem 1.4 can also be derived by observing that Z
1
E.X / D
xdF .x/ Z
0 1
Z
x
D
dydF .x/ Z
0 1
Z
0 1
D
dF .x/dy Z
0
y 1
Œ1 F .y/dy:
D 0
Remark. For any general continuous random variable X with E.jX j/ < 1, this result extends to Z
Z
0
E.X / D
1
F .x/dx C 1
Œ1 F .x/dx: 0
The following result gives the expression for the mean residual life. Corollary 1.1. If EŒX < 1, then for all t such that FN .t/ > 0, the mean residual life Z 1 1 .t/ D EŒX tjX > t D FN .s/ds: FN .t/ t In particular, D .0/ D EŒX .
1.6.3 Cauchy Functional Equation The next theorem gives a characterization of exponential function which will be used in the next two chapters.
1.6 Survival Function and Failure Rate
17
Theorem 1.5. Let g W RC ! RC be continuous and satisfy the Cauchy functional equation g.x C y/ D g.x/g.y/; x; y 2 RC : Then either g.x/ D 0 or g.x/ D e˛x , for some ˛ 2 R. Proof. By induction, the Cauchy functional equation implies that for any xj 0, 1 j n, g.x1 C : : : C xn / D g.x1 /g.x2 / : : : g.xn /: If we choose xj D x D
1 n
for all j , we have
n 1 ; g.1/ D g n
1 D .g.1//1=n : g n
or
By choosing xj D x D g
1 m,
we have
n 1 D g D Œg.1/n=m ; for any m; n 2 N: m m
n
Therefore from the continuity of g.x/, it follows that g.x/ D ..g.1//x ; for x 2 RC : If g.1/ D 0, then g.x/ D 0. If g.1/ ¤ 0, then g.1/ > 0. By letting ˛ D ln g.1/, we have g.x/ D e˛x ; for x 2 RC : t u
Problems 1. A filling station is supplied with gasoline once a week. Its weekly volume of sales in thousands of gallons is a random variable with density function f .x/ D 5.1 x/4 ; for 0 < x < 1; and is zero otherwise. (a) Find the cumulative distribution function F .x/. (b) Find the mean and variance of F .x/. (c) If the probability of the supply being exhausted in a given week is at most 0.01, then find the capacity of the tank.
18
1 Preliminaries
2. A certain kind of insurance claim follows the following distribution function: F .x/ D 1
1 ; .1 C x=200/4
for 0 < x < 1. (a) (b) (c) (d)
Find the density function of F .x/. Find the mean and variance of the claims. Find the conditional distribution of claims given X > 200. Find the mean residual claim EŒX 200jX > 200.
3. The lifetime of a radar system (months) has the distribution function F .t/ D
et 1 ; 0 < t < 1: et C 1
(a) What is the probability that it will fail within the first month? (b) What is the probability that it will fail during the second month given that it has survived the first month? p 4. A failure rate function is r.t/ D 1= t. Deduce (a) The survival function; (b) The density function; (c) The mean and variance. 5. Prove Corollary 1.1. 6. For a nonnegative integer-valued random variable N (that means, N only takes values on ZC ), show that (a) EŒN D
1 X
kP ŒN D k D
kD1
(b) by noting P ŒN > k D 2
P1
X
i DkC1
1 X
P ŒN D i , show that
kP ŒN > k D E N 2 EŒN :
kD1
7. Let X follow the uniform density on Œ0; 1, f .x/ D 1; 0 x 1: Find M.t/ D EŒetX .
P ŒN kI
kD1
1.6 Survival Function and Failure Rate
19
8. The lung cancer failure rate of a t-year-old smoker is such that r.t/ D 0:027 C 0:00025.t 40/2 ; t 40; and 0 otherwise. Assume that a 40-year-old male smoker survives all other hazards, what is the probability that he survives to (a) age 50 and (b) age 60 without contracting lung cancer? 9. Show that for a nonnegative random variable X , h i Z E Xk D
1
kx k1 P ŒX > xdx: 0
10. The number of years that a washing machine functions is a random variable with failure rate function r.t/ D 0:2;
0
D 0:2 C 0:3.t 2/; D 1:1; t > 5:
2t <5
(a) What is the probability that the machine will still be working six years after being purchased? (b) If it is still working six years after being purchased, what is the conditional probability that it will fail within the succeeding two years? (c) If it is still working six years after being purchased, what is the mean residual life? 11. Suppose a certain type of insurance claim X follows the distribution function F .x/. (a) Suppose the deductible is D. That means, the insurer only pays the part of claim which exceeds the deductible D, i.e., XD D .X D/C . Find the expected value EŒXD . (b) A similar idea is called the franchise deductible. That means, the insurer only pays the whole claim if the claim size is larger than D, i.e., XDF D XIŒX>D . Find the mean EŒXDF . 12. Show that the continuity condition in Theorem 1.5 can be weakened to right continuity. 13. For a lifetime X with distribution F .x/, calculate EŒX jX > t and EŒX jX t for 0 < P ŒX t < 1.
20
1 Preliminaries
14. For a lifetime distribution F .x/ with density function f .x/ for 0 x < 1 show that t C .t/ is an increasing function of t, where .t/ is the mean residual life. 15. Let X be a binomial random variable with n trials and success probability p. (a) Using the binomial law show that the moment generating function of X is given by h i n E etX D pet C 1 p : (b) Show that EŒX D np and Var.X / D np.1 p/. 16. Suppose a lifetime has density function f .x/ D ce2x ; for 0 x < 1: (a) Find the value of c. (b) Calculate P ŒX > 2. (c) Calculate P ŒX > 3jX > 2. 17. For a random loss L following distribution F .x/, we call l the Value at Risk (VaR) at level 99% if 1 F .l/ D 0:99. Suppose F .x/ is normal with mean and standard deviation . (a) Show that the Value at Risk at level 99% l C 2:326. (b) Find the corresponding expected total loss Tl D EŒLjL l: 18. Suppose the joint density function of X and Y is f .x; y/ D (a) (b) (c) (d)
1 y e ; for 0 < x < y; 0 < y < 1: y
Find the marginal density function for Y . Find the conditional density function of X given Y . Compute EŒX jY D y. Compute Var.X jY D y/.
19. Show that in both discrete and continuous cases, if X and Y are independent, EŒX jY D y D EŒX : 20. Let X be a uniform random variable over .0; 1/ with density f .x/ D 1 for 0 < x < 1. Find E X jX > 12 .
1.6 Survival Function and Failure Rate
21
21. Show that if X and Y are jointly absolutely continuous, then Z
1
EŒX D 1
EŒX jY D yfY .y/dy:
22. Show that Cov.X; Y / D Cov.X; EŒY jX /: 23. Show that if EŒY jX D a C bX , then bD
Cov.X; Y / : Var.X /
24. Prove Cov.X; Y / D Cov.EŒX jZ; EŒY jZ/ C EŒCov.X; Y jZ/: PN 25. For the independent random sum Y D i D1 Xi where X1 ; ; Xn ; are independent and identically distributed random variables with moment generating function MX .t/ and N is a positive integer random variable independent of Xi ’s with moment generating function MN .t/, show that the moment generating function of Y is given by MY .t/ D MN .ln MX .t//: 26. Show that Var.EŒX jY / Var.X /.
Chapter 2
Exponential Distribution
2.1 Introduction In establishing a probability model for a real-world phenomenon, it is always necessary to make certain simplifying assumptions to render the mathematics tractable. However, we cannot make too many simplifying assumptions, for then our conclusions, obtained from the probability model, would not be applicable to the real-world phenomenon. Thus, we must make enough simplifying assumptions to enable us to handle the mathematics but not so many that the model no longer resembles the real-world phenomenon. The reliability of instruments and systems can be measured by the survival probabilities, such as P ŒX > x. During the early 1950s, Epstein and Sobel, and Davis analyzed statistical data of the operating time of an instrument up to failure, and found that in many cases the lifetime has exponential distribution. Consequently, the exponential distribution became the underlying life distribution in research on reliability and life expectancy in the 1950s. Although further research revealed that for a number of problems in reliability theory the exponential distribution is inappropriate for modeling the life expectancy, however, it can be useful to get a first approximation (see Barlow and Proschan 1975). Exponential distributions are also used in measuring the length of telephone calls and the time between successive impulses in the spinal cords of various mammals. This chapter is devoted to the study of exponential distribution, its properties and characterizations, and models which lead to it and illustrate its applications.
2.2 Exponential Distribution A continuous nonnegative random variable X (X 0) is called to have an exponential distribution with parameter , > 0, if its probability density function is given by f .x/ D ex ; for x 0;
A.K. Gupta et al., Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics, DOI 10.1007/978-0-8176-4987-6 2, c Springer Science+Business Media, LLC 2010
23
24
2 Exponential Distribution
or equivalently, if its distribution function is given by Z
x
f .t/dt D 1 ex ; for x 0:
F .x/ D 1
If follows that the survival function FN .x/ is given by FN .x/ D 1 F .x/ D ex ; for x 0: We list several properties of exponential distribution:
Property 1. 1 1 ; Var.X / D 2 ;
E.X / D
and the moment generating function is given by h i M.t/ D E etX D
; t
for t < . In fact, h i Z M.t/ D E etX D D D
t
1
etx ex dx
0
Z
1
. t/e.t /x dx
0
; t
since the last integral is an exponential density with parameter t. The first two moments of X can be calculated as ˇ ˇ d M.t/ˇˇ dt t D0 ˇ ˇ ˇ D 2 . t/ ˇt D0
EŒX D
D
1 ;
2.2 Exponential Distribution
25
ˇ ˇ 2 d2 E X D 2 M.t/ˇˇ dt t D0 ˇ 2 ˇˇ D . t/3 ˇt D0 D
2 ; 2
and thus Var.X / D E X 2 .EŒX /2 2 1 2 2 1 D 2: D
Property 2. The failure rate function r.t/ D , i.e., a constant. In fact, r.t/ D D
f .t/ FN .t/ et D : et
Property 3 (Lack of Memory). The residual lifetime .X t/jX > t has the same distribution as X . That means, for all x; t 0, P ŒX > x C tjX > t D P ŒX > x: This is clear by observing that P ŒX > x C tjX > t D D
FN .x C t/ FN .t/ e.xCt / et
D ex D P ŒX > x: In reliability terms, if we think X as the lifetime of some instrument, then the memoryless property implies that if the instrument is working at time t, then the distribution of the residual lifetime is the same as the original lifetime distribution.
26
2 Exponential Distribution
In particular, the mean residual lifetime .t/ D EŒX tjX > t is the same as the original mean D EŒX : .t/ D EŒX tjX > t 1 D EŒX D D : Property 4 (Extreme Value). Suppose X1 ; : : : ; Xn are independent following the same distribution as X. Then n min.X1 ; : : : ; Xn / also has the same exponential distribution. In fact, h xi P Œn min.X1 ; : : : ; Xn / > x D P min.X1 ; : : : ; Xn / > n h h xi xi : : : P Xn > D P X1 > n n h x in D P X> n D .ex=n /n D ex : Example 2.1. Suppose the lifetime of certain items follows exponential distribution with parameter D 0:1. 1 1 (a) From Property 1, D 0:1 D 10; Var.X / D 0:1 2 D 100. (b) The density function is f .x/ D 0:1e0:1x ;
and the survival function is FN .x/ D e0:1x : Therefore as stated in Property 2, the failure rate is equal to r.t/ D
f .t/ D 0:1: FN .t/
(c) Suppose one item has survived 10 units of time, then from Property 3, the conditional survival function given X > 10 is P ŒX > x C 10jX > 10 D P ŒX > x D e0:1x : In particular, the mean residual life .t/ D D 10. (d) (Serial system) Consider a serial system consists of five identical items, which are working independently. The system fails as long as one item fails. Then the lifetime of the system is X D min.X1 ; X2 ; : : : X5 /, where Xi denotes the
2.3 Characterization of Exponential Distribution
27
lifetime of i -th item. From Property 4, as X follows exponential distribution with parameter 0:5, then the mean system lifetime will be 0:5 D 2. Example 2.2 (Location-transformed exponential distribution). Suppose the time headway X between consecutive cars in highway during a period of heavy traffic follows the location-transformed exponential density function f .x/ D 0:15e0:15.x0:5/ ;
for x 0:5:
(a) X 0:5 is a regular exponential random variable with parameter D 0:15. 1 1 (b) EŒX D 0:5 C 0:15 and Var.X / D 0:15 2. (c) The survival function can be calculated as FN .t/ D P ŒX > t D P ŒX 0:5 > t 0:5 D e0:15.t 0:5/ ; for t 0:5: (d) The memoryless property still holds with minor modification. For t 0:5 and x > 0, P ŒX > t C xjX > t D P ŒX 0:5 > t 0:5 C xjX 0:5 > t 0:5 D e0:15x : (e) However, the mean residual life is still constant. For t 0:5, EŒX tjX > t D
1 : 0:15
2.3 Characterization of Exponential Distribution It turns out that Properties 2 to 4 can all be used to characterize exponential distribution in the sense that if a distribution possesses one of these properties, it must be exponential.
2.3.1 Memoryless Property Theorem 2.1. Let X be a nondegenerate lifetime random variable and its distribution has the memoryless property, i.e., P ŒX > x C tjX > t D P ŒX > x; for x; t 0: Then X has an exponential distribution. Proof. The memoryless property implies that FN .x C t/ D FN .x/; for x; t 0; FN .t/
28
2 Exponential Distribution
or equivalently, FN .x C t/ D FN .x/FN .t/; for x; t 0: By following the lines of the proof for Theorem 1.2, this implies that for any rational number r D m=n, FN .r/ D ŒFN .1/r D er ; where D ln FN .1/. Since X is not degenerate, FN .1/ > 0. (Otherwise, FN .x/ D 0 for all x > 0, a contradiction.) For any irrational x, let frn g be a sequence of rational numbers such that x < rn and limn!1 rn D x. Since FN .x/ is right continuous, we have FN .x/ D lim FN .rn / D lim ern n!1
n!1
D e lim rn D ex : t u Remark. A more delicate analysis shows that we only have to check the memoryless property at two points t1 and t2 such that t1 =t2 is irrational and P ŒX > x C ti jX > ti D P ŒX > x; for x 0; i D 1; 2: In fact, from
FN .x C ti / D FN .x/FN .ti /; for i D 1; 2;
we see that FN .x C mt1 C nt2 / D FN .x/FN .mt1 C nt2 /; for all m; n 2 Z: Without loss of generality, we assume t1 < t2 . Then we can find a sequence fxi g for i D 1; 2; , such that t2 D n0 t1 C x1 t1 D n1 x1 C x2 xi D ni C1 xi C1 C xi C2 ; for i D 1; 2; : Note that each xi is of the form mt1 C nt2 by looking at the equations backward, and more important, x1 > x2 > ! 0: Thus, for any given > 0, there exists a k such that 0 < xk < . This implies that for any x > 0, there exists a positive integer j such that .j 1/xk < x jxk :
2.3 Characterization of Exponential Distribution
29
Let zj D jxk and zj is of the form mt1 C nt2 . We see that 0 zj x xk < : Hence, we can find a monotone decreasing sequence z1 > z2 > ! x: By the right continuity of FN .x/, we have FN .x C y/ D FN .x/FN .y/ for all x; y 0: That means, FN .x/ is exponential. Corollary 2.1. If X is a nondegenerate lifetime random variable such that X has constant mean residual lifetime, i.e., .t/ D EŒX tjX > t Z 1 1 D FN .x/dx D ; for FN .t/ > 0; FN .t/ t then X is exponential. Proof. The constant mean residual lifetime implies FN .t/ 1 D R1 N t F .x/dx Z 1 d FN .x/dx; for FN .t/ > 0: D ln dt t Thus,
Z
1
ln t
t FN .x/dx D C :
By taking t D 0, we get C D ln . Thus, Z
1
t
FN .x/dx D e ;
t
and hence t
FN .t/ D e : t u
30
2 Exponential Distribution
2.3.2 Constant Failure Rate Function Theorem 2.2. Let X be a nonnegative random variable with probability density function f .x/. If X has a constant failure rate function r.t/ D , then X is exponentially distributed. This is simply observed from the fact that FN .x/ D e
Rx 0
r.t /dt
D et :
2.3.3 Extreme Value Distribution Let fX1 ; : : : ; Xn g be independent random variables distributed as X , which may represent the lifetimes of n similar components which are working independently. Denote by Xk;n as the k-th order statistics, which represents the k-th failure time from the n components. Theorem 2.3. Let X be a nondegenerate nonnegative random variable. If nX1;n has the same distribution as X for every positive integer n, then X has exponential distribution. Proof. h xi P ŒnX1;n > x D P X1;n > n h xi D P min.X1 ; : : : ; Xn / > n h x xi D P X1 > ; : : : ; Xn > n n h x in : D P X> n If nX1;n has the same distribution as X , then x D .FN .x//1=n ; for n D 1; 2; : : : : FN n By a similar argument as in Theorem 2.1, FN .x/ D ex for some > 0.
t u
Remark 1. Similar to the remark after Theorem 2.1, we can show that if FN .x=n/ D .FN .x//1=n holds for n1 and n2 such that we only need two samples.
ln n1 ln n2
is irrational, then X is exponential. That means,
2.3 Characterization of Exponential Distribution
31
In fact, from FN .x=ni / D .FN .x//1=ni for i D 1; 2, we have i1=nj1 nk2 h j ; FN .x/ D FN n1 nk2 x for all integers j; k 2 Z. This implies that j k FN nj1 nk2 D ŒFN .1/n1 n2 : For any x > 0, we have ln x D y 2 R. Just like in the remark following Theorem n1 2.1, since ln is irrational, we can find a sequence of real numbers ym > y which ln n2 converge to y and all ym are of the form j ln n1 C k ln n2 . By the right continuity of FN .x/, we have FN .x/ D ŒFN .1/x : Remark 2. Under some extra condition for F .x/ as x ! 0, we only need one sample to characterize the exponential distribution. The following is a typical result. Theorem 2.4. Suppose for some n, nX1;n has the same distribution as X. If 0 < lim
x!0C
F .x/ D < 1; x
then F .x/ D 1 ex , for x 0. Proof. For this sample size n, we have FN .x/ D .FN .x=n//n . Inductively, this implies that for any integer k, k FN .x/ D .FN .x=nk //n ; for x 0:
That means F .x/ D 1 FN .x/ k D 1 .FN .x=nk //n k
D 1 .1 F .x=nk //n : Since lim
x!0C
F .x/ D > 0; x
and F .x=nk / D x=nk C o.1=nk /;
32
2 Exponential Distribution
as k ! 1. Consequently, F .x/ D 1 lim
k!1 x
D 1e
1
x nk nk
: t u
Remark. The probability interpretation of the above inductive proof is that we can first take n independentn and identically o distributed samples of size n (total .i / .i / 2 n random variables), say, X1 ; : : : ; Xn , as copies of fX1 ; : : : ; Xn g for i D .i / for 1; 2; : : : ; n. This will give a sequence of independent variables, say, nX1;n i D 1; 2; : : : ; n with the same distribution as nX1;n . This process can repeatedly be carried to the total sample size n3 , n4 , . . . .
2.4 Order Statistics and Exponential Distribution In life testing, when all n items are tested starting from the same time, the lifetimes X1 ; : : : ; Xn are recorded as the order statistics X1;n ; : : : ; Xn;n . We call dr;n D XrC1;n Xr;n the spacing statistics for r D 0; 1; : : : ; n 1, and Dr;n D .n r/dr;n the normalized spacing statistics. Can any of the spacing statistics characterize the exponential distribution?
2.4.1 Some Properties of Order Statistics To find the distribution function Fk;n .y/ of Xk;n , we note that the event fXk;n yg is equivalent to the event that there are at least k of Xi ’s y. Note that given that thereare exactly j failures at y for j k (thus n j survivals at y), there are total n combinations of j out of n. Thus, j Fk;n .y/ D P ŒXk;n y D n X n
n X
P ŒExactly j failures before y
j Dk
F j .y/Œ1 F .y/nj j j Dk Z F .y/ n Dk t k1 .1 t/nk dt; k 0
D
where the last equation is obtained by integrating by part consecutively for j k.
2.4 Order Statistics and Exponential Distribution
33
If F .x/ is absolutely continuous with density function f .x/, then Fk;n .y/ has density n fk;n .y/ D k F k1 .y/Œ1 F .y/nk f .y/ k D nŠf .y/
F k1 .y/ Œ1 F .y/nk : .k 1/Š .n k/Š
This form can also be explained as follows. There are total nŠ permutations from the n failures. One fails at time y, k 1 fail before y, and n k are survival at y. Thus, nŠ there are total 1Š.k1/Š.nk/Š combinations of observations and each occurs with the same probability f .y/ŒF .y/k1 Œ1 F .y/nk . Generally, using the similar explanation, the joint density of fXr1 ;n ; Xr2 ;n ; : : : ; Xrk ;n g for 1 r1 < r2 < : : : < rk n with 1 k n is given by h i ŒF .yi C1 / F .yi /ri C1 ri 1 ; f.r1 ;:::;rk / .y1 ; y2 ; : : : ; yk / D nŠ …kiD1 f .yi / …kiD0 .ri C1 ri 1/Š where y0 D 0; ykC1 D 1; r0 D 0; rkC1 D n C 1; and y1 y2 : : : yk : In particular, the density function of the spacing statistic dr;n is fdr;n .x/D
nŠ .r 1/Š.n r 1/Š
Z
1
F r1 .y/Œ1F .xCy/nr1 f .y/f .xCy/dy: 1
If F .x/ is exponential, we have the following explicit result: Theorem 2.5. Let dr;n be the r-th spacing statistic from the exponential distribution F .x/ D 1 ex . Then (1) Fdr;n .x/ D P Œdr;n x D 1 e.nr/x ; 1 1 ; Var.dr;n / D .nr/ (2) E.dr;n / D .nr/ 2 2 ; r D 1; 2; : : : ; n 1; (3) d1;n ; d2;n ; : : : ; dn1;n are mutually independent. Proof. (1) The joint density function of XrC1;n and Xr;n is fr;rC1 .s; t/ D nŠf .s/f .t/
.F .s//r1 .1 F .t//nr1 .r 1/Š .n r 1/Š
D nŠes et
.1 es /r1 e.nr1/t ; 0 < s < t: .r 1/Š .n r 1/Š
34
2 Exponential Distribution
From the transformation .Xr;n ; XrC1;n / ! .Xr;n ; dr;n /; we obtain the joint density function of Xr;n ; dr;n , with s D x; t D x C y, and jJ j D 1 (Jacobian is unit), as fXr;n ;dr;n .x; y/ D nŠex e.xCy/
.1 ex /r1 e.nr1/.xCy/ ; .r 1/Š .n r 1/Š
for 0 < x; y < 1. Since the form is separable in x and y, the marginal density function for dr;n is fdr;n .y/ / e.nr/y ; where / means equality except for a constant. Hence Fdr;n .x/ D 1 e.nr/x : (2) follows from (1). For (3), we note that the joint density function of fX1;n ; : : : ; Xn;n g is f.X1;n ;:::;Xn;n / .x1 ; : : : ; xn / D nŠ …niD1 f .xi / D nŠ…niD1 exi : Consider the transformation .X1;n ; : : : ; Xn;n / ! .X1;n ; d1;n ; : : : ; dn1;n / with Jacobian jJ j D 1. The joint density function of .X1;n ; d2 ; : : : ; dn / is f.X1;n ;d1;n ;:::;dn1;n / .y1 ; y2 ; : : : ; yn / D nŠn expŒ.y1 C .y1 C y2 / C : : : C .y1 C : : : C yn // D nŠn expŒ.ny1 C .n 1/y2 C : : : C yn / D nŠ…niD1 e.ni C1/yi : It follows that X1;n and d1;n ; : : : ; dn1;n are mutually independent.
t u
Corollary 2.2. For the exponential distribution, the normalized spacings Dr;n D .n r/dr;n , for r D 1; 2; : : : ; n 1, are identically and independently distributed with common distribution function F .x/ D 1 ex . Proof. The independence of Dr;n for r D 1; : : : ; n1 follows from the last theorem. Since the density function of dr;n is .n r/e.nr/x , the density function of Dr;n D .n r/dr;n is thus ex . t u
2.4 Order Statistics and Exponential Distribution
35
Example 2.3. Consider the life testing for n identical components which work independently with exponential life distribution with . (a) X1;n is exponential with mean (b) In general, we can write
n
D
1 n .
Xk;n D X1;n C X2;n X1;n C : : : : C Xk;n Xk1;n 1 1 .n 1/.X2;n X1;n / C : : : D nX1;n C n n1 1 .n k C 1/.Xk;n Xk1;n /; C nkC1 which is a weighted sum of independent exponential random variables. In particular, 1 1 1 1 EŒXk;n D C C ::: C ; n n1 nkC1 and
VarŒXk;n D
1 1 1 1 C C : : : C : 2 2 2 n .n 1/ .n k C 1/ 2
(c) The total time on test (TTT) up to the k-th failure Xk;n can be written as TTT.Xk;n / D X1;n C X2;n C : : : C Xk;n C .n k/Xk;n D nX1;n C .n 1/.X2;n X1;n / C : : : C.n k C 1/.Xk;n Xk1;n /; which is a sum of k independent exponential random variable with parameter . (d) In general, denote by R the total number of failures before time t. The total time on test up to time t can be written as TTT.t/ D X1;n C X2;n C : : : C XR;n C .n R/t D nX1;n C .n 1/.X2;n X1;n / C : : : C.n R C 1/.XR;n XR1;n / C .n R/.t XR;n /:
2.4.2 Characterization Based on Order Statistics We first give a characterization of exponential distribution based on the first two spacing statistics. Theorem 2.6. Let X1 ; X2 be independently distributed random variables with common continuous distribution F .x/. If X1;2 and X2;2 X1;2 are independent, then F .x/ D 1 ex .
36
2 Exponential Distribution
Proof. The independence of X1;2 and d1;2 implies P ŒX2;2 X1;2 > xjX1;2 D y D P ŒX2;2 X1;2 > x; for all y 0, which is free of y. However, the distribution of X2;2 given X1;2 D y is indeed equivalent to the survival distribution of X given X > y. Thus, for y 0, P ŒX2;2 X1;2 > xjX1;2 D y D P ŒX2;2 > x C yjX1;2 D y D P ŒX > x C yjX > y D
FN .x C y/ : FN .y/
This implies FN .x C y/ D FN .x/; for all x; y 0: FN .y/ That means F .x/ is exponential from the memoryless property.
t u
Remark. If we know that the normalized spacings are all exponential with F .x/ D 1 ex , does it imply that the population distribution is also exponential without additional assumptions? The answer is no. Example 2.4. Let X1 ; X2 be independent and identically distributed, with common distribution function p F .x/ D 1 ex Œ1 C 4b 2 .1 cos.bx//; for x 0; b 2 2: Then it can be shown that d1;2 D X2;2 X1;2 D jX1 X2 j has exponential distribution. We further give a result without proof. Theorem 2.7. Let the population distribution F .x/ be such that F .0C / D 0 and F .x/ > 0 for all x > 0. Assume that Z
1
ur .s/ D
esx dF r .x/ ¤ 0
0
for all s such that Re.s/ 0. If for some r 1, P Œdr;n x D 1 e.nr/x ; for x 0; then F .x/ D 1 ex for x 0.
2.5 More Applications
37
2.4.3 Record Values Highly related to the order statistics are the following record values. Let fX1 ; : : : ; Xn ; : : :g be a sequence of nonnegative random variable which are observed one by one in time-ordered way (longitudinal observations). Let F .x/ be their distribution function with density f .x/. Definition 2.1. Xj is called a record value if Xj > max.X1 ; : : : ; Xj 1 /. Denote by R1 < R2 < : : : the successive record values with R1 D X1 . The following simple result is due to Tata (1969). Theorem 2.8. Let fX1 ; X2 ; : : :g be a sequence of identically and independently distributed nonnegative random variables. The distribution function F .x/ is exponential if, and only if, R1 D X1 and R2 R1 are independent. Proof. Notice that given R1 D X1 D x, R2 R1 D R2 x is equivalent to the residual life X xjX > x. Thus, the conditional distribution of R2 R1 given R1 D x is just P ŒR2 R1 > yjR1 D x D P ŒX x > yjX > x D
FN .x C y/ ; FN .x/
for x; y 0. Thus, R1 and R2 R1 are independent if, and only if, FN .x C y/=FN .x/ is free of x. That means, FN .x C y/ D FN .y/; FN .x/ by letting x D 0.
t u
2.5 More Applications Example 2.5 (Time of accident). Insurance companies collect accident records (history) of drivers. A driver is considered in “good” category if his probability of having an accident (in statistical sense) remains the same small number regardless of the passage of time. Therefore, if X denotes the random time period up to the first accident of the driver in question, then P ŒX > x C yjX > x D P ŒX > y: This is exactly the Memoryless Property. In this case, the insurance company does not need to guess the risk of such a driver, but can go ahead with a well-defined model of setting the amount of premium based on the driver’s record. Example 2.6. Suppose a store serves a community of n persons. These persons visit the store independently of each other, and their actual times of entering the
38
2 Exponential Distribution
store have the same distribution. Therefore, the n individuals can be associated with n identically and independently distributed random variables, say fX1 ; : : : ; Xn g. The store owner observes the order statistics fX1;n ; : : : ; Xn;n g as the successive arrival times at the store. Assume the store owner observes that the spacing statistics fX1;n ; d1;n ; : : : ; dn1;n g are also independent. Then from Theorem 2.9, Xj has exponential distribution. This also provides a well-defined model and basis for decision on number of employees, availability of items, etc. Example 2.7 (Geometric Sums). Consider a system with exponential lifetime with parameter . Upon a failure, a repair can restore the system like new with probability p and the failure is irreparable with probability 1 p. Denote by N the total number of working periods, which is a geometric random variable (Problem 2.8) with success probability 1 p. That means, P ŒN D k D .1 p/p k1 ; for k D 1; 2; : : :. Denote by X1 ; : : : ; XN the corresponding working period lengths, which are independent exponential random variables. Then the total lifetime for the system is N X Y D Xi : i D1
(a) By conditioning on the values of N , we can calculate the mean of Y as EŒY D
1 X
" E
kD1
DD
k X
# Xi P ŒN D k
i D1
1 X k P ŒN D k
kD1
D
1 : .1 p/
(b) Generally, we can calculate the moment generating function of Y as 1 h i X Pk E EtY D EŒet i D1 Xi P ŒN D k kD1
k 1 X D .1 p/p k1 t kD1
D
1 .1 p/ X p k1 t t kD1
D
.1 p/ : .1 p/ t
2.5 More Applications
39
(c) By matching the moment generating functions, we see that Y is also an exponential random variable with parameter .1 p/. In other words, geometric sum of identically and independently distributed exponential random variables is still exponential. Example 2.8 (Valuation of defaultable Zero-Coupon Bonds). (a) Consider a unit of bond of face value $1.00 with maturity time T . Suppose the interest(yield) rate is constant r. Then at time t, the value of the bond is er.T t / (discounted to time t) if no default is assumed. (b) Suppose the bond’s default time X follows an exponential distribution with parameter and there is no recovery at default. Then the value of this defaultable bond at time t given it is not defaulted will be er.T t / P ŒX > T jX > t D e.rC/.T t / : (c) Furthermore we consider the partial recovery case by assuming that w proportion of the market value will be recovered at default. Thus, the (discounted) value at time t given no default up to time t can be evaluated as er.T t / P ŒX > T jX > t C EŒer.Xt / wer.T X/ IŒX
t D e.rC/.T t / C wer.T t / P ŒX < T jX > t D e.rC/.T t / C wer.T t / .1 e.T t / D .1 w/e.rC/.T t / C wer.T t / : Thus, the value at time t is a mixture of values of no recovery and full recovery (no default). Example 2.9 (Two-Unit Reliability Systems). (a) (Serial System) Suppose the two units are connected in a serial system and are working independently. That means, as long as one unit fails, the system will fail. Suppose the lifetimes X1 and X2 follow the exponential distribution with parameter 1 and 2 , respectively. Then the system has an exponential lifetime min.X1 ; X2 / with parameters 1 C 2 . (b) (Parallel System or Warm Standby System) Suppose the system works as long as at least one unit is working. Then the system’s lifetime max.X1 ; X2 / follows the distribution F .x/ D P Œmax.X1 ; X2 / x D .1 e1 x /.1 e2 x /: (c) (Cold-Redundant System) Suppose the second unit is in cold standby and it will be put in working as long as the first unit fails. Therefore, the system lifetime is X1 C X2 and its distribution function is the convolution of the two exponential distribution.
40
2 Exponential Distribution
Z
y
P ŒX1 C X2 y D
P ŒX2 y xdP ŒX1 x 0
Z
y
D
1 e1 x .1 e2 .yx/ /dx
0
D 1 e1 y
1 e2 y .1 e.1 2 /y /: 1 2
In particular, when 1 D 2 D , P ŒX1 C X2 y D 1 .1 C y/ey :
Problems 1. Suppose the lifetime of a certain model of car battery follows an exponential distribution with the mean lifetime of 5 years. (a) Write down the survival function. (b) What is the probability that the lifetime will be over 2 years? (c) What is probability that the battery will work more than 4 years given that it worked at least two years? 2. The time to repair a machine is an exponentially distributed random variable with parameter D 1=2. What is (a) the probability that the repair time exceeds 2 h; (b) the conditional probability that a repair takes at least 10 h, given that its duration exceeds 9 h? 3. Let X and Y be independent Exp./. Prove that the density function of Z D X=Y is given by h.z/ D .1 C z/2 ; z > 0: 4. Let X be a nonnegative random variable such that P ŒX > 0 > 0. Then X is exponential if and only if EŒX jX > a D a C EŒX ; for all a 0: 5. Let X be an exponential random variable with rate . (a) Use the definition of conditional expectation to determine EŒX jX c. (b) Now determine EŒX jX c by using the following identity: EŒX D EŒX jX cP ŒX c C EŒX jX > cP ŒX > c:
2.5 More Applications
41
6. Let X1 and X2 be independent exponential random variables with same rate . Let X.1/ D min.X1 ; X2 / and X.2/ D max.X1 ; X2 / be the order statistics. Find (a) the mean and variance of X.1/ ; (b) the mean and variance of X.2/ . 7. Let X1 ; ; Xn be n identically and independently distributed exponential variables with rate , and X1;n ; ; Xn;n be the order statistics. (a) What are the mean and variance of X1;n ? (b) Using the property of spacing statistics find EŒXk;n and Var.Xk;n /. 8. Suppose that independent trials, each having a success probability p, (0 < p < 1), are performed until a success occurs. Denote by X the number of trials needed. Then P ŒX D k D .1 p/k1 p; k D 1; 2; : : : ; is called the Geometric probability distribution. (a) Calculate the mean and variance of X . (b) Calculate P ŒX > k. (c) Show that X has the similar memoryless property in the sense that P ŒX > k C mjX > k D P ŒX > m; k; m 1: (d) Calculate the mean and variance of X . 9. Suppose a certain kind of insurance claims follows exponential distribution with rate D 0:02. Assume a deductible D D 400. Refer to Problem 11 of Chap. 1. (a) Find the expected payment EŒXD from the insurer. (b) Find the expected payment EŒXDF under the franchise deductible. 10. By conditioning on the value of the first record value R1 , find the conditional distribution function of the second record value R2 . 11. If X is an exponential random variable with parameter D 1, compute the probability density function of the random variable Y defined by Y D log X . 12. If X is uniformly distributed over .0; 1/, find the density function of Y D eX . 13. Magnitude M of an earthquake, as measured on the Richter scale, is a random variable. Suppose the excess M 3:25 for large magnitudes bigger than 3.25 is roughly exponential with mean 0.59 (or D 1=0:59). (a) Find the probability that an earthquake has scale larger than 5? (b) Given an earthquake has scale larger than 5, what is the conditional probability that its scale is larger than 7?
42
2 Exponential Distribution
14. The duration of pauses (and the duration of vocalizations) that occur in a monologue follows an exponential distribution with mean 0.70 s. (a) What is the variance of the duration of pauses? (b) Given that the duration of a pause is longer than 1.0 s, what will be the expected total duration time? 15. Suppose the time X between two successive arrivals at the drive-up window of a local bank is an exponential random variable with D 0:2. (a) What is the expected time between two successive arrivals? (b) Find the probability P ŒX > 4. (c) Find the probability P Œ2 < X < 6. 16. Consider the two similar unit parallel systems. That means the two units follow the same exponential distribution with parameter . (a) What is the distribution function of system’s lifetime? (b) What is the density function? (c) Calculate the failure rate. 17. Consider a two similar unit cold standby system. That means, the two units have the same exponential distribution with parameter . (a) What is the distribution function of system’s lifetime? (b) What is the density function of system’s lifetime? (c) Calculate the failure rate and verify that it is monotone increasing. 18. Suppose X and Y are two independent exponential random variables with parameters and ı, respectively. (a) Show that min.X; Y / is exponential with parameter C ı. ı (b) Show that the probability P ŒX > Y D Cı . (c) Show by the memoryless property that given X > Y , Y and X Y are independent. Thus, EŒY jX > Y D EŒmin.X; Y /: (d) By extending (c), show that min.X; Y / and jX Y j are independent. 19. Suppose a bank branch has two tellers and their service times for each customer are exponential with parameters 0:2 and 0:25, respectively. When a customer arrives at the branch, he finds that both tellers are serving and no customers are waiting. (a) What is the distribution of his waiting time? (b) What is the probability that he will be served by the first teller? 20. In deciding upon the appropriate premium to charge, insurance companies sometimes use the exponential principle, defined as follows. If X denotes the amount of claims which the company has to pay. Then the premium charged by the insurance company will be
2.5 More Applications
43
P D
h i 1 ln E eaX ; a
where a is some specified positive constant. Suppose X is exponential with parameter and let a D ˛ for some 0 < ˛ < 1. Calculate the value P . 21. Refer to Problem 1.16. Suppose F .x/ is exponential with parameter . (a) Find the Risk at Value at level 99%; (b) Find the corresponding expected total loss Tl D EŒLjL l: 22. Let Y be an exponential random variable and X1 and X2 be two independent positive random variables. Show by the memoryless property of Y P ŒY > X1 C X2 D P ŒY > X1 P ŒY > X2 : 23. Let X1 ; ; Xn ; ; be a sequence of nonnegative identically and independently distributed random variables. Define N as the first time the sequence stops decreasing, i.e., N D inf fn 2 W X1 X2 Xn1 Xn g: (a) Show that for n 2, P ŒN n D P ŒX1 X2 Xn1 D (b) Find EŒN .
1 : .n 1/Š
Chapter 3
Poisson Process
3.1 Poisson Process as a Counting Process Consider a certain event that occurs consecutively at random time points 0 < T1 < T2 < < Tn < : For convenience, we write T0 D 0. Denote Xi D Ti Ti 1 for i D 1; 2; as the interarrival times. Let N.t/ be the total number of occurrences of the event in the time interval .0; t, t > 0, i.e., N.t/ D supfn W Tn tg; for t > 0: We call fN.t/; t 0g a counting process. For each t > 0, N.t/ could represent the number of accidents at a particular intersection, the number of times a computer breaks down, or similar counts. For example: If we let N.t/ be the number of persons who have entered a particular store at or
prior to time t, then fN.t/; t 0g is a counting process. If N.t/ equals the number of goals that a given soccer player has scored by the
time t, then fN.t/; t 0g is a counting process. An event of this process will occur whenever the soccer player scores a goal. A counting process N.t/ has the following properties: 1. 2. 3. 4.
The state space is the set of nonnegative integers f0; 1; 2; 3; g. It is nondecreasing with time, that is, if s < t, then N.s/ N.t/. It increases by 1 whenever an event occurs. For s < t, N.t/ N.s/ is equal to the number of events that have occurred in the interval .s; t. There are a few important notions of counting process we need to introduce.
A.K. Gupta et al., Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics, DOI 10.1007/978-0-8176-4987-6 3, c Springer Science+Business Media, LLC 2010
45
46
3 Poisson Process
Definition 3.1. Let fN.t/; t 0g be a counting process. 1. fN.t/; t 0g has independent increments if a number of events that occur in disjoint time intervals are independent, i.e., for any t0 < t1 < < tn , the random variables N.ti / N.ti 1 / and N.tj / N.tj 1 / are independent for i ¤ j . 2. fN.t/; t 0g has stationary increments if the distribution of the number of events that occur in any interval of time depends only on the length of the time interval, i.e., N.t2 Cs/N.t1 Cs/ and N.t2 /N.t1 / have identical distributions for all t1 < t2 , and s > 0. If we look at the two examples above, the independent increment assumption seems to be reasonable in both cases, as long as the soccer player’s chance of scoring does not depend on “how he has been going” that day. As for the stationary increments assumption, it needs further examination to justify it. Now, we are ready to define the Poisson processes. Definition 3.2. A counting process fN.t/; t 0g is said to be a Poisson process with rate , > 0, if 1. N.0/ D 0. 2. The process has stationary and independent increments. 3. For any t 0, N.t/ has Poisson distribution with mean t, i.e., P ŒN.t/ D k D
.t/k t e ; k D 0; 1; 2; : : : : kŠ
Let us look at several properties of the Poisson process. Theorem 3.1. Let fN.t/; t 0g be a Poisson process with rate . Then 1 P ŒN.t/ D 0 D : t !0 t P ŒN.t/ D 1 D : 2. lim t !0 t 3. P ŒN.t/ 2 .t/2 : 1. lim
Proof. Let fN.t/; t 0g be a Poisson process with rate . 1 et 1 P ŒN.t/ D 0 D lim t !0 t !0 t t lim
.t /0 0Š
1 et t !0 t
D lim
D lim et D t !0
3.2 Characterization of Poisson Processes
47
and P ŒN.t/ 2 D
1 X kD2
e
1 X et .t/k P ŒN.t/ D k D kŠ
t
kD2
1 X .t/k2 .t/ .k 2/Š 2
kD2
D et .t/2
1 X .t/j jŠ
j D0
D et .t/2 et D .t/2 : The second property follows from the fact that P ŒN.t/ D 1 D 1 P ŒN.t/ D 0 P ŒN.t/ 2: t u
3.2 Characterization of Poisson Processes as Counting Processes We first show that to determine whether a counting process with stationary and independent increments and N.0/ D 0 is a Poisson process, we only need to check whether the properties (2) and (3) in Theorem 3.1 hold. Then we will examine the conditions of “stationary and independent increments” to see whether they can be further weakened. Theorem 3.2. Let fN.t/; t 0g be a counting process that satisfies 1. N.0/ D 0. 2. The process has stationary and independent increments. P ŒN.t/ D 1 3. lim D , and > 0. t !0 t 4. P ŒN.t/ 2 .t/2 . Then fN.t/; t 0g is a Poisson process. Proof. From the definition of the Poisson process, we only need to show that for each t > 0, N.t/ has Poisson distribution with mean t. To this end, let pn .t/ D P ŒN.t/ D n:
48
3 Poisson Process
We will show by induction that pn .t/ D et
.t/n ; nŠ
for all t > 0 and n D 0; 1; 2; . When n D 0, the event N.s C t/ D 0 means that there is no occurrence in the interval .0; s C t, where s; t 0. This is equivalent to N.s/ D 0 and N.s C t/ N.s/ D 0. Since the counting process fN.t/; t 0g has stationary and independent increments, the random variables N.s C t/ N.s/ and N.t/ are identically distributed, and hence, P ŒN.s C t/ D 0 D P ŒN.s/ D 0; N.s C t/ N.s/ D 0 D P ŒN.s/ D 0P ŒN.s C t/ N.s/ D 0 D P ŒN.s/ D 0P ŒN.t/ D 0: It follows that p0 .s C t/ D p0 .s/p0 .t/; for all s; t 0. From this Cauchy functional equation, we have p0 .t/ D eıt for some ı > 0. Since p0 .t/ D 1
1 X
pk .t/;
kD1
the condition (3) implies that eıt D 1 t C o.t/; as t ! 0, and thus ı D . Note that p0 .t/ D et means that the time T1 for the first occurrence is exponentially distributed, i.e., P ŒT1 < t D 1 et : Suppose for t > 0,
.t/n : nŠ D s < t and using the independent and
pn .t/ D et Then by conditioning on the time T1 stationary increment property, we have
pnC1 .t/ D P ŒN.t/ D n C 1 Z t P ŒN.t/ D n C 1jT1 D ses ds D 0 Z t P ŒN.t/ N.s/ D njT1 D ses ds D 0
3.2 Characterization of Poisson Processes
Z
t
D Z
49
P ŒN.t s/ D nes ds
0
..t s//n .t s/ s e e ds nŠ 0 Z t .t s/n dsnC1 et D nŠ 0 .t/nC1 t e : D .n C 1/Š t
D
t u Remark. Condition (4) can be weakened to P ŒN.t/ 2 D o.t/ as t ! 0: If N.t/ has Poisson distribution with mean t, then P ŒN.t/ D 0 D et ; so that P ŒN.0/ D 0 D 1. That means N.0/ D 0 with probability 1. Therefore, the condition N.0/ D 0 is unnecessary. Condition (3) is only a local condition. It means that under the property of stationary and independent increments, only local conditions are necessary to characterize the Poisson process. Let us now turn to the conditions of stationary and independent increments. The property of independent increments implies that for any t1 < t2 < t3 , and nonnegative integers m and n, P ŒN.t3 / N.t2 / D m; N.t2 / N.t1 / D n D P ŒN.t3 / N.t2 / D mP ŒN.t2 / N.t1 / D n: The following result shows that we just need to assume the equality for m D n D 0. For notation, we will use N.I / to denote the number of occurrences in the interval I and jI j the length of I . Theorem 3.3. Let fN.t/; t 0g be a counting process that satisfies P ŒN.Ii / D 0I i D 1; 2; ; n D
n Y
P ŒN.Ii / D 0;
i D1
for any disjoint intervals I1 ; I2 ; ; In and P ŒN.s C t/ N.s/ 2 .t/2 for some > 0 and all s; t 0. Then fN.t/; t 0g has independent increments.
50
3 Poisson Process
Proof. We will prove that for any 0 t1 < t2 < < tk < tkC1 and intervals Ii D .ti ; ti C1 , 1 i k, P ŒN.Ii / D ni ; 1 i k D
Y
P ŒN.Ii / D ni ;
i D1
where ni are nonnegative integers, 1 i k. Let us divide the intervals Ii into r equal parts, denoted by Ii;j , 1 i k, 1 j r. Define ei;j the indicator random variables of N.Ii;j /. That means, ei;j D 0I if
N.Ii;j / D 0
D 1 if N.Ii;j / 1: Then fei;j g are all mutually independent for 1 i k, 1 j r. If we let r X Ei D ei;j j D1
be the number of intervals in Ii which have at least one event, then E1 ; E2 ; ; Ek are mutually independent and each takes values in f0; 1; 2; ; rg. This implies that P ŒEi D ni ; 1 i k D
k Y
P ŒEi D ni :
i D1
Since N.Ii;j / ei;j ; and N.Ii;j / ¤ ei;j only if N.Ii;j / 2, N.Ii / Ei ; and N.Ii / ¤ Ei only if N.Ii;j / 2 for at least one j , 1 j r, thus, jP ŒN.Ii / D ni P ŒEi D ni j P ŒN.Ii / ¤ Ei P ŒN.Ii;j / 2; for at least one1 j r
r X
P ŒN.Ii;j / 2
j D1
2
r r X X ˇ ˇ2 jIi j2 ˇIi;j ˇ D 2 r2
j D1
D
2 jIi j2 : r
j D1
3.2 Characterization of Poisson Processes
51
It follows that jP ŒN.Ii / D ni ; 1 i k P ŒEi D ni ; 1 i kj
k X
jP ŒN.Ii / D ni P ŒEi D ni j
i D1 k 2 X jIi j2 : r
i D1
Letting r ! 1 yields that jP ŒN.Ii / D ni P ŒEi D ni j D 0; and jP ŒN.Ii / D ni ; 1 i k P ŒEi D ni ; 1 i kj D 0: Therefore, P ŒN.Ii / D ni ; 1 i k D P ŒEi D ni ; 1 i k D
k Y
P ŒEi D ni
i D1
D
k Y
P ŒN.Ii / D ni :
i D1
t u Remark. From the above theorem, the conditions for fN.t/; t 0g to be a Poisson process can be weakened to: 1. fN.t/; t 0g has stationary increments. 2. For any 0 t1 < t2 < < tkC1 , P ŒN.ti C1 / N.ti / D 0I 1 i k D
k Y
P ŒN.ti C1 / N.ti / D 0:
i D1
3. limt !0
P ŒN.t /D1 t
D and P ŒN.t/ 2 .t/2 .
In fact, condition (2) and the second part of condition (3) above imply that N.t/ has independent increments, while condition (3) together with condition (1) and independent increments yields that N.t/ has Poisson distribution with mean t. We now turn to the condition of stationary increments. The property of stationary increments means that for any s; t and n, P ŒN.s C t/ N.s/ D n D P ŒN.t/ D n: In fact, we only need the above equality holds for n D 0.
52
3 Poisson Process
Theorem 3.4. A counting process fN.t/; t 0g is a Poisson process if it satisfies 1. P ŒN.s C t/ N.s/ D 0 D P ŒN.t/ D 0, s; t 0. 2. For disjoint intervals Ii , i D 1; 2; ; k, P ŒN.Ii / D 0; 1 i k D
k Y
P ŒN.Ii / D 0:
i D1
3. For any interval I , P ŒN.I / 2 b jI j2 ; for some b > 0, where jI j is the length of the interval I . Proof. From Theorem 3.3, conditions (2) and (3) here imply that N.t/ has independent increments. In the following, we shall show by induction that P ŒN.s C t/ N.s/ D n D P ŒN.t/ D n D et
.t/n ; n D 0; 1; 2; ; s; t > 0: nŠ
For n D 0, conditions (1) and (2) imply that P ŒN.s C t/ D 0 D pŒN.s/ D 0; N.s C t/ N.s/ D 0 D P ŒN.s/ D 0P ŒN.s C t/ N.s/ D 0 D P ŒN.s/ D 0P ŒN.t/ D 0; This Cauchy functional equation for g.t/ D P ŒN.t/ D 0 gives that P ŒN.t/ D 0 D et ; for some > 0. (Here, we implicitly exclude all the trivial cases, which imply that N.0/ D 0.) In particular, this implies that the time for the first occurrence after any specific point is exponential with rate . Suppose the conclusion is true for n. Then for n C 1, by conditioning on the time, say s C x, of first occurrence after time s, we have P ŒN.t C s/ N.s/ D n C 1 Z t D P ŒN.t C s/ N.s/ D n C 1jfirst occurrence at s C xex dx 0 Z t P ŒN.t C s/ N.s C x/ D nex dx D 0 Z t P ŒN.t x/ D nex dx D 0
3.3 Poisson Process as a Renewal Process
Z
53
..t x//n .t x/ x e e dx nŠ 0 Z t ..t x//n dxet D nŠ 0 .t/nC1 t e : D .n C 1/Š t
D
t u Before we close this section, let us compare the conditions in Theorem 3.2 and Theorem 3.4. With the “stationary increments” in condition (2) of the Theorem 3.2, condition (4) is equivalent to the condition (3) in the Theorem 3.4. In that sense, Theorem 3.4 is the strongest characterization theorem (with the weakest conditions) in all. From an application point of view, the conditions in Theorem 3.4 are all reasonable in the sense of verification or derivation from practical models, which will be seen in a later section.
3.3 Poisson Process as a Renewal Process A counting process fN.t/g is called a renewal (counting) process, if the interarrival intervals fTi Ti 1 D Xi g, for i D 1; 2; are identically and independently distributed with distribution function F .x/, called underlying distribution. Renewal processes have been widely used to model the failure times of repairable systems and claim arrival times in insurance risk model. A more systematic discussion of renewal theory is given in Chap. 8. We first show that Poisson process is a special case of renewal process. Theorem 3.5. If fN.t/; t 0g is a Poisson process with rate , then it is a renewal process with exponential underlying distribution with mean 1=. Proof. It is clear that P ŒX1 > t D P ŒT1 > t D P ŒN.t/ D 0 D et : Generally, given T1 D t1 ; ; Tn D tn , then from the stationary and independent increments property, we have P ŒXnC1 > tjT1 D t1 ; ; Tn D tn D P ŒN.t C tn / N.tn / D 0jN.tn / D n D P ŒN.t/ D 0 D et ; which is free of T1 ; : : : ; Tn . That means, fXi g for i D 1; 2; ; are identically and independently exponential random variables with mean 1=. t u
54
3 Poisson Process
The following theorem shows that the reverse is also true. Theorem 3.6. If fN.t/; t 0g is a renewal counting process with exponential underlying distribution with mean 1=, then fN.t/; t 0g is a Poisson process with rate . Proof. For t > 0, P ŒN.t/ D 0 D P ŒX1 > t D et : Thus, P ŒN.0/ D 0 D 1. The stationarity and independent increment property can be easily shown by the memoryless property of the exponential distribution. Finally, to prove N.t/ has Poisson distribution with rate t, we can use the induction method as in the proof of Theorem 3.4 and the details are omitted. t u Before we investigate the characterization of Poisson process among renewal processes, we first study the renewal process in a little more detail. It is easy to note that fN.t/ ng ” fTn tg: If we do not consider T0 D 0 is a renewal point, then we define the renewal function U.t/ as the expected number of renewal times before time t U.t/ D EŒN.t/ D
1 X
P ŒN.t/ n
nD1
D
1 X
P ŒTn t
nD1
D
1 X
P ŒX1 C Xn t
nD1
D
1 X
F .n/ .t/;
nD1
where F .n/ .t/ is the n-th convolution of F .t/. At the current time t, we define the age and residual life in the current renewal circle from TN.t / to TN.t /C1 as A.t/ D t TN.t / I and R.t/ D TN.t /C1 t: Our main purpose is to investigate the properties of the distributions of A.t/ and R.t/ and then find characterizations for the Poisson process. We first give a standard renewal argument for the distribution P ŒR.t/ > x. We use the total probability law by separating whether N.t/ D 0 or N.t/ > 0
3.3 Poisson Process as a Renewal Process
55
and then condition on the first renewal time T1 D s. The following renewal equation follows: P ŒR.t/ > x D P ŒTN.t /C1 t > x
Z t D P ŒT1 t > xI N.t/ D 0 C P ŒTN.t /C1 t > xjT1 D sdP ŒT1 s 0 Z t P ŒTN.t s/C1 .t s/ > xdF .x/ D P ŒT1 t > x C 0 Z t P ŒR.t s/ > xdF .x/: D 1 F .t C x/ C 0
In particular, by integrating x on both sides of the above equation from 0 to 1, we have the following renewal equation for the mean residual life Z 1 Z tZ 1 EŒR.t/ D .1 F .t C x//dx C P ŒR.t s/ > xdxdF .x/ 0 0 0 Z t Z 1 .1 F .t C x//dx C EŒR.t s/dF .x/: D t
0
To find the solution for the above renewal equation, we extend the technique by conditioning the time of last renewal time and number of renewals before time t and get P ŒR.t/ > x D P ŒT1 t > xI N.t/ D 0 C
1 X
P ŒTnC1 t > xI N.t/ D n
nD1
D P ŒT1 t > x C D 1 F .t C x/ C D 1 F .t C x/ C
1 X
P ŒTnC1 t > xI Tn t
1 1 Z t X 0 1 Z 1 X t
P ŒXnC1 > t C x sdF .n/ .s/
0
1
Z
P ŒXnC1 > t C x Tn jTn D sdP ŒTn s
t
.1 F .t C x s//d
D 1 F .t C x/ C Z
0
1 X
! F
.n/
.s/
0 t
.1 F .t C x s//dU.s/:
D 1 F .t C x/ C 0
Similarly, for z < t and x < t, the joint distribution of A.t/ and R.t/ can be obtained as P ŒA.t/ < zI R.t/ > x D P ŒA.t/ < zI R.t/ > xI N.t/ D 0 1 X P ŒA.t/ < zI R.t/ > xI N.t/ D n C 0
56
3 Poisson Process
D
1 X
P Œt Tn < zI TnC1 t > xI Tn t
0
D
1 X
P ŒTnC1 t < xI t z < Tn t
0
D
1 Z X
t z
0
Z
t
P ŒXnC1 < t C x sdP ŒTn s
t
.1 F .t C x s//dU.s/;
D t z
where we note that if N.t/ D 0, then A.t/ D t. In particular, the marginal distribution of A.t/ can be obtained by letting x D 0 as Z
t
P ŒA.t/ < z D
Œ1 F .t s/dU.s/: t z
In the Poisson process case, we have U.t/ D t and 1 F .t/ D et . Thus Z
t
P ŒA.t/ < zI R.t/ > x D
e.t Cxs/ ds
t z
D ex D ex
Z Z
t
e.t s/ ds
t z z
es ds
0
D ex Œ1 ez : That means, A.t/ and R.t/ are not only independent, but are also exponential random variables with mean 1=. In particular, their distributions are free of time t. It turns out that these results also characterize the Poisson process among the renewal processes. Theorem 3.7. Let fN.t/g be a renewal counting process with continuous underlying distribution with finite mean. Then it is Poisson if either 1. EŒR.t/ does not depend on t; or 2. A.t / and R.t / are independent for some t > 0.
3.4 Further Properties of Poisson Process
57
Proof. From the renewal equation for EŒR.t/, if EŒR.t/ D is free of t, then Z
Z
1
D
t
Œ1 F .s/ds C Z
t
D
dF .s/ 0
1
Œ1 F .s/ds C F .t/: t
This implies D
1 1 F .t/
Z
1
Œ1 F .s/ds D .t/: 0
That means the mean residual life is free of t. So F .x/ is exponential. For (2), from the joint distribution function of A.t / and R.t /, the independence implies that for all 0 < z; x < t , Z
t t z
Œ1 F .t C x s/dU.s/ D P ŒR.t / > x
Z
t t z
Œ1 F .t s/dU.s/:
Since F .x/ is continuous, and so is U.x/. By differentiating both sides with respect to z, we get Œ1 F .x C z/ D P ŒR.t / > xŒ1 F .z/; for all x; z < t : That means, the conditional survival distribution .1 F .x C z//=.1 F .z// is free of z. By letting z D 0, we have .1 F .x C z// D .1 F .x//.1 F .z// and thus F .x/ must be exponential.
t u
Remark. From the proof, the independence of A.t / and R.t / for some t implies certain kind of memoryless property for the residual life. This memoryless property further characterizes the Poisson process among the renewal processes.
3.4 Further Properties of Poisson Process In this section, we consider superposition of two independent Poisson processes and decomposition of a Poisson process into independent counting processes.
3.4.1 Superposition Process Let fN1 .t/; t 0g and fN2 .t/; t 0g be two independent Poisson processes with rates 1 and 2 , respectively. The superposition process, denoted as N1 .t/ C N2 .t/,
58
3 Poisson Process
results from pooling together the times of occurrences in each of the separate Poisson processes, as shown below. Theorem 3.8. The superposition process of two independent Poisson process N1 .t/ and N2 .t/ with rates 1 and 2 , respectively, is itself a Poisson process with rate D 1 C 2 . Proof. Let N.t/ be the number of occurrences of the event observed during the time interval .0; t. Then since each of the N1 .t/ and N2 .t/ has stationary and independent increments, so does N.t/ D N1 .t/ C N2 .t/. For any s; t 0, N.s C t/ N.s/ D ŒN1 .s C t/ N1 .s/ C ŒN2 .s C t/ N2 .s/ is the sum of two independent Poisson random variables with means 1 t and 2 t, respectively. Thus P ŒN.s C t/ N.s/ D n D P Œ.N1 .s C t/ N1 .s// C .N2 .s C t/ N2 .s// D n n X P ŒN1 .s C t/ N1 .s/ D k; N2 .s C t/ N2 .s/ D n k D D D
kD0 n X kD0 n X
P ŒN1 .s C t/ N1 .s/ D kP ŒN2 .s C t/ N2 .s/ D n k e1 t
kD0
.1 t/k 2 t .2 t/nk e kŠ n kŠ n X
De
.1 C2 /
De
.1 C2 / Œ.1
kD0
1 .1 t/k .2 t/nk kŠ.n k/Š C 2 /tn : nŠ
Therefore, N.s C t/ N.s/ has Poisson distribution with rate D 1 C 2 , and the theorem follows. t u
3.4.2 Decomposition of Poisson Process Consider a Poisson process fN.t/; t 0g with rate . Suppose that each time an event occurs, it can be classified as either type I with probability p or type II with probability 1 p, independently of all other occurrences. For example, the arriving customers could be either male or female with probability 1/2 each. Let N1 .t/ and N2 .t/ denote the number of occurrences that are of type I and type II, respectively. Then N.t/ D N1 .t/ C N2 .t/.
3.4 Further Properties of Poisson Process
59
Theorem 3.9. The two component processes fN1 .t/; t 0g and fN2 .t/; t 0g are independent, and both are Poisson processes with rate p and .1p/, respectively. Proof. For any s; t > 0, using the stationary and independent increments property of N.t/, we have P ŒN1 .t C s/ N1 .s/ D n; N2 .t C s/ N2 .s/ D m D P ŒN1 .t C s/ N1 .s/ D n;
N2 .t C s/ N2 .s/
D mjN.t C s/ N.s/ D n C m P ŒN.t C s/ N.s/ D n C m ! .t/nCm nCm n D p .1 p/m et .n C m/Š n D ept
.pt/n .1p/t Œ.1 p/tm e : nŠ mŠ
Hence, N1 .t C s/ N1 .s/ and N2 .t C s/ N2 .s/ are independent and have Poisson distributions with means pt and .1 p/t, respectively. Both have stationary increments. Similarly, let 0 D t0 t1 tk , and Ij D .tj 1 ; tj . Then P ŒN1 .Ij / D nj ; 1 j kP ŒN2 .Ij / D mj ; 1 j k D P ŒN1 .Ij / D nj ; 1 j ; N2 .Ij / D mj ; 1 j k D P ŒN1 .Ij / D nj ; N2 .Ij / D mj ; 1 j kjN.Ij / D nj C mj ; 1 j k P ŒN.Ij / D nj C mj ; 1 j k ! k Y Œ.tj tj 1 /nj Cmj nj C mj p nj .1 p/mj e.tj tj 1 / D nj .nj C mj /Š j D1
D
k Y j D1
D
k Y
ep.tj tj 1 /
Œp.tj tj 1 /nj .1p/.tj tj 1 / Œ.1p/.tj tj 1 /mj e nj Š mj Š
P ŒN1 .Ij / D nj P ŒN2 .Ij / D mj ;
j D1
and hence the property of independent increments follows.
t u
Intuitively, Theorem 3.9 can be viewed as follows: If we divide the interval .0; t into n subintervals of equal length t=n, where n is very large, then each subinterval will have a small probability t=n of containing a single occurrence (neglecting occurrences of the event with probability o.jI j/). As each occurrence
60
3 Poisson Process
has probability of being type I, each of the n subintervals will have either no occurrence, a type I event, or a type II event with probabilities 1 t=n; p
t t ; and .1 p/ ; n n
respectively. Since Poisson distribution is the limit distribution of binomial distribution, we can conclude that N1 .t/ and N2 .t/ have Poisson distribution with pt and .1 p/t as means, respectively. Their independence is proved by a similar limiting argument and by observing that P Œtype IIjno type I D
t .1 p/ C o.t=n/: n
In the proof of the above two theorems, we note the fact that if X1 and X2 are independent and Poisson distributed with respective means 1 and 2 , then X D X1 C X2 is also Poisson distributed with mean D 1 C 2 . Conversely, if X has Poisson distribution and X D X1 C X2 , where X1 and X2 are independent, can we conclude that X1 and X2 are both Poisson? In the proof of Theorem 3.9, the answer is affirmative. In general, it is also affirmative, and it is known as Raikov theorem. Theorem 3.10 (Raikov). If X1 and X2 are independent nonnegative integervalued random variables such that X1 C X2 is Poisson distributed, then X1 and X2 both have Poisson distribution. We can raise several questions. (a) How to prove Raikov’s theorem? (b) Is it true if X1 , X2 and X D X1 C X2 are all Poisson distributed, then X1 and X2 must be independent? (c) If X1 C X2 is Poisson distributed, but X1 and X2 are not independent, is it still true that X1 and X2 are Poisson distributed? Question (a) can be dealt with moment generating functions. Counterexamples for (b) and (c) are referred to, for instance, Counterexamples in Probability by Stoyanov (1997).
3.5 Examples of Poisson Process There are countless examples where probability models lead to the Poisson process. The examples shown in this section are of two kinds: the first kind is those that arise naturally in reliability and life testing situations when the underlying distribution is exponential; and the second one contains further analysis of the models using our main theorems or properties of the Poisson process. Example 3.1 (Maintained Unit). A unit is put into operation at time zero. Each time a unit fails, it is replaced by a new unit of the same type. Assume that the life
3.5 Examples of Poisson Process
61
lengths are independently and identically exponentially distributed with failure rate . Then the number of failures observed during .0; t fN.t/; t 0g is a Poisson process with rate , and the interarrival times have underlying distribution F .x/ D 1 ex . Example 3.2 (Life Testing). It is desirable to estimate the unknown failure rate of an exponential life distribution. Suppose n similar units are put into testing. Now the failed units are not replaced, and the successive failures are observed at times of order statistics X1;n X2;n Xn;n . Suppose the experiment ends in one of the following three ways: Case 1 (Type I censoring). At a specified time t0 . Case 2 (Type II censoring). At the rth failure time Xr;n , where r is specified in advance, 1 r n. Case 3 (Mixed censoring). At the time min.t0 ; Xr;n /, where t0 and r .1 r n/ are specified in advance. The total time on test during .0; t is given by .t/ D nX1;n C .n 1/.X2;n X1;n / C C.n i C 1/.Xi;n Xi 1;n / C .n i /.t Xi;n/: In fact, .t/ D X1;n C X2;n C C Xi;n C .n i /t: Notice that nX1;n represents the total test time observed between 0 and the first failure X1;n , and the normalized spacing Di;n D .n i /.Xi C1;n Xi;n/ represents the total test time observed between Xi C1;n and Xi;n for i D 1; ; n1. Now we define a counting process N ./ as the number of failures observed when the total time on test (not elapsed time) has reached by considering as the time scale. Suppose the experiment ends at D 0 . So 0 D .t0 / in case 1, 0 D Dr;n in case 2, and 0 D min..t0 /; Dr;n / in Case 3. Theorem 2.8 implies that these normalized spacings are independent and identically distributed with exponential distribution F .x/ D 1 ex . Therefore fN ./; 0 0 g is a Poisson process with rate . Example 3.3 (The Spare Parts Problems I). Consider a system with k component positions using parts of a single type, where the parts have life distribution F .x/ D 1 ex , and all the life lengths are stochastically independent. When a part fails, it is immediately replaced by a spare, if available. Component position j is required to operate for tj hours during the mission, j D 1; 2; ; k. Note that the tj ’s may differ. We want to determine the number n of spares in stock to achieve assurance (probability) ˛ that the system will operate throughout the mission without shutdown due to shortage of spares.
62
3 Poisson Process
Let Nj .t/ be the total number of failures (or replacements) occurring in the j th component position, assuming that spares are available when needed. Then Nj .t/ are independent Poisson processes for j D 1; ; k. Given t1 ; t2 ; ; tk , N1 .t1 /; N2 .t2 /; ; Nk .tk / are independent Poisson variables with means t1 ; t2 ; tk , respectively. Hence, N1 .t1 / C N2 .t2 / C C Nk .tk / has Poisson distribution with mean D .t1 C t2 C tk /: The number n of spares in stock is the smallest r such that P ŒN1 .t1 / C N2 .t2 / C C Nk .tk / r D
r X j e ˛: jŠ
j D0
With given ˛ and , r can be found from the table for Poisson distribution. Example 3.4 (The Spare Parts Problems II). Consider a system with k component positions using parts of a single type, where all the life lengths are stochastically independent. However, due to the environment and stress at each component position, each part may have a different life distribution. Assume that the part used in the j th component position has life distribution Fj .x/ D 1 ej x . Then the number of parts used in j-th position Nj .t/ is a Poisson process with rate j , and the total failures up to time t N.t/ is a Poisson process with rate D 1 C 2 C C k : For example, assume that a silicon diode having exponential life length is used in a number of component positions in an electronic circuit. Its failure rate is 0.000002 in the first five positions, 0.000001 in the next three positions, and 0.000003 in the last four positions. A supply depot is required to provide spares to keep 100 such circuits in operation during a period containing 1,000 operating hours. How many spares should the depot carry so that the probability of shortage is less than 0.05? The total number of silicon diode failures experienced during the 1,000 operating hours is a Poisson random variable with expected value .5 0:000002 C 3 0:000001 C 4 0:000003/ 100 1;000 D 2:5: From the table of Poisson distribution, P ŒN > 5 D 0:0420; and P ŒN > 4 D 0:1088: Thus, the smallest number of spare silicon diodes to stock so as to achieve an assurance of at least 0.95 of no shortage is 5.
3.5 Examples of Poisson Process
63
Example 3.5 (Biological Survey). Suppose that in certain area plants are randomly dispersed with a mean density of per square yard. If a biologist randomly locates 100 2-square-yard sampling quadrants on the area, how many of them can the biologist expect to contain no plants? The first problem we have to settle is to establish a model. If we look at a randomly selected t-square-yard quadrant and count the number N.t/ of plants on the quadrant, regarding each plant as an event occurring, then we can reasonably assume that as the area t increases, (a) the number of plants will increase one at a “time”, namely, two or more plants would not occur precisely at the same “time” and (b) the number of plant counts as the area increases from t to t C s is independent of the counts when the area is t. These are justified since the plants are randomly dispersed with mean density . The characterization of Poisson process yields that we can view the number of plants in a randomly selected quadrant of area t square yards fN.t/; t 0g as a Poisson process with rate . With this model, the probability of getting an empty quadrant of 2 square yards is P ŒN.2/ D 0 D e2 . For the biologist, the expected number of plants of quadrants containing no plants is 100P ŒN.2/ D 0 D 100e2 : We should mention that this model is often used in environmental and ecological research. Example 3.6 (An Immigration Model). Suppose that people immigrate into a territory such that (a) the immigrants are arriving one at a time, namely, no two immigrants arrive at the same time; (b) the probability of no one arriving over a certain period depends only on the length of the period; (c) the events that no one arrives over disjoint periods are independent. Clearly, such a model leads to a Poisson process with certain rate per day. For example, if the rate D 10=7 per day, then the expected time until the tenth immigrant arrives is EŒT10 D 10= D 10
7 D 7.days/; 10
and the probability that the elapsed time between the tenth and the eleventh arrivals exceeds two days equals to P ŒT11 T10 > 2 D e2 D e20=7 0:06: If we assume that each immigrant is of English descent with probability 1=12, then the decomposition theorem shown shows that the number of Englishmen emigrated to the territory during February is Poisson distributed with mean 4 10
1 10 D : 12 3
64
3 Poisson Process
Let us extend the model further. Assume that the immigrants arrive at the territory at a Poisson rate , and after they live in the territory, they may move out of the territory to another area. Suppose that the amount of time they spend in the territory before leaving has common distribution F .x/. Let N1 .t/ denote the number of immigrants who have moved out of the territory by the time t, and N2 .t/ the number of immigrants living in the territory at the time t. Then N1 .t/ and N2 .t/ are independent Poisson random variables with means determined by the distribution function F .x/. An equivalent model is given in the next example. Example 3.7 (An Infinite Server Queue with Poisson Arrivals). Suppose that customers arrive at a service station in accordance with a Poisson process fN.t/; t 0g with rate . Upon arrival, a customer is immediately served by one of the (infinite number of possible) servers, and the service times are assumed to be independent with a common distribution F .x/. Let N1 .t/ be the number of customers, called type I, who have completed service by the time t, and N2 .t/ the number of customers, called type II, who are being served at the time t. Given N.t/ D n, denote the first n arrival times as T1 ; T2 ; Tn . Let 0 < t1 < t2 < < tn < tnC1 D t; and let hj .> 0/ be such that tj C hj < tj C1 . Then P Œtj Tj tj C hj ; j D 1; 2; ; njN.t/ D n P exactly one in Œtj ; tj C1 ,1 j n, no other in .0; t D P ŒN.t/ D n D D
h1 eh1 h2 eh2 hn ehn e.t h1 h2 hn / et .tnŠ/
n
nŠ h1 h2 hn tn
and hence
nŠ : tn That means, T1 ; T2 ; ; Tn are distributed as order statistics of n independent uniform random variables on .0; t. Thus, the arrival time of an arbitrary one of the n customers is uniform on .0; t with density 1=t. Moreover, given a customer’s arrival time as x (x < t), then the probability that he is still in service at time t will be 1 F .t x/. Thus, the probability of such an arbitrary customer being in service at the time t is Z t Z t 1 dy pD Œ1 F .t x/ dx D Œ1 F .y/ : t t 0 0 fT1 ;T2 ; ;Tn jN.t /Dn .t1 ; t2 ; ; tn / D
3.5 Examples of Poisson Process
65
Hence, we have for j n, ! n j P ŒN2 .t/ D j jN.t/ D n D p .1 p/nj ; j and thus for any x; y > 0, P ŒN1 .t/ D x; N2 .t/ D y D P ŒN1 .t/ D x; N2 .t/ D yjN.t/ D x C yP ŒN.t/ D x C y ! .t/xCy xCy y D p .1 p/x et .x C y/Š y D et .1p/
t.1 p//x tp .tp/y e : xŠ yŠ
This means that N1 .t/ and N2 .t/ are independent Poisson random variables with means Z t Z t 1 Œ1 F .x/dx and Œ1 F .x/dx; 0
0
respectively. Note that t here is only one fixed time. Both N1 .t/ and N2 .t/ can be considered as counting processes. However, we did not conclude that N1 .t/ and N2 .t/ are Poisson processes. In fact, it is not necessary that they are Poisson processes. Example 3.8 (Optimization). Suppose that items arrive at a processing plant in accordance with a Poisson process with rate . At a fixed time T , all the items are dispatched from the system. The problem is to choose an intermediate time t 2 .0; T , at which all items are dispatched, so as to minimize the total expected waiting time of all items. If we dispatch at time t, 0 < t < T , then the expected number of arrivals in .0; t is t. Each arrival is uniformly distributed on .0; t, and so its expected waiting time is t=2. Thus, the expected total waiting time of items arriving in .0; t is .t/ .t=2/ D t 2 =2: Similarly, the expected waiting time of items arriving in .t; T is expected total waiting time of all items will be .T t/2 t 2 C : 2 2
.T t/2 . The 2
66
3 Poisson Process
To minimize the total expected waiting time, we have .T t/2 d t 2 C D t .T t/: dt 2 2 Equating it to zero yields t D T =2, and the intermediate dispatch time that minimizes the expected total waiting time is t D T =2. Example 3.9. To sell a used car, one pays cP per unit time for advertisement. Suppose i the offers come in at time points fTi D j D1 Xi g for i D 1; 2; following a Poisson process with rate and the consecutive offer prices fYi g are independent and identically distributed with distribution F .x/. One wants to sell the car as long as an offer price is higher than y. (a) The number of offers until an offer price higher than y, denoted by N , follows the geometric distribution with success probability FN .y/. Thus, the total cost of advertisement until sell can be written as c
N X
Xi :
i D1
The total expected cost of advertisement until sell is cEŒN EŒXi D
c : N F .y/
(b) The mean selling price is EŒX jX > y D y C .y/: (c) The total expected return will be y C .y/
c : N F .y/
(d) If F .x/ has density f .x/, it can be shown that the derivative of the total expected return has the same sign as Z 1 c FN .x/dx : y Thus, as long as > c , then the optimal value y uniquely exists and satisfies Z
1 y
c FN .x/dx D :
Example 3.10. Suppose insurance claims arrive at times fTi g following a Poisson process with rate , the successive claims fCi g are independent random variables
3.5 Examples of Poisson Process
67
with distribution function F .x/ and mean and are independent of the claim arrival times. Let N.t/ denote the total number of claims before time t and ˛ be the discount factor. Then the total discounted cost up to time t is D.t/ D
N.t X/
e˛Ti Ci :
i D1
Given N.t/ D n, T1 ; ; Tn follow the same distribution as the order statistics from n uniform random variables U1 ; ; Un on Œ0; t. Thus, EŒD.t/jN.t/ D n D
n X
EŒe˛Ui;n EŒCi
i D1
D
n X
Ee D Œe˛Ui
i D1
D n
1 .1 e˛t /: t˛
Thus, EŒD.t/ D EŒEŒD.t/jN.t/ .1 e˛t /EŒN.t/ D t˛ .1 e˛t /: D ˛
Problems 1. A discrete random variable X follows Poisson probability distribution with rate if k P ŒX D k D e ; kŠ for k D 0; 1; : (a) Find the mean and variance of X . (b) Show that P ŒX D k D P ŒX D k 1; for k 1: k (c) Show that EŒX.X 1/ : : : .X i / D i C1 :
68
3 Poisson Process
2. Show that if X and Y are independent Poisson random variables with rate 1 and 2 , respectively, then X C Y is also Poisson with rate 1 C 2 . 3. Events occur according to a Poisson process with rate D 2 per hour. (a) What is the probability that no events occur between 8 P.M. and 9 P.M.? (b) Starting at noon, what is the expected time at which the fourth event occurs? (c) What is the probability that two or more events occur between 6 P.M. and 8 P.M.? 4. Cars pass a point on the highway at a Poisson rate of one per minute. Suppose 5% of the cars on the road are vans. (a) What is the probability that at least one van passes by during an hour? (b) Given that ten vans have passed by in an hour, what is the expected number of total cars to have passed by in that time? (c) If 50 cars have passed by in an hour, what is the probability that five of them are vans? 5. Cars pass a certain street location according to a Poisson process with rate . A woman who wants to cross the street at that location waits until she sees that no cars will come by in the next T time units. (a) Find the probability that her waiting time is 0. (b) Find the expected waiting time. (Hint: Condition on the time of the first car.) 6. Let fN.t/; t 0g be a Poisson process with rate . Let Sn denote the time of the nth event. Find (a) EŒS4 ; (b) EŒS4 jN.1/ D 2; (c) EŒN.4/ N.2/jN.1/ D 3: 7. Pulses arrive at a Geiger counter in accordance with a Poisson process at a rate of three arrivals per minute. Each particle arriving at the counter has a probability 2=3 of being recorded. Let X.t/ denote the number of pulses recorded by time t minutes. (a) Find P ŒX.t/ D 0. (b) Find EŒX.t/. 8. Consider an infinite server queueing system (no waiting) in which customers arrive in accordance with a Poisson process with rate and where the service time distribution is exponential with rate . Let X.t/ denote the number of customers in the system at time t. Find (a) EŒX.t C s/jX.s/ D n; (b) Var.X.t Cs/jX.s/ D n/. (Hint: Divide the customers in the system at time t Cs into two groups, one consisting of “old” customers arrived before time s and the other of “new” customers after time s.)
3.5 Examples of Poisson Process
69
9. Customers arrive at a bank at a Poisson rate . Suppose two customers arrived during the first hour. What is the probability that (a) Both arrived during the first 20 min? (b) At least one arrived during the first 20 min? 10. Suppose that people arrive at a bus station in accordance with a Poisson process N.t/ with rate . The bus departs at time t. Let X denote the total amount of waiting time of all those that get on the bus at time t. (a) What is EŒX jN.t/ D n? (b) Argue that VarŒX jN.t/ D n D nt 2 =12. (c) What is Var.X /? 11. We continue the previous question. Suppose the bus departs uniformly between Œ0; T and the people arrive at the station in accordance with a Poisson process N.t/ with rate . Denote by N the total number of people who get on the bus. By conditioning on the depart time T D t, calculate (a) EŒN ; (b) Var.N /; (c) Denote by X the total waiting time of all those who get on the bus. Using the results of previous question, calculate EŒX and VarŒX . 12. For a Poisson process N.t/ with rate , show that (a) Given N.t/ D 1, the conditional distribution of the arrival time T1 is a uniform distribution on .0; t/; (b) Generally, given N.t/ D n, the conditional distribution of the arrival times T1 ; : : : ; Tn is the same as the distribution of order statistics of n uniform random variables on Œ0; t. 13. For a Poisson process N.t/ with rate , show that given N.t/ D n, the conditional distribution of N.s/ for 0 < s < t is the binomial distribution Bin.n; s=t/ with success probability s=t. 14. Show that for a Poisson process N.t/, the renewal function U.t/ D t. 15. Let X be a Poisson random variable with parameter . Show that P ŒX D i increases monotonically and then decreases monotonically as i increases, reaching its maximum when i is the largest integer not exceeding . (Hint: Consider P ŒX D i =P ŒX D i 1.) 16. Let X be a Poisson random variable with parameter . What value of maximizes P ŒX D k for a given k. 17. If X is a Poisson random variable with parameter , show that EŒX n D EŒ.X C 1/n1 : Using this result to compute EŒX 3 .
70
3 Poisson Process
18. Let X be a Poisson random variable with parameter . Using the Taylor expansion for e C e , verify that P ŒX is even D
i 1h 1 C e2 : 2
19. Suppose the number of eggs laid on a tree leaf by an insect of a certain type is a Poisson random variable X with parameter . However, X can only be observed if it is positive. Let Y denote the observed number of eggs, then P ŒY D k D P ŒX D kjX > 0: We call Y the truncated Poisson random variable. Find EŒY . 20. Let X be a Poisson random variable with parameter . Show that its moment generating function is given by M.z/ D e.e 1/ : z
21. Let N.t/ for t > 0 be a Poisson process with intensity and Xi for i 1 are independent and identically distributed random variables independent of N.t/ with mean moment generating function M.z/. P / (a) Find the moment generating function of Y .t/ D N.t i D1 Xi . (b) Calculate the mean EŒY .t/. (c) Calculate the variance Var.Y .t//. (d) Calculate Cov.N.t/; Y .t//. 22. Suppose T is a nonnegative random variable which is independent of a Poisson process N.t/ for t 0 with intensity . (a) Find Cov.T; N.T //. (b) Find Var.N.T //.
Chapter 4
Parametric Families of Lifetime Distributions
In this chapter, we present four parametric families of lifetime distributions: Weibull distribution, gamma distribution, change-point model, and mixture exponential distribution. They represent parametric extensions of exponential distribution under its two fundamental characteristics: constant failure rate and memoryless property. The notions of increasing (decreasing) failure rate along with a more general mixture Erlang distribution family are also introduced.
4.1 Weibull Distribution A random variable X is said to have Weibull distribution with parameters and ˛, denoted by WEI.; ˛/ if its distribution function has the form ˛
F;˛ .x/ D 1 e.x/ ; x 0; where and ˛ are positive real numbers. Its density function is ˛
f;˛ .x/ D ˛˛ x ˛1 e.x/ ; x > 0; and its failure rate function is given by r.t/ D
f .t/ 1 F .t/
˛˛ t ˛1 e.t / D e.t /˛ D ˛˛ t ˛1 ;
˛
for t > 0. Thus, if ˛ > 1, the failure rate is increasing; and if 0 < ˛ < 1, the failure rate is decreasing. Here, increasing means nondecreasing and decreasing means nonincreasing.
A.K. Gupta et al., Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics, DOI 10.1007/978-0-8176-4987-6 4, c Springer Science+Business Media, LLC 2010
71
72
4 Parametric Families of Lifetime Distributions
The parameter ˛ is a shape parameter that determines the shape of the density function. Three basic shapes of the density function are represented by ˛ < 1, ˛ D 1, or ˛ > 1, respectively, and ˛ D 1 corresponds to the exponential model. The parameter is a scale parameter as every Weibull distribution is related to WEI.1; ˛/ by rescaling: F;˛ .x/ D 1 e.x/
˛
D F1;˛ .x/: That means, if X has Weibull distribution WEI.; ˛/, then X has Weibull distribution WEI.1; ˛/. Hence, WEI.1; ˛/ can be regarded as the “standardized” Weibull distribution. The special case ˛ D 2 is known as the Rayleigh distribution. Theorem 4.1. If X has a Weibull distribution WEI.; ˛/, then its mean EŒX , variance VarŒX , and the r th moment EŒX r , r > 0, are given by 1 1 ; EŒX D 1 C ˛ 1 2 1 2 VarŒX D 2 1 C 1C ; ˛ ˛ 1 r EŒX r D r 1 C ; ˛ where .t/ defines the gamma function as Z 1 x t 1 ex dx; t > 0: .t/ D 0
Before we prove Theorem 4.1, we point out the following properties of gamma functions. Lemma 4.1. 1. .t/ D .t 1/.t 1/, t > 1. 2. .n/ D .n p 1/Š, n D 1; 2; . 3. 12 D . Proof. For t > 1, .t/ D
R1 0
x t 1 ex dx
D x t 1 ex j1 0 C .t 1/
R1 0
x .t 1/1 ex dx
D .t 1/.t 1/: It follows from .1/ D 1 and induction that .n/ Z 1 D .n 1/Š. To show (3), let p 1 2 t D x, and change of variable leads to . / D 2et dt. Thus, 2 0
4.1 Weibull Distribution
73
2 Z 1 Z 1 1 2 2 D 2es ds 2et dt 2 0 0 Z
1
Z
1
D4 0
Z
2 Ct 2 /
dsdt
0 2
D4
e.s
Z
1
2
e dd (in Polar coordinates)
0
0
D Therefore,
1 2
D
p .
t u
Proof of Theorem 4.1. We only need to show that for r > 0, EŒX r D In fact,
Z
1
EŒX D r
1 r : 1 C r ˛
˛
x r ˛˛ x ˛1 e.x/ dx
0
Z
1
D 0
D
1 r
Z
1 1=˛ y
1
r
ey dy (letting y D .x/˛ /
r
y .1C ˛ /1 ey dy
0
1 r : 1 C r ˛ 1 1 ; and It follows that EŒX D 1 C ˛ D
VarŒX D EŒX 2 .EŒX /2 2 1 1 2 1C : D 2 1C ˛ ˛ t u The moment generating function of Weibull distribution is too complicated to be tractable. The Weibull distribution has been used successfully to describe fatigue failure (Weibull 1939, the distribution is named after him), vacuum tube failure (Kao 1956), and ball-bearing failure (Lieblein and Zelen 1956). The simplicity of the distribution
74
4 Parametric Families of Lifetime Distributions
function of Weibull distribution and the parameters and ˛ provide quite some flexibility in applications. It is perhaps the most popular parametric family of lifetime (failure) distributions at the present time in reliability of electronic and mechanical systems and components, but not so much in biological components and systems. This will be seen in the next section. It has also been shown that the Weibull distribution is one of the limiting (extreme) distributions of the minimum of independent random variables. The special case of exponential distribution has been studied in Chap. 2, where nX1;n has the same exponential distribution as the identically and independently distributed exponential random variables X1 ; ; Xn have. In fact, the Weibull distribution has the similar property. If X1 ; ; Xn have Weibull distribution F;˛ .x/ D 1 exp..x/˛ /; then X1;n has the Weibull distribution Fn1=˛ ;˛ .x/. In reliability terms, X1;n is the lifetime for a serial system with n similar components. Thus, the Weibull family is closed under the series system formation (minimum operation).
4.2 Gamma Distribution A random variable X is said to have gamma distribution with parameters and ˛, denoted by Gam.; ˛/, if its density function is given by f;˛ .x/ D
˛ x ˛1 x e ; x > 0; .˛/
where the parameters and ˛ are positive real numbers. For an arbitrary ˛, there is no closed form for the distribution function F;˛ .x/. When ˛ is an integer, F;˛ .x/ is called an Erlang distribution and its form can be obtained by integration by parts as below ˛1 X .x/k F;˛ .x/ D 1 ex ; x 0: kŠ kD1
To calculate the failure rate function r.t/, we consider 1 1 F .x/ D r.t/ f .t/ Z 1 ˛ x ˛1 ex dx .˛/ t D ˛ ˛1 t t e .˛/ Z 1 ˛1 x D e.xt / dx t t
4.2 Gamma Distribution
75
u C t ˛1 u e du t 0 Z 1 u ˛1 u D 1C e du: t 0 Z
1
D
Thus
Z
1
r.t/ D 0
1 u ˛1 u 1C e du : t
u ˛1 If ˛ > 1, 1 C is decreasing in t, and hence r.t/ is increasing. If 0 < ˛ < 1, t ˛1 u 1C is increasing in t, and so r.t/ decreases. When ˛ D 1, the distribution t reduces to exponential with constant failure rate . Thus, gamma distribution generalizes the exponential model similar to Weibull distribution in terms of the failure rate function. For any integer n, F;n .x/ is simply the distribution of the sum of n independent and identically distributed exponential random variables, with parameter . Similar to Weibull distribution, the parameter ˛ is a shape parameter, and is a scale parameter. The three basic shapes of the density function are given according to ˛ < 1, ˛ D 1 and ˛ > 1. Every gamma distribution Gam.; ˛/ is related to Gam.; 1/ by rescaling, i.e., if X has gamma distribution Gam.; ˛/, then X has gamma distribution Gam.; 1/. Thus, Gam.; 1/ can be viewed as the standard gamma distribution. A special case of gamma distribution with D 1=2 and ˛ D n=2 is referred to as -squared distribution with n degree of freedom, denoted by 2 .n/. Theorem 4.2. If X has a gamma distribution Gam.; ˛/, then its mean EŒX , variance VarŒX , and moment generating function MX .t/ D EŒetX are given by ˛ ˛ EŒX D ; VarŒX D 2 ; and MX .t/ D
t
˛ ; for t < :
Proof. We start with the moment generating function MX .t/. Z
1
etx
MX .t/ D EŒe D tX
0
D
t
D D
˛ Z 0
˛ Z
t ˛ t
1
˛ ˛1 x x e dx .˛/
. 1/˛ ˛1 .t /x x e dx .˛/
1
pt;˛ .x/dx 0
:
76
4 Parametric Families of Lifetime Distributions
It follows that MX0 .t/ D ˛˛ . t/˛1 and MX00 .t/ D ˛.˛ C 1/˛ . t/˛2 : Hence, ˛ 0 00 VarŒX D MX .0/ MX .0/ EŒX D MX0 .0/ D
˛.˛ C 1/ ˛ 2 2 2 ˛ D 2: D
t u In general, we have the following result for any moments of X . Theorem 4.3. Let X have a gamma distribution Gam.; ˛/. For any real number r > ˛, .˛ C r/ : EŒX r D .˛/r Proof.
Z
1
EŒX r D
xr 0
Z
1
D 0
D D
˛ ˛1 x x e dx .˛/
˛ ˛Cr1 x x e dx .˛/
.˛ C r/ .˛/r
Z
1 0
˛Cr x ˛Cr1 ex dx .˛ C r/
.˛ C r/ ; .˛/r
since, again, the integrand is the density function of Gam.; ˛ C r/.
t u
The usefulness of the Theorem 4.3 can be illustrated by the following example. Let X have 2 .4/ distribution, i.e., the distribution of X is Gam 12 ; 2 . Then, we have .2 1/ 1 1 E D 1 1 D I X 2 .2/ 2
4.2 Gamma Distribution
77
hp i 2 C 12 E X D 1=2 .2/ 12 p 3 1 3p 1 D D 2 2: 2 2 2 4 Theorem 4.4. Let Xi have gamma distribution Gam.; ˛i /, 1 i n, and let X1 ; X2 ; ; Xn be independent. Then, Y D X1 C X2 C C Xn has gamma distribution Gam.; ˛1 C ˛2 C C ˛n /. Proof. Since X1 ; X2 ; ; Xn are independent, the moment generating function of Y is ˛1 C˛2 CC˛n MY .t/ D MX1 .t/MX2 .t/ MXn .t/ D : t The uniqueness of the moment generating function yields that Y has gamma distribution Gam.; ˛1 C ˛2 C C ˛n /. t u In reliability terms, if Xi represents the lifetime for the i th component for i D 1; ; n and the system is formed as cold-redundant, that means, if the i th component fails, the .i C 1/th component will replace it like a new one. Then, the system lifetime for this n cold-standby components will be X1 C ::: C Xn . Thus, the gamma family is closed under the formation of cold standby system (or convolution) with same scale parameter . Corollary 4.1. Let X1 ; X2 ; ; Xn be a random sample from a population, and Y D X1 C X2 C C Xn . 1. If the population has gamma distribution Gam.; ˛/, then Y has gamma distribution Gam.; n˛/. 2. If the population has 2 distribution 2 . /, then Y has 2 distribution 2 .n /. Proof. By letting ˛1 D ˛2 D D ˛n D ˛ in Theorem 4.4, we get (1), and (2) follows from D 1=2 and ˛ D =2.
t u
Example 4.1. Let X1 ; X2 ; ; Xn be a random sample from a normal population N. ; 2 /. When is known, Xi
Zi D has N.0; 1/ distribution, so that Zi2 D
.Xi /2 2
78
4 Parametric Families of Lifetime Distributions
has 2 .1/ distribution. Thus,
n X .Xi /2 2 i D1
2
has .n/ distribution. When is unknown, it is known that n .n 1/S 2 1 X D .Xi XN /2 2 2 i D1
follows 2 .n 1/ distribution. On the one hand, the Weibull and gamma distributions with shape parameter ˛ > 1 are of great interests. For example, they may be more suitable as models for life length of electronic components, where few will have very short life lengths, many will have something close to an average life length, and very few will have extraordinary long life lengths. In application, Weibull paper is used to plot data set, like the normal paper, to verify whether the data are drawn from a Weibull population. On the other hand, the Weibull distribution with shape parameter ˛ < 1 has been used to model large claim in insurance risk model (Chap. 9).
4.3 Change-Point Model A lifetime X with failure rate function r.t/ follows a change-point model if r.t/ D 0 ;
for t I
D 1 ;
for t > ;
where is called the change-point, 0 the pre-change failure rate, and 1 the postchange failure rate. Obviously, when 0 D 1 D , it reduces to the exponential distribution. If 1 > 0 , the failure rate is increasing, and if 1 < 0 , the failure rate is decreasing. Under the change-point model, FN .x/ D e
Rx 0
r.t /dt
D e0 x ;
for x I
D e0 1 .x/ ;
for x > :
Consequently, the probability density function is given by f .x/ D 0 e0 x ; D 1 e
for x I
0 1 .x/
;
for x > :
4.4 Mixture Exponential Distribution
79
The mean of X can be calculated as Z 1 EŒX D .1 F .x//dx 0 Z Z D e0 x dx C e0 0
1
e1 .x/ dx
Z 1 1 .1 e0 / C e0 e1 x dx 0 0 1 1 0 D .1 e0 / C e : 0 1 D
That means, the mean is a weighted average of 1=0 and 1=1 . The calculation of variance is left as an exercise. The change-point model has been widely used in online quality control, where the change represents a sudden change in system dynamic structures. More recent applications can be found in survival analysis where the baseline hazard function may follow a change-point model (Matthews and Farewell 1982). Further extensions of the single change-point model to multiple change-point model have been used as alternatives to bathtub failure rate functions and epidemic models.
4.4 Mixture Exponential Distribution The simple two-component mixture exponential distribution is defined as F .x/ D .1 ˛/.1 e0 x / C ˛.1 e1 x /; where ˛ 2 Œ0; 1 is the mixing proportion. The corresponding mixture exponential density function is given by f .x/ D .1 ˛/0 e0 x C ˛1 e1 x ; and its failure rate function is equal to r.x/ D
.1 ˛/0 e0 x C ˛1 e1 x f .x/ D : G.x/ .1 ˛/e0 x C ˛e1 x
By taking the derivative of r.x/, we have r 0 .x/ D
.1 ˛/20 e0 x C ˛21 e1 x ..1 ˛/0 e0 x C ˛1 e1 x /2 .1 ˛/e0 x C ˛e1 x ..1 ˛/e0 x C ˛e1 x /2
0: Thus, its failure rate function is always decreasing.
80
4 Parametric Families of Lifetime Distributions
Theorem 4.5. The mean and variance of mixture exponential distribution are given by 1˛ ˛ C I 0 1 ˛ 1 2 1˛ 1 C C ˛.1 ˛/ : Var.X / D 0 1 0 1 E.X / D
Proof. We only give the details for the variance and leave the proof for the mean as an exercise. First, the second moment of X can be calculated as Z E.X 2 / D
1
h i x 2 .1 ˛/0 e0 x C ˛1 e1 x dx
0
Z
1
D .1 ˛/
x 2 0 e0 x dx C ˛
0
D .1 ˛/
1 1 C 0 20
C˛
Z
1 0
1 1 C 1 21
x 2 1 e1 x dx :
Thus, Var.X / D E.X 2 / .EX /2 1 1 ˛ 2 1 1˛ 1 C˛ D .1 ˛/ C C C 0 1 0 1 20 21 D
1˛ ˛ 1˛ ˛ .1 ˛/2 ˛.1 ˛/ ˛ 2 C C C 2 2 2 2 2 0 1 0 1 0 1 0 1
1˛ ˛ ˛.1 ˛/ ˛.1 ˛/ ˛.1 ˛/ C C 2 C 2 0 1 0 1 0 21 1˛ ˛ 1 2 1 D C C ˛.1 ˛/ : 0 1 0 1 D
t u Generally, let H./ be a mixing distribution for 0 < 1. We call Z 1 .1 ex /dH./ F .x/ D 0
a mixture exponential distribution. Its failure rate function can be calculated as R 1 y e dH./ d N r.x/ D ln F .x/ D R0 1 y : dx dH./ 0 e
4.5 IFR (DFR) and Mixture Erlang Distribution
81
To analyze r.x/, we note that ey dH./ ; Hy ./ D R 1 y dH./ 0 e defines a new distribution function for . Its k th moments can be calculated as R 1 k y Z 1 e dH./ k : Hy ./ D 0R 1 y dH./ 0 0 e The derivative of r.x/ is equal to R1 0
2 ey dH./
r .x/ D R 1 0
ey dH./
!2
R1
C
0
ey dH./ Z 1 2 Z 1 D 2 dHy ./ C dHy ./ 0
ey dH./
R1
0
0
0
0; since the last equality gives the negative variance of Hy ./. Thus, the mixture exponential distribution always has a decreasing failure rate function. The mixture exponential distribution has been used as a model for claim amounts in insurance mathematics and lifetime distribution for systems under shock models when there are multiple shock sources or lifetime distribution for components when components are produced from different assembly lines.
4.5 IFR (DFR) and Mixture Erlang Distribution The four parametric families are mainly generalized by extending the constant failure rate for the exponential distribution to increasing or decreasing failure rate functions. To understand how they also generalize the exponential distribution in terms of memoryless property, we recall that the residual life has the distribution function F .x C t/ F .t/ F .xjt/ D P .X t C xjX > t D 1 F .t/ Z xCt D 1 exp r.s/ds : t
Since d dt
Z
t Cx
r.s/ds D r.t C x/ r.t/; t
82
4 Parametric Families of Lifetime Distributions
thus, r.t/ is increasing if and only if So we have the following result.
R t Cx t
r.s/ds is increasing in t for any x > 0.
Theorem 4.6. If the failure rate function r.t/ for X exists, then r.t/ is increasing /F .t / (decreasing) if and only if the residual life distribution F .xjt/ D F .xCt is 1F .t / increasing (decreasing) in t for any x > 0. Therefore, the IFR (DFR)(Increasing Failure Rate (Decreasing Failure Rate)) class can be defined based on increasing or decreasing residual life distribution F .xjt/ in t no matter whether the failure rate function r.t/ exists or not. Remark. A typical probability model popular in reliability and biostatistics is a combination of IFR and DFR. There are many data sets in reliability testing, clinical trials, and biostatistics surveys that reveal the so-called “bath-tup” shape failure rate. The failure rate is initially decreasing during the “infant mortality” phase, then remains relatively constant during the “useful life” phase, and finally reaches the “wear-out” phase, i.e., the failure rate increases. The following is a typical example: Example 4.2 (Mixture Erlang Distribution). Let fk .x/ D
.x/k1 ex ; for k D 1; 2; :::; .k 1/Š
be the Erlang density function. The Erlang density can be seen as a special case of gamma distribution with integer shape parameter and it can also seen as a convolution of k identical exponential density functions. Its cumulative distribution can be calculated by integrating by parts k times as Z
x
Fk .x/ D
fk .y/dy 0
Z
x
D 0
.y/k1 ey dy .k 1/Š
D 1 ex
k1 X j D1
.x/j : jŠ
Obviously, every Erlang distribution is IFR. Our interest is on the following mixture Erlang density: f .x/ D
r X
qk fk .x/;
kD1
where q1 C C qr D 1 are the mixing proportions. Denote by Qj D q1 C qj ; and QN j D qj C1 C qr ;
4.5 IFR (DFR) and Mixture Erlang Distribution
83
with QN 0 D 1. Then the corresponding survival function FN .x/ can be calculated as FN .x/ D
r X
qk FNk .x/
kD1
D ex
r X
j D0
kD1
D ex
r1 X j D0
D ex
r1 X j D0
k1 X
qk
.x/j jŠ
.x/j jŠ r X
qk
kDj C1 j
.x/ : QN j jŠ
We shall denote by E1;2;:::;r the above mixture Erlang distribution of order k. Its failure rate function can thus be written as Pr1 j f .x/ j D0 qj C1 .x/ =j Š r.x/ D D Pr1 j N FN .x/ j D0 Qj .x/ =j Š # " Pr2 j N j D0 Qj C1 .x/ =j Š : D 1 Pr1 QN j .x/j =j Š j D0
By taking derivative, we obtain P
r3 j D0
r 0 .x/ D 2
j QN j C2 .x/ jŠ
P
r1 j D0
P
j QN j .x/ jŠ
r1 j D0
QN j .x/ jŠ
j 2
P r2
N
j D0 Qj C1
.x/j jŠ
That means, r 0 .x/ has the same sign as the term 20 10 1 0 12 3 r3 r1 r2 j j j X X X .x/ .x/ .x/ 6 A@ A@ A 7 4@ QN j C2 QN j QN j C1 5: jŠ jŠ jŠ j D0
j D0
j D0
Let us consider two special cases: (a) r D 2. The above term becomes QN 12 > 0. Therefore, E1;2 is always IFR. (b) r D 3. The sign of r 0 .x/ is the same as the sign of ŒQN 2 QN 12 QN 2 QN 1 .x/ QN 22 .x/2 =2 D q3 C .q2 C q3 /2 C q3 .q2 C q3 /.x/ C q3 .x/2 =2:
2 :
84
4 Parametric Families of Lifetime Distributions
Thus, we see that E1;2;3 is IFR, if and only if, q3 C .q2 C q3 /2 0: That means, if q3 > .q2 C q3 /2 ; then r 0 .x/ changes from negative to positive. That means, r.x/ is first decreasing and then becomes increasing. Mixture Erlang distributions define a much broad parametric distribution class since any absolutely continuous distribution on .0; 1/ may be approximated arbitrarily accurately by a distribution of this type. In fact, for F .0/ D 0 and arbitrary
> 0, the distribution function defined by the density function Fı0 .x/ D
1 X kD1
pk . /
.1= /k x k1 ex= ; .k 1/Š
where pk . / D F .k / F ..k 1/ /, satisfies lim!0 F .x/ D F .x/ for any continuous point of F .x/ (Tijms 1994). Thus, mixture Erlang distributions can be used to approximate any continuous distribution by matching a certain number of finite moments. We conclude this section by noticing that if X has a distribution that is both IFR and DFR, then X must have an exponential distribution. A more systematic discussion for various nonparametric lifetime distribution classes based on the two fundamental properties of exponential distribution will be given in the next chapter.
Problems 1. Consider the following two Weibull distributions as survival models: (a) D 1:0, ˛ D 0:5 (b) D 0:5, ˛ D 2:0 For each distribution, find (a) The survival distribution (b) The failure rate function (c) The mean and variance. Which distribution gives the larger survival probability of at least three units of time? 2. Suppose the pain relief time follows the gamma distribution with D 1, ˛ D 0:5. Find the mean and variance.
4.5 IFR (DFR) and Mixture Erlang Distribution
85
3. Suppose that D 1, ˛ D 2:0 for the gamma distribution. Find (a) The survival function (b) The failure rate function (c) The mean and variance. 4. Consider a life-testing distribution with density function of the form f .x/ D 2 2xex for x > 0. (a) Find the cumulative distribution function F(x) and the survival function. (b) Find the failure rate function r.x/. (c) Does this model have the IFR property? 5. Suppose items are produced from two different assembly lines. An item is produced from Line 1 with probability 0.7 and its lifetime distribution is exponential with mean 1.0. It is produced from Line 2 with probability 0.3 and its lifetime distribution is exponential with mean 1.5. (a) (b) (c) (d)
Find the mixture exponential distribution for the lifetime. Find its corresponding density function. Calculate the failure rate function. Show that the failure rate function is always decreasing.
6. Suppose the survival times for a group of leukemia patients has failure rate following a change point model: r.t/ D 0:2 for t 0:5 and D 0:5 for t > 0:5. (a) Write down the survival function. (b) Find the probability that a patient will survive more than 1.0. (c) What is the mean survival time. 7. Suppose the “O-rings” on the spaceship Columbia have the Weibull lifetime distribution with failure rate function r.t/ D t 1=2 . What is the probability that this O-ring is still in use after two years? 8. The lifetime, T , of a semiconductor device has a Gamma distribution with D 0:01 and ˛ D 5. (a) Find the probability that the device fails before 500 months. (b) What is the failure rate of the device at age t D 500. (c) What are the mean and standard deviation of the life distribution? 9. A large number of identical relays have times to first failure that follows a Weibull distribution with parameters D 0:1 and ˛ D 0:5 years. What is the probability that a relay will survive (a) 1 year, (b) 5 years, and (c) 10 years without failure and what is the mean time to the first failure? 10. The variation in the output power of a motor is found to follow a gamma distribution with D 0:01 and ˛ D 3. What is the probability that the power output is less than 200 KW?
86
4 Parametric Families of Lifetime Distributions
11. Suppose a component with mean operating time of 100 h has five spares. (a) What is the expected operation time to be obtained from the component and spares? (b) If the components are distributed as exponential, find the reliability R(200), the probability that the system is still in operation after 200 h. (c) If we want the reliability R.200/ D 0:95, how many spares are needed to achieve it? 12. Prove that a necessary and sufficient condition for a random variable with support Œ0; 1/ to be distributed with F .x/ D 1 exp..x=ˇ/˛ / is that EŒX ˛ jX > y D y ˛ C ˇ ˛ ; ˛ > 0; 0 y < 1: 13. Prove that the failure rate function is r.t/ D ˛˛ t ˛1 ; t > 0 if and only if the lifetime distribution is WEI.; ˛/. 14. Find the density function of the minimum of n i.i.d. Weibull random variables. 15. Let X and Y be independent, nondegenerate and positive random variables. Then X C Y and X=Y are independent if and only if both X and Y are gamma distributions with the same scale parameter. 16. Let Xi , i D 1; ; n be i.i.d. random variables. Prove that if the distribution of Xi is IFR, then so is the distribution of i th order statistic Xi;n. 17. Let X and Y be independent exponential random variables with same rate . (a) Show that the conditional density function of X , given X C Y D c for c > 0 is uniform 1 fXjX CY .xjc/ D ; 0 < x < c: c (b) What is EŒX jX C Y D c? 18. Let X and Y be independent exponential random variables with rate and ı, respectively. Calculate the conditional density function and conditional mean of X given X C Y D c for c > 0. 19. Show that the mixture Erlang distribution E1;2 is always IFR. 20. Show that the mixture Erlang distribution E1;2;3 with q2 D 0 has the failure rate which is first decreasing and then increasing. 21. Show that a plot of logŒlog.1 F .x//1 against log x will be a straight line with slope ˛ for a Weibull distribtion. 22. Calculate the variance of X whose failure rate follows the change-point model.
Chapter 5
Lifetime Distribution Classes
At the end of Chap. 4, we introduced the IFR and DFR lifetime distribution classes based on the failure rate and residual life distribution. The IFR and DFR properties are quite strong as they require the full knowledge of residual life distribution F .xjt/ D P ŒX t C xjX > t for all t; x 0. In many practical situations, we may not need or have such full knowledge. For example, in reliability maintenance we may be only interested in the mean residual life to determine the maintenance schedule. In insurance mathematics, we may be only interested in the mean stop-and-loss claims. In this chapter, we shall introduce several lifetime distribution classes, which have gained wide applications in reliability, insurance, and economics. Since the notions are originated from reliability, so their main properties will be discussed from reliability terminology. More applications to other areas can be seen in later chapters.
5.1 IFR and DFR In this section, we look at IFR and DFR classes in more detail including their distributional properties.
5.1.1 IFR and PF2 Definition 5.1. A nonnegative function h.x/ is said to be in the class of P´olya frequency function of order 2 (PF2 ) if for all 1 < x1 < x2 < 1 and 1 < y1 < y2 < 1, ˇ ˇ h.x1 y1 / ˇ ˇ h.x2 y1 /
ˇ h.x1 y2 / ˇˇ 0: h.x2 y2 / ˇ
This simply means that for any x1 < x2 and y1 < y2 , h.x1 y1 /h.x2 y2 / h.x1 y2 /h.x2 y1 / 0: A.K. Gupta et al., Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics, DOI 10.1007/978-0-8176-4987-6 5, c Springer Science+Business Media, LLC 2010
87
88
5 Lifetime Distribution Classes
Lemma 5.1. Let h.x/ be a nonnegative function on .1; 1/. The following are equivalent: 1. h.x/ is PF2 . 2. h.x/ is log-concave, i.e., log Œh.x/ is concave in the sense that 1 xCy ; x; y 2 .1; 1/: flog Œh.x/ C log Œh.y/g log h 2 2 h.x C ı/ is decreasing in x for a < x < b, where a D h.x/ inf fyjh.y/ > 0g, and b D supfyjh.y/ > 0g.
3. For any fixed ı > 0,
Remark. P´olya first introduced the notion of PF2 for other purposes. Due to the above equivalent relation, it turns out to be another characterization of IFR property. If h.x/ is continuous, the concavity above is equivalent to ˛ logŒh.x/ C .1 ˛/ logŒh.y/ logŒh.˛x C .1 ˛/y/; ˛ 2 .0; 1/: Proof. (1)H)(2) Suppose h.x/ is PF2 . Then for any x1 < x2 and y1 < y2 , h.x1 y1 /h.x2 y2 / h.x1 y2 /h.x2 y1 / 0: Let ı; > 0 and x 2 .1; 1/. If we let x1 D x C ı; x2 D x C ı C ; y1 D 0; and y2 D ı; then x1 y1 D x C ı; x2 y2 D x C ; and x1 y2 D x; x2 y1 D x C ı C : So that h.x C ı/h.x C / h.x/h.x C ı C / 0: This implies that logŒh.x/ C logŒh.x C ı C / logŒh.x C ı/ C logŒh.x C /: By letting ı D , we get 1 x C .x C 2ı/ flogŒh.x/ C logŒh.x C 2ı/g log h ; 2 2 and hence (2) holds.
5.1 IFR and DFR
89
(2)H)(3) Assume (2). Let ı > 0 be fixed, and x < y. Then x < x C ı, and y < y C ı. From the property of a concave function (see Hardy et al. 1952), the graph of logŒh.x/ is above the secant line joining the points .x; logŒh.x// and .y C ı; logŒh.y C ı//. Thus, noticing that there is an ˛ 2 .0; 1/ such that ˛x C .1 ˛/.y C ı/ D x C ı and .1 ˛/x C ˛.y C ı/ D y; we have ˛ logŒh.x/ C .1 ˛/ logŒh.y C ı/ logŒh.x C ı/; and .1 ˛/ logŒh.x/ C ˛ logŒh.y C ı/ logŒh.y/: Hence logŒh.x/ C logŒh.y C ı/ logŒh.x C ı/ C logŒh.y/: This is equivalent to
h.y C ı/ h.x C ı/ ; x < y; h.x/ h.y/
and we have (3). (3)H)(1) Assume that for each given ı > 0,
h.x C ı/ is decreasing in x. Then h.x/
for x < y, we have h.x C ı/h.y/ h.y C ı/h.x/; or h.x C ı/h.y/ h.y C ı/h.x/ 0: For any x1 < x2 and y1 < y2 , if we let ı D y2 y1 ; x D x1 y2 ; and y D x2 y2 ; then ı > 0 and x < y. We have h.x1 y1 /h.x2 y2 / h.x1 y2 /h.x2 y1 / D h.x C ı/h.y/ h.x/h.y C ı/ 0; i.e.,
and h.x/ is PF2 .
ˇ ˇ h.x1 y1 / ˇ ˇ h.x2 y1 /
ˇ h.x1 y2 / ˇˇ 0; h.x2 y2 / ˇ t u
We can see that it is the logarithmic concavity that provides a way to verify the IFR property. Theorem 5.1. Let F .x/ be a lifetime distribution function. F .x/ has IFR if and only if the survival function FN .x/ D 1 F .x/ is PF2 .
90
5 Lifetime Distribution Classes
Proof. Let F .x/ have IFR. Then for each x, 1 F .x C t/ FN .x C t/ D 1 F .t/ FN .t/ is decreasing in t, and Lemma 5.1 implies that FN .x/ is PF2 . The converse follows from Lemma 5.1 as well. t u
5.1.2 Smoothness of IFR Distribution In Theorem 5.1, we did not assume that F .x/ has density function. However, since IFR property of F .x/ implies that FN .x/ is PF2 , which is equivalent to logŒFN .x/ being concave, and hence F .x/ has certain smoothness. Theorem 5.2. If F .x/ has IFR and F .z/ < 1, then F .x/ is absolutely continuous (i.e., has a density function) on .1; z/. Proof. We define the hazard function as R.x/ D logŒFN .x/. Since logŒFN .x/ is concave and decreasing, R.x/ is convex and increasing. Thus, R.z C h/ R.z/ h is positive and decreasing in h, and its limit as ı ! 0C exists and is positive. Let R.z C ı/ R.z/ : rC .z/ D lim C ı ı!0 Notice that a convex function has increasing right derivative. For any ˛ < ˇ < z, we have jR.ˇ/ R.˛/j rC .z/jˇ ˛j: Given > 0, if ˛1 < ˇ1 < ˛2 < ˇ2 < < ˛m < ˇm < z with
m X .ˇi ˛i / < i D1
then
m X i D1
jR.ˇi / R.˛i /j
m X
; rC .z/
rC .z/.ˇi ˛i / < :
i D1
Therefore, F .x/ is absolutely continuous on .1; z/.
t u
5.1 IFR and DFR
91
Theorem 5.2 does not imply that F .x/ has a density function on .1; 1/, since an IFR distribution may have a jump at the right endpoint of its interval of support. For example, if we consider a truncated exponential distribution with distribution function F .x/ D 1 ex if 0 x < 10, and F .x/ D 1 if x 10, then F .x/ has IFR, but it does not have a density function.
5.1.3 A Sufficient Condition When the density function f .x/ exists, the following theorem provides a sufficient condition based on f .x/ for F .x/ to be IFR. Theorem 5.3. If f .x/ is a PFR2 density function on Œ0; 1/, then the corresponding x distribution function F .x/ D 0 f .t/dt has IFR. Proof. To show that r.t/ D
f .t/ is increasing in t, consider for t1 < t2 , FN .t/
f .t1 /FN .t2 / f .t2 /FN .t1 / D
Z
1
0
ˇ ˇ f .t1 / ˇ ˇ f .t1 C x/
ˇ f .t2 / ˇˇ dx: f .t2 C x/ ˇ
Since f .t/ is PF2 , Lemma 5.1 implies that for each x, f .t2 C x/ f .t1 C x/ ; f .t1 / f .t2 / and hence, ˇ ˇ f .t1 / ˇ ˇ f .t1 C x/ We have
ˇ f .t2 / ˇˇ D f .t1 /f .t2 C x/ f .t2 /f .t1 C x/ 0: f .t2 C x/ ˇ Z
1
0
and
ˇ ˇ f .t1 / ˇ ˇ f .t1 C x/
ˇ f .t2 / ˇˇ dx 0; f .t2 C x/ ˇ
f .t2 / f .t1 / : 1 F .t1 / 1 F .t2 / t u
This completes the proof.
Example 5.1. Let f .x/ be the probability density function of the truncated normal distribution, given by
f .x/ D
8 ˆ ˆ < ˆ ˆ :
1 p
a 2 0
e
.x/ 2 2
2
; 0x<1 x<0
92
5 Lifetime Distribution Classes
Z
1
2 1 .x/ p e 2 2 dx. After examining 2 0 failure data for a wide variety of items, Davis (1952) has shown that items manufactured and tested under close control may be fitted nicely with truncated normal life distribution. If we observe log f .x/, we have
where > 0, 1 < < 1, and a D
p .x /2 log f .x/ D log.a 2/ ; x 0: 2 2 This is a concave function on Œ0; 1/. Therefore, f .x/ is PF2 . Now Theorem 5.3 implies that the truncated normal distribution has IFR. Similar to IFR class, we notice that we can establish parallel results for DFR distributions. We present one such result here. We will use the fact that DFR corresponds to logarithmic concavity. Theorem 5.4. If f .x/ is a density function on Œ0; 1/ such that log f .x/ is convex on Œ0; 1/, then the corresponding distribution function F .x/ has DFR. Proof. By an argument similar to the proof of (2)H)(3) in Lemma 5.1, for x < y and ı > 0, we see that f .x C ı/f .y/ f .x/f .y C ı/ 0: This implies that
so that
ˇ ˇ f .y/ ˇ ˇ FN .y/
ˇ ˇ ˇ f .y/ f .x/ ˇˇ ˇ ˇ f .y C ı/ f .x C ı/ ˇ 0; ˇ Z 1ˇ ˇ ˇ f .y/ f .x/ ˇˇ f .x/ ˇˇ ˇ D ˇ f .y C ı/ f .x C ı/ ˇ dı 0; FN .x/ ˇ 0
Hence, f .y/ f .x/ ; N F .y/ FN .x/ and F .x/ has DFR.
t u
Logarithmic concavity or convexity is easier to verify than the properties of IFR or DFR.
5.2 IFRA and DFRA Classes We first give a generalization of IFR (DFR) based on the monotone property of failure rate function. If r.t/ is the failure rate function of a distribution F .x/, we define the hazard function as
5.2 IFRA and DFRA Classes
93
Z
t
R.t/ D
r.u/du D logŒ1 F .x/: 0
Thus, with the survival function FN .x/ D 1 F .x/, the average of the failure rate N function on the interval .0; t/ is log Ft .t / . With this observation, we introduce the following definitions. Definition 5.2. A distribution function F .x/ is said to have log FN .t/ is increasing in t, i.e., the t average failure rate function on .0; t/ is increasing in t; log FN .t/ is decreasing in t. (b) decreasing failure rate average (DFRA) if t It should be noted that if F .x/ has IFRA or DFRA, then 1 F .x/ D FN .x/ > 0 for all x necessarily. (a) increasing failure rate average (IFRA) if
Theorem 5.5. A distribution function F .x/ is IFRA (DFRA) if and only if its survival function FN .x/ satisfies that FN .˛t/ ./ŒFN .t/˛ ; f or 0 < ˛ < 1; and t > 0: Proof. Let F .x/ be a distribution function. We have the following equivalent statements: F .x/ is IFRA (DFRA) ”
log FN .x/ is increasing (decreasing) t
” ŒFN .t/1=t is decreasing (increasing) ” ŒFN .˛t/1=˛t ./ŒFN .t/1=t ” FN .˛t/ ./ŒFN .t/˛ : The proof is completed.
t u
Remark. Equivalently, we can say that the distribution function F .x/ has IFRA if and only if the hazard function R.t/ D log FN .t/ satisfies R.˛x/ ˛R.x/; 0 ˛ 1; x 0 (such a function is called star-shaped). This will be quite useful in the models, which we will discuss later. Theorem 5.6. A distribution function F .x/ is IFRA (DFRA) if and only if for each > 0, FN .x/ ex has at most one change of sign, and if one change of sign actually occurs, it is from + to (from to +).
94
5 Lifetime Distribution Classes
Proof. Let F .x/ be IFRA. Then
log FN .x/ is increasing in x, and hence x
log FN .˛x/ ˛ log FN .x/; f or ˛ 2 Œ0; 1 and x 0: This simply means that the graph of log FN .x/ on any interval Œ0; t is below the line segment between .0; 0/ and .t; log FN .t//. However, log.ex / is linear. Both functions log FN .x/ and log.ex / pass through the origin. Thus, the graph of log FN .x/ crosses the graph of log.ex / at most once, and if a crossing does occur, log FN .x/ would cross log.ex / from below. (The special case when FN .x/ D e0 x yields no crossing at all.) This implies that the graph of FN .x/ crosses the graph of ex at most once, and if it crosses, it does so from above. This is exactly the “only if” part. Conversely, assume that, for each > 0, FN .x/ ex has at most one change of sign, and if once change of sign actually occurs, it occurs from C to . Then the FN .x/ function x passes 1 at most once, and so the graph of log FN .x/ crosses that e of log.ex / at most once from below. It follows that log FN .x/ satisfies that log FN .˛x/ ˛ log FN .x/; ˛ 2 Œ0; 1; x 0: Hence,
log FN .x/ is increasing in x 0, and F .x/ is IFRA. x
t u
Furthermore, we should point out that a convex function passing through the origin is star-shaped, so that every IFR distribution function is IFRA. However, the reverse is false, i.e., not every IFRA distribution is IFR. Example 5.2. Let F .x/ be the lifetime distribution of a parallel system of two independent components with respective life distributions F1 .x/ D 1 e1 x and F2 .x/ D 1 e2 x : This implies that F .x/ D F1 .x/F2 .x/ D .1 e1 x /.1 e2 x /; since a parallel system stops operation when both components stop operation. Thus, FN .x/ D 1 F .x/ D 1 .1 e1 x /.1 e2 x /: So the failure rate of the system is r.t/ D
1 e1 t C 2 e2 t .1 C 2 /e.1 2 /t : e1 t C e2 t C e.1 C2 /t
5.3 Several Lifetime Distribution Classes
95
It can be shown that r.t/ is increasing on Œ0; t0 / and decreasing on .t0 ; 1/, where t0 depends on 1 and 2 . Note that the lifetime distribution of each component is exponential, so that each has IFR. But F .x/ is not IFR. However, we observe that for ˛ 2 .0; 1/, the function f .u/ D u˛ is concave on .0; 1/, so that if u1 < u2 , then f .u1 C ı/ f .u1 / f .u2 C ı/ f .u2 / for ı > 0: If we let ı D .u v/, u1 D v, and u2 D v, where 2 Œ0; 1, then u1 C ı D u; u2 C ı D u C .1 /v; and ˛ u˛ ˛ v˛ Œu C .1 /v˛ v˛ or ˛ u˛ C .1 ˛ /v˛ Œu C .1 /v˛ : Since
FN .x/ D e1 x C e2 x e.1 C2 /x ;
with u D 1 and D e1 x , we have FN .˛x/ D e1 ˛x C e2 ˛x e.1 C2 /˛x ˛ h ˛ i D e1 x C e2 x ˛ 1 e1 x
˛ e1 x C 1 e1 x e2 x
D e1 x C e2 x e.1 C2 /x ˛ D ŒFN .x/˛ : Therefore, F .x/ has IFRA. The point is, the lifetime distribution of a parallel system with IFR components is IFRA, but not necessarily IFR. This provides a good model for an IFRA system.
5.3 Several Lifetime Distribution Classes In this section, we generalize the IFR (DFR) properties based on the distribution of residual life. Since results of IFR and DFR are parallel, we will mainly concentrate on IFR series and point out the related DFR results. We first give a series of lifetime distribution classes based on certain monotone properties of residual life distribution.
96
5 Lifetime Distribution Classes
Definition 5.3. A lifetime distribution F .x/ is called (a) IFR if for any x; t 0, the conditional survival distribution function FN .x C t/ FN .xjt/ D P ŒX > x C tjX > t D FN .t/ is decreasing in t for any x 0; (b) IFRC (Increasing Failure Rate in Convex Order) if Z 1 Z 1 N F .x C t/ P ŒX > t C xjX > tdx D dx FN .t/ y y is decreasing in t for any y 0; (c) DMRL (Decreasing Mean Residual Life) if Z
Z
1
.t/ D
1
P ŒX > t C xjX > tdx D 0
0
FN .x C t/ dx FN .t/
is decreasing in t; (iv) HDMRL (Harmonic Decreasing Mean Residual Life) if Z 1 t 1 ds t 0 .s/ is decreasing in t. As the condition gets weaker in the definition, it is obvious that IFR ) IFRC ) DMRL ) HDMRL. That means, the class gets bigger. The dual DFR series can be defined by changing “decreasing” to “increasing”. Remark. A detailed analysis shows that IFRC is equivalent to DMRL when the density function f .x/ exists. In fact, we first note that F .x/ is IFRC if and only if for any y 0, Z 1 d d 1 .t/ D FN .x C t/dx 0; dt dt FN .t/ y which is equivalent to R1
y
R1 f .t C x/dx f .t/ y FN .t C x/dx 0; C .FN .t//2 FN .t/
or r.t/ D
f .t/ 1 F .t/
R1 y
1 F .t C y/ .1 F .t C x//dx
5.3 Several Lifetime Distribution Classes
97
1 F .t C y/ D R1 yCt .1 F .x//dx D
1 : .t C y/
Similarly, by letting y D 0, F .x/ is DMRL if and only if r.t/
1 : .t/
Now from the definition of DMRL, if F .x/ is DMRL, .t/ .t Cy/ for x; y 0. This implies 1 1 r.t/ ; .t/ .t C y/ for t; y 0. So F .x/ is also IFRC. Next, we give another series of lifetime distribution classes by comparing the current residual life distribution with the new one, rather than based on the monotone property. Definition 5.4. A lifetime distribution F .x/ is called (v) NBU (New Better Than Used) if for x; t 0, 1 F .t C x/ 1 F .t/ P ŒX > x D 1 F .x/I
P ŒX > x C tjX > t D
(d) NBUC (New Better Than Used in Convex Order) if for y; t 0, Z
Z
1
1
P ŒX > t C xjX > tdx D y
Z
y
Z
1
1
P ŒX > xdx D y
1 F .t C x/ dx 1 F .t/
.1 F .x//dxI y
(e) NBUE (New Better Than Used in Expectation) if Z .t/ D Z
Z
1
1
P ŒX > t C xjX > tdx D 0
D
0
Z
1
1
P ŒX > xdx D 0
1 F .t C x/ dx 1 F .t/
.1 F .x//dxI 0
98
5 Lifetime Distribution Classes
(f) HNBUE (Harmonic new Better Than Used in Expectation) if 1 t
Z
t 0
1 1 ds : .s/
Obviously, NBU ) NBUC ) NBUE ) HNBUE. Remark 1. IFRA ) NBU. In fact, if F .x/ is IFRA, then for x; y 0, log FN .x/ D log FN
x x .x C y/ log FN .x C y/; xCy xCy
y x .x C y/ log FN .x C y/; xCy xCy
and log FN .y/ D log FN Hence, log FN .x/ log FN .y/ log FN .x C y/ and the property of NBU follows. Remark 2. HDMRL ) NBUE. In fact, F .x/ is HDMRL, if and only if, d dt
Z t 1 1 ds 0; t 0 .s/
which is equivalent to 1 1 .t/ t
Z
t 0
1 ds: .s/
However, by letting t ! 0, we have 1 t
Z
t 0
1 1 ds lim t !0 t .s/
Z
t 0
1 1 ds D : .s/
Thus, HDMRL is equivalent to 1 1 .t/ t
Z
t 0
1 1 ds : .s/
Thus, F .x/ is also NBUE, and is also HNBUE.
5.4 Preservation of Lifetime Distributions Under Reliability Operations
99
Remark 3. Further, we notice that Z t Z t 1 1 F .s/ R1 ds D ds .s/ .1 F .x//dx 0 0 s Z 1 D log .1 F .x//dxjt0 s Z 1 1 D log .1 F .x//dx : t Thus, HNBUE is equivalent to Z 1 1 .1 F .x//dx et = : t By combining the above two definitions, we have the following relationships between the IFR series of eight lifetime distribution classes: IFR ! DMRL ! HDMRL . # # # & IFRA ! NBU ! NBUC ! NBUE ! HNBUE In the next section, we shall first study the preservation properties of these lifetime distribution classes under two simple reliability operations: one is under independent sums or cold standby system, and the other is under mixture operation.
5.4 Preservation of Lifetime Distributions Under Reliability Operations 5.4.1 Independent Sums When a failed component is replaced by a spare, the total life accumulated is obtained by the addition of the two life lengths. This is essential in the study of maintenance policies. Here, we assume life lengths of the first component and that of the spare are independent. If the component’s life distribution is IFR, it is desirable that the accumulated life length is also IFR. In that way, we would like to apply addition to life distributions introduced here and see whether the properties of the life distribution are preserved. It is known that if X1 and X2 are independent and have respective distribution functions F1 .x/ and F2 .x/, the distribution function F .x/ of X D X1 C X2 is given by Z Z x
F .x/ D
x
F2 .x z/dF1 .z/ D 0
F1 .x z/dF2 .z/; 0
which is usually called the convolution of F1 .x/ and F2 .x/, denoted by F .x/ D F1 F2 .x/.
100
5 Lifetime Distribution Classes
Theorem 5.7. If F1 .x/ and F2 .x/ are IFR, then F .x/ D F1 F2 .x/ is IFR. Proof. We will prove for the case that F1 .x/ and F2 .x/ have density functions f1 .x/ and f2 .x/, respectively. For notational convenience, we denote Gi .x/ D FNi .x/ and G.x/ D FN .x/. By Theorem 5.1, we need to show that G.x/ is PF2 , i.e., for any t1 < t2 and u1 < u2 , ˇ G.t1 u2 / ˇˇ 0: G.t2 u2 / ˇ
ˇ ˇ G.t1 u1 / D D ˇˇ G.t2 u1 / Z
x
Since F .x/ D
F1 .x t/f2 .t/dt, we have 0
Z
1
G1 .x t/f2 .t/dt
G.x/ D 1 F .x/ D 0
where G.x/ D 1 for x 0. Hence, ˇ ˇ G.t1 u1 / D D ˇˇ G.t2 u1 /
ˇ G.t1 u2 / ˇˇ G.t2 u2 / ˇ
ˇZ 1 ˇ ˇ G1 .t1 u1 t/f2 .t/dt ˇ ˇ 0 ˇ D ˇZ ˇ 1 ˇ G1 .t2 u1 t/f2 .t/dt ˇ 0
ˇ ˇ G1 .t1 u2 t/f2 .t/dt ˇˇ 0 ˇ ˇ ˇ Z 1 ˇ G1 .t2 u2 t/f2 .t/dt ˇˇ Z
ˇZ 1 ˇ ˇ ˇ u G1 .t1 s1 /f2 .s1 u1 /ds1 ˇ 1 D ˇˇ Z ˇ 1 ˇ G1 .t2 s1 /f2 .s1 u1 /ds1 ˇ u1
0
Z
ˇ ˇ G1 .t1 s2 /f2 .s2 u2 /ds2 ˇˇ u2 ˇ ˇ ˇ Z 1 ˇ G1 .t2 s2 /f2 .s2 u2 /ds2 ˇˇ 1
u2
ZZ
ˇ ˇ G1 .t1 s1 / ˇ ˇ fs1 <s2 g G1 .t2 s1 /
ˇˇ G1 .t1 s2 / ˇˇ ˇˇ f2 .s1 u1 / G1 .t2 s2 / ˇ ˇ f2 .s2 u1 /
ˇ f2 .s1 u2 / ˇˇ ds2 ds1 f2 .s2 u2 / ˇ
ZZ
ˇ ˇ G1 .t1 s1 / ˇ ˇ fs1 <s2 g G1 .t2 s1 /
ˇˇ f1 .t1 s2 / ˇˇ ˇˇ f2 .s1 u1 / f1 .t2 s2 / ˇ ˇ G2 .s2 u1 /
ˇ f2 .s1 u2 / ˇˇ ds2 ds1 : G2 .s2 u2 / ˇ
D
D
1
The last step was obtained by integration by parts with respect to s2 . Since F1 .x/ is f1 .t/ IFR, is increasing, so that G1 .t/ f1 .t1 s2 / f1 .t2 s2 / : G1 .t2 u2 / G1 .t1 s2 /
5.4 Preservation of Lifetime Distributions Under Reliability Operations
101
By Theorem 5.1, G1 .x/ is PF2, and G1 .t1 s2 / G1 .t2 s2 / : G1 .t2 s1 / G1 .t1 s1 / It follows that
i.e.,
f1 .t1 s2 / f1 .t2 s2 / ; G1 .t2 s1 / G1 .t1 s1 / ˇ ˇ G1 .t1 s1 / ˇ ˇ G1 .t2 s1 /
ˇ f1 .t1 s2 / ˇˇ 0: f1 .t2 s2 / ˇ
Similarly, we can show that ˇ ˇ f2 .s1 u1 / ˇ ˇ G2 .s2 u1 /
ˇ f2 .s1 u2 / ˇˇ 0: G2 .s2 u2 / ˇ
This implies that D 0 and F .x/ is IFR.
t u
Remark 1. A much more involved argument leads to the conclusion that if F1 .x/ and F2 .x/ are IFRA, so is F .x/ (Block and Savits 1976). Remark 2. If we let F1 .x/ and F2 .x/ be exponential with parameter , then F1 .x/ and F2 .x/ are both DFR. However, F .x/ D F1 F2 .x/ will be Gam.; 2/, which is IFR. It follows that all DFR series lifetime distribution classes are not preserved under addition.
5.4.2 Mixture of Lifetime Distributions Mixture of distributions arise naturally in a number of reliability models. For example, suppose a manufacturer produces 60% of a certain product in Assembly Line 1 and 40% in Assembly Line 2. Because of the differences in production conditions, such as machines, personnel, and so on, the life length of a unit produced in Line 1 has a distribution F1 .x/, whereas the life length of a unit produced in Line 2 has a distribution F2 .x/ that might be different from F1 .x/. After production, units from both assembly lines flow into a common shipping room, so that the outgoing lots consist of a random mixture of the output from the two assembly lines. If a unit is selected at random from a lot, it would have a lifetime distribution F .x/ D 0:6F1 .x/ C 0:4F2 .x/; which is a mixture of the two underlying distributions F1 .x/ and F2 .x/. More generally, there may be infinitely many underlying distributions involved in a mixture. Such an example can be illustrated by considering an important quality
102
5 Lifetime Distribution Classes
characteristic of the product that depends on the amount ˛ of impurity present in the raw material, so that the probability distribution of the quality characteristic is F˛ .x/. Suppose ˛ itself is random with distribution function K.˛/. The resulting distribution function F .x/ of the quality characteristic is Z
1
F .x/ D 1
F˛ .x/dK.˛/:
This leads to Definition 5.5. Let fF˛ .x/g be a family of probability distributions, where the index ˛ is governed by a distribution function K.˛/. The mixture distribution F .x/ of fF˛ .x/g according to K.˛/ is given by Z
1
F .x/ D 1
F˛ .x/dK.˛/:
Definition 5.6. For the mixture distribution Z 1 F .x/ D F˛ .x/dK.˛/; 1
the hazard transform of the mixture is Z .Eu/ D log
eu˛ dK.˛/ ;
where uE D .u˛ /, 0 u˛ < 1, 1 < ˛ < 1. For instance, if the family of distributions is fF1 ; F2 ; ; Fn g, then ˛ D 1; 2; ; n, and uE D .u1 ; u2 ; ; un /. The hazard transform would be " .u1 ; u2 ; ; un / D log
n X
# e
ui
pi ;
i D1
where pi D P Œ˛ D i . If K.˛/ is continuous, then uE is in fact a mapping from the index set f˛g to Œ0; 1/. It follows from Definition 5.6 that the hazard function of the mixture distribution F .t/ is given by Z R.t/ D logŒ1 F .t/ D log where
E Œ1 F˛ .t/ dK.˛/ D R.t/
E D .R˛ .t// D . log Œ1 F˛ .t// : R.t/
The hazard transform has the following property.
5.4 Preservation of Lifetime Distributions Under Reliability Operations
103
Theorem 5.8. The hazard transform of a mixture is concave, i.e., if is a hazard transform of a mixture, then for any 2 Œ0; 1,
uE C .1 /Ev uE C .1 / Ev ; where uE D .u˛ /, Ev D .v˛ /, with u˛ ; v˛ 2 Œ0:1/ for each index ˛. Proof. By H¨older’s inequality 1=p Z
Z
Z jf .x/g.x/j d.x/
p
jg.x/j d.x/
jf .x/j d.x/
where
1=q q
1 1 C D 1; p q
we have Z e
u˛ .1 /v˛
e
Z dK.˛/
e
u˛
Z dK.˛/
e
v˛
1 dK.˛/
;
where 2 .0; 1/. By taking logarithm on both sides, we have the concavity of . t u Theorem 5.9. Let F .x/ be a mixture of fF˛ .x/g governed by K.˛/. 1. If each F˛ .x/ is DFR, then F .x/ is DFR. 2. If each F˛ .x/ is DFRA, then F .x/ is DFRA. To prove this theorem, we need the following result. Lemma 5.2. Let h uE be concave (convex) and increasing in each argument. If each u˛ .t/ is concave (convex), then the function gEu .t/ D h uE .t/ is concave (convex) in t. Proof. Since u˛ .t/ is concave, u˛ Œs C .1 /t u˛ .s/ C .1 /u˛ .t/: It follows from h.Eu/ being concave and increasing in each argument that
h uE.s C .1 /t h uE .s/ C .1 /h uE.t/ : Hence, for any 2 Œ0; 1, gEu Œs C .1 /t gEu .s/ C .1 /gEu .t/: The other case with convexity follows similarly.
t u
104
5 Lifetime Distribution Classes
Proof of Theorem 5.9. (1). To show F .x/ is DFR, we only need to show that R.t/ D logŒ1 F .x/ is concave. (This is the counterpart of Theorem 5.1 for DFR, by noticing the equivalent condition in Lemma 5.1.) But this follows from Theorem 5.8 and Lemma 5.2. (2). Since each F˛ .x/ is DFRA and is increasing,
E t/ R.t/ E R. t/ D R. ; for 2 Œ0; 1. By Theorem 5.8 (with Ev D 0), we have
E E R.t/ R.t/ ; so R. t/ R.t/; for 2 Œ0; 1, and hence by Theorem 5.5, F .x/ is DFRA.
t u
Remark 1. We should mention that mixture of IFR distributions may not be IFR . In fact, we have already seen in Chap. 4 that mixture of two nonidentical exponential distributions has decreasing failure rate and thus is DFR. Therefore, the IFR series of lifetime distribution classes are not preserved under mixture operations. Remark 2. The situations for DMRL, NWU, NWUC, NWUE, HIMRL, and HNWUE are more involved and we will not discuss these here.
5.5 Shock Models and Lifetime Distribution Classes 5.5.1 IFRA Property of Shock Model There are several shock models that lead to the lifetime distributions we discuss in this chapter. A typical situation in a shock model is: observe a device which is subject to shocks. The shocks are occurring according to a Poisson process in time, and each shock independently causes random damage to the device. The lifetime, or the time of failure of the device, will be the key in describing such models. Consider a device that is subject to shocks occurring randomly in time according to a Poisson process with rate (intensity) . The i t h shock causes a random amount Xi of damage, where X1 ; X2 ; are independently distributed with common distribution F .x/. The device fails when the total accumulated damage exceeds a specified capacity or threshold value x. Let HF .t/ be the probability that the device survives Œ0; t. Then HF .t/ D
1 X kD0
P ŒThe device survives exactly k shocks during Œ0; t
5.5 Shock Models
105
D D
1 X kD0 1 X kD0
P ŒExactly k shocks in Œ0; tP ŒX1 C C Xk x et .t/k .k/ F .x/; kŠ
where F .k/ .x/ D F F F .x/ is the k-fold convolution of F .x/, representing the distribution function of X1 C X2 C C Xk . We are going to establish in the following theorem that for any damage distribution F .x/, the survival probability HF .t/ has IFRA. Theorem 5.10. Let
1 X et .t/k .k/ F .x/ HF .t/ D kŠ kD0
represent the survival probability in the cumulative damage model, where nonnegative damage follows an arbitrary distribution function F .x/. Then the life distribution of the device (given by HF .t/ D 1 HF .t/) is IFRA. Proof. We prove the theorem in two steps. Step 1. If F .x/ is a distribution function of a nonnegative random variable, then
1=k for each x, the sequence F .k/ .x/ is decreasing in k D 1; 2; . In fact, if k D 2, then Z
Z
x
F .2/ .x/ D
x
F .x z/dF .z/ 0
F .x/dF .z/ D ŒF .x/2 ; 0
1=2
F .x/. Now, suppose that and therefore, F .2/ .x/ h
F .k/ .x/
i1=k
h i1=.k1/ F .k1/ .x/ ;
.k1/=k , and hence, then F .k1/ .x/ F .k/ .x/ h
F .k/ .x/
ikC1
Z
k
x
D F .k/ .x/
F .k1/ .x z/dF .z/ 0
Z
x
F .k/ .x/ Z
x
D
0
h F
Z
h
.k/
i1=k h
.x/
F
.k/
0
F .k/ .x z/dF .z/ h
0
D F .kC1/ .x/
ik :
dF .z/ k
i.k1/=k
.x z/
k
x
k
i.k1/=k
F .k/ .x z/
dF .z/
106
5 Lifetime Distribution Classes
This is exactly
h
F .k/ .x/
i1=k
h i1=.kC1/ F .kC1/ .x/ :
1=t Step 2. HF .t/ is decreasing in t > 0, i.e., HF .t/ has IFRA. To show this, we need two results. The proofs of these results are beyond the scope of this book, so we will just state them without proof. (For reference, see Chap. 5 in Karlin, Total Positivity, 1968.) 1. If K.x; y/ is a total positive function of order n in the sense that for all x1 < x2 < < xr , y1 < y2 < yr , 1 r n, ˇ ˇ K.x1 ; y1 / ˇ ˇ K.x2 ; y1 / ˇ ˇ :: ˇ : ˇ ˇ K.x ; y / r
1
K.x1 ; y2 / K.x2 ; y2 / :: :
:: :
K.xr ; y2 /
ˇ K.x1 ; yr / ˇˇ K.x2 ; yr / ˇˇ ˇ 0; :: ˇ : ˇ K.x ; y / ˇ r
r
R and g.x/ D K.x; y/f .y/dy, then the number of sign changes of g.x/ is less than or equal to the number of sign changes of f .x/, provided that the number of sign changes of f .x/ is less than or equal to n 1. We note,P in particular, that the conclusion is valid if the variable y is discrete and g.x/ D y K.x; y/f .y/. et .t/k is a total positive function 2. The Poisson frequency function K.t; / D kŠ of order 2. Now let 2 .0; 1/ be given. Consider the function f .k/ D F .k/ .x/ k . For the 1=k
1=k
is decreasing in k, the function F .k/ .x/ fixed value of x, since F .k/ .x/ has at most one sign change from C to , and so the number of sign change of f .k/ D F .k/ .x/ k is less than or equal to 1. Moreover, HF .t/ e.1/t D D
1 h X kD0 1 X
F .k/ .x/ k
i et .t/k kŠ
K.t; k/f .k/:
kD0
It follows from (1) and (2) that HF .t/ e.1/t has at most one sign change from C to . Since for a given > 0, .1 / runs through the interval .0; / as ranges in .0; 1/, we have HF .t/ e t has at most one sign change from C to , where
2 .0; /. Notice that HF .t/ D
1 1 X X et .t/k .k/ et .t/k .k/ F .x/ D et C F .x/; kŠ kŠ
kD0
kD1
5.5 Shock Models
107
and
HF .t/ et :
If , then
HF .t/ e t
for all t 0. Therefore, for all > 0, HF .t/ e t has at most one sign change in t from C to . By Theorem 5.6, HF .t/ is IFRA. t u We should note that IFRA is quite a large family of lifetime distributions. Theorem 5.10 specifies a nonparametric feature of the distribution HF .t/, but its actual distribution and further properties would be determined by the damage distribution function F .x/.
5.5.2 Extension of Cumulative Damage Model A more realistic consideration in extending the Cumulative Damage Model can be formulated as follows. As usual, shocks to the device are subject to occur in time according to a Poisson process with rate . Now, successive shocks become increasingly effective in causing damage or wear of the device, even though they are independent. This means that the amount of damage Xi C1 caused by the .i C 1/st shock would be greater than Xi caused by the i th shock, i.e., Xi Xi C1 stochastically, and hence, the damage distribution Fi .x/ caused by the i th shock satisfies Fi .z/ D P ŒXi z P ŒXi C1 z D Fi C1 .z/: For a given value of x, we have a decreasing sequence fFi .x/g1 i D1 . The probability pk of surviving k shocks is then given by pk D P ŒX1 C X2 C C Xk x D F1 F2 Fk .x/: Let H .t/ D
1 X et .t/k pk kŠ
kD1
be the survival function of the life time of the device. We have the following results. Theorem 5.11. H .t/ is IFRA. Proof. From the proof of Theorem 5.10, we only need to show that .pk /1=k is a decreasing sequence. In fact, since Fi .x/ Fi C1 .x/, Z
x
p2 D
F2 .x z/dF1 .z/ Z0 x
F1 .x z/dF1 .z/ Z x F1 .x/ dF1 .z/ D p12 ; 0
0
108
5 Lifetime Distribution Classes 1=2
so that p2
p1 . Inductively, we have
ŒpkC1 k D ŒF1 F2 Fk FkC1 .x/k Z x k D FkC1 .x z/d ŒF1 F2 Fk .z/ (by definition) 0
Z
x
k Fk .x z/d ŒF1 F2 Fk .z/ (since FkC1 Fk )
x
k F1 F2 Fk .x z/dFk .z/ (by symmetry of convolution)
0
Z D 0
Z
x
D
ŒF1 F2 Fk .x z/
1=k
ŒF1 F2 Fk .x z/
k1 k
k dFk .z/
0
Z
x
pk
k ŒF1 F2 Fk1 .x z/ dFk .z/ (by induction hypothesis)
0
D pk pkk D pkkC1 ;
and hence, 1=k
pk
1=.kC1/
pkC1
:
The rest follows exactly as Step 2 in the proof of Theorem 5.10.
t u
5.5.3 General Cumulative Damage Model We assume more realistically that an accumulation of damage may result in a loss of resistance to further damage, and for any given accumulation of damage, later shocks are apt to be more severe. In this case, the magnitudes (or amounts) of successive damages are not necessarily independent. With this in mind, we consider the following model. Shocks to a device occur in time according to a Poisson process with rate . The damage Xk caused by the k th shock satisfies the following conditions: 1. P ŒXk > ujX1 ; ; Xk1 D P ŒXk > ujZk1 D X1 C C Xk1 , i.e., the conditional probability depends on X1 ; Xk1 only via the total damage Zk1 D X1 C C Xk1 . 2. P ŒXk > ujZk1 D z is increasing in z 0, i.e., the more damage accumulated, the weaker the device. 3. P ŒXk > ujZk1 D z P ŒXkC1 > ujZk D z, z 0, k D 1; 2; 3; , i.e., given the amount of damage (Zk D z or Zk1 D z), later shocks are apt to be more severe.
5.5 Shock Models
109
Let p0 .x/ D 1, and pk .x/ D P ŒX1 C C Xk x; k D 1; 2; 3; , then the survival probability H .t/ D
1 X
pk .x/
kD0
et .t/k : kŠ
We point out that H .t/ is again IFRA. Theorem 5.12. If H .t/ is the survival probability of the life distribution in the general cumulative damage model as above, with random damages X1 ; X2 ; , such that the conditions (1) - (3) are satisfied, then H .t/ is IFRA. o1 n Proof. We only need to show that Œpk .x/1=k is a decreasing sequence. kD0 First, observe that pk .x/ D P ŒX1 C X2 C C Xk x Z x P ŒXk x zjZk1 D zdpk .x/ D 0
where Zk D X1 C X2 C C Xk , and pk .x/ D FZk .x/. With z D 0 and k D 1, we have Z x p2 .x/ D P ŒX2 x zjX1 D zdp1 .z/ Z0 x P ŒX2 x zjX1 D 0dp1 .z/ (by condition (2)) 0 Z x P ŒX2 x zdp1 .z/ (by condition (3)) 0 Z x P ŒX1 x dp1 .z/ D Œp1 .x/2 : 0
To complete the induction, suppose that Œpk .z/1=k Œpk1 .z/1=.k1/ for z 0, then for z x, pk .z/ D Œpk .z/1=k Œpk .z/.k1/=k Œpk .z/1=k pk1 .z/ Œpk .x/1=k pk1 .z/:
110
5 Lifetime Distribution Classes
Hence, Z
k
x
ŒpkC1 .z/k D
P ŒXkC1 x zjZk D zdpk .z/ 0
Z
x
k P ŒXk x zjZk D z Œpk .x/1=k dpk1 .z/
0
Z pk .x/
k
x
P ŒXk x zjZk1 D zdpk1 .z/ 0
D pk .x/ Œpk .x/k D Œpk .x/kC1 : Therefore, ŒpkC1 .x/1=.kC1/ Œpk .x/1=k ; t u
and H .t/ is IFRA.
If we reexamine the models introduced above, we notice that the survival probability H .t/ is always of the form H .t/ D
1 X
pk .x/
kD0
et .t/k ; kŠ
where pk .x/ D P ŒX1 C X2 C C Xk x is the probability that the device survives k shocks. The property of H .t/ is pretty much determined by the property of the sequence fpk .x/g1 kD0 . That means if we modify the condition on pk .x/, we can establish models leading to lifetime distributions with other aging properties, such as NBU, NBUE, as well as NWU and NWUE.
5.5.4 Shock Models Leading to Other Lifetime Distributions Consider the cumulative shock model with probabilities of surviving k shocks p0 > p1 > p2 > , where p0 D 1. The survival probability of the device is H .t/ D
1 X kD0
pk
et .t/k : kŠ
If we assume that when the device has already absorbed one or more shocks, the probability of surviving k additional shocks is smaller than the surviving probability of the device before it has absorbed any shock, then we may conclude that the life distribution H.t/ is NBU. That is
5.5 Shock Models
111
Theorem 5.13. If the sequence fpk g1 kD1 satisfies pi Cj pi pj ; f or i; j D 1; 2; 3; ; (that means fpj g are discrete NBU), then H .t/ is NBU. Proof. We need to show that H .s C t/ H .s/H .t/. 1 X
H .s/H .t/ D
i D0
D
es .s/i pi iŠ
1 1 X X
j D0
pi pj
e.sCt / .s/i .t/j i Šj Š
pi Cj
e.sCt / .s/i .t/j i Šj Š
i D1 j D1 1 1 X X
1 !0 1 X et .t/j @ A pj jŠ
i D1 j D1
! 1 n X pn .sCt / X n .s/j .t/nj e D j nŠ nD0 j D0
D
1 X pn .sCt / e Œ.s C t/n nŠ nD0
D H .s C t/: t u
Therefore, H .t/ is NBU.
By reversing the direction of the inequalities, the model leads to NWU. Similarly, we have Theorem 5.14. If the sequence fpk g1 kD1 satisfies pk
1 X
pi
i D0
1 X
pj ; f or k D 1; 2; 3; ;
j Dk
(which is the NBUE property G.t/ NBUE.
R1 t
G.x/dx in discrete form,) then H .t/ is
Proof. We need to show that Z H .t/
1
H .x/dx; t
112
5 Lifetime Distribution Classes
where D Z
R1 0
H .x/dx. In fact,
1
H .t/
H .x/dx t
Z
Z
1 0
D
1 X
pj
j D0
D
1 X
1
H .x/dx
D H .t/
H .x/dx t
Z 1 x Z 1 x 1 1 X et .t/j X e .x/k e .x/k dx dx pk pk jŠ kŠ kŠ 0 t kD0
pj
j D0
kD0
1 1 k et .t/j X 1 X 1 X et .t/j (integration by parts) pk pk jŠ jŠ kD0
0
j D0
kD0
1 1 1 1 1 X X 1 @ X et .t/j X et .t/j A D pj pk pk jŠ jŠ j D0
kD0
j D0 kDj
1 1 1 1 X 1 X et .t/j @ X pk pk A 0: D pj jŠ 0
j D0
kD0
kDj
Z
We have H .t/
1
H .x/dx; t
t u
and H .t/ has NBUE.
Again, a similar result can be derived for NWUE distributions. Similar results are also true for DMRL, HDMRL, NBUC, and HNBUE classes. We shall not give detailed proofs.
Problems 1. A random variable X is called stochastically larger than another random variable Y , denoted as X st Y if P ŒX > x P ŒY > x; for all x. Show that X st Y if and only if for any increasing function g.x/, EŒg.X / EŒg.Y /: 2. Show that X is IFR if and only if .X s/jX>s st .X t/jX>t for s t. That means, the residual life gets stochastically smaller. 3. Show that X is NBU if and only if X st .X t/jX>t for any t.
5.5 Shock Models
113
4. A function .x/ is called star-shaped if for each 0 ˛ 1, .˛x/ ˛.x/: Show that F .x/ is IFRA if and only if ln FN .x/ is star-shaped. 5. A function .x/ is called super-additive if .x C y/ .x/ C .y/: Show that F .x/ is NBU if and only if ln FN .x/ is super-additive. 6. Show that if X is NBU, then EX s < 1 for all s > 0. 7. Show that if two independent lifetime variables X and Y are NBU, then X C Y is also NBU. 8. Suppose X has survival function FN .t/ with failure rate r.t/ and Xk;n is the k th order statistics from a sample of size n with distribution .F .t/. (a) Show that Xk;n has survival function P ŒXk;n > t D
nŠ .k 1/Š.n k/Š
Z
FN .t /
x k1 .1 x/nk dx: 0
(b) Show that Xk;n has failure rate 2 r.t/ 4
Z
1
y k1 0
1 y FN .t/ 1 F .t/
!nk
31 dy 5
:
(c) Show that if X is IFR, so is Xk;n . 9. Show that if the distribution function F .x/ of X has the failure rate r.x/, then EŒX D EŒ1=r.X /: 10. Show that if F .x/ has failure rate r.x/ and limx!1 r.x/ D r.1/ exists, then .1/ D lim .y/ D lim y!1
y!1
1 1 D : r.y/ r.1/
11. Show that if X is NBUE then Var.X / .E.X //2 : 12. Find a counter-example to show that N W U class is not closed under mixture operation.
114
5 Lifetime Distribution Classes
13. Construct an IFRA life distribution, which is not IFR. 14. Let fpi D .1 / i 1 g for i 1 denote the geometric distribution, and fX1 ; X2 ; :::g be i.i.d. random variables with distribution function F .x/. We call C.x/ D .1 /
1 X
i 1 F .i / .x/
i D1
the geometric compound distribution of F .x/. Show that if F .x/ is HNWUE (HNBUE), so is C.x/. 15. For two independent lifetime random variables X1 and X2 with failure rates r1 .t/ and r2 .t/, respectively, show that P ŒX1 < X2 j min.X1 ; X2 / D t D
r1 .t/ : r1 .t/ C r2 .t/
16. For a lifetime X following distribution F .x/ withe mean , we call 1 FQ .x/ D
Z
x
FN .u/du;
0
the equilibrium distribution of F .x/ and the corresponding lifetime is denoted by XQ . Show that (a) X is NBUE if and only if X st XQ ; (b) X is HNBUE if and only if XQ st Exp./, where Exp./ denotes an exponential variable with mean ; (c) X is HDMRL if and only if XQ is DMRL. 17. A nonnegative random variable X is called stochastically smaller in convex order than another nonnegative random variable Y , denoted by X cx Y , if Z
Z
1
1
P ŒX > udu x
P ŒY > udu: x
(a) Show that X is NBUC if and only if Xt cx X . (b) Show that X is DMRL if and only if Xt cx Xs for t s. (c) Show that X is HNBUE if and only if Xt cx Exp./, where Exp./ denotes an exponential random variable with mean . 18. By noting that
Z
1 y
xdF .x/ D FN .y/.y C .y//;
5.5 Shock Models
show that
115
FN .y/
: y C .y/
19. Show that if F .x/ is NWUE, then FN .y/
: yC
20. Suppose Xi ; Zi for i D 1; 2 are independent lifetime random variables, and Xi st Zi for i D 1; 2. Show that X1 C X2 st Z1 C Z2 : 21. Suppose two lifetime random variables X and Y have densities f .x/ and g.y/, is increasing in x. Show that X st Y . respectively, and fg.x/ .x/
Chapter 6
Multivariate Lifetime Distributions
In most reliability analysis, components of a system are assumed to have independent lifetime distributions. However, in many reliability situations, it is more realistic to assume some form of dependence among components. For instance, we may observe the phenomenon that one component’s survival would increase the chance of another component’s survival, which is termed “positive dependence” in reliability. This positive dependence among component life lengths arises from common environmental stress and shocks, from components depending on common sources of power, and so on. Multivariate lifetime distributions are used to describe life length of multicomponent systems, taking component dependence under consideration. The development of multivariate lifetime distribution got significant interest recently. We are mainly interested in the multivariate distributions whose marginal distributions are typical univariate life distributions, such as exponential, Weibull and gamma distributions. Partially due to the universal belief in correlation and regression techniques, coupled with the unjustified popularity of normal distributions (univariate and multivariate), the overwhelming predominance of normality in statistical practice and theory lasted over 80 years unchallenged, until recently. Most of the results in multivariate lifetime distributions were obtained during the 1960s and 1970s, and so the standard multivariate analysis texts are usually confined to multivariate normal distributions. In this chapter, we will consider several basic bivariate parametric families of lifetime distributions, mainly bivariate extensions of exponential distributions, and shock models that give rise to them. In addition, we study various notions of dependence and the relationship between them.
6.1 Basic Properties of Bivariate Distributions Let X , Y be nonnegative random variables (lifetimes of components) with distribution functions F1 .x/ and F2 .y/, respectively. The joint distribution function F .x; y/ of .X; Y / is defined by F .x; y/ D P ŒX x; Y y; A.K. Gupta et al., Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics, DOI 10.1007/978-0-8176-4987-6 6, c Springer Science+Business Media, LLC 2010
117
118
6 Multivariate Lifetime Distributions
and the survival function of .X; Y / is defined by G.x; y/ D P ŒX > x; Y > y: If X and Y are independent, we have F .x; y/ D F1 .x/F2 .y/ and G.x; y/ D G1 .x/G2 .y/; where G1 .x/ and G2 .y/ are the marginal survival functions of X and Y , respectively. Throughout, we will consider continuous random variables X and Y only, since we are mainly interested in lifetime distributions. Therefore, the bivariate distribution functions F .x; y/ and survival functions G.x; y/ concerned are continuous. We summarize the properties of bivariate distributions below. (1) Let F .x; y/ be a bivariate distribution function with marginal distribution functions F1 .x/ and F2 .y/. Then F1 .x/ D F .x; 1/; F2 .y/ D F .1; y/; F .1; 1/ D 1; F .x; 1/ D F .1; y/ D 0; and for any x1 x2 , y1 y2 , F .x2 ; y2 / F .x2 ; y1 / F .x1 ; y2 / C F .x1 ; y1 / 0: The probabilistic interpretation of the last inequality is that the probability content of the rectangle .x1 ; x2 .y1 ; y2 is nonnegative. (2) Conversely, if F .x; y/ is a function of x and y, with range Œ0; 1, such that all the conditions in (1) are satisfied, and F .x; y/ is increasing in each variable (i.e., for each x, F .x; y/ is increasing in y), then F .x; y/ is a bivariate distribution function for some .X; Y /. Note that the inequality in (1) is the most crucial property. (3) The following identity holds. F .x; y/ D 1 G1 .x/ G2 .y/ C G.x; y/; G.x; y/ D 1 F1 .x/ F2 .y/ C F .x; y/: (4) The Fr´echet bounds between F .x; y/ and its marginal distributions are max fF1 .x/ C F2 .y/ 1; 0g F .x; y/ min fF1 .x/; F2 .y/g : The probabilistic interpretation of the right hand side is P ŒX x; Y y min fP ŒX x; P ŒY yg :
6.1 Basic Properties of Bivariate Distributions
119
2
@ (5) If the partial derivative @x@y F .x; y/ exists everywhere, then F .x; y/ is said to be absolutely continuous with density function
f .x; y/ D Z
We have
@2 F .x; y/: @x@y
y
Z
x
F .x; y/ D
f .s; t/dsdt: 1 1
Notice that not every continuous distribution function F .x; y/ has density function. Therefore, we need the next property. (6) Every continuous bivariate distribution function F .x; y/ is a mixture of an absolutely continuous bivariate distribution Fa .x; y/ and a singular continuous bivariate distribution function Fs .x; y/, i.e., F .x; y/ D ˛Fa .x; y/ C .1 ˛/Fs .x; y/; where ˛ 2 .0; 1/, and @2 Fs .x; y/ D 0 @x@y for almost all x and y. The part ˛Fa .x; y/ is called the absolutely continuous part of F .x; y/, whereas .1 ˛/Fs .x; y/ the singular continuous part. The rigorous proof of (6) involves results in integration. We sketch the main idea here to illustrate how to find ˛ and determine Fa .x; y/. If F .x; y/ is a continuous distribution function, the second order partial derivative f .x; y/ D
@2 F .x; y/ @x@y
of F .x; y/ exists for almost all x and y. (The term “almost all” means that the set where the partial derivative does not exist has insignificant measure in the plane.) We can show that Z y Z x f .s; t/dsdt F .x; y/: 1
1
Z
If we let
1
Z
1
˛D
f .s; t/dsdt; 1
1
then ˛1 f .x; y/ is a bivariate density function. It follows that F .x; y/ D ˛Fa .x; y/ C .1 ˛/Fs .x; y/; where Fa .x; y/ is an absolutely continuous bivariate distribution with density function ˛1 f .x; y/, and the singular continuous part Fs .x; y/ D
1 ŒF .x; y/ ˛Fa .x; y/ : 1˛
120
6 Multivariate Lifetime Distributions
We should notice that this quite important observation provides a way to determine whether F .x; y/ has a density function. Also, the above properties hold for any bivariate distributions, regardless whether they are lifetime distributions (nonnegative) or not. We conclude this section with an example illustrating a singular continuous distribution function. Example 6.1. Single Fatal Shock Model Consider a system with two components, which is subject to shocks from a single source governed by a Poisson process with rate > 0. Each shock will cause fatal damage to either component, and when one of the components ceases to operate, the other component stops, so the system fails. The lifetime of the system can be best described by the joint distribution of the components’ lifetimes X and Y . Notice that the fatal shock will stop the system no matter which component it hits. The system survives at the “time” .x; y/ if and only if both components survive at maxfx; yg. This simply means that there is no shock during Œ0; maxfx; yg. Hence, G.x; y/ D P ŒX > x; Y > y D P ŒNo shock occurs during Œ0; maxfx; yg D P ŒN.maxfx; yg/ D 0 D e maxfx;yg ; where fN.t/; t 0g is the Poisson process governing the occurrence of fatal shocks. We can write G.x; y/ as 8 x if x y; <e G.x; y/ D : y if x y: e The marginal distribution functions of X and Y are both exponential distributions with the same parameter , and the bivariate distribution function of .X; Y / is F .x; y/ D 1 ex ey C e maxfx;yg : Clearly, F .x; y/ is a continuous bivariate distribution. Moreover, for x ¤ y, @2 F .x; y/ D 0: @x@y Intuitively, the “area” of the line x D y is insignificant in the plane, we conclude that the life distribution function F .x; y/ is singular continuous.
6.2 Bivariate Memoryless Property We start with bivariate extension of exponential distributions. A distribution function F .x; y/ is said to be a bivariate exponential extension if its marginals F1 .x/ D F .x; 1/ and F2 .y/ D F .1; y/ are both exponential distributions. The marginals
6.2 Bivariate Memoryless Property
121
may or may not be identical. It should be mentioned here that given two univariate distribution functions F1 .x/ and F2 .y/, there are infinitely many bivariate distributions F .x; y/ with F1 .x/ and F2 .y/ as marginal distributions. In other words, bivariate distribution is not uniquely determined by its marginal. Therefore, there could be more than one bivariate extensions of exponential distribution. There are three basic characterization theorems of univariate exponential distribution: lack of memory property, constant failure rate, and by order statistics. It is natural to consider lack of memory property first. The univariate version of the lack of memory property is P ŒX > x C yjX > y D P ŒX > x; for x; y 0. One obvious bivariate extension of the lack of memory property is P ŒX > x C s; Y > y C tjX > s; Y > t D P ŒX > x; Y > y; for s; t; x; y 0. That means, G.x C s; y C t/ D G.x; y/G.s; t/; for s; t; x; y 0. Unfortunately, this is too strong a condition that only leads to bivariate exponential extension with independent components, as the next theorem shows. Theorem 6.1. The only bivariate exponential extension F .x; y/, whose survival function G.x; y/ satisfies G.x C s; y C t/ D G.x; y/G.s; t/; for s; t; x; y 0, is F .x; y/ D 1 e1 x e2 y C e1 x2 y ; x; y 0; for some 1 ; 2 > 0. Proof. By setting t D y D 0, we obtain G1 .x C s/ D G.x; 0/ D G.x; 0/G.s; 0/ D G1 .x/G1 .s/; for all s; x 0, which implies that G1 .x/ D e1 x for some 1 > 0. Similarly, G2 .y/ D e2 y
122
6 Multivariate Lifetime Distributions
for some 2 > 0. Now setting s D y D 0 yields G.x; t/ D G.x; 0/G.0; t/ D G1 .x/G2 .t/ D e1 x2 y : It follows that F .x; y/ is the joint distribution of independent exponential random variables. t u In other words, we only obtain trivial bivariate exponential extension in this way. Therefore, we need to relax the condition on .s; t/. Viewing the condition ŒX > s; Y > t as the “age” of the system, we may consider the generalization P ŒX > x C t; Y > y C tjX > t; Y > t D P ŒX > x; Y > y; for x; y; t 0. This means that the joint survival probability of a pair of components each at age t is the same as that of a pair of new components. In terms of the joint survival function G.x; y/, we may write it as G.x C t; y C t/ D G.x; y/G.t; t/; for x; y; t 0. We call this the Bivariate Lack of Memory Property. Lemma 6.1. If F .x; y/ is a bivariate distribution with bivariate lack of memory property, then its survival function G.x; y/ is of the form ( G.x; y/ D
ey G1 .x y/;
if x y 0;
e x G2 .y x/;
if y x 0;
for some > 0, where G1 .x/ D G.x; 0/ and G2 .y/ D G.0; y/ are the marginal survival functions. Proof. Letting x D y D s in the equation that defines the bivariate lack of memory property, we have G.s C t; s C t/ D G.s; s/G.t; t/; s; t 0: This implies that G.s; s/ D es for some > 0. Next, if y D 0, we have G.x C t; t/ D G.x; 0/G.t; t/ D G1 .x/e t : It follows that G.x; y/ D G1 .x y/ey ; x y:
6.2 Bivariate Memoryless Property
123
By a similar argument, we also have G.x; y/ D G2 .y x/e x ; y x; t u
and hence the lemma.
This leads to the remarkable result due to Marshall and Olkin (1967), which stimulated extensive study of multivariate exponential extensions. Theorem 6.2 (Marshall and Olkin’s BVE). The only bivariate exponential extension that has bivariate lack of memory property is of the form G.x; y/ D exp Œ1 x 2 y 12 max.x; y/ ; where 1 , 2 and 12 are nonnegative. Remark. We notice that if 12 D 0, then we have bivariate exponential extension with independent marginal distributions, whereas if 1 D 2 D 0, then we have the singular continuous bivariate exponential extension as in Example 6.1. Proof. Clearly, the survival function G.x; y/ given in the theorem satisfies the bivariate lack of memory property. Suppose that G.x; y/ is a survival function of a bivariate exponential extension with bivariate lack of memory property, i.e., G.x C t; y C t/ D G.x; y/G.t; t/; x; y; t 0: Then Lemma 6.1 implies that ( G.x; y/ D
ey G1 .x y/;
if x y 0;
e x G2 .y x/;
if y x 0:
Since G1 .x/ and G2 .y/ are both exponential survival functions, G1 .x/ D eı1 x and G2 .y/ D eı2 y ; thus
( G.x; y/ D
ey eı1 .xy/ ;
if x y 0;
e x eı2 .yx/ ;
if y x 0:
For each x 0, G.x; y/ is decreasing in y. We should have ı1 . Similarly, ı2 . Let 1 D ı 1 ; 2 D ı 2 ; and 12 D ı1 C ı2 :
124
6 Multivariate Lifetime Distributions
We claim that 12 0. In fact, since F .x; y/ D 1 G1 .x/ G2 .y/ C G.x; y/; the function
H.x/ D F .x; x/ D 1 eı1 x eı2 x C e x
defines a univariate distribution function (notice the geometric meaning of the distribution), and its density function is h.x/ D ı1 eı1 x C ı2 eı2 x e x : It follows from h.x/ 0 that lim h.x/ D ı1 C ı2 0:
x!0C
With D 1 C 2 C 12 ; ı1 D 1 C 12 ; and ı2 D 2 C 12 ; we have
( G.x; y/ D
e1 x e2 y e12 x ; if x y 0; e1 x e2 y e12 y ; if y x 0;
which is exactly of the form required.
t u
Remark 1. Observe that if .X; Y / has the above BVE, then P Œmin.X; Y / > s D G.s; s/ D es ; where D 1 C 2 C 12 . If X and Y are independent, then 12 D 0. Thus, we should view 12 representing the correlation between X and Y . Remark 2. The bivariate lack of memory property P ŒX > x C t; Y > y C tjX > t; Y > t D P ŒX > x; Y > y is equivalent to the equation on survival function G.x; y/ G.x C t; y C t/ D G.x; y/G.t; t/: This is, in turn, equivalent to P ŒX > x C t; Y > y C tjX > x; Y > y D P ŒX > t; Y > t; which states that the survival probability of a series system of two components of age x and y, respectively, is the same as that of a new system. The probability P ŒX > t; Y > t D P Œmin.X; Y / > t is the survival probability of a series system.
6.3 Properties of the BVE
125
In other words, we can conclude that the life distribution of a series system of two used components each having marginal exponential life distribution is independent of component age if and only of the joint distribution of the two components is BVE. This is another model of BVE. Remark 3 (Fatal Shock Model). consider a system of two components, with life lengths X and Y , subject to fatal shocks. Suppose that there are three independent sources of shocks at present in the environment. A shock from source 1 destroys component 1 only, a shock from source 2 destroys component 2 only, and a shock from source 3 will destroy both components 1 and 2 simultaneously. The three kinds of shocks occur according to three independent Poisson processes with rates 1 , 2 , and 12 , respectively. If we let U1 , U2 , and U12 denote the times that the shocks occur, then U1 , U2 and U12 are independent and exponentially distributed. Moreover, the life lengths X and Y of components are X D minfU1 ; U12 g and Y D minfU2 ; U12 g. Thus, the survival function G.x; y/ of .X; Y / is G.x; y/ D P ŒX > x; Y > y D P ŒminfU1 ; U12 g > x; minfU2 ; U12 g > y D P ŒU1 > x; U2 > y; U12 > maxfx; yg; which leads to
G.x; y/ D e1 x e2 y e12 maxfx;yg :
This is the BVE of Marshall and Olkin. Conversely, if we have a BVE, there must be independent exponential random variables U1 , U2 , and U12 such that the marginals are X D minfU1 ; U12 g and Y D minfU2 ; U12 g: Again, if U12 ¤ 0, X and Y are dependent via U12 with parameter 12 . This dependence structure can be seen through the above model. There are other distinct plausible models that lead to the above BVE. We will first examine properties of the BVE, and then will look at those models.
6.3 Properties of the BVE Let F .x; y/ be the distribution function of a BVE with survival function G.x; y/ D e1 x2 y12 max.x;y/ ; for x; y 0. We list the following basic properties of the BVE. Property (1). The marginal distributions of X and Y are exponential with parameters 1 C 12 and 2 C 12 , respectively. The expected values of the marginals are 1 1 and 2 C , respectively. 1 C12 12
126
6 Multivariate Lifetime Distributions
Property (2). The following inequalities hold: F .x; y/ F1 .x/F2 .y/; G.x; y/ G1 .x/G2 .y/: In fact, since 1 x C 2 y C 12 .x C y/ 1 x C 2 y C 12 max.x; y/; G.x; y/ D e1 x2 y12 max.x;y/ e1 x2 y12 .xCy/ D G1 .x/G2 .y/: Also, F .x; y/ F1 .x/F2 .y/ D G.x; y/ G1 .x/G2 .y/; and hence the other inequality is valid. Property (3). X and Y are positively correlated (in fact, associated) in the sense that cov.X; Y / 0. There are two ways of showing this. First, since cov.X; Y / D EŒX Y EŒX EŒY Z 1Z 1 D ŒG.x; y/ G1 .x/G2 .y/dxdy; 0
0
and G.x; y/ G1 .x/G2 .y/ 0; we have cov.X; Y / 0. This is easy if we have the above formula for expectation. Second, we can also calculate cov.X; Y /. By the fatal shock model, the marginals are X D min.U1 ; U12 / and Y D min.U2 ; U12 /; where U1 , U2 , and U12 are independent exponential random variables. We only need to find EŒX Y D EŒmin.U1 ; U12 / min.U2 ; U12 /: Conditioned on U12 D t, X and Y become independent (conditionally). Hence, we have EŒX jU12 D t D EŒmin.U1 ; U12 /jU12 D t Z
1
D Z
t
D 0
D
min.u; t/1 e1 u du
0
u1 e1 u du C
1 1 e1 t : 1
Z
1
t1 e1 u du t
6.3 Properties of the BVE
127
Similarly, EŒY jU12 D t D
1 1 e2 t : 2
Hence, EŒX Y D E Œmin.U1 ; U12 / min.U2 ; U12 / D E fE Œmin.U1 ; U12 / min.U2 ; U12 /jU12 g D E fE Œmin.U1 ; U12 /jU12 g E fE Œmin.U2 ; U12 /jU12 g 1 1 1 U12 2 U12 1e 1e DE 1 2 Z 1 1 1 e1 u 1 e2 u 12 e12 u du D 1 2 0
D
1 C 2 C 212 .1 C 12 /.2 C 12 /.1 C 2 C 12 /
and cov.X; Y / D EŒX Y EŒX EŒY D
1 C 2 C 212 1 1 .1 C 12 /.2 C 12 /.1 C 2 C 12 / 1 C 12 2 C 12
D
.1 C 2 C 212 / .1 C 2 C 12 / .1 C 12 /.2 C 12 /.1 C 2 C 12 /
D
12 0: .1 C 12 /.2 C 12 /.1 C 2 C 12 /
We can see the crucial role that 12 plays here. Property (4). The random variable T D min.X; Y / is exponentially distributed with parameter D 1 C 2 C 12 : In fact, P ŒT > x D P Œmin.X; Y / > x D P ŒX > x; Y > x D e.1 C2 C12 /x : Property (5). If we let D D X Y , then D and T D min.X; Y / are independent. We need to show that for all x 0 and y 2 .1; 1/, P Œmin.X; Y / > x; X Y > y D P Œmin.X; Y / > xP ŒX Y > y:
128
6 Multivariate Lifetime Distributions
Again, we reduce everything back to the “construction blocks” U1 , U2 , and U12 . Observe that for every y > 0, U12 < U12 C y, so min.U1 ; U12 / < U12 C y; and hence,
min.U1 ; U12 / > min.U2 ; U12 / C y
if and only if min.U1 ; U12 / > U2 C y .or min.U2 ; U12 / D U2 /: It follows that P Œmin.X; Y / > x; X Y > y D P ŒX > x; Y > x; X Y > y D P Œmin.U1 ; U12 / > x; min.U2 ; U12 / > x; and min.U1 ; U12 / > min.U2 ; U12 / C y D P Œmin.U2 ; U12 / > x; min.U1 ; U12 / > min.U2 ; U12 / C y D P ŒU2 > x; min.U1 ; U12 / > U2 C y Z 1 P Œmin.U1 ; U12 / > U2 C yjU2 D u2 e2 u du D Z
x 1
D Z
P ŒU1 > u C y; U12 > u C y2 e2 u du
x 1
D
e.1 C12 /.uCy/ 2 e2 u du
x
D e.1 C12 /y
Z
1
2 e.1 C2 C12 /u du
x
D e.1 C12 /y D
2 e.1 C2 C12 /x 1 C 2 C 12
2 e.1 C12 /y P ŒT > x: 1 C 2 C 12
Similarly, for y < 0, P Œmin.X; Y / > x; X Y > y D P ŒX > x; Y > x; X Y > y D P ŒX > x; Y > x P ŒX > x; Y > x; X Y C x
6.3 Properties of the BVE
129
D P Œmin.U1 ; U12 / > x; min.U2 ; U12 / > x P Œmin.U1 ; U12 / > x; min.U2 ; U12 / > x; and min.U1 ; U12 / min.U2 ; U12 / C y D P ŒU1 > x; U2 > x; U12 > x P ŒU1 > x; min.U2 ; U12 / min.U1 ; U12 / y D e.1 C2 C12 /x P ŒU1 > x; min.U2 ; U12 / > U1 y Z 1 P ŒU2 > u y; U12 u y1 e1 u du D e.1 C2 C12 /x x
D e.1 C2 C12 /x
Z
1
e.2 C12 /.uy/ 1 e1 u du
x
D e.1 C2 C12 /x D 1 D 1
1 e.1 C12 /y e.1 C2 C12 /x 1 C 2 C 12 .1 C12 /y e.1 C2 C12 /x e
1 1 C 2 C 12
1 .1 C12 /y P ŒT > x: e 1 C 2 C 12
Notice that with G1 .x/ D P ŒT > x D e.1 C2 C12 /x ; and if we let
G2 .y/ D
8 ˆ ˆ <1 ˆ ˆ :
1 e.2 C12 /y ; 1 C 2 C 12
2 e.1 C12 /y ; 1 C 2 C 12
we have G2 .y/ D P ŒX Y > y for y ¤ 0, and hence P ŒT > x; X Y > y D G1 .x/G2 .y/: This proves (5). Property (6). T is independent of jX Y j D max.X; Y / min.X; Y /:
y<0 y>0
130
6 Multivariate Lifetime Distributions
This is basically a restatement of (5), since T and X Y are independent, so are T and any function of X Y . We can also observe that P ŒT > x; jX Y j < y D P ŒT > x; y < X Y < y D P ŒT > xP Œy < X Y < y: Property (7). T is independent of the events ŒX < Y , ŒX > Y , and ŒX D Y . This follows from (5) by observing that ŒX < Y D ŒX Y < 0; ŒX > Y D ŒX Y > 0; and ŒX D Y D ŒX Y D 0: Now, we have several further results about the BVE. Theorem 6.3. Let X and Y be exponential random variables with joint bivariate distribution F .x; y/. F .x; y/ is the BVE of Marshall and Olkin (i.e., F .x; y/ has the bivariate lack of memory property) if and only if 1. T D min.X; Y / is exponentially distributed with parameter , 2. T and D D X Y are independent. In this case, the parameters of X , Y and T D min.X:Y / are 1 C 12 ; 2 C 12 ; and 1 C 2 C 12 ; respectively. Proof. The necessity follows from the properties (4) and (5). To show the sufficiency, we will show that if conditions (1) and (2) hold, then F .x; y/ has the bivariate lack of memory property. Let X and Y be exponential random variables satisfying the conditions (1) and (2). For x y, G.x; y/ D P ŒX > x; Y > y D P Œmin.X; Y / > x C P Œx < X y; Y > y D P ŒT > y C P Œx < X y; Y > y Z y D P ŒT > y C P Œx < X y; Y > yjT D ueu du x
D ey C
Z
y
P ŒX Y u yeu du x
D ey C
Z
y
P ŒD u yeu du: x
6.3 Properties of the BVE
131
Thus, @ G.x; y/ D P ŒD x yex ; @x for almost all 0 x y. With y and t fixed, @ G.x C t; y C t/ D P ŒD x ye.xCt / @x @ D G.t; t/ G.x; y/: @x Integrating with respect to x yields G.x C t; y C t/ D G.t; t/G.x; y/; for 0 x y, t 0. Similar result holds for the case 0 y x. This implies that F .x; y/ has the bivariate lack of memory property, so that F .x; y/ is the BVE. u t Theorem 6.4. Let G.x; y/ be the survival function of the BVE with D 1 C 2 C 12 . Then 1 C 2 12 Ga .x; y/ C Gs .x; y/; G.x; y/ D where Ga .x; y/ is an absolutely continuous survival function with density function 8 1 .2 C 12 / 1 x.2 C12 /y ˆ ˆ e ; < 1 C 2 ga .x; y/ D ˆ .2 C 12 / .1 C12 /x2 y ˆ : 2 e ; 1 C 2 and
0x
Gs .x; y/ D e max.x;y/
is a singular continuous survival function. Proof. Let G.x; y/ be the survival function of the BVE, then G.x; y/ D
8 < e1 x e.2 C12 /y ;
x y;
: e.1 C12 /x e2 y ;
y x:
For x < y, @2 G.x; y/ D 1 .2 C 12 /e1 x e.2 C12 /y ; @x@y and for y < x, @2 G.x; y/ D .1 C 12 /2 e.1 C12 /x e2 y : @x@y
132
6 Multivariate Lifetime Distributions
It follows that Z 1Z 0
1 0
Z
@2 G.x; y/dxdy @x@y Z
1
y
D 0
1 .2 C 12 /e1 x e.2 C12 /y dxdy
0
Z
1
Z
x
C 0
Z
1
D
.1 C 12 /2 e.1 C12 /x e2 y dydx
0
.2 C 12 / 1 e1 y e.2 C12 /y dy
0
Z
1
C
.1 C 12 /e.1 C12 /x dx
0
Z
1
D
.2 C 12 /ey dy C
0
Z
1
Z
1
C
.1 C 12 /e.1 C12 /x dx C
0
D1 D
.2 C 12 /e.2 C12 /y dy
0
Z
1
.1 C 12 /ex dx
0
1 C 12 2 C 12 C1
1 C 2 :
If we let p.x; y/ D
@2 G.x; y/; 1 C 2 @x@y
then p.x; y/ is a bivariate density function with survival function Ga .x; y/ D
12 e1 x2 y12 max.x;y/ e max.x;y/ : 1 C 2 1 C 2
Hence, G.x; y/ D
12 1 C 2 Ga .x; y/ C Gs .x; y/;
where Gs .x; y/ D e max.x;y/ .
t u
6.4 A Nonfatal Shock Model
133
Remark 1. As long as 12 ¤ 0, BVE is not absolutely continuous, so there is no density function. In applications, this creates some inconvenience. We shall study absolutely continuous bivariate extensions. Remark 2. Two useful formulas on conditional probability related to BVE are given as below. 8 x 1 ; y>x ˆ <e P ŒX > xjY D y D 2 ˆ : e.xy/1 x ; y < x; 2 C 12 and EŒX jY D y D
12 1 e1 y ; y 0: 1 1 .1 C 12 /.2 C 12 /
They are useful in statistical analysis of the BVE, and reveal how one component is dependent on the other.
6.4 A Nonfatal Shock Model We present another shock model leading to BVE. Consider a system with two components, having respective life lengths X and Y . Suppose now that there are three sources of shocks that are not necessarily fatal. A shock from source 1 causes the failure of component 1 with probability q1 , and is nonfatal with probability 1 q1 , a shock from source 2 causes the failure of component 1 with probability q2 , and is nonfatal with probability 1 q2 ; finally, a shock from source 3 affect both components 1 and 2 that can be fatal or nonfatal to either or both components. More precisely, there are four cases: 1. 2. 3. 4.
Failure of both components, with probability q11 , Failure of component 1 only, with probability q10 , Failure of component 2 only, with probability q01 , Nonfatal to both components, with probability q00 ,
where q00 C q10 C q01 C q11 D 1. Assume, as usual, that the three sources of shocks are arriving according to three independent Poisson processes N1 .t/, N2 .t/, and N12 .t/ with rates 1 , 2 , and 12 , respectively. Now, let U1 and U2 denote the times of the first fatal shock due to N1 .t/ and 0 N2 .t/, and U12 and U12 the times of the first fatal shock to component 1 and component 2 due to N12 .t/, respectively. Then, 0 .X; Y / D .min.U1 ; U12 /; min.U2 ; U12 //:
134
6 Multivariate Lifetime Distributions
Note that the time until the first fatal shock from N1 .t/ is exponentially distributed with parameter 1 q1 , since P ŒU1 > t1 D P Œall shocks in Œ0; t1 are nonfatal D
1 X
P ŒN1 .t1 / D k; all shocks are nonfatal
kD0
D
1 X e1 t1 .1 t1 /k .1 q1 /k kŠ
kD0
D e1 q1 t1 ; for t1 0. Similarly, P ŒU2 > t2 D e2 q2 t2 : This simply implies that G.x; y/ D P ŒX > x; Y > y 0 D P min .U1 ; U12 / > x; min U2 ; U12 >y 0 D P U1 > x; U2 > y; U12 > x; U12 >y 0 D P ŒU1 > xP ŒU2 > yP ŒU12 > x; U12 > y 0 D e1 t1 e2 t2 P ŒU12 > x; U12 > y: 0 Let us turn our attention to P ŒU12 > x; U12 > y. Since U12 is the time until the 0 first fatal shock from N12 .t/ that destroys the first component, whereas U12 is the time until the first fatal shock that destroys the second component, for x < y, we have 0 P ŒU12 > x; U12 > y D e12 q10 x e12 .1q00 q10 /y ;
and for x > y, 0 > y D e12 q01 x e12 .1q00 q01 /y : P ŒU12 > x; U12
Combining the two yields that ( G.x; y/ D
e1 x e2 y e12 q10 x e12 .1q00 q10 /y ;
x < y;
e1 x e2 y e12 q01 x e12 .1q00 q01 /y ;
y < x:
Let 01 D 1 q1 C 12 q10 ; 02 D 2 q2 C 12 q01 ; and 012 D 12 q11 ;
6.5 Absolutely Continuous Bivariate Exponential Extensions
135
then G.x; y/ can be written as ( G.x; y/ D
0
0
0
e1 x e2 y e12 y ; e
01 x
e
02 y
e
012 x
;
x y; y x:
It follows from 1 q00 q10 D q01 C q11 and 1 q00 q10 D q10 C q11 that 0
0
0
G.x; y/ D e1 x2 y12 max.x;y/ ; x; y 0: The method used here is typical. It can be used to derive bivariate Poisson process, bivariate gamma and Weibull distributions. The interested readers are referred to the book of Barlow and Proschan (1975).
6.5 Absolutely Continuous Bivariate Exponential Extensions As we mentioned in a remark in Sect. 6.3, the fact that BVE lacks a density function limits its applications. There are many possible ways to define bivariate exponential extensions. Through bivariate lack of memory property is only one of them, and thus it is worthwhile to see how far we can proceed. We will study the absolutely continuous part of the BVE, and show that whether we require BVE to be absolutely continuous and satisfy the bivariate lack of memory property, then we end up with only trivial extension: bivariate exponential extension with independent marginals. Definition 6.1. A bivariate distribution function F .x; y/ is called an absolutely continuous bivariate exponential extension (ACBVE) if it has density function 8 . C / 1 2 12 1 x2 y12 max.x;y/ ˆ e ; ˆ < 1 C 2 f .x; y/ D ˆ ˆ 2 .1 12 / 1 x2 y12 max.x;y/ : e 1 C 2
0 < x < y; 0 < y < x;
and survival function G.x; y/ D
12 e1 x2 y12 max.x;y/ e max.x;y/ 1 C 2 1 C 2
where 1 ,2 ,12 > 0, and D 1 C 2 C 12 . The ACBVE is the absolutely continuous part of the BVE, but it can be derived independently as well.
136
6 Multivariate Lifetime Distributions
Model 1. Failure Affected Model Consider a two-component system where the failure of one component will place strain on the surviving component. Let X1 and Y1 be independent exponential lifetimes with parameters ˛ and ˇ, which are the initial lifetimes of the components, respectively. Let X2 and Y2 be independent of X1 and Y1 , exponentially distributed with parameters ˛ 0 > ˛ and ˇ 0 > ˇ, representing the lifetimes of the components affected by additional strain. Notice that greater parameter means smaller expected value or shorter lifetime possibly. The lifetime of the system is ( .X; Y / D Let
˛ D 1 C 12
.X1 ; X1 C Y2 /;
if X1 < Y1 ,
.Y1 C X2 ; Y1 /;
if X1 > Y1 .
1 1 C 2
; ˇ D 1 C 12
2 1 C 2
;
˛ 0 D 1 C 12 ; and ˇ 0 D 2 C 12 : We can show that the survival function G.x; y/ of .X; Y / is exactly the ACBVE in definition 6.1. Theorem 6.5. The distribution of .X; Y / in the failure affected model is ACBVE. Proof. Let G.x; y/ be the survival function of .X; Y /. Then, G.x; y/ D P ŒX1 > x; X1 CY2 > y; X1 < Y1 CP ŒY1 CX2 > x; Y1 > y; X1 > Y1 : We consider two cases where y < x and y > x. If y < x, then X1 > x implies that X1 C Y2 > y, so ŒX1 > x; X1 C Y2 > y D ŒX1 > x; and
Z
1
P ŒX1 > x; X1 C Y2 > y; X1 < Y1 D
P ŒY1 > t˛e˛t dt
x
D
˛ e.˛Cˇ /t : ˛Cˇ
Next, P ŒY1 C X2 > x; Y1 > y; X1 > Y1 Z 1 P ŒX2 > x t; X1 > tˇeˇ t dt D y
Z
x
D y
0
e˛t e˛ .xt / ˇeˇ t dt C
Z
1 x
e˛t ˇeˇ t dt
6.5 Absolutely Continuous Bivariate Exponential Extensions
Z
x
D
0
0
ˇe˛ x e.˛Cˇ ˛ /t dt C
Z
y
1
137
ˇe.˛Cˇ /t dt
x 0
ˇ ˇ ˇe˛ x .˛Cˇ ˛0 /y e.˛Cˇ /x e e.˛Cˇ /x C 0 0 ˛Cˇ˛ ˛Cˇ˛ ˛Cˇ
ˇ ˇ ˇ ˛ 0 x .˛Cˇ ˛ 0 /y e.˛Cˇ /x : D e e C ˛ C ˇ ˛0 ˛ C ˇ ˛ C ˇ ˛0 D
Hence,
ˇ ˇ ˛ 0 x .˛Cˇ ˛ 0 /y e.˛Cˇ /x : e e C 1 G.x; y/ D ˛ C ˇ ˛0 ˛ C ˇ ˛0 If x y, then we have P ŒX1 > x; X1 C Y2 > y; X1 < Y1 Z 1 P ŒY2 > y C t; Y1 > t˛e˛t dt D x
Z D
y
˛e
ˇ 0 y .˛Cˇ ˇ 0 /t
e
Z
1
dt C
x
y
˛ 0 0 eˇ y e.˛Cˇ ˇ /x C D ˛ C ˇ ˇ0
˛e.˛Cˇ /t dt
˛ ˛ ˛ C ˇ ˛ C ˇ ˇ0
e.˛Cˇ /y ;
and Z
1
P ŒY1 C X2 > x; Y1 > y; X1 > Y1 D
P ŒX1 > tˇeˇ t dt
y
D
ˇ e.˛Cˇ /y ; ˛Cˇ
so G.x; y/ D
˛ ˛ ˇ 0 y .˛Cˇ ˇ 0 /x e.˛Cˇ /y : e e C 1 ˛ C ˇ ˇ0 ˛ C ˇ ˇ0
Let ˛ D 1 C 12
ˇ D 2 C 12
1 1 C 2 2 1 C 2
; ˛ 0 D 1 C 12 ; ; ˇ 0 D 2 C 12 ;
and D 1 C 2 C 12 :
138
Then
6 Multivariate Lifetime Distributions
˛ C ˇ D 1 C 2 C 12 D ; ˛ C ˇ ˛ 0 D 2 ;
and
˛ C ˇ ˇ 0 D 1 :
We have G.x; y/ 8
ˇ ˇ ˆ ˛ 0 x.˛Cˇ ˛ 0 /y ˆ e.˛Cˇ /x ; C 1 ˆ < ˛ C ˇ ˛0 e ˛ C ˇ ˛0 D
ˆ ˛ ˛ ˆ .˛Cˇ ˇ 0 /xˇ 0 y ˆ e.˛Cˇ /y ; e C 1 : ˛ C ˇ ˇ0 ˛ C ˇ ˇ0 8 12 .1 C12 /x2 y ˆ ex ; y < x; ˆ < C e 1 C 2 1 2 D ˆ 12 ˆ : e1 x.2 C12 /y ey ; x < y; 1 C 2 1 C 2
y < x; x < y;
t u
and the theorem follows.
Model 2. Lack of Memory Property It was shown that if a bivariate distribution has bivariate lack of memory property, its survival function is of the form 8 < ey G1 .x y/; 0 y x; G.x; y/ D : e x G .y x/; 0 x y; 2 for some > 0. Notice that the marginal survival functions G1 .x/ and G2 .y/ of the ACBVE are (by letting x D 0 or y D 0 in G.x; y/) G1 .x/ D
12 e.1 C12 /x ex 1 C 2 1 C 2
G2 .y/ D
12 e.2 C12 /y ey : 1 C 2 1 C 2
and
By choosing D , we can see that G.x; y/ is exactly of the above form, i.e., ACBVE satisfies the bivariate lack of memory property. However, we also notice that G1 .x/ and G2 .y/ are not survival functions of exponential distributions. This is clarified in the next theorem.
6.5 Absolutely Continuous Bivariate Exponential Extensions
139
Theorem 6.6. The only absolutely continuous bivariate distribution with lack of memory property and exponential marginals is the bivariate distribution with independent exponential marginals. Proof. The Bivariate Lack of memory property yields that
G.x; y/ D
8 y G1 .x y/; <e :
e x G2 .y x/;
0 y x; 0 x y:
With exponential marginals, we have G.x; y/ D e1 x2 y12 max.x;y/ : If G.x; y/ is absolutely continuous, it follows from Theorem 6.4 that its singular continuous part should be zero, i.e., 12 12 max.x;y/ Gs .x; y/ D e D 0: But this holds only if 12 D 0, that is, when G.x; y/ is the bivariate distribution with independent exponential marginals. t u Here is the conclusion: ACBVE has lack of memory property, is absolutely continuous, but does not have exponential marginals unless it is a bivariate distribution of two independent exponential distributions. This also suggests that extensions via properties other than lack of memory property are worthwhile exploring. In fact, there are plenty of new and interesting generalizations in other directions.
Problems Let .X; Y / follow the BVE distribution with 1 D 0:2, 2 D 0:3, and 12 D 0:1. 1. Write down the joint distribution of .X; Y / F .x; y/. 2. What are the marginal distributions for X and Y F1 .x/ and F2 .x/? 3. What is the covariance Cov.X; Y /? 4. Find the conditional survival function of X given Y D y P ŒX > xjY D y. 5. What is the conditional mean of X given Y D y EŒX jY D y? 6. Verify that the conditional mean of X given Y D y is increasing in y.
140
6 Multivariate Lifetime Distributions
7. Verify that the conditional survival function of X given Y increasing in y. 8. What is the corresponding ACBVE? 9. What is the singular distribution corresponding to the BVE? 10. What is the distribution of min.X; Y /?
D
y is
Chapter 7
Association and Dependence
In this chapter, we shall first generalize the concepts of association and dependence in the bivariate case. We shall see that there are a variety of ways of extending the concept of association. Then, we shall consider its applications to lifetime distribution and see how we can generalize the concepts of failure rates and residual life, and thus the notions of multivariate lifetime distribution classes. Finally, we shall discuss the concept of negative association.
7.1 Several Concepts of Association We shall use the same notations as in Chap. 6. Let two nonnegative random variables .X; Y / have the joint distribution function F .x; y/ D P ŒX x; Y y with marginal distributions F1 .x/ D P ŒX x and F2 .y/ D P ŒY y, respectively. Besides the most important memoryless property of bivariate exponential distribution, another notable property is the positive correlation between X and Y . We first note the following expression for the covariance between X and Y . Lemma 7.1. For two nonnegative random variables X and Y Z 1Z 1 Cov.X; Y / D Cov IŒX>s ; IŒY >t dsdt; 0
0
where IŒ: represents the indicator function. Proof. We first note that Cov.IŒX>s ; IŒY >t / D E IŒX>s IŒY >t E IŒX>s E IŒY >t D P ŒX > s; Y > t P ŒX > sP ŒY > t: Next we have Z 1Z 1 0
0
E IŒX>s E IŒY >t dsdt D
Z
Z
1
1
P ŒX > sds 0
P ŒY > tdt 0
D EŒX EŒY A.K. Gupta et al., Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics, DOI 10.1007/978-0-8176-4987-6 7, c Springer Science+Business Media, LLC 2010
141
142
7 Association and Dependence
and by integrating by parts, we have Z 1Z 1 Z 1Z 1Z 1Z 1 EŒIŒX>s IŒY >t dsdt D dF .x; y/dsdt 0 0 0 0 s t Z 1Z 1Z xZ y D dsdtdF .x; y/ 0 0 0 0 Z 1Z 1 D xydF .x; y/ 0
0
D EŒX Y ; which completes the proof.
t u
Therefore, we see that a sufficient condition for Cov.X; Y / 0 is P ŒX > s; Y > t P ŒX > sP ŒY > t for all s; t. Along this line, we first give several definitions of bivariate dependence. Definition 7.1. (a) .X; Y / is called positive quadrant dependent (PQD) if P ŒX x; Y y P ŒX xP ŒY y; for all x; y. (b) .X; Y / is called left tail decreasing (LTD) if for s t, P ŒY yjX s P ŒY yjX t; i.e., P ŒY yjX x is decreasing in x for any y. (c) .X; Y / is called right tail increasing (RTI) if for s t, P ŒY > yjX > s P ŒY > yjX > t; i.e., P ŒY > yjX > x is increasing in x for any y. (d) .X; Y / is said to be positive regression dependent (PRD) if for s t, P ŒY yjX D s P ŒY yjX D t; i.e., P ŒY yjX D x is decreasing in x for any y. Next, we list the relationships between the several concepts of dependence. (1) In the bivariate case, PQD is equivalent to P ŒX > s; Y > t D P ŒX > sP ŒX > t: This follows from the fact that P ŒX > x; Y > y D 1 P ŒX x P ŒY y C P ŒX x; Y y:
7.1 Several Concepts of Association
143
(2) RTI or LTD implies PQD. This is obvious. (3) PRD implies RTI and LTD. Proof. We first give an equivalent condition for RTI. For s < t, the RTI property can be rewritten as P ŒY > yjX > t P ŒY > yjX > s D
P ŒY > y; X > s P ŒX > s
D
P ŒY > y; X > t C P ŒY > y; s < X t : P ŒX > t C P Œs < X t
This is equivalent to P ŒY > y; X > tP Œs < X t P ŒY > y; s < X tP ŒX > t: That means, P ŒY > yjX > t P ŒY > yjs < X t; for s < t. Now if (X,Y) are PRD, then for any z t x s, we have P ŒY > yjX D z P ŒY > yjX D x: By integrating both sides with respect to dP ŒX z for 1 > z t, we get Z
1
P ŒY > y; X > t
P ŒY > yjX D zdP ŒX z t
D P ŒY > yjX D xP ŒX > t: By further integrating both sides with respect to dP ŒX x for s < x t, we have Z
t
P ŒY > yjX D xdP ŒX x
P ŒY > y; X > tP Œs < X t P ŒX > t s
D P ŒX > tP ŒY > y; s < X t: This means, .X; Y / are RTI.
t u
Similarly, we can show that LTD is equivalent to P ŒY yjX s P ŒY yjs < X t; for all s < t. Similar technique can be used to show that PRD implies LTD. However, the positive covariance alone cannot characterize the association between two variables as it is only based on the first joint moment between X and Y ,
144
7 Association and Dependence
i.e., the linear functions of X and Y . Along this line, we can define the association of X and Y based on positivity of joint moments between more general functions of X and Y . The following is the formal definition of association. Definition 7.2. X and Y are called associated if for any bivariate increasing functions g.x; y/ and h.x; y/, Cov.g.X; Y /; h.X; Y // 0: We list some properties of association. Property (1) If .X; Y / are associated, then they are PQD. In fact, by letting g.x; y/ D IŒx>s and h.x; y/ D IŒy>t , the association implies PQD. But the reverse is not true. Property (2) X is always associated with itself. In fact, for any increasing functions g.x/ and h.x/ in x, we define x D inffx W g.x/ Eg.X /g: Then, we have CovŒg.X /; h.X / D EŒg.X /h.X / EŒg.X /EŒh.X / D EŒ.g.X / EŒg.X //h.X / D EŒ.g.X / EŒg.X //.h.X / h.x //: Note that from the increment property of g.x/ and h.x/, g.x/ Eg.X / if and only if h.x/ h.x /. Thus, .g.x/ Eg.X //.h.x/ h.x // 0 for any x, which implies CovŒg.X /; h.X / 0. Property (3) If X and Y are independent, then X and Y are associated. This is a special case of the following theorem, and so we omit the proof. Property (4) If X and Y are associated and g.x; y/ and h.x; y/ are increasing functions of x and y, then g.X; Y / and h.X; Y / are associated. This is noticed because the composition of increasing functions are still increasing functions. Property (5) If X and Y are associated, then X and Y are independent if and only if Cov.X; Y / D 0.
7.1 Several Concepts of Association
145
This is a corollary of the expression Z
1
Z
1
Cov.X; Y / D
.P ŒX > xP ŒY > y P ŒX > xP ŒY > y/dxdy: 0
0
The next theorem gives a relationship between association and PRD. Theorem 7.1. If X and Y are PRD, then X and Y are associated. Proof. By denoting EX Œ: and EY jX Œ: as the marginal expectation with respect to X and the conditional expectation given X , respectively, we have CovŒg.X; Y /; h.X; Y / D EŒg.X; Y /h.X; Y / EŒg.X; Y /EŒh.X; Y / D EX ŒEY jX .g.X; Y /h.X; Y // EX ŒEY jX g.X; Y /EX ŒEY jX h.X; Y / D EX ŒEY jX .g.X; Y /h.X; Y // EX Œ.EY jX g.X; Y //.EY jX h.X; Y // CEX Œ.EY jX g.X; Y //.EY jX h.X; Y //EX ŒEY jX g.X; Y /EX ŒEY jX h.X ;Y / D EX ŒCovY jX .g.X; Y /; h.X; Y // C CovX ŒEY jX g.X; Y /; EY jX h.X; Y /: From the increment property of g.x; y/ and h.x; y/, we see that given X D x, g.x; y/ and h.s; y/ are still increasing in y. Thus, from the second property, CovY jXDx Œg.x; Y /; h.x; Y / 0: This implies EX ŒCovY jX .g.X; Y /; h.X; Y // 0: To show CovX ŒEY jX g.X; Y /; EY jX h.X; Y / 0; we first note that from the PRD property, P ŒY > yjX D x is increasing in x. Thus, from the increment property of g.x; y/, for any given x 0 P Œg.x 0 ; Y / > zjX D x is increasing in x, which in turn implies P Œg.x; Y / > zjX D x is increasing in x by replacing x 0 with x. Thus, Z EY jXDx g.x; Y / D
1
P Œg.x; Y / > zjX D xdz 0
is increasing in x. The same is true for EY jXDx h.x; Y /. Therefore, CovX ŒEY jX g.X; Y /; EY jX h.X; Y / 0: t u The generalization to the multivariate case can be naturally defined as follows and has been used to model the lifetimes of components in coherent reliability systems.
146
7 Association and Dependence
Definition 7.3. A random vector X =.X1 ; ; Xn / is called associated if for any increasing functions g.x/ and h.x/ in x D .x1 ; ; xn / CovŒg.X/; h.X/ 0: We add two more extra properties for the associated random vector case and their proofs are left as exercises. Property (6) Any subsets of associated random variables are associated. Property (7) If two sets of associated random variables are independent of one another, then their union is a set of associated variables. Example 7.1. If .X; Y / follows the bivariate exponential distribution (BVE), then .X; Y / are associated. In fact, suppose Y1 ; Y2 , and Y3 are three independent exponential random variables with parameters 1 ; 2 , and 12 , respectively. Then Y1 ; Y2 ; and Y3 are associated. However, .X; Y / have the same distribution as .min.Y1 ; Y3 /; min.Y2 ; Y3 //, which are increasing functions of .Y1 ; Y2 ; Y3 /, and thus are associated.
7.2 MTP2 Distribution It is usually difficult to show the association directly by the definition itself. Some sufficient conditions for association may lead to a smaller class of associated random variables. And these conditions are much easier to verify. The MTP2 property introduced by Karlin and Rinott (1980) is one of these conditions. For the simplicity, we limit our presentation in 2-dimensional case. Let .X; Y / be a 2-dimensional continuous random variable with density f .x; y/. Definition 7.4. .X; Y / is called MTP2 if for any xi and yi for i D 1; 2, f .x1 ; y1 /f .x2 ; y2 / f .x1 ^ x2 ; y1 ^ y2 /f .x1 _ x2 ; y1 _ y2 /; where x ^ y D min.x; y/ and x _ y D max.x; y/. We first present two lemmas and a theorem before we show the association of MTP2 variables. Lemma 7.2. Let fi .x/ be univariate nonnegative functions for i D 1; 2; 3; 4 satisfying f1 .x1 /f2 .x2 / f3 .x1 ^ x2 /f4 .x1 _ x2 /: Then
Z
Z f1 .x/dx
Z f2 .x/dx
Z f3 .x/dx
f4 .x/dx:
7.2 M TP2 Distribution
147
Proof. We first write Z
“
Z f1 .x/dx
f2 .x/dx D
f1 .x1 /f2 .x2 /dx1 dx2 “
D
f1 .x1 /f2 .x2 /dx1 dx2 x1 <x2
“
C “
f1 .x1 /f2 .x2 /dx1 dx2 x1 >x2
D
Œf1 .x1 /f2 .x2 / C f1 .x2 /f2 .x1 /dx1 dx2 : x1 <x2
Similarly, Z
Z f3 .x/dx
“ f4 .x/dx D
Œf3 .x1 /f4 .x2 / C f3 .x2 f4 .x1 /dx1 dx2 : x1 <x2
From the MTP2 property, we know that for x1 x2 , f1 .x1 /f2 .x2 / f3 .x1 /f4 .x2 / f1 .x2 /f2 .x1 / f3 .x1 /f4 .x2 /: By letting x2 D x1 in the first equation and x1 D x2 in the second equation and multiplying the two sides, we have f1 .x1 /f2 .x1 /f1 .x2 /f2 .x2 / f3 .x1 /f4 .x1 /f3 .x2 /f4 .x2 /: Thus, 0 Œf3 .x1 /f4 .x2 / f1 .x2 /f2 .x1 /Œf3 .x1 /f4 .x2 / f1 .x1 /f2 .x2 / D f3 .x1 /f4 .x2 /Œf3 .x1 /f4 .x2 / f1 .x1 /f2 .x2 / f1 .x2 /f2 .x1 /Œf3 .x1 /f4 .x2 / f1 .x1 /f2 .x2 / D f3 .x1 /f4 .x2 /Œf3 .x1 /f4 .x2 / f1 .x1 /f2 .x2 / f1 .x2 /f2 .x1 /f3 .x1 /f4 .x2 / C f1 .x2 /f2 .x1 /f1 .x1 /f2 .x2 / f3 .x1 /f4 .x2 /Œf3 .x1 /f4 .x2 / f1 .x1 /f2 .x2 / f1 .x2 /f2 .x1 /f3 .x1 /f4 .x2 / C f3 .x1 /f4 .x1 /f3 .x2 /f4 .x2 / D f3 .x1 /f4 .x2 /Œf3 .x1 /f4 .x2 / f1 .x1 /f2 .x2 / Cf3 .x1 /f4 .x2 /Œf3 .x2 /f4 .x1 / f1 .x2 /f2 .x1 / D f3 .x1 /f4 .x2 /Œf3 .x1 /f4 .x2 / C f3 .x2 /f4 .x1 / f1 .x1 /f2 .x2 / f1 .x2 /f2 .x1 /:
148
7 Association and Dependence
That means, f3 .x1 /f4 .x2 / C f3 .x2 /f4 .x1 / f1 .x1 /f2 .x2 / C f1 .x2 /f2 .x1 /; t u
which completes the proof. Lemma 7.3. Let fi .x; y be nonnegative functions for i D 1; 2; 3; 4 satisfying f1 .x1 ; y1 /f2 .x2 ; y2 / f3 .x1 ^ x2 ; y1 ^ y2 /f4 .x1 _ x2 ; y1 _ y2 /: Then “
“ f1 .x; y/dxdy
“ f2 .x; y/dxdy
“ f3 .x; y/dxdy
f4 .x; y/dxdy:
Proof. Define the marginal integral as Z i .x/ D
fi .x; y/dy;
for i D 1; 2; 3; 4. Then we can show that 1 .x1 /2 .x2 / 3 .x1 ^ x2 /4 .x1 _ x2 /; which is equivalent to “ .f1 .x1 ; y1 /f2 .x2 ; y2 / C f1 .x1 ; y2 /f2 .x2 ; y1 //dy1 dy2 y1
“
.f3 .x1 ^ x2 ; y1 /f4 .x1 _ x2 ; y2 /Cf3 .x1 ^ x2 ; y2 /f4 .x1 _ x2 ; y1 //dy1 dy2 : y1
Indeed, similar to the proof of the previous lemma, we can show that the inequality holds for the integrals for any fixed x1 ; x2 . Then from the previous lemma, we have Z Z Z Z 1 .x/dx 2 .x/dx 3 .x/dx 4 .x/dx; which is equivalent to the desired result. Theorem 7.2. If f1 .x; y/ and f2 .x; y/ are two density functions such that f1 .x1 ; y1 /f2 .x2 ; y2 / f1 .x1 ^ x2 ; y1 ^ y2 /f2 .x1 ^ x2 ; y1 ^ y2 /; then for any increasing function g.x; y/ E1 g.X; Y / E2 g.X; Y /; where Ei Œ: denotes the expectation under density fi , for i D 1; 2.
t u
7.3 Multivariate Failure Rate and Distribution Class
149
Proof. Let f1 .x; y/ D g.x; y/f1 .x; y/;
f2 .x; y/ D f2 .x; y/;
f3 .x; y/ D f1 .x; y/; and f4 .x; y/ D g.x; y/f2 .x; y/: Then, fi for i D 1; 2; 3; 4 satisfy the condition of the previous lemma and the result follows. t u Now we are ready to show that the MTP2 property is sufficient for association. Theorem 7.3. If .X; Y / are MTP2 , then .X; Y / are associated. Proof. For another increasing function h.x; y/, let f1 .x; y/ D f .x; y/ and f2 .x; y/ D
1 h.x; y/f .x; y/; Eh.X; Y /
where Eg.X; Y / denotes the expected value under density f .x; y/. Then f1 and f2 satisfy the condition for the previous theorem. Thus, Eg.X; Y / D E1 g.X; Y / E2 g.X; Y / D
EŒg.X; Y /h.X; Y / ; Eh.X; Y /
that means, EŒg.X; Y /h.X; Y / Eg.X; Y /Eh.X; Y /: t u Remark. From the above discussion, we see all the results hold for the absolute continuous part of BVE. Since the singular part only involves the case when x D y, which does not affect the inequalities, thus, we see that the results hold for the BVE as well. That means, the BVE is MTP2 and thus are associated. The details are omitted here.
7.3 Multivariate Failure Rate and Distribution Class In this section, we restrict our discussion on the case when .X; Y / are two possible dependent lifetimes which start from the same original time. For notational convenience, we denote for 0 < s < t ht
.0/
D fX > t; Y > tg;
.1/ ht .s/ .2/ ht .s/
D fX D s; Y > tg; D fX > t; Y D sg;
which represent the three possible forms of history up to time t if at least one component is still alive.
150
7 Association and Dependence
We only consider the case when .X; Y / are continuous variables with joint density function f .x; y/. We first introduce the hazard rates. Define r2 .t/ D lim
t !0
1 P ŒY 2 .t; t C tjh.0/ t t
1 lim t P ŒX > tI Y 2 .t; t C t P ŒX > tI Y > t R1 f .x; t/dx ; D R 1 Rt 1 t t f .x; y/dxdy
D
and for t > s, r2 .tjs/ D lim
t !0
D
1 P ŒY 2 .t; t C tjh.1/ t .s/ t
1 lim t P ŒX D sI Y 2 .t; t C t P ŒX D sI Y > t
D R1 t
f .s; t/ : f .s; y/dy
Similarly, we can define r1 .tjs/. Definition 7.5. (a) Y is called having hazard increased by failure (HIF) of X , if for any s1 s2 t, r2 .t/ r2 .tjs2 / r2 .tjs1 /I (b) X is called having hazard increased by failure of Y , if for any s1 s2 t, r1 .t/ r1 .tjs2 / r1 .tjs1 /I (c) .X; Y / are called having hazard increased by failures if both (a) and (b) hold. That means, the hazard rate of one component increases if the other component fails earlier. The conditions in this definition are quite strong as all histories up to time t are involved. The following definition gives another class which is only based on the history at time t. Definition 7.6. (a) Y is called weakened by failure of X (WBF) if for any t; x > 0 .1/ P ŒY t C yjh.0/ t P ŒY t C yjht .t/ D P ŒY t C yjX D tI Y > tI
(b) X is called weakened by failure of Y if for any t; x > 0 .2/ P ŒX t C xjh.0/ t P ŒX t C xjht .t/ D P ŒX t C xjX > tI Y D tI
7.4 Negative Association
151
(c) .X; Y / are called weakened by failures if both (a) and (b) hold. Obviously, HIF implies WBF. Theorem 7.4. If .X; Y / are MTP2 , then .X; Y / are HIF. Proof. The MTP2 property states that for any t x; s y, f .x; s/f .t; y/ f .t; s/f .x; y/: By letting s1 D s s2 D y, we have f .t; s1 /f .x; s2 / f .t; s2 /f .x; s1 /: Integrating both sides with respect to x, we get Z 1 Z 1 f .t; s1 /f .x; s2 /dx f .t; s2 /f .x; s1 /dx: t
t
This is equivalent to R1 t
f .t; s2 / f .t; s1 / R1 ; f .x; s1 /dx t f .x; s2 /dx
which is r1 .s1 / r1 .s2 /. Similarly, for s < t, by integrating x and y from t to 1, we get Z
1
Z
Z
1
1
Z
1
f .x; s/f .t; y/dxdy t
t
f .t; s/f .x; y/dxdy; t
t
which is equivalent to R1 R 1 Rt 1 t
t
f .t; y/dy f .x; y/dxdy
R1 t
f .t; s/ : f .x; s/dx
That means, r1 .t/ r1 .tjs/. So f .x; y/ is HIF.
t u
The last theorem gives the relationship between HIF, WBF and association, and the proof will not be given here. Theorem 7.5. If .X; Y / are WBF, then .X; Y / are associated.
7.4 Negative Association The generalization of association to negative association needs more careful treatment. We directly give the definition in the multivariate case. For a random vector X D .X1 ; : : : ; Xn /, let A denote a subset of the whole index f1; : : : ; ng, i.e., A f1; : : : ; ng, and AC denote the complement of A in f1; : : : ; ng.
152
7 Association and Dependence
Definition 7.7. X is called negatively associated (NA) if CovŒg.Xi ; i 2 A/; h.Xj ; j 2 AC / 0; for every A f1; : : : ; ng, where g.:/ and h.:/ are increasing functions. The following simple properties are listed with their special interests. Property 1. Two variables X and Y are negatively associated if and only if P ŒX x; Y y P ŒX xP ŒY y: That means, X and Y are negatively quadrant dependent. Indeed, the negative association of X and Y is equivalent to CovŒg.X /; h.Y / 0; for all increasing functions g.x/ and h.y/. Since any increasing functions are limits of mixtures of indicator functions of the type IŒx>a , the negative association is equivalent to CovŒIŒX>x ; IŒY >y 0; or P ŒX > x; Y > y P ŒX > xP ŒY > y for all x; y, which is the expected result. Property 2. For disjoint subsets A1 ; : : : ; Am of f1; : : : ; ng and increasing functions g1 ; : : : ; gm , X is NA implies " E
m Y
# gi .XAi /
i D1
m Y
Egi .XAi /;
i D1
where XAi D .Xj ; j 2 Ai /. This follows by induction method. Property 3. If X is NA, then it is negatively orthant dependent, i.e., P ŒX1 > x1 ; : : : ; Xn > xn P ŒX1 > x1 P ŒXn > xn : Property 4. A subset of NA random variables is NA. Property 5. If X consists of independent components, then it is NA. Property 6. Increasing functions defined on disjoint subsets of a set of NA random variables are NA.
7.4 Negative Association
153
Property 7. If X is NA, Y is NA, and X and Y are independent, then the enlarged random vector .X; Y/ is NA. Before we state sufficient for negative association, we first present a lemma for conditional covariance, defined as CovŒX; Y jF D EŒX Y jF EŒX jF EŒY jF ; where F is a given condition generated from some known events. Lemma 7.4. Let X and Y be two real variables and F1 F2 are two conditions defined on such that Condition F1 implies Condition F2 . Then CovŒX; Y jF1 D EŒCovŒX; Y jF2 jF1 C CovŒEŒX jF2 ; EŒY jF2 jF1 : In particular, if F1 is the trivial condition generated by only the whole sample itself, Then CovŒX; Y D EŒCovŒX; Y jF2 C CovŒEŒX jF2 ; EŒY jF2 j: Proof. The right-hand side of the equality can be rewritten as EŒEŒX Y jF2 EŒX jF2 EŒY jF2 jF1 CEŒEŒX jF2 EŒY jF2 jF1 EŒEŒX jF2 jF1 EŒEŒY jF2 jF1 D EŒEŒX Y jF2 jF1 EŒEŒX jF2 jF1 EŒEŒY jF2 jF1 D EŒX Y jF1 EŒX jF1 EŒY jF1 D CovŒX; Y jF1 ; where in the last equality we use the fact that EŒEŒX Y jF2 jF1 D EŒX Y jF1 : t u Theorem 7.6. Let X1 ; ; Xn be independent, and suppose that " E g.XA /j
X
# Xi
i 2A
P is increasing in i 2A Xi for every increasing function P g.:/, and every A f1; ; ng. Then the conditional distribution of .Xj niD1 Xi D s/ is NA, for almost all s.
154
7 Association and Dependence
Proof. Let A f1; ; ng S1 D
X
Xi ; S2 D
i 2A
X
Xi ; and S D S1 C S2 :
i 2AC
For given S D s and increasing functions g.xA / and h.xAC /, we have from Lemma 7.4 CovŒg.XA /; h.XAC /jS D s D CovŒEŒg.XA /jS1 ; S2 ; EŒh.XAC /jS1 ; S2 jS D s CEŒCovŒg.XA /; h.XAC /jS1 ; S2 jS D s: Since S1 and S2 are independent, CovŒg.XA /; h.XAC /jS1 ; S2 D 0: By defining .S1 / D EŒg.XA /jS1 and .S1 / D EŒh.XAC /jS2 D s S1 , we have CovŒg.XA /; h.XAC /jS D s D EŒ.S1 /; .S1 /jS D s: From the condition of the lemma, is increasing in S1 and is decreasing in S1 . Note that a single variable is always associated with itself. Thus, we see that CovŒg.XA /; h.XAC /jS D s 0: t u The importance of the above theorem lies in the following important property of PF2 density functions (Efron 1965). The definition of PF2 is given in Chap. 5. Theorem 7.7. Let X1 ; ; Xn be independent random variables with PF2 density functions f1 .x/; ; fn .x/. Then forPevery increasing function g.x/, EŒg.X/j S D s is increasing in (almost all) niD1 Xi D s and thus .XjS D s/ is NA for almost all s. Proof. We first proof the result for two-variable case, and the proof for general case will be completed by induction. For n D 2 and 0 < ˛ < 1, let x˛;s ; z˛;s D s x˛;s be the 100˛%-quantile for the conditional distribution given X1 C X2 D s. That means, R x˛;s R0 1 0
Or equivalently,
R x˛;s R01
x˛;s
f1 .x/f2 .s x/dx
f1 .x/f2 .s x/dx
f1 .x/f2 .s x/dx f1 .x/f2 .s x/dx
D
D ˛:
˛ : 1˛
7.4 Negative Association
155
From the PF2 property, we have for 0, f2 .s x/ f2 .s C x/ : f2 .s C x˛;s / f2 .s x˛;s / Thus, R x˛;s R01
x˛;s
f1 .x/f2 .s C x/dx f1 .x/f2 .s C x/dx
R x˛;s D
0
R1 x˛;s
R x˛;s
0
R1 x˛;s
D
f2 .s C x/ dx f2 .s C x˛;s / f2 .s x/ dx f1 .x/ f2 .s x˛;s /
f1 .x/
f2 .s x/ dx f2 .s x˛;s / f2 .s C x/ dx f1 .x/ f2 .s C x˛;s / f1 .x/
˛ : 1˛
Due to the monotone property of ˛=.1 ˛/ in ˛, we get x˛;sC x˛;s ; z˛;sC z˛;sC : By making a change of variable from x to ˛, we have EŒg.X1 ; s C X1 /jS D s C R sC g.x; s C x/f1 .x/f2 .s C x/dx D 0 R sC f1 .x/f2 .s C x/dx 0 Z 1 g.x˛;sC ; s x˛;sC /d˛ D 0
Z
1
0
g.x˛;s ; s x˛;s /d˛
D EŒg.X1 ; s X1 /jS D s: Suppose the result is true for n 1. From the closure property of PF2 density P function (IFR property), the marginal density function of n1 i D1 Xi f
.n1/
Z
Z
1
.t/ D
1
0
0
f1 .x1 / fn1 t
n2 X i D1
! xi dx1 dxn2
156
7 Association and Dependence
is also PF2 . Define the conditional mean of g.X/ given T D Xn D u as
Pn1 i D1
Xi D t and
h.t; u/ D EŒg.X/jT D t; Xn D u ! Z 1 Z 1 n2 X 1 D .n1/ g x1 ; ; xn2 ; t xi ; u f .t/ 0 0 i D1 ! n2 X xi dx1 dxn2 : f1 .x1 / fn1 t i D1
By induction, h.t; u/ is increasing in t, and it is also increasing in u by definition. Thus, by the argument in the case n D 2, we see that Rs 0
h.t; s t/f .n1/ .t/fn .s t/dt Rs .n1/ .t/f .s t/dt n 0 f
is increasing in s. This means that ˇn1 ˇˇ ˇX ˇ ˇ Xi D T; Xn ˇ T CXn D s EŒh.T; Xn /jT C Xn D s D E E g.X1 ; ; Xn / ˇ ˇ ˇ i D1
D E Œg.X1 ; ; Xn /jX1 C C Xn D s ; is increasing in Sn D s.
t u
Problems 1. For two points .x1 ; y1 / and .x2 ; y2 / such that xi > 0; yi > 0 for i D 1; 2, we say .x2 ; y2 / majorizes .x1 ; y1 / if max.x1 ; y1 / max.x2 ; y2 /; while x1 C y1 D x2 C y2 . A function h.x; y/ is called Schur convex if h.x1 ; y1 / h.x2 ; y2 / when .x2 ; y2 / majorizes .x1 ; y2 /. Show that the following functions are Schur convex: (a) h.x; y/ D x ln x C y ln y; (b) h.x; y/ D .ln x C ln y/;
7.4 Negative Association
157
(c) h.x; y/ D x1 C y1 ; (d) h.x; y/ D Œ 12 .x 2 C y 2 /1=2 . 2. Let X1 ; ; Xn be independent and Sk D that .S1 ; : : : ; Sn / are associated.
Pk
i D1
Xi for k D 1; 2; : : : ; n. Show
3. If X1 ; X2 are identically and independently distributed, then the order statistics X1;2 ; X2;2 are MTP2 . 4. Show that if the joint density function f .x; y/ of .X; Y / is MTP2 , then EŒh.X; Y /jY D y is increasing in y when h.x; y/ is increasing in x; y. 5. .X1 ; Y1 / is called stochastically smaller than .X2 ; Y2 /, if for any increasing function h.x; y/ in x; y, Eh.X1 ; Y1 / Eh.X2 ; Y2 /: Show that if .X1 ; Y1 / is stochastically smaller than .X2 ; Y2 /, then (a) P ŒX1 > s; Y1 > t P ŒX2 > s; Y2 > t for all s; t; (b) P ŒX1 s; Y1 t P ŒX2 s; Y2 t for all s; t. 6. Let X denote a lifetime with distribution F .x/. Denote by X1 D min.X; M / and X2 D max.0; X M / for some constant M > 0. (a) Show that X1 and X2 are associated. (b) Find the covariance Cov.X1 ; X2 /. 7. Show that RTI or LTD implies PQD. 8. Show that PRD implies LTD. 9. Show that if X and Y are independent, then they are associated. 10. Show that if two sets of associated variables are independent of each other, then their union is a set of associated variables. 11. Show that any subsets of associated variables are associated. 12. Show that if X is NA, then it is negatively orthant dependent, i.e., P ŒX1 > x1 ; ; Xn > xn P ŒX1 > x1 P ŒXn > xn : 13. Let .X1 ; Y1 / have density f .x; y/ and .X2 ; Y2 / have density g.x; y/. Suppose .X1 ; Y1 / are positively associated and fg.x;y/ .x;y/ is increasing in .x; y/. Show that .X1 ; Y1 / st .X2 ; Y2 /.
Chapter 8
Renewal Theory
The renewal theory plays a key role in many applied probability areas, such as replacement policies in reliability, ruin probability in insurance mathematics, and system analysis in queueing theory. In this chapter, we shall first introduce the renewal theorem and its extensions including the key renewal theorem. Then we shall study some extended renewal processes, such as the delayed renewal process and defective renewal process. Meanwhile, we study the properties of the renewal functions under some special underlying distribution function classes.
8.1 Renewal Theorem Let fX1 ; ; Xn ; g be a sequence of independently and identically distributed nonnegative random variables with distribution function F .x/ with mean and variance 2 . Define for t > 0, N.t/ D supfn W Sn D X1 C : : : Xn tg; the renewal point process and .t/ D 1 C N.t/ D inffn W Sn D X1 C C Xn > tg: Then the renewal function is defined by U.t/ D EŒN.t/ 1 X D F .k/ .t/; i D1
where F .k/ .t/ D P ŒX1 C : : : C Xk t is the k th convolution of F .x/. By conditioning on whether X1 > t or X1 t, we have the following standard renewal equation for U.t/: Z U.t/ D F .t/ C U.t x/dF .x/: Œ0;t
A.K. Gupta et al., Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics, DOI 10.1007/978-0-8176-4987-6 8, c Springer Science+Business Media, LLC 2010
159
160
8 Renewal Theory
Definition 8.1. A random variable X with distribution function F .x/ is called arithmetic if for some constant d > 0, P ŒX 2 f 2d; d; 0; d; 2d; gD 1. The largest of such constants d is called the span of X . Let Fn denote the class of random variables determined by X1 ; ; Xn . That means a random variable Y 2 Fn if and only if Y D g.X1 ; ; Xn / for some function g of n variables. For an event A, the notation A 2 Fn means that the indicator variable IA 2 Fn . A random variable T with values in f1; 2; ; 1g is called a stopping time if fT D ng 2 Fn . A random variable Y is said to be prior to a stopping time T if Y IŒT Dn 2 Fn ; for all n, or equivalently Y IŒT n 2 Fn for all n. Thus, .t/ is a stopping time, while N.t/ is not a stopping time. Theorem 8.1. (Wald’s identity) Let T be a stopping time with P respect to Fn (and is independent of XnC1 ; ) and assume ET < 1. Then EŒ TkD1 Xk D ET . P1 PT Proof. We first write kD1 Xk D kD1 IŒT k Xk , and note that IŒT k D 1 IŒT
T X kD1
# Xk D
1 X
EŒXk I T k D
kD1
1 X
P ŒT k D EŒT :
kD1
t u In the following discussion, for the sake of convenience we shall assume further P ŒXi > 0 D 1. The next result is called the elementary renewal theorem: Lemma 8.1. EŒ.t/ < 1 for all t, and limt !1 t 1 EŒ.t/ D 1 : Proof. Suppose initially that X1 is bounded such that P ŒX1 c D 1 for some c. By Wald’s identity, EŒ.t/ ^ n D EŒS.t /^n t C c: Letting n ! 1 proves that EŒ.t/ b C c < 1; and letting t ! 1 proves that lim sup t 1 EŒ.t/ 1 : t
8.1 Renewal Theorem
161
The reverse inequality, t 1 EŒ.t/ D t 1 1 EŒS.t / 1 ; follows by ignoring the overshoot. .c/ In general, let c > 0 be large and define Xn D Xn IŒXn c C cIŒXn >c such that .c/ .c/ D EX1 > 0. Define .c/ .t/ by analogy with .t/. Since .t/ .c/ .t/, it follows that E.t/ < 1, and lim sup t 1 E.t/ ..c/ /1 : t
Since .c/ " as c " 1, this finishes the proof.
t u
The following result gives the renewal theorem due to Blackwell (1948). The proof is lengthy and will not be provided here. Theorem 8.2. Suppose P ŒX1 > 0 D 1. If X1 is nonarithmetic, then for any h > 0, U.t C h/ U.t/ ! h=; as t ! 1: If X1 is arithmetic with span d , then U..k C 1/d / U.kd / ! d=; as k ! 1: Suppose g.x/ is a positive and decreasing function and consider the general renewal equation: Z
R Œ0;1/
g.x/dx< 1. We
t
m.t/ D g.t/ C
m.t x/dF .x/; 0
which has the following solution: Z
t
m.t/ D g.t/ C
g.t x/dU.x/: 0
The next theorem gives the key renewal theorem. Theorem 8.3. Assume P ŒX1 > 0 D 1. (a) If X1 is nonarithmetic, then Z lim
t !1 Œ0;t /
g.t x/dU.x/ D 1
Z g.x/dx: Œ0;1/
(b) If X1 is arithmetic with span d and t ! 1 through multiples of d , then the corresponding result is
162
8 Renewal Theory
X
lim
t !1
g.t jd /ŒU.jd / U..j 1/d / D d1
1 X
g.j /:
j D0
j t =d
Proof. To prove (a), let m D 1; 2; and h > 0 be fixed. Observe that U.x C h/ U.x/ U.h/ C 1 < 1 because the number of visits to .x; x C h equals at most one (the first visit if there is a visit) plus a random variable whose distribution is stochastically smaller than the number of visits to .0; h. Since g is decreasing, Z g.t x/dU.x/ .0;t
m 1 X X ŒU.t .i 1/h/U.t ih/g..i 1/h/CU.h/ g.ih/: i D1
i Dm
From the renewal theorem, Z
1
g.t x/dU.x/
lim sup t !1
Œ0;t /
m1 X
1 X
g.ih/h C U.h/
i D0
g.ih/:
i Dm
Putting m D c=t and first letting c ! 1, and then h ! 0 yields Z lim sup t !1
g.t x/dU.x/ 1
Z
Œ0;t /
g.x/dx: Œ0;1/
A similar but easier lower bound completes the proof of (a). The proof of (b) is similar and is omitted. u t The following theorem gives the corresponding results for the residual life at time t, defined as R.t/ D SN.t /C1 t. Theorem 8.4. Assume P ŒX1 > 0 D 1. (a) If X1 is nonarithmetic, then lim P ŒR.t/ > y D 1
t !1
Z .1 F .x//dx; .y;1/
and if EX12 < 1, then lim EŒR.t/ D EX12 =.2/:
t !1
(b) If X1 is arithmetic with span d and t ! 1 through multiples of d , then lim P ŒR.t/ D jd D d1 P ŒX1 jd ;
t !1
8.2 High-Order Approximations and Bounds
163
and if EX12 < 1, lim EŒR.t/ D EX12 =.2/ C d=2:
t !1
Proof. From Chap. 3, by conditioning on X1 > t or X1 t, P Œr.t/ > y satisfies the following renewal equation Z
t
P ŒR.t/ > y D 1 F .t C y/ C
P ŒR.t x/ > ydF .x/: 0
Further by conditioning on the number of renewals in the interval Œ0; t and the last time of the renewal point, the solution of the above renewal equation is given by Z P ŒR.t/ > y D 1 F .t C y/ C .1 F .t C y x//dU.x/: Œ0;t /
As t ! 1, letting g.x/ D 1 F .x C y/, from the key renewal theorem we have 1
Z
1
lim P ŒR.t/ > y D
t !1
Z
.1 F .x C y//dx D Œ0;1/
.1 F .x//dx; .y;1/
which proves (a). The proof for (b) is similar and left as an exercise.
t u
8.2 High-Order Approximations and Bounds In this section, we shall concentrate our discussion on the nonarithmetic case. With the help of this limiting result for the residual life, we have the following secondorder result for the renewal function due to Smith (1958). Theorem 8.5. Assume P ŒX1 > 0 D 1 and X1 is nonarithmetic. Then as t ! 1, U.t/ D
EX12 t C 1 C o.1/: 22
Proof. From Wald’s identity and Theorem 8.4, we have U.t/ D EŒ.t/ 1 1 D ES.t / 1 1 D Œt C ER.t/ 1 EX12 1 tC 1 C o.1/: D 2 t u
164
8 Renewal Theory
From the proof of above result, we see that the evaluation of renewal function U.t/ is equivalent to the evaluation of the mean of residual life R.t/. The following result gives a simple general bound for the renewal function (Lorden 1970). Theorem 8.6. Assume F .x/ is nonlattice and F .0/ D 0. Then ER.t/ 2 lim R.t/ D EX12 =; t !1
and thus t= 1 U.t/ t= C EX12 =2 1: Proof. First note that U.t/ C 1 has the following subadditive property, U.t/ C 1 U.x/ C 1 C U.t x/ C 1; for x t as in the proof of the key renewal theorem. Thus, as in the proof for Theorem 8.4, we get ER.t/ ER.x/ C ER.t x/: Therefore, 1 1 tER.t/ t inf .ER.x/ C ER.t x// 2 2 0xt =2 Z t =2 Z t .ER.x/ C ER.t x//dx D ER.x/dx: 0
0
However, by considering the residual life R.x/ D S.x/ x as a stochastic function of x for 0 x t, we see that R.x/ is a piecewise linear function between time points 0; S1 ; ; SN.t / ; S.t / with slope 1. Thus, Z
Z
t
S .t /
R.x/dx D 0
0
Z
S .t /
R.x/dx
R.x/dx
t
1X 2 1 2 Xi R .t/: 2 2 .t /
D
i D1
By taking the expectation on both sides and using Wald’s identity, we have Z
t
1 EX12 E.t/ ER2 .t/ 2 2 EX12 1 Œt C ER.t/ ER2 .t/ D 2 2 EX12 1 Œt C ER.t/ .ER.t//2 : 2 2
ER.x/dx D 0
8.2 High-Order Approximations and Bounds
165
Combining the two inequalities, we obtain EX12 1 1 tER.t/ Œt C ER.t/ .ER.t//2 ; 2 2 2 2 which is equivalent to .t C ER.t//ŒER.t/ EX12 =.2/ 0: Thus, ER.t/ EX12 =.2/.
t u
From the above proof and the limit theorem for P ŒR.t/ > y, we have the following interesting result: Corollary 8.1. Under the same condition as in the last theorem. Z
1
0
2 EX12 EX13 EX12 dx D : ER.x/ 2 2 6
The following corollary gives an easier bound when the underlying distribution F .x/ belongs to certain distribution classes. Corollary 8.2. Assume F .x/ is NBUE (NWUE) with F .0/ D 0. Then ER.t/ ./ and thus t U.t/ ./ : Proof. From the renewal equation for P ŒR.t/ > y, we have Z
Z tZ
1
1
.1 F .t C x//dx C
EŒR.t/ D 0
.1 F .t s C x//dxdU.s/: 0
0
If F .x/ is NBUE, then Z
Z
1
1
.1 F .t C x//dx D 0
.1 F .x//dx .1 F .t//: t
Thus, Z t .1 F .t s//dU.s/ D P ŒR.t/ > 0 D ; EŒR.t/ 1 F .t/ C 0
which gives the result for NBUE case. The NWUE case is left as an exercise.
t u
166
8 Renewal Theory
8.3 Delayed Renewal Process In ordinary renewal process, we can treat time zero X0 D 0 as the first renewal epoch. In general, we suppose the first renewal epoch X00 follows distribution G.t/. We can define S00 D X00 and Sn0 D X00 C X10 C C Xn0 and N 0 .t/ D supfn 0 W Sn0 tg: Therefore, the renewal function U 0 .t/ D EŒN 0 .t/ will still be mean number of renewals before time t. We call N 0 .t/ a delayed renewal process. By conditioning on the time of X00 , we can see that U 0 .t/ can be calculated as follows: Z t 0 N U .t/ D G.t/ C U.t x/dG.x/: 0
In this section, we consider two special classes of delayed renewal process. In the first case, we treat the item at time zero as having a random time T , say, which follows distribution H.t/. That means, when T D t, X00 has the same distribution as the residual life time X tjX > t with distribution function Ft .x/ D P Œt < X t C xjX > t D .F .t C x/ F .t//=.1 F .t//: R1 Thus, X00 has the mixture distribution G.x/ D 0 Ft .x/dH.t/. In the second X00 has the stationary residual life distribution R case, we assume 1 x N 0 G.x/ D 0 F .y/dy, and N .t/ is called the stationary renewal process. In this case, it is obvious that U 0 .t/ D t= C EX12 =.22 /: An important application of the stationary (or delayed) renewal process is to study the monotone property of the renewal function of ordinary renewal processes. The following is a typical result (Brown 1980). R1 Recall that F .x/ is IMRL (increasing mean residual R 1 life) if t D t .1 F .y//dy=.1F .t// is decreasing in t, or .1F .t//= t .1F .y//dy is decreasing Rt in t. This is equivalent to saying that the stationary distribution 1 0 .1 F .y//dy has DFR (decreasing failure rate). Theorem 8.7. Let N.t/ be an ordinary renewal process with underlying distribution F .x/ which is IMRL. Then EŒR.t/ is monotone increasing to the limit ER.1/ D EX12 =.2/. The proof involves of constructing two renewal processes on the same sample space: one is the ordinary renewal process fN.t/g and the other is a delayed renewal process such that Si0 D SN Ci for i 0 and X00 D SN , where N is a properly defined integer variable with mean EN D EX12 =.22 /. The details are spelled out as follows.
8.3 Delayed Renewal Process
167
Step 1. Construct two independent random variables Z1 and W1 such that Z1 follows G.t/ and W1 follows J.t/ where N JN .t/ D FN .t/=G.t/: Since F .t/ is IMRL, thus, JN .t/ is decreasing with JN .0/ D 1 and JN .1/ D 1=1 , which may be larger than zero. Thus, J.t/ might be defective. If Z1 W1 , let X00 D X1 D Z1 and Xj0 1 D Xj D Yj 1 , where Yj for j D 1; 2; : : : are a sequence of identically and independently distributed random variables with distribution F .x/. If Z1 > W1 , set X1 D W1 and go to Step 2. Step 2. Given X1 D W1 D v, we construct two conditionally independent random variables Z2 and W2 as follows. Let Z2 jW1 Dv follows distribution Gv .t/ where N C v/ G.t GN v .t/ D N G.v/ and W2 jW1 Dv follows distribution Jv .t/ where FN .t/ : JNv .t/ D GN v .t/ Since F .t/ is IMRL, thus Gv .t/ is distribution function and may be defective. Also, we can write N FN .t/ G.t/ N JNv .t/ D G.v/: N G.t/ G.t NC v/ N
N
.t / G.t / N From the IMRL property of F .x/, both F N / and G.tNCv/ G.v/ are decreasing in t and G.t thus Jv .t/ is a possible defective distribution as well. If Z2 W2 , let X00 D W1 C Z2 , X2 D Z2 , and Xj0 2 D Xj D Yj 2 for j D 3; 4; : : :. If Z2 > W2 , set X2 D W2 and go to next step.
Step m. We reach Step m if and only if Zi > Wi for i D 1; 2; : : : ; m 1, in which case, Xi D Wi for iP D 1; 2; : : : ; m1. Pm1Now we construct Zm and Wm conditionally P independent given m1 W D follows i i D1 i D1 Xi D v as follows. Zm j m1 i D1 Wi Dv P distribution Gv .t/ and Wm j m1 W Dv follows distribution Jv .t/. i i D1 P 0 If Zm Wm , set Xm D Zm , X00 D m1 i D1 Wi C Zm , and Xj D Xj m D Yj m for j D m C 1; : : :. If Zm > Wm , repeat Step m by setting m to m C 1. The procedure stops until N D minfi W Zi Wi g:
168
8 Renewal Theory
Based on the above construction procedure, we have the following result: Theorem 8.8. (a) fXi g for i D 1; 2; : : : are identically and independently distributed following distribution F .t/. (b) X00 follows distribution G.t/ and Xj0 are independent following distribution F .t/ for j 1. (c) Si0 D SN Ci and EŒN D EX12 =.2/. Proof. For (a), from the definition of Xi , we only have to show that for i N , Xi follows F .t/. Indeed, given .Wj ; Zj / D .wj ; zj / for j D 1; : : : ; i 1 and P i N , the conditional distribution of Xi only depends on ji 1 D1 wi and has the same distribution as min.Zv ; Wv / where Zv follows the distribution Gv .t/, Wv follows distribution Jv .t/, and Zv and Wv are conditionally independent. However, GN v .t/JNv .t/ D FN .t/, which is free of v. Thus, Xi follows the distribution F .t/ and is independent of X1 ; : : : ; Xi 1 . For (b), since we generated W1 ; : : : ; WN , it will not be convenient to continue constructing Wj for j > N . At stage j, construct Wj to be conditionally indepenPj 1 dent of W1 ; : : : ; Wj 1 given i D1 Wi D v following distribution Jv .t/. Since G.t/ is DFR, FN .v/ inf P ŒWi > t inf P ŒWv > t D > 0: v i limv!1 GN v .t/ P Therefore, P Œ 1 iP D1 Wi D 1 D 1. So that given t, there exists j such that Pj 1 j W < t i 1 1 Wi . Thus, for given W1 D w1 ; W2 D w2 ; : : :, P ŒX00
> tjw1 ; w2 ; : : : D
"j 1 Y lD1
# GN Pl1 w .wl / GN Pj 1 w 1
i
1
i
t
jX 1
! wi
N D G.t/;
1
which is free of all w1 ; w2 ; : : :. Thus, X00 follows distribution G.t/. That means the second process is a P stationary renewal process. 0 0 For (c), since P Œ 1 1 Wi D 1 D 1, X0 < 1 if and only if N < 1. By (b), X0 0 follows distribution G.t/, and thus P ŒX0 < 1 D 1. Therefore, P ŒN < 1 D 1. By construction Si0 D SN Ci , and in particular, X00 D SN . From Wald’s identity, we have EX12 EŒX00 D D EŒN ; 2 that means, EŒN D EX12 =.22 /. Now we are ready to finish the proof for the main theorem. Since
t u
ER.t/ D U.t/ t= 1 D EŒN.t/ N 0 .t/ 1 and N.t/ N 0 .t/ is monotone increasing to N , ER.t/ is monotone increasing to EN . t u
8.4 Defective Renewal Process
169
8.4 Defective Renewal Process In this section on the one hand, we assume that the interarrival times fX1 ; ; Xn ; g may have a defective distribution such that P ŒX1 < 1 D < 1. We denote the conditional distribution F .x/ D P ŒX1 xjX < 1. Define M D inffk > 0 W Xk D 1g as the terminating arrival time. Then M is a geometric random variable with P ŒM D k D k1 .1 /: We still define the renewal process as ( N.t/ D sup n > 0 W
k X
) Xi t :
i D1
Then N.t/ is a terminating renewal process with " P ŒN.t/ k D P " DP
k X
# Xi t
i D1 k X
# Xi tjX1 < 1; ; Xk < 1 .P ŒX1 < 1/k
i D1
D F .k/ .x/ k : Thus, the renewal function can be defined similarly as U.t/ D EŒN.t/ 1 X P ŒN.t/ k D D
kD1 1 X
k F .k/ .x/:
kD1
Obviously, U.1/ D lim U.t/ D t !1
1 X kD1
k D
: 1
On the other hand, note that the geometric compound sum distribution
PM 1 i D1
Xi has the
170
P
8 Renewal Theory
"M 1 X
# Xi t
D P ŒM D 1 C
i D1
"
1 X
P
D 1C
"
P
i D1
# Xi tjM D k C 1 P ŒM D k C 1
i D1
kD1 1 X
k X
k X
#
Xi tjX1 < 1; ; Xk < 1 k .1 /
i D1
D .1 /U.t/; where we agree that
P0
i D1 Xi
P
D 0. That means,
"M 1 X
# Xi > t
D 1 .1 /U.t/:
i D1
Therefore, the evaluation of tail probability distribution of the geometric compound sum is equivalent to the calculation of the renewal function. P 1 Denote by G.t/ D P Œ iMD1 Xi > t. Then G.1/ D 0. We now study the rate of this convergence. By conditioning on the first renewal time X1 after time 0, it can be seen that G.t/ satisfies "M 1 # X G.t/ D P Xi > tI t < X1 < 1 i D1
CP
"M 1 X
# Xi > tI X1 t
i D1
Z
t
P
D .1 F .t// C Z
"M 1 X
0
# Xi > tjX1 D x dF .x/
i D1 t
D .1 F .t// C
G.t x/dF .x/; 0
where in the last equation, we use the memoryless property of the geometric distribution for M . This is no longer an ordinary renewal equation due to the factor , called defective renewal equation. We consider the following general defective renewal equation: Z
t
m.t/ D g.t/ C
m.t x/dF .x/: 0
We first assume that there exists a positive constant satisfying Z
1
ex dF .x/ D 1=: 0
8.4 Defective Renewal Process
171
Under this condition, we can define a new proper distribution function F .x/ as dF .x/ D ex dF .x/: Multiplying by et on both sides of the defective renewal equation, we have Z
t
et m.t/ D et g.t/ C
e.t x/ m.t x/dF .x/:
0
Thus, m .t/ D et m.t/ satisfies a standard renewal equation with underlying distribution function F .x/. From the key renewal theory, we have the following result: R1 Theorem 8.9. Suppose et g.t/ is decreasing in t and 0 et g.t/dt < 1. Then et m.t/ ! where D
R1 0
xdF .x/ D
R1 0
Z
1
et g.t/dt; 0
xex dF .x/.
For the tail probability of geometric compound sum G.t/, we have the following corollary: Corollary 8.3. Suppose et .1 F .t// is decreasing in t. Then G.t/
1 t e :
Proof. Letting g.t/ D 1 F .t/, we find Z
1
1 t 1 e .1 F .t//j1 0 C 1 D .1= 1/:
Z
1
et .1 F .t//dt D 0
et dF .t/ 0
The result follows by some simplification.
t u
The asymptotic result does not provide estimation for small t’s. In the following discussion, we provide lower and upper bounds for m.t/ for all values of t (Willmot et al. 2001). First, we define g.z/ez g.z/ez ˛.t/ D inf R 1 y and ˇ.t/ D sup R 1 y ; 0zt 0zt z e dF .y/ z e dF .y/ where 1 F .t/ > 0.
172
8 Renewal Theory
Theorem 8.10. For all t, ˛.t/et m.t/ ˇ.t/et : Proof. We first derive the upper bound. Using an induction method, it is easy to see that m.x/ can be defined as the limit of the following monotone increasing functions: Z t
mk .t/ D g.t/ C
mk1 .t x/dF .x/; 0
where m0 .t/ D g.t/. Indeed, mk .t/ can be written as Z
"
t
mk .t/ D g.t/ C
g.t x/d 0
k X
# F k
.k/
.x/ :
kD1
We shall show by induction that for all 0 x t, m.x/ ˇ.t/ex : Obviously, Z 1 g.x/ex ex ey dF .y/ m0 .x/ D g.x/ D R 1 y e dF .y/ x x Z 1 ˇ.t/ex ey dF .y/ 0
D ˇ.t/e
x
:
Suppose the upper bound is true for k. Then for k C 1, Z t mk .t x/dF .x/ mkC1 .x/ D g.x/ C 0
ˇ.t/ex
Z
Z
1 x
D ˇ.t/e
x
x
ey dF .y/ C Z
Z
1 y
y
e dF .y/ C x
ˇ.x/e.xy/ dF .y/
0 x
e dF .y/ 0
D ˇ.t/ex : By letting k ! 1, we obtain the expected result by taking x D t. For the lower bound, we use a technique developed by Cai and Wu (1997). Since P1 .k/ F .x/ is a proper function, it implies that U.t/ D .t/ < 1. That kD1 F .k/ means, F .t/ ! 0 as k ! 1 for any finite t. We shall show by induction that mk .x/ ˛.t/ex Œ1 F .k/ .x/; for all 0 x t.
8.4 Defective Renewal Process
173
Obviously, for k D 0, g.x/ex ex m0 .x/ D g.x/ D R 1 y e dF .y/ x
Z
1
ey dF .y/ x
˛.t/ex Œ1 F .x/: Suppose it is true for k. Then for k C 1, Z
t
mkC1 .x/ D g.x/ C
mk .t x/dF .x/ 0
˛.t/ex
Z
1
ey dF .y/ x
Z
x
C
˛.x/e.xy/ Œ1 F .k/ .x y/dF .y/
0
D ˛.t/e
x
D ˛.t/e
x
Z x .k/ 1 F .x/ C Œ1 F .x y/dF .y/ 0
Œ1 F
.kC1/
.x/;
where in the last step, we use the equality
Z
x
1 F .x/ C
Œ1 F
.k/
.x y/dF .y/ D 1 F
.kC1/
.x/:
0
By letting k ! 1, we have m.x/ D lim mk .x/ lim ˛.t/ex Œ1 F
.k/
.x/ D ˛.t/ex :
The lower bound follows by letting x D t.
t u
In the geometric compound sum case, g.t/ D 1 F .t/. We have the following simpler form of bound: Corollary 8.4. Let g.t/ D 1 F .t/. (a) If F .t/ is NWUC (new worse than used in convex order) then G.t/ et : (b) If F .t/ is NBUC (new better than used in convex order) then G.t/ et :
174
8 Renewal Theory
Proof. If F .t/ is NWUC, then for all y; z 0, Z
1 y
Thus,
Z
Z
1
1
ey 0
y
1 F .x C z/ dx 1 F .z/ 1 F .x C z/ dx 1 F .z/
Z
1
.1 F .x/dx: y
Z
Z
1
1
.1 F .x/dx:
ey 0
y
By integrating by parts twice, we get Z
1
ey 0
dF .y C z/ 1 F .z/
Z
1
ey dF .y/ D 1=: 0
Thus, we have .1 F .z//ez ˇ.t/ D inf R 1 y 0zt z e dF .y/ "R 1 #1 x 0 e dF .x C z/ inf 0zt 1 F .z/ Z 1 1 D ex dF .x/ 0
D : Similarly, we can show ˛.t/ D if F .t/ is NBUC.
t u
The following is a typical example of defective renewal equation. Example 8.1. (Age-dependent branching processes.) Suppose that an organism at the end of its lifetime produces a random number of offsprings with the probability distribution P[The number of offspring Dj]D pj for j D 0; 1; 2; : : :. Also assume that offsprings act independently of each other and produce their own offsprings with the same probability distribution. Denote the mean number of offsprings by < 1. Assume that the lifetimes of the organisms are independent random variables with common distribution F .x/ with F .0/ D 0. Let X.t/ denote the number of organisms alive at time t. The point process X.t/ is called an age-dependent branching process. Denote by m.t/ D EŒX.t/ as the expected number of organisms alive at time t. Conditioning on whether the lifetime of the first organism is larger than t (which keeps one organism) or less than t (which produce expected offsprings), we have the following defective renewal equation: Z
t
m.t/ D 1 F .t/ C
m.t x/dF .x/: 0
8.4 Defective Renewal Process
175
Suppose the conditions in the theorem are true. Then we have the following asymptotic result for m.t/: R1
et .1 F .t//dt t R1 e 0 xet dF .x/ 1 R1 et : D 2 0 xet dF .x/
m.t/
0
The lower and upper bounds for m.t/ can be obtained as ˇ.t/ t ˛.t/ t e e : m.t/ When F .t/ is NWUC, m.t/ et I and when F .t/ is NBUC, m.t/ et :
Problems 1. Let fN.t/g be a renewal process with interarrival underlying distribution function F .x/. Let W denote the waiting time until the current age A.t/ after last renewal has exceeded s. That is W D infft > 0 W A.t/ > sg: (a) Show by the renewal argument that Z P ŒW x D 1 F .s/ C
s
P ŒW x udF .u/; 0
for x s; and 0 for x < s. (b) When fN.t/g is Poisson with rate , show that EŒW D
es 1 :
(Hint: Find the moment generating function of W from (a).) 2. (Alternating Renewal Process) A machine breaks down repeatedly. After the nt h breakdown, the repair takes time Yn and makes the machine like new. Subsequently, the machine runs for a period of length Zn before it breaks down for another repair.
176
8 Renewal Theory
Assume Ym and Zn are independent of each other for all m 0; n > 0, the Ym having common distribution FY .y/ and the Zn having common distribution FZ .z/. Denote Xn D Zn1 C Yn and fN.t/g as the renewal process with interarrival times Xn with underlying distribution Z
x
F .x/ D
FY .x y/dFZ .y/: 0
Then N.t/ is the number of completed repairs by the time t. Denote by p.t/ the probability that the machine is working at time t. (a) Show by a renewal argument that Z p.t/ D 1 FZ .t/ C
t
p.t x/dF .x/: 0
(b) Show that the solution of above renewal equation is given by Z
t
p.t/ D 1 FZ .t/ C
.1 FZ .t x//dm.x/ 0
where m.x/ is the renewal function of N.t/. (c) As t ! 1, show by the Key Renewal Theorem p.t/ !
EŒZ ; EŒZ C EŒY
which is called the availability. 3. (Renewal-Reward Process) Let .Xi ; Ri / i 1 be independent and identically distributed pairs of random variables and Xi are the interarrival times for a renewal process N.t/. We call N.t X/ Ri ; C.t/ D i D1
a cumulative renewal-reward process with rewards Ri . Using Wald’s identity show that the long-run average reward rate has limit EC.t/ EŒR ! : t EŒX 4. (Age Replacement Maintenance Policy) A vital component of an airplane is replaced at a cost b whenever it reaches the given age T . If it fails earlier, the cost of replacement is a. Suppose the lifetime distribution F .x/ has density f .x/.
8.4 Defective Renewal Process
177
(a) Let X1 ; X2 ; : : : be the consecutive runtimes of the component. Show that the mean runtime per cycle is Z
T
EŒX D
.1 F .x//dx: 0
(b) Let R1 ; R2 ; be the corresponding replacement costs. Show that the mean cost per replacement is EŒR D aF .T / C b.1 F .T //: (c) Using the renewal-reward theorem show that the long-run cost rate per unit time is EŒR aF .T / C b.1 F .T // D : RT EŒX .1 F .x//dx 0
(d) Show that if a > b and the failure rate r.t/ D f .t/=.1 F .t// is increasing to 1, then the optimal maintenance time T exists uniquely. 5. (Block Replacement Maintenance Policy) Suppose a group of similar items (e.g., road lamps) are inspected at the fixed time inspection points fkT g for k D 1; 2; : : :. At each inspection point, if an item failed, it is replaced by a new item with cost a; and if it is still working, it is also replaced by a new one with cost b. Suppose the lifetime distribution is F .x/. (a) Using the elementary renewal theorem, find the long-run availability for an item; (b) Find the long-run average cost rate for an item. 6. Let N.t/ be a Poisson process with rate . (a) Show that the total life D.t/ D A.t/ C R.t/ at time t has the distribution P ŒD.t/ x D 1 .1 C min.t; x//ex ; for x 0: (Hint: Note A.t/ and R.t/ are independent, A.t/ has the same distribution as min.t; X / (by looking backward from time t) where X is exponential, and R.t/ is exponential.) (b) Show that 1 ED.t/ D .2 et /: 7. Let N.t/ be a renewal process with underlying distribution Gam.; 2/. Show that the renewal function is given by m.t/ D
1 1 t .1 e2t /: 2 4
178
8 Renewal Theory
(Hint: Note that N.t/ has the same distribution as Œ1=2M.t/ where Œx denotes the largest integer x and M.t/ is a Poisson process with rate . Thus, m.t/ D 1=2EŒM.t/ 1=2P ŒM.t/ is odd. Then use Problem 18 of Chap. 3.) 8. Show that if the underlying distribution F .x/ is NWUE, then U.t/
t :
9. Let Y be an exponential random variable with parameter which is independent the renewal process N.t/ with interarrival time fXi g for i D 1; 2; : : :. (a) Show by the memoryless property of Y that, P ŒX1 C Xn Y D .P ŒX1 < Y /n : (b) Calculate EŒN.Y /.
Chapter 9
Risk Theory
9.1 Classical Risk Model Suppose the claims arrive according to a Poisson process fN.t/g with rate and interarrival intervals fX1 ; ; Xn ; g. The consecutive claim amounts fY1 ; ; Yn ; g are identically and independently distributed with continuous distribution F .y/ with mean > 0, and independent of the arrival times. The premier rate c satisfies c= > , and c D .1 C /. This guarantees the profitability in the long run. The risk process Rt is defined as Rt D ct
N.t X/
Yi :
i D1
For initial capital reserve U0 D u, we define the surplus process as Ut D u C Rt and the time of ruin as T D infft > 0 W Ut D u C Rt < 0g: Two fundamental problems related to the risk process are the (ultimate) ruin probability .u/ D P ŒT < 1jU0 D u and the distribution of deficit at ruin .u; y/ D P Œju C RT j y; T < 1jU0 D u: Without loss of generality, we assume c D 1. Note that the surplus process Ut is linearly increasing between arrival times. Thus, the ruin can only occur at the arrival times. Define ( ) n X u D inf n > 0 W u C ŒXi Yi < 0 ( D inf n > 0 W
i D1 n X
)
ŒYi Xi > u ;
i D1
A.K. Gupta et al., Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics, DOI 10.1007/978-0-8176-4987-6 9, c Springer Science+Business Media, LLC 2010
179
180
9 Risk Theory
as the number of claims when the ruin occurs. Thus, .u/ D P Œu < 1. Denote N .u/ D 1 .u/ as the nonruin probability. In the following, we derive a renewal equation for .u/. After the first claim, the capital reserve becomes u C X1 Y1 . By conditioning on X1 and Y1 , we have N .u/ D E N .u C X1 Y1 / Z 1 Z uCx x N .u C x y/dF .y/dx D e 0
D eu
0
Z
1
ex
Z
x
N .x y/dF .y/dx;
0
u
where in the last equation, we change variable u C x to x. By differentiating both sides with respect to u, we get N 0 .u/ D N .u/
Z
u
N .u y/dF .y/:
0
Integrating by parts both sides with respect to u, we get N .t/ N .0/ D
Z Z
t 0 t
N .u/du
0
Z tZ 0
t
D Z
Z tZ 0
D Z
N .u/du
0
N .u/du
Z tZ 0
t
D
u
N .u/dF .y/du
0 t
N .u y/dudF .y/
y t y
N .u/dudF .y/
0
( Z N .u/du
0
Z
t y 0 t
ˇt ˇ N .u/.1 F .y// ˇ ˇ
yD0
N .t y/.1 F .y//dy
0
Z
t
D
N .t y/.1 F .y//dy:
0
Thus, N .u/ satisfies the following defective renewal equation: N .t/ D N .0/ C where
1 1C
1 FQ .y/ D
Z
Z
t
N .t y/dFQ .y/;
0
y
.1 F .x//dx 0
is the equilibrium distribution of F .y/, and D 1=.1 C /.
9.2 Approximation and Bounds for Ruin Probability
181
Using the monotone property of N .u/ in u and noting that N .1/ D 1, we get 1 D N .0/ C
1 : 1C
Thus, .0/ D 1 N .0/ D Thus,
1 : 1C
.u/ satisfies the following renewal equation .u/ D
1 1 .1 FQ .u// C 1C 1C
Z
u
.u y/dFQ .y/:
0
Obviously, this is a defective renewal equation with D 1=.1 C /.
9.2 Approximation and Bounds for Ruin Probability Suppose there exists a > 0 satisfying Z
1
ey dFQ .y/ D 1 C :
0
is called the Lundberg coefficient. Then, we have the following asymptotic results for
.u/.
Theorem 9.1. As u ! 1, .u/ where D
R1 0
1 u e ;
ye dFQ .y/.
As mentioned earlier, the asymptotic results may not cover the case of finite value of u very well. So a conservative bound may be needed. The following result is the well-known Lundberg bound. Theorem 9.2. For all u 0,
.u/ eu :
Proof. Here, we again prove the result based on the following recursive renewal equations. Let 1 .1 FQ .u//; 0 .u/ D 1C
182
9 Risk Theory
and for k 1, Z
1 1 .1 FQ .u// C k .u/ D 1C 1C
u 0
k1 .u
x/dFQ .x/:
Then, we already know that k .u/ " .u/ for all finite u’s. Now we shall use induction method to show that k .u/ eu for all k. For k D 0, it is easy to see that Z 1 1 u 1 Q .1 FQ .u// D e dF .x/eu 1C 1C u Z 1 1 ex dFQ .x/eu 1C u Z 1 1 ex dFQ .x/eu 1C 0 D eu : Suppose the bound holds for k 1. Then for k, 1 k .u/ 1C 1 D 1C De
u
Z
1
e dFQ .x/eu C x
u
Z
1
ex dFQ .x/ C
Z
u
1 1C
Z
u
e.ux/ dFQ .x/
0
ex dFQ .x/ eu
0
u
:
By letting k ! 1, we get the expected bound.
t u
Similar to the result given in Theorem 8.8, we have the following improved bounds for .u/. The proof is omitted. Theorem 9.3.
˛.u/eu
where
"Z
.u/ ˇ.u/eu ; 1 FQ .u C x/ ex dx 1 FQ .u/
1
˛.u/ D inf
0xu
and
0
"Z
1
ˇ.u/ D sup
0xu
e 0
x 1
FQ .u C x/ dx 1 FQ .u/
#1
#1 :
When F .y/ is 2 NWU, i.e., FQ .y/ is NWU (new worse than used), then .u/
1 u e : 1C
9.3 Deficit at Ruin
183
When F .y/ is 2 NBU , i.e., FQ .y/ is NBU (new better than used), then .u/
1 u e : 1C
Remark. A sufficient condition for F .y/ being 2 N W U (2 NBU ) is that F .y/ is IMRL (DMRL). In fact, F .y/ is IMRLR implies that FQ .y/ is DFR, since the 1 failure rate of FQ .y/ is equal to .1 F .y//= y .1 F .x//dx. Example 9.1. When F .y/ is exponential with rate 1=, then satisfies Z
1 0
ex
1 x= e dx D 1 C ;
which gives D =..C//. The lower bound is equal to the upper bound and .u/ D
1 u e : 1C
That means the bounds given in Theorem 9.3 are sharp.
9.3 Deficit at Ruin To evaluate the distribution of deficit at ruin, we further notice that the ruin not only occurs at the claim points, but also can P only occur at those claim points, called n ladder points. More specifically, let Sn D i D1 .Yi Xi /. We define the ladder times as follows: .1/ C D C D inffn > 0 W Sn > 0g and
.kC1/
C
.k/
D inffn > C W Sn > S .k/ g; C
for k D 1; 2; . .kC1/ .k/ Obviously, we can see that ladder times and heights fC C ; S .kC1/ C
S .k/ g are identically and independently distributed as fC ; SC g for k D 1; 2; . C
.k/
The ruin can only occur at the ladder points fC g, for k D 1; 2; . At these ladder P times, the risk process S .k/ D kiD1 Zi can be seen as a defective renewal process C
with terminating probability P ŒC D 1 and underlying distribution H.y/ D P ŒSC yjC < 1: Note that .0/ D P ŒC < 1 D 1=.1 C /:
184
9 Risk Theory
By conditioning on the value of SC given C < 1, we have another form of defective renewal equation for .u/: 1 1 .u/ D .1 H.u// C 1C 1C
Z
u
.u x/dH.x/;
0
which has the same form as the one derived in the first section. By evaluating Laplace transformations of both sides of the two equations, we see that H.x/ and FQ .x/ have the same Laplace transformation. Thus, H.x/ D FQ .x/ D
1
Z
x
.1 F .y//dy; 0
which is the equilibrium distribution of F .x/. That means, the ladder height SC jC < 1 has the same distribution as FQ .x/, or P ŒSC y; C < 1 D
1 Q F .x/: 1C
Now we define .u; y/ D P ŒT < 1; jUT j y: By conditioning on the value of SC give C < 1, we have the following defective renewal equation for .u; y/: 1 1 .u; y/ D ŒH.u C y/ H.u/ C 1C 1C
Z
u
.u x; y/dH.x/: 0
Under the existence of > 0, we define a changed probability under which the distribution of SC becomes dH .x/ D ex
1 dH.x/: 1C
Then, H .x/ is a proper distribution function with mean 1 D 1C
Z
1
xex dFQ .x/ > 0:
0
By changing the defective renewal equation to a normal renewal equation, we have the following asymptotic result. Theorem 9.4. Under the existence of > 0, as u ! 1, R1 .u; y/
0
ex .H.y C x/ H.x//dx u R1 e ; x 0 xe dH.x/
9.4 Large Claim Case
185
and P ŒjUT j yjT < 1 !
Z
1
ex .H.y C x/ H.x//dx:
0
Corollary 9.1. Suppose > 0 exists. (a) If F .x/ is 2 N W U , then P ŒjUT j yjT < 1 H.y/; and thus G.u; y/
H.y/ u e : 1C
(b) If F .x/ is 2 NBU , then P ŒjUT j yjT < 1 H.y/; and thus
H.y/ u e : 1C Proof. If F .x/ is 2 N W U , then H.x/ is N W U . That means, .1 H.x C y// .1 H.x//.1 H.y//. Thus, .u; y/
Z 0
1
Z 1 x e .H.y C x/ H.x//dx D e .1 H.x/ .1 H.y C x///dx 0 Z 1 ex .1 H.x//.1 .1 H.y///dx 0 D H.y/: x
Combining with bound for P ŒT < 1, we get the upper bound. The lower bound follows similarly. t u Example 9.2. Let F .x/ be exponential with mean . Then H.y/ D FQ .y/ D F .y/. The lower bound equals to the upper bound. That means .u; y/ D
1 ey= u e ; 1C
where D =..1 C //. Thus, the bounds given in the above corollary are sharp.
9.4 Large Claim Case The large claim case has recently attracted extensive research in risk theory. Here, the large claim means that the distribution of claim sizes has heavier tails than exponential distribution. Typical examples include log-normal, Weibull, and Pareto
186
9 Risk Theory
distributions. That means the Lundberg coefficient does not exist. This implies that the simple exponential approximations and bounds for the ruin probability do not hold. Theoretically, it is difficult to get any accurate approximations. There are two main approaches to deal with this case. One is to give bounds based on heavy tailed distributions, which belong to certain lifetime distribution classes. The other is to give approximations within certain classes of heavy tailed distributions. In this section, we first give a method of obtaining simple bounds based on the induction method. Then, we introduce some approximate results based on some simple heavy tailed distribution classes.
9.4.1 Bounds in terms of NWU (NBU) Distribution Classes Let B .y/ be a family of random variables for 0 < 1 satisfying B0 .y/ D 0 and B1 .y/ D 1 for all y. Suppose there exists a unique > 0 satisfying Z 1 .B .y//1 dFQ .y/ D 1 C : 0
Here plays the role of the Lundberg coefficient. The following theorem gives an upper bound for the ruin probability B .y/ is NWU.
.u/ when
Theorem 9.5. Suppose B .y/ is NWU and B .0/ D 0. Then .u/ ˇ BN .u/; where ˇ D sup0x1 Œh.x/1 and Z
1
h.x/ D 0
BN .z C x/ BN .x/
!1
! FQ .x C z/ d : 1 FQ .x/
Proof. We still use the induction method using the iterative renewal equations: 0 .u/
D
1 .1 FQ .u//; 1C
and for k 1, Z
1 1 .1 FQ .u// C k .u/ D 1C 1C
u 0
k1 .u
x/dFQ .x/:
First for k D 0, 0 .u/
D
1 .1 FQ .u// 1C
D
1 Œh.u/1 BN .u/ 1C
Z u
1
.BN .y//1 dFQ .y/
9.4 Large Claim Case
187
Z BN .u/ 1 N .B .y//1 dFQ .y/ 1C u Z BN .u/ 1 N .B .y//1 dFQ .y/ ˇ 1C 0 ˇ
ˇ BN .u/: Now suppose the result is true for k 1. Then for k, Z 1 1 Œh.u/1 BN .u/ .BN .y//1 dFQ .y/ 1C u Z u 1 Q C k1 .u y/dF .y/ 1C 0 Z u Z 1 ˇ N ˇ 1 Q N .B .y// dF .y/ C BN .u y/dFQ .y/ B .u/ 1C 1C 0 u Z 1 Z u ˇ BN .u/ .BN .y//1 dFQ .y/ C BN .u/.BN .y//1 dFQ .y/ 1C u 0 Z 1 Z u ˇ N 1 Q 1 Q N N .B .y// dF .y/ C .B .y// dF .y/ D B .u/ 1C u 0
k .u/ D
D ˇ BN .u/; where in the last inequality we use the NWU property of B .y/, which implies BN .u y/ BN .u/.BN .y//1 : By letting k ! 1, we get the expected result.
t u
A lower bound can be obtained when B .y/ is NBU. Theorem 9.6. Suppose B .y/ is NBU and B .0/ D 0. Then .u/ ˛ BN .u/; where ˛ D inf0x1 Œh.x/1 . Proof. We first introduce a proper distribution 1 F .x/ D 1C
and we shall show that k .u/
Z
x 0
.B .y//1 dFQ .y/;
˛.1 F
.k/
.u//;
188
9 Risk Theory
for all k . For k D 0, 0 .u/
D
1 .1 FQ .u// 1C
D
1 Œh.u/1 BN .u/ 1C
Z
1
.BN .y//1 dFQ .y/
u
˛ BN .u/.1 F .u//: Suppose it is true for k 1. Then for k, k .u/ D
Z 1 1 Œh.u/1 BN .u/ .BN .y//1 dFQ .y/ 1C u Z u 1 Q C k1 .u y/dF .y/ 1C 0
˛ BN .u/.1 F .u// Z u ˛ .k1/ .u y//dFQ .y/ BN .u y/.1 F C 1C 0 ˛ BN .u/.1 F .u// Z u 1 1 .k1/ N Q N C .u y//dF .y/ B .u/.B .y// .1 F 1C 0 Z u .k1/ .1 F .u y//dF .y/ D ˛ BN .u/ .1 F .u// C 0
.k/ D ˛ BN .u/.1 F .u//;
where we note that .1 F .u// C
Z
u
.1 F .k1/ .u y//dF .y/ D 1 F .k/ .u/:
0
By letting k ! 1 and noting F .k/ .u/ ! 0 for all u, we get the expected result. t u In the following, we give two examples with typical large claims. a
Example 9.3 (Weibull distribution). Let F .x/ D 1 ebx for 0 < a 1 and b > 0. Then F .x/ is DFR from Chapter 4 and thus NWU. Obviously, when a < 1, the Lundberg coefficient does not exist. Note that D .1 C 1=a/=b 1=a :
9.4 Large Claim Case
189 a
We select B .x/ D 1 ex . To force 1
Z
1 0
.BN .y//1 .1 F .x//dx D 1 C ;
we get D b.1 1=.1 C /a /: From the above theorem, we get .u/ ˇ BN .u/ D ˇeb.11=.1C /
a /x a
:
Example 9.4 (Pareto distribution). Let F .x/ D 1 .1 C x=b/.1Ca/ for a > 0. Then D b=a. Since F .x/ is DFR, thus is NWU. Note that FQ .x/ D .1 C x=b/a : We select BN .x/ D .1 C x=b/ : Then satisfies 1
Z
1 0
1 .BN .y//1 FN .y/dy D
Z
1
.1 C x=b/.1Ca/ dx D 1 C :
0
This gives a D 1 C ; i:e:; D a : a 1C By using the fact that z BN .x C z/ D 1C ; bCx BN .x/
we get " Z ˇ D sup x0
D D
1 0
a 1 a
1 : 1C
1C
z bCx
d 1C
z bCx
a #1
190
9 Risk Theory
Thus, we have the upper bound .u/
1 .1 C u=b/a=.1C /: 1C
This bound is sharp as we know
.0/ D
1 . 1C
It gets better as gets larger.
9.4.2 Subexponential Classes To obtain accurate approximations for the ruin probability in the heavy-tailed distribution case, we need to characterize the tail-behavior of the claim size distribution. For independently and identically distributed random variables Y1 ; Y2 ; ; with common distribution G.y/, large claims should dominate the accumulated claim amount. That means, Y1 C Yn should have the same right-tail distributional behavior as max.Y1 ; : : : ; Yn /, i.e., P
" n X
# Yi > x P Œ max Yi > x; 1i n
i D1
as x ! 1. Note that P
max Yi > x D 1 P
1i n
max Yi x
1i n
D 1 .G.x//n n.1 G.x//; where the last step is due to L’Hospital rule as x ! 1. Formally, we introduce the following subexponential distribution. Definition 9.1. The distribution G.x/ is called subexponential, denoted by G 2 S, if and only if 1 G .n/ .x/ lim D n; f or al l n: x!1 .1 G.x// The following theorem shows that in the above definition, we only need that the result holds for n D 2. Theorem 9.7. If
1 G .2/ .x/ D 2; x!1 1 G.x/ lim
then F .x/ 2 S.
9.4 Large Claim Case
191
Proof. We first note that G.x/ G .2/ .x/ 1 G .2/ .x/ D 1C 1 G.x/ 1 G.x/ Z x Z y 1 G.x t/ 1 G.x t/ dG.t/ C dG.x/ D 1C 1 G.x/ 1 G.x/ 0 y 1 C G.y/ C
1 G.x y/ .G.x/ G.y//: 1 G.x/
Thus, for G.x/ G.y/ > 0, we have 1 G.x y/ 1 1 G.x/
! GN.2/ .x/ 1 G.y/ .G.x/ G.y//1 : N G.x/
As x ! 1, since 1 G .2/ .x/ ! 2; 1 G.x/ we see that
N y/ G.x ! 1: N G.x/
Next we prove the result by induction. Suppose the result is true for n. Then for n C 1, N .x/ G.x/ G .nC1/ .x/ G .nC1/ D 1C N N G.x/ G.x/ Z x N.n/ G .x t/ dG.t/ D 1C N G.x/ 0 ! Z xy Z x N t/ GN.n/ .x t/ G.x C dG.t/ D 1C N t/ N G.x G.x/ 0 xy D 1 C I1 .x/ C I2 .x/: For I2 .x/, since
N .xt / G .n/ N G.xt /
Z
x xy
is bounded for x y t and hence
N t/ G.x/ G.x y/ G.x dG.t/ N N G.x/ G.x/ N y/ G.x D 1 N G.x/ ! 0:
192
9 Risk Theory
Thus, I2 .x/ ! 0. For I1 .x/, we note that for 0 t x y and y sufficiently large, GN.n/ .x t/ D n C o.1/; N t/ G.x and hence Z
xy 0
Z x N N t/ G.x/ G .2/ .x/ G.x G.x t/ dG.t/ D dG.t/ N N N G.x/ G.x/ G.x/ xy D 1 C o.1/:
Thus, I1 .x/ D n C o.1/, which completes the proof.
t u
From the result for the defective renewal equations, we see that the ruin probability can be written as the following series sum form as n 1 X 1 .u/ D .1 FQ .n/ .u//; 1 C nD0 1 C Ru where FQ .x/ D 1 0 .1 F .z//dz is the stable distribution of F .x/. By assuming FQ .x/ 2 S, we have as u ! 1, .u/ D
n 1 X 1 n.1 FQ .u// 1 C nD0 1 C 1 .1 FQ .u//:
Note .1 C / D . Thus, we have the following result: Theorem 9.8. Suppose FQ .x/ 2 S. Then as u ! 1, .u/
1 1
Z
1
.1 F .z//dz: u
Sometimes it is more convenient to give some sufficient conditions directly for F .y/, which imply FQ .x/ 2 S. The following definition gives a subclass of S. Definition 9.2. We call F .x/ 2 S if and only if FN .n/ .x/ D nn1 ; for all n; x!1 FN .x/ lim
9.5 Risk Sharing and Stop-Loss Reinsurance
where FN .n/ .x/ D
Z
x
193
FN ..n1// .y/FN .x y/dy;
0
with FN .1/ .x/ D FN .x/. Theorem 9.9. F 2 S implies FQ .x/ 2 S. We shall not give the proof here. A different sufficient condition is given in the case when the failure rate function r.t/ exists. Theorem 9.10. If the failure rate function r.t/ of F .t/ satisfies one of the following two conditions, then FQ 2 S: (a) lim supx!1 xr.x/ < 1; (b) limx!1 r.x/ DR 0, limx!1 xr.x/ D 1, and lim supx!1 .xr.x/=R.x// < 1, x where R.x/ D 0 r.t/dt D ln FN .x/. From this theorem, we have the following two examples as illustration. Example 9.5 (Weibull). Let FN .x/ D ex for 0 < a < 1. Then F 2 S. Thus, as u ! 1, x 1a a eu : .u/ a= .1=a/ a
Example 9.6 (Pareto). Let FN .x/ D .a=x/b IŒa;1/ .x/ for a > 0; b > 1. Then F .x/ 2 S. Thus, .u/
a b1 a ; as u ! 1: .b 1/= a u
9.5 Risk Sharing and Stop-Loss Reinsurance Reinsurance treaties provide a tool for sharing the risk between the insurer and the cedant. Consider an insurer who wants to find a reinsurance policy which gives the smallest expected harm for fixed insurance risk premium P (without any safety or expense loading). Such a policy can be considered optimal from the cedant’s point of view because it provides the best protection against the volatility for a fixed net risk premium. Typically, the harm is treated as a measure of risk and the variation of range is a key concept. When alternative policies are compared, their optimality can often be assessed on the basis of the smaller the volatile, the better. The variance Var.X / is commonly used as the measure of variation. The smaller the variance, the smaller the risk. In many applications, however, only the positive
194
9 Risk Theory
deviations (large claim amounts) are harmful, whereas smaller values may even be beneficial. Thus, a more satisfactory measure is to use a convex function h.X / of X to measure the “harm” caused by a loss X . The “harmfulness” of a loss X is now measured by the expected harm: Z HX D EŒh.X / D
1
h.x/dF .x/: 0
The smaller the expected harm, the smaller the risk. Let Xtot be the total aggregate amount of claims during a year and X D Xced the cedant’s share. We consider the problem of finding the reinsurance arrangement which has the smallest expected harm HX subject to the conditions: 0 X Xtot
and EŒX D P:
We first give a result which generalizes Jensen’s inequality. Theorem 9.11. Suppose two random variables X and X satisfy (a) EŒX D EŒX , (b) there exists a constant M such that X is always between X and M , i.e., either X X M or X X M: Then for any convex function h.:/, EŒh.X / EŒh.X /: Proof. Let h0 .x/ denote either the right derivative or the left derivative if the two derivatives are different. From the convexity property of h.x/, we have h.X / h.X / h0 .M /.X X /; since X and X are always either both to the left or both to the right of M . Taking the expectation of both sides, we get EŒh.X / EŒh.X / h0 .M /.E.X / E.X // D 0: t u Remark. By taking X D E.X / D M , we get Jensen’s inequality EŒh.X / h.E.X //: Next theorem gives the main result (Arnold 1963).
9.5 Risk Sharing and Stop-Loss Reinsurance
195
Theorem 9.12. For the above-defined optimization problem, the optimal reinsurance treaty is the stop-loss type with X D Xced D min.Xtot ; M /; where M is called the retention level such that EŒX D P . Proof. Obviously, X D Xtot and X satisfy the conditions given in the previous theorem. Thus, we have HX D EŒh.X / EŒh.X / D HX : t u Corollary 9.2. If h.x/ D .x /2 , then the stop-loss treaty is the optimal solution of minimizing the variance X2 subject to P D EŒmin.X; M /. Note that the insurer’s coverage will be XQ D Xtot X D Xtot min.Xtot ; M / D .Xtot M /C : The final result shows that the sum of variances of cedant’s coverage and insurer’s coverage is smaller than the total variance. Theorem 9.13. Under the optimal stop-loss reinsurance treaty, Var.Xtot / Var.X / C Var.XQ /: Proof. Obviously, Var.Xtot / D Var.X C XQ / D Var.X / C Var.XQ / C 2Cov.X ; XQ /: Note that both X and XQ are increasing functions of Xtot , and thus they are positively correlated. That means, Cov.X ; XQ / 0: t u Remark. The condition P D EŒX can be generalized to include higher moments of X as well. For example, we can require P D g.EŒX ; VarŒX /;
196
9 Risk Theory
for a proper function g.x; y/. Under some mild conditions, one can show that the coverage by the insurer is of the following three forms: (a) Stop-loss: XQ D .Xtot M /C ; (b) Quota share: XQ D kXtot ; (c) Combination of the two: XQ D k.Xtot M /C . We shall not give the details here.
Problems 1. Suppose the claims arrive according to a Poisson process N.t/ with rate and the consecutive claim sizes are Y1 ; ; Yn ; ::: following distribution F .y/, which are independent of N.t/. (a) Using Wald’s identity show that the mean of total claim amount up to time t PN.t / Ytot D i D1 Yi has the mean tot D t; where is the mean of Yi . (b) Suppose the moment generating function of Yi is mY .s/ D EesYi . What is the moment generating function of Yt ot ? (c) Suppose F .y/ D 1ey= . What are the mean and moment generating function of Ytot ? 2. Under the classical risk model, suppose the claim size distribution follows the mixture of two exponential distributions, F .y/ D .1 p/.1 e1 y / C p.1 e2 y /; for some 1 < 2 . (a) (b) (c) (d)
Find the Lundberg coefficient. Find the approximation for the ruin probability. Give the bounds for the ruin probability. Find the bounds for the distribution of deficit at ruin.
3. Under the classical risk model, suppose the claim size is exponential with mean . (a) What is the Lundberg coefficient? (b) Give the exact formula for the ruin probability. (c) Give the exact formula for the distribution of deficit at ruin. 4. Under the stop-loss reinsurance model, suppose the total claim follows exponential distribution with mean 1=. For given premium P , (a) Derive the retention level M ; (b) Calculate the sum of shared variances Var.X / C Var.XQ /.
9.5 Risk Sharing and Stop-Loss Reinsurance
197
5. Under the stop-loss reinsurance model, suppose the total claim follows mixture exponential density f .x/ D .1 ˛/1 e1 x C ˛2 e2 x : (a) Derive the retention level M given premium P . (b) Calculate Var.X / and Var.XQ /. 6. Given t > 0, define 8 < FQ .x/ ; 0xt Q Ft .x/ D FQ .t/ : 1; x>t as the truncated FQ .x/ at time t. Suppose there exists a constant t satisfying Z
t 0
ey dGQ t .y/ D
1C D 1 C t : FQ .t/
(a) Show that .n/
1 FQt
.x/ D 1
FQ .n/ .x/ : ŒFQ .t/n
(b) Show that the corresponding ruin probability n 1 t X 1 .1 FQt.n/ .x// t .x/ D 1 C t nD1 1 C t satisfies .x/ D (c) From the Lundberg bound for
C 1 FQ .t/ : C 1 FQ .t/ t .x/
t .x/
.x/
for 0 x t, show that
ext C 1 FQ .t/ : C 1 FQ .t/
(d) Show that for 0 x t, .x/ ext C
1 FQ .t/ : C 1 FQ .t/
a 7. Show that the Weibull distribution with FN .x/ D ex for 0 < a < 1 is a subexponential distribution.
198
9 Risk Theory
8. Show that the Pareto distribution with FN .x/ D .a=x/b IŒa;1/ .x/ for a > 0, b > 1 is subexponential. 9. A lifetime distribution F .x/ is called belonging to the class L if 1 F .x C y/ D 1: x!1 1 F .x/ lim
Show that if the failure rate function r.t/ exists and is decreasing, the above condition is equivalent to r.t/ ! 0 as t ! 1. 10. A lifetime distribution F .x/ is called belonging to the class D if lim sup x!
1 FN .x=2/ < 1: 1 F .x/
Show that if F .x/ is DFR and has the failure rate function r.t/, then F .x/ 2 D if limx!1 xr.x/ < 1. 11. Show the following equality 1F
.2/
Z
x=2
.x/ D 2
.1 F .x y//dF .y/ C .1 F .x=2//2 :
0
12. Using the equality in the above question show that if F .x/ belongs to both L and D, then it belongs to S. 13. Using the above result show that F .x/ D
1 .1Cx/˛
belongs to S for ˛ > 0.
Chapter 10
Asset Pricing Theory
10.1 Utility, Risk, and Pricing Kernel 10.1.1 Utility and Risk An investor must decide how much to save and how much to consume, and what portfolio of assets to hold. In this chapter, we shall introduce the basic theory of asset pricing and portfolio management in the discrete time case. Consider the current time t. Let et be the consumption level (if the investor bought none of the asset) and denote by the amount of a certain kind of asset he chooses to buy with a price of pt per unit. Here, the asset can be bonds, stocks, or options, etc. Then the consumption at time t will be Ct D et pt : Suppose from each unit of asset, the payoff at time t C1 is Xt C1 , which is random. Then the total payoff (the consumption level) at time t C 1 becomes Ct C1 D Xt C1 : Associated with the consumption level, we introduce the utility function u.Ct / to measure how much “happiness” the investor can get. The utility function u.c/ is an increasing and concave function. Increasing means that for more consumption, there is more “happiness”. Concave means that the first-order derivative u0 .c/ is decreasing. That implies that the marginal utility decreases with the increasing consumption. The utility function captures the fundamental desire for more consumption. Some examples of utility functions are u.c/ D
1 1 c ; f or 0 < < 1; 1
u.c/ D ln.c/;
A.K. Gupta et al., Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics, DOI 10.1007/978-0-8176-4987-6 10, c Springer Science+Business Media, LLC 2010
199
200
10 Asset Pricing Theory
or u.c/ D e˛c : To understand the relationship between utility function and risk, suppose one invests c and is willing to pay a premium for the random (risk) change Z with mean EŒZ during the next time period. That means, u.c C EŒZ / D EŒu.c C Z/: For simplicity, we assume EŒZ D 0. Let 2 D Var.Z/ ! 0 and ! 0. By Taylor expansion of u.x/, we have u.c / D u.c/ u0 .c/ C o./I 1 EŒu.c C Z/ D E u.c/ C Zu0 .c/ C Z 2 u00 .c/ C O.Z 3 / 2 1 D u.c/ C 2 u00 .c/ C o. 2 /: 2 Thus, locally we have
1 2 u00 .c/ : 2 u0 .c/
We call the curvature of the utility function A.c/ D
d u00 .c/ D log u0 .c/ 0 u .c/ dc
the relative risk-aversion coefficient. Obviously, the investor prefers a consumption stream that is steady over time.
10.1.2 Asset Pricing Formula and Pricing Kernel By introducing a discount factor ˇ from the future, which represents impatience, we set our goal to maximize the total expected utility over time period Œt; t C 1/: U.Ct ; Ct C1 / D u.Ct / C ˇEt Œu.Ct C1 /; where Et Œ: means that the expectation is taken at time t. By plugging the conditions on Ct and Ct C1 into the objective function, we have U.Ct ; Ct C1 / D u.et pt / C ˇEt Œu.Xt C1 /:
10.1 Utility, Risk, and Pricing Kernel
201
Differentiating with respect to , we have pt u0 .Ct / D ˇEt Œu0 .Ct C1 /Xt C1 : The above equation expresses the standard marginal condition for the optimum solution. On the left-hand side, pt u0 .Ct / is the marginal utility loss if the investor buys another unit of the asset. On the right-hand side, ˇEt Œu0 .Ct C1 /Xt C1 is the increase in (discounted, expected) marginal utility gain for the extra payoff at t C 1. The investor continues to buy or sell the asset until the marginal loss equals the marginal gain. From this equation, we get the central asset pricing formula: pt D Et
u0 .Ct C1 / Xt C1 : ˇ 0 u .Ct /
Given the payoff Xt C1 and the investor’s consumption levels Ct and Ct C1 , it tells what market price pt to expect. Its economic content is simply the first-order conditions for the optimal consumption and portfolio formation. Most of the asset pricing theory consists of specialization and manipulation of this formula. We shall call u0 .Ct C1 / mt C1 D ˇ 0 u .Ct / the stochastic discount factor (or pricing kernel). So we can write pt D Et Œmt C1 Xt C1 ; which is the fundamental pricing formula. Example 10.1 (Risk-free bond). For the one-period zero-coupon risk-free bonds from t to t C 1, the payoff is Xt C1 D 1, and the price is just equal to the discount factor: 1 pt D Et Œmt C1 D ; 1 C rt C1 where rt C1 is the risk-free rate. Thus, the mean pricing kernel is equal to the riskfree discount rate. Thus, for n-period risk-free bonds of face value $1.00, the price at time 0 will be 1 1 p0 D : 1 C r1 1 C rn In particular, for constant risk-free rate rt D r, p0 D .1 C r/n . In other words, for $1.00 investment at time 0, the payoff at time n will be .1 C r/n . Remark. Typically, the time unit for a period can be a day, a month, or a year. For example, a deposit of amount $1,000.00 of annual interest r which is compounded monthly will have value 1;000.1 C r=12/n at n-th month.
202
10 Asset Pricing Theory
Example 10.2. Consider a 30-year loan of amount $100,000.00 at 6% annual rate compounded monthly. Suppose $C amount is paid at the end of each month. Then the total discounted amount of payments at the present time is C
1 1 1 C CC 1 C 0:06=12 .1 C 0:06=12/2 .1 C 0:06=12/360
DC
1 1 .1 C 0:06=12/360 1 C 0:06=12 1 .1 C 0:06=12/1
DC
1 .1 C 0:06=12/360 D C 166:7916: 0:06=12
Thus, the mortgage payment for each month is C D 100;000=166:7916 D 599:5506: When the length of time unit approaches zero, the discount factor for one period becomes lim .1 C r=N /N D er ; N !1
where r is called continuous compound interest rate.
t u
Using this example, we can write pt D EŒmt C1 Xt C1 D EŒmt C1 EŒXt C1 C CovŒmt C1 ; Xt C1 Xt C1 C CovŒmt C1 ; Xt C1 DE 1 C rt C1 CovŒˇu0 .Ct C1 /; Xt C1 Xt C1 : C DE 1 C rt C1 u0 .Ct / The first term is the standard discounted expected payoff and the second term is the covariance between the pricing kernel and the payoff, called the risk adjustment. Since u0 .c/ decreases with c, thus pt is lowered if the payoffs is positively correlated with consumption. Its economics interpretation is that the investor does not like uncertainty (risk) about the consumption. The increase of this uncertainty lowers the price of the asset.
10.2 Models for Returns
203
10.2 Models for Returns 10.2.1 ˇ-Representation For modeling purposes, we are often interested in the stochastic return rate Rt C1 defined as Xt C1 : 1 C Rt C1 D pt From the fundamental pricing formula by letting pt D 1, Rt C1 satisfies 1 D Et Œmt C1 .1 C Rt C1 /: Using the fact that EŒmt C1 D 1=.1 C rt C1 /, we can write 1 D EŒmt C1 .1 C Rt C1 / D EŒmt C1 EŒ.1 C Rt C1 / C CovŒmt C1 ; .1 C Rt C1 / D
EŒ1 C Rt C1 C CovŒmt C1 ; Rt C1 : 1 C rt C1
Thus, we have EŒRt C1 rt C1 D .1 C rt C1 /CovŒmt C1 ; Rt C1 D
CovŒmt C1 ; Rt C1 EŒmt C1
D
Var.mt C1 / CovŒmt C1 ; Rt C1 Var.mt C1 / EŒmt C1
D ˇt C1 t C1 ; where t C1 D
Var.mt C1 / CovŒmt C1 ; Rt C1 ; and ˇt C1 D : EŒmt C1 Var.mt C1 /
Note that ˇt C1 is the coefficient of linear projection of Xt C1 on mt C1 in the sense that ˇt C1 minimizes EŒXt C1 ˇmt C1 2 : Obviously, ˇt C1 is equivalent to the covariance between Xt C1 and mt C1 . We call EŒRt C1 D rt C1 C ˇt C1 t C1 ; the ˇ pricing model for the return rate.
204
10 Asset Pricing Theory
The ˇ-representation is important since it shows that every expected return is proportional to the regression coefficient ˇt C1 on linear regression of Xt C1 on mt C1 . t C1 is often called the price of risk, which only depends on the pricing kernel, and ˇt C1 is thus called the quantity of risk in each asset.
10.2.2 Frontier Expression Furthermore, by denoting Corr.X; Y / as the correlation between X and Y , we can also write EŒRt C1 rt C1 D
CovŒmt C1 ; Rt C1 EŒmt C1
p Var.mt C1/ p Var.Rt C1 /: D Corr.mt C1 ; Rt C1 / EŒmt C1 Thus, jEŒRt C1 rt C1 j
p Var.mt C1 / p Var.Rt C1 /; EŒmt C1
or jEŒRt C1 rt C1 j p Var.Rt C1 /
p
Var.mt C1 / : EŒmt C1
p The right-hand side Var.mt C1 /=EŒmt C1 is called the slope of the frontier. p The frontier lines give the bounds for the mean EŒRt C1 in terms of Var.Rt C1 /. The equality holds only when Rt C1 is a linear function of mt C1 . That means, Rt C1 and mt C1 are fully correlated. In this case, Rt C1 is called the frontier return rate.
10.2.3 Log-Normal Model The following is a classical example under this pricing model. Example 10.3 (Log-Normal Model). Let us assume u.c/ D Then u0 .c/ D c : C
1 c 1 1
for 0 < < 1.
A common model is to assume that Ct C1 is log-normally distributed and can be t Zt written as e , where Zt is normal with mean t and variance t2 . We can write, mt C1 D eln ˇ Zt :
10.3 Examples of Risk Assets
205
Thus, 1 D Et Œmt C1 1 C rt C1 D EŒeln ˇ Zt D eln ˇ t C
2 2 =2 t
:
For the ˇ-pricing model, we have t C1 D D D
Var.mt C1 / EŒmt C1 EŒm2tC1 .EŒmt C1 /2 EŒmt C1 e2 ln ˇ 2t C2 e
D EŒmt C1 .e
2 2 t
.eln ˇ t C
2 2 =2 t
/2
ln ˇ t C 2 t2 =2
2 2 t
1/:
The slope of the frontier is equal to p q Var.mt C1 / 2 2 D e t 1: EŒmt C1
10.3 Examples of Risk Assets Example 10.4. For stocks, Xt C1 D pt C1 C dt C1 , where dt C1 is the dividend. The gross return rate becomes 1 C Rt C1 D
Xt C1 D pt
dt C1 pt C1 : C pt pt
It is quite often that the stock price is adjusted backward in time to account for the dividends. In this case, Xt C1 D pt C1 , and 1 C Rt C1 D
pt C1 : pt
Example 10.5. For one-period European call options, suppose the strike (exercise) price is K and the stock price is S . Then at the end of the period, the buyers have the right to buy the stock at the strike price K and sell it at price S . Thus, the payoff
206
10 Asset Pricing Theory
Xt C1 will be the excess amount of the stock price S over the strike price K. That means, Xt C1 D max.S K; 0/: Similarly, a one-period put option with strike price K will have the payoff Xt C1 D max.K S; 0/; by selling the stock at price K. Both options are typical examples of derivative securities. Remark. Although the pricing kernel is defined from the utility function, the concepts can be used independently for statistical modeling purposes by fitting historical data. Example 10.6. Suppose a stock is liquidated in one-period and its payoff is $k if state k occurs, for k D 1; 2; : : : ; 5. Assume each state has equal probability of occurring. Let the pricing kernel mt C1 take the values 0:98; 0:96; 0:94; 0:92; and 0:90 for states 1 to 5, respectively, following the uniform probability distribution. (a) The risk-free rate can be calculated as 1 D Et Œmt C1 1 C rt D
1 1 1 1 1 0:98 C 0:96 C 0:94 C 0:92 C 0:9 5 5 5 5 5
D 0:94: Thus, 1 C rt D 1=0:94 D 1:0638: (b) The stock price at current time t is calculated as pt D Et Œmt C1 Xt C1 D
1 Œ0:98 1 C 0:96 2 C 0:94 3 C 0:92 4 C 0:9 5 5
D 2:78: (c) Similarly, if the strike price is $ 3 for a call option, then the payoffs will be 0 in states 1 to 3, 1 in state 4, and 2 in state 5. The price of this one-period call option is 1 pt D Œ0:92 1 C 0:9 2 D 0:54: 5
10.4 Risk-Neutral Probabilities
207
10.4 Risk-Neutral Probabilities In practical situations, one rarely knows the true model form for the pricing kernel, but only the risk-free rate. Therefore, the calculation of prices should only be based on the risk-free rate. The following discussion provides a method by finding the riskneutral probabilities and the price can be calculated under this probability measure. For risk-free assets (e.g., bonds), we already know 1 D Et Œmt C1 .1 C rt /: Thus, we can define a new probability measure Pt .A/ for any event A by Pt .A/ D Et Œmt C1 .1 C rt /IA D Et Œmt C1 .1 C rt /I A: Under this probability measure, we can write the price form as Xt C1 Xt C1 D Et : pt D Et mt C1 .1 C rt / 1 C rt 1 C rt where Et Œ: is defined as the expectation under the probabilities Pt .:/. That means, the price is just the expected discounted payoffs under the new probabilities Pt .:/. Therefore, we call Pt .:/ the risk-neutral probabilities. The significance of this formula is that, theoretically, the calculation of price can be directly carried out under Pt .:/. Example 10.6 (Cont’d). (a) Since 1 C rt D 1:0638, the risk neutral probabilities are calculated as pi D pi mt C1 .i /.1 C rt /: For example, p1 D 0:2 0:98 1:0638 D 0:2085; and p2 D 0:2043, p3 D 0:2, p4 D 0:1957, and p5 D 0:1915. (b) The stock price can be calculated as pt D Œ1 0:2085 C 2 0:2043 C C 5 0:1915=1:0638 D 2:78: (c) The call-option price with strike price K D 3 can be calculated as: pt D Œ1 0:1957 C 2 0:1915=1:0638 D 0:54:
208
10 Asset Pricing Theory
10.5 Option Pricing for Binomial Model 10.5.1 Pricing Formula for Multiple Stages The importance of the risk-neutral probability lies in the fact that if one can construct a risk-neutral probability measure directly, then it can be used to calculate the asset price since the risk-free return rate is typically known. Let us generalize the above model to an N-period model. For simplicity, we assume that the risk-free return rate is a constant rt D r and the payoffs is the adjusted price Xt D pt (which includes dividends as part of it) for t D 0; 1; 2; ; N . At time t, under the risk-neutral probability measure, we already know that pt D Et i.e.,
pt C1 jpt ; 1Cr
pt pt C1 D Et jpt ; .1 C r/t .1 C r/t C1
for t D 0; 1; 2; : : : ; N . n o pt In probability terms, .1Cr/ consists of a martingale. t Denoting by E Œ: the joint expectation of E1 Œ:; ; EN Œ:, by the chain rule we have pk pN p0 D E D E ; .1 C r/N .1 C r/k for all k. Therefore, if one knows the probability structure of pk under the risk-neutral probability measure P Œ:, then the price can be calculated. In the following, we give the details for calculation of European call-option under the binomial model.
10.5.2 Binomial Model Suppose the stock evolves following a binomial model such that the return rate Rt only takes two values u (up) and d (down) and P ŒRt D u D p D 1 P ŒR D d ; where 1 < d < u and p does not depend on t. Also assume that the risk-free return rate is r. Then, the risk-neutral probability p satisfies r D Et ŒRt D pu C .1 p/d:
10.5 Option Pricing for Binomial Model
209
Thus, p D
r d : ud
Therefore, we naturally assume d < r < u for the existence of p . Under this risk-neutral probability, for N steps, denote by M the number of periods that the stock goes up. Then M is a binomial random variable with P ŒM D j D
N j
p j .1 p /N j ;
for j D 0; 1; ; N . Let the initial price of the stock be S0 , then the stock price at the end of N th step is SN D S0 .1 C u/M .1 C d /N M : For the European call option, let the strike price be K, then the payoff at the end of the N th period is pN D max.0; SN K/. Denote by m the smallest integer such that the payoff is positive, m D minfj > 0 W S0 .1 C u/j .1 C d /N j > Kg; i.e.,
m D 1 C ln
K S0 .1 C d /N
1Cu ; ln 1Cd
where Œx represents the largest integer x. Then the price of this call option can be calculated as p0 D E
pN .1 C r/N
D
1 E Œmax.0; SN K/ .1 C r/N
D
N X 1 P ŒM D j ŒS0 .1 C u/j .1 C d /N j K .1 C r/N j Dm
N X 1 N j p .1 p /N j S0 .1 C u/j .1 C d /N j D S0 j .1 C r/N j Dm
N X K P ŒM D j .1 C r/N j Dm
210
10 Asset Pricing Theory
N X p .1 C u/ j .1 p /.1 C d / N j N D S0 j 1Cr 1Cr j Dm
N X K P ŒM D j .1 C r/N j Dm
D S0 .1 Bin.m 1; N; q // where Bin.m 1; N; p / D
N X
K .1 Bi n.m 1; N; p //; .1 C r/N
P ŒM D j ; q D
m
p .1 C u/ ; 1Cr
and 1 q D 1
1 p C r p u p .1 C u/ 1Cd D D .1 p / : 1Cr 1Cr 1Cr
Thus we have the well-known Cox–Ross–Rubinstein formula: p0 D S0 .1 Bin.m 1; N; q //
K .1 Bin.m 1; N; p //: .1 C r/N
10.6 Portfolio Management 10.6.1 Discrete Financial Market We first introduce the concept of discrete finance market. Suppose at time t, an investor holds xjt units of risk asset j (securities) at price Sjt for j D 1; 2; : : : ; m, and yt units of risk-free bonds at price At . Then the total wealth will be Vt D
m X
xjt Sjt C yt At :
j D1
Automatically, we shall assume Vt 0 for all t. The sequence fx1t ; ::; xmt ; yt g is called an investment strategy. Definition 10.1. (a) An investment strategy is called self-financed if Vt D
m X j D1
xj.t C1/ Sjt C yt C1 At :
10.6 Portfolio Management
211
(b) An investment strategy is called predictable if fx1.t C1/ ; : : : ; xm.t C1/ ; yt C1 g depends only on the history of the wealth up to time t. (c) There is no arbitrage. Arbitrage means that V0 D 0 and P ŒVt > 0 > 0; for some t D 1; 2; : : :. (d) A discrete financial market is called complete if there exists a unique risk-neutral probability measure P .:/. Under the binomial model for only one type of stock introduced in the previous section, we have the following result. Theorem 10.1. The binomial model for stocks and risk-free assets admits no arbitrage if and only if d < r < u or there exists a risk-neutral probability 0 < p < 1. Proof. We only have to prove it for one step and the general case is similar. (a) If r d , let x0 D 1=S0 and y0 D 1. That means borrow one unit of risk-free bond and buy 1=S0 shares of stocks. Then V1 D r C d 0; if the stock goes down and V1 D r C u > 0 if the stock goes up. (b) If r u, we let x0 D 1=S0 and y0 D 1. Similar result as (a) holds. (c) If d < r < u, every portfolio with V0 D 0 must satisfy x0 D a=S0 ; y0 D a; for some real number a. If a D 0, obviously V1 D 0. If a > 0, V1 D a.d r/ < 0. If a < 0, V1 D a.u r/ < 0. That means, there is no arbitrage. t u Under general strategies, we have the following result. The proof is not given here. Theorem 10.2. The no-arbitrage principle is equivalent to the existence of a unique positive risk-neutral probability measure.
10.6.2 Risk Management We restrain our discussion to two types of securities. Suppose at time 0, one holds x10 and x20 shares of the two securities with prices S10 and S20 , respectively. So the total wealth is V0 D x10 S10 C x20 S20 :
212
10 Asset Pricing Theory
Suppose the return for the two stocks are R10 and R20 , respectively. Let x10 S10 x20 S20 ; w2 D : V0 V0
w1 D Then the wealth at time t D 1 is
V1 D x10 S10 .1 C R10 / C x20 S20 .1 C R20 / D V0 .w1 .1 C R10 / C w2 .1 C R20 // D V0 .1 C w1 R10 C w2 R20 /: Thus, the return on the portfolio will be R0 D w1 R10 C w2 R20 : Denote by 1 D EŒRi 0 , i2 D Var.Ri 0 /, and 12 D Corr.R10 ; R20 /. Then mean and variance of R0 can be obtained as V D EŒR0 D w1 1 C w2 2 ; and Var.R0 / D Var.w1 R10 C w2 R20 / D w21 12 C w22 22 C 2w1 w2 Cov.R10 ; R20 / D w21 12 C w22 22 C 2w1 w2 121 2 : Since j 12 j 1, it is easy to see that V2 max.12 ; 22 /: By taking the derivative with respect to the variable s D w2 and noting that w1 D 1 w2 , the following theorem gives the optimal weight w2 D s, which minimizes Var.R0 /. Theorem 10.3. For 1 < 12 < 1, the optimal weight with minimum variance Var.R0 / is attained at s0 D w2 D
12
12 12 1 2 : C 22 2 12 1 2
10.6 Portfolio Management
213
10.6.3 Hedging Options From an individual investor’s point of view, one usually only buys or sells options. But for writers (issuers) of options (e.g., financial institutions), the risk can be significant. For example, for a call option, the payoff will be X D max.0; S K/; where S is the price and K is the strike price. Theoretically, the payoffs have no upper limit. Just like the reinsurance policies, where large claims can be prevented by stop-loss reinsurance policy. To prevent large payoffs, the writer can purchase a certain number of shares of the same stock plus some amount of the risk-free bonds to replicate the option, called hedge fund at each step. Intuitively, the hedge fund will cancel most of the large payoffs for the issued call options. For example, under the N -step binomial model, we already know that the calloption price can be written as p0 D S0 .1 Bin.m 1; N; q // where p D
rd ud
K .1 Bin.m 1; N; p //; .1 C r/N
1Cu , and q D p 1Cr , and
m D 1 C ln
K S0 .1 C d /N
ln
1Cu : 1Cd
This is equivalent to saying that at the beginning, one buys x0 D .1 Bin.m K 1; N; q // shares of stock and borrows y0 D .1Cr/ N .1Bin.m1; N; p // amount of risk-free bond. That means, the option can be replicated (hedged) by a fund formed by stock and bond, called hedge fund. To match the option price, after each step, the writer needs to reevaluate the option price depending on the new “initial” stock price, and then reset portfolio of the hedge fund. We use a simple example as an illustration. Example 10.7. We consider a binomial model with only twosteps N D 2. Assume the initial stock price is S0 D 100, the bond price is A D 100, and the strike price for the call option is also K D 100. Assume the risk-free rate is r D 0:1, and u D 0:3 and d D 0:1 for each step. Suppose at t D 0 the writer issues an option. (a) To calculate the option price p0 , we first calculate the risk-neutral probability as p D
0:1 C 0:1 1 r d D D ; ud 0:3 C 0:1 2
214
10 Asset Pricing Theory
and
1 1 C 0:3 13 1Cu D D : 1Cr 2 1 C 0:1 22 Since N D 2, the smallest m such that q D p
S2 D S0 .1 C u/m .1 C d /2m > K; is obviously m D 1. Thus, the option price at t D 0 can be calculated as K 2 .p C 2p .1 p // .1 C r/2 " # # " 100 13 9 11 1 2 13 2 C2 C2 D 100 22 22 22 1:12 2 22 2
p0 D S0 .q C 2q .1 q //
D 100 0:833 100 0:723 D 11:00: That means, the writer gets 11.00 cash for issuing an option. To prevent large payoffs at the end, the writer spends the cash using the following replicating strategy: buy 0.833 shares of stock and borrow 0.723 unit of bond. (b) Let us look at t D 1. We distinguish two situations depending on whether the stock goes up or down. (a) Stock goes up at t D 1 and S1 D 130. Since there is only one step left, S2 will always be larger than K no matter whether the stock goes up or down at the second step. Thus, the option price at t D 1 becomes p1 D S 1
100 K D 130 D 39:091: 1Cr 1:1
Therefore, if the option is not hedged and the writer keeps the cash with interest, the gain for the writer is negative 11:00 .1 C r/ 39:091 D 26:991: Suppose if we use the hedging strategy, for the hedge fund, the stock value will be 100 0:833 1:3 D 108:29; the bond owes 72:3 1:1 D 79:53, and the option owes 39:091. Thus, the total gain for the writer is 108:29 79:53 39:09 D 10:33:
10.6 Portfolio Management
215
(b) The stock goes down at t D 1 and S1 D 90. Again, since there is only one step left, and S2 > K only if the stock goes up at the second step. Thus, the option price becomes K p 1Cr 13 100 1 D 7:23: D 90 22 1:1 2
p1 D S 1 q
Thus, the gain without hedging for the writer is 11:00 1:1 7:23 D 4:87: Similarly, for the hedge fund, the stock value is now 100 0:833 0:9 D 74:97; the bond owes 79.53, and the option owes 7.23. Thus, the total gain for the writer is 74:97 79:53 7:23 D 11:79: (c) Therefore, we see that at t D 1, under the risk-neutral probability p , the two strategies with and without hedging have the same mean 1 1 1 1 .26:991/ C 4:87 D .10:33/ C .11:79/ D 11:06: 2 2 2 2 The hedged portfolio has variance 1 1 .10:33 C 11:06/2 C .11:79 C 11:06/2 D 0:533; 2 2 while the nonhedged fund has variance 1 1 .26:99 C 11:06/2 C .4:87 C 11:06/2 D 253:765: 2 2 Thus, the hedged fund reduces the variance (risk) dramatically. (d) At time t D 1, a new hedging strategy will be set up depending on the new stock price (going up or down) and new bond price. Obviously, for the writer, it has to relocate the hedge fund at every step. Remark. Denote the price of a general derivative security (such as options) for the initial stock price S0 by D.S0 /. Then if we buy x shares of stock and y units of bond (with price 1), the hedge fund should satisfy V0 D 0 D xS0 C y D.S0 /:
216
10 Asset Pricing Theory
By differentiating with respect to S0 , we have d D.S0 / D x: dS0 That means, the number of shares for the hedge fund is the first order derivative for the security price with respect to the initial price of the stock. For this reason, the above hedging strategy is called Delta Hedging. Along this line, one can develop many different types of hedging strategies.
10.7 Black–Scholes Formula At the end of this chapter, we go a little further to show how the binomial model can be used to approximate the European call-option price under a continuous time model, where the evolution of logarithm of stock price is modeled as a Brownian motion (see Chap. 11 for the definition) with drift parameter r0 and variance 2 . For the time period Œ0; T , we partition it into N intervals with width D T =N . We shall use the binomial model to approximate the evolution of the stock price at the end of each of these N intervals. More specifically, we let 1 C r D er0 ; 1 C u D e
p
; and 1 C d D e
p
which are same for all steps. Then as ! 0 (N ! 1), we have K 1Cu m D 1 C ln ln S0 .1 C d /N 1Cd D
lnŒK=S0 N ln.1 C d / ln.1 C u/ ln.1 C d /
D
lnŒK=S0 T : p C 2
2
Also, p D D
r d ud er0 e e
p
p
p e
p r0 C 12 2 C o. / D p 2 C o. / " # p p 1 2
D 1 C r0 Co
: 2 2
;
10.7 Black–Scholes Formula
217
Similarly, q D p
1Cu 1Cr p
D p e r0 " # p p h p i p 1 2
1 C r0 Co D
1C Co
2 2 " # p p 1 2
1 C r0 C Co D
: 2 2 By using elementary normal approximation for the binomial distribution and after some simplifications, we have m Np
!
Bin.m; N; p / ˆ p Np .1 p / 0 lnŒK=S0 r0 D ˆ@ p T
2 2
T
1 A;
where ˆ.x/ is the standard normal distribution function. Similarly, lnŒK=S0 r0 C Bin.m; N; q / ˆ @ p T 0
2 2
T
1 A:
Further we note that .1 C r/N D eN r0 D er0 T : By using the symmetric property of normal distribution, we finally get the following Black–Scholes formula for European call option: lnŒS0 =K C .r0 C p0 S 0 ˆ p T
2 2 /T
!
lnŒS0 =K C .r0 Ker0 T ˆ p T
2 2 /T
! :
Under this formula, the price can be calculated based on the normal distribution function.
218
10 Asset Pricing Theory
Problems 1. Let the annual interest rate be 10% which is paid monthly and compounded. Suppose after k months, $1.00 will become .1 C 0:10=12/k . For $1,000.00 invested, (a) What is the value after three months? (b) What will be the value after six months? (c) What will be the value after one year? 2. Suppose the annual interest rate is 10% which is paid daily. Then after k days, $1.00 will become .1 C 0:10=365/k . For $1,000.00 invested, (a) (b) (c) (d) (e)
What is the daily interest rate? What is the value after 60 days? What is the value after 90 days? What is the value after 180 days? Which of the two investments, paid daily or monthly, with same annual rate 10%, is better?
3. What initial investment subject to monthly compounding at annual rate 12% is needed to produce $1,200.00 after two years? 4. Which will deliver a better strategy, a deposit attracting interest at 15% compounded daily, or at 15.5% compounded semiannually? 5. How much a monthly mortgage payment should be for a 15-year $100,000.00 loan at 8% annual rate compounded monthly? 6. r is called annual interest rate compounded continuously, if at time t, $1.00 deposit has value ert . Which of the deposit strategy is better, annual interest at 10% compounded daily or continuously? 7. Suppose a stock is liquidated in one-period and its payoff is $k C 4 if state k occurs, for k D 1; 2; ; 6. Assume each state has equal probability of occurring. Let the pricing kernel mt C1 take the values 0:99; 0:98; 0:97; 0:96; 0:95 and 0:94 for states 1 to 6, respectively, following the uniform probability distribution. (a) (b) (c) (d) (e)
Find the risk-free rate for one period. Find the stock price at the current time. If the strike price is $8.00, find the price for a call-option. Calculate the corresponding risk-neutral probabilities. Calculate the stock price and option price based on the risk-neutral probabilities.
8. Under a binomial model for an N D 60-day European call option, suppose the initial stock price is $100.00, and the strike price is also $100.00. (a) Let the annual interest rate be 0:10, which is compounded daily. Calculate the daily interest rate r and the discount factor 1=.1 C r/.
10.7 Black–Scholes Formula
219
(b) Suppose u D 0:0004 and d D 0:0002. What will be the daily binomial neutral probability p ? (c) Calculate the option price at the beginning. (d) Suppose the writer wants to hedge this option. What are the shares of stock and amount of bond he/she should buy (borrow) based on the option price? 9. For the utility function u.c/ D ln c, (a) Calculate the absolute local measure of risk aversion A.c/ D
u00 .c/ I u0 .c/
(b) Calculate the relative local measure of risk aversion defined by R.c/ D cA.c/. Verify that its relative local measure of risk aversion is constant. 10. For the utility function u.c/ D ec , calculate the absolute local measure and relative local measure of risk aversion. Verify that its absolute local measure of risk aversion is constant. 11. Consider two stocks which are liquidated in three common states I; II; III with probabilities 0:4; 0:2; 0:4, respectively, after one period. Corresponding to the three states, the returns R1 and R2 are 10%, 0%, and 20% for the first stock and 20%, 20%, and 10% for the second stock, respectively. (a) Find the mean returns E.R1 / and E.R2 / for the two stocks. (b) Find the variance and covariance for the returns of the two stocks. (c) If a portfolio puts 40% on the first stock and 60% on the second stock, what will be the variance of the return for the portfolio? (d) What is the optimal allocation of portfolio? 12. Let S0 D 50,r D 0:5%, u D 0:01, and d D 0:01. For an European call option with strike price 60 to be exercised after N D 60 steps, (a) Find the value of m; (b) Find the price of this call option. 13. For the model given in Example 10.7 with u being changed to 0.2 and other parameters being same, (a) Calculate the European call option price at the beginning; (b) Calculate the mean and variance after one period without hedging (the premium is paid as bond); (c) What is the hedging strategy for replicating the option? Calculate the mean and variance after one period with hedging; (d) How much difference do the two variances have? 14. In Problem 10.8, (a) Find the approximated parameters r0 ; and under the continuous model; (b) Using the Black–Scholes formula calculate the option price.
Chapter 11
Credit Risk Modeling
In this chapter, we briefly introduce the basic credit risk modeling including measuring portfolio risk and pricing defaultable bonds, credit derivatives, and other securities exposed to credit risk. Conceptually, credit risk is one source of market risk, which models the risk of changes in market values of a firm’s portfolio of position and thus includes the risk of default or fluctuation in credit quality of one’s counter partners. There are, however, important differences in measuring and pricing credit risk and market price risk. Market risk puts more emphasis on risk caused by the degree of volatility of market prices and of change in daily profit and loss. Credit risk is the risk of default or of reduction in market value caused by changes in the credit quality (ratings) of issuers or counter parties.
11.1 Two Models for Default Probability There are two broad classes of models which are used for credit risk pricing. One is the reduced form, where the process of default probability is directly specified by modeling the default intensity. The second is the structural model, which frames a structure of a model for variation in assets related to liabilities.
11.1.1 Basic Notation The following terms and notations are used specifically for credit risk. Let denote the default time and p.t/ D P Œ > t denote the probability of survival at time t. For s > t, we call P Œ sj > t D 1
p.s/ p.t/ p.s/ D p.t/ p.t/
the forward default probability given no default up to time t. A.K. Gupta et al., Probability and Statistical Models: Foundations for Problems in Reliability and Financial Mathematics, DOI 10.1007/978-0-8176-4987-6 11, c Springer Science+Business Media, LLC 2010
221
222
11 Credit Risk Modeling
If p 0 .t/ exists, we call .t/ D
p 0 .t/ p.t/
the forward default rate. Note that .t/ has the same meaning as the failure rate as in reliability terms. The forward survival probability can be written as p.t; s/ D P Œ > sj > t D e
Rs t
.u/eu
:
In particular, if .t/ D a constant, then p.t; s/ D e.st / .
11.1.2 Reduced Form In practice, only using the information that whether the default has occurred before time t is too simple to adapt all the information about many random risk factors (covariates), such as the credit ratings and market assets. Therefore, we need to extend the forward rate function .t/ to a stochastic process f.t/g, called default intensity process. Here, we briefly introduce two models. Example 11.1 (Jump mean-reverting process). Let f.t/g be defined as the meanreverting process with jumps. Suppose the jumps arrive following a Poisson process with intensity c and jump sizes are independent and identically distributed with mean . Between jumps, .t/ changes following the following rate: e.t/ D . .t//; et where is called the reverting rate, is called the reverting limit. Typically, we assume .0/ D . That means, .t/ starts as a constant and after a jump, it decreases at an exponential rate before the second jump arrives and so forth. The sample of .t/ behaves like a sawtooth. Denote by T .t/ the last jump point before time t and .T .t// the value of intensity just after the last jump (right continuous). Then .t/ has the solution between jumps as .t/ D C e.t T .t // ..T .t// /: Denote by Et Œ: D EŒ:j.u/; 0 u tI > t the conditional mean given all information up to time t. We give the following formula without proof. Theorem 11.1. Under the above jump mean-reverting model, the forward survival probability given all information up to time t is given by p.t; s/ D P Œ > sj.u/; 0 u tI > t D e˛.st /Cˇ.st /.t /;
11.1 Two Models for Default Probability
223
where
1 ˇ.t/ D 1 et ; i c h 1 t ln.1 C .1 et // : ˛.t/ D t .1 et / C
In general, we can calculate the conditional survival probability as h Rs i p.t; s/ D Et e t .u/eu : In particular, p.t/ D Ee
Rt
0
.u/eu
:
Before we introduce the next model, we first give the definition of a Brownian motion. Definition 11.1. A Brownian motion with drift and variance 2 is a family of random variables fWt ; 0 t < 1g with the following properties: (a) W0 D 0; (b) Ws Wt is a normal random variable with mean .s t/ and variance .s t/ 2 for 0 t < s < 1; (c) Wt has independent increments as defined in Chap. 3; (d) Wt is a continuous function of t. Sometimes, we write a Brownian motion in the stochastic differential form as dWt D et C dBt ; where Bt is the standard Brownian motion with zero drift and unit variance. In general, when the drift and variance are general functions of t including the process itself, we call it a diffusion process. Example 11.2 (Cox-Ingersoll-Ross (CIR) Model). Here, we assume ft g is a diffusion (Gaussian) process satisfying the stochastic differential form: dt D . t /et C
p
t dBt ;
where Bt is the standard Brownian motion with drift 0 and unit variance, is the long-run mean of t , is mean reverting rate, and is the volatility coefficient. By taking expectation of both sides, we have dEŒt D . EŒt /et; which gives the solution
EŒt D C et :
Practically, the two models given above make very little difference.
224
11 Credit Risk Modeling
11.1.3 Structural Model Suppose the total assets At follows an exponential Brownian motion defined as eAt D . /et C dBt ; At where is the mean rate of return, is the cash payment rate, and is the asset volatility. (a) (Black and Scholes (1973) and Merton (1974)) Suppose the default occur when the asset is below D at the maturity time T . Define log At log D Xt D as the distance to default. Ito’s formula for stochastic calculus states that for any function f .At / of At , 1 df .At / D f 0 .At /dAt C f 00 .At /A2t 2 et: 2 Thus, dXt D
1
2 et C dBt ; 2 2
which is a Brownian motion with drift parameter m D 1 . 2 / and unit variance. From the definition of a Brownian motion, we can calculate the default probability as P ŒAT DjAt D P ŒXT 0jXt D P ŒXT Xt Xt jXt Xt m.T t/ p Dˆ T t Xt C m.T t/ p D 1ˆ T t D 1 ˆ.u.Xt ; T t//; where u.x; s/ D
x C ms p : s
(b) (Black and Cox (1976)) Different from the previous model, suppose the default occurs at the first time the asset drops below D, or Xs 0, denote by D inffs W Xs 0g:
11.2 Valuation of Default Risk
225
Then, for given > t and Xt > 0, the conditional forward survival probability can be written as p.t; T jXt / D P Œ > T j > tI Xt D P ŒXs > 0; t s T jXt : The following theorem gives the formula for p.t; T jXt /, which is a typical result for Brownian motion, and its proof can be seen in Karlin and Taylor (1975, page 345) based on the reflection principle and Siegmund (1985, page 40) by using the likelihood ratio identity. Theorem 11.2. Suppose the default occurs at the first time Xs 0, then forward survival probability for given Xt > 0 is equal to p.t; T jXt / D H.Xt ; T t/; where
x C ms H.x; s/ D ˆ p s
e
2mx
x C ms ˆ p s
D ˆ.u.x; s// e2mx ˆ.u.x; s//: In particular at t D 0 for given X0 D x > 0, the survival probability for is given by p.T jX0 D x/ D P . > T jX0 D x/ x C mT x C mT p p Dˆ e2mx ˆ ; T T which gives the density of as the inverse Gaussian density function
e p.T jX0 D x/ D xT 3=2
eT
p x p m T T
:
11.2 Valuation of Default Risk Just as in the valuation of price of a stock or its derivatives, the valuation of default risk depends on the determination of risk-neutral probabilities under default probability model. We shall discuss two typical cases: zero recovery and nonzero recovery.
226
11 Credit Risk Modeling
11.2.1 No Recovery Zero-Coupon Defaultable Bond Suppose in the situation without default, the short rate process is fr.t/g. Under the risk-neutral probabilities P .:/, the price of bond at par without default at time t with maturity time T is ı.t; T / D Et Œe
RT t
r.s/es
;
which is the expected discounted value at time t. Now let denote the default time and assume there is no recovery for loss at default. Let Pt Œ: denote the conditional risk neutral probabilities given all history up to t and > t. Then the price at time t becomes d0 .t; T / D Et Œe
RT t
r.s/es
IŒ>T :
Let the risk-neutral default intensity be .t/ for given r.s/ for 0 s T . By conditioning on the values of r.s/ for t s T , it can be written as d0 .t; T / D Et ŒEt Œe D Et Œe D Et Œe
RT t
r.s/es
IŒ>T jr.s/I t s T
RT
r.s/es
RT
.r.s/C .s//es
t t
Pt Œt > T jr.s/; t s T :
We call
ln d0 .t; T / ln ı.t; T / ; T t the credit spread for maturity .T t/ for the zero-coupon rate. In particular, if r.t/ D r a constant, then s.t; T / D
ı.t; T / D e.T t /r I d0 .t; T / D e.T t /r Et Œe and s.t; T / D
RT t
.s/es
;
RT 1 ln Et Œe t .s/es : T t
11.2.2 Non-Zero Recovery In this section, we further assume that a proportion of w recovery of face value is paid at the maturity time T . Suppose the yield short-rate is r.t/ and the risk-neutral fault intensity is .t/. Then the price at t can be calculated as d.t; T / D Et Œe D Et Œe
RT
r.s/es
RT
ŒIŒ>T C wIŒT
r.s/es
Œw C .1 w/IŒ>T
t t
D wı.t; T / C .1 w/d0 .t; T /;
11.3 Credit Rating: Default and Transition
227
which is a mixture of the values with no default and no recovery at default. When r.s/ D r, a constant, the credit spread is equal to ln d.t; T / ln ı.t; T / T t RT 1 lnŒw C .1 w/Et e t .s/es : D T t
s.t; T / D
11.2.3 Actual and Risk Neutral Default Intensity Here, we briefly explain how to find the risk-neutral probability under a simple discrete time model with exponential default time. Suppose a one-year par bond of $100 (price is its face value) that promises its face value at 8% coupon at maturity. The 1-year riskless rate is 6%. On the one hand, if the issuer survives with probability 0:99 D e , or D ln 0:99 D 0:01, then the investor receives $108 at maturity. On the other hand, if the issuer defaults before maturity with probability 0.01, the investor is assumed to receive 50% of the par-value, or $50. Thus, the discounted payoff is 1 .1:08 0:99 C 50 0:01/ D 101:37; 1:06 which is higher than the face value. Similar to the techniques of finding the risk neutral probability under the binomial models for options. Let p denote the risk-neutral probability that the issuer survives through the maturity. Then 1 .108 p C 50 .1 p // D 100: 1:06 That means, the face value is equal to the discounted value. This gives p D 0:965 D e , or D ln 0:965 D 0:0365.
11.3 Credit Rating: Default and Transition 11.3.1 Credit Rating The Securities and Exchange Commission (SEC) has currently designated several agencies as nationally recognized statistical rating organizations (NRSROS), including, e.g., Moody’s KMV, Standard and Poor’s, Fitch, and Thomas Bankwatch.
228
11 Credit Risk Modeling
Different agencies might assign different ratings for the same bond. In particular, S&P now rates more than USD 10 trillion in bonds and other financial obligations of obligor in more than 50 countries. Generally, the rating agencies provide two different sorts of ratings: Issue-specific credit rating and Issuer credit ratings. Issue-specific credit ratings are current opinions of the credit worthiness of an obligor with respect to a specific financial obligation, a specific class of financial obligations, or specific financial program, which also take into account the recovery prospects associated with the specific debt being rated. Issuer credit ratings give an opinion of the obligor’s overall capacity to meet its financial obligation, that is, its financial credit worthiness. These so-called corporate credit ratings indicate the likelihood of default regarding all financial obligations of the firm. The long-term credit ratings, i.e., obligations with an original maturity of more than one year, are divided into several categories ranging from AAA, reflecting the strongest credit quality, to D, reflecting occurrence of default. Ratings in the four highest categories AAA, AA, A, and BBB are generally recognized as being investment grades, whereas BB or below are generally regarded as noninvestment grades. Ratings from AA to CCC may be modified by the addition of a plus or minus sign to show the relative standings. In the end, the ratings of a company or loan should also be transformed to a corresponding default probability. However, default probabilities may vary substantially through time, e.g., cyclical effects. A typical rating categories and corresponding default probability (PD) according to Dartsch and Weinrich (2002) is given in the Table 11.1 below.
Table 11.1 Rating and PD by Dartsch and Weinrich (2002)
18 Classes AAA AAC AA AA AC A A BBBC BBB BBB BBC BB BB BC B B CCCC CCC
7 Classes Lower PD (%) Upper PD (%) AAA 0.00 0.025 0.025 0.035 AA 0.035 0.045 0.045 0.055 0.055 0.07 A 0.07 0.095 0.095 0.135 0.135 0.205 BBB 0.205 0.325 0.325 0.5125 0.5125 0.77 BB 0.77 1.12 1.12 1.635 1.635 2.905 B 2.905 5.785 5.785 11.345 11.345 17.485 CCC >17.495
11.3 Credit Rating: Default and Transition
229
11.3.2 Rating Assignment Statistically, the prediction of default and ratings is based on different kinds of models. Let X1 ; ; Xk be covariates associated with assets, e.g., financial ratios. Then the link model is Y D g.ˇ0 C ˇ1 X1 C C ˇk Xk /: For example, X1 X2 X3 X4 X5
D working capital/ total assets; D retained earnings/ total assets; D earnings before interest and taxes/ total assets; D market value of equality/ book value of total liability; D sales/total assets. Under the logit model, the default probability is equal to P ŒY D 1jX1 ; ; Xk D
eˇ0 Cˇ1 X1 CCˇk Xk : 1 C eˇ0 Cˇ1 X1 CCˇk Xk
Under the probit model, P ŒY D 1jX1 ; ; Xk D ˆ.ˇ0 C ˇ1 X1 C C ˇk Xk /; where ˆ.x/ is the standard normal distribution. Generally, by denoting Z D ˇ0 Cˇ1 X1 C Cˇk Xk , we can classify the ratings into K categories based on the values of Z. For example, we can define 8 1 if ˆ ˆ < 2 if Y D ˆ ˆ : K if
Z z1 z1 < Z z2 zk1 < Z
11.3.3 Rating Transition Overall, not only the worst event of default has influence on the price of a bond, but also a change in the rating of a company can affect prices of the issued bond. There are three basic approaches to model the rating migrations. The first approach is based on historical data of transitions of gradings. This is typically modeled by a yearlybased Markov transition matrix. For example, for K states, let Yt be the state at year t, then the transition matrix from year t 1 to year t can be written as 0 1 p11 p12 p1K B p p22 p2K C B C 21 B C P D B C; B C @ pK1;1 pK1;2 pK1;K A 0 0 1
230
11 Credit Risk Modeling
where pij D P ŒYt D j jYt 1 D i . Note that P ŒYt D j jYt 1 D K D 1, since state K denotes default. In the continuous time case, a Markov process is used to model transition with the generator matrix modeling the instant transition intensity. The following is an example of Moody’s corporate bond rating transition matrix for the period 1982–2001. 0
Aaa B Aaa 0:9276 B B Aa 0:0064 B B A 0:0007 B B B Baa 0:0005 B B Ba 0:0002 B B B 0:000 B @ C 0:0012 D 0:0000
Aa 0:0661 0:9152 0:0221 0:0029 0:0011 0:0010 0:0000 0:0000
A 0:0050 0:0700 0:9137 0:0550 0:0052 0:0035 0:0029 0:0000
Baa 0:0009 0:0062 0:0546 0:8753 0:0712 0:0047 0:0053 0:0000
Ba 0:0003 0:0008 0:0058 0:0506 0:8229 0:0588 0:0157 0:0000
B 0:000 0:0011 0:0024 0:0108 0:0741 0:8323 0:1121 0:0000
C 0:000 0:0002 0:0003 0:0021 0:0111 0:0385 0:6238 0:0000
D 0:000 0:0001 0:0005 0:0029 0:0141 0:0612 0:2389 1:0000
1 C C C C C C C C C C C C C A
The second approach is to develop a transition matrix P D .pij / for 1 i; j K under risk-neutral probability to price the credit risk. The third approach is to develop conditional transition matrix depending on business cycles. It is typical to use a two-state hidden Markov chain as the model. We shall not discuss these models in detail.
11.4 Correlated Defaults The degree of correlation of default risk for n entities can be captured by various measures of joint distribution of default times 1 ; ; n . These entities can be obligors underlying a collateral debt obligations or a set of over-the-counter broker dealer counterparts. Empirically, bank sector has the highest intracorrelation of more than 60%, while technology sector has the lowest intracorrelation of under 10%. There are two major approaches modeling the correlations. One is based on credit ratings through the driving variable. The other is based on directly modeling the joint correlated default intensities or default distributions.
11.4.1 Credit Metrics Typically, an underlying driving variable, say Z, such as the asset return of entity, is used to determine the change in ratings, which is assumed to be normal random variable. Default occurs when assets are insufficient to meet liabilities. For n entities, we can observe their corresponding drivers Z1 .t/; ; Zn .t/ with correlation and
11.4 Correlated Defaults
231
assign new ratings for each entity according to the changes in the drivers. From the joint distribution of the driving variables, we can know the correlation of the credit ratings and defaults. Time series models are typically used for the drivers.
11.4.2 Correlated Default Intensities Here we introduce two models by directly introducing correlated default intensities. Model 1 (Jump mean-reverting intensity) We can assume i .t/ for entity i has mean revert and has random jumps with random sizes. By allowing for the possibility of common jump times, one has relatively simple correlated defaults. Without correlated jumps, it is difficult to obtain a high degree of correlation for relatively high quality entities. More specifically, we assume that all entities’ default intensities have the same parameter .; ; c; J / under the jump reverting intensity model, where is the mean reverse rate, is the level of reverse, c is the arrival rate of jumps, and J is the mean jump size. Assume that the common adverse changes in credit quality arrive according to a Poisson process with constant intensity c . At such a common event, a given entity’s default intensity jumps with probability p. Thus, an entity experiences commonly generated adverse credit shocks with arrival intensity pc . On the other hand, we also assume that jumps to individual default intensities are idiosyncratic, i.e., not instigated by the arrival of common credit events, and hence the remaining jump intensity is c pc . So that the total jump arrival intensity is c. Thus, given j -th entity jumps, i -th entity jumps with conditional probability pc c . Model 2 (Correlated log-normal intensities) By assuming that the intensities change only once a year, for counter party i , the default intensity it during year t C 1 is a log-normal model with mean reversion log i;t C1 D i .log N i log i;t / C i i;t C1 ; where i is the rate of mean reversion, log N i is the steady-state mean level, and i is the volatility. Assuming i;t C1 are correlated normal random variables, the intensities become correlated as well.
11.4.3 Copula-Based Correlation Modeling In recent years, copula has been used extensively for modeling the correlation of n default times. Let p1 .t/; ; pn .t/ be the associated survival functions for the n entities. Then Ui D pi .i / follows the uniform distribution on Œ0; 1.
232
11 Credit Risk Modeling
Let C.u1 ; ; un / D P .U1 u1 ; ; Un un / define a copula function, i.e, a joint distribution function for n uniform random variables on Œ0; 1. Then the joint distribution function for 1 ; ; n is F .x1 ; ; xn / D C.F1 .x1 /; ; Fn .xn //; where Fi .x/ is the marginal distribution of i . Typical bivariate copulas include (a) Independent: C.u; v/ D uv; (b) Perfect Correlation: C.u; v/ D min.u; v/; (c) Gumbel Copula: C.u; v/ D exp... ln u/ı C . ln v/ı /1ı /I (d) Gaussian Copula with correlation : C.u; v/ D P .ˆ.X / u; ˆ.Y / v/; where .X; Y / follows the bivariate standard normal variables with correlation 2 x 2 xy C y 2 1 : exp p 2.1 2 / 2 1 2 A very nice property of Copula is given in the following lemma. Lemma 11.1. If .X1 ; ; Xn / has copula C.u1 ; ; un /, and T1 .x/; ; Tn .x/ are continuous increasing functions, then .T1 .X1 /; ; Tn .Xn // also has the same copula. It should be mentioned that the copula approach should be used by paying attention on the dynamic structure of the correlation, i.e., business cycle effects, etc.
11.5 Credit Derivatives Credit derivatives may be defined as a specific class of financial instruments whose value is derived from an underlying asset bearing a credit risk of private or sovereign debt issuers. The rational behind credit derivatives depends on the types of players on the market. For example, the banks as the largest group of users intend to free up capitals, optimize their balance sheet, manage loan exposure without the consent of the debtor, and compile the regulatory offsets as well as for risk reduction and diversification. Insurance companies and fund managers have the opportunity to access new classes of assets, such as bank loans. In addition, an aim is to hedge and diversify their portfolios and to reduce the risk by buying credit derivative instruments.
11.5 Credit Derivatives
233
In the following, we mainly explain two most used derivatives: credit default swaps (CDS) and collateral debt obligations (CDO).
11.5.1 Credit Default Swaps Credit swap is a typical type of derivatives for defaultable bonds. The credit event is the default of the bond issuers. Party A can sell an insurance contract, called credit default swap (CDS). A buyer, or Party B can purchase the contract by paying a stream of premium, say U per unit time (called default swap spread), until the default time or the maturing time, whichever comes first. In case a default occurs, the contingent payment amount will be paid to the buyer at the default time. For example, it can be the difference between the face value and the market value, or a proportion of the face value. Let us consider the simplest case. Under the risk neutral probabilities, suppose the default time follows an exponential distribution with parameter . On the one hand, assume that the short rate is r , a constant. The total expected premium payment at time 0 will be UE
"Z
#
min.T;/
e
r t
et
0
Z r t r T e etI < T C e P Œ > T DU E 0 # "Z Z T
DU
e
0
Z
T
DU
x
x
er t etex C e.r
C /T
0
e.r
C /x
ex
0
DU
r
1 Œ1 e.r C /T : C
On the other hand, assume the contingent payment at default is the difference between the face value and the market value 1 e.T /r at time T . Then the expected contingent payment at time 0 becomes
E Œer .1 e.T /r /I < T
D E Œ.er er T /I < T Z T D e.r C /x ex er T .1 e T / 0
.1 e.r C /T / er T C e.r C /T r C r .1 e.r C /T /: D 1 er T r C
D
234
11 Credit Risk Modeling
By matching the two quantities, we see that the default swap spread U satisfies U
h i r 1 .r C /T r T .r C /T 1 e 1 e D 1 e : r C r C
Thus,
1 er T U D .r C / r : 1 e.r C /T Similarly, if the contingent payment at default is a constant proportion L of the face value, then the expected contingent payment becomes
LE Œer I < T D L
r
1 e.r C /T : C
In this case, the credit swap spread U is simply U D L :
11.5.2 Collateral Debt Obligations A CDO is a debt security issued by a special-purpose vehicle (SPV) and backed by a diversified loan or bond portfolio. Traditionally, SPV purchases the portfolio of bonds and loan securities either in the secondary market or from the balance sheet of a bank. A CDO cash flow structure allocates interest income and principal payments from a collateral pool of different debt instruments to a prioritized collection of CDO securities, called tranches. A standard prioritization scheme is simple subordination: senior CDO notes are paid before mezzanine and lower-subordinated notes are paid, with any residual cash flow paid to an equity piece. Note that very often the SPV also enters into a portfolio default swap contract with a protection buyer such as a bank to assume the credit risk of the underlying portfolio. For modeling single issuer’s default risk model for collateral, we can use the models introduced in Sect. 11.1. For example, to model the intensity of default, we can use a more general model by combining the mean-reversion jump Poisson process and the CIR process. That means, we let dt D . t /et C
p
t dBt C Jt ;
where Bt is the standard Brownian motion, Jt is a jump Poisson process with mean arrival rate l and mean jump size .
11.5 Credit Derivatives
235
For multi-issuer default model, there are N participants in the collateral pool, whose default times 1 ; ; N have default intensity processes 1 ; ; N , respectively. To introduce the correlation in a simple way, we can assume i D Xc C Xi ; for i D 1; ; N; where X1 ; ; XN and Xc are independent and have the same structure as defied by t . Xi has the parameter .; i ; ; ; li / for i D 1; 2; :::; N , and Xc has the parameters .; c ; ; ; lc /. It can be shown that marginally, i also has the same structure as t with parameters .; i C c ; ; ; li C lc /.
Problems 1. Suppose the forward default rate .t/ D ˛t ˛1 for some ˛ > 0. Find the corresponding survival probability and forward survival probability. 2. For a no recovery zero-coupon defaultable bond, suppose both the short rate and the risk neutral default intensity are constant r and , respectively. Show that the credit spread is equal to s.t; T / D . 3. For a w-proportion recovery zero-coupon defaultable bond, suppose both the short rate and the risk neutral default intensity are constant r and , respectively. Find the credit spread s.t; T /. 4. Suppose the default time has the inverse Gaussian density as given in Sect. 11.1. Show that for r > 0, EŒer jX0 D x D exŒ
p
2rCm2 m
:
Bibliographical Notes and Further Reading
Chapter 1 Ross (2005) gives an elementary introduction to probability for students with a calculus background. A more extensive introduction to probability models, mainly for students in industrial engineering, is given by Ross (2006).
Chapter 2 The exponential distribution is discussed in almost all elementary textbooks in statistics and probability. Our discussion is mainly focused on its characterization. Balakrishnan (1996) gives an extensive review on exponential distribution and its applications. Its primary application in statistics has been in life-testing and survival analysis, see Zacks (1992) an elementary introduction. A classical introduction of extreme distribution is given by Gumbel 2004. A more recent review is referred to Kotz and Nadarajah (2001). Order statistics have been discussed extensively in literature. Further readings on this topic are referred to David and Nagaraja (2003). Associated with order statistics are the record values. A convenient reference is Arnold et al. (1998).
Chapter 3 Almost all elementary textbooks on stochastic processes cover the Poisson process and its related models in one or few chapters (see, e.g., Karlin and Taylor 1975). An advanced discussion on Poisson process is given by Kingman (1993). Our emphasis given here is again on the characterization.
237
238
Bibliographical Notes
Chapter 4 A collection of common parametric distribution classes is given by Johnson et al. (1995). A more recent discussion emphasizing on data fitting is given by Krishnamoorthy (2006). A more specialized contribution on Weibull models from an engineering statistics point of view is provided by Murthy et al. (2003). Change-point models have been discussed extensively with applications in almost every field. Testing of change on failure (hazard) rate is discussed by Matthews and Farewell (1982). A review of statistical inference on the parametric changepoint models is given by Chen and Gupta (2000). Mixture models have been discussed just as popularly as change-point models in statistical literature, see the books by Everitt (1981), Titterington et al. (1985), and McLachlan and Peel (2000). The materials on the mixture Erlang distribution presented here are taken from Tijms (1994) and Willmot and Lin (2001).
Chapter 5 A classical reference on life distribution class and its related reliability properties, such as closure property under convolution, mixture, and shock models, is given by Barlow and Proschan (1975). A more recent reference on generalizations to reliability models is given by Aven and Jensen (1999). There have been enormous discussions on a variety of extensions of life distribution classes. A common way of introducing life distribution classes is by using stochastic orders. A recent summary of the results is given by Shaked and Shanthikumar (2006). Here, the authors give a different way of defining the life distribution classes and discussing their relationships.
Chapter 6 The bivariate exponential distribution is mainly defined from the memoryless property (Marshall and Olkin 1967). There are many other ways of generalization by just fixing the marginal distributions. The readers are referred to Kotz et al. (2000) for other multivariate exponential distributions as well as multivariate generalizations of gamma and Weibull distributions.
Chapter 7 A variety of concepts of association and dependence are initiated in Lehmann (1966) and Esary et al. (1967). A recent review on multivariate generalizations is given
Bibliographical Notes
239
by Main and Kotz (2001), see also Joe (1997). There are many discussions on multivariate generalizations of life distribution classes, mainly based on the definitions or properties in the univariate case. A recent review based on stochastic orders is given by Shaked and Shanthikumar (2006). A presentation with emphasis on queueing systems is given by Szekli (1995). Total positivity is studied in Karlin (1968) and Karlin and Rinott (1980). Negative association and dependence are discussed in Joag-Dev and Proschan (1983) and Block et al. (1982). The interesting increasing property of Polya functions is given by Efron (1965).
Chapter 8 Renewal theory has played a fundamental role in many applied probability areas. The renewal theorem is established by Blackwell (1948). An important second order approximation for the renewal function is given by Smith (1958). An earlier monograph is given by Cox (1962). Elementary discussions of renewal theorem and models are presented by Karlin and Taylor (1975) and Ross (2006). The presentation given here is mainly based on the elementary induction method and recursive renewal equations. The bounds and monotone properties given here are taken from Brown (1980), Lorden (1970), and Willmot et al. (2001).
Chapter 9 A popular reference for the ruin probability under classical risk models is Grandell (1991). A large amount of literature has been recently presented in the situation of the large claims (heavy-tailed distribution). The induction technique used here is discussed by Cai and Wu (1997), and also, Willmot and Lin (2001). A more practical discussion on risk models, including reinsurance policies, is given by Daykin et al. (1994). For an advanced discussion on risk models with large claims, we refer to Embrechts et al. (2008).
Chapter 10 We only presented the basic concepts and simple techniques in financial mathematics. Stochastic calculus has been playing the fundamental role on more complicated continuous time models and cost structures. An excellent introduction of pricing theory based on utility function and statistical modeling is given by Cochrane (2005). Recent undergraduate textbooks on financial mathematics are referred to Ross (1999) and Capinski and Zastawniak (2003). One of the important topics we missed here is the term structure models for stochastic interest since more
240
Bibliographical Notes
advanced models and techniques are needed. The derivation of Black–Scholes formula from the binomial model is given by Cox et al. (1979).
Chapter 11 Most material of this chapter is taken from Duffie and Singleton (2003). For the pricing of credit risk under more general Cox process with stochastic intensities, we refer to Lando (1998). McNeil et al. (2005) discuss the technique of constructing copulas by fitting available data.
References
Arnold, B.C., Balakrishnan, N., and Nagaraja, H.N. (1998). Records, Wiley, New York Arrow, K.J. (1963). “Uncertainty and the welfare economics of medical care”, American Economics Review, 53, 941–973 Aven, T. and Jensen, U. (1999). Stochastic Models in Reliability, Springer, New York Balakrishnan, K. (1996). Exponential Distribution: Theory, Methods and Applications, CRC Barlow, R.E. and Proschan, F. (1975). Statistical Theory of Reliability and Life Testing, Probability Models, Holt, Rinehart, and Winston, New York Black, F. and Cox, J. (1976). “Valuating corporate securities: Liabilities: Some effects of bond indenture provisions”, Journal of Finance, 31, 351–367 Black, F. and Scholes, M. (1973). “The pricing of options and corporate liabilities”, Journal of Political Economy, 81, 637–654 Blackwell, D. (1948). “A renewal theorem”, Duke Mathematical Journal, 15, 145–150 Block, H.W. and Savits, T.H. (1976). “The IFRA closure problem”, Annals of Probability, 4, 1020– 1032 Block, H.W., Savits, T.H., and Shaked, M. (1982). “Some concepts of negative dependence”, Annals of Probability, 10, 765–772 Brown, M. (1980). “Bounds, inequalities and monotonicity properties for some specialized properties”, Annals of Probability, 8, 227–240 Cai, J. and Wu, Y. (1997). “Some improvements on the Lundberg bound for the ruin probability ”, Statistics and Probability Letters, 33, 395–403 Capinski, M. and Zastawniak, T. (2003). Mathematics for Finance, Springer, London Chen, J. and Gupta, A.K. (2000) Parametric Statistical Change-Point Analysis, Birkhauser, Boston Cochrane, J.H. (2005). Asset Pricing, Princeton University Press, Princeton, NJ Cox, D.R. (1962). Renewal Theory, Longman, London Cox, J.C., Ross, R.A., and Rubinstein, M. (1979). “Option pricing: a simplified approach”, Journal of Financial Economics, 7, 229–263 Dartsch, A. and Weinrich, G. (2002). “Das gesamtprojekt internes ratings”, In: Hofmann, G. (Ed.), Basel II und MaK – Vorgaben, bankinterne Verfahren, Bewertungen, Bankakademie-Verlag, 131–145 David, H.A. and Nagaraja, H.N. (2003). Order Statistics, 3rd ed., Wiley-Interscience Davis, D.J. (1952). “An analysis of some failure data”, Journal of American Statistical Association, 47, 113–150 Daykin, C.D., Pentikainen, T., and Pesonen, M. (1994). Practical Risk Theory for Actuaries, Chapman and Hall, London Duffie, D. and Singleton, K.J. (2003). Credit Risk: Pricing, Measurement, and Management, Princeton University Press, Princeton and Oxford Efron, B. (1965). “Increasing properties of Polya frequency functions”, Annals of Mathematical Statistics, 36, 272–279 Embrechts, P., Klupperlberg, C., and Mikosch, T. (2008). Modelling Extremal Events for Insurance and Finance, Springer, New York
241
242
References
Esary, J.D., Marchall, A.W., and Walkup, D. (1967). “Association of random variables with applications”, Annals of Mathematical Statistics, 38, 1466–1474 Everitt, B. (1981). Finite Mixture Distributions, Springer Grandell, J. (1991). Aspects of Risk Theory, Springer, New York Gumbel, E.J. (2004). Statistics of Extremes, Dover Publication Hardy, G.H., Littlewood, J.E., and Polya, G. (1952). Inequalities, 22nd ed., Cambridge University Press, London/New York Joag-Dev, K. and Proschan, F. (1983). “Negative association of random variables with applications”, Annals of Statistics, 11, 286–295 Joe, H. (1997). Multivariate Models and Dependence Concepts, Chapman and Hall, London Johnson, N.L., Kotz, S., and Balakrishnan, N. (1995). Continuous Univariate Distributions, Volume 2, Wiley-Interscience Kao, J.H.K. (1956). “A new life-quality measure for electron tubes”, IRE Transactions on Reliability and Quality Control PGRQC-7 Karlin, S. (1968). Total Positivity, Stanford University Press, Stanford, CA Karlin, S. and Rinott, Y. (1980). “Classes of orderings of measures and related correlation inequalities. I Multivariate totally positive distributions”, Journal of Multivariate Analysis, 10, 467–498 Karlin, S. and Taylor, H.M. (1975). A First Course in Stochastic Process, 2nd ed., Academic Kingman., J.F.C. (1993) Poisson Process, Oxford Kotz, S., Balakrishnan, N., and Johnson, N.L. (2000). Continuous Multivariate Distributions, Wiley-Interscience Kotz, S. and Nadarajah, S. (2001). Extreme Value Distributions: Theory and Applications, World Scientific Krishnamoorthy, K. (2006). Handbook of Statistical Distribution with Applications, Chapman and Hall/CRC Lando, D. (1998). “On Cox processes and credit risky securities”, Derivatives Research, 2 (2–3), 99–120 Lehmann, E.L. (1966). “Some concepts of dependence”, Annals of Mathematical Statistics, 37, 1137–1153 Lieblein, J. and Zelen, M. (1956). “Statistical investigation of the fatigue life of deep-groove ball bearings”, Journal of Research of National Bureau of Standards, 57, 273–316 Lorden, G. (1970). “On the excess over the boundary”, Annals of Mathematical Statistics, 41, 520–527 Main, D.D. and Kotz, S. (2001). Correlation and Dependence, World Scientific Marshall, T.W. and Olkin, I. (1967). “A multivariate exponential distribution”, Journal of the American Statistical Association, 62, 30–44 Matthews, D.E. and Farewell, V.T. (1982). “On testing for constant hazard rate against a changepoint alternative”, Biometrics, 38, 463–468 McLachlan, G. and Peel, D. (2000). Finite Mixture Models, Wiley-Interscience McNeil, A., Frey, R., and Embrechts, P. (2005). Quantitative Risk Management: Concepts, Techniques and Tools, Princeton University Press, New Jersey Merton, R. (1974). “On the pricing of corporate debt: The risk structure of interest rates”, Journal of Finance, 29, 449–470 Murthy, D.N.P., Xie, M., and Jiang, R. (2003). Weibull Models, Wiley-Interscience Ross, S.M. (1999). Introduction Mathematical Finance: Options and Other Topics, Cambridge University Press, Cambridge Ross, S.M. (2005). A First Course in Probability, Prentice Hall, NJ Ross, S.M. (2006). Introduction to Probability Models, 9th ed., Academic Shaked, M. and Shanthikumar, J.G. (2006). Stochastic Orders, Springer, New York Siegmund, D. (1985). Sequential Analysis: Tests and Confidence Intervals, Springer, New York Smith (1958). “Renewal theorem and its ramifications”, Journal Royal Statistics Society, Series B, 20, 243–302 Stoyanov, J. (1997). Counterexamples in Probability, Wiley, New York Szekli, R. (1995). Stochastic Ordering and Dependence in Applied Probability, Lecture Notes in Statistics, 97, Springer, New York
References
243
Tata, M.N. (1969). “On outstanding values in a sequence of random variables,” Z Wahrscheinlichkeitstheorie und Verw Gebiete, 12, 9–20 Tijms, H. (1994). Stochastic Models: An Algorithmic Approach, Wiley, Chichester Titterington, D.M., Smith, A.F.M., and Makov, U.E. (1985). Statistical Analysis of Finite Mixture Distribution, Wiley, Chichester Weibull, W. (1939). “The phenomenon of rupture in solids”, Ingeniors Vetenskaps Akademein Handlingar, 153, 293–297 Willmot, G., Cai, J., and Lin, X.S. (2001). “Lundberg inequalities for renewal equations”, Advances in Applied Probability, 33, 674–689 Willmot, G. ad Lin, X.S. (2001). Lundberg Approximations for Compound Distributions with Insurance Applications, Lecture Notes in Statistics, 156, Springer, New York Zacks, S. (1992). Introduction to Reliability Analysis: Probability Models and Statistical Methods, Springer, New York
Answers and Solutions to Selected Problems
Chapter 1 1. (a) F .x/ D 1 .1 x/5 for 0 < x < 1. (b) D 1=6. (c) From FN .c/ D 0:01, we get c D 1 0:011=5 . 2. (a) (b) (c) (d)
f .x/ D 4=.200.1 C x=200/5 for 0 < x < 1. D 200. F .tj200/ D 1 1=.1 C t=400/4. .200/ D 400.
3. (a) F .1/ D .e 1/=.e C 1/. (b) F .1j1/ D e.e 1/=.e2 C 1/. p
4. (a) FN .x/ D e2 x , for 0 < xp< 1. p (b) f .x/ D r.x/FN .x/ D .1= x/e2 x . (c) D 1=2. 5. By using the definition of conditional expectation and integration by parts, we have .t/ D EŒX tjX > t D D
1 N F .t/
Z
EŒX tI X > t P ŒX > t
1
.x t/dF .x/ t
Z 1 1 1 N N F .x/.x t/jt C F .x/dx D FN .t/ t Z 1 1 FN .x/dx: D FN .t/ t
245
246
Answers and Solutions
6. (a) By writing k D
Pk
i D1 1
EŒN D D
and exchanging the summations, we have
1 X
kP ŒN D k D
kD1 1 1 X X
k 1 X X
P ŒN D k
kD1 i D1 1 X
P ŒN D k D
i D1 kDi
P ŒN i :
i D1
(b) Again by exchanging the summations, we have 2
1 X
kP ŒN > k D 2
kD1
1 X
k
D2 D
P ŒN D i
i DkC1
kD1 1 X
1 X
P ŒN D i
i D2 1 X 2
i 1 X
i P ŒN D i
i D1
kD
kD1 1 X
1 X
i.i 1/P ŒN D i
i D2
iP ŒN D i D EŒN 2 EŒN :
i D1
7. M.t/ D .et 1/=t. 8. (a) FN .50/ D e1:43 . (b) FN .60/ D e2:28 . R R R 9. EŒX k D P ŒX K > ydy D P ŒX > xd.x k / D kx k1 P ŒX > xdx. 10. (a) FN .6/ D e3:25 . R8 N .8/ 6 r.t /dt (b) F .2j6/ D 1 F D 1 e D 1 e2:2 . FN .6/ (c) FN .yj6/ D e1:1y . Thus .6/ D 1=1:1. R1 R1 11. (a) EŒX DC D D .x D/f .x/dx D D FN .x/dx. R1 (b) EŒX I X > D D D FN .D/ C EŒX DC D D FN .D/ C D FN .x/dx. 12. See the proof of Theorem 2.1. 13. EŒX RjX > t D t C EŒX tjX > t D t C .t/I EŒX jX t D t t 1 F .x/dx: F .t / 0 14. Note that d .t C .t// D r.t/.t/: dt 15. (a) Using the independence of the n Bernoulli trials. 16. (a) c D 2; (b) P ŒX > 2 D e4 ; (c) P ŒX > 3jX > 2 D e2 .
Answers and Solutions
247
17. (a) It is obtained from the normal table. (b) EŒLjL l D C EŒL ljL l Z 1 2 1 1 .x/ D C p e 2 2 dx 0:01 l 2 2 Z 1 2 x x e 2 2 d D DC p 2 2 0:01 2 0:326 0:3262 D C p e 2 : 0:01 2 18. (a) fY .y/ D ey for 0 < y < 1; (b) fXjY .xjy/ D y1 for 0 < x < y; (c) EŒX jY D y D y2 ; (d) Var.X jY D y/ D 20. E X jX > 12 D 34 .
y2 12 .
21. Z
1
Z
1
EŒX D
xf .x; y/dxdy 1 Z 1
D Z
1
1
1
fY .y/
1
x Z
1
D
Z
fY .y/
1
f .x; y/ dxdy fY .y/
1 1
xfXjY .xjy/dxdy:
22. Note that Cov.X; Y jX / D 0 as X given X is a constant. 23. Note that Cov.X; EŒY jX / D bVar.X /. 24. Note Cov.EŒX jZ; EŒY jZ/ D EŒEŒX jZEŒY jZ EŒEŒX jZEŒEŒY jZ D EŒEŒX jZEŒY jZ EŒX EŒY ; and EŒCov.X; Y jZ/ D EŒEŒX Y jZ EŒEŒX jZEŒY jZ D EŒX Y EŒEŒX jZEŒY jZ:
248
Answers and Solutions
25. MY .t/ D EŒet Y D EŒEŒet Y jN D EŒ.MX .t//N D EŒeN ln MX .t / D MN .ln MX .t//:
Chapter 2 1. (a) FN .x/ D ex=5 . (b) FN .2/ D e2=5 . (c) FN .2j2/ D FN .2/. 2. (a) (b)
FN .2/ D e1 . FN .1j9/ D FN .1/ D e1=2 .
3. P ŒX=Y > z D P ŒX > zY D h.z/ D 1=.1 C z/2 .
R1 0
ezy ey dy D 1=.1 C z/. Thus,
4. EŒX ajX > a D EŒX is equivalent to EŒX jX > a D a C EŒX . 5. (a) EŒX jX c D
E ŒXIXc P ŒXc
D
Rc 0
xex dx 1ec
D
1
c
e c 1e c :
(b) From EŒX D EŒX jX cP ŒX c C jX cP ŒX > c, we have EŒX 1 1 c c e D EŒX jX c.1 e / C c C . After simplifying, we get the same result. 1 ; Var.X.1/ / D 412 . 6. (a) EŒX.1/ D 2 1 (b) EŒX.2/ D EŒX.2/ X .1/ C EŒX.1/ D 1 C 2 . Var.X.2/ / D Var.X.2/ X.1/ / C Var.X.1/ / D 12 C
1 . 42
1 . 7. (a) EŒX1Wn D n Pk (b) EŒXkWn D j D2 EŒXj Wn X.j 1/Wn C EŒX1Wn D
1 nkC1
8. (a) EŒX D
1 ; p
Var.X / D
1
C ::: C
1 n
.
1p . p2
(b) P ŒX > k D .1 p/k . R1 9. (a) EŒXD D 400 e0:02x dx D 50e8 . (b) EŒXDF D 450e8 . Ry Ry
10. P ŒR2 y D 0 P ŒR2 yf .x/dx D 0 1 FN .y/ ln FN .y/:
FN .y/ FN .x/
f .x/dx D F .y/ C
Answers and Solutions
249
11. Note that for 0 x 1, 1 < y D log x < 1. Thus, y
P ŒY y D P Œlog X y D P ŒX ey D 1 ee : y
Thus, the density of Y is ey ee for 1 < y < 1. 12. Note that for 0 x 1, 1 y D ex e. Thus, P ŒY y D P ŒeX y D P ŒX log y D log y: Thus, the density of Y is
1 y
for 1 y e.
13. (a) e1:75=0:59 ; (b)e2=0:59 . 14. (a) 0:72 ; (b) 1:7. 15. (a) D 1=0:2; (b) e0:8 ; (c) e0:4 e1:2 . 16. (a) P Œmax.X1 ; X2 / z D P ŒX1 zP ŒX2 z D .1 ez /2 ; (b) f .z/ D 2ez .1 ez /; (c) r.z/ D
2.1ez / . 2ez
17. (a) Z
z
P ŒX1 C X2 z D Z
P ŒX1 < z ueu du
0 z
D
.1 e.zu/ /eu du
0
D 1 ez zez D 1 .1 C z/ez I (b) f .z/ D 2 zez ; 2 z (c) r.z/ D 1Cz . 18. (a) P Œmin.X; Y / > z D P ŒX > zP ŒY > z D e.Cı/z ; R1 R1 (b) P ŒX > Y D 0 P ŒX > y 1ı eıy dy D ı 0 e.Cı/y dy D
ı . Cı
19. (a) 1=.0:2 C 0:25/; (b) 0:2=.0:2 C 0:25/. 20.
1 ˛
ln EŒe˛X D ln.1/ ˛ .
21. (a) l D 1 ln 0:01; (b) EŒLjL l D l C EŒL ljL l D l C
1
D 1 . ln 0:01 C 1/.
22. By using the memoryless property, Z P ŒY > X1 CX C2jY > X1 D 0
1
P ŒY > xCX2 jY > xdFX1 .x/ D P ŒY > X2 :
250
Answers and Solutions
23. (a) It is obtained from the Pproperty of permutation. (b) By Using EŒN D 1 nD1 P ŒN 1 and noting that P ŒN 1 D 1 we have EŒN D e.
Chapter 3 1. (a) E.X / D , Var.X / D . k k1 (b) P ŒX D k D kŠ e D k .k1/Š e D k P ŒX D k 1. P ki (c) EŒX.X 1/ .X i / D i C1 1 D i C1 . kDi .ki C1/Š e 2. P ŒX C Y D k D
k X
P ŒX D j P ŒY D k j
j D0
D
k X j1 1 kj 2 e e2 jŠ .k j /Š
j D0
D
.1 C 2 /k .1 C2 / e kŠ j kj k X kŠ 1 2 j Š.k j /Š 1 C 2 1 C 2
j D0
D
.1 C 2 /k .1 C2 / e : kŠ
3. (a) e2 . (b) 2:00 p.m. (c) 1 2e2:2 . 4. (a) e3 . (b) 67. 50Š (c) 5Š45Š 0:055 0:954 5. 5. (a) eT . (b) By using the memoryless property of exponential distribution, we have EŒW D EŒW I X1 > T C EŒW I X1 T D EŒX1 I X1 > T C EŒX1 I X1 T C EŒW P ŒX1 T D EŒX1 C EŒW P ŒX1 T :
Answers and Solutions
251
Thus, EŒW D EŒX1 =.1 P ŒX1 T / D
1 T e :
6. (a) EŒS4 D 4 . (b) EŒS4 jN.1/ D 2 D 1 C 2 . (c) EŒN.4/ N.2/jN.1/ D 3 D 2. 7. (a) P ŒX.t/ D 0 D e2t . (b) 2t. 8. For those “old” n customers, the mean number of staying in service until time t C s is net . Given a customer who arrives between s and t C s, its arrival time follows the uniform distribution. Thus, the probability it stays in service is R t ex t /=.t/. Thus, the total mean number of customers at 0 t dx D .1 e time t C s is EŒX.t C s/jX.s/ D n D ne.t s/ C .1 et /=: 9. (a) .1=3/2 . (b) 1 .2=3/2 . 10. (a) Given an arrival before time t, its arrival time follows uniform distribution. Thus, EŒX jN.t/ D n D nt=2: (b) Var.X jN.t/ D n/ D nt 2 =12. (c) To find the variance of X , we use the following identity Var.X / D EŒVar.X jN.t// C Var.EŒX jN.t// D EŒN.t/t 2 =12 C Var.N.t//t=2 D t 3 =12 C t 2 =2: 11. (a) By conditioning on the departure time T , we have Z
T
EŒN D 0
dt EŒN jT D t D T
Z
T
t 0
T dt D I T 2
(b) Similarly, Z
T
dt EŒN jT D t VarŒN D EŒN .E.N // D T 0 2 Z T 1 T D ..t/2 C t/dt T 0 2 2
D
2
2
T .T /2 T .T /2 .T /2 C D C : 3 4 2 12 2
T 2
2
252
Answers and Solutions
12. (a) Note that for 0 < x < t P ŒT1 2 dxjN.t/ D 1 D
P ŒT1 2 dxI N.t/ D 1 P ŒN.t/ D 1
D
P ŒT1 2 dxI T2 T1 > t x P ŒN.t/ D 1
D
P ŒT1 2 dxP ŒT2 T1 > t x P ŒN.t/ D 1
D
ex e.t x/ x D : t t te
(b) Generally for 0 < t1 < t2 < tn < t, P ŒT1 2 dt1 ; ; Tn 2 dtn jN.t/ D n D
et1 etn e.t tn / dt1 dtn nŠ D n: 2 t t .t/ e =nŠ
13. P ŒN.s/ D kjN.t/ D n D D D
P ŒN.s/ D k; N.t/ N.s/ D n k P ŒN.t/ D n .s/k s ..t s//nk .t s/ e e kŠ .nk/Š .t /n t nŠ e
s k
s nk nŠ 1 : kŠ.n k/Š t t
14. U.t/ D EŒN.t/ D t. 15. Obviously : i Thus, P ŒX D i increases for i , decreases for i , and reaches the maximum at i D Œ, the largest integer smaller than . P ŒX D i =P ŒX D i 1 D
16. By taking derivative of P ŒX D k with respect to , we see that k D 0 is equivalent to kk1 D k or D k.
d d P ŒX
17. EŒX 3 D EŒ.X C 1/2 D ŒEŒX 2 C 2EŒX C 1 D ŒVar.X / C .E.X //2 C 2 C 1 D .2 C 3 C 1/:
D
Answers and Solutions
253
18. Using the Taylor expansion for e C e , we have e C e D
1 1 1 X X X k ./k 2i C D2 : kŠ kŠ .2i /Š
kD0
i D0
kD0
Thus, P ŒX is even D
1 X 1 2i e D .1 C e2 /: .2i /Š 2 i D0
19. Obviously, EŒY D
1 X
kP ŒY D k D
kD1
D
1 X
kP ŒX D kjX > 0
kD1
1 X 1 kP ŒX D k D : 1e 1 e kD1
21. (a) h
i
2 2
0
E ezY.t / D E 4E 4exp @z
N.t X/
33 ˇ ˇ Xi A ˇN.t/55 1
i D1
D EŒ.M.z//N.t / D EŒeN.t / ln M.z/ D et .M.z/1/ I (b) EŒY .t/ D tEŒX1 ; (c) Var.Y .t// D EŒVar.Y .t/jN.t// C Var.EŒY .t/jN.t// D EŒN.t/Var.X1 / C Var.N.t/EŒX1 / D t.Var.X1 / C .EŒX1 /2 /I (d) Cov.N.t/; Y .t// D Cov.N.t/; EŒY .t/jN.t// C EŒCov.N.t/; Y .t/jN.t// D Cov.N.t/; N.t/EŒX1 / D Var.N.t//EŒX1 D tEŒX1 : 22. (a) Cov.T; N.t// D Cov.T; EŒN.T /jT / D Cov.T; tT / D tVar.T /I
254
Answers and Solutions
(b) Var.N.T // D Var.EŒN.T /jT / C EŒVar.N.T /jT / D Var.T / C EŒT D 2 Var.T / C EŒT :
Chapter 4 0:5 1. (a) FN .x/ D ex , r.x/ D 0:5x 0:5 , EŒX D 2, and Var.X / D 20. p 2 (b) FN .x/ D e.0:5x/ , r.x/ D 0:5x, EŒX D , and Var.X / D 4 .
2. EŒX D Var.X / D 0:5. 3. (a) FN .x/ D .1 C x/ex ; (b) r.x/ D x=.1 C x/; (c) EŒX D Var.X / D 2. 2
4. (a) F .x/ D 1 ex ; (b) r.x/ D 2x, (c) Yes. 5. (a) (b) (c) (d)
F .x/ D 0:7.1 ex / C 0:3.1 e2x=3 /; f .x/ D 0:7ex C 0:2e2x=3 ; x C0:2e2x=3 r.x/ D 0:7e ; 0:7ex C0:3e2x=3 0:1 r.x/ D 1 0:7ex=3 C0:3 is decreasing.
6. (a) FN .t/ D e0:2t , for t 0:5; D e0:10:5.t 0:5/ , for t > 0:5. (b) FN .1/ D e0:35 . 1 1 0:1 (c) EŒX D 0:2 .1 e0:1 / C 0:5 e . p
7. FN .2/ D e2
2
.
8. (a) F .100/ D 1 e5 .1 C 5 C 52 =2 C 53 =3Š C 54 =4Š/. (d) r.100/ D 0:01 54 =4Š=.1 C 5 C 52 =2 C 53 =3Š C 54 =4Š/. (c) EŒX D 500, Var.X / D 50000. p
9. (a) FN .1/ D e
0:1
p
; (b) FN .5/ D e
0:5
; (c) FN .10/ D e1 ; (d) EŒX D 20.
10. F .3/ D 1 5e2 . 11. (a) EŒX D 500; (b) R.200/ D 7e2 ; (c) R.200/ 0:95, ˛ D 5. 12. Note X ˛ is exponential with mean ˇ ˛ . Thus, by the memoryless property, we get the result. Rx 13. Just by noting that FN .x/ D exp. 0 r.t/dt/. 14. Note that Thus,
˛
P Œmin.X1 ; :::; Xn / > x/ D FN n .x/ D en.x/ : ˛
f .x/ D ˛n˛ x ˛1 en.x/ :
Answers and Solutions
255
17. Note fXjXCY .xjc/ D
ex e.cx/ 1 D : c 2 cec
Thus, EŒX jX C Y D c D 2c . 18. Since sign.r 0 .x// D sign.q22 / > 0, thus E1;2 is IFR. 19. Since sign.r 0 .x// D sign.q3 C q32 C q32 .x/ C q3 .x/2 =2/; thus, r 0 .x/ changes from negative to positive. That means, E1;3 has bath-tub failure rate curve. 20. Since ˛
logŒlog.1 F .x//1 D logŒlog e.x/ D ˛ log C ˛ log x; we see that it is linear function of log x. 21. By using the technique of calculating the mean, we have Z
1
EŒX 2 D 2
x.1 F .x//dx 0 Z
D2
0 x
0
Z
1
1 .x/
xe dx C e xe dx 0 Z 0 1 e C e0 x dx D 2 e0 C 0 1 0 0 # Z e0 x 1 1 .x/ e dx C 1 1 2 2 1 e0 C 2 .1 e20 / C 2 e0 : D 2 1 0 0 1
Chapter 5 1. (a) For sufficiency, take g.x/ D IŒx>t . For necessity, we note Z Z EŒg.X / D P Œg.X / > tdt D P ŒX > g 1 .t/dt: 2. Note that F .x/ is IFRA is equivalent to x1 ln FN .x/ decreasing, which means for 0 < ˛ < 1, 1 1 ln FN .˛x/ ln FN .x/: ˛x x
256
Answers and Solutions
4. Similar to the proof for Problem 9. 8. From Chap. 2, by integrating by parts, we can find the survival function of Xk;n as n X
nŠ .FN .t//i .F .t//ni i Š.n i /Š i Dk Z FN .t / nŠ D x k1 .1 x/nk dx: .k 1/Š.n k/Š 0
P ŒXk;n > t D
Thus, nŠ d P ŒXk;n > t D f .t/.FN .t//k1 .F .t//nk : dt .k 1/Š.n k/Š Its failure rate can be calculated as
d P ŒXk;n dt
P ŒXk;n
> t D f .t/ > t
"Z 0
FN .t /
#1 1 x nk dx F .t/ !nk 31 N 1 y F .t/ dy 5 : 1 FN .t/
x N F .t/
2 Z f .t/ 4 1 k1 D y FN .t/ 0
k1
N
F .t / is increasing in FN .t/, we see that if Since 1y 1FN .t / Xk;n is IFR. i R h R 1 1 D r.x/ 9. E r.X/ f .x/dx D FN .x/dx.
f .t / FN .t /
is increasing in t, then
10. By using the L’Hospital’s Rule, R1 lim .y/ D lim
y!1
y
y!1
FN .x/dx 1 FN .y/ D : D lim y!1 f .y/ r.1/ FN .y/
11. F .x/ is NBUE implies Z
1
FN .x/dx FN .y/:
y
By integrating both sides in y, we get EŒX 2 22 . 13. The following failure rate gives a life distribution which is IFRA but not IFR: 8 <1 0 < s < 1 r.s/ D 2:1 1 s < 2 : 2 2s
Answers and Solutions
257
15. By noting that the density function of min.X1 ; X2 / is given by
d N .F1 .t/FN2 .t// D f1 .t/FN2 .t/ C FN1 .t/f2 .t/; dt
we have P ŒX1 < X2 I min.X1 ; X2 / D t f1 .t/FN2 .t/ C FN1 .t/f2 .t/ P ŒX2 > tf1 .t/ D f1 .t/FN2 .t/ C FN1 .t/f2 .t/ r1 .t/ : D r1 .t/ C r2 .t/
P ŒX1 < X2 j min.X1 ; X2 / D t D
18. Note that
R1 y
xdF .x/ .
20. First, we can show P ŒZ1 C Z2 > t D P ŒZ1 > t C P ŒZ1 C Z2 > tI Z1 t Z t P ŒZ2 > t udP .Z1 u/ D P ŒZ1 > t C 0 Z t P ŒZ1 > t C P ŒX2 > t udP .Z1 u/ 0
D P ŒZ1 C X2 > t: Similarly, we can show P ŒZ1 C X2 > t P ŒX1 C X2 > t. 21. Since
g.x/ f .x/
increasing,
Z
1
0
g.x/ f .x/
1 changes from negative to positive. However,
Z 1 g.x/ 1 f .x/dx D .g.x/ f .x//dx D 0: f .x/ 0
Thus, for any t > 0 Z
1 t
g.x/ 1 f .x/dx D P ŒY > t P ŒX > t 0: f .x/
Chapter 6 1. F .x; y/ D 1 e0:2x0:3y0:1 max.x;y/ . 2. FX .x/ D 1 e0:3x ; FY .y/ D 1 e0:4y . 3. Cov.X; Y / D 1=0:24.
258
Answers and Solutions
4. P ŒX > xjY D y D e0:2x for x < y; D 0:75e0:2x0:6.xy/ for x > y. 5. EŒX jY D y D 5 2:5e0:2y. 8. f .x; y/ D 0:048e0:2x0:3y0:1 max.x;y/ ; for 0 < x < y; 0:108e0:2x0:3y0:1 max.x;y/ ; f or 0 < y < x: 9. Fs .x; y/ D 1 e0:6 max.x;y/ . 10. 1 e0:6z .
Chapter 7 1. For fixed x C y D c, we can assume x c=2 and y D c x. We only have to show that all functions are increasing in x. 2. Obviously, Sk is increasing in X1 ; :::; Xn . 3. Note that the joint density of X1;2 and X2;2 is 2f .x/f .y/. The result follows due to the independence of X1 and X2 . 5. (a) Take h.x; y/ D IŒx>s;y>t . (b) Note P ŒX xI Y t D 1 P ŒX > s P ŒY > t C P ŒX > sI Y > t. 6. (b) Cov.X1 ; X2 / D EŒmin.X; M /.X M /C EŒmin.X; M /EŒ.X M /C D MEŒ.X M /C EŒmin.X; M /EŒ.X M /C D EŒ.X M /C EŒ.M X /C : 13. Let f .x; y/ and g.x; y/ be the density functions for .X1 ; X2 / and .Y1 ; Y2 / respectively. For any increasing function h.x; y/, by changing the density function for .Y1 ; Y2 / to the density function for .X1 ; X2 /, we have g.X1 ; X2 / EŒh.Y1 ; Y2 / D E h.X1 ; X2 / f .X1 ; X2 / g.X1 ; X2 / EŒh.X1 ; X2 /E f .X1 ; X2 / D EŒh.X1 ; X2 /:
Answers and Solutions
259
Chapter 8 1. (a) By conditioning on the first renewal point, the equation follows. (b) We prove for the general renewal process. Note that P ŒW > x equals to 0 for x s and satisfies Z P ŒW > x D
s
P ŒW x udF .u/; 0
for x s. Thus, using EŒW D the integration order Z
R1 0
P ŒW > xdx, we have by exchange
1
EŒW D s C
P ŒW > xdx Z Z 1 s sCu P ŒW > x udxdF .u/ C sC 0 s sCu Z s Z 1 sC uC P ŒW > xdx dF .u/ 0 s Z s sC Œu s C EŒW dF .u/ Z 0s s .s u/dF .u/ C EŒW F .u/ 0 Z s FN .u/du C EŒW F .u/; Z
D D D D D
s
0
where in the last step, we use integration by part. Thus, 1 EŒW D N F .u/
Z
s
FN .u/du:
0
3. By using Lemma 8.1 and Wald’s identity, we have EŒN.t/ EŒR EŒC.t/ D EŒR ! : t t EŒX 4. (d) By differentiating with respect to T , we see that the derivative has the same sign as aF .T / C b.1 F .T // .a b/r.T / : RT 0 .1 F .x//dx As T ! 0, the above term is 1. As T ! 1, if r.T / increases to r.1/ and .a b/r.1/
a > 0;
260
Answers and Solutions
then, it can be shown that there exists a unique solution T minimizing the longrun cost rate. In particular, the above condition is always valid if r.1/ D 1. RT 5. (a) 0 xdF .x/=T ; (b) .aF .T / C b.1 F .T ///=T . 6. (a) By looking backward at time t, we see that A.t/ as the same distribution as min.t; X / and R.t/ has the same distribution as Y , where X and Y are independent exponential random variables with parameter . Thus, P ŒD.t/ > z D P ŒX C Y > zI X t C P Œt C Y > zI X > t: For t z, by conditioning on X , we have Z
t
P ŒD.t/ > z D
e.zx/ ex dx C e.zt / et
0
D tez C ez I while for t > z, Z
z
P ŒD.t/ > z D
e.zx/ ex dx C P Œz < X t C P ŒX > t
0
D zez C ez : 9. (a) From the memoryless property for the exponential distribution, we have P ŒY > X1 C Xn D P ŒY > X1 P ŒY > X1 C Xn jY > X1 D P ŒY > X1 P ŒX2 C Xn D P ŒY > X1 P ŒY > Xn (b) From the definition of the renewal function, we have EŒN.Y / D EŒEŒN.Y /jY # "1 X P ŒX1 C Xn Y DE nD1
D
1 X
.P ŒX1 Y /n
nD1
D
EeX1 P ŒX1 Y D : 1 P ŒX1 Y 1 EeX1
Answers and Solutions
261
Chapter 9 1. (a) e.MY .s/1/ . (b) e.1=.1s/1/ . 2. The Lundberg coefficient satisfies p p 1p .1 p/ C D .1 C / : C 1 2 1 2 3. (a) D (b) (c)
. .1C / 1 .u/ D 1C e .1C / u . 1 .u; y/ D 1C .1 ey= /eu .
4. (a) From EŒmin.X; M / D P .P < 1=/, we get M D 1 ln.1 P /. (b) We use the relationships Var.X / D
1 D Var.X / C Var.XQ / C 2Cov.X ; XQ /; 2
and (Problem 6, Chap. 7) Cov.X ; XQ / D EŒ.X M /C EŒ.M X /C D
M .M P /:
Thus,
1 M 2 .M P /: 2 7. The Weibull distribution satisfies Condition (b) of Theorem 9.10. Var.X / C Var.XQ / D
8. The Pareto distribution satisfies Condition (a) of Theorem 9.10. 9. Note that
R xCy 1 F .x C y/ D exp x r.u/du : 1 F .x/
Thus, F .x/ belongs to L if and only if Z
xCy
r.u/du D 0:
lim
x!1 x
Since r.u/ is decreasing, this is equivalent to limx!1 D 0. 10. If F .x/ is DFR and F .x/ 2 D if and only if Z lim
x
x!1 x=2
r.u/du < 1:
This is equivalent to limx!1 xr.x/ < 1.
262
Answers and Solutions
11. Note that for independent random variables X1 ; X2 ,
x x P .X1 C X2 > x/ D P X1 I X1 C X2 > x C P X2 I X1 C X2 > x 2 2
x x : CP X1 > I X2 > 2 2 The equality is obtained when X1 and X2 follows the same distribution function. 12. From the above equality, we have 1 F .2/ .x/ D2 1 F .x/
Z
x=2
0
.1 F .x=2//2 1 F .x y/ dy C : 1 F .x/ 1 F .x/
.x=2/ As lim sup 1F 1F .x/ < 1, and
1 F .x=2/ 1 F .x y/ ; 1 F .x/ 1 F .x/ for y x. Therefore from the dominated convergence theorem, the first term goes to 2 since F .x/ 2 L. The second term goes to 0 since F .x/ 2 D. 13. It can be checked that F .x/ belongs to both L and D.
Chapter 10 1. (a) (b) (c) (d)
.1 C 0:1=12/3; .1 C 0:1=12/6; .1 C 0:1=12/12; .1 C 0:1=365/90; .1 C 0:1=365/180; and .1 C 0:1=365/365.
3. Solve x.1 C 0:12=12/24 D 1200. 4. .1 C 0:15=365/365 D 1:161798; .1 C 0:155=2/2 D 0:161006. 5. 100000 .0:08=12/=.1 .1 C 0:08=12/180/ D 955:652. 8. (a) (b) (c) (d)
r D 0:000274; p D 0:79I q D 0:7900995; The price for a call option is 1.63. The writer should buy 1 share of stock and borrow 98.37 amount of bond.
9. (a) A.c/ D 1=c; (b) R.c/ D 1; (c) A.c/ D 1I R.c/ D c.
Answers and Solutions
10. (a) (b) (c) (d)
263
E.R1 / D 0:04, E.R2 / D 0:16; Var.R1 / D 0:0184, Var.R2 / D 0:0024, and Cov.R1 ; R2 / D 0:0064; Var.0:4R1 C 0:6R2 / D 0:000736; Optimal weight w2 D 0:738.
11. (a) p D 2=3, q D 8=11, and m D 1. Thus, p0 D 100Œ.8=11/2 C 2 .8=11/ .3=11/ .100=1:12/Œ.2=3/2 C 2 .2=3/ .1=3/ D 100 0:926 100 0:735 D 19:1. (b) Without hedging, the mean is .2=3/.8:08/ C .1=3/10:65 D 1:84 and the variance is .2=3/8:082 C .1=3/10:652 1:842 D 77:94. (c) Under the hedging strategy, we should buy 0.926 shares of stock and borrow 0.735 units of bond. With hedging, the mean is .2=3/1:18 C.1=3/.7:87/ D 1:84 and the variance is .2=3/1:182 C .1=3/7:872 1:842 D 18:19.
Index
ˇ pricing model, 203
A Absolutely continuous BVE(ACBVE), 135 Age, 54 Age-dependent branching process, 174 Arithmetic distribution function, 160 Asset pricing, 199 Associated, 144 Association, 141
B Binomial model, 208 Bivariate distribution, 120 Bivariate exponential distribution(BVE), 125 Black–Scholes formula, 217
C Cauchy functional equation, 17 Censoring, 61 Change-point model, 71 Changed probability measure, 184 Claim, 19 Cold-redundant system, 77 Complete financial market, 211 Conditional probability, 5 Consumption level, 199 Convolution, 54 Correlation, 204 Counting process, 45 Covariance, 126 Cox–Ross–Rubinstein formula, 210 Cumulative damage model, 107
D Decomposition, 57 Deductible, 19 Deductible franchise, 19 Defective distribution, 169 Defective renewal equation, 180 Defective renewal process, 169 Deficit at ruin, 179 Delayed renewal process, 166 Delta hedging, 216 Density function, 5 Density function joint, 33 maginal, 34 Dependence, 141 Distribution function, 4 Binomial, 4 Distribution function
-square, 75 normal, 5 absolutely continuous, 5 continuous, 5 degenerate, 4 discrete, 4 Erlang, 71 gamma, 71 normal, 217 parametric, 71 Pareto, 189 Poisson, 46 Rayleigh, 72 singularly continuous, 120 underlying, 53 Weibull, 71 Dividend, 205
265
266 E Elementary renewal theorem, 160 Equilibrium distribution, 184 European call option, 209 Exponential distribution, 23 Exponential distribution mixture, 71 Exponential function, 16 Extreme value, 26
F Failure rate, 14, 25 Failure rate bath-tup shaper, 82 increasing (decreasing), 82 Fatal shock model, 120
G Gamma function, 72 Geiger counter, 68 Geometric compound, 169
H H¨older’s inequality, 103 Hazard function, 79, 93 Hazard increased by Failures, 150 Heavy tail distribution, 185 Hedge fund, 213 High-order approximation, 163
I Independent, 8 Independent increment, 46 Induction method, 172 Insurance policy, 193
J Jensen’s inequality, 194
K Key renewal theorem, 161
L Ladder times and heights, 183 Large claims, 185 Left tail decreasing (LTD), 142
Index Life distribution class IFR(DFR), 87 Life distribution class 2 NM U , 182 DMRL(IMRL), 96 HDMRL(HIMRL), 96 HNBUE(HNWUE), 98 IFRA(DFRA), 93 NBU(NWU), 97 NBUC(NWUC), 97 NBUE(NWUE), 97 Log-concave, 88 Lower bound, 172 Lundberg bound, 181 Lundberg coefficient, 181
M Marginal utility, 199 Martingale, 208 Mean, 5 Mean conditional, 7 Mean residual life, 16, 26 Mean-Variance Frontier, 204 Memoryless property, 27 Mixture distribution, 102 Mixture Erlang distribution, 82 Moment generating function, 6, 24 Multivariate life distribution, 117 Multivariate total positivity (M TP2 ), 146
N Negatively associated, 152 Non-arbitrage, 211 Nonfatal shock model, 133 Normal distribution truncated, 91 Normal law, 9
O Order statistics, 32 Overshoot, 161
P Parameter, 24, 72 Parameter scale, 72 shape, 72 Payoff, 199 PF2 function, 87
Index Poisson law, 9 Poisson process, 46 Portfolio management, 199 Positive quadrant dependent, 142 Positive regression dependent (PRD), 142 Premier rate, 179 Price of risk, 204 Pricing formula, 201 Pricing kernel, 201 Probability distribution binomial, 60 geometric, 41 Q Quantity of risk, 204 Queueing system, 68 R Random variable, 4, 5 Record values, 37 Recursive renewal equations, 181 Reinsurance policy, 195 Reliability, 23 Renewal equation, 55 Renewal function, 54 Renewal process, 53 Renewal theorem, 161 Residual Life, 13 Residual life, 25 Retention level, 195 Return rate, 203 Right tail increasing (RTI), 142 Risk, 193 Risk adjustment, 202 Risk process, 179 Risk-aversion coefficient, 200 Risk-free bond, 201 Risk-free rate, 201 Risk-neutral probability, 207 Ruin probability, 179
267 S S(S ) distribution class, 192 Self-financing, 210 Shock model, 104 Spacing statistics, 32 Spacing statistics normalized, 32 Star-shaped function, 113 Stationary increment, 46 Stationary renewal process, 166 Stop-loss type, 195 Stopping time, 160 Sub(super)-additive function, 113 Subexponential distribution, 190 Superposition, 57 Surplus process, 179 Survival function, 13, 24 Survival function conditional, 14
T Time of ruin, 179 Total positivity, 106 Total time on test, 61
U Upper bound, 172 Utility function, 199
V Variance, 6
W Wald’s identity, 160 Weakened by failures, 150